Innova ESI
AWS Glue Developer - Data Engineering
Job Location
pune, India
Job Description
We are hiring for an AWS Glue Developer to join our team for one of the MNC companies in collaboration with Innova. This role requires a strong data engineering background with 5 years of experience working with Apache PySpark, AWS Glue, and AWS Stack. You will work on data integration, ETL processes, and designing dimensional data models leveraging AWS Glue and other cloud technologies. The ideal candidate will also have a strong understanding of DevOps, DataOps, and a SRE mindset for managing the data pipeline lifecycle. Key Responsibilities : - Design, develop, and manage ETL pipelines using AWS Glue, focusing on data ingestion, transformation, and loading (ETL) in data lakes and data warehouses. - Utilize Apache PySpark and SparkSQL for large-scale data processing, leveraging Python for efficient data transformation and manipulation. - Integrate structured and unstructured data from various sources into AWS Glue, leveraging its capabilities to handle complex data workflows and transformations. - Understand and apply dimensional modeling techniques, including star schema and snowflake schema, to design and manage data structures suitable for data warehousing. - Work extensively with AWS services such as AWS Glue, AWS S3, AWS Step Functions, AWS Redshift, and other serverless technologies to orchestrate, process, and manage data. - Leverage AWS Step Functions to orchestrate complex data workflows, including triggering ETL jobs, managing execution orders, and monitoring job statuses. - Implement DevOps, CloudOps, and DataOps best practices, ensuring automated deployment, monitoring, and governance of data pipelines. Adopt a SRE (Site Reliability Engineering) mindset to ensure high availability, scalability, and reliability. - Implement CI/CD pipelines for automating the deployment and continuous delivery of data pipelines using tools like Jenkins, GitLab, or other industry-standard CI/CD tools. - Ensure version control using Git, collaborating on code changes, managing branches, and handling pull requests effectively. - Work with cross-functional teams to translate business requirements into technical solutions. Collaborate with data architects, data scientists, and other stakeholders to understand data needs and ensure solutions align with business objectives. - Continuously optimize the performance of ETL pipelines for speed, scalability, and fault tolerance. Monitor system performance and troubleshoot issues related to AWS Glue jobs, PySpark transformations, and AWS resources. - Maintain clear and comprehensive documentation for data pipelines, architectures, and best practices, and report regularly on project status, risks, and 5 years of experience in data engineering with a strong focus on AWS Glue, PySpark, and AWS cloud technologies. - Proficiency in Apache PySpark for handling large-scale data processing tasks using Python. - Strong experience with SQL and SparkSQL for querying and transforming data. - Experience with AWS Glue for managing ETL workflows, data transformations, and data cataloging. - AWS Step Functions expertise for orchestrating and automating complex workflows. - Good understanding of dimensional modeling and creating effective data models for analytics and reporting. - Familiarity with AWS cloud stack (including S3, Redshift, Lambda, Glue, Kinesis, etc.). - Knowledge of DevOps, DataOps, and CI/CD pipelines for continuous delivery and deployment of data-related systems. - Strong experience with Git and version control practices for collaborative development. - Hands-on experience in API design and microservices architecture. - Software engineering skills including coding practices, debugging, and automation for data engineering tasks. - Strong analytical and problem-solving skills, with the ability to troubleshoot data issues in complex data environments. - Communication skills: Ability to effectively communicate complex technical concepts to non-technical stakeholders, both in writing and verbally. (ref:hirist.tech)
Location: pune, IN
Posted Date: 5/1/2025
Location: pune, IN
Posted Date: 5/1/2025
Contact Information
Contact | Human Resources Innova ESI |
---|