- Employment type: Full Time
- Experience: 8-9 years
- Salary: 8L to 11L Yearly
- Location: Work For Home
- Work timing: 9:45 AM to 6:45 PM IST
- Working Days: 5 Days
- Education: Any Degree
Key Responsibilities
• Design, develop, and optimize data pipelines using Databricks and Apache Airflow.
• Implement PySpark-based transformations and processing in Databricks for handling large-scale data.
• Develop and maintain SQL-based data pipelines, ensuring performance tuning and optimization.
• Create Python scripts for automation, data transformation, and API-based data ingestion.
• Work with Airflow DAGs to schedule and orchestrate data workflows efficiently.
• Optimize data lake and data warehouse performance for scalability and reliability.
• Integrate data pipelines with cloud platforms (AWS, Azure, or GCP) and various data storage solutions.
• Ensure adherence to data security, governance, and compliance standards.
Required Skills & Qualifications
• 8-9 years of experience in Data Engineering or related fields.
• Strong expertise in Databricks (PySpark, Delta Lake, DBSQL).
• Proficiency in Apache Airflow for scheduling and orchestrating workflows.
• Advanced SQL skills for data extraction, transformation, and performance tuning.
• Strong programming skills in Python (pandas, NumPy, PySpark, APIs).
• Experience with big data technologies and distributed computing.
• Hands-on experience with cloud platforms (AWS / Azure / GCP).
• Expertise in data warehousing and data modeling concepts.
• Understanding of CI/CD pipelines, version control (Git)
• Experience with data governance, security, and compliance best practices.
• Excellent troubleshooting, debugging, and performance optimization skills.