As a Senior Data Engineer II, you will provide technical leadership to clients in a team that designs and develops path-breaking large scale cluster data processing systems. You will mentor sophisticated organizations on large scale data and analytics and work with client teams to deliver results.
Additionally, as a senior member of our Consulting team, you will help Think Big, A Teradata Company establish thought leadership in the big data space by contributing white papers, technical commentary and representing our company at industry conferences.
Design and develop code, scripts and data pipelines that leverage structured and unstructured data integrated from multiple sources. Software installation and configuration. Participate in and help lead requirements and design workshops with our clients. Develop project deliverable documentation. Lead small teams of developers and coordinate development activities. Mentor junior members of the team in software development best practices. Other duties as assigned.
• Proven expertise in production software development
• 10+ years of experience programming in Java, Python, SQL, or C/C++
• Proficient in SQL, NoSQL, relational database design and methods for efficiently retrieving data
• Strong analytical skills
• Creative problem solver
• Excellent verbal and written communications skills
• Strong team player capable of working in a demanding start-up environment
• Experience building complex and non-interactive systems (batch, distributed, etc.)
• Experience building large scale data science pipelines end to end
• Ability to understand and tune data science models in variety of languages (Python, R, Scala)
• Prior consulting experience
• Prior experience working as a data engineer in a telecom company
• Good cross industry experience.
• Experience with Spark, Hadoop and JMS: JBoss
• Experience designing and tuning high performance systems
• Familiarity with data lakes (Kylo, NiFi)
• Prior experience with data warehousing and business intelligence systems
• Professional or academic background that includes mathematics, statistics, machine learning and data mining
• Linux expertise
• Experience in JavaEE and REST web services
• Commercial experience with Elasticsearch and Apache Spark
• Experience with TDD and BDD
• Prior work and/or research experience with unstructured data and data modeling
• Familiarity with different development methodologies (e.g., agile, waterfall, XP, scrum, etc.)
• Firm understanding of python memory mode, classes, subclassing, designing classes for re-use, static string constants rather than in-line constants
• Configure a Jenkins build, create/update a Jira ticket, enable Automated Tests in gradle/maven build
• Understanding of how to segregate data based on access control rules, when and how to encrypt data (whole record vs individual fields) when and how to mask fields, etc.
• Able to create and deploy a Spark jobs via YARN or Mesos, read from a streaming source and produce some filtered or enhanced output
• Understand basic modeling techniques, tools sets. Implement simple Python or R analytic routines
• Must be able to interact and communicate with the client in meetings.
• Must be able to write programming code in applicable languages.
• Must be able to write project documentation in English.
• Travel requirement of up to 75% of time
• Fluency in other languages would be desirable
Postgraduate Diploma or foreign equivalent in Computer Science or related technical field followed by ten (10) years of progressively responsible professional experience programming in Java, Python or C/C++. Experience with production software development lifecycle. Completed courses in Machine Learning, Apache Spark, Introduction to Data Science and R programming, experience with Linux, SQL, relational database design and methods for efficiently retrieving data. Experience building complex and non-interactive systems (batch, distributed, etc.).
Salary in line with market rate.
Closing date for applications is 22nd september 2017.