About the Company
Lift Ventures, a remote-first startup studio whose portfolio of businesses has reached over 250 million consumers to date, is seeking a seasoned and talented Data Engineer for SuperSummary, our fast-growing EdTech business. SuperSummary is a subscription-based website and mobile app offering a library of professionally written study guides and other educational tools and resources on thousands of books for students, teachers, and readers of all types.
About the Job
At SuperSummary, data drives our most important decisions — from determining which literature titles to cover in our Study Guide Library to how to best meet user needs on our site. As an integral part of our data-centric approach, we're looking for a skilled Data Engineer to join our dynamic team.
The Data Engineer will report to the Data Science Manager and interact frequently with stakeholders across the company — including product managers and department heads — who rely on data insights for building product features, creating POCs, outlining business strategy, and reporting on progress toward company goals. The ideal candidate has a proven track record of data integration, data modeling, data quality assurance, and ETL experience.
This role is 100% remote, located in Latin America. Our fully remote team is distributed across the globe with the majority of the team working in the United States, Brazil, and the Philippines.
Key Functional Responsibilities
Perform production-level automation of data collection procedures, including web scrapers.
Integrate multiple datasets in our cloud architecture.
Lead the implementation of a Data Governance framework for the whole company.
Develop scalable data pipelines preferably with Databricks.
Identify and implement improvements to infrastructure and to our data collection and processing methods. Maintain, update, expand and provide quality assurance for existing data sets.
Perform data collection, transformation and cleansing tasks: this includes writing/maintaining web scrapers and using external APIs.
Work with stakeholders including the Executive, Product, and Content Production teams to assist with data-related technical issues and support their data needs.
Collaborate with the Analytics team in building data models, intermediate tables, views and visualizations to provide insights to end business users.
Help with the integration of data sources into BI solutions and other reporting / visualization tools.
Sample Projects
Integrate traffic and session data from different sources (Amplitude, Google Analytics, AWR) with our current datasets.
Lead the implementation of a Data Governance framework for the whole company. Address different data quality issues in our existing datasets.
Create custom data models, tables and views for product analytics.
Participate in the constant improvement of our search engine.
Help with the integration of LLMs technologies into our data and product development activities.
Minimum Requirements
Professional working proficiency in English — we are a global team!
Minimum 3 years of strong expertise in Microsoft Azure (preferably) or another cloud platform, including core services like compute, storage, and databases.
Strong experience building and deploying data projects in cloud environments using tools like Azure Data Factory, Synapse Analytics, S3, SageMaker (or equivalent in other cloud ecosystems) is a must.
3-6 years of experience in data engineering with a strong foundation in Python and SQL programming languages.
Experience working within data governance frameworks, especially in the implementation of best practices for data quality assurance.
Experience consuming APIs and integrating the retrieved data with other data sets.
Experience integrating different sources into BI tools and providing business analysts with datasets for analytics.
Proficient in data modeling and database creation and designing.
Bachelor's degree or equivalent in a relevant field (engineering, computer science, etc.).
Strong problem-solving skills and a creative approach to overcoming technical challenges.
Excellent communication skills, capable of collaborating effectively with cross-functional and international teams.
Desired Skills and Experience
Strong BI background.
Data Governance experience.
Experience with APIs and tools related to web traffic and SEO: Google Analytics, Amplitude/Mixpanel, Hrefs, SEO.
Proficient in Python's data analytics stack (Numpy, Pandas, etc.).
Personal Attributes
Strong problem-solving skills and meticulous attention to detail.
Ability to work independently and as part of a team in a fast-paced environment.
Excellent time management skills with the ability to report progress towards goals.
Proactive and innovative mindset, always seeking ways to improve processes and drive efficiency.
Positive attitude, open to receiving and giving constructive feedback. Approach challenges with optimism, constantly looking for ways to improve.
Adaptable and quick to learn new technologies and business domains.
Perks & Benefits
Work with a distributed, global team that has been remote-first since 2018.
Competitive salary, health benefits, and vacation policy.
Workspace improvement stipend.
Make a big impact on a small Data team.
EEOC Statement
SuperSummary supports workplace diversity and does not discriminate on the basis of age, race, national origin, religion, gender identity or expression, sexual orientation, pregnancy, physical or mental disability, or any other protected class. We welcome diverse perspectives and are dedicated to fostering an inclusive workplace where everyone can grow and thrive. We understand that candidates may not meet every requirement in the job description, but we strongly encourage individuals from all backgrounds to apply. If you're passionate about this role and our mission, even if your experience doesn't perfectly match, we'd love to hear from you and explore how you can contribute to our team.
#J-18808-Ljbffr