-
Design, develop, and maintain automated QA frameworks for data pipelines, APIs, and analytics platforms using Python and SQL.
-
Build reusable testing utilities for data validation, regression testing, and pipeline certification.
-
Integrate automated tests into CI/CD pipelines to support continuous testing and deployment.
-
Develop unit, integration, and end-to-end test cases for complex data workflows.
-
Leverage AI-assisted testing tools to generate test cases, identify edge cases, and improve test coverage.
-
Validate ETL/ELT pipelines to ensure accurate ingestion, transformation, and delivery of data.
-
Create automated checks for data completeness, consistency, accuracy, and timeliness.
-
Test ingestion and transformation of complex datasets, including XBRL financial data.
-
Implement reconciliation and audit mechanisms across source-to-target mappings.
-
Apply AI-driven anomaly detection to identify data quality issues and pipeline failures.
-
Develop and execute test strategies for Apache Iceberg-based data lakehouse architectures, including:Schema evolution validationTime travel and versioning accuracyPartitioning and performance behavior
-
Schema evolution validation
-
Time travel and versioning accuracy
-
Partitioning and performance behavior
-
Validate and compare materialized views vs. Iceberg table performance and consistency, including:Query performance benchmarkingData freshness and latencyStorage efficiency and maintenance overhead
-
Query performance benchmarking
-
Data freshness and latency
-
Storage efficiency and maintenance overhead
-
Ensure alignment between precomputed datasets (materialized views) and underlying source data.
-
Implement automated validation for data quality rules, lineage, and metadata accuracy.
-
Support context engineering by validating that datasets include proper business context, definitions, and relationships.
-
Integrate QA processes with enterprise data catalogs and metadata systems to ensure discoverability and trust.
-
Validate AI-generated metadata, lineage, and transformations for accuracy and traceability.
-
Apply AI/ML and generative AI tools to enhance QA processes, including intelligent test generation, defect prediction, and automated root cause analysis.
-
Validate data readiness for AI/ML and generative AI use cases, ensuring datasets meet quality, completeness, and governance standards.
-
Collaborate with data and AI teams to test data pipelines supporting RAG, analytics, and machine learning workflows.
-
Ensure alignment with responsible AI practices, including traceability, explainability, and data integrity.
-
Support enterprise data management programs and OCDO initiatives by ensuring data quality and reliability across systems.
-
Contribute to data maturity assessments by evaluating data quality, testing coverage, and governance adherence.
-
Align QA processes with Federal Data Strategy and Evidence Act requirements.
-
Work closely with data engineers, data architects, and analysts to define test strategies and acceptance criteria.
-
Participate in stakeholder engagement sessions and listening campaigns to understand data quality expectations and pain points.
-
Document test results, defects, and quality metrics for both technical and non-technical stakeholders.
-
Operate within Agile teams to iteratively improve data quality processes and tooling.
-
Promote adoption of AI-driven efficiencies and automation across QA and data engineering workflows.
-
Bachelor’s degree in Computer Science, Engineering, Information Systems, or related field.
-
5+ years of experience in QA engineering, data testing, or software development.
-
Strong programming skills in Python and advanced proficiency in SQL.
-
Experience building automated test frameworks for data platforms and ETL pipelines.
-
Hands-on experience with:AWS data services (S3, Glue, Redshift, Lambda, etc.)Apache Iceberg or similar data lake technologies
-
AWS data services (S3, Glue, Redshift, Lambda, etc.)
-
Apache Iceberg or similar data lake technologies
-
Experience validating materialized views and performance-optimized data structures.
-
Familiarity with XBRL or complex financial/regulatory datasets.
-
Understanding of data modeling, metadata, and data governance principles.
-
Experience with CI/CD tools and automated testing integration.
-
Demonstrated proficiency with AI tools and AI-assisted development/testing workflows.
-
Understanding of data quality requirements for AI/ML and analytics use cases.
-
U.S. Citizenship required; ability to obtain and maintain a federal clearance.
-
Experience supporting federal agencies such as SEC, DHS, Treasury, or Federal Reserve System.
-
Familiarity with data catalog and governance tools (e.g., Collibra, Alation, ServiceNow).
-
Experience with Apache Spark or distributed data processing frameworks.
-
Knowledge of data quality tools and observability platforms.
-
Exposure to data maturity frameworks (e.g., EDM DCAM, TDWI).
-
Experience testing large-scale cloud data platforms and lakehouse architectures.
-
Experience validating data pipelines supporting AI/ML, analytics, or generative AI solutions.
-
Familiarity with AI-driven testing tools or frameworks.

BrowserStack