Hi Marco,
It is not very clear as to which checks are you interested in.
Beam does not have any standard business-specific data quality checks.
However, you can add your checks in various stages of the pipeline.
The checks will broadly fall into 2 categories.
1. Check a single element: There are easy to do as you can write a
transform to check a single element.
2. Checks that co-relate data across elements such as "at least a single
domain has 2 pages" etc: For these, you can use aggregation and then apply
the check that you need.

Thanks,
Ankur


On Sun, 1 Jan 2023 at 11:53, Sofia’s World <mmistr...@gmail.com> wrote:

> Hi all
>  Are there any facilities to do dqchecks on apache beam?
> Got few jobs that download data from web..do some filters transformation
> and aggregation..
> Want to introduce dqchecks so Job fails if certain conditions are not met
> eg number of outputs....
> Is that achievable in beam?
> Thanks. Marco
>

Reply via email to