Hi Marco, It is not very clear as to which checks are you interested in. Beam does not have any standard business-specific data quality checks. However, you can add your checks in various stages of the pipeline. The checks will broadly fall into 2 categories. 1. Check a single element: There are easy to do as you can write a transform to check a single element. 2. Checks that co-relate data across elements such as "at least a single domain has 2 pages" etc: For these, you can use aggregation and then apply the check that you need.
Thanks, Ankur On Sun, 1 Jan 2023 at 11:53, Sofia’s World <mmistr...@gmail.com> wrote: > Hi all > Are there any facilities to do dqchecks on apache beam? > Got few jobs that download data from web..do some filters transformation > and aggregation.. > Want to introduce dqchecks so Job fails if certain conditions are not met > eg number of outputs.... > Is that achievable in beam? > Thanks. Marco >