I like the usage of metrics since it doesn't depend on external resources.
I believe there could be some small amount of code shared between runners
for the PAssert metric verification.

I would say that PAssert by itself and PAssert with metrics are two levels
of testing available. For runners that don't support metrics than PAssert
gives a signal (albeit weaker one) and ones that do support metrics will
have a stronger signal for execution correctness.

On Fri, Apr 7, 2017 at 11:59 AM, Aviem Zur <[email protected]> wrote:

> Currently, PAssert assertions may not happen and tests will pass while
> silently hiding issues.
>
> Up until now, several runners have implemented an assertion that the number
> of expected successful assertions have actually happened, and that no
> failed assertions have happened. (runners which check this are Dataflow
> runner and Spark runner).
>
> This has been valuable in the past to find bugs which were hidden by
> passing tests.
>
> The work to repeat this in https://issues.apache.org/jira/browse/BEAM-1726
> has
> surfaced bugs in the Flink runner that were also hidden by passing tests.
> However, with the removal of aggregators in
> https://issues.apache.org/jira/browse/BEAM-1148 this ticket will be harder
> to implement, since Flink runner does not support metrics.
>
> I believe that validating that runners do in fact support Beam model is a
> blocker for first stable release. (BEAM-1726 was also marked as a blocker
> for Flink runner).
>
> I think we have one of 2 choices here:
> 1. Keep implementing this for each runner separately.
> 2. Implement this in a runner agnostic way (For runners which support
> metrics - use metrics, for those that do not use a fallback implementation,
> perhaps using files or some other method). This should be covered by the
> following ticket: https://issues.apache.org/jira/browse/BEAM-1763
>
> Thoughts?
>

Reply via email to