I agree that the DirectRunner should drop late data. Late data dropping is
optional but the DirectRunner is used by many for testing and we should
have the same behaviour they would get on other runners or users may be
surprised.

On Fri, Jan 3, 2020 at 3:33 AM Jan Lukavský <[email protected]> wrote:

> Hi,
>
> I just found out that DirectRunner is apparently not using
> LateDataDroppingDoFnRunner, which means that it doesn't drop late data
> in cases where there is no GBK operation involved (dropping in GBK seems
> to be correct). There is apparently no @Category(ValidatesRunner) test
> for that behavior (because DirectRunner would fail it), so the question
> is - should late data dropping be considered part of model (of which
> DirectRunner should be a canonical implementation) and therefore that
> should be fixed there, or is the late data dropping an optional feature
> of a runner?
>
> I'm strongly in favor of the first option, and I think it is likely that
> all real-world runners would probably adhere to that (I didn't check
> that, though).
>
> Opinions?
>
>   Jan
>
>

Reply via email to