Hi,

I just found out that DirectRunner is apparently not using LateDataDroppingDoFnRunner, which means that it doesn't drop late data in cases where there is no GBK operation involved (dropping in GBK seems to be correct). There is apparently no @Category(ValidatesRunner) test for that behavior (because DirectRunner would fail it), so the question is - should late data dropping be considered part of model (of which DirectRunner should be a canonical implementation) and therefore that should be fixed there, or is the late data dropping an optional feature of a runner?

I'm strongly in favor of the first option, and I think it is likely that all real-world runners would probably adhere to that (I didn't check that, though).

Opinions?

 Jan

Reply via email to