Hi,

I have written a PoC implementation of this in [1] and I'd like to discuss some implementation details. First of all, I'd appreciate any feedback about this. There are some known issues:

 1) need to figure out how to get Coder of input PCollection of stateful ParDo inside StatefulDoFnRunner

 2) there are performance considerations, that can be solved probably only by Sorted Map State [2]

 3) additional work is needed for allowedLateness to work correctly (and there are at least two ways how to solve this), see the design doc [3]

 4) more tests (for batch and validatesRunner) are needed

I have come across a few bugs in DirectRunner, which I tried to solve:

 a) timers seem to be broken in stateful pardo with side inputs

 b) timers need to be sorted by timestamp, otherwise state might be cleared before it gets chance to be flushed


Thanks for feedback,

 Jan


[1] https://github.com/apache/beam/pull/8774

[2] http://mail-archives.apache.org/mod_mbox/beam-dev/201905.mbox/%3ccalstk6+ldemtjmnuysn3vcufywjkhmgv1isfbdmxthoqh91...@mail.gmail.com%3e

[3] https://docs.google.com/document/d/1ObLVUFsf1NcG8ZuIZE4aVy2RYKx2FfyMhkZYWPnI9-c/


On 5/23/19 4:40 PM, Robert Bradshaw wrote:
Thanks for writing this up.

I think the justification for adding this to the model needs to be
that it is useful (you have this covered, though some examples would
be nice) and that it's something that can't easily be done by users
themselves (specifically, though it can be (relatively) cheaply done
in streaming and batch, it's done in very different ways, and also
that it's hard to do via composition).

On Thu, May 23, 2019 at 4:10 PM Jan Lukavský <je...@seznam.cz> wrote:
Hi,

I have written a very brief draft of how it might be possible to
implement @RequireTimeSortedInput discussed in [1]. I see the document
[2] a starting point for a discussion. There are several open questions,
which I believe can be resolved by this great community. :-)

Jan

[1] http://mail-archives.apache.org/mod_mbox/beam-dev/201905.mbox/browser

[2]
https://docs.google.com/document/d/1ObLVUFsf1NcG8ZuIZE4aVy2RYKx2FfyMhkZYWPnI9-c/

Reply via email to