Hello all, I have a streaming pipeline with a JmsIO source and downstream there are some custom transforms that include the Wait.on() transform. This is done to wait on certain things like writing to database before publishing messages to JMS at the end of my pipeline.
This custom transform is also used in a batch pipeline and it works perfectly, waiting works as expected and all is good. However, as I mentioned, I am now using this custom transform in a streaming pipeline and it appears that the Wait.on() never "finishes" until the watermark gets past the end of the window. (I am using a window because the custom transform has GroupByKey in it; and also it uses FileIO.writeDynamic which uses GroupByKey internally it seems.) I realize this is how Wait.on() (and the trigger Never.ever() used inside) are supposed to work but then I wonder how to make it work in a streaming setup? I tried with different window types and triggers, but that doesn't seem to make any difference. The only thing that "unblocks" the Wait.on() is simply moving the watermark further - by pushing more data to the JMS source. In my use case, the source is not always producing data constantly, sometimes it is just a big burst of messages once a day. If those messages fit into one window, processing would hold them until next batch of data arrives (possibly, next day) which is not desired - I would like that it basically works as in batch mode, so the whole custom transform is executed for each window - together with Wait.on() working and expected, and subsequent transforms after waiting. Does it make sense or am I missing something in the general design of windowing/triggers/watermark/Wait.on/...? -- Paweł
