If I understand correctly - your pipeline has some kind of windowing, and on every trigger downstream of the combiner, the pipeline updates a cache with a single, non-windowed value. Is that correct?
What are your keys for this pipeline? You could work this out with, as you noted, a timer that fires periodically, and keeps some state with the value that you want to update to the cache. Is this a Python or Java pipeline? What is the runner? Best -P. On Mon, Nov 25, 2019 at 1:27 PM Aaron Dixon <[email protected]> wrote: > Suppose I trigger a Combine per-element (in a high-volume stream) and use > a ParDo as a sink. > > I assume there is no guarantee about the order that my ParDo will see > these triggers, especially as it processes in parallel, anyway. > > That said, my sink writes to a db or cache and I would not like the cache > to ever regress its value to something "before" what it has already written. > > Is the best way to solve this problem to always write the event-time in > the cache and do a compare-and-swap only updating the sink if the triggered > value in-hand is later than the target value? > > Or is there a better way to guarantee that my ParDo sink will process > elements in-order? (Eg, if I can give up per-event/real-time, then a > delay-based trigger would probably be sufficient I imagine.) > > Thanks for advice! >
