Hi, First, a bit of clarification (or refinement): a windowing strategy is used in all subsequent GroupByKey operations until another windowing strategy is specified. That being said, from quickly glancing at the windowed write-code I have the suspicion that triggers are not used for windowed writing and that instead some other scheme is used for determining when to close a file.
I’m sure others with more knowledge of those parts will jump in later but I nevertheless wanted to give you this quick answer. Best, Aljoscha > On 7. May 2017, at 00:57, Ankur Chauhan <an...@malloc64.com> wrote: > > That I believe is expected behavior. The windowing strategy is applied at the > following group by key operation. To have the windows fire the way you want, > try putting a group by key immediately after the desired windowing function. > > The messages right now are being bundled aggressively for performance reasons > and doing a gbk would ensure desired bundle delineations. > > Ankur Chauhan > Sent from my iPhone > > On May 6, 2017, at 14:11, Josh <jof...@gmail.com <mailto:jof...@gmail.com>> > wrote: > >> Hi all, >> I am using a PubSubIO source, windowing every 10 seconds and then doing >> something with the windows, for example: >> pipeline >> .apply(PubsubIO.readPubsubMessagesWithAttributes().fromSubscription(..)) >> .apply(Window.into(FixedWindows.of(Duration.standardSeconds(10)))) >> .apply(MapElements >> .into(TypeDescriptors.strings()) >> .via((PubsubMessage msg) -> msg.getAttribute(..))) >> .apply(new WriteOneFilePerWindow(..)); >> My expectation was that if I publish a pubsub message, and then publish >> another 10+ seconds later, a single file should be written for the previous >> 10 second window. However I find that I need to publish a lot of messages >> for any files to be written at all (e.g. 30+ messages). >> Is this expected behaviour when using PubSubIO? Is there a way to tweak it >> to fire the windows more eagerly? >> Thanks, >> Josh