Hi,

+1 to what Kenn asked: your pipeline is in streaming mode and GIB preserves windowing, the elements are buffered until one of these conditions are true: batchsize reached or end of window. I your case I think it is the second one.

Best

Etienne

On 28/02/2020 19:15, Kenneth Knowles wrote:
What are the timestamps on the elements?

On Fri, Feb 28, 2020 at 8:36 AM Vasu Gupta <dev.vasugu...@gmail.com <mailto:dev.vasugu...@gmail.com>> wrote:

    Edit: Issue is on Direct Runner(Not Direction Runner - mistyped)
    Issue Details:
    Input data: 7 key-value Packets like: a-1, a-4, b-3, c-5, d-1,
    e-4, e-5
    Batch Size: 5
    Expected output: a-1,4, b-3, c-5, d-1, e-4,5
    Getting Packets with irregular size like a-1, b-5, e-4,5 OR a-1,4,
    c-5 etc
    But i always got correct number of packets with BATCH_SIZE = 1

    On 2020/02/27 20:40:16, Kenneth Knowles <k...@apache.org
    <mailto:k...@apache.org>> wrote:
    > Can you share some more details? What is the expected output and
    what
    > output are you seeing?
    >
    > On Thu, Feb 27, 2020 at 9:39 AM Vasu Gupta
    <dev.vasugu...@gmail.com <mailto:dev.vasugu...@gmail.com>> wrote:
    >
    > > Hey folks, I am using Apache beam Framework in Java with
    Direction Runner
    > > for local testing purposes. When using GroupIntoBatches with
    batch size 1
    > > it works perfectly fine i.e. the output of the transform is
    consistent and
    > > as expected. But when using with batch size > 1 the output
    Pcollection has
    > > less data than it should be.
    > >
    > > Pipeline flow:
    > > 1. A Transform for reading from pubsub
    > > 2. Transform for making a KV out of the data
    > > 3. A Fixed Window transform of 1 second
    > > 4. Applying GroupIntoBatches transform
    > > 5. And last, Logging the resulting Iterables.
    > >
    > > Weird thing is that it batch_size > 1 works great when running on
    > > DataflowRunner but not with DirectRunner. I think the issue
    might be with
    > > Timer Expiry since GroupIntoBatches uses BagState internally.
    > >
    > > Any help will be much appreciated.
    > >
    >

Reply via email to