Hi
I have been trying out the apache beam go SDK for running aggregation on
logs from kafka. I have been trying to write a kafka reader using ParDoFns
and aggregating the result using WindowInto and GroupByKey. However the
`GroupByKey`'s output only outputs result if I finish my kafka message
outputs. I am using the direct runner for testing these out. It seems like
a bug in the runner/GroupByKey implementation as it seems to assume that
the output is going to exhaust even in the case of windowed input.

Pipeline representation:
impulse -> kafka consumption ParDo(consumes kafka messages in a loop and
emits metric with beam.EventTime) -> add timestamp ParDo ->
WindowInto(fixed windows) -> Add window max Timestamp as key Pardo ->
*GroupByKey*

The *GroupByKey *gets stuck until first kafka consumption pardo exits. This
doesn't seem desired. I haven't been unable to pinpoint the exact line of
code why this gets stuck. Can someone help me out with this.

Thanks for your help.

Best
Anand Singh Kunwar

Reply via email to