Ning Kang created BEAM-11906: -------------------------------- Summary: No trigger early repeatedly for session windows Key: BEAM-11906 URL: https://issues.apache.org/jira/browse/BEAM-11906 Project: Beam Issue Type: Improvement Components: runner-dataflow Affects Versions: 2.28.0, 2.23.0 Reporter: Ning Kang
Originated from: https://stackoverflow.com/questions/66381608/apache-beam-does-not-trigger-early-repeatedly-for-session-windows-on-google-data The following pipeline fires early after each element when running locally using DirectRunner, but there are no early triggers when running on google cloud dataflow. On dataflow it triggers only after the session window has closed. {code:python} ( p | 'read' >> beam.io.ReadFromPubSub(subscription = 'projects/xxx/subscriptions/xxx-sub') | 'json' >> beam.Map(lambda x: json.loads(x.decode('utf-8'))) | 'kv' >> beam.Map(lambda x: (x['id'], x['amount'])) | 'window' >> beam.WindowInto(window.Sessions(15*60), trigger=trigger.Repeatedly(trigger.AfterCount(1)), accumulation_mode=AccumulationMode.ACCUMULATING) | 'group' >> beam.GroupByKey() | 'log' >> beam.Map(lambda x: logging.info(x)) ) {code} Apache Beam versions tried: 2.23 and 2.28. -- This message was sent by Atlassian Jira (v8.3.4#803005)