Maulik Gandhi <mmg...@gmail.com> 10:19 AM (1 hour ago) to user Hi Beam Community,
I am working on Beam processing pipeline, which reads data from the non-bounded and bounded source and want to leverage Beam state management in my pipeline. For putting data in Beam state, I have to transfer the data in key-value (eg: KV<String, Object>. As I am reading data from the non-bounded and bounded source, I am forced to perform Window + Triggering, before grouping data by key. I have chosen to use GlobalWindows(). I am able to kick-off the Data Flow job, which would run my Beam pipeline. I have noticed Data Flow would use only 1 Worker node to perform the work, and would not scale the job to use more worker nodes, thus not leveraging the benefit of distributed processing. I have posted the question on Stack Overflow: https://stackoverflow.com/questions/55242684/join-bounded-and-non-bounded-source-data-flow-job-not-scaling but reaching out on the mailing list, to get some help, or learn what I am missing. Any help would be appreciated. Thanks. - Maulik