Re: Samza runner bundle implementation

2019-01-18 Thread Xinyu Liu
Your understanding is correct: both GroupIntoBatches and the Stateful processing of ParDo are on a per-key basis. The DoFn in this example uses bufferState to accumulate the batch and uses countState to keep track of the size of the batch. I believe GroupIntoBatches implements the same logic, basic

Re: Samza runner bundle implementation

2019-01-18 Thread Daniel Chen
On 1/18/19, 1:33 PM, "Deshpande, Omkar" wrote: Hey Xinyu, I am trying to implement "Batched RPC" as described in - https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeam.apache.org%2Fblog%2F2017%2F08%2F28%2Ftimely-processing.html&data=02%7C01%7Cdchen1%40linkedin.c

Re: Samza runner bundle implementation

2019-01-18 Thread Deshpande, Omkar
Hey Xinyu, I am trying to implement "Batched RPC" as described in - https://beam.apache.org/blog/2017/08/28/timely-processing.html The documentation for GroupIntoBatches says "Batches will contain only elements of a single key". And my understanding is for "Batched RPC", I need a batch of keys.

Re: Samza runner bundle implementation

2019-01-18 Thread Xinyu Liu
sorry, the correct link to the first reference: https://github.com/apache/beam/blob/master/runners/samza/src/main/java/org/apache/beam/runners/samza/runtime/DoFnOp.java . Thanks, Xinyu On Fri, Jan 18, 2019 at 10:35 AM Xinyu Liu wrote: > Hi, Omkar, > > Your observation is correct. Currently bund

Re: Samza runner bundle implementation

2019-01-18 Thread Xinyu Liu
Hi, Omkar, Your observation is correct. Currently bundle is implemented in a per-event basis (Code is DoFnOp.processElement, https://docs.google.com/spreadsheets/d/1pIUQ8J658B7GPNDt5dwiJBRQyWep1n2rXlkONyA6ZMM/edit#gid=1709587251). We are working on supporting bundles in Samza right now so in Beam

Re: Samza runner bundle implementation

2019-01-18 Thread Daniel Chen
+ Xinyu On 1/18/19, 10:05 AM, "Deshpande, Omkar" wrote: Hello, I am using Samza runner with Apache Beam. Is there any documentation available on how bundles are implemented in the Samza runner? I have observed every Kafka record getting processed in its own bundle. How can I

Samza runner bundle implementation

2019-01-18 Thread Deshpande, Omkar
Hello, I am using Samza runner with Apache Beam. Is there any documentation available on how bundles are implemented in the Samza runner? I have observed every Kafka record getting processed in its own bundle. How can I get larger bundles? 2.9.0 0.14.1 Thanks, Omkar