Your understanding is correct: both GroupIntoBatches and the Stateful
processing of ParDo are on a per-key basis. The DoFn in this example
uses bufferState
to accumulate the batch and uses countState to keep track of the size of
the batch. I believe GroupIntoBatches implements the same logic, basic
On 1/18/19, 1:33 PM, "Deshpande, Omkar" wrote:
Hey Xinyu,
I am trying to implement "Batched RPC" as described in -
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeam.apache.org%2Fblog%2F2017%2F08%2F28%2Ftimely-processing.html&data=02%7C01%7Cdchen1%40linkedin.c
Hey Xinyu,
I am trying to implement "Batched RPC" as described in -
https://beam.apache.org/blog/2017/08/28/timely-processing.html
The documentation for GroupIntoBatches says "Batches will contain only elements
of a single key".
And my understanding is for "Batched RPC", I need a batch of keys.
sorry, the correct link to the first reference:
https://github.com/apache/beam/blob/master/runners/samza/src/main/java/org/apache/beam/runners/samza/runtime/DoFnOp.java
.
Thanks,
Xinyu
On Fri, Jan 18, 2019 at 10:35 AM Xinyu Liu wrote:
> Hi, Omkar,
>
> Your observation is correct. Currently bund
Hi, Omkar,
Your observation is correct. Currently bundle is implemented in a per-event
basis (Code is DoFnOp.processElement,
https://docs.google.com/spreadsheets/d/1pIUQ8J658B7GPNDt5dwiJBRQyWep1n2rXlkONyA6ZMM/edit#gid=1709587251).
We are working on supporting bundles in Samza right now so in Beam
+ Xinyu
On 1/18/19, 10:05 AM, "Deshpande, Omkar" wrote:
Hello,
I am using Samza runner with Apache Beam. Is there any documentation
available on how bundles are implemented in the Samza runner?
I have observed every Kafka record getting processed in its own bundle. How
can I
Hello,
I am using Samza runner with Apache Beam. Is there any documentation available
on how bundles are implemented in the Samza runner?
I have observed every Kafka record getting processed in its own bundle. How can
I get larger bundles?
2.9.0 0.14.1
Thanks,
Omkar