On 1/18/19, 1:33 PM, "Deshpande, Omkar" <omkar_deshpa...@intuit.com> wrote:

    Hey Xinyu,
    
    I am trying to implement "Batched RPC" as described in - 
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeam.apache.org%2Fblog%2F2017%2F08%2F28%2Ftimely-processing.html&amp;data=02%7C01%7Cdchen1%40linkedin.com%7C77e8e21b921042e5a8f908d67d8c8da8%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636834439965710970&amp;sdata=wWeVLlQVp3udG2Xg70SovqB66feNxMxYLxxCZRsJIc0%3D&amp;reserved=0
    The documentation for GroupIntoBatches says "Batches will contain only 
elements of a single key". 
    And my understanding is for "Batched RPC", I need a batch of keys. So, I am 
not sure if I can use GroupIntoBatches.
    
    On 1/18/19, 10:40 AM, "Xinyu Liu" <xinyuliu...@gmail.com> wrote:
    
        This email is from an external sender.
        
        
        sorry, the correct link to the first reference:
        
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fbeam%2Fblob%2Fmaster%2Frunners%2Fsamza%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fbeam%2Frunners%2Fsamza%2Fruntime%2FDoFnOp.java&amp;data=02%7C01%7Cdchen1%40linkedin.com%7C77e8e21b921042e5a8f908d67d8c8da8%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636834439965710970&amp;sdata=FQ9nJn1S7%2BRTiRwvwKrxAvFhTF076SNuY6RfUnsdkRI%3D&amp;reserved=0
        .
        
        Thanks,
        Xinyu
        
        On Fri, Jan 18, 2019 at 10:35 AM Xinyu Liu <xinyuliu...@gmail.com> 
wrote:
        
        > Hi, Omkar,
        >
        > Your observation is correct. Currently bundle is implemented in a
        > per-event basis (Code is DoFnOp.processElement,
        > 
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fspreadsheets%2Fd%2F1pIUQ8J658B7GPNDt5dwiJBRQyWep1n2rXlkONyA6ZMM%2Fedit%23gid%3D1709587251&amp;data=02%7C01%7Cdchen1%40linkedin.com%7C77e8e21b921042e5a8f908d67d8c8da8%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636834439965710970&amp;sdata=Y48p6YZWoB5AaYgUeVHYrWD3JITAcjvVH5%2FgyZbIN44%3D&amp;reserved=0).
        > We are working on supporting bundles in Samza right now so in Beam we 
can
        > take advantage of it. Bundling is also critical to have better python
        > performance so we are trying to get it out very soon (Feb-March).
        >
        > On the other hand, in java if you want to process in a batch fashion, 
you
        > can use the Beam GroupIntoBatches api (
        > 
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeam.apache.org%2Freleases%2Fjavadoc%2F2.0.0%2Forg%2Fapache%2Fbeam%2Fsdk%2Ftransforms%2FGroupIntoBatches.html&amp;data=02%7C01%7Cdchen1%40linkedin.com%7C77e8e21b921042e5a8f908d67d8c8da8%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636834439965710970&amp;sdata=ZgKm2XXoromrgxW%2BeEyALoZT%2BLR3fwh3eQkxM17rW5g%3D&amp;reserved=0).
        > This will group the elements into batches and then deliver to your 
ParDo
        > afterwards. Please let us know whether this works for you.
        >
        > Thanks,
        > Xinyu
        >
        >
        >
        > On Fri, Jan 18, 2019 at 10:27 AM Daniel Chen <dch...@linkedin.com> 
wrote:
        >
        >> + Xinyu
        >>
        >> On 1/18/19, 10:05 AM, "Deshpande, Omkar" <omkar_deshpa...@intuit.com>
        >> wrote:
        >>
        >>     Hello,
        >>
        >>     I am using Samza runner with Apache Beam. Is there any 
documentation
        >> available on how bundles are implemented in the Samza runner?
        >>     I have observed every Kafka record getting processed in its own
        >> bundle. How can I get larger bundles?
        >>
        >>      <beam.version>2.9.0 <samza.version>0.14.1
        >>
        >>     Thanks,
        >>     Omkar
        >>
        >>
        >>
        
    
    

Reply via email to