Threading/parallelism is up to the runner and does not map 1-1 to the java
memory model since most runners will execute in a distributed manner. In
general, runners will attempt to break up the work as evenly as possible
and schedule work across multiple machines / cpu cores at all times to
maximize the throughput / minimize time for execution of the job This is
abstracted away much by getting users to write DoFns that apply with ParDo.
Please take a look at this explanation about ParDo (
https://cloud.google.com/dataflow/model/par-do) to get a better
understanding of its usage and as a place to look at some examples.

On Mon, Jun 20, 2016 at 12:44 PM, amir bahmanyari <[email protected]>
wrote:

> Thanks JB.
> I am executing FlinkPipelineRunner...& later will experirnt the same with
> SparkRunner....any examples pls?
> Cheers
>
> ------------------------------
> *From:* Jean-Baptiste Onofré <[email protected]>
> *To:* [email protected]
> *Sent:* Monday, June 20, 2016 12:35 PM
> *Subject:* Re: Multi-threading implementation equivalence in Beam
>
> Hi Amir,
>
> the DirectPipelineRunner uses multi-thread to achieve ParDo execution
> for instance.
>
> You mean example of Beam pipeline code ? Or example of runner ?
>
> Regards
> JB
>
> On 06/20/2016 09:25 PM, amir bahmanyari wrote:
> > Hi Colleagues,
> > Hope you all had a great weekend. Another novice question :-)
> > Is there a pipeline parallelism/threading model provide by Beam that
> > equates the multi-threading model in Java for instance?
> > Any examples if so?
> > Thanks again,
> > Amir
>
>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
>
>
>

Reply via email to