Ah, thank you!
- If you create a data set from a Java/Scala collection, this data source
has the parallelism one.
- The map function is chained to that source, so it runs with parallelism
one as well.
- To run it with a higher parallelism, use "setParallelism(...)" on the
mapFunction, or call "
Yeah, sorry. I would like to do something simple like this, but using
Java Threads.
DataSet> input = env.fromCollection(in);
DataSet output = input.map(new HighWorkLoad());
ArrayList result = output.consume(); // ? like collect but in
parallel, some operation that consumes the pipeline.
return r
It is not quite easy to understand what you are trying to do.
Can you post your program here? Then we can take a look and give you a good
answer...
On Mon, Jun 29, 2015 at 3:47 PM, Juan Fumero <
juan.jose.fumero.alfo...@oracle.com> wrote:
> Is there any other way to apply the function in paralle
Is there any other way to apply the function in parallel and return the
result to the client in parallel?
Thanks
Juan
On Mon, 2015-06-29 at 15:01 +0200, Stephan Ewen wrote:
> In general, avoid collect if you can. Collect brings data top the
> client, where the computation is not parallel any mor
In general, avoid collect if you can. Collect brings data top the client,
where the computation is not parallel any more.
Try to do as much on the DataSet as possible.
On Mon, Jun 29, 2015 at 2:58 PM, Juan Fumero <
juan.jose.fumero.alfo...@oracle.com> wrote:
> Hi Stephan,
> so should I use ano
Hi Stephan,
so should I use another method instead of collect? It seems
multithread is not working with this.
Juan
On Mon, 2015-06-29 at 14:51 +0200, Stephan Ewen wrote:
> Hi Juan!
>
>
> This is an artifact of a workaround right now. The actual collect()
> logic happens in the flatMap() a
Hi Juan!
This is an artifact of a workaround right now. The actual collect() logic
happens in the flatMap() and the sink is a dummy that executes nothing. The
flatMap writes the data to be collected to the "accumulator" that delivers
it back.
Greetings,
Stephan
On Mon, Jun 29, 2015 at 2:30 PM,
Hi,
I am starting with Flink. I have tried to look for the documentation
but I havent found it clear.
I wonder the difference between these two states:
FlatMap RUNNING vs DataSink RUNNIG.
FlatMap is doing data any data transformation? Compilation? In which
point is actually executing the f