Re: How to set Spark to perform only one map at once at each cluster node

jan.zikes Tue, 28 Oct 2014 04:02:26 -0700

But I guess that this makes only one task over all the clusters nodes. I would 
like to run several tasks, but I would like Spark to not run more than one map 
at each of my nodes at one time. That means I would like to let's say have 4 
different tasks and 2 nodes where each node has 2 cores. Currently hadoop runs 
2 maps in parallel at each node (all the 4 tasks in parallel), but I would like 
to somehow force it to run only 1 task at each node and to give it another task 
after the first task will finish.  
 
______________________________________________________________



The number of tasks is decided by the input partition numbers.If you want only 
one map or flatMap at once, just call coalesce() or repartition() to associate 
data into one partition.However, this is not recommend because it was not 
executed parallel efficiently.
2014-10-28 17:27 GMT+08:00 <jan.zi...@centrum.cz <jan.zi...@centrum.cz>>:
Hi,

I am currently struggling with how to properly set Spark to perform only one 
map, flatMap, etc at once. In other words my map uses multi core algorithm so I 
would like to have only one map running to be able to use all the machine cores.

Thank you in advance for advices and replies. 

Jan 
 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org 
<user-unsubscr...@spark.apache.org>
For additional commands, e-mail: user-h...@spark.apache.org 
<user-h...@spark.apache.org>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: How to set Spark to perform only one map at once at each cluster node

Reply via email to