> On 30 Nov 2015, at 17:47, Kashmar, Ali <ali.kash...@emc.com> wrote:
> Do the parallel instances of each task get distributed across the cluster or 
> is it possible that they all run on the same node?

Yes, slots are requested from all nodes of the cluster. But keep in mind that 
multiple tasks (forming a local pipeline) can be scheduled to the same slot (1 
slot can hold many tasks).

Have you seen this? 
https://ci.apache.org/projects/flink/flink-docs-release-0.10/internals/job_scheduling.html

> If they can all run on the same node, what happens when that node crashes? 
> Does the job manager recreate them using the remaining open slots?

What happens: The job manager tries to restart the program with the same 
parallelism. Thus if you have enough free slots available in your cluster, this 
works smoothly (so yes, the remaining/available slots are used)

With a YARN cluster the task manager containers are restarted automatically. In 
standalone mode, you have to take care of this yourself.


Does this help?

– Ufuk

Reply via email to