Given a set of transformations does spark create multiple DAG's and picks
the DAG by some metric such as say higher degree of concurrency or
something else like the typical task graph model in parallel computing
suggests? or does it simply builds one simple DAG by going through
transformations/tasks in order?

I am thinking during shuffle phase there can be many ways to shuffle
therefore can result in multiple DAG's and ultimately some DAG will be
chosen by some metric however This is a complete guess I am not sure what
exactly happens underneath spark.

thanks!

Reply via email to