1. Yes if two tasks depend on each other they cant parallelize 2. Imagine something like a web application driver. You only get to have 1 spark context but now you want to run many concurrent jobs. They have nothing 2 do with each other; no reason to keep them sequential.
Hope this helps <div>-------- Original message --------</div><div>From: bit1...@163.com </div><div>Date:06/01/2015 4:14 AM (GMT-05:00) </div><div>To: user <user@spark.apache.org> </div><div>Subject: Don't understand "schedule jobs within an Application </div><div> </div>Hi, sparks, Following is copied from the spark online document http://spark.apache.org/docs/latest/job-scheduling.html. Basically, I have two questions on it: 1. If two jobs in an application has dependencies, that is one job depends on the result of the other job, then I think they will have to run sequentially. 2. Since jobs scheduling happens within one application, I don't think job scheduing will give benefits to multi-users as the last sentence says.in my opinion, multi users can benifit only from cross applications scheduling. Maybe i haven't had a good understanding on the job scheduing, could someone elaborate this? Thanks very much By default, Spark’s scheduler runs jobs in FIFO fashion. Each job is divided into “stages” (e.g. map and reduce phases), and the first job gets priority on all available resources while its stages have tasks to launch, then the second job gets priority, etc. If the jobs at the head of the queue don’t need to use the whole cluster, later jobs can start to run right away, but if the jobs at the head of the queue are large, then later jobs may be delayed significantly. Starting in Spark 0.8, it is also possible to configure fair sharing between jobs. Under fair sharing, Spark assigns tasks between jobs in a “round robin” fashion, so that all jobs get a roughly equal share of cluster resources. This means that short jobs submitted while a long job is running can start receiving resources right away and still get good response times, without waiting for the long job to finish. This mode is best for multi-user settings bit1...@163.com