Spark will schedule all jobs you have and add them to common task queue. Difference between FIFO and FAIR is how this queue is handled. FIFO will prefer to run jobs in FIFO order and FAIR would try to divide resources equally to all jobs
Problem you have is different. Driver (actually spark API) blocks on certain actions like writing. So what actually happens is that driver does not process further flow until it finishes that write call so it does not even have opportunity to schedule additional jobs. To run multiple writes concurrently on spark you actually need to call spark API in multiple threads (using thread pool or futures) This way you will be able to run multiple writes concurrently on driver which will add all related jobs/tasks to common queue. At this point you can decide whether you want FIFO or FAIR. In some cases (because of data locality) FAIR scheduler can produce quicker overall professing -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: [email protected]
