Shivaram: Yes, we can call it "gang scheduling" or "barrier
synchronization". Spark doesn't support it now. The proposal is to have a
proper support in Spark's job scheduler, so we can integrate well with
MPI-like frameworks.

On Tue, May 8, 2018 at 11:17 AM Nan Zhu <zhunanmcg...@gmail.com> wrote:

> .....how I skipped the last part........
>
> On Tue, May 8, 2018 at 11:16 AM, Reynold Xin <r...@databricks.com> wrote:
>
>> Yes, Nan, totally agree. To be on the same page, that's exactly what I
>> wrote wasn't it?
>>
>> On Tue, May 8, 2018 at 11:14 AM Nan Zhu <zhunanmcg...@gmail.com> wrote:
>>
>>> besides that, one of the things which is needed by multiple frameworks
>>> is to schedule tasks in a single wave
>>>
>>> i.e.
>>>
>>> if some frameworks like xgboost/mxnet requires 50 parallel workers,
>>> Spark is desired to provide a capability to ensure that either we run 50
>>> tasks at once, or we should quit the complete application/job after some
>>> timeout period
>>>
>>> Best,
>>>
>>> Nan
>>>
>>> On Tue, May 8, 2018 at 11:10 AM, Reynold Xin <r...@databricks.com>
>>> wrote:
>>>
>>>> I think that's what Xiangrui was referring to. Instead of retrying a
>>>> single task, retry the entire stage, and the entire stage of tasks need to
>>>> be scheduled all at once.
>>>>
>>>>
>>>> On Tue, May 8, 2018 at 8:53 AM Shivaram Venkataraman <
>>>> shiva...@eecs.berkeley.edu> wrote:
>>>>
>>>>>
>>>>>>
>>>>>>>    - Fault tolerance and execution model: Spark assumes
>>>>>>>    fine-grained task recovery, i.e. if something fails, only that task 
>>>>>>> is
>>>>>>>    rerun. This doesn’t match the execution model of distributed ML/DL
>>>>>>>    frameworks that are typically MPI-based, and rerunning a single task 
>>>>>>> would
>>>>>>>    lead to the entire system hanging. A whole stage needs to be re-run.
>>>>>>>
>>>>>>> This is not only useful for integrating with 3rd-party frameworks,
>>>>>> but also useful for scaling MLlib algorithms. One of my earliest attempts
>>>>>> in Spark MLlib was to implement All-Reduce primitive (SPARK-1485
>>>>>> <https://issues.apache.org/jira/browse/SPARK-1485>). But we ended up
>>>>>> with some compromised solutions. With the new execution model, we can set
>>>>>> up a hybrid cluster and do all-reduce properly.
>>>>>>
>>>>>>
>>>>> Is there a particular new execution model you are referring to or do
>>>>> we plan to investigate a new execution model ?  For the MPI-like model, we
>>>>> also need gang scheduling (i.e. schedule all tasks at once or none of 
>>>>> them)
>>>>> and I dont think we have support for that in the scheduler right now.
>>>>>
>>>>>>
>>>>>>> --
>>>>>>
>>>>>> Xiangrui Meng
>>>>>>
>>>>>> Software Engineer
>>>>>>
>>>>>> Databricks Inc. [image: http://databricks.com]
>>>>>> <http://databricks.com/>
>>>>>>
>>>>>
>>>>>
>>>
> --

Xiangrui Meng

Software Engineer

Databricks Inc. [image: http://databricks.com] <http://databricks.com/>

Reply via email to