Thanks for starting this discussion, I'd also like to see some improvements
in this area and glad to hear that the Pandas UDFs / Arrow functionality
might be useful.  I'm wondering if from your initial investigations you
found anything lacking from the Arrow format or possible improvements that
would simplify the data representation?  Also, while data could be handed
off in a UDF, would it make sense to also discuss a more formal way to
externalize the data in a way that would also work for the Scala API?

Thanks,
Bryan

On Wed, May 9, 2018 at 4:31 PM, Xiangrui Meng <m...@databricks.com> wrote:

> Shivaram: Yes, we can call it "gang scheduling" or "barrier
> synchronization". Spark doesn't support it now. The proposal is to have a
> proper support in Spark's job scheduler, so we can integrate well with
> MPI-like frameworks.
>
>
> On Tue, May 8, 2018 at 11:17 AM Nan Zhu <zhunanmcg...@gmail.com> wrote:
>
>> .....how I skipped the last part........
>>
>> On Tue, May 8, 2018 at 11:16 AM, Reynold Xin <r...@databricks.com> wrote:
>>
>>> Yes, Nan, totally agree. To be on the same page, that's exactly what I
>>> wrote wasn't it?
>>>
>>> On Tue, May 8, 2018 at 11:14 AM Nan Zhu <zhunanmcg...@gmail.com> wrote:
>>>
>>>> besides that, one of the things which is needed by multiple frameworks
>>>> is to schedule tasks in a single wave
>>>>
>>>> i.e.
>>>>
>>>> if some frameworks like xgboost/mxnet requires 50 parallel workers,
>>>> Spark is desired to provide a capability to ensure that either we run 50
>>>> tasks at once, or we should quit the complete application/job after some
>>>> timeout period
>>>>
>>>> Best,
>>>>
>>>> Nan
>>>>
>>>> On Tue, May 8, 2018 at 11:10 AM, Reynold Xin <r...@databricks.com>
>>>> wrote:
>>>>
>>>>> I think that's what Xiangrui was referring to. Instead of retrying a
>>>>> single task, retry the entire stage, and the entire stage of tasks need to
>>>>> be scheduled all at once.
>>>>>
>>>>>
>>>>> On Tue, May 8, 2018 at 8:53 AM Shivaram Venkataraman <
>>>>> shiva...@eecs.berkeley.edu> wrote:
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>    - Fault tolerance and execution model: Spark assumes
>>>>>>>>    fine-grained task recovery, i.e. if something fails, only that task 
>>>>>>>> is
>>>>>>>>    rerun. This doesn’t match the execution model of distributed ML/DL
>>>>>>>>    frameworks that are typically MPI-based, and rerunning a single 
>>>>>>>> task would
>>>>>>>>    lead to the entire system hanging. A whole stage needs to be re-run.
>>>>>>>>
>>>>>>>> This is not only useful for integrating with 3rd-party frameworks,
>>>>>>> but also useful for scaling MLlib algorithms. One of my earliest 
>>>>>>> attempts
>>>>>>> in Spark MLlib was to implement All-Reduce primitive (SPARK-1485
>>>>>>> <https://issues.apache.org/jira/browse/SPARK-1485>). But we ended
>>>>>>> up with some compromised solutions. With the new execution model, we can
>>>>>>> set up a hybrid cluster and do all-reduce properly.
>>>>>>>
>>>>>>>
>>>>>> Is there a particular new execution model you are referring to or do
>>>>>> we plan to investigate a new execution model ?  For the MPI-like model, 
>>>>>> we
>>>>>> also need gang scheduling (i.e. schedule all tasks at once or none of 
>>>>>> them)
>>>>>> and I dont think we have support for that in the scheduler right now.
>>>>>>
>>>>>>>
>>>>>>>> --
>>>>>>>
>>>>>>> Xiangrui Meng
>>>>>>>
>>>>>>> Software Engineer
>>>>>>>
>>>>>>> Databricks Inc. [image: http://databricks.com]
>>>>>>> <http://databricks.com/>
>>>>>>>
>>>>>>
>>>>>>
>>>>
>> --
>
> Xiangrui Meng
>
> Software Engineer
>
> Databricks Inc. [image: http://databricks.com] <http://databricks.com/>
>

Reply via email to