Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

Xiangrui Meng Mon, 04 Mar 2019 15:18:13 -0800

On Mon, Mar 4, 2019 at 3:10 PM Mark Hamstra <m...@clearstorydata.com> wrote:


> :) Sorry, that was ambiguous. I was seconding Imran's comment.
>

Could you also help review Xingbo's design sketch and help evaluate the
cost?


>
> On Mon, Mar 4, 2019 at 3:09 PM Xiangrui Meng <men...@gmail.com> wrote:
>
>>
>>
>> On Mon, Mar 4, 2019 at 1:56 PM Mark Hamstra <m...@clearstorydata.com>
>> wrote:
>>
>>> +1
>>>
>>
>> Mark, just to be clear, are you +1 on the SPIP or Imran's point?
>>
>>
>>>
>>> On Mon, Mar 4, 2019 at 12:52 PM Imran Rashid <im...@therashids.com>
>>> wrote:
>>>
>>>> On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng <men...@gmail.com> wrote:
>>>>
>>>>> On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung <
>>>>> felixcheun...@hotmail.com> wrote:
>>>>>
>>>>>> IMO upfront allocation is less useful. Specifically too expensive for
>>>>>> large jobs.
>>>>>>
>>>>>
>>>>> This is also an API/design discussion.
>>>>>
>>>>
>>>> I agree with Felix -- this is more than just an API question.  It has a
>>>> huge impact on the complexity of what you're proposing.  You might be
>>>> proposing big changes to a core and brittle part of spark, which is already
>>>> short of experts.
>>>>
>>>
>> To my understanding, Felix's comment is mostly on the user interfaces,
>> stating upfront allocation is less useful, specially for large jobs. I
>> agree that for large jobs we better have dynamic allocation, which was
>> mentioned in the YARN support section in the companion scoping doc. We
>> restrict the new container type to initially requested to keep things
>> simple. However upfront allocation already meets the requirements of basic
>> workflows like data + DL training/inference + data. Saying "it is less
>> useful specifically for large jobs" kinda missed the fact that "it is super
>> useful for basic use cases".
>>
>> Your comment is mostly on the implementation side, which IMHO it is the
>> KEY question to conclude this vote: does the design sketch sufficiently
>> demonstrate that the internal changes to Spark scheduler is manageable? I
>> read Xingbo's design sketch and I think it is doable, which led to my +1.
>> But I'm not an expert on the scheduler. So I would feel more confident if
>> the design was reviewed by some scheduler experts. I also read the design
>> sketch to support different cluster managers, which I think is less
>> critical than the internal scheduler changes.
>>
>>
>>>
>>>> I don't see any value in having a vote on "does feature X sound cool?"
>>>>
>>>
>> I believe no one would disagree. To prepare the companion doc, we went
>> through several rounds of discussions to provide concrete stories such that
>> the proposal is not just "cool".
>>
>>
>>>
>>>>
>>> We have to evaluate the potential benefit against the risks the feature
>>>> brings and the continued maintenance cost.  We don't need super low-level
>>>> details, but we have to a sketch of the design to be able to make that
>>>> tradeoff.
>>>>
>>>
>> Could you review the design sketch from Xingbo, help evaluate the cost,
>> and provide feedback?
>>
>>
>

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

Reply via email to