Sean, thanks for your input and making a pass on the updated SPIP!

As the next step, how about having a remote meeting to discuss the
remaining topics? I started a doodle poll here
<https://doodle.com/poll/33cthyc6f8i8naya>. Due to time constraint, I
suggest limiting the attendees to committers and posting the meeting
summary to JIRA after.

On Tue, Mar 19, 2019 at 10:16 AM Sean Owen <sro...@gmail.com> wrote:

> This looks like a great level of detail. The broad strokes look good to me.
>
> I'm happy with just about any story around what to do with Mesos GPU
> support now, but might at least deserve a mention: does the existing
> Mesos config simply become a deprecated alias for the
> spark.executor.accelerator.gpu.count? and no further support is added
> to Mesos? that seems entirely coherent, and if that's agreeable, could
> be worth a line here.
>

I would go with deprecated alias option. But I would defer the decision to
some committer who is willing to shepherd the Mesos sub-project.


>
> I think it could go into Spark 3 but need not block it. This doesn't
> say it does, merely says it's desirable to have it ready for 3.0 if
> possible. That seems like a fine position.
>
> On Mon, Mar 18, 2019 at 1:56 PM Xingbo Jiang <jiangxb1...@gmail.com>
> wrote:
> >
> > Hi all,
> >
> > I updated the SPIP doc and stories, I hope it now contains clear scope
> of the changes and enough details for SPIP vote.
> > Please review the updated docs, thanks!
> >
> > Xiangrui Meng <men...@gmail.com> 于2019年3月6日周三 上午8:35写道:
> >>
> >> How about letting Xingbo make a major revision to the SPIP doc to make
> it clear what proposed are? I like Felix's suggestion to switch to the new
> Heilmeier template, which helps clarify what are proposed and what are not.
> Then let's review the new SPIP and resume the vote.
> >>
> >> On Tue, Mar 5, 2019 at 7:54 AM Imran Rashid <im...@therashids.com>
> wrote:
> >>>
> >>> OK, I suppose then we are getting bogged down into what a vote on an
> SPIP means then anyway, which I guess we can set aside for now.  With the
> level of detail in this proposal, I feel like there is a reasonable chance
> I'd still -1 the design or implementation.
> >>>
> >>> And the other thing you're implicitly asking the community for is to
> prioritize this feature for continued review and maintenance.  There is
> already work to be done in things like making barrier mode support dynamic
> allocation (SPARK-24942), bugs in failure handling (eg. SPARK-25250), and
> general efficiency of failure handling (eg. SPARK-25341, SPARK-20178).  I'm
> very concerned about getting spread too thin.
> >>>
> >>>
> >>> But if this is really just a vote on (1) is better gpu support
> important for spark, in some form, in some release? and (2) is it
> *possible* to do this in a safe way?  then I will vote +0.
> >>>
> >>> On Tue, Mar 5, 2019 at 8:25 AM Tom Graves <tgraves...@yahoo.com>
> wrote:
> >>>>
> >>>> So to me most of the questions here are implementation/design
> questions, I've had this issue in the past with SPIP's where I expected to
> have more high level design details but was basically told that belongs in
> the design jira follow on. This makes me think we need to revisit what a
> SPIP really need to contain, which should be done in a separate thread.
> Note personally I would be for having more high level details in it.
> >>>> But the way I read our documentation on a SPIP right now that detail
> is all optional, now maybe we could argue its based on what reviewers
> request, but really perhaps we should make the wording of that more
> required.  thoughts?  We should probably separate that discussion if people
> want to talk about that.
> >>>>
> >>>> For this SPIP in particular the reason I +1 it is because it came
> down to 2 questions:
> >>>>
> >>>> 1) do I think spark should support this -> my answer is yes, I think
> this would improve spark, users have been requesting both better GPUs
> support and support for controlling container requests at a finer
> granularity for a while.  If spark doesn't support this then users may go
> to something else, so I think it we should support it
> >>>>
> >>>> 2) do I think its possible to design and implement it without causing
> large instabilities?   My opinion here again is yes. I agree with Imran and
> others that the scheduler piece needs to be looked at very closely as we
> have had a lot of issues there and that is why I was asking for more
> details in the design jira:
> https://issues.apache.org/jira/browse/SPARK-27005.  But I do believe its
> possible to do.
> >>>>
> >>>> If others have reservations on similar questions then I think we
> should resolve here or take the discussion of what a SPIP is to a different
> thread and then come back to this, thoughts?
> >>>>
> >>>> Note there is a high level design for at least the core piece, which
> is what people seem concerned with, already so including it in the SPIP
> should be straight forward.
> >>>>
> >>>> Tom
> >>>>
> >>>> On Monday, March 4, 2019, 2:52:43 PM CST, Imran Rashid <
> im...@therashids.com> wrote:
> >>>>
> >>>>
> >>>> On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng <men...@gmail.com>
> wrote:
> >>>>
> >>>> On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung <
> felixcheun...@hotmail.com> wrote:
> >>>>
> >>>> IMO upfront allocation is less useful. Specifically too expensive for
> large jobs.
> >>>>
> >>>>
> >>>> This is also an API/design discussion.
> >>>>
> >>>>
> >>>> I agree with Felix -- this is more than just an API question.  It has
> a huge impact on the complexity of what you're proposing.  You might be
> proposing big changes to a core and brittle part of spark, which is already
> short of experts.
> >>>>
> >>>> I don't see any value in having a vote on "does feature X sound
> cool?"  We have to evaluate the potential benefit against the risks the
> feature brings and the continued maintenance cost.  We don't need super
> low-level details, but we have to a sketch of the design to be able to make
> that tradeoff.
>

Reply via email to