Thanks for sending the updated docs. Can you please give everyone the ability to comment? I have some comments, but overall I think this is a good proposal and addresses my prior concerns.
My only real concern is that I notice some mention of "must dos" for spark 3.0. I don't want to make any commitment to holding spark 3.0 for parts of this, I think that is an entirely separate decision. However I'm guessing this is just a minor wording issue, and you really mean that's a minimal set of features you are aiming for, which is reasonable. On Mon, Mar 18, 2019 at 12:56 PM Xingbo Jiang <jiangxb1...@gmail.com> wrote: > Hi all, > > I updated the SPIP doc > <https://docs.google.com/document/d/1C4J_BPOcSCJc58HL7JfHtIzHrjU0rLRdQM3y7ejil64/edit#> > and stories > <https://docs.google.com/document/d/12JjloksHCdslMXhdVZ3xY5l1Nde3HRhIrqvzGnK_bNE/edit#heading=h.udyua28eu3sg>, > I hope it now contains clear scope of the changes and enough details for > SPIP vote. > Please review the updated docs, thanks! > > Xiangrui Meng <men...@gmail.com> 于2019年3月6日周三 上午8:35写道: > >> How about letting Xingbo make a major revision to the SPIP doc to make it >> clear what proposed are? I like Felix's suggestion to switch to the new >> Heilmeier template, which helps clarify what are proposed and what are not. >> Then let's review the new SPIP and resume the vote. >> >> On Tue, Mar 5, 2019 at 7:54 AM Imran Rashid <im...@therashids.com> wrote: >> >>> OK, I suppose then we are getting bogged down into what a vote on an >>> SPIP means then anyway, which I guess we can set aside for now. With the >>> level of detail in this proposal, I feel like there is a reasonable chance >>> I'd still -1 the design or implementation. >>> >>> And the other thing you're implicitly asking the community for is to >>> prioritize this feature for continued review and maintenance. There is >>> already work to be done in things like making barrier mode support dynamic >>> allocation (SPARK-24942), bugs in failure handling (eg. SPARK-25250), and >>> general efficiency of failure handling (eg. SPARK-25341, SPARK-20178). I'm >>> very concerned about getting spread too thin. >>> >> >>> But if this is really just a vote on (1) is better gpu support important >>> for spark, in some form, in some release? and (2) is it *possible* to do >>> this in a safe way? then I will vote +0. >>> >>> On Tue, Mar 5, 2019 at 8:25 AM Tom Graves <tgraves...@yahoo.com> wrote: >>> >>>> So to me most of the questions here are implementation/design >>>> questions, I've had this issue in the past with SPIP's where I expected to >>>> have more high level design details but was basically told that belongs in >>>> the design jira follow on. This makes me think we need to revisit what a >>>> SPIP really need to contain, which should be done in a separate thread. >>>> Note personally I would be for having more high level details in it. >>>> But the way I read our documentation on a SPIP right now that detail is >>>> all optional, now maybe we could argue its based on what reviewers request, >>>> but really perhaps we should make the wording of that more required. >>>> thoughts? We should probably separate that discussion if people want to >>>> talk about that. >>>> >>>> For this SPIP in particular the reason I +1 it is because it came down >>>> to 2 questions: >>>> >>>> 1) do I think spark should support this -> my answer is yes, I think >>>> this would improve spark, users have been requesting both better GPUs >>>> support and support for controlling container requests at a finer >>>> granularity for a while. If spark doesn't support this then users may go >>>> to something else, so I think it we should support it >>>> >>>> 2) do I think its possible to design and implement it without causing >>>> large instabilities? My opinion here again is yes. I agree with Imran and >>>> others that the scheduler piece needs to be looked at very closely as we >>>> have had a lot of issues there and that is why I was asking for more >>>> details in the design jira: >>>> https://issues.apache.org/jira/browse/SPARK-27005. But I do believe >>>> its possible to do. >>>> >>>> If others have reservations on similar questions then I think we should >>>> resolve here or take the discussion of what a SPIP is to a different thread >>>> and then come back to this, thoughts? >>>> >>>> Note there is a high level design for at least the core piece, which is >>>> what people seem concerned with, already so including it in the SPIP should >>>> be straight forward. >>>> >>>> Tom >>>> >>>> On Monday, March 4, 2019, 2:52:43 PM CST, Imran Rashid < >>>> im...@therashids.com> wrote: >>>> >>>> >>>> On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng <men...@gmail.com> wrote: >>>> >>>> On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung <felixcheun...@hotmail.com> >>>> wrote: >>>> >>>> IMO upfront allocation is less useful. Specifically too expensive for >>>> large jobs. >>>> >>>> >>>> This is also an API/design discussion. >>>> >>>> >>>> I agree with Felix -- this is more than just an API question. It has a >>>> huge impact on the complexity of what you're proposing. You might be >>>> proposing big changes to a core and brittle part of spark, which is already >>>> short of experts. >>>> >>>> I don't see any value in having a vote on "does feature X sound cool?" >>>> We have to evaluate the potential benefit against the risks the feature >>>> brings and the continued maintenance cost. We don't need super low-level >>>> details, but we have to a sketch of the design to be able to make that >>>> tradeoff. >>>> >>>