Hi all, I updated the SPIP doc <https://docs.google.com/document/d/1C4J_BPOcSCJc58HL7JfHtIzHrjU0rLRdQM3y7ejil64/edit#> and stories <https://docs.google.com/document/d/12JjloksHCdslMXhdVZ3xY5l1Nde3HRhIrqvzGnK_bNE/edit#heading=h.udyua28eu3sg>, I hope it now contains clear scope of the changes and enough details for SPIP vote. Please review the updated docs, thanks!
Xiangrui Meng <men...@gmail.com> 于2019年3月6日周三 上午8:35写道: > How about letting Xingbo make a major revision to the SPIP doc to make it > clear what proposed are? I like Felix's suggestion to switch to the new > Heilmeier template, which helps clarify what are proposed and what are not. > Then let's review the new SPIP and resume the vote. > > On Tue, Mar 5, 2019 at 7:54 AM Imran Rashid <im...@therashids.com> wrote: > >> OK, I suppose then we are getting bogged down into what a vote on an SPIP >> means then anyway, which I guess we can set aside for now. With the level >> of detail in this proposal, I feel like there is a reasonable chance I'd >> still -1 the design or implementation. >> >> And the other thing you're implicitly asking the community for is to >> prioritize this feature for continued review and maintenance. There is >> already work to be done in things like making barrier mode support dynamic >> allocation (SPARK-24942), bugs in failure handling (eg. SPARK-25250), and >> general efficiency of failure handling (eg. SPARK-25341, SPARK-20178). I'm >> very concerned about getting spread too thin. >> > >> But if this is really just a vote on (1) is better gpu support important >> for spark, in some form, in some release? and (2) is it *possible* to do >> this in a safe way? then I will vote +0. >> >> On Tue, Mar 5, 2019 at 8:25 AM Tom Graves <tgraves...@yahoo.com> wrote: >> >>> So to me most of the questions here are implementation/design questions, >>> I've had this issue in the past with SPIP's where I expected to have more >>> high level design details but was basically told that belongs in the design >>> jira follow on. This makes me think we need to revisit what a SPIP really >>> need to contain, which should be done in a separate thread. Note >>> personally I would be for having more high level details in it. >>> But the way I read our documentation on a SPIP right now that detail is >>> all optional, now maybe we could argue its based on what reviewers request, >>> but really perhaps we should make the wording of that more required. >>> thoughts? We should probably separate that discussion if people want to >>> talk about that. >>> >>> For this SPIP in particular the reason I +1 it is because it came down >>> to 2 questions: >>> >>> 1) do I think spark should support this -> my answer is yes, I think >>> this would improve spark, users have been requesting both better GPUs >>> support and support for controlling container requests at a finer >>> granularity for a while. If spark doesn't support this then users may go >>> to something else, so I think it we should support it >>> >>> 2) do I think its possible to design and implement it without causing >>> large instabilities? My opinion here again is yes. I agree with Imran and >>> others that the scheduler piece needs to be looked at very closely as we >>> have had a lot of issues there and that is why I was asking for more >>> details in the design jira: >>> https://issues.apache.org/jira/browse/SPARK-27005. But I do believe >>> its possible to do. >>> >>> If others have reservations on similar questions then I think we should >>> resolve here or take the discussion of what a SPIP is to a different thread >>> and then come back to this, thoughts? >>> >>> Note there is a high level design for at least the core piece, which is >>> what people seem concerned with, already so including it in the SPIP should >>> be straight forward. >>> >>> Tom >>> >>> On Monday, March 4, 2019, 2:52:43 PM CST, Imran Rashid < >>> im...@therashids.com> wrote: >>> >>> >>> On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng <men...@gmail.com> wrote: >>> >>> On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung <felixcheun...@hotmail.com> >>> wrote: >>> >>> IMO upfront allocation is less useful. Specifically too expensive for >>> large jobs. >>> >>> >>> This is also an API/design discussion. >>> >>> >>> I agree with Felix -- this is more than just an API question. It has a >>> huge impact on the complexity of what you're proposing. You might be >>> proposing big changes to a core and brittle part of spark, which is already >>> short of experts. >>> >>> I don't see any value in having a vote on "does feature X sound cool?" >>> We have to evaluate the potential benefit against the risks the feature >>> brings and the continued maintenance cost. We don't need super low-level >>> details, but we have to a sketch of the design to be able to make that >>> tradeoff. >>> >>