:) Sorry, that was ambiguous. I was seconding Imran's comment. On Mon, Mar 4, 2019 at 3:09 PM Xiangrui Meng <men...@gmail.com> wrote:
> > > On Mon, Mar 4, 2019 at 1:56 PM Mark Hamstra <m...@clearstorydata.com> > wrote: > >> +1 >> > > Mark, just to be clear, are you +1 on the SPIP or Imran's point? > > >> >> On Mon, Mar 4, 2019 at 12:52 PM Imran Rashid <im...@therashids.com> >> wrote: >> >>> On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng <men...@gmail.com> wrote: >>> >>>> On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung <felixcheun...@hotmail.com> >>>> wrote: >>>> >>>>> IMO upfront allocation is less useful. Specifically too expensive for >>>>> large jobs. >>>>> >>>> >>>> This is also an API/design discussion. >>>> >>> >>> I agree with Felix -- this is more than just an API question. It has a >>> huge impact on the complexity of what you're proposing. You might be >>> proposing big changes to a core and brittle part of spark, which is already >>> short of experts. >>> >> > To my understanding, Felix's comment is mostly on the user interfaces, > stating upfront allocation is less useful, specially for large jobs. I > agree that for large jobs we better have dynamic allocation, which was > mentioned in the YARN support section in the companion scoping doc. We > restrict the new container type to initially requested to keep things > simple. However upfront allocation already meets the requirements of basic > workflows like data + DL training/inference + data. Saying "it is less > useful specifically for large jobs" kinda missed the fact that "it is super > useful for basic use cases". > > Your comment is mostly on the implementation side, which IMHO it is the > KEY question to conclude this vote: does the design sketch sufficiently > demonstrate that the internal changes to Spark scheduler is manageable? I > read Xingbo's design sketch and I think it is doable, which led to my +1. > But I'm not an expert on the scheduler. So I would feel more confident if > the design was reviewed by some scheduler experts. I also read the design > sketch to support different cluster managers, which I think is less > critical than the internal scheduler changes. > > >> >>> I don't see any value in having a vote on "does feature X sound cool?" >>> >> > I believe no one would disagree. To prepare the companion doc, we went > through several rounds of discussions to provide concrete stories such that > the proposal is not just "cool". > > >> >>> >> We have to evaluate the potential benefit against the risks the feature >>> brings and the continued maintenance cost. We don't need super low-level >>> details, but we have to a sketch of the design to be able to make that >>> tradeoff. >>> >> > Could you review the design sketch from Xingbo, help evaluate the cost, > and provide feedback? > >