Thanks for the comments, Andrey. - I agree that instead of ResourceManagerGateway#sendSlotReport, we should add the default slot resource profile to ResourceManagerGateway#registerTaskExecutor.
- If I understand correctly, the reason you suggest do default slot resource profile first and then do step 3 in a way that support both TaskExecutorGateway#requestSlot and TaskExecutorGateway#requestResource, is to try to avoid splitting code paths with the feature option? I think we can do that, but I also want to bring it up that this can only reduce the code split by the feature option (which is good) but not eliminate it. We still need the feature option for the fundamental differences, e.g. creating new SlotIDs on allocation vs. allocate to free slots with existing SlotIDs. - I don't really think we can do step 5, 6 and 7 independently. Basically they are all making changes to the same component. We probably can do step 6 and 7 independently, but I think they both depends on step 5. In general, I would say it's good to have as less as possible codes split by the feature option, which makes the later clean-up easier. But if it cannot be easily done, I would rather not to put too much efforts on having a good abstraction and deduplication between the new code path and the original one that we are removing soon. What do you think? Thank you~ Xintong Song On Mon, Sep 16, 2019 at 5:59 PM Andrey Zagrebin <and...@ververica.com> wrote: > Hi Xintong, > > Thanks for sharing the implementation steps. I also think they makes sense > with the feature option. > > I was wondering if we could order the steps in a way that each change does > not affect other components too much, always having a working system > then maybe the feature option does not always need to split the code. Here > are some thoughts. > > - We could do default slot profile firstly and include it into the TM > registration. I would suggest to add > to ResourceManagerGateway#registerTaskExecutor, not sendSlotReport. > This way RM knows about it but does not use at this point. (parts of step > 4,6) > > - We could try to do step 3 firstly in a way that it also supports the > current way of allocation in TaskExecutorGateway#requestSlot with the > default slot profile > and sends reports both with available resources and with free default > slots which correspond to the available resources. We can just remove free > default slots later. > The new way of TaskExecutorGateway#requestResource could be also > implemented here but not used yet. > > - Then step 5 can use the new TaskExecutorGateway#requestResource and the > default slot profile > > - Not sure, step 5 and 7 can be implemented independently without > regression of what we have. Maybe if we do step 7 firstly it will have only > default slots firstly and it will simplify step 5 later. > > Best, > Andrey > > On Mon, Sep 16, 2019 at 5:53 AM Xintong Song <tonysong...@gmail.com> > wrote: > > > Thanks for the comments, Till and Wenlong. > > > > @Wenlong > > Regarding slot sharing, the general idea is to request a slot with > > resources for tasks of the entire slot sharing group. Details can be > found > > in FLIP-53 [1], regarding how to decide the slot sharing groups and how > to > > manage task resources within the shared slots. > > > > Thank you~ > > > > Xintong Song > > > > > > > > On Mon, Sep 16, 2019 at 10:42 AM wenlong.lwl <wenlong88....@gmail.com> > > wrote: > > > > > Hi, Xintong, thanks for the great proposal. big +1 for the feature! It > is > > > something like mapreduce-1.0 to mapreduce-2.0. > > > > > > I like the design on the whole. One point may need to be included in > the > > > proposal:How we deal with slot share group and dynamic slot allocation? > > It > > > can be quite different with dynamic slot allocation. > > > > > > On Fri, 13 Sep 2019 at 16:42, Till Rohrmann <trohrm...@apache.org> > > wrote: > > > > > > > Thanks for the update Xintong. From a high level perspective the > > > > implementation plan looks good to me. > > > > > > > > Cheers, > > > > Till > > > > > > > > On Thu, Sep 12, 2019 at 11:04 AM Xintong Song <tonysong...@gmail.com > > > > > > wrote: > > > > > > > > > Added implementation steps for this FLIP on the wiki page [1]. > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation > > > > > > > > > > > > > > > > > > > > On Tue, Aug 20, 2019 at 3:43 PM Xintong Song < > tonysong...@gmail.com> > > > > > wrote: > > > > > > > > > > > @Zili > > > > > > > > > > > > As far as I know, Timo is drafting a FLIP that has taken the > number > > > 55. > > > > > > There is a round-up number maintained on the FLIP wiki page [1] > > shows > > > > > > which number should be used for the new FLIP, which should be > > > increased > > > > > by > > > > > > whoever takes the number for a new FLIP. > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals > > > > > > > > > > > > On Tue, Aug 20, 2019 at 3:28 AM Zili Chen <wander4...@gmail.com> > > > > wrote: > > > > > > > > > > > >> We suddenly skipped FLIP-55 lol. > > > > > >> > > > > > >> > > > > > >> Xintong Song <tonysong...@gmail.com> 于2019年8月19日周一 下午10:23写道: > > > > > >> > > > > > >> > Hi everyone, > > > > > >> > > > > > > >> > We would like to start a discussion thread on "FLIP-56: > Dynamic > > > Slot > > > > > >> > Allocation" [1]. This is originally part of the discussion > > thread > > > > for > > > > > >> > "FLIP-53: Fine Grained Resource Management" [2]. As Till > > > suggested, > > > > we > > > > > >> > would like split the original discussion into two topics, and > > > start > > > > a > > > > > >> > separate new discussion thread as well as FLIP process for > this > > > one. > > > > > >> > > > > > > >> > Thank you~ > > > > > >> > > > > > > >> > Xintong Song > > > > > >> > > > > > > >> > > > > > > >> > [1] > > > > > >> > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation > > > > > >> > > > > > > >> > [2] > > > > > >> > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-53-Fine-Grained-Resource-Management-td31831.html > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > >