Hi

Thanks for starting this work weihua, I think unifying
DeclarativeSlotManager and FineGrainedSlotManager is valuable.

I agree with @Matthias and @John that we need a way to ensure that
DeclarativeSlotManager's capabilities are fully covered by
FineGrainedSlotManager

1. For their functional differences, can you give some detailed tests to
verify that the new FineGrainedSlotManager has these capabilities? This can
effectively verify the new functions

2. I'm worried that many functions are not independent and it is difficult
to migrate step-by-step. You can list the relationship between them in
detail.

3. As John mentioned, give a smoke test for FineGrainedSlotManager is a
good idea. Or you can add some test information to the
DeclarativeSlotManager to determine how many tests have used it. In this
way, we can gradually construct test cases for FineGrainedSlotManager
during the development process.


Best,
Shammon


On Tue, Feb 28, 2023 at 10:22 PM John Roesler <vvcep...@apache.org> wrote:

> Thanks for the FLIP, Weihua!
>
> I’ve read the FLIP, and it sounds good to me. We need to avoid
> proliferating alternative implementations wherever possible. I have just a
> couple of comments:
>
> 1. I share Matthias’s concern about ensuring the behavior is really the
> same. One suggestion I’ve used for this kind of thing is, as a smoke test,
> to update the DeclarativeSlotManager to just delegate to the
> FineGrainedSlotManager. If the full test suite still passes, you can be
> pretty sure the new default is really ok. It would not be a good idea to
> actually keep that in for the release, since it would remove the option to
> fall back in case of bugs. Either way, we need to make sure all test
> scenarios are present for the FGSM.
>
> 4. In addition to changing the default, would it make sense to log a
> deprecation warning on initialization if the DeclarativeSlotManager is used?
>
> Thanks again,
> John
>
> On Tue, Feb 28, 2023, at 07:20, Matthias Pohl wrote:
> > Hi Weihua,
> > Thanks for your proposal. From a conceptual point: AFAIU, the
> > DeclarativeSlotManager covers a subset (i.e. only evenly sized slots) of
> > what the FineGrainedSlotManager should be able to achieve (variable slot
> > size per task manager). Is this the right assumption/understanding? In
> this
> > sense, merging both implementations into a single one sounds good. A few
> > more general comments, though:
> >
> > 1. Did you do a proper test coverage analysis? That's not mentioned in
> the
> > current version of the FLIP. I'm bringing this up because we ran into the
> > same issue when fixing the flaws that popped up after introducing the
> > multi-component leader election (see FLIP-285 [1]). There is a risk that
> by
> > removing the legacy code we decrease test coverage because certain
> > test cases that were covered for the legacy classes might not be
> > necessarily covered in the new implementation, yet (see FLINK-30338 [2]
> > which covers this issue for the leader election case). Ideally, we don't
> > want to remove test cases accidentally because they were only implemented
> > for the DeclarativeSlotManager but missed for the FineGrainedSlotManager.
> >
> > 2. DeclarativeSlotManager and FineGrainedSlotManager feel quite big in
> > terms of lines of code. Without knowing whether it's actually a
> reasonable
> > thing to do: Instead of just adding more features to the
> > FineGrainedSlotManager, have you thought of cutting out functionality
> into
> > smaller sub-components along this refactoring? Such a step-by-step
> approach
> > might improve the overall codebase and might make reviewing the
> refactoring
> > easier. I did a first pass over the code and struggled to identify code
> > blocks that could be moved out of the SlotManager implementation(s).
> > Therefore, I might be wrong with this proposal. I haven't worked on this
> > codebase in that detail that it would allow me to come up with a
> judgement
> > call. I wanted to bring it up, anyway, because I'm curious whether that
> > could be an option. There's a comment created by Chesnay (CC'd) in the
> > JavaDoc of TaskExecutorManager [3] indicating something similar. I'm
> > wondering whether he can add some insights here.
> >
> > 3. For me personally, having a more detailed summary comparing the
> > subcomponents of both SlotManager implementations with where
> > their functionality matches and where they differ might help understand
> the
> > consequences of the changes proposed in FLIP-298.
> >
> > Best,
> > Matthias
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-285%3A+Refactoring+LeaderElection+to+make+Flink+support+multi-component+leader+election+out-of-the-box
> > [2] https://issues.apache.org/jira/browse/FLINK-30338
> > [3]
> >
> https://github.com/apache/flink/blob/f611ea8cb5deddb42429df2c99f0c68d7382e9bd/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/TaskExecutorManager.java#L66-L68
> >
> > On Tue, Feb 28, 2023 at 6:14 AM Matt Wang <wang...@163.com> wrote:
> >
> >> This is a good proposal for me, it will make the code of the SlotManager
> >> more clear.
> >>
> >>
> >>
> >> --
> >>
> >> Best,
> >> Matt Wang
> >>
> >>
> >> ---- Replied Message ----
> >> | From | David Morávek<d...@apache.org> |
> >> | Date | 02/27/2023 22:45 |
> >> | To | <dev@flink.apache.org> |
> >> | Subject | Re: [DISCUSS] FLIP-298: Unifying the Implementation of
> >> SlotManager |
> >> Hi Weihua, I still need to dig into the details, but the overall
> sentiment
> >> of this change sounds reasonable.
> >>
> >> Best,
> >> D.
> >>
> >> On Mon, Feb 27, 2023 at 2:26 PM Zhanghao Chen <
> zhanghao.c...@outlook.com>
> >> wrote:
> >>
> >> Thanks for driving this topic. I think this FLIP could help clean up the
> >> codebase to make it easier to maintain. +1 on it.
> >>
> >> Best,
> >> Zhanghao Chen
> >> ________________________________
> >> From: Weihua Hu <huweihua....@gmail.com>
> >> Sent: Monday, February 27, 2023 20:40
> >> To: dev <dev@flink.apache.org>
> >> Subject: [DISCUSS] FLIP-298: Unifying the Implementation of SlotManager
> >>
> >> Hi everyone,
> >>
> >> I would like to begin a discussion on FLIP-298: Unifying the
> Implementation
> >> of SlotManager[1]. There are currently two types of SlotManager in
> Flink:
> >> DeclarativeSlotManager and FineGrainedSlotManager.
> FineGrainedSlotManager
> >> should behave as DeclarativeSlotManager if the user does not configure
> the
> >> slot request profile.
> >>
> >> Therefore, this FLIP aims to unify the implementation of SlotManager in
> >> order to reduce maintenance costs.
> >>
> >> Looking forward to hearing from you.
> >>
> >> [1]
> >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-298%3A+Unifying+the+Implementation+of+SlotManager
> >>
> >> Best,
> >> Weihua
> >>
> >>
>

Reply via email to