Thanks Xingtong for driving this effort, I haven't finished the whole document yet, but have couple of questions:
1. Regarding to network memory, the document said it will be derived by framework automatically. I'm wondering whether we should delete this dimension from user- facing API? 2. Regarding to fraction based quota, I don't quite get the meaning of "slotSharingGroupOnHeapManagedMem" and "slotSharingGroupOffHeapManagedMem". What if the sharing group is mixed with specified resource and UNKNOWN resource requirements. 3 IIUC, even user had set resource requirements, lets say 500MB off-heap managed memory, during execution the operator may or may not have 500MB off-heap managed memory, right? Best, Kurt On Mon, Sep 2, 2019 at 8:36 PM Zhu Zhu <reed...@gmail.com> wrote: > Thanks Xintong for proposing this improvement. Fine grained resources can > be very helpful when user has good planning on resources. > > I have a few questions: > 1. Currently in a batch job, vertices from different regions can run at the > same time in slots from the same shared group, as long as they do not have > data dependency on each other and available slot count is not smaller than > the *max* of parallelism of all tasks. > With changes in this FLIP however, tasks from different regions cannot > share slots anymore. > Once available slot count is smaller than the *sum* of all parallelism of > tasks from all regions, tasks may need to be executed sequentially, which > might result in a performance regression. > Is this(performance regression to existing DataSet jobs) considered as a > necessary and accepted trade off in this FLIP? > > 2. The network memory depends on the input/output ExecutionEdge count and > thus can be different even for parallel instances of the same JobVertex. > Does this mean that when adding task resources to calculating the slot > resource for a shared group, the max possible network memory of the vertex > instance shall be used? > This might result in larger resource required than actually needed. > > And some minor comments: > 1. Regarding "fracManagedMemOnHeap = 1 / numOpsUseOnHeapManagedMemory", I > guess you mean numOpsUseOnHeapManagedMemoryInTheSameSharedGroup ? > 2. I think the *StreamGraphGenerator* in the #Slot Sharing section and > implementation step 4 should be *StreamingJobGraphGenerator*, as > *StreamGraphGenerator* is not aware of JobGraph and pipelined region. > > > Thanks, > Zhu Zhu > > Xintong Song <tonysong...@gmail.com> 于2019年9月2日周一 上午11:59写道: > > > Updated the FLIP wiki page [1], with the following changes. > > > > - Remove the step of converting pipelined edges between different slot > > sharing groups into blocking edges. > > - Set `allSourcesInSamePipelinedRegion` to true by default. > > > > Thank you~ > > > > Xintong Song > > > > > > > > On Mon, Sep 2, 2019 at 11:50 AM Xintong Song <tonysong...@gmail.com> > > wrote: > > > > > Regarding changing edge type, I think actually we don't need to do this > > > for batch jobs neither, because we don't have public interfaces for > users > > > to explicitly set slot sharing groups in DataSet API and SQL/Table API. > > We > > > have such interfaces in DataStream API only. > > > > > > Thank you~ > > > > > > Xintong Song > > > > > > > > > > > > On Tue, Aug 27, 2019 at 10:16 PM Xintong Song <tonysong...@gmail.com> > > > wrote: > > > > > >> Thanks for the correction, Till. > > >> > > >> Regarding your comments: > > >> - You are right, we should not change the edge type for streaming > jobs. > > >> Then I think we can change the option > 'allSourcesInSamePipelinedRegion' > > in > > >> step 2 to 'isStreamingJob', and implement the current step 2 before > the > > >> current step 1 so we can use this option to decide whether should > change > > >> the edge type. What do you think? > > >> - Agree. It should be easier to make the default value of > > >> 'allSourcesInSamePipelinedRegion' (or 'isStreamingJob') 'true', and > set > > it > > >> to 'false' when using DataSet API or blink planner. > > >> > > >> Thank you~ > > >> > > >> Xintong Song > > >> > > >> > > >> > > >> On Tue, Aug 27, 2019 at 8:59 PM Till Rohrmann <trohrm...@apache.org> > > >> wrote: > > >> > > >>> Thanks for creating the implementation plan Xintong. Overall, the > > >>> implementation plan looks good. I had a couple of comments: > > >>> > > >>> - What will happen if a user has defined a streaming job with two > slot > > >>> sharing groups? Would the code insert a blocking data exchange > between > > >>> these two groups? If yes, then this breaks existing Flink streaming > > jobs. > > >>> - How do we detect unbounded streaming jobs to set > > >>> the allSourcesInSamePipelinedRegion to `true`? Wouldn't it be easier > to > > >>> set > > >>> it false if we are using the DataSet API or the Blink planner with a > > >>> bounded job? > > >>> > > >>> Cheers, > > >>> Till > > >>> > > >>> On Tue, Aug 27, 2019 at 2:16 PM Till Rohrmann <trohrm...@apache.org> > > >>> wrote: > > >>> > > >>> > I guess there is a typo since the link to the FLIP-53 is > > >>> > > > >>> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management > > >>> > > > >>> > Cheers, > > >>> > Till > > >>> > > > >>> > On Tue, Aug 27, 2019 at 1:42 PM Xintong Song < > tonysong...@gmail.com> > > >>> > wrote: > > >>> > > > >>> >> Added implementation steps for this FLIP on the wiki page [1]. > > >>> >> > > >>> >> > > >>> >> Thank you~ > > >>> >> > > >>> >> Xintong Song > > >>> >> > > >>> >> > > >>> >> [1] > > >>> >> > > >>> >> > > >>> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > >>> >> > > >>> >> On Mon, Aug 19, 2019 at 10:29 PM Xintong Song < > > tonysong...@gmail.com> > > >>> >> wrote: > > >>> >> > > >>> >> > Hi everyone, > > >>> >> > > > >>> >> > As Till suggested, the original "FLIP-53: Fine Grained Resource > > >>> >> > Management" splits into two separate FLIPs, > > >>> >> > > > >>> >> > - FLIP-53: Fine Grained Operator Resource Management [1] > > >>> >> > - FLIP-56: Dynamic Slot Allocation [2] > > >>> >> > > > >>> >> > We'll continue using this discussion thread for FLIP-53. For > > >>> FLIP-56, I > > >>> >> > just started a new discussion thread [3]. > > >>> >> > > > >>> >> > Thank you~ > > >>> >> > > > >>> >> > Xintong Song > > >>> >> > > > >>> >> > > > >>> >> > [1] > > >>> >> > > > >>> >> > > >>> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management > > >>> >> > > > >>> >> > [2] > > >>> >> > > > >>> >> > > >>> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation > > >>> >> > > > >>> >> > [3] > > >>> >> > > > >>> >> > > >>> > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-56-Dynamic-Slot-Allocation-td31960.html > > >>> >> > > > >>> >> > On Mon, Aug 19, 2019 at 2:55 PM Xintong Song < > > tonysong...@gmail.com > > >>> > > > >>> >> > wrote: > > >>> >> > > > >>> >> >> Thinks for the comments, Yang. > > >>> >> >> > > >>> >> >> Regarding your questions: > > >>> >> >> > > >>> >> >> 1. How to calculate the resource specification of > > TaskManagers? > > >>> Do > > >>> >> they > > >>> >> >>> have them same resource spec calculated based on the > > >>> >> configuration? I > > >>> >> >>> think > > >>> >> >>> we still have wasted resources in this situation. Or we > could > > >>> start > > >>> >> >>> TaskManagers with different spec. > > >>> >> >>> > > >>> >> >> I agree with you that we can further improve the resource > utility > > >>> by > > >>> >> >> customizing task executors with different resource > > specifications. > > >>> >> However, > > >>> >> >> I'm in favor of limiting the scope of this FLIP and leave it > as a > > >>> >> future > > >>> >> >> optimization. The plan for that part is to move the logic of > > >>> deciding > > >>> >> task > > >>> >> >> executor specifications into the slot manager and make slot > > manager > > >>> >> >> pluggable, so inside the slot manager plugin we can have > > different > > >>> >> logics > > >>> >> >> for deciding the task executor specifications. > > >>> >> >> > > >>> >> >> > > >>> >> >>> 2. If a slot is released and returned to SlotPool, does it > > >>> could be > > >>> >> >>> reused by other SlotRequest that the request resource is > > >>> smaller > > >>> >> than > > >>> >> >>> it? > > >>> >> >>> > > >>> >> >> No, I think slot pool should always return slots if they do not > > >>> exactly > > >>> >> >> match the pending requests, so that resource manager can deal > > with > > >>> the > > >>> >> >> extra resources. > > >>> >> >> > > >>> >> >>> - If it is yes, what happens to the available resource > in > > >>> the > > >>> >> >> > > >>> >> >> TaskManager. > > >>> >> >>> - What is the SlotStatus of the cached slot in SlotPool? > > The > > >>> >> >>> AllocationId is null? > > >>> >> >>> > > >>> >> >> The allocation id does not change as long as the slot is not > > >>> returned > > >>> >> >> from the job master, no matter its occupied or available in the > > >>> slot > > >>> >> pool. > > >>> >> >> I think we have the same behavior currently. No matter how many > > >>> tasks > > >>> >> the > > >>> >> >> job master deploy into the slot, concurrently or sequentially, > it > > >>> is > > >>> >> one > > >>> >> >> allocation from the cluster to the job until the slot is freed > > from > > >>> >> the job > > >>> >> >> master. > > >>> >> >> > > >>> >> >>> 3. In a session cluster, some jobs are configured with > > operator > > >>> >> >>> resources, meanwhile other jobs are using UNKNOWN. How to > > deal > > >>> with > > >>> >> >>> this > > >>> >> >>> situation? > > >>> >> >> > > >>> >> >> As long as we do not mix unknown / specified resource profiles > > >>> within > > >>> >> the > > >>> >> >> same job / slot, there shouldn't be a problem. Resource manager > > >>> >> converts > > >>> >> >> unknown resource profiles in slot requests to specified default > > >>> >> resource > > >>> >> >> profiles, so they can be dynamically allocated from task > > executors' > > >>> >> >> available resources just as other slot requests with specified > > >>> resource > > >>> >> >> profiles. > > >>> >> >> > > >>> >> >> Thank you~ > > >>> >> >> > > >>> >> >> Xintong Song > > >>> >> >> > > >>> >> >> > > >>> >> >> > > >>> >> >> On Mon, Aug 19, 2019 at 11:39 AM Yang Wang < > > danrtsey...@gmail.com> > > >>> >> wrote: > > >>> >> >> > > >>> >> >>> Hi Xintong, > > >>> >> >>> > > >>> >> >>> > > >>> >> >>> Thanks for your detailed proposal. I think many users are > > >>> suffering > > >>> >> from > > >>> >> >>> waste of resources. The resource spec of all task managers are > > >>> same > > >>> >> and > > >>> >> >>> we > > >>> >> >>> have to increase all task managers to make the heavy one more > > >>> stable. > > >>> >> So > > >>> >> >>> we > > >>> >> >>> will benefit from the fine grained resource management a lot. > We > > >>> could > > >>> >> >>> get > > >>> >> >>> better resource utilization and stability. > > >>> >> >>> > > >>> >> >>> > > >>> >> >>> Just to share some thoughts. > > >>> >> >>> > > >>> >> >>> > > >>> >> >>> > > >>> >> >>> 1. How to calculate the resource specification of > > >>> TaskManagers? Do > > >>> >> >>> they > > >>> >> >>> have them same resource spec calculated based on the > > >>> >> configuration? I > > >>> >> >>> think > > >>> >> >>> we still have wasted resources in this situation. Or we > could > > >>> start > > >>> >> >>> TaskManagers with different spec. > > >>> >> >>> 2. If a slot is released and returned to SlotPool, does it > > >>> could be > > >>> >> >>> reused by other SlotRequest that the request resource is > > >>> smaller > > >>> >> than > > >>> >> >>> it? > > >>> >> >>> - If it is yes, what happens to the available resource > in > > >>> the > > >>> >> >>> TaskManager. > > >>> >> >>> - What is the SlotStatus of the cached slot in SlotPool? > > The > > >>> >> >>> AllocationId is null? > > >>> >> >>> 3. In a session cluster, some jobs are configured with > > operator > > >>> >> >>> resources, meanwhile other jobs are using UNKNOWN. How to > > deal > > >>> with > > >>> >> >>> this > > >>> >> >>> situation? > > >>> >> >>> > > >>> >> >>> > > >>> >> >>> > > >>> >> >>> Best, > > >>> >> >>> Yang > > >>> >> >>> > > >>> >> >>> Xintong Song <tonysong...@gmail.com> 于2019年8月16日周五 下午8:57写道: > > >>> >> >>> > > >>> >> >>> > Thanks for the feedbacks, Yangze and Till. > > >>> >> >>> > > > >>> >> >>> > Yangze, > > >>> >> >>> > > > >>> >> >>> > I agree with you that we should make scheduling strategy > > >>> pluggable > > >>> >> and > > >>> >> >>> > optimize the strategy to reduce the memory fragmentation > > >>> problem, > > >>> >> and > > >>> >> >>> > thanks for the inputs on the potential algorithmic > solutions. > > >>> >> However, > > >>> >> >>> I'm > > >>> >> >>> > in favor of keep this FLIP focusing on the overall mechanism > > >>> design > > >>> >> >>> rather > > >>> >> >>> > than strategies. Solving the fragmentation issue should be > > >>> >> considered > > >>> >> >>> as an > > >>> >> >>> > optimization, and I agree with Till that we probably should > > >>> tackle > > >>> >> this > > >>> >> >>> > afterwards. > > >>> >> >>> > > > >>> >> >>> > Till, > > >>> >> >>> > > > >>> >> >>> > - Regarding splitting the FLIP, I think it makes sense. The > > >>> operator > > >>> >> >>> > resource management and dynamic slot allocation do not have > > much > > >>> >> >>> dependency > > >>> >> >>> > on each other. > > >>> >> >>> > > > >>> >> >>> > - Regarding the default slot size, I think this is similar > to > > >>> >> FLIP-49 > > >>> >> >>> [1] > > >>> >> >>> > where we want all the deriving happens at one place. I think > > it > > >>> >> would > > >>> >> >>> be > > >>> >> >>> > nice to pass the default slot size into the task executor in > > the > > >>> >> same > > >>> >> >>> way > > >>> >> >>> > that we pass in the memory pool sizes in FLIP-49 [1]. > > >>> >> >>> > > > >>> >> >>> > - Regarding the return value of > > >>> >> TaskExecutorGateway#requestResource, I > > >>> >> >>> > think you're right. We should avoid using null as the return > > >>> value. > > >>> >> I > > >>> >> >>> think > > >>> >> >>> > we probably should thrown an exception here. > > >>> >> >>> > > > >>> >> >>> > Thank you~ > > >>> >> >>> > > > >>> >> >>> > Xintong Song > > >>> >> >>> > > > >>> >> >>> > > > >>> >> >>> > [1] > > >>> >> >>> > > > >>> >> >>> > > > >>> >> >>> > > >>> >> > > >>> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > >>> >> >>> > > > >>> >> >>> > On Fri, Aug 16, 2019 at 2:18 PM Till Rohrmann < > > >>> trohrm...@apache.org > > >>> >> > > > >>> >> >>> > wrote: > > >>> >> >>> > > > >>> >> >>> > > Hi Xintong, > > >>> >> >>> > > > > >>> >> >>> > > thanks for drafting this FLIP. I think your proposal helps > > to > > >>> >> >>> improve the > > >>> >> >>> > > execution of batch jobs more efficiently. Moreover, it > > >>> enables the > > >>> >> >>> proper > > >>> >> >>> > > integration of the Blink planner which is very important > as > > >>> well. > > >>> >> >>> > > > > >>> >> >>> > > Overall, the FLIP looks good to me. I was wondering > whether > > it > > >>> >> >>> wouldn't > > >>> >> >>> > > make sense to actually split it up into two FLIPs: > Operator > > >>> >> resource > > >>> >> >>> > > management and dynamic slot allocation. I think these two > > >>> FLIPs > > >>> >> >>> could be > > >>> >> >>> > > seen as orthogonal and it would decrease the scope of each > > >>> >> individual > > >>> >> >>> > FLIP. > > >>> >> >>> > > > > >>> >> >>> > > Some smaller comments: > > >>> >> >>> > > > > >>> >> >>> > > - I'm not sure whether we should pass in the default slot > > size > > >>> >> via an > > >>> >> >>> > > environment variable. Without having unified the way how > > Flink > > >>> >> >>> components > > >>> >> >>> > > are configured [1], I think it would be better to pass it > in > > >>> as > > >>> >> part > > >>> >> >>> of > > >>> >> >>> > the > > >>> >> >>> > > configuration. > > >>> >> >>> > > - I would avoid returning a null value from > > >>> >> >>> > > TaskExecutorGateway#requestResource if it cannot be > > fulfilled. > > >>> >> >>> Either we > > >>> >> >>> > > should introduce an explicit return value saying this or > > >>> throw an > > >>> >> >>> > > exception. > > >>> >> >>> > > > > >>> >> >>> > > Concerning Yangze's comments: I think you are right that > it > > >>> would > > >>> >> be > > >>> >> >>> > > helpful to make the selection strategy pluggable. Also > > >>> batching > > >>> >> slot > > >>> >> >>> > > requests to the RM could be a good optimization. For the > > sake > > >>> of > > >>> >> >>> keeping > > >>> >> >>> > > the scope of this FLIP smaller I would try to tackle these > > >>> things > > >>> >> >>> after > > >>> >> >>> > the > > >>> >> >>> > > initial version has been completed (without spoiling these > > >>> >> >>> optimization > > >>> >> >>> > > opportunities). In particular batching the slot requests > > >>> depends > > >>> >> on > > >>> >> >>> the > > >>> >> >>> > > current scheduler refactoring and could also be realized > on > > >>> the RM > > >>> >> >>> side > > >>> >> >>> > > only. > > >>> >> >>> > > > > >>> >> >>> > > [1] > > >>> >> >>> > > > > >>> >> >>> > > > > >>> >> >>> > > > >>> >> >>> > > >>> >> > > >>> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration > > >>> >> >>> > > > > >>> >> >>> > > Cheers, > > >>> >> >>> > > Till > > >>> >> >>> > > > > >>> >> >>> > > > > >>> >> >>> > > > > >>> >> >>> > > On Fri, Aug 16, 2019 at 11:11 AM Yangze Guo < > > >>> karma...@gmail.com> > > >>> >> >>> wrote: > > >>> >> >>> > > > > >>> >> >>> > > > Hi, Xintong > > >>> >> >>> > > > > > >>> >> >>> > > > Thanks to propose this FLIP. The general design looks > good > > >>> to > > >>> >> me, > > >>> >> >>> +1 > > >>> >> >>> > > > for this feature. > > >>> >> >>> > > > > > >>> >> >>> > > > Since slots in the same task executor could have > different > > >>> >> resource > > >>> >> >>> > > > profile, we will > > >>> >> >>> > > > meet resource fragment problem. Think about this case: > > >>> >> >>> > > > - request A want 1G memory while request B & C want > 0.5G > > >>> memory > > >>> >> >>> > > > - There are two task executors T1 & T2 with 1G and 0.5G > > >>> free > > >>> >> >>> memory > > >>> >> >>> > > > respectively > > >>> >> >>> > > > If B come first and we cut a slot from T1 for B, A must > > >>> wait for > > >>> >> >>> the > > >>> >> >>> > > > free resource from > > >>> >> >>> > > > other task. But A could have been scheduled immediately > if > > >>> we > > >>> >> cut a > > >>> >> >>> > > > slot from T2 for B. > > >>> >> >>> > > > > > >>> >> >>> > > > The logic of findMatchingSlot now become finding a task > > >>> executor > > >>> >> >>> which > > >>> >> >>> > > > has enough > > >>> >> >>> > > > resource and then cut a slot from it. Current method > could > > >>> be > > >>> >> seen > > >>> >> >>> as > > >>> >> >>> > > > "First-fit strategy", > > >>> >> >>> > > > which works well in general but sometimes could not be > the > > >>> >> >>> optimization > > >>> >> >>> > > > method. > > >>> >> >>> > > > > > >>> >> >>> > > > Actually, this problem could be abstracted as "Bin > Packing > > >>> >> >>> Problem"[1]. > > >>> >> >>> > > > Here are > > >>> >> >>> > > > some common approximate algorithms: > > >>> >> >>> > > > - First fit > > >>> >> >>> > > > - Next fit > > >>> >> >>> > > > - Best fit > > >>> >> >>> > > > > > >>> >> >>> > > > But it become multi-dimensional bin packing problem if > we > > >>> take > > >>> >> CPU > > >>> >> >>> > > > into account. It hard > > >>> >> >>> > > > to define which one is best fit now. Some research > > addressed > > >>> >> this > > >>> >> >>> > > > problem, such like Tetris[2]. > > >>> >> >>> > > > > > >>> >> >>> > > > Here are some thinking about it: > > >>> >> >>> > > > 1. We could make the strategy of finding matching task > > >>> executor > > >>> >> >>> > > > pluginable. Let user to config the > > >>> >> >>> > > > best strategy in their scenario. > > >>> >> >>> > > > 2. We could support batch request interface in RM, > because > > >>> we > > >>> >> have > > >>> >> >>> > > > opportunities to optimize > > >>> >> >>> > > > if we have more information. If we know the A, B, C at > the > > >>> same > > >>> >> >>> time, > > >>> >> >>> > > > we could always make the best decision. > > >>> >> >>> > > > > > >>> >> >>> > > > [1] http://www.or.deis.unibo.it/kp/Chapter8.pdf > > >>> >> >>> > > > [2] > > >>> >> >>> > > > >>> >> > > https://www.cs.cmu.edu/~xia/resources/Documents/grandl_sigcomm14.pdf > > >>> >> >>> > > > > > >>> >> >>> > > > Best, > > >>> >> >>> > > > Yangze Guo > > >>> >> >>> > > > > > >>> >> >>> > > > On Thu, Aug 15, 2019 at 10:40 PM Xintong Song < > > >>> >> >>> tonysong...@gmail.com> > > >>> >> >>> > > > wrote: > > >>> >> >>> > > > > > > >>> >> >>> > > > > Hi everyone, > > >>> >> >>> > > > > > > >>> >> >>> > > > > We would like to start a discussion thread on > "FLIP-53: > > >>> Fine > > >>> >> >>> Grained > > >>> >> >>> > > > > Resource Management"[1], where we propose how to > improve > > >>> Flink > > >>> >> >>> > resource > > >>> >> >>> > > > > management and scheduling. > > >>> >> >>> > > > > > > >>> >> >>> > > > > This FLIP mainly discusses the following issues. > > >>> >> >>> > > > > > > >>> >> >>> > > > > - How to support tasks with fine grained resource > > >>> >> >>> requirements. > > >>> >> >>> > > > > - How to unify resource management for jobs with / > > >>> without > > >>> >> >>> fine > > >>> >> >>> > > > grained > > >>> >> >>> > > > > resource requirements. > > >>> >> >>> > > > > - How to unify resource management for streaming / > > >>> batch > > >>> >> jobs. > > >>> >> >>> > > > > > > >>> >> >>> > > > > Key changes proposed in the FLIP are as follows. > > >>> >> >>> > > > > > > >>> >> >>> > > > > - Unify memory management for operators with / > > without > > >>> fine > > >>> >> >>> > grained > > >>> >> >>> > > > > resource requirements by applying a fraction based > > >>> quota > > >>> >> >>> > mechanism. > > >>> >> >>> > > > > - Unify resource scheduling for streaming and batch > > >>> jobs by > > >>> >> >>> > setting > > >>> >> >>> > > > slot > > >>> >> >>> > > > > sharing groups for pipelined regions during > compiling > > >>> >> stage. > > >>> >> >>> > > > > - Dynamically allocate slots from task executors' > > >>> available > > >>> >> >>> > > resources. > > >>> >> >>> > > > > > > >>> >> >>> > > > > Please find more details in the FLIP wiki document > [1]. > > >>> >> Looking > > >>> >> >>> > forward > > >>> >> >>> > > > to > > >>> >> >>> > > > > your feedbacks. > > >>> >> >>> > > > > > > >>> >> >>> > > > > Thank you~ > > >>> >> >>> > > > > > > >>> >> >>> > > > > Xintong Song > > >>> >> >>> > > > > > > >>> >> >>> > > > > > > >>> >> >>> > > > > [1] > > >>> >> >>> > > > > > > >>> >> >>> > > > > > >>> >> >>> > > > > >>> >> >>> > > > >>> >> >>> > > >>> >> > > >>> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Resource+Management > > >>> >> >>> > > > > > >>> >> >>> > > > > >>> >> >>> > > > >>> >> >>> > > >>> >> >> > > >>> >> > > >>> > > > >>> > > >> > > >