[jira] [Updated] (FLINK-15959) Add min number of slots configuration to limit total number of slots
[ https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yangze Guo updated FLINK-15959: --- Description: Flink removed `-n` option after FLIP-6, change to ResourceManager start a new worker when required. But I think maintain a certain amount of slots is necessary. These workers will start immediately when ResourceManager starts and would not release even if all slots are free. Here are some resons: # Users actually know how many resources are needed when run a single job, initialize all workers when cluster starts can speed up startup process. # Job schedule in topology order, next operator won't schedule until prior execution slot allocated. The TaskExecutors will start in several batchs in some cases, it might slow down the startup speed. # Flink support [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out tasks evenly across all available registered TaskManagers], but it will only effect if all TMs are registered. Start all TMs at begining can slove this problem. *suggestion:* * Add config "taskmanager.minimum.numberOfTotalSlots" and "taskmanager.maximum.numberOfTotalSlots", default behavior is still like before. * Start plenty number of workers to satisfy minimum slots when ResourceManager accept leadership(subtract recovered workers). * Don't comlete slot request until minimum number of slots are registered, and throw exeception when exceed maximum. *update* Finally, we'd like to introduce three config options related to the minimum resources restriction: * slotmanager.min-total-resource.cpu * slotmanager.min-total-resource.memory * slotmanager.number-of-slots.min Note that these configuration do not take effect for standalone clusters, where how many slots are allocated is not controlled by Flink. These config are best effort and Flink will not block the job progress even if the min resources are not guaranteed. was: Flink removed `-n` option after FLIP-6, change to ResourceManager start a new worker when required. But I think maintain a certain amount of slots is necessary. These workers will start immediately when ResourceManager starts and would not release even if all slots are free. Here are some resons: # Users actually know how many resources are needed when run a single job, initialize all workers when cluster starts can speed up startup process. # Job schedule in topology order, next operator won't schedule until prior execution slot allocated. The TaskExecutors will start in several batchs in some cases, it might slow down the startup speed. # Flink support [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out tasks evenly across all available registered TaskManagers], but it will only effect if all TMs are registered. Start all TMs at begining can slove this problem. *suggestion:* * Add config "taskmanager.minimum.numberOfTotalSlots" and "taskmanager.maximum.numberOfTotalSlots", default behavior is still like before. * Start plenty number of workers to satisfy minimum slots when ResourceManager accept leadership(subtract recovered workers). * Don't comlete slot request until minimum number of slots are registered, and throw exeception when exceed maximum. > Add min number of slots configuration to limit total number of slots > > > Key: FLINK-15959 > URL: https://issues.apache.org/jira/browse/FLINK-15959 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: YufeiLiu >Assignee: xiangyu feng >Priority: Major > Labels: auto-deprioritized-major, auto-deprioritized-minor, > auto-unassigned, pull-request-available > > Flink removed `-n` option after FLIP-6, change to ResourceManager start a new > worker when required. But I think maintain a certain amount of slots is > necessary. These workers will start immediately when ResourceManager starts > and would not release even if all slots are free. > Here are some resons: > # Users actually know how many resources are needed when run a single job, > initialize all workers when cluster starts can speed up startup process. > # Job schedule in topology order, next operator won't schedule until prior > execution slot allocated. The TaskExecutors will start in several batchs in > some cases, it might slow down the startup speed. > # Flink support > [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out > tasks evenly across all available registered TaskManagers], but it will only > effect if all TMs are registered. Start all TMs at begining can slove this > problem. > *suggestion:* > * Add config "taskmanager.minimum.numberOfTotalSlots" and > "taskmanager.maximum.numberOfTotalSlots", defau
[jira] [Updated] (FLINK-15959) Add min number of slots configuration to limit total number of slots
[ https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-15959: --- Labels: auto-deprioritized-major auto-deprioritized-minor auto-unassigned pull-request-available (was: auto-deprioritized-major auto-deprioritized-minor auto-unassigned) > Add min number of slots configuration to limit total number of slots > > > Key: FLINK-15959 > URL: https://issues.apache.org/jira/browse/FLINK-15959 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: YufeiLiu >Assignee: xiangyu feng >Priority: Major > Labels: auto-deprioritized-major, auto-deprioritized-minor, > auto-unassigned, pull-request-available > > Flink removed `-n` option after FLIP-6, change to ResourceManager start a new > worker when required. But I think maintain a certain amount of slots is > necessary. These workers will start immediately when ResourceManager starts > and would not release even if all slots are free. > Here are some resons: > # Users actually know how many resources are needed when run a single job, > initialize all workers when cluster starts can speed up startup process. > # Job schedule in topology order, next operator won't schedule until prior > execution slot allocated. The TaskExecutors will start in several batchs in > some cases, it might slow down the startup speed. > # Flink support > [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out > tasks evenly across all available registered TaskManagers], but it will only > effect if all TMs are registered. Start all TMs at begining can slove this > problem. > *suggestion:* > * Add config "taskmanager.minimum.numberOfTotalSlots" and > "taskmanager.maximum.numberOfTotalSlots", default behavior is still like > before. > * Start plenty number of workers to satisfy minimum slots when > ResourceManager accept leadership(subtract recovered workers). > * Don't comlete slot request until minimum number of slots are registered, > and throw exeception when exceed maximum. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-15959) Add min number of slots configuration to limit total number of slots
[ https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yangze Guo updated FLINK-15959: --- Priority: Major (was: Not a Priority) > Add min number of slots configuration to limit total number of slots > > > Key: FLINK-15959 > URL: https://issues.apache.org/jira/browse/FLINK-15959 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: YufeiLiu >Priority: Major > Labels: auto-deprioritized-major, auto-deprioritized-minor, > auto-unassigned > > Flink removed `-n` option after FLIP-6, change to ResourceManager start a new > worker when required. But I think maintain a certain amount of slots is > necessary. These workers will start immediately when ResourceManager starts > and would not release even if all slots are free. > Here are some resons: > # Users actually know how many resources are needed when run a single job, > initialize all workers when cluster starts can speed up startup process. > # Job schedule in topology order, next operator won't schedule until prior > execution slot allocated. The TaskExecutors will start in several batchs in > some cases, it might slow down the startup speed. > # Flink support > [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out > tasks evenly across all available registered TaskManagers], but it will only > effect if all TMs are registered. Start all TMs at begining can slove this > problem. > *suggestion:* > * Add config "taskmanager.minimum.numberOfTotalSlots" and > "taskmanager.maximum.numberOfTotalSlots", default behavior is still like > before. > * Start plenty number of workers to satisfy minimum slots when > ResourceManager accept leadership(subtract recovered workers). > * Don't comlete slot request until minimum number of slots are registered, > and throw exeception when exceed maximum. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-15959) Add min number of slots configuration to limit total number of slots
[ https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yangze Guo updated FLINK-15959: --- Parent: FLINK-25318 Issue Type: Sub-task (was: Improvement) > Add min number of slots configuration to limit total number of slots > > > Key: FLINK-15959 > URL: https://issues.apache.org/jira/browse/FLINK-15959 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: YufeiLiu >Priority: Not a Priority > Labels: auto-deprioritized-major, auto-deprioritized-minor, > auto-unassigned > > Flink removed `-n` option after FLIP-6, change to ResourceManager start a new > worker when required. But I think maintain a certain amount of slots is > necessary. These workers will start immediately when ResourceManager starts > and would not release even if all slots are free. > Here are some resons: > # Users actually know how many resources are needed when run a single job, > initialize all workers when cluster starts can speed up startup process. > # Job schedule in topology order, next operator won't schedule until prior > execution slot allocated. The TaskExecutors will start in several batchs in > some cases, it might slow down the startup speed. > # Flink support > [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out > tasks evenly across all available registered TaskManagers], but it will only > effect if all TMs are registered. Start all TMs at begining can slove this > problem. > *suggestion:* > * Add config "taskmanager.minimum.numberOfTotalSlots" and > "taskmanager.maximum.numberOfTotalSlots", default behavior is still like > before. > * Start plenty number of workers to satisfy minimum slots when > ResourceManager accept leadership(subtract recovered workers). > * Don't comlete slot request until minimum number of slots are registered, > and throw exeception when exceed maximum. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-15959) Add min number of slots configuration to limit total number of slots
[ https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-15959: --- Labels: auto-deprioritized-major auto-deprioritized-minor auto-unassigned (was: auto-deprioritized-major auto-unassigned stale-minor) Priority: Not a Priority (was: Minor) This issue was labeled "stale-minor" 7 days ago and has not received any updates so it is being deprioritized. If this ticket is actually Minor, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > Add min number of slots configuration to limit total number of slots > > > Key: FLINK-15959 > URL: https://issues.apache.org/jira/browse/FLINK-15959 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: YufeiLiu >Priority: Not a Priority > Labels: auto-deprioritized-major, auto-deprioritized-minor, > auto-unassigned > > Flink removed `-n` option after FLIP-6, change to ResourceManager start a new > worker when required. But I think maintain a certain amount of slots is > necessary. These workers will start immediately when ResourceManager starts > and would not release even if all slots are free. > Here are some resons: > # Users actually know how many resources are needed when run a single job, > initialize all workers when cluster starts can speed up startup process. > # Job schedule in topology order, next operator won't schedule until prior > execution slot allocated. The TaskExecutors will start in several batchs in > some cases, it might slow down the startup speed. > # Flink support > [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out > tasks evenly across all available registered TaskManagers], but it will only > effect if all TMs are registered. Start all TMs at begining can slove this > problem. > *suggestion:* > * Add config "taskmanager.minimum.numberOfTotalSlots" and > "taskmanager.maximum.numberOfTotalSlots", default behavior is still like > before. > * Start plenty number of workers to satisfy minimum slots when > ResourceManager accept leadership(subtract recovered workers). > * Don't comlete slot request until minimum number of slots are registered, > and throw exeception when exceed maximum. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-15959) Add min number of slots configuration to limit total number of slots
[ https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-15959: --- Labels: auto-deprioritized-major auto-unassigned stale-minor (was: auto-deprioritized-major auto-unassigned) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issues has been marked as Minor but is unassigned and neither itself nor its Sub-Tasks have been updated for 180 days. I have gone ahead and marked it "stale-minor". If this ticket is still Minor, please either assign yourself or give an update. Afterwards, please remove the label or in 7 days the issue will be deprioritized. > Add min number of slots configuration to limit total number of slots > > > Key: FLINK-15959 > URL: https://issues.apache.org/jira/browse/FLINK-15959 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: YufeiLiu >Priority: Minor > Labels: auto-deprioritized-major, auto-unassigned, stale-minor > > Flink removed `-n` option after FLIP-6, change to ResourceManager start a new > worker when required. But I think maintain a certain amount of slots is > necessary. These workers will start immediately when ResourceManager starts > and would not release even if all slots are free. > Here are some resons: > # Users actually know how many resources are needed when run a single job, > initialize all workers when cluster starts can speed up startup process. > # Job schedule in topology order, next operator won't schedule until prior > execution slot allocated. The TaskExecutors will start in several batchs in > some cases, it might slow down the startup speed. > # Flink support > [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out > tasks evenly across all available registered TaskManagers], but it will only > effect if all TMs are registered. Start all TMs at begining can slove this > problem. > *suggestion:* > * Add config "taskmanager.minimum.numberOfTotalSlots" and > "taskmanager.maximum.numberOfTotalSlots", default behavior is still like > before. > * Start plenty number of workers to satisfy minimum slots when > ResourceManager accept leadership(subtract recovered workers). > * Don't comlete slot request until minimum number of slots are registered, > and throw exeception when exceed maximum. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-15959) Add min number of slots configuration to limit total number of slots
[ https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-15959: --- Labels: auto-deprioritized-major auto-unassigned (was: auto-unassigned stale-major) Priority: Minor (was: Major) This issue was labeled "stale-major" 7 ago and has not received any updates so it is being deprioritized. If this ticket is actually Major, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > Add min number of slots configuration to limit total number of slots > > > Key: FLINK-15959 > URL: https://issues.apache.org/jira/browse/FLINK-15959 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: YufeiLiu >Priority: Minor > Labels: auto-deprioritized-major, auto-unassigned > > Flink removed `-n` option after FLIP-6, change to ResourceManager start a new > worker when required. But I think maintain a certain amount of slots is > necessary. These workers will start immediately when ResourceManager starts > and would not release even if all slots are free. > Here are some resons: > # Users actually know how many resources are needed when run a single job, > initialize all workers when cluster starts can speed up startup process. > # Job schedule in topology order, next operator won't schedule until prior > execution slot allocated. The TaskExecutors will start in several batchs in > some cases, it might slow down the startup speed. > # Flink support > [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out > tasks evenly across all available registered TaskManagers], but it will only > effect if all TMs are registered. Start all TMs at begining can slove this > problem. > *suggestion:* > * Add config "taskmanager.minimum.numberOfTotalSlots" and > "taskmanager.maximum.numberOfTotalSlots", default behavior is still like > before. > * Start plenty number of workers to satisfy minimum slots when > ResourceManager accept leadership(subtract recovered workers). > * Don't comlete slot request until minimum number of slots are registered, > and throw exeception when exceed maximum. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-15959) Add min number of slots configuration to limit total number of slots
[ https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-15959: --- Labels: auto-unassigned stale-major (was: auto-unassigned) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issues has been marked as Major but is unassigned and neither itself nor its Sub-Tasks have been updated for 30 days. I have gone ahead and added a "stale-major" to the issue". If this ticket is a Major, please either assign yourself or give an update. Afterwards, please remove the label or in 7 days the issue will be deprioritized. > Add min number of slots configuration to limit total number of slots > > > Key: FLINK-15959 > URL: https://issues.apache.org/jira/browse/FLINK-15959 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: YufeiLiu >Priority: Major > Labels: auto-unassigned, stale-major > > Flink removed `-n` option after FLIP-6, change to ResourceManager start a new > worker when required. But I think maintain a certain amount of slots is > necessary. These workers will start immediately when ResourceManager starts > and would not release even if all slots are free. > Here are some resons: > # Users actually know how many resources are needed when run a single job, > initialize all workers when cluster starts can speed up startup process. > # Job schedule in topology order, next operator won't schedule until prior > execution slot allocated. The TaskExecutors will start in several batchs in > some cases, it might slow down the startup speed. > # Flink support > [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out > tasks evenly across all available registered TaskManagers], but it will only > effect if all TMs are registered. Start all TMs at begining can slove this > problem. > *suggestion:* > * Add config "taskmanager.minimum.numberOfTotalSlots" and > "taskmanager.maximum.numberOfTotalSlots", default behavior is still like > before. > * Start plenty number of workers to satisfy minimum slots when > ResourceManager accept leadership(subtract recovered workers). > * Don't comlete slot request until minimum number of slots are registered, > and throw exeception when exceed maximum. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-15959) Add min number of slots configuration to limit total number of slots
[ https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-15959: -- Fix Version/s: (was: 1.14.0) > Add min number of slots configuration to limit total number of slots > > > Key: FLINK-15959 > URL: https://issues.apache.org/jira/browse/FLINK-15959 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: YufeiLiu >Priority: Major > Labels: auto-unassigned > > Flink removed `-n` option after FLIP-6, change to ResourceManager start a new > worker when required. But I think maintain a certain amount of slots is > necessary. These workers will start immediately when ResourceManager starts > and would not release even if all slots are free. > Here are some resons: > # Users actually know how many resources are needed when run a single job, > initialize all workers when cluster starts can speed up startup process. > # Job schedule in topology order, next operator won't schedule until prior > execution slot allocated. The TaskExecutors will start in several batchs in > some cases, it might slow down the startup speed. > # Flink support > [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out > tasks evenly across all available registered TaskManagers], but it will only > effect if all TMs are registered. Start all TMs at begining can slove this > problem. > *suggestion:* > * Add config "taskmanager.minimum.numberOfTotalSlots" and > "taskmanager.maximum.numberOfTotalSlots", default behavior is still like > before. > * Start plenty number of workers to satisfy minimum slots when > ResourceManager accept leadership(subtract recovered workers). > * Don't comlete slot request until minimum number of slots are registered, > and throw exeception when exceed maximum. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-15959) Add min number of slots configuration to limit total number of slots
[ https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Wysakowicz updated FLINK-15959: - Fix Version/s: (was: 1.13.0) 1.14.0 > Add min number of slots configuration to limit total number of slots > > > Key: FLINK-15959 > URL: https://issues.apache.org/jira/browse/FLINK-15959 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: YufeiLiu >Priority: Major > Labels: auto-unassigned > Fix For: 1.14.0 > > > Flink removed `-n` option after FLIP-6, change to ResourceManager start a new > worker when required. But I think maintain a certain amount of slots is > necessary. These workers will start immediately when ResourceManager starts > and would not release even if all slots are free. > Here are some resons: > # Users actually know how many resources are needed when run a single job, > initialize all workers when cluster starts can speed up startup process. > # Job schedule in topology order, next operator won't schedule until prior > execution slot allocated. The TaskExecutors will start in several batchs in > some cases, it might slow down the startup speed. > # Flink support > [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out > tasks evenly across all available registered TaskManagers], but it will only > effect if all TMs are registered. Start all TMs at begining can slove this > problem. > *suggestion:* > * Add config "taskmanager.minimum.numberOfTotalSlots" and > "taskmanager.maximum.numberOfTotalSlots", default behavior is still like > before. > * Start plenty number of workers to satisfy minimum slots when > ResourceManager accept leadership(subtract recovered workers). > * Don't comlete slot request until minimum number of slots are registered, > and throw exeception when exceed maximum. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-15959) Add min number of slots configuration to limit total number of slots
[ https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-15959: --- Labels: auto-unassigned (was: stale-assigned) > Add min number of slots configuration to limit total number of slots > > > Key: FLINK-15959 > URL: https://issues.apache.org/jira/browse/FLINK-15959 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: YufeiLiu >Assignee: Yangze Guo >Priority: Major > Labels: auto-unassigned > Fix For: 1.13.0 > > > Flink removed `-n` option after FLIP-6, change to ResourceManager start a new > worker when required. But I think maintain a certain amount of slots is > necessary. These workers will start immediately when ResourceManager starts > and would not release even if all slots are free. > Here are some resons: > # Users actually know how many resources are needed when run a single job, > initialize all workers when cluster starts can speed up startup process. > # Job schedule in topology order, next operator won't schedule until prior > execution slot allocated. The TaskExecutors will start in several batchs in > some cases, it might slow down the startup speed. > # Flink support > [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out > tasks evenly across all available registered TaskManagers], but it will only > effect if all TMs are registered. Start all TMs at begining can slove this > problem. > *suggestion:* > * Add config "taskmanager.minimum.numberOfTotalSlots" and > "taskmanager.maximum.numberOfTotalSlots", default behavior is still like > before. > * Start plenty number of workers to satisfy minimum slots when > ResourceManager accept leadership(subtract recovered workers). > * Don't comlete slot request until minimum number of slots are registered, > and throw exeception when exceed maximum. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-15959) Add min number of slots configuration to limit total number of slots
[ https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-15959: --- Labels: stale-assigned (was: ) > Add min number of slots configuration to limit total number of slots > > > Key: FLINK-15959 > URL: https://issues.apache.org/jira/browse/FLINK-15959 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: YufeiLiu >Assignee: Yangze Guo >Priority: Major > Labels: stale-assigned > Fix For: 1.13.0 > > > Flink removed `-n` option after FLIP-6, change to ResourceManager start a new > worker when required. But I think maintain a certain amount of slots is > necessary. These workers will start immediately when ResourceManager starts > and would not release even if all slots are free. > Here are some resons: > # Users actually know how many resources are needed when run a single job, > initialize all workers when cluster starts can speed up startup process. > # Job schedule in topology order, next operator won't schedule until prior > execution slot allocated. The TaskExecutors will start in several batchs in > some cases, it might slow down the startup speed. > # Flink support > [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out > tasks evenly across all available registered TaskManagers], but it will only > effect if all TMs are registered. Start all TMs at begining can slove this > problem. > *suggestion:* > * Add config "taskmanager.minimum.numberOfTotalSlots" and > "taskmanager.maximum.numberOfTotalSlots", default behavior is still like > before. > * Start plenty number of workers to satisfy minimum slots when > ResourceManager accept leadership(subtract recovered workers). > * Don't comlete slot request until minimum number of slots are registered, > and throw exeception when exceed maximum. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-15959) Add min number of slots configuration to limit total number of slots
[ https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yangze Guo updated FLINK-15959: --- Fix Version/s: (was: 1.12.0) 1.13.0 > Add min number of slots configuration to limit total number of slots > > > Key: FLINK-15959 > URL: https://issues.apache.org/jira/browse/FLINK-15959 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: YufeiLiu >Assignee: Yangze Guo >Priority: Major > Fix For: 1.13.0 > > > Flink removed `-n` option after FLIP-6, change to ResourceManager start a new > worker when required. But I think maintain a certain amount of slots is > necessary. These workers will start immediately when ResourceManager starts > and would not release even if all slots are free. > Here are some resons: > # Users actually know how many resources are needed when run a single job, > initialize all workers when cluster starts can speed up startup process. > # Job schedule in topology order, next operator won't schedule until prior > execution slot allocated. The TaskExecutors will start in several batchs in > some cases, it might slow down the startup speed. > # Flink support > [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out > tasks evenly across all available registered TaskManagers], but it will only > effect if all TMs are registered. Start all TMs at begining can slove this > problem. > *suggestion:* > * Add config "taskmanager.minimum.numberOfTotalSlots" and > "taskmanager.maximum.numberOfTotalSlots", default behavior is still like > before. > * Start plenty number of workers to satisfy minimum slots when > ResourceManager accept leadership(subtract recovered workers). > * Don't comlete slot request until minimum number of slots are registered, > and throw exeception when exceed maximum. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-15959) Add min number of slots configuration to limit total number of slots
[ https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yangze Guo updated FLINK-15959: --- Fix Version/s: 1.12.0 > Add min number of slots configuration to limit total number of slots > > > Key: FLINK-15959 > URL: https://issues.apache.org/jira/browse/FLINK-15959 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: YufeiLiu >Assignee: Yangze Guo >Priority: Major > Fix For: 1.12.0 > > > Flink removed `-n` option after FLIP-6, change to ResourceManager start a new > worker when required. But I think maintain a certain amount of slots is > necessary. These workers will start immediately when ResourceManager starts > and would not release even if all slots are free. > Here are some resons: > # Users actually know how many resources are needed when run a single job, > initialize all workers when cluster starts can speed up startup process. > # Job schedule in topology order, next operator won't schedule until prior > execution slot allocated. The TaskExecutors will start in several batchs in > some cases, it might slow down the startup speed. > # Flink support > [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out > tasks evenly across all available registered TaskManagers], but it will only > effect if all TMs are registered. Start all TMs at begining can slove this > problem. > *suggestion:* > * Add config "taskmanager.minimum.numberOfTotalSlots" and > "taskmanager.maximum.numberOfTotalSlots", default behavior is still like > before. > * Start plenty number of workers to satisfy minimum slots when > ResourceManager accept leadership(subtract recovered workers). > * Don't comlete slot request until minimum number of slots are registered, > and throw exeception when exceed maximum. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-15959) Add min number of slots configuration to limit total number of slots
[ https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yangze Guo updated FLINK-15959: --- Summary: Add min number of slots configuration to limit total number of slots (was: Add min/max number of slots configuration to limit total number of slots) > Add min number of slots configuration to limit total number of slots > > > Key: FLINK-15959 > URL: https://issues.apache.org/jira/browse/FLINK-15959 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: YufeiLiu >Priority: Major > > Flink removed `-n` option after FLIP-6, change to ResourceManager start a new > worker when required. But I think maintain a certain amount of slots is > necessary. These workers will start immediately when ResourceManager starts > and would not release even if all slots are free. > Here are some resons: > # Users actually know how many resources are needed when run a single job, > initialize all workers when cluster starts can speed up startup process. > # Job schedule in topology order, next operator won't schedule until prior > execution slot allocated. The TaskExecutors will start in several batchs in > some cases, it might slow down the startup speed. > # Flink support > [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out > tasks evenly across all available registered TaskManagers], but it will only > effect if all TMs are registered. Start all TMs at begining can slove this > problem. > *suggestion:* > * Add config "taskmanager.minimum.numberOfTotalSlots" and > "taskmanager.maximum.numberOfTotalSlots", default behavior is still like > before. > * Start plenty number of workers to satisfy minimum slots when > ResourceManager accept leadership(subtract recovered workers). > * Don't comlete slot request until minimum number of slots are registered, > and throw exeception when exceed maximum. -- This message was sent by Atlassian Jira (v8.3.4#803005)