[jira] [Updated] (FLINK-17019) Implement FIFO Physical Slot Assignment in SlotPoolImpl

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-17019: Fix Version/s: (was: 1.11.0) 1.12.0 > Implement FIFO Physical Slot Assignment in Sl

[jira] [Updated] (FLINK-17017) Implement Bulk Slot Allocation

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-17017: Fix Version/s: (was: 1.11.0) 1.12.0 > Implement Bulk Slot Allocation >

[jira] [Updated] (FLINK-17016) Use PipelinedRegionSchedulingStrategy in DefaultScheduler (for Blink Planner)

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-17016: Fix Version/s: (was: 1.11.0) 1.12.0 > Use PipelinedRegionSchedulingStrategy in Defa

[jira] [Updated] (FLINK-17018) DefaultExecutionSlotAllocator allocates slots in bulks ignoring slot sharing

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-17018: Fix Version/s: (was: 1.11.0) 1.12.0 > DefaultExecutionSlotAllocator allocates slots

[jira] [Updated] (FLINK-17542) Unify slot request timeout handling for streaming and batch tasks

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-17542: Fix Version/s: (was: 1.11.0) 1.12.0 > Unify slot request timeout handling for strea

[jira] [Updated] (FLINK-17330) Avoid scheduling deadlocks caused by cyclic input dependencies between regions

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-17330: Fix Version/s: (was: 1.11.0) 1.12.0 > Avoid scheduling deadlocks caused by cyclic i

[jira] [Updated] (FLINK-15626) Remove legacy scheduler

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15626: Fix Version/s: (was: 1.11.0) 1.12.0 > Remove legacy scheduler > ---

[jira] [Updated] (FLINK-15031) Calculate required shuffle memory before allocating slots if resources are specified

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15031: Fix Version/s: (was: 1.11.0) 1.12.0 > Calculate required shuffle memory before allo

[jira] [Updated] (FLINK-13056) Optimize region failover performance on calculating vertices to restart

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-13056: Fix Version/s: (was: 1.11.0) > Optimize region failover performance on calculating vertices to restart

[jira] [Updated] (FLINK-17760) Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-17760: Fix Version/s: 1.12.0 > Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore > -

[jira] [Updated] (FLINK-17760) Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-17760: Affects Version/s: 1.11.0 > Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

[jira] [Updated] (FLINK-14606) Simplify params of Execution#processFail

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14606: Fix Version/s: (was: 1.11.0) > Simplify params of Execution#processFail >

[jira] [Updated] (FLINK-15178) TaskExecutor crashes due to mmap allocation failure for BLOCKING shuffle

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15178: Fix Version/s: (was: 1.11.0) > TaskExecutor crashes due to mmap allocation failure for BLOCKING shuffl

[jira] [Updated] (FLINK-14510) Remove the lazy vertex attaching mechanism from ExecutionGraph

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14510: Fix Version/s: (was: 1.11.0) 1.12.0 > Remove the lazy vertex attaching mechanism fr

[jira] [Updated] (FLINK-14236) Make LazyFromSourcesSchedulingStrategy do lazy scheduling based on partition state only

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14236: Fix Version/s: (was: 1.11.0) > Make LazyFromSourcesSchedulingStrategy do lazy scheduling based on part

[jira] [Updated] (FLINK-14233) All task state changes should be notified to SchedulingStrategy (SchedulerNG)

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14233: Fix Version/s: (was: 1.11.0) > All task state changes should be notified to SchedulingStrategy (Schedu

[jira] [Closed] (FLINK-15813) Set default value of jobmanager.execution.failover-strategy to region

2020-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-15813. --- Resolution: Fixed done via master b425c05d57ace5cf27591dbd6798b6131211f6c1 release-1.1 6fa0c92567a9d8163

[jira] [Created] (FLINK-17821) Kafka010TableITCase>KafkaTableTestBase.testKafkaSourceSink failed on AZP

2020-05-19 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-17821: --- Summary: Kafka010TableITCase>KafkaTableTestBase.testKafkaSourceSink failed on AZP Key: FLINK-17821 URL: https://issues.apache.org/jira/browse/FLINK-17821 Project: Flink

[jira] [Commented] (FLINK-17821) Kafka010TableITCase>KafkaTableTestBase.testKafkaSourceSink failed on AZP

2020-05-19 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17111779#comment-17111779 ] Zhu Zhu commented on FLINK-17821: - [~wanglijie95] yes, it's the same root cause. Thanks

[jira] [Closed] (FLINK-17821) Kafka010TableITCase>KafkaTableTestBase.testKafkaSourceSink failed on AZP

2020-05-19 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-17821. --- Resolution: Duplicate > Kafka010TableITCase>KafkaTableTestBase.testKafkaSourceSink failed on AZP > -

[jira] [Assigned] (FLINK-17018) DefaultExecutionSlotAllocator allocates slots in bulks ignoring slot sharing

2020-05-20 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-17018: --- Assignee: Zhu Zhu > DefaultExecutionSlotAllocator allocates slots in bulks ignoring slot sharing >

[jira] [Updated] (FLINK-17645) REAPER_THREAD.start() in SafetyNetCloseableRegistry failed, causing the repeated failover.

2020-05-20 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-17645: Affects Version/s: 1.12.0 > REAPER_THREAD.start() in SafetyNetCloseableRegistry failed, causing the > rep

[jira] [Updated] (FLINK-17645) REAPER_THREAD.start() in SafetyNetCloseableRegistry failed, causing the repeated failover.

2020-05-20 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-17645: Fix Version/s: 1.12.0 > REAPER_THREAD.start() in SafetyNetCloseableRegistry failed, causing the > repeate

[jira] [Closed] (FLINK-17645) REAPER_THREAD.start() in SafetyNetCloseableRegistry failed, causing the repeated failover.

2020-05-20 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-17645. --- Resolution: Fixed Fixed via master 9f3a71183ea4b14a396ecf66e4377da07b06a689 release-1.1 d40826d23c8993f462

[jira] [Assigned] (FLINK-17814) Translate native kubernetes document to Chinese

2020-05-20 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-17814: --- Assignee: CaoZhen > Translate native kubernetes document to Chinese > -

[jira] [Commented] (FLINK-17814) Translate native kubernetes document to Chinese

2020-05-20 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112774#comment-17112774 ] Zhu Zhu commented on FLINK-17814: - I have assigned the ticket to you [~caozhen1937]. >

[jira] [Assigned] (FLINK-17019) Implement FIFO Physical Slot Assignment in SlotPoolImpl

2020-05-21 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-17019: --- Assignee: Zhu Zhu > Implement FIFO Physical Slot Assignment in SlotPoolImpl > -

[jira] [Assigned] (FLINK-17016) Use PipelinedRegionSchedulingStrategy in DefaultScheduler (for Blink Planner)

2020-05-21 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-17016: --- Assignee: Zhu Zhu > Use PipelinedRegionSchedulingStrategy in DefaultScheduler (for Blink Planner) >

[jira] [Assigned] (FLINK-17726) Scheduler should take care of tasks directly canceled by TaskManager

2020-05-22 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-17726: --- Assignee: Nicholas Jiang > Scheduler should take care of tasks directly canceled by TaskManager > -

[jira] [Commented] (FLINK-17726) Scheduler should take care of tasks directly canceled by TaskManager

2020-05-22 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113820#comment-17113820 ] Zhu Zhu commented on FLINK-17726: - I have assigned the ticket to you. [~nicholasjiang] B

[jira] [Closed] (FLINK-17542) Unify slot request timeout handling for streaming and batch tasks

2020-05-24 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-17542. --- Resolution: Abandoned Superseded by FLINK-17018. > Unify slot request timeout handling for streaming and ba

[jira] [Closed] (FLINK-17017) Implement Bulk Slot Allocation

2020-05-24 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-17017. --- Resolution: Abandoned Superseded by FLINK-17018. > Implement Bulk Slot Allocation > ---

[jira] [Updated] (FLINK-17018) Allocates slots in bulks on pipelined region scheduling

2020-05-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-17018: Summary: Allocates slots in bulks on pipelined region scheduling (was: DefaultExecutionSlotAllocator allo

[jira] [Updated] (FLINK-17017) Implements bulk allocation for physical slots

2020-05-26 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-17017: Summary: Implements bulk allocation for physical slots (was: Implement Bulk Slot Allocation) > Implement

[jira] [Reopened] (FLINK-17017) Implements bulk allocation for physical slots

2020-05-26 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reopened FLINK-17017: - Reopened for 'bulk slot allocation' part of FLINK-17018. > Implements bulk allocation for physical slots >

[jira] [Updated] (FLINK-17017) Implements bulk allocation for physical slots

2020-05-26 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-17017: Description: SlotProvider should support bulk slot allocation so that we can check to see if the resource

[jira] [Commented] (FLINK-17923) It will throw MemoryAllocationException if rocksdb statebackend and Python UDF are used in the same slot

2020-05-27 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117512#comment-17117512 ] Zhu Zhu commented on FLINK-17923: - We([~yunta]) just find that we can also have option #

[jira] [Commented] (FLINK-15037) Introduce LimittingMemoryManager as operator scope MemoyManager

2019-12-04 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988444#comment-16988444 ] Zhu Zhu commented on FLINK-15037: - Thanks for the feedback [~ykt836]. This task is not f

[jira] [Comment Edited] (FLINK-15037) Introduce LimittingMemoryManager as operator scope MemoyManager

2019-12-04 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988444#comment-16988444 ] Zhu Zhu edited comment on FLINK-15037 at 12/5/19 4:09 AM: -- Than

[jira] [Commented] (FLINK-14566) Enable to get/set whether an operator uses managed memory

2019-12-04 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988458#comment-16988458 ] Zhu Zhu commented on FLINK-14566: - Thanks for confirming the design and helping with the

[jira] [Updated] (FLINK-14701) Slot leaks if SharedSlotOversubscribedException happens

2019-12-04 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14701: Priority: Minor (was: Major) > Slot leaks if SharedSlotOversubscribedException happens >

[jira] [Commented] (FLINK-15031) Calculate required shuffle memory cases before allocating slots in resources specified

2019-12-05 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988603#comment-16988603 ] Zhu Zhu commented on FLINK-15031: - Thanks [~trohrmann] for the feedbacks. I can work on

[jira] [Comment Edited] (FLINK-15031) Calculate required shuffle memory cases before allocating slots in resources specified

2019-12-05 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988603#comment-16988603 ] Zhu Zhu edited comment on FLINK-15031 at 12/5/19 9:03 AM: -- Than

[jira] [Updated] (FLINK-15031) Calculate required shuffle memory before allocating slots if resources are specified

2019-12-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15031: Summary: Calculate required shuffle memory before allocating slots if resources are specified (was: Calcu

[jira] [Updated] (FLINK-13056) Optimize region failover performance on calculating vertices to restart

2019-12-08 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-13056: Fix Version/s: (was: 1.10.0) 1.11.0 > Optimize region failover performance on calcu

[jira] [Commented] (FLINK-13056) Optimize region failover performance on calculating vertices to restart

2019-12-08 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16991173#comment-16991173 ] Zhu Zhu commented on FLINK-13056: - postpone this improvement to 1.11. > Optimize region

[jira] [Updated] (FLINK-15031) Calculate required shuffle memory before allocating slots if resources are specified

2019-12-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15031: Fix Version/s: (was: 1.10.0) 1.11.0 > Calculate required shuffle memory before allo

[jira] [Commented] (FLINK-15031) Calculate required shuffle memory before allocating slots if resources are specified

2019-12-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16991514#comment-16991514 ] Zhu Zhu commented on FLINK-15031: - Postpone to 1.11 since several re-requisites to use t

[jira] [Updated] (FLINK-14162) Unify SchedulerOperations#allocateSlotsAndDeploy implementation for all scheduling strategies

2019-12-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14162: Fix Version/s: (was: 1.10.0) 1.11.0 > Unify SchedulerOperations#allocateSlotsAndDep

[jira] [Updated] (FLINK-14233) All task state changes should be notified to SchedulingStrategy (SchedulerNG)

2019-12-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14233: Fix Version/s: (was: 1.10.0) 1.11.0 > All task state changes should be notified to

[jira] [Updated] (FLINK-14606) Simplify params of Execution#processFail

2019-12-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14606: Fix Version/s: (was: 1.10.0) 1.11.0 > Simplify params of Execution#processFail > --

[jira] [Updated] (FLINK-14234) All partition consumable events should be notified to SchedulingStrategy (SchedulerNG)

2019-12-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14234: Fix Version/s: (was: 1.10.0) 1.11.0 > All partition consumable events should be not

[jira] [Updated] (FLINK-14236) Make LazyFromSourcesSchedulingStrategy do lazy scheduling based on partition state only

2019-12-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14236: Fix Version/s: (was: 1.10.0) 1.11.0 > Make LazyFromSourcesSchedulingStrategy do laz

[jira] [Updated] (FLINK-15031) Calculate required shuffle memory before allocating slots if resources are specified

2019-12-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15031: Parent: (was: FLINK-14058) Issue Type: Task (was: Sub-task) > Calculate required shuffle memo

[jira] [Created] (FLINK-15169) Errors happen in the scheduling of DefaultScheduler is not shown in WebUI

2019-12-09 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-15169: --- Summary: Errors happen in the scheduling of DefaultScheduler is not shown in WebUI Key: FLINK-15169 URL: https://issues.apache.org/jira/browse/FLINK-15169 Project: Flink

[jira] [Updated] (FLINK-15169) Errors happen in the scheduling of DefaultScheduler is not shown in WebUI

2019-12-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15169: Description: WebUI relies on {{ExecutionGraph#failureInfo}} and {{Execution#failureCause}} to generate er

[jira] [Commented] (FLINK-14058) FLIP-53 Fine Grained Operator Resource Management

2019-12-10 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16992444#comment-16992444 ] Zhu Zhu commented on FLINK-14058: - Yes we can close it. We may need some time for the re

[jira] [Commented] (FLINK-15169) Errors happen in the scheduling of DefaultScheduler is not shown in WebUI

2019-12-10 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16992455#comment-16992455 ] Zhu Zhu commented on FLINK-15169: - 1. Execution#processFail is only invoked on task fail

[jira] [Comment Edited] (FLINK-15169) Errors happen in the scheduling of DefaultScheduler is not shown in WebUI

2019-12-10 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16992455#comment-16992455 ] Zhu Zhu edited comment on FLINK-15169 at 12/10/19 11:27 AM:

[jira] [Created] (FLINK-15178) Task crash due to mmap allocation failure for BLOCKING shuffle

2019-12-10 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-15178: --- Summary: Task crash due to mmap allocation failure for BLOCKING shuffle Key: FLINK-15178 URL: https://issues.apache.org/jira/browse/FLINK-15178 Project: Flink Issue T

[jira] [Updated] (FLINK-15178) TaskExecutor crashes due to mmap allocation failure for BLOCKING shuffle

2019-12-10 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15178: Summary: TaskExecutor crashes due to mmap allocation failure for BLOCKING shuffle (was: Task crash due to

[jira] [Updated] (FLINK-15178) TaskExecutor crashes due to mmap allocation failure for BLOCKING shuffle

2019-12-10 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15178: Description: I met this issue when running testing batch(DataSet) job with 1000 parallelism. Some TMs cras

[jira] [Comment Edited] (FLINK-15169) Errors happen in the scheduling of DefaultScheduler is not shown in WebUI

2019-12-10 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16992602#comment-16992602 ] Zhu Zhu edited comment on FLINK-15169 at 12/10/19 2:29 PM: --- If

[jira] [Commented] (FLINK-15169) Errors happen in the scheduling of DefaultScheduler is not shown in WebUI

2019-12-10 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16992602#comment-16992602 ] Zhu Zhu commented on FLINK-15169: - If we can have an interface {{ExecutionVertexOperatio

[jira] [Comment Edited] (FLINK-15169) Errors happen in the scheduling of DefaultScheduler is not shown in WebUI

2019-12-10 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16992602#comment-16992602 ] Zhu Zhu edited comment on FLINK-15169 at 12/10/19 2:30 PM: --- If

[jira] [Commented] (FLINK-15178) TaskExecutor crashes due to mmap allocation failure for BLOCKING shuffle

2019-12-11 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16993300#comment-16993300 ] Zhu Zhu commented on FLINK-15178: - [~pnowojski] I tried -XX:-UseCompressedOops but the i

[jira] [Comment Edited] (FLINK-15178) TaskExecutor crashes due to mmap allocation failure for BLOCKING shuffle

2019-12-11 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16993300#comment-16993300 ] Zhu Zhu edited comment on FLINK-15178 at 12/11/19 8:31 AM: --- [~

[jira] [Commented] (FLINK-15178) TaskExecutor crashes due to mmap allocation failure for BLOCKING shuffle

2019-12-11 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16993323#comment-16993323 ] Zhu Zhu commented on FLINK-15178: - Thanks for the explanation [~pnowojski]. Yes the iss

[jira] [Commented] (FLINK-15178) TaskExecutor crashes due to mmap allocation failure for BLOCKING shuffle

2019-12-11 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16993477#comment-16993477 ] Zhu Zhu commented on FLINK-15178: - Yes, my result is from the master {{Rev:1710207, Date

[jira] [Comment Edited] (FLINK-15178) TaskExecutor crashes due to mmap allocation failure for BLOCKING shuffle

2019-12-11 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16993477#comment-16993477 ] Zhu Zhu edited comment on FLINK-15178 at 12/11/19 12:37 PM:

[jira] [Commented] (FLINK-15013) Flink (on YARN) sometimes needs too many slots

2019-12-11 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994287#comment-16994287 ] Zhu Zhu commented on FLINK-15013: - [~trohrmann] yes that's what I mean. And location pre

[jira] [Updated] (FLINK-15169) Errors happen in the scheduling of DefaultScheduler are not shown in WebUI

2019-12-12 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15169: Summary: Errors happen in the scheduling of DefaultScheduler are not shown in WebUI (was: Errors happen i

[jira] [Created] (FLINK-15224) Resource requirements are not respected when fulfilling a slot request with unresolvedRootSlots from a SlotSharingManager

2019-12-12 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-15224: --- Summary: Resource requirements are not respected when fulfilling a slot request with unresolvedRootSlots from a SlotSharingManager Key: FLINK-15224 URL: https://issues.apache.org/jira/brows

[jira] [Updated] (FLINK-15224) Resource requirements are not respected when fulfilling a slot request with unresolvedRootSlots from a SlotSharingManager

2019-12-12 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15224: Description: In {{SchedulerImpl#allocateMultiTaskSlot}}, if a slot request cannot be fulfilled immediatel

[jira] [Updated] (FLINK-15224) Resource requirements are not respected when fulfilling a slot request with unresolvedRootSlots from a SlotSharingManager

2019-12-12 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15224: Description: In {{SchedulerImpl#allocateMultiTaskSlot}}, if a slot request cannot be fulfilled immediatel

[jira] [Updated] (FLINK-15224) Resource requirements are not respected when fulfilling a slot request with unresolvedRootSlots from a SlotSharingManager

2019-12-12 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15224: Description: In {{SchedulerImpl#allocateMultiTaskSlot}}, if a slot request cannot be fulfilled immediatel

[jira] [Commented] (FLINK-15013) Flink (on YARN) sometimes needs too many slots

2019-12-12 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994787#comment-16994787 ] Zhu Zhu commented on FLINK-15013: - Yes, it is ok to not respect input preferences if fal

[jira] [Commented] (FLINK-15224) Resource requirements are not respected when fulfilling a slot request with unresolvedRootSlots from a SlotSharingManager

2019-12-13 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995723#comment-16995723 ] Zhu Zhu commented on FLINK-15224: - The managed memory weight is not in ResourceProfile i

[jira] [Commented] (FLINK-15249) Improve PipelinedRegions calculation with Union Set

2019-12-16 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997171#comment-16997171 ] Zhu Zhu commented on FLINK-15249: - Hi [~nppoly], did you measure the region building per

[jira] [Commented] (FLINK-15249) Improve PipelinedRegions calculation with Union Set

2019-12-17 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16998122#comment-16998122 ] Zhu Zhu commented on FLINK-15249: - Thanks for the reply [~nppoly]. According to the resu

[jira] [Assigned] (FLINK-15320) JobManager crashes in the standalone model when cancelling job which subtask' status is scheduled

2019-12-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-15320: --- Assignee: Zhu Zhu > JobManager crashes in the standalone model when cancelling job which subtask'

[jira] [Commented] (FLINK-15320) JobManager crashes in the standalone model when cancelling job which subtask' status is scheduled

2019-12-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999815#comment-16999815 ] Zhu Zhu commented on FLINK-15320: - To summarize, this issue happens when an external job

[jira] [Created] (FLINK-15325) Input location preference which affects task distribution may make certain job performance worse

2019-12-19 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-15325: --- Summary: Input location preference which affects task distribution may make certain job performance worse Key: FLINK-15325 URL: https://issues.apache.org/jira/browse/FLINK-15325

[jira] [Updated] (FLINK-15325) Input location preference which affects task distribution may make certain job performance worse

2019-12-19 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15325: Environment: (was: When running TPC-DS jobs in a session cluster, we observed that sometimes tasks are

[jira] [Updated] (FLINK-15325) Input location preference which affects task distribution may make certain job performance worse

2019-12-19 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15325: Description: When running TPC-DS jobs in a session cluster, we observed that sometimes tasks are not even

[jira] [Updated] (FLINK-15325) Input location preference which affects task distribution may make certain job performance worse

2019-12-19 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15325: Description: When running TPC-DS jobs in a session cluster, we observed that sometimes tasks are not even

[jira] [Commented] (FLINK-15325) Input location preference which affects task distribution may make certain job performance worse

2019-12-19 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999895#comment-16999895 ] Zhu Zhu commented on FLINK-15325: - [~trohrmann] do you think we should have such a confi

[jira] [Commented] (FLINK-15320) JobManager crashes in the standalone model when cancelling job which subtask' status is scheduled

2019-12-19 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999898#comment-16999898 ] Zhu Zhu commented on FLINK-15320: - Thanks [~trohrmann]. And also thanks [~lining] for re

[jira] [Updated] (FLINK-15325) Input location preference which affects task distribution may make certain job performance worse

2019-12-19 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15325: Issue Type: Improvement (was: Bug) > Input location preference which affects task distribution may make c

[jira] [Commented] (FLINK-16139) Co-location constraints are not reset on task recovery in DefaultScheduler

2020-02-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17038956#comment-17038956 ] Zhu Zhu commented on FLINK-16139: - cc [~gjy] > Co-location constraints are not reset on

[jira] [Created] (FLINK-16139) Co-location constraints are not reset on task recovery in DefaultScheduler

2020-02-18 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-16139: --- Summary: Co-location constraints are not reset on task recovery in DefaultScheduler Key: FLINK-16139 URL: https://issues.apache.org/jira/browse/FLINK-16139 Project: Flink

[jira] [Created] (FLINK-16180) Replacing vertexExecution in ScheduledUnit with executionVertexID

2020-02-19 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-16180: --- Summary: Replacing vertexExecution in ScheduledUnit with executionVertexID Key: FLINK-16180 URL: https://issues.apache.org/jira/browse/FLINK-16180 Project: Flink Issu

[jira] [Updated] (FLINK-16180) Replacing vertexExecution in ScheduledUnit with executionVertexID

2020-02-19 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-16180: Parent: FLINK-15626 Issue Type: Sub-task (was: Improvement) > Replacing vertexExecution in Schedu

[jira] [Commented] (FLINK-16145) ScheduledUnit toString method throw NPE when Execution is null

2020-02-19 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040582#comment-17040582 ] Zhu Zhu commented on FLINK-16145: - Thanks for reporting this issue [~liuyufei]! This is

[jira] [Updated] (FLINK-16180) Replacing vertexExecution in ScheduledUnit with executionVertexID

2020-02-19 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-16180: Description: {{ScheduledUnit#vertexExecution}} is nullable but {{ProgrammedSlotProvider}} requires it to

[jira] [Commented] (FLINK-16180) Replacing vertexExecution in ScheduledUnit with executionVertexID

2020-02-19 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040713#comment-17040713 ] Zhu Zhu commented on FLINK-16180: - cc [~gjy] [~trohrmann] > Replacing vertexExecution i

[jira] [Updated] (FLINK-16180) Replacing vertexExecution in ScheduledUnit with executionVertexID

2020-02-19 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-16180: Description: {{ScheduledUnit#vertexExecution}} is nullable but {{ProgrammedSlotProvider}} requires it to

[jira] [Resolved] (FLINK-16139) Co-location constraints are not reset on task recovery in DefaultScheduler

2020-02-20 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu resolved FLINK-16139. - Resolution: Fixed Fixed via: master: 8c01397018f20865094dec8c37cf44279651a279 release-1.10: fd3e6e7dcaee

[jira] [Assigned] (FLINK-15813) Set default value of jobmanager.execution.failover-strategy to region

2020-02-21 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-15813: --- Assignee: Zhu Zhu > Set default value of jobmanager.execution.failover-strategy to region > ---

[jira] [Commented] (FLINK-16017) Improve attachJobGraph Performance

2020-02-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044298#comment-17044298 ] Zhu Zhu commented on FLINK-16017: - Yes region creation is now invoked twice, one from R

<    4   5   6   7   8   9   10   11   12   13   >