[jira] [Commented] (FLINK-15031) Automatically calculate required network memory for fine-grained jobs

2021-06-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17371290#comment-17371290 ] Zhu Zhu commented on FLINK-15031: - Discussed with Till offline. His concern was that the network

[jira] [Comment Edited] (FLINK-15031) Automatically calculate required network memory for fine-grained jobs

2021-06-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17371207#comment-17371207 ] Zhu Zhu edited comment on FLINK-15031 at 6/29/21, 8:18 AM: --- I think it should

[jira] [Commented] (FLINK-15031) Automatically calculate required network memory for fine-grained jobs

2021-06-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17371207#comment-17371207 ] Zhu Zhu commented on FLINK-15031: - I think it should be an advanced and experimental config. It can be

[jira] [Commented] (FLINK-15031) Automatically calculate required network memory for fine-grained jobs

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17370577#comment-17370577 ] Zhu Zhu commented on FLINK-15031: - Thanks for reviving this discussion! This improvement is necessary

[jira] [Updated] (FLINK-15031) Automatically calculate required network memory for fine-grained jobs

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15031: Summary: Automatically calculate required network memory for fine-grained jobs (was: Automatically

[jira] [Reopened] (FLINK-15031) Automatically calculate required shuffle memory for fine-grained jobs

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reopened FLINK-15031: - Assignee: Jin Xing (was: Zhu Zhu) > Automatically calculate required shuffle memory for fine-grained

[jira] [Updated] (FLINK-15031) Automatically calculate required shuffle memory for fine-grained jobs

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15031: Summary: Automatically calculate required shuffle memory for fine-grained jobs (was: Calculate required

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Fix Version/s: 1.13.2 1.14.0 > StackOverflowException can happen when a large scale

[jira] [Assigned] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-22945: --- Assignee: Gen Luo (was: Luo Gen) > StackOverflowException can happen when a large scale job is

[jira] [Assigned] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-22945: --- Assignee: Luo Gen > StackOverflowException can happen when a large scale job is CANCELING/FAILING

[jira] [Commented] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17370462#comment-17370462 ] Zhu Zhu commented on FLINK-22945: - [~pltbkd] I have assign you the ticket. Feel free to open a fix for

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Priority: Critical (was: Major) > StackOverflowException can happen when a large scale job is

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Labels: (was: auto-deprioritized-critical) > StackOverflowException can happen when a large scale job

[jira] [Commented] (FLINK-23153) Benchmark not compiling

2021-06-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-23153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17369262#comment-17369262 ] Zhu Zhu commented on FLINK-23153: - Thanks for reporting this issue [~Thesharing]. I have assigned you

[jira] [Assigned] (FLINK-23153) Benchmark not compiling

2021-06-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-23153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-23153: --- Assignee: Zhilong Hong > Benchmark not compiling > --- > >

[jira] [Commented] (FLINK-23005) Optimize the deployment of tasks

2021-06-21 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-23005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366577#comment-17366577 ] Zhu Zhu commented on FLINK-23005: - Thanks [~Thesharing] for looking into the problem and proposing an

[jira] [Assigned] (FLINK-23005) Optimize the deployment of tasks

2021-06-21 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-23005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-23005: --- Assignee: Zhilong Hong > Optimize the deployment of tasks > > >

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Description: The pending requests in ExecutionSlotAllocator are not cleared when a job transitions to

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Component/s: Runtime / Coordination > StackOverflowException can happen when a large scale job is

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Summary: StackOverflowException can happen when a large scale job is CANCELING/FAILING (was:

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELED/FAILED

2021-06-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Description: The pending requests in ExecutionSlotAllocator are not cleared when a job transitions to

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELED/FAILED

2021-06-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Priority: Critical (was: Major) > StackOverflowException can happen when a large scale job is

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELED/FAILED

2021-06-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Issue Type: Bug (was: Improvement) > StackOverflowException can happen when a large scale job is

[jira] [Created] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELED/FAILED

2021-06-09 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-22945: --- Summary: StackOverflowException can happen when a large scale job is CANCELED/FAILED Key: FLINK-22945 URL: https://issues.apache.org/jira/browse/FLINK-22945 Project: Flink

[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-06-08 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-19142: Labels: pull-request-available (was: pull-request-available stale-assigned) > Investigate slot hijacking

[jira] [Closed] (FLINK-22863) ArrayIndexOutOfBoundsException may happen when building rescale edges

2021-06-04 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-22863. --- Resolution: Fixed Fixed via master: 739a12add50c90e020e4b9aaafc1cc45465fa937 release-1.13:

[jira] [Comment Edited] (FLINK-22115) JobManager dies with IllegalStateException SharedSlot (physical request SlotRequestId{%}) has been released

2021-06-03 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17357036#comment-17357036 ] Zhu Zhu edited comment on FLINK-22115 at 6/4/21, 3:09 AM: -- Close this ticket

[jira] [Closed] (FLINK-22115) JobManager dies with IllegalStateException SharedSlot (physical request SlotRequestId{%}) has been released

2021-06-03 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-22115. --- Resolution: Won't Fix Close this ticket since it may be already fixed and has been inactive for too long.

[jira] [Commented] (FLINK-22863) ArrayIndexOutOfBoundsException may happen when building rescale edges

2021-06-03 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356274#comment-17356274 ] Zhu Zhu commented on FLINK-22863: - Thanks for reporting this issue. [~Thesharing] It is indeed a

[jira] [Updated] (FLINK-22863) ArrayIndexOutOfBoundsException may happen when building rescale edges

2021-06-03 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22863: Priority: Blocker (was: Critical) > ArrayIndexOutOfBoundsException may happen when building rescale

[jira] [Comment Edited] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time

2021-05-31 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17354757#comment-17354757 ] Zhu Zhu edited comment on FLINK-16069 at 6/1/21, 2:54 AM: -- Even if the main

[jira] [Commented] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time

2021-05-31 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17354757#comment-17354757 ] Zhu Zhu commented on FLINK-16069: - Even if the main thread can have the highest priority, GC problem can

[jira] [Commented] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time

2021-05-31 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17354420#comment-17354420 ] Zhu Zhu commented on FLINK-16069: - Yes a dedicated {{serializationExecutor}} is an alternative. One

[jira] [Commented] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time

2021-05-27 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17352943#comment-17352943 ] Zhu Zhu commented on FLINK-16069: - Thanks for the suggestion and sorry for the late reply! [~trohrmann]

[jira] [Updated] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time

2021-05-27 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-16069: Labels: (was: stale-major) > Creation of TaskDeploymentDescriptor can block main thread for long time >

[jira] [Commented] (FLINK-22677) Scheduler should invoke ShuffleMaster#registerPartitionWithProducer by a real asynchronous fashion

2021-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347310#comment-17347310 ] Zhu Zhu commented on FLINK-22677: - I will take a look to see how we can improve the partition

[jira] [Assigned] (FLINK-22305) Improve log messages of sort-merge blocking shuffle

2021-05-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-22305: --- Assignee: Yingjie Cao > Improve log messages of sort-merge blocking shuffle >

[jira] [Closed] (FLINK-14327) Getting "Could not forward element to next operator" error

2021-05-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-14327. --- Fix Version/s: (was: 1.9.4) Resolution: Invalid Close it because the ticket has been inactive

[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-05-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-19142: Labels: pull-request-available (was: auto-unassigned pull-request-available) > Investigate slot

[jira] [Assigned] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-05-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-19142: --- Assignee: Zhu Zhu > Investigate slot hijacking from preceding pipelined regions after failover >

[jira] [Comment Edited] (FLINK-17726) Scheduler should take care of tasks directly canceled by TaskManager

2021-04-26 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17332834#comment-17332834 ] Zhu Zhu edited comment on FLINK-17726 at 4/27/21, 1:50 AM: --- I think it is a

[jira] [Commented] (FLINK-17726) Scheduler should take care of tasks directly canceled by TaskManager

2021-04-26 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17332834#comment-17332834 ] Zhu Zhu commented on FLINK-17726: - I think it is a potential issue and is not a real production problem

[jira] [Commented] (FLINK-22115) JobManager dies with IllegalStateException SharedSlot (physical request SlotRequestId{%}) has been released

2021-04-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331685#comment-17331685 ] Zhu Zhu commented on FLINK-22115: - Hi [~wym_maozi], I could not reproduce this problem (or related

[jira] [Commented] (FLINK-22115) JobManager dies with IllegalStateException SharedSlot (physical request SlotRequestId{%}) has been released

2021-04-22 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17327150#comment-17327150 ] Zhu Zhu commented on FLINK-22115: - Thanks for reporting this issue. [~wym_maozi] I will take a look. The

[jira] [Assigned] (FLINK-14510) Remove the lazy vertex attaching mechanism from ExecutionGraph

2021-04-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-14510: --- Assignee: (was: Zhu Zhu) > Remove the lazy vertex attaching mechanism from ExecutionGraph >

[jira] [Assigned] (FLINK-12138) Limit input split count of each source task for better failover experience

2021-04-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-12138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-12138: --- Assignee: (was: Zhu Zhu) > Limit input split count of each source task for better failover

[jira] [Updated] (FLINK-14510) Remove the lazy vertex attaching mechanism from ExecutionGraph

2021-04-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14510: Fix Version/s: (was: 1.13.0) > Remove the lazy vertex attaching mechanism from ExecutionGraph >

[jira] [Closed] (FLINK-22037) Remove the redundant blocking queue from DeployingTasksBenchmarkBase

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-22037. --- Resolution: Fixed Fixed via b89736b1dd6f350d16529def539f1a9ebac909f1 > Remove the redundant blocking queue

[jira] [Updated] (FLINK-22037) Remove the redundant blocking queue from DeployingTasksBenchmarkBase

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22037: Component/s: (was: Runtime / Coordination) > Remove the redundant blocking queue from

[jira] [Commented] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17311960#comment-17311960 ] Zhu Zhu commented on FLINK-16069: - >From what I can see, heartbeat timeout happens because the scheduled

[jira] [Updated] (FLINK-22007) PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22007: Fix Version/s: (was: 1.13.0) > PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing >

[jira] [Closed] (FLINK-22007) PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-22007. --- Resolution: Duplicate > PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing >

[jira] [Commented] (FLINK-22007) PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17311456#comment-17311456 ] Zhu Zhu commented on FLINK-22007: - PartitionReleaseInBatchJobBenchmark is working

[jira] [Commented] (FLINK-22007) PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17311426#comment-17311426 ] Zhu Zhu commented on FLINK-22007: - I'd like to wait a bit time for the next run of the scheduler

[jira] [Assigned] (FLINK-22037) Remove the redundant blocking queue from DeployingTasksBenchmarkBase

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-22037: --- Assignee: Zhilong Hong > Remove the redundant blocking queue from DeployingTasksBenchmarkBase >

[jira] [Closed] (FLINK-20757) Optimize data broadcast for sort-merge shuffle

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-20757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-20757. --- Resolution: Fixed Done via ae0a615f4490c548fcb53b15d2f6f0595371d303 > Optimize data broadcast for

[jira] [Commented] (FLINK-22007) PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17311313#comment-17311313 ] Zhu Zhu commented on FLINK-22007: - [~pnowojski] FLINK-21332 is merged and hopefully

[jira] [Closed] (FLINK-21332) Optimize releasing result partitions in RegionPartitionReleaseStrategy

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21332. --- Resolution: Fixed Done via 9951be845e14026b17518373c73e28796e63407d > Optimize releasing result partitions

[jira] [Closed] (FLINK-21330) Optimize the performance of PipelinedRegionSchedulingStrategy

2021-03-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21330. --- Resolution: Fixed Done via 5f0c76f2e87326cf844d9914e8b8f6cd7f311c8f > Optimize the performance of

[jira] [Commented] (FLINK-22007) PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing

2021-03-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310496#comment-17310496 ] Zhu Zhu commented on FLINK-22007: - Hopefully we can get the fix merged tomorrow. I will disable the

[jira] [Closed] (FLINK-19938) Implement shuffle data read scheduling for sort-merge blocking shuffle

2021-03-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-19938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-19938. --- Resolution: Fixed Done via f1e69bbde05cb834e6726e88b3c354299922ed46 > Implement shuffle data read

[jira] [Commented] (FLINK-22007) PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing

2021-03-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310470#comment-17310470 ] Zhu Zhu commented on FLINK-22007: - Hi [~pnowojski], thanks for reporting this problem! We also noticed

[jira] [Closed] (FLINK-21731) Add benchmarks for DefaultScheduler's creation, scheduling and deploying

2021-03-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21731. --- Resolution: Fixed The benchmarks are enabled in flink-benchmark via

[jira] [Assigned] (FLINK-21850) Improve document and config description of sort-merge blocking shuffle

2021-03-26 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-21850: --- Assignee: Yingjie Cao > Improve document and config description of sort-merge blocking shuffle >

[jira] [Closed] (FLINK-20740) Use managed memory to avoid direct memory OOM error for sort-merge shuffle (introduce a separated buffer pool)

2021-03-26 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-20740. --- Resolution: Fixed done via a1f079d1968ae286bd8d91b48801a732b88b0bc7 > Use managed memory to avoid direct

[jira] [Closed] (FLINK-21331) Optimize calculating tasks to restart in RestartPipelinedRegionFailoverStrategy

2021-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21331. --- Resolution: Fixed done via 9c95cc19bed1a8c9dddcfa3969614474ee4934c2 > Optimize calculating tasks to

[jira] [Commented] (FLINK-21117) KafkaProducerExactlyOnceITCase fails with "Exceeded checkpoint tolerable failure threshold."

2021-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17309124#comment-17309124 ] Zhu Zhu commented on FLINK-21117: - another instance:

[jira] [Closed] (FLINK-21975) Remove hamcrest dependency from SchedulerBenchmarkBase

2021-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21975. --- Resolution: Fixed Fixed via 94ce6f9f638f7e346344e6e078ecbdd8933b44d6 > Remove hamcrest dependency from

[jira] [Commented] (FLINK-20329) Elasticsearch7DynamicSinkITCase hangs

2021-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-20329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17309121#comment-17309121 ] Zhu Zhu commented on FLINK-20329: - Another instance:

[jira] [Assigned] (FLINK-21975) Remove hamcrest dependency from SchedulerBenchmarkBase

2021-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-21975: --- Assignee: Zhilong Hong > Remove hamcrest dependency from SchedulerBenchmarkBase >

[jira] [Assigned] (FLINK-21920) Optimize DefaultScheduler#allocateSlots

2021-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-21920: --- Assignee: Zhilong Hong > Optimize DefaultScheduler#allocateSlots >

[jira] [Assigned] (FLINK-21915) Optimize Execution#finishPartitionsAndUpdateConsumers

2021-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-21915: --- Assignee: Zhilong Hong > Optimize Execution#finishPartitionsAndUpdateConsumers >

[jira] [Updated] (FLINK-21788) Throw PartitionNotFoundException if the partition file has been lost for blocking shuffle

2021-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-21788: Affects Version/s: 1.9.3 1.10.3 1.11.3

[jira] [Closed] (FLINK-21731) Add benchmarks for DefaultScheduler's creation, scheduling and deploying

2021-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21731. --- Resolution: Fixed done via 08ef809930ed69d0a0d2752664e88c61c6a1b869 > Add benchmarks for

[jira] [Assigned] (FLINK-21731) Add benchmarks for DefaultScheduler's creation, scheduling and deploying

2021-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-21731: --- Assignee: Zhilong Hong > Add benchmarks for DefaultScheduler's creation, scheduling and deploying

[jira] [Updated] (FLINK-21788) Throw PartitionNotFoundException if the partition file has been lost for blocking shuffle

2021-03-24 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-21788: Priority: Critical (was: Blocker) > Throw PartitionNotFoundException if the partition file has been lost

[jira] [Updated] (FLINK-21788) Throw PartitionNotFoundException if the partition file has been lost for blocking shuffle

2021-03-24 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-21788: Priority: Blocker (was: Major) > Throw PartitionNotFoundException if the partition file has been lost

[jira] [Updated] (FLINK-21788) Throw PartitionNotFoundException if the partition file has been lost for blocking shuffle

2021-03-24 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-21788: Parent: (was: FLINK-19614) Issue Type: Bug (was: Sub-task) > Throw

[jira] [Commented] (FLINK-21788) Throw PartitionNotFoundException if the partition file has been lost for blocking shuffle

2021-03-24 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308329#comment-17308329 ] Zhu Zhu commented on FLINK-21788: - which versions are affected by this problem [~kevin.cyj]? > Throw

[jira] [Comment Edited] (FLINK-21788) Throw PartitionNotFoundException if the partition file has been lost for blocking shuffle

2021-03-24 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308329#comment-17308329 ] Zhu Zhu edited comment on FLINK-21788 at 3/25/21, 3:44 AM: --- Which versions are

[jira] [Assigned] (FLINK-21788) Throw PartitionNotFoundException if the partition file has been lost for blocking shuffle

2021-03-24 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-21788: --- Assignee: Yingjie Cao > Throw PartitionNotFoundException if the partition file has been lost for

[jira] [Closed] (FLINK-21951) Fix wrong if condition in BufferReaderWriterUtil#writeBuffers

2021-03-24 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21951. --- Assignee: Yingjie Cao Resolution: Fixed Fixed via 3533d9822f0f653f2848b31de0fc239b3a12dcef > Fix

[jira] [Closed] (FLINK-21328) Optimize the initialization of DefaultExecutionTopology

2021-03-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21328. --- Resolution: Fixed done via 0ca36150bc76056ca4ceae28c7eb6ec0a2dd5127 > Optimize the initialization of

[jira] [Closed] (FLINK-21777) Replace the 4M data writing cache of sort-merge shuffle with writev system call

2021-03-17 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21777. --- Resolution: Fixed done via 61bc9429d6afde18711bcc726f077ade5e3b44a8 > Replace the 4M data writing cache of

[jira] [Updated] (FLINK-21707) Job is possible to hang when restarting a FINISHED task with POINTWISE BLOCKING consumers

2021-03-16 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-21707: Fix Version/s: 1.12.3 1.13.0 > Job is possible to hang when restarting a FINISHED task

[jira] [Assigned] (FLINK-19614) Further optimization of sort-merge based blocking shuffle

2021-03-16 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-19614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-19614: --- Assignee: Yingjie Cao > Further optimization of sort-merge based blocking shuffle >

[jira] [Updated] (FLINK-19614) Further optimization of sort-merge based blocking shuffle

2021-03-16 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-19614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-19614: Fix Version/s: 1.13.0 > Further optimization of sort-merge based blocking shuffle >

[jira] [Assigned] (FLINK-20758) Use region file mechanism for shuffle data reading before we switch to managed memory

2021-03-16 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-20758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-20758: --- Assignee: Yingjie Cao > Use region file mechanism for shuffle data reading before we switch to >

[jira] [Assigned] (FLINK-19938) Implement shuffle data read scheduling for sort-merge blocking shuffle

2021-03-16 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-19938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-19938: --- Assignee: Yingjie Cao > Implement shuffle data read scheduling for sort-merge blocking shuffle >

[jira] [Closed] (FLINK-21778) Use heap memory instead of direct memory as index entry cache for sort-merge shuffle

2021-03-16 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21778. --- Resolution: Fixed done via f165c7261d6f90a1390efcc3b98a00ae60a67ef3 > Use heap memory instead of direct

[jira] [Assigned] (FLINK-20757) Optimize data broadcast for sort-merge shuffle

2021-03-16 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-20757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-20757: --- Assignee: Yingjie Cao > Optimize data broadcast for sort-merge shuffle >

[jira] [Assigned] (FLINK-21777) Replace the 4M data writing cache of sort-merge shuffle with writev system call

2021-03-16 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-21777: --- Assignee: Yingjie Cao > Replace the 4M data writing cache of sort-merge shuffle with writev system

[jira] [Assigned] (FLINK-21778) Use heap memory instead of direct memory as index entry cache for sort-merge shuffle

2021-03-16 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-21778: --- Assignee: Yingjie Cao > Use heap memory instead of direct memory as index entry cache for

[jira] [Closed] (FLINK-21707) Job is possible to hang when restarting a FINISHED task with POINTWISE BLOCKING consumers

2021-03-16 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21707. --- Resolution: Fixed fixed via: master(1.13): c1aa6b41841342ddf6e168f326a183db8a8edcac 1.12:

[jira] [Updated] (FLINK-21707) Job is possible to hang when restarting a FINISHED task with POINTWISE BLOCKING consumers

2021-03-15 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-21707: Affects Version/s: (was: 1.11.3) > Job is possible to hang when restarting a FINISHED task with

[jira] [Assigned] (FLINK-21707) Job is possible to hang when restarting a FINISHED task with POINTWISE BLOCKING consumers

2021-03-11 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-21707: --- Assignee: Zhu Zhu > Job is possible to hang when restarting a FINISHED task with POINTWISE >

[jira] [Commented] (FLINK-21707) Job is possible to hang when restarting a FINISHED task with POINTWISE BLOCKING consumers

2021-03-11 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17299534#comment-17299534 ] Zhu Zhu commented on FLINK-21707: - I have a fix for it already so I will take this ticket. I opened 2

[jira] [Created] (FLINK-21735) Harden JobMaster#updateTaskExecutionState()

2021-03-11 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-21735: --- Summary: Harden JobMaster#updateTaskExecutionState() Key: FLINK-21735 URL: https://issues.apache.org/jira/browse/FLINK-21735 Project: Flink Issue Type: Improvement

[jira] [Created] (FLINK-21734) Allow BLOCKING result partition to be individually consumable

2021-03-11 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-21734: --- Summary: Allow BLOCKING result partition to be individually consumable Key: FLINK-21734 URL: https://issues.apache.org/jira/browse/FLINK-21734 Project: Flink Issue

[jira] [Commented] (FLINK-21707) Job is possible to hang when restarting a FINISHED task with POINTWISE BLOCKING consumers

2021-03-10 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17299293#comment-17299293 ] Zhu Zhu commented on FLINK-21707: - Agreed to "removing this logic from the

<    5   6   7   8   9   10   11   12   13   14   >