[jira] [Comment Edited] (FLINK-14164) Add a metric to show failover count regarding fine grained recovery

2019-10-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956735#comment-16956735 ] Zhu Zhu edited comment on FLINK-14164 at 10/30/19 2:33 AM: --- Ma

[jira] [Comment Edited] (FLINK-14164) Add a metric to show failover count regarding fine grained recovery

2019-10-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956735#comment-16956735 ] Zhu Zhu edited comment on FLINK-14164 at 10/30/19 2:35 AM: --- Ma

[jira] [Updated] (FLINK-14375) Refactor task state updating to only notify scheduler about state changes that really happened

2019-10-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14375: Description: The DefaultScheduler triggers failover if a task is notified to be FAILED. However, in the c

[jira] [Created] (FLINK-14566) Enable ResourceSpec to get/set whether managed memory is used

2019-10-30 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14566: --- Summary: Enable ResourceSpec to get/set whether managed memory is used Key: FLINK-14566 URL: https://issues.apache.org/jira/browse/FLINK-14566 Project: Flink Issue Ty

[jira] [Updated] (FLINK-14566) Enable ResourceSpec to get/set whether managed memory is used

2019-10-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14566: Description: To calculate managed memory fraction for an operator with UNKNOWN resources, we need to know

[jira] [Updated] (FLINK-14566) Enable ResourceSpec to get/set whether managed memory is used

2019-10-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14566: Description: To calculate managed memory fraction for an operator with UNKNOWN resources, we need to know

[jira] [Updated] (FLINK-14566) Enable ResourceSpec to get/set whether managed memory is used

2019-10-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14566: Description: To calculate managed memory fraction for an operator with UNKNOWN resources, we need to know

[jira] [Updated] (FLINK-14439) RestartPipelinedRegionStrategy leverage tracked partition availability for better failover experience in DefaultScheduler

2019-10-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14439: Description: In current region failover when using DefaultScheduler, most of the input result partition s

[jira] [Updated] (FLINK-14439) RestartPipelinedRegionStrategy leverage tracked partition availability for better failover experience in DefaultScheduler

2019-10-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14439: Description: In current region failover when using DefaultScheduler, most of the input result partition s

[jira] [Updated] (FLINK-14164) Add a metric to show failover count regarding fine grained recovery

2019-10-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14164: Description: Previously Flink uses restart all strategy to recover jobs from failures. And the metric "fu

[jira] [Updated] (FLINK-14164) Add a metric to show failover count regarding fine grained recovery

2019-10-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14164: Description: Previously Flink uses restart all strategy to recover jobs from failures. And the metric "fu

[jira] [Commented] (FLINK-14536) Make clear the way to aggregate specified cpuCores resources

2019-10-31 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16963744#comment-16963744 ] Zhu Zhu commented on FLINK-14536: - Thanks [~zjwang] for the explanation. Agreed that us

[jira] [Commented] (FLINK-14566) Enable ResourceSpec to get/set whether managed memory is used

2019-10-31 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16963824#comment-16963824 ] Zhu Zhu commented on FLINK-14566: - I think we can do it if there is assumption that an o

[jira] [Commented] (FLINK-14566) Enable ResourceSpec to get/set whether managed memory is used

2019-10-31 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16963844#comment-16963844 ] Zhu Zhu commented on FLINK-14566: - [~ykt836] When saying "little", I mean even "no" in t

[jira] [Comment Edited] (FLINK-14566) Enable ResourceSpec to get/set whether managed memory is used

2019-10-31 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16963844#comment-16963844 ] Zhu Zhu edited comment on FLINK-14566 at 10/31/19 10:36 AM:

[jira] [Created] (FLINK-14594) Fix matching logics of ResourceSpec/ResourceProfile/Resource considering double values

2019-11-03 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14594: --- Summary: Fix matching logics of ResourceSpec/ResourceProfile/Resource considering double values Key: FLINK-14594 URL: https://issues.apache.org/jira/browse/FLINK-14594 Project:

[jira] [Commented] (FLINK-14575) Wrong (parent-first) class loader during serialization while submitting jobs

2019-11-04 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967184#comment-16967184 ] Zhu Zhu commented on FLINK-14575: - I guess it is related to FLINK-14037(the fix in progr

[jira] [Commented] (FLINK-14575) Wrong (parent-first) class loader during serialization while submitting jobs

2019-11-04 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967191#comment-16967191 ] Zhu Zhu commented on FLINK-14575: - And even more reported issues FLINK-14598, FLINK-1374

[jira] [Commented] (FLINK-14572) BlobsCleanupITCase failed in Travis stage core - scheduler_ng

2019-11-04 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967225#comment-16967225 ] Zhu Zhu commented on FLINK-14572: - >From the root cause, the case fails when trying to g

[jira] [Created] (FLINK-14606) Simplify params of Execution#processFail

2019-11-04 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14606: --- Summary: Simplify params of Execution#processFail Key: FLINK-14606 URL: https://issues.apache.org/jira/browse/FLINK-14606 Project: Flink Issue Type: Sub-task

[jira] [Created] (FLINK-14607) SharedSlot cannot fulfill pending slot requests before it's totally released

2019-11-04 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14607: --- Summary: SharedSlot cannot fulfill pending slot requests before it's totally released Key: FLINK-14607 URL: https://issues.apache.org/jira/browse/FLINK-14607 Project: Flink

[jira] [Updated] (FLINK-14607) SharedSlot cannot fulfill pending slot requests before it's completely released

2019-11-04 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14607: Summary: SharedSlot cannot fulfill pending slot requests before it's completely released (was: SharedSlot

[jira] [Updated] (FLINK-14607) SharedSlot cannot fulfill pending slot requests before it's completely released

2019-11-04 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14607: Description: Currently a pending request can only be fulfilled when a physical slot({{AllocatedSlot}}) be

[jira] [Updated] (FLINK-14607) SharedSlot cannot fulfill pending slot requests before it's completely released

2019-11-04 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14607: Description: Currently a pending request can only be fulfilled when a physical slot({{AllocatedSlot}}) be

[jira] [Updated] (FLINK-14607) SharedSlot cannot fulfill pending slot requests before it's completely released

2019-11-04 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14607: Description: Currently a pending request can only be fulfilled when a physical slot({{AllocatedSlot}}) be

[jira] [Updated] (FLINK-14607) SharedSlot cannot fulfill pending slot requests before it's completely released

2019-11-04 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14607: Description: Currently a pending request can only be fulfilled when a physical slot({{AllocatedSlot}}) be

[jira] [Commented] (FLINK-14059) Introduce option allVerticesInSameSlotSharingGroupByDefault in ExecutionConfig

2019-11-05 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967381#comment-16967381 ] Zhu Zhu commented on FLINK-14059: - Agreed that we should put it into StreamGraph if it i

[jira] [Commented] (FLINK-14374) Enable RegionFailoverITCase to pass with scheduler NG

2019-11-05 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967416#comment-16967416 ] Zhu Zhu commented on FLINK-14374: - If not using {{FailingRestartStrategy}}, I guess the

[jira] [Created] (FLINK-14611) Move allVerticesInSameSlotSharingGroupByDefault from ExecutionConfig to StreamGraph

2019-11-05 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14611: --- Summary: Move allVerticesInSameSlotSharingGroupByDefault from ExecutionConfig to StreamGraph Key: FLINK-14611 URL: https://issues.apache.org/jira/browse/FLINK-14611 Project: Fl

[jira] [Commented] (FLINK-14059) Introduce option allVerticesInSameSlotSharingGroupByDefault in ExecutionConfig

2019-11-05 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967418#comment-16967418 ] Zhu Zhu commented on FLINK-14059: - Thanks for the suggestion [~dwysakowicz]. I think you

[jira] [Commented] (FLINK-14607) SharedSlot cannot fulfill pending slot requests before it's completely released

2019-11-05 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967449#comment-16967449 ] Zhu Zhu commented on FLINK-14607: - With FLIP-53, it may happen for a *logical region* in

[jira] [Comment Edited] (FLINK-14607) SharedSlot cannot fulfill pending slot requests before it's completely released

2019-11-05 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967449#comment-16967449 ] Zhu Zhu edited comment on FLINK-14607 at 11/5/19 11:53 AM: --- Wi

[jira] [Commented] (FLINK-14611) Move allVerticesInSameSlotSharingGroupByDefault from ExecutionConfig to StreamGraph

2019-11-05 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967505#comment-16967505 ] Zhu Zhu commented on FLINK-14611: - That's fine. I will start the work soon. > Move allV

[jira] [Updated] (FLINK-14375) Refactor task state updating to only notify scheduler about state changes that really happened

2019-11-05 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14375: Description: The DefaultScheduler triggers failover if a task is notified to be FAILED. However, in the c

[jira] [Updated] (FLINK-14375) Refactor task state updating to only notify scheduler about state changes that really happened

2019-11-05 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14375: Description: The DefaultScheduler triggers failover if a task is notified to be FAILED. However, in the c

[jira] [Updated] (FLINK-14607) SharedSlot cannot fulfill pending slot requests before it's completely released

2019-11-05 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14607: Priority: Minor (was: Major) > SharedSlot cannot fulfill pending slot requests before it's completely >

[jira] [Updated] (FLINK-14375) Avoid to notify ineffective state updates to scheduler

2019-11-05 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14375: Summary: Avoid to notify ineffective state updates to scheduler (was: Refactor task state updating to onl

[jira] [Commented] (FLINK-14607) SharedSlot cannot fulfill pending slot requests before it's completely released

2019-11-05 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968042#comment-16968042 ] Zhu Zhu commented on FLINK-14607: - Thanks for the explanation. The slot sharing can be s

[jira] [Commented] (FLINK-14611) Move allVerticesInSameSlotSharingGroupByDefault from ExecutionConfig to StreamGraph

2019-11-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968240#comment-16968240 ] Zhu Zhu commented on FLINK-14611: - [~dwysakowicz] would you assign this ticket to me and

[jira] [Created] (FLINK-14641) Fix description of metric `fullRestarts`

2019-11-06 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14641: --- Summary: Fix description of metric `fullRestarts` Key: FLINK-14641 URL: https://issues.apache.org/jira/browse/FLINK-14641 Project: Flink Issue Type: Bug Comp

[jira] [Created] (FLINK-14643) Deprecate metric `fullRestarts`

2019-11-06 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14643: --- Summary: Deprecate metric `fullRestarts` Key: FLINK-14643 URL: https://issues.apache.org/jira/browse/FLINK-14643 Project: Flink Issue Type: Bug Components: R

[jira] [Updated] (FLINK-14643) Deprecate metric `fullRestarts`

2019-11-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14643: Issue Type: Improvement (was: Bug) > Deprecate metric `fullRestarts` > --- >

[jira] [Commented] (FLINK-14643) Deprecate metric `fullRestarts`

2019-11-07 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969276#comment-16969276 ] Zhu Zhu commented on FLINK-14643: - Yes. Would you assign it to me [~chesnay]? > Depreca

[jira] [Commented] (FLINK-14641) Fix description of metric `fullRestarts`

2019-11-07 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969275#comment-16969275 ] Zhu Zhu commented on FLINK-14641: - Yes. Would you assign it to me [~chesnay]? > Fix des

[jira] [Updated] (FLINK-14641) Fix description of metric `fullRestarts`

2019-11-07 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14641: Affects Version/s: 1.10.0 > Fix description of metric `fullRestarts` > ---

[jira] [Updated] (FLINK-14641) Fix description of metric `fullRestarts`

2019-11-07 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14641: Affects Version/s: (was: 1.10.0) > Fix description of metric `fullRestarts` >

[jira] [Updated] (FLINK-14641) Fix description of metric `fullRestarts`

2019-11-07 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14641: Fix Version/s: (was: 1.10.0) > Fix description of metric `fullRestarts` >

[jira] [Updated] (FLINK-14641) Fix description of metric `fullRestarts`

2019-11-07 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14641: Affects Version/s: 1.10.0 > Fix description of metric `fullRestarts` > ---

[jira] [Updated] (FLINK-14641) Fix description of metric `fullRestarts`

2019-11-07 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14641: Fix Version/s: 1.10.0 > Fix description of metric `fullRestarts` > ---

[jira] [Commented] (FLINK-14641) Fix description of metric `fullRestarts`

2019-11-07 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969877#comment-16969877 ] Zhu Zhu commented on FLINK-14641: - [~chesnay] not pretty sure whether we can apply this

[jira] [Commented] (FLINK-14606) Simplify params of Execution#processFail

2019-11-08 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969932#comment-16969932 ] Zhu Zhu commented on FLINK-14606: - Thanks [~chesnay] for sharing the thoughts. My bad t

[jira] [Comment Edited] (FLINK-14606) Simplify params of Execution#processFail

2019-11-08 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969932#comment-16969932 ] Zhu Zhu edited comment on FLINK-14606 at 11/8/19 8:15 AM: -- Than

[jira] [Commented] (FLINK-14164) Add a metric to show failover count regarding fine grained recovery

2019-11-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16970998#comment-16970998 ] Zhu Zhu commented on FLINK-14164: - Compared to {{numberOfRestarts}} metric, {{fullRestar

[jira] [Created] (FLINK-14701) Slot leaks if SharedSlotOversubscribedException happens

2019-11-10 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14701: --- Summary: Slot leaks if SharedSlotOversubscribedException happens Key: FLINK-14701 URL: https://issues.apache.org/jira/browse/FLINK-14701 Project: Flink Issue Type: Bug

[jira] [Commented] (FLINK-14701) Slot leaks if SharedSlotOversubscribedException happens

2019-11-10 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16971359#comment-16971359 ] Zhu Zhu commented on FLINK-14701: - [~chesnay], what do you think of the issue and the pr

[jira] [Commented] (FLINK-14674) some tpc-ds query hang in scheduled stage for long time

2019-11-10 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16971363#comment-16971363 ] Zhu Zhu commented on FLINK-14674: - The job hangs due to FLINK-14701, which happen if {{

[jira] [Updated] (FLINK-14701) Slot leaks if SharedSlotOversubscribedException happens

2019-11-11 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14701: Priority: Critical (was: Blocker) > Slot leaks if SharedSlotOversubscribedException happens > ---

[jira] [Commented] (FLINK-14164) Add a metric to show failover count regarding fine grained recovery

2019-11-11 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16971413#comment-16971413 ] Zhu Zhu commented on FLINK-14164: - I misunderstood the new problem. [~gjy] You are right

[jira] [Created] (FLINK-14708) Introduce full restarts failover strategy for NG scheduler

2019-11-11 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14708: --- Summary: Introduce full restarts failover strategy for NG scheduler Key: FLINK-14708 URL: https://issues.apache.org/jira/browse/FLINK-14708 Project: Flink Issue Type:

[jira] [Created] (FLINK-14733) Introduce ResourceProfile builder to enable flexible building

2019-11-12 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14733: --- Summary: Introduce ResourceProfile builder to enable flexible building Key: FLINK-14733 URL: https://issues.apache.org/jira/browse/FLINK-14733 Project: Flink Issue Ty

[jira] [Updated] (FLINK-14733) Introduce ResourceProfile builder to enable flexible building

2019-11-12 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14733: Description: The ResourceProfile constructors may accept values in raw types (double/int) or advanced typ

[jira] [Created] (FLINK-14734) Add a ResourceSpec in SlotSharingGroup to describe its overall resources

2019-11-12 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14734: --- Summary: Add a ResourceSpec in SlotSharingGroup to describe its overall resources Key: FLINK-14734 URL: https://issues.apache.org/jira/browse/FLINK-14734 Project: Flink

[jira] [Updated] (FLINK-14734) Add a ResourceSpec in SlotSharingGroup to describe its overall resources

2019-11-12 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14734: Description: To enable FLINK-14314 to allocate task slot regarding the share slot resources. We need a R

[jira] [Updated] (FLINK-14314) Allocate shared slot resources respecting the resources of all vertices in the group

2019-11-12 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14314: Description: With FLINK-14058, it is assumed that a shared slot should be large enough to be used by one

[jira] [Updated] (FLINK-14734) Add a ResourceSpec in SlotSharingGroup to describe its overall resources

2019-11-12 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14734: Description: To enable FLINK-14314 to allocate task slot regarding the share slot resources. We need a R

[jira] [Updated] (FLINK-14062) Set managed memory fractions according to slot sharing groups

2019-11-12 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14062: Description: * For operators with specified {{ResourceSpecs}}, calculate fractions according to operators

[jira] [Updated] (FLINK-14131) Support configurable failover strategy for scheduler NG

2019-11-13 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14131: Labels: pull-request-available (was: ) > Support configurable failover strategy for scheduler NG > --

[jira] [Updated] (FLINK-14682) Enable AbstractTaskManagerProcessFailureRecoveryTest to pass with new DefaultScheduler

2019-11-13 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14682: Labels: pull-request-available (was: ) > Enable AbstractTaskManagerProcessFailureRecoveryTest to pass wit

[jira] [Commented] (FLINK-14735) Improve batch schedule check input consumable performance

2019-11-13 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973508#comment-16973508 ] Zhu Zhu commented on FLINK-14735: - This issue happens because of the high computing comp

[jira] [Comment Edited] (FLINK-14735) Improve batch schedule check input consumable performance

2019-11-13 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973508#comment-16973508 ] Zhu Zhu edited comment on FLINK-14735 at 11/13/19 4:47 PM: --- Th

[jira] [Comment Edited] (FLINK-14735) Improve batch schedule check input consumable performance

2019-11-13 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973508#comment-16973508 ] Zhu Zhu edited comment on FLINK-14735 at 11/13/19 4:47 PM: --- Th

[jira] [Comment Edited] (FLINK-14735) Improve batch schedule check input consumable performance

2019-11-13 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973508#comment-16973508 ] Zhu Zhu edited comment on FLINK-14735 at 11/14/19 2:49 AM: --- Th

[jira] [Comment Edited] (FLINK-14735) Improve batch schedule check input consumable performance

2019-11-13 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973508#comment-16973508 ] Zhu Zhu edited comment on FLINK-14735 at 11/14/19 2:52 AM: --- Th

[jira] [Commented] (FLINK-14733) Introduce ResourceProfile builder to enable flexible building

2019-11-13 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973911#comment-16973911 ] Zhu Zhu commented on FLINK-14733: - Thanks [~AT-Fieldless] for the offering! However, we

[jira] [Commented] (FLINK-14766) Remove volatile variable in ExecutionVertex

2019-11-13 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973986#comment-16973986 ] Zhu Zhu commented on FLINK-14766: - Sounds good to me. cc [~gjy] > Remove volatile varia

[jira] [Commented] (FLINK-14594) Fix matching logics of ResourceSpec/ResourceProfile/Resource considering double values

2019-11-13 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973994#comment-16973994 ] Zhu Zhu commented on FLINK-14594: - Thanks for the proposal [~azagrebin]. With your propo

[jira] [Commented] (FLINK-14766) Remove volatile variable in ExecutionVertex

2019-11-13 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973996#comment-16973996 ] Zhu Zhu commented on FLINK-14766: - [~wind_ljy] According to Flink bylaw, please wait unt

[jira] [Comment Edited] (FLINK-14766) Remove volatile variable in ExecutionVertex

2019-11-13 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973996#comment-16973996 ] Zhu Zhu edited comment on FLINK-14766 at 11/14/19 7:53 AM: --- [~

[jira] [Commented] (FLINK-14735) Improve batch schedule check input consumable performance

2019-11-14 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16974164#comment-16974164 ] Zhu Zhu commented on FLINK-14735: - [~sewen], one individual PRC can take very long. If t

[jira] [Comment Edited] (FLINK-14735) Improve batch schedule check input consumable performance

2019-11-14 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16974164#comment-16974164 ] Zhu Zhu edited comment on FLINK-14735 at 11/14/19 11:39 AM:

[jira] [Commented] (FLINK-14735) Improve batch schedule check input consumable performance

2019-11-14 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16974209#comment-16974209 ] Zhu Zhu commented on FLINK-14735: - Here's a sample test case([testInputConstraintALLPerf

[jira] [Commented] (FLINK-16728) Taskmanager dies after job got stuck and canceling fails

2020-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067346#comment-17067346 ] Zhu Zhu commented on FLINK-16728: - Hi [~lilyevsky], it is intentioned to shutdown a Task

[jira] [Commented] (FLINK-16560) StreamExecutionEnvironment configuration is empty when building program via PackagedProgramUtils#createJobGraph

2020-03-31 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17072363#comment-17072363 ] Zhu Zhu commented on FLINK-16560: - [~aljoscha] do you think this is a blocker issue to f

[jira] [Commented] (FLINK-15639) Support to set toleration for jobmanager and taskmanger

2020-04-01 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17072657#comment-17072657 ] Zhu Zhu commented on FLINK-15639: - I have assigned the ticket to you [~fly_in_gis]. > S

[jira] [Assigned] (FLINK-15639) Support to set toleration for jobmanager and taskmanger

2020-04-01 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-15639: --- Assignee: Yang Wang > Support to set toleration for jobmanager and taskmanger > ---

[jira] [Commented] (FLINK-16728) Taskmanager dies after job got stuck and canceling fails

2020-04-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17076949#comment-17076949 ] Zhu Zhu commented on FLINK-16728: - Flink forces the TM to shutdown because some tasks on

[jira] [Updated] (FLINK-16960) Add PipelinedRegion Interface to Topology

2020-04-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-16960: Description: {code} interface Topology { ... Iterable getAllPipelinedRegions(); PipelinedRegion getPi

[jira] [Created] (FLINK-17014) Implement PipelinedRegionSchedulingStrategy

2020-04-06 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-17014: --- Summary: Implement PipelinedRegionSchedulingStrategy Key: FLINK-17014 URL: https://issues.apache.org/jira/browse/FLINK-17014 Project: Flink Issue Type: Sub-task

[jira] [Updated] (FLINK-16960) Add PipelinedRegion Interface to Topology

2020-04-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-16960: Component/s: Runtime / Coordination > Add PipelinedRegion Interface to Topology >

[jira] [Updated] (FLINK-16960) Add PipelinedRegion Interface to Topology

2020-04-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-16960: Affects Version/s: 1.11.0 > Add PipelinedRegion Interface to Topology > --

[jira] [Created] (FLINK-17016) Use PipelinedRegionSchedulingStrategy in DefaultScheduler (for Blink Planner)

2020-04-06 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-17016: --- Summary: Use PipelinedRegionSchedulingStrategy in DefaultScheduler (for Blink Planner) Key: FLINK-17016 URL: https://issues.apache.org/jira/browse/FLINK-17016 Project: Flink

[jira] [Created] (FLINK-17017) Implement Bulk Slot Allocation in SchedulerImpl

2020-04-06 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-17017: --- Summary: Implement Bulk Slot Allocation in SchedulerImpl Key: FLINK-17017 URL: https://issues.apache.org/jira/browse/FLINK-17017 Project: Flink Issue Type: Sub-task

[jira] [Created] (FLINK-17018) Use Bulk Slot Allocation in DefaultExecutionSlotAllocator

2020-04-06 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-17018: --- Summary: Use Bulk Slot Allocation in DefaultExecutionSlotAllocator Key: FLINK-17018 URL: https://issues.apache.org/jira/browse/FLINK-17018 Project: Flink Issue Type: S

[jira] [Created] (FLINK-17019) Implement FIFO Physical Slot Assignment in SlotPoolImpl

2020-04-06 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-17019: --- Summary: Implement FIFO Physical Slot Assignment in SlotPoolImpl Key: FLINK-17019 URL: https://issues.apache.org/jira/browse/FLINK-17019 Project: Flink Issue Type: Sub

[jira] [Created] (FLINK-17020) Introduce GlobalDataExchangeMode for JobGraph Generation

2020-04-06 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-17020: --- Summary: Introduce GlobalDataExchangeMode for JobGraph Generation Key: FLINK-17020 URL: https://issues.apache.org/jira/browse/FLINK-17020 Project: Flink Issue Type: Su

[jira] [Created] (FLINK-17021) Blink Planner set GlobalDataExchangeMode

2020-04-06 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-17021: --- Summary: Blink Planner set GlobalDataExchangeMode Key: FLINK-17021 URL: https://issues.apache.org/jira/browse/FLINK-17021 Project: Flink Issue Type: Sub-task

[jira] [Updated] (FLINK-16430) FLIP-119 Pipelined Region Scheduling

2020-04-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-16430: Summary: FLIP-119 Pipelined Region Scheduling (was: Pipelined Region Scheduling) > FLIP-119 Pipelined Re

[jira] [Updated] (FLINK-16430) FLIP-119 Pipelined Region Scheduling

2020-04-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-16430: Description: Pipelined region scheduling is targeting to allow batch jobs with PIPELINED data exchanges t

[jira] [Assigned] (FLINK-14162) Unify SchedulerOperations#allocateSlotsAndDeploy implementation for all scheduling strategies

2020-04-07 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-14162: --- Assignee: Zhu Zhu > Unify SchedulerOperations#allocateSlotsAndDeploy implementation for all > sche

[jira] [Assigned] (FLINK-14234) All partition consumable events should be notified to SchedulingStrategy (SchedulerNG)

2020-04-07 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-14234: --- Assignee: Zhu Zhu > All partition consumable events should be notified to SchedulingStrategy > (Sc

<    1   2   3   4   5   6   7   8   9   10   >