[jira] [Commented] (FLINK-9869) Send PartitionInfo in batch to Improve perfornance
[ https://issues.apache.org/jira/browse/FLINK-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679215#comment-16679215 ] ASF GitHub Bot commented on FLINK-9869: --- TisonKun commented on issue #6345: [FLINK-9869][runtime] Send PartitionInfo in batch to Improve perfornance URL: https://github.com/apache/flink/pull/6345#issuecomment-436854489 close for no longer interest This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Send PartitionInfo in batch to Improve perfornance > -- > > Key: FLINK-9869 > URL: https://issues.apache.org/jira/browse/FLINK-9869 > Project: Flink > Issue Type: Improvement > Components: Local Runtime >Affects Versions: 1.5.1 >Reporter: TisonKun >Assignee: TisonKun >Priority: Major > Labels: pull-request-available > Fix For: 1.5.6 > > > ... current we send partition info as soon as one arrive. we could > `cachePartitionInfo` and then `sendPartitionInfoAsync`, which will improve > performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9869) Send PartitionInfo in batch to Improve perfornance
[ https://issues.apache.org/jira/browse/FLINK-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679216#comment-16679216 ] ASF GitHub Bot commented on FLINK-9869: --- TisonKun closed pull request #6345: [FLINK-9869][runtime] Send PartitionInfo in batch to Improve perfornance URL: https://github.com/apache/flink/pull/6345 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/docs/_includes/generated/job_manager_configuration.html b/docs/_includes/generated/job_manager_configuration.html index 0458af24c06..83d8abb7d27 100644 --- a/docs/_includes/generated/job_manager_configuration.html +++ b/docs/_includes/generated/job_manager_configuration.html @@ -42,6 +42,11 @@ 6123 The config parameter defining the network port to connect to for communication with the job manager. Like jobmanager.rpc.address, this value is only interpreted in setups where a single JobManager with static name/address and port exists (simple standalone setups, or container setups with dynamic service name resolution). This config option is not used in many high-availability setups, when a leader-election service (like ZooKeeper) is used to elect and discover the JobManager leader from potentially multiple standby JobManagers. + +jobmanager.update-partition-info.send-interval +10 +The interval of send update-partition-info message. + jobstore.cache-size 52428800 diff --git a/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java index 1666f213d18..43091a256b2 100644 --- a/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java +++ b/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java @@ -154,6 +154,11 @@ .defaultValue(60L * 60L) .withDescription("The time in seconds after which a completed job expires and is purged from the job store."); + public static final ConfigOption UPDATE_PARTITION_INFO_SEND_INTERVAL = + key("jobmanager.update-partition-info.send-interval") + .defaultValue(10L) + .withDescription("The interval of send update-partition-info message."); + /** * The timeout in milliseconds for requesting a slot from Slot Pool. */ diff --git a/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java b/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java index 801f35a41dc..4a157f9cb60 100644 --- a/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java +++ b/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java @@ -27,18 +27,13 @@ import org.apache.flink.runtime.checkpoint.CheckpointOptions; import org.apache.flink.runtime.checkpoint.JobManagerTaskRestore; import org.apache.flink.runtime.clusterframework.types.AllocationID; -import org.apache.flink.runtime.clusterframework.types.ResourceID; import org.apache.flink.runtime.clusterframework.types.ResourceProfile; import org.apache.flink.runtime.clusterframework.types.SlotProfile; import org.apache.flink.runtime.concurrent.FutureUtils; -import org.apache.flink.runtime.deployment.InputChannelDeploymentDescriptor; import org.apache.flink.runtime.deployment.PartialInputChannelDeploymentDescriptor; -import org.apache.flink.runtime.deployment.ResultPartitionLocation; import org.apache.flink.runtime.deployment.TaskDeploymentDescriptor; import org.apache.flink.runtime.execution.ExecutionState; import org.apache.flink.runtime.instance.SlotSharingGroupId; -import org.apache.flink.runtime.io.network.ConnectionID; -import org.apache.flink.runtime.io.network.partition.ResultPartitionID; import org.apache.flink.runtime.jobmanager.scheduler.CoLocationConstraint; import org.apache.flink.runtime.jobmanager.scheduler.LocationPreferenceConstraint; import org.apache.flink.runtime.jobmanager.scheduler.ScheduledUnit; @@ -69,6 +64,8 @@ import java.util.concurrent.CompletionException; import java.util.concurrent.ConcurrentLinkedQueue; import java.util.concurrent.Executor; +import java.util.concurrent.ScheduledFuture; +import java.util.concurrent.TimeUnit; import java.util.concurrent.TimeoutException; import java.util.concurrent.atomic.AtomicReferenceFieldUpdater; import java.util.stream.Collectors; @@ -178,6 +175,10 @@ // + private final Object updatePartitionLock =
[jira] [Commented] (FLINK-9869) Send PartitionInfo in batch to Improve perfornance
[ https://issues.apache.org/jira/browse/FLINK-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582149#comment-16582149 ] ASF GitHub Bot commented on FLINK-9869: --- TisonKun commented on issue #6345: [FLINK-9869][runtime] Send PartitionInfo in batch to Improve perfornance URL: https://github.com/apache/flink/pull/6345#issuecomment-413463126 ping @tillrohrmann FYI, travis fails on `BucketingSinkFaultToleranceITCase`, but I run ~1000 times locally and could not reproduce it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Send PartitionInfo in batch to Improve perfornance > -- > > Key: FLINK-9869 > URL: https://issues.apache.org/jira/browse/FLINK-9869 > Project: Flink > Issue Type: Improvement > Components: Local Runtime >Affects Versions: 1.5.1 >Reporter: 陈梓立 >Assignee: 陈梓立 >Priority: Major > Labels: pull-request-available > Fix For: 1.5.3 > > > ... current we send partition info as soon as one arrive. we could > `cachePartitionInfo` and then `sendPartitionInfoAsync`, which will improve > performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9869) Send PartitionInfo in batch to Improve perfornance
[ https://issues.apache.org/jira/browse/FLINK-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574240#comment-16574240 ] ASF GitHub Bot commented on FLINK-9869: --- TisonKun commented on issue #6345: [FLINK-9869][runtime] Send PartitionInfo in batch to Improve perfornance URL: https://github.com/apache/flink/pull/6345#issuecomment-411627672 cc @GJL @twalthr This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Send PartitionInfo in batch to Improve perfornance > -- > > Key: FLINK-9869 > URL: https://issues.apache.org/jira/browse/FLINK-9869 > Project: Flink > Issue Type: Improvement > Components: Local Runtime >Affects Versions: 1.5.1 >Reporter: 陈梓立 >Assignee: 陈梓立 >Priority: Major > Labels: pull-request-available > Fix For: 1.5.3 > > > ... current we send partition info as soon as one arrive. we could > `cachePartitionInfo` and then `sendPartitionInfoAsync`, which will improve > performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9869) Send PartitionInfo in batch to Improve perfornance
[ https://issues.apache.org/jira/browse/FLINK-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561354#comment-16561354 ] ASF GitHub Bot commented on FLINK-9869: --- TisonKun commented on issue #6345: [FLINK-9869][runtime] Send PartitionInfo in batch to Improve perfornance URL: https://github.com/apache/flink/pull/6345#issuecomment-408727224 @tillrohrmann so i am here again. please review when you are free, thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Send PartitionInfo in batch to Improve perfornance > -- > > Key: FLINK-9869 > URL: https://issues.apache.org/jira/browse/FLINK-9869 > Project: Flink > Issue Type: Improvement > Components: Local Runtime >Affects Versions: 1.5.1 >Reporter: 陈梓立 >Assignee: 陈梓立 >Priority: Major > Labels: pull-request-available > Fix For: 1.5.3 > > > ... current we send partition info as soon as one arrive. we could > `cachePartitionInfo` and then `sendPartitionInfoAsync`, which will improve > performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9869) Send PartitionInfo in batch to Improve perfornance
[ https://issues.apache.org/jira/browse/FLINK-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561355#comment-16561355 ] ASF GitHub Bot commented on FLINK-9869: --- TisonKun edited a comment on issue #6345: [FLINK-9869][runtime] Send PartitionInfo in batch to Improve perfornance URL: https://github.com/apache/flink/pull/6345#issuecomment-408727224 @tillrohrmann so i am here again and do the rebase. please review when you are free, thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Send PartitionInfo in batch to Improve perfornance > -- > > Key: FLINK-9869 > URL: https://issues.apache.org/jira/browse/FLINK-9869 > Project: Flink > Issue Type: Improvement > Components: Local Runtime >Affects Versions: 1.5.1 >Reporter: 陈梓立 >Assignee: 陈梓立 >Priority: Major > Labels: pull-request-available > Fix For: 1.5.3 > > > ... current we send partition info as soon as one arrive. we could > `cachePartitionInfo` and then `sendPartitionInfoAsync`, which will improve > performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9869) Send PartitionInfo in batch to Improve perfornance
[ https://issues.apache.org/jira/browse/FLINK-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547929#comment-16547929 ] ASF GitHub Bot commented on FLINK-9869: --- Github user tison1 commented on the issue: https://github.com/apache/flink/pull/6345 OK. This PR is about performance improvement. I will try to give out a benchmark, but since it is inspired by our own batch table tasks, it might take time to give one. Though since this PR concurrently send partition info and deploy task in another thread, it theoretically does good. Keep on on Flink 1.6! I will nudge you guys to review this one, though(laughed) > Send PartitionInfo in batch to Improve perfornance > -- > > Key: FLINK-9869 > URL: https://issues.apache.org/jira/browse/FLINK-9869 > Project: Flink > Issue Type: Improvement > Components: Local Runtime >Affects Versions: 1.5.1 >Reporter: 陈梓立 >Assignee: 陈梓立 >Priority: Major > Labels: pull-request-available > Fix For: 1.5.2 > > > ... current we send partition info as soon as one arrive. we could > `cachePartitionInfo` and then `sendPartitionInfoAsync`, which will improve > performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9869) Send PartitionInfo in batch to Improve perfornance
[ https://issues.apache.org/jira/browse/FLINK-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547735#comment-16547735 ] ASF GitHub Bot commented on FLINK-9869: --- Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/6345 Thanks for opening this PR @tison1. The Flink community is currently preparing the Flink 1.6 release and, thus, it could take a bit longer until someone reviews your PR. Please bear with us until then. Thanks a lot! > Send PartitionInfo in batch to Improve perfornance > -- > > Key: FLINK-9869 > URL: https://issues.apache.org/jira/browse/FLINK-9869 > Project: Flink > Issue Type: Improvement > Components: Local Runtime >Affects Versions: 1.5.1 >Reporter: 陈梓立 >Assignee: 陈梓立 >Priority: Major > Labels: pull-request-available > Fix For: 1.5.2 > > > ... current we send partition info as soon as one arrive. we could > `cachePartitionInfo` and then `sendPartitionInfoAsync`, which will improve > performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9869) Send PartitionInfo in batch to Improve perfornance
[ https://issues.apache.org/jira/browse/FLINK-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16546628#comment-16546628 ] ASF GitHub Bot commented on FLINK-9869: --- Github user tison1 commented on the issue: https://github.com/apache/flink/pull/6345 cc @sihuazhou > Send PartitionInfo in batch to Improve perfornance > -- > > Key: FLINK-9869 > URL: https://issues.apache.org/jira/browse/FLINK-9869 > Project: Flink > Issue Type: Improvement > Components: Local Runtime >Affects Versions: 1.5.1 >Reporter: 陈梓立 >Assignee: 陈梓立 >Priority: Major > Labels: pull-request-available > Fix For: 1.5.2 > > > ... current we send partition info as soon as one arrive. we could > `cachePartitionInfo` and then `sendPartitionInfoAsync`, which will improve > performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9869) Send PartitionInfo in batch to Improve perfornance
[ https://issues.apache.org/jira/browse/FLINK-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16546030#comment-16546030 ] ASF GitHub Bot commented on FLINK-9869: --- Github user tison1 commented on the issue: https://github.com/apache/flink/pull/6345 cc @tillrohrmann @fhueske > Send PartitionInfo in batch to Improve perfornance > -- > > Key: FLINK-9869 > URL: https://issues.apache.org/jira/browse/FLINK-9869 > Project: Flink > Issue Type: Improvement > Components: Local Runtime >Affects Versions: 1.5.1 >Reporter: 陈梓立 >Assignee: 陈梓立 >Priority: Major > Labels: pull-request-available > Fix For: 1.5.2 > > > ... current we send partition info as soon as one arrive. we could > `cachePartitionInfo` and then `sendPartitionInfoAsync`, which will improve > performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9869) Send PartitionInfo in batch to Improve perfornance
[ https://issues.apache.org/jira/browse/FLINK-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16546007#comment-16546007 ] ASF GitHub Bot commented on FLINK-9869: --- GitHub user tison1 opened a pull request: https://github.com/apache/flink/pull/6345 [FLINK-9869] Send PartitionInfo in batch to Improve perfornance ## What is the purpose of the change Current we send partition info as soon as one arrive. we could `cachePartitionInfo` and then `sendPartitionInfoAsync`, which will improve performance. ... also improve task deployment ## Brief change log - `Execution` - now deploy task in another thread - as describe above, now we first `cachePartitionInfo` and then `sendPartitionInfoAsync` - add a config option `JobManagerOptions#UPDATE_PARTITION_INFO_SEND_INTERVAL`, which config the time window for cachePartitionInfo - update `ExecutionGraphDeploymentTest` and `ExecutionVertexDeploymentTest`, which also tests changes above ## Verifying this change This change is already covered by existing tests, such as `ExecutionGraphDeploymentTest` and `ExecutionVertexDeploymentTest` ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no, it's internal) You can merge this pull request into a Git repository by running: $ git pull https://github.com/tison1/flink partition-improve Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/6345.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6345 commit ca9ffbb99e91a8415d7469cba4bf2075615edc0d Author: 陈梓立 Date: 2018-07-17T04:11:36Z [FLINK-9869] Send PartitionInfo in batch to Improve perfornance > Send PartitionInfo in batch to Improve perfornance > -- > > Key: FLINK-9869 > URL: https://issues.apache.org/jira/browse/FLINK-9869 > Project: Flink > Issue Type: Improvement > Components: Local Runtime >Affects Versions: 1.5.1 >Reporter: 陈梓立 >Assignee: 陈梓立 >Priority: Major > Labels: pull-request-available > Fix For: 1.5.2 > > > ... current we send partition info as soon as one arrive. we could > `cachePartitionInfo` and then `sendPartitionInfoAsync`, which will improve > performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)