[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16991889#comment-16991889 ] Jonathan Hung commented on MAPREDUCE-7208: -- Removing 2.11.0 fix version after branch-2 -> branch-2.10 rename > Tuning TaskRuntimeEstimator > > > Key: MAPREDUCE-7208 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7208 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1 > > Attachments: MAPREDUCE-7208-branch-2.10.001.patch, > MAPREDUCE-7208-branch-2.10.002.patch, MAPREDUCE-7208.001.patch, > MAPREDUCE-7208.002.patch, MAPREDUCE-7208.003.patch, MAPREDUCE-7208.004.patch, > smoothing-exponential.md > > > By default, MR uses LegacyTaskRuntimeEstimator to get an estimate of the > runtime. The estimator does not adjust dynamically to the progress rate of > the tasks. On the other hand, the existing alternative > "ExponentiallySmoothedTaskRuntimeEstimator" behavior in unpredictable. > > There are several dimensions to improve the exponential implementation: > # Exponential shooting needs a warmup period. Otherwise, the estimate will > be affected by the initial values. > # Using a single smoothing factor (Lambda) does not work well for all the > tasks. To increase the level of smoothing across the majority of tasks, we > need to give a range of flexibility to dynamically adjust the smoothing > factor based on the history of the task progress. > # Design wise, it is better to separate between the statistical model and > the MR interface. We need to have a way to evaluate estimators statistically, > without the need to run MR. For example, an estimator can be evaluated as a > black box by using a stream of raw data as input and testing the accuracy of > the generated stream of estimates. > # The exponential estimator speculates frequently and fails to detect > slowing tasks. It does not detect slowing tasks. As a result, a taskAttempt > that does not do any progress won't trigger a new speculation. > > The file [^smoothing-exponential.md] describes how Simple Exponential > smoothing factor works. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967913#comment-16967913 ] Ahmed Hussein commented on MAPREDUCE-7208: -- Thanks [~jeagles]. Reviewed 2.10 patch errors. They are unrelated time-out unit tests. > Tuning TaskRuntimeEstimator > > > Key: MAPREDUCE-7208 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7208 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1, 2.11.0 > > Attachments: MAPREDUCE-7208-branch-2.10.001.patch, > MAPREDUCE-7208-branch-2.10.002.patch, MAPREDUCE-7208.001.patch, > MAPREDUCE-7208.002.patch, MAPREDUCE-7208.003.patch, MAPREDUCE-7208.004.patch, > smoothing-exponential.md > > > By default, MR uses LegacyTaskRuntimeEstimator to get an estimate of the > runtime. The estimator does not adjust dynamically to the progress rate of > the tasks. On the other hand, the existing alternative > "ExponentiallySmoothedTaskRuntimeEstimator" behavior in unpredictable. > > There are several dimensions to improve the exponential implementation: > # Exponential shooting needs a warmup period. Otherwise, the estimate will > be affected by the initial values. > # Using a single smoothing factor (Lambda) does not work well for all the > tasks. To increase the level of smoothing across the majority of tasks, we > need to give a range of flexibility to dynamically adjust the smoothing > factor based on the history of the task progress. > # Design wise, it is better to separate between the statistical model and > the MR interface. We need to have a way to evaluate estimators statistically, > without the need to run MR. For example, an estimator can be evaluated as a > black box by using a stream of raw data as input and testing the accuracy of > the generated stream of estimates. > # The exponential estimator speculates frequently and fails to detect > slowing tasks. It does not detect slowing tasks. As a result, a taskAttempt > that does not do any progress won't trigger a new speculation. > > The file [^smoothing-exponential.md] describes how Simple Exponential > smoothing factor works. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967872#comment-16967872 ] Hudson commented on MAPREDUCE-7208: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17610 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17610/]) MAPREDUCE-7208. Tuning TaskRuntimeEstimator. (Ahmed Hussein via jeagles) (jeagles: rev ed302f1fed6d124d682486d24dae958946dba9be) * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/speculate/DefaultSpeculator.java * (add) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/speculate/forecast/TestSimpleExponentialForecast.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/speculate/DataStatistics.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestSpeculativeExecutionWithMRApp.java * (add) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/speculate/forecast/SimpleExponentialSmoothing.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/speculate/StartEndTimesBase.java * (add) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestSpeculativeExecOnCluster.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java * (add) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/speculate/SimpleExponentialTaskRuntimeEstimator.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/speculate/TaskRuntimeEstimator.java > Tuning TaskRuntimeEstimator > > > Key: MAPREDUCE-7208 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7208 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Fix For: 3.3.0 > > Attachments: MAPREDUCE-7208-branch-2.10.001.patch, > MAPREDUCE-7208-branch-2.10.002.patch, MAPREDUCE-7208.001.patch, > MAPREDUCE-7208.002.patch, MAPREDUCE-7208.003.patch, MAPREDUCE-7208.004.patch, > smoothing-exponential.md > > > By default, MR uses LegacyTaskRuntimeEstimator to get an estimate of the > runtime. The estimator does not adjust dynamically to the progress rate of > the tasks. On the other hand, the existing alternative > "ExponentiallySmoothedTaskRuntimeEstimator" behavior in unpredictable. > > There are several dimensions to improve the exponential implementation: > # Exponential shooting needs a warmup period. Otherwise, the estimate will > be affected by the initial values. > # Using a single smoothing factor (Lambda) does not work well for all the > tasks. To increase the level of smoothing across the majority of tasks, we > need to give a range of flexibility to dynamically adjust the smoothing > factor based on the history of the task progress. > # Design wise, it is better to separate between the statistical model and > the MR interface. We need to have a way to evaluate estimators statistically, > without the need to run MR. For example, an estimator can be evaluated as a > black box by using a stream of raw data as input and testing the accuracy of > the generated stream of estimates. > # The exponential estimator speculates frequently and fails to detect > slowing tasks. It does not detect slowing tasks. As a result, a taskAttempt > that does not do any progress won't trigger a new speculation. > > The file [^smoothing-exponential.md] describes how Simple Exponential > smoothing factor works. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967819#comment-16967819 ] Hadoop QA commented on MAPREDUCE-7208: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 42s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} branch-2.10 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 3s{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s{color} | {color:green} branch-2.10 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s{color} | {color:green} branch-2.10 passed with JDK v1.8.0_222 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 54s{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 21s{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s{color} | {color:green} branch-2.10 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} branch-2.10 passed with JDK v1.8.0_222 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 40s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 42s{color} | {color:orange} hadoop-mapreduce-project/hadoop-mapreduce-client: The patch generated 3 new + 679 unchanged - 2 fixed = 682 total (was 681) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 58s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 9s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}116m 46s{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 34s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}182m 3s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:1c7ae55d7d3 | | JIRA
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967017#comment-16967017 ] Hadoop QA commented on MAPREDUCE-7208: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} branch-2.10 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 55s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 1s{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 23s{color} | {color:green} branch-2.10 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s{color} | {color:green} branch-2.10 passed with JDK v1.8.0_222 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 42s{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} branch-2.10 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} branch-2.10 passed with JDK v1.8.0_222 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 16s{color} | {color:red} hadoop-mapreduce-client-app in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 23s{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 49s{color} | {color:red} hadoop-mapreduce-client in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 49s{color} | {color:red} hadoop-mapreduce-client in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 42s{color} | {color:red} hadoop-mapreduce-client in the patch failed with JDK v1.8.0_222. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 42s{color} | {color:red} hadoop-mapreduce-client in the patch failed with JDK v1.8.0_222. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 37s{color} | {color:orange} hadoop-mapreduce-project/hadoop-mapreduce-client: The patch generated 2 new + 678 unchanged - 2 fixed = 680 total (was 680) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 18s{color} | {color:red} hadoop-mapreduce-client-app in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 24s{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 16s{color} | {color:red} hadoop-mapreduce-client-app in the patch failed. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 21s{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 14s{color} | {color:red} hadoop-mapreduce-client-app in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 52s{color} | {color:green} hadoop-mapreduce-client-core in the patch
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16966790#comment-16966790 ] Jonathan Turner Eagles commented on MAPREDUCE-7208: --- +1. I will plan to commit this back to branch-2.10. Thanks for this contribution, [~ahussein]. > Tuning TaskRuntimeEstimator > > > Key: MAPREDUCE-7208 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7208 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: MAPREDUCE-7208.001.patch, MAPREDUCE-7208.002.patch, > MAPREDUCE-7208.003.patch, MAPREDUCE-7208.004.patch, smoothing-exponential.md > > > By default, MR uses LegacyTaskRuntimeEstimator to get an estimate of the > runtime. The estimator does not adjust dynamically to the progress rate of > the tasks. On the other hand, the existing alternative > "ExponentiallySmoothedTaskRuntimeEstimator" behavior in unpredictable. > > There are several dimensions to improve the exponential implementation: > # Exponential shooting needs a warmup period. Otherwise, the estimate will > be affected by the initial values. > # Using a single smoothing factor (Lambda) does not work well for all the > tasks. To increase the level of smoothing across the majority of tasks, we > need to give a range of flexibility to dynamically adjust the smoothing > factor based on the history of the task progress. > # Design wise, it is better to separate between the statistical model and > the MR interface. We need to have a way to evaluate estimators statistically, > without the need to run MR. For example, an estimator can be evaluated as a > black box by using a stream of raw data as input and testing the accuracy of > the generated stream of estimates. > # The exponential estimator speculates frequently and fails to detect > slowing tasks. It does not detect slowing tasks. As a result, a taskAttempt > that does not do any progress won't trigger a new speculation. > > The file [^smoothing-exponential.md] describes how Simple Exponential > smoothing factor works. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16966707#comment-16966707 ] Ahmed Hussein commented on MAPREDUCE-7208: -- The failed test case is not related to the patch. > Tuning TaskRuntimeEstimator > > > Key: MAPREDUCE-7208 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7208 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: MAPREDUCE-7208.001.patch, MAPREDUCE-7208.002.patch, > MAPREDUCE-7208.003.patch, MAPREDUCE-7208.004.patch, smoothing-exponential.md > > > By default, MR uses LegacyTaskRuntimeEstimator to get an estimate of the > runtime. The estimator does not adjust dynamically to the progress rate of > the tasks. On the other hand, the existing alternative > "ExponentiallySmoothedTaskRuntimeEstimator" behavior in unpredictable. > > There are several dimensions to improve the exponential implementation: > # Exponential shooting needs a warmup period. Otherwise, the estimate will > be affected by the initial values. > # Using a single smoothing factor (Lambda) does not work well for all the > tasks. To increase the level of smoothing across the majority of tasks, we > need to give a range of flexibility to dynamically adjust the smoothing > factor based on the history of the task progress. > # Design wise, it is better to separate between the statistical model and > the MR interface. We need to have a way to evaluate estimators statistically, > without the need to run MR. For example, an estimator can be evaluated as a > black box by using a stream of raw data as input and testing the accuracy of > the generated stream of estimates. > # The exponential estimator speculates frequently and fails to detect > slowing tasks. It does not detect slowing tasks. As a result, a taskAttempt > that does not do any progress won't trigger a new speculation. > > The file [^smoothing-exponential.md] describes how Simple Exponential > smoothing factor works. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16965057#comment-16965057 ] Ahmed Hussein commented on MAPREDUCE-7208: -- {{TestJobSplitWriterWithEC}} seems not related to the patch. I will do further investigation before confirming that it is a flaky test. > Tuning TaskRuntimeEstimator > > > Key: MAPREDUCE-7208 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7208 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: MAPREDUCE-7208.001.patch, MAPREDUCE-7208.002.patch, > MAPREDUCE-7208.003.patch, MAPREDUCE-7208.004.patch, smoothing-exponential.md > > > By default, MR uses LegacyTaskRuntimeEstimator to get an estimate of the > runtime. The estimator does not adjust dynamically to the progress rate of > the tasks. On the other hand, the existing alternative > "ExponentiallySmoothedTaskRuntimeEstimator" behavior in unpredictable. > > There are several dimensions to improve the exponential implementation: > # Exponential shooting needs a warmup period. Otherwise, the estimate will > be affected by the initial values. > # Using a single smoothing factor (Lambda) does not work well for all the > tasks. To increase the level of smoothing across the majority of tasks, we > need to give a range of flexibility to dynamically adjust the smoothing > factor based on the history of the task progress. > # Design wise, it is better to separate between the statistical model and > the MR interface. We need to have a way to evaluate estimators statistically, > without the need to run MR. For example, an estimator can be evaluated as a > black box by using a stream of raw data as input and testing the accuracy of > the generated stream of estimates. > # The exponential estimator speculates frequently and fails to detect > slowing tasks. It does not detect slowing tasks. As a result, a taskAttempt > that does not do any progress won't trigger a new speculation. > > The file [^smoothing-exponential.md] describes how Simple Exponential > smoothing factor works. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16965052#comment-16965052 ] Hadoop QA commented on MAPREDUCE-7208: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 32s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 58s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 52s{color} | {color:orange} hadoop-mapreduce-project/hadoop-mapreduce-client: The patch generated 2 new + 702 unchanged - 2 fixed = 704 total (was 704) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 49s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 13s{color} | {color:red} hadoop-mapreduce-client-core in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 10m 32s{color} | {color:red} hadoop-mapreduce-client-app in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}126m 55s{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 45s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}223m 46s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.mapreduce.split.TestJobSplitWriterWithEC | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | MAPREDUCE-7208 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12984392/MAPREDUCE-7208.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d3013037aefe 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964937#comment-16964937 ] Hadoop QA commented on MAPREDUCE-7208: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 7m 3s{color} | {color:red} Docker failed to build yetus/hadoop:104ccca9169. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | MAPREDUCE-7208 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12984392/MAPREDUCE-7208.004.patch | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7683/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Tuning TaskRuntimeEstimator > > > Key: MAPREDUCE-7208 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7208 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: MAPREDUCE-7208.001.patch, MAPREDUCE-7208.002.patch, > MAPREDUCE-7208.003.patch, MAPREDUCE-7208.004.patch, smoothing-exponential.md > > > By default, MR uses LegacyTaskRuntimeEstimator to get an estimate of the > runtime. The estimator does not adjust dynamically to the progress rate of > the tasks. On the other hand, the existing alternative > "ExponentiallySmoothedTaskRuntimeEstimator" behavior in unpredictable. > > There are several dimensions to improve the exponential implementation: > # Exponential shooting needs a warmup period. Otherwise, the estimate will > be affected by the initial values. > # Using a single smoothing factor (Lambda) does not work well for all the > tasks. To increase the level of smoothing across the majority of tasks, we > need to give a range of flexibility to dynamically adjust the smoothing > factor based on the history of the task progress. > # Design wise, it is better to separate between the statistical model and > the MR interface. We need to have a way to evaluate estimators statistically, > without the need to run MR. For example, an estimator can be evaluated as a > black box by using a stream of raw data as input and testing the accuracy of > the generated stream of estimates. > # The exponential estimator speculates frequently and fails to detect > slowing tasks. It does not detect slowing tasks. As a result, a taskAttempt > that does not do any progress won't trigger a new speculation. > > The file [^smoothing-exponential.md] describes how Simple Exponential > smoothing factor works. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963414#comment-16963414 ] Hadoop QA commented on MAPREDUCE-7208: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 24s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 48s{color} | {color:orange} hadoop-mapreduce-project/hadoop-mapreduce-client: The patch generated 2 new + 702 unchanged - 2 fixed = 704 total (was 704) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 30s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 8s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 13s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}125m 28s{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 40s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}202m 0s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | MAPREDUCE-7208 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12984392/MAPREDUCE-7208.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux af8bb51752c8 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963182#comment-16963182 ] Hadoop QA commented on MAPREDUCE-7208: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 8s{color} | {color:red} MAPREDUCE-7208 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | MAPREDUCE-7208 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12984384/MAPREDUCE-7208.003.patch | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7680/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Tuning TaskRuntimeEstimator > > > Key: MAPREDUCE-7208 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7208 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: MAPREDUCE-7208.001.patch, MAPREDUCE-7208.002.patch, > MAPREDUCE-7208.003.patch, smoothing-exponential.md > > > By default, MR uses LegacyTaskRuntimeEstimator to get an estimate of the > runtime. The estimator does not adjust dynamically to the progress rate of > the tasks. On the other hand, the existing alternative > "ExponentiallySmoothedTaskRuntimeEstimator" behavior in unpredictable. > > There are several dimensions to improve the exponential implementation: > # Exponential shooting needs a warmup period. Otherwise, the estimate will > be affected by the initial values. > # Using a single smoothing factor (Lambda) does not work well for all the > tasks. To increase the level of smoothing across the majority of tasks, we > need to give a range of flexibility to dynamically adjust the smoothing > factor based on the history of the task progress. > # Design wise, it is better to separate between the statistical model and > the MR interface. We need to have a way to evaluate estimators statistically, > without the need to run MR. For example, an estimator can be evaluated as a > black box by using a stream of raw data as input and testing the accuracy of > the generated stream of estimates. > # The exponential estimator speculates frequently and fails to detect > slowing tasks. It does not detect slowing tasks. As a result, a taskAttempt > that does not do any progress won't trigger a new speculation. > > The file [^smoothing-exponential.md] describes how Simple Exponential > smoothing factor works. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962665#comment-16962665 ] Hadoop QA commented on MAPREDUCE-7208: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 6m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 11s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 2s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 33s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 40s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 0s{color} | {color:orange} root: The patch generated 19 new + 1195 unchanged - 8 fixed = 1214 total (was 1203) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 7m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 39s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 1s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 15s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 10m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 37s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 59s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 4s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 37s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 25s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 32s{color} |
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962437#comment-16962437 ] Ahmed Hussein commented on MAPREDUCE-7208: -- Thanks [~jeagles]. I looked at the test cases: * {{hadoop.mapreduce.v2.TestSpeculativeExecutionWithMRApp}} is a related test case and It was failing because I changed the threshold of the estimate that triggers a new speculative task. I fixed that default behavior in the new patch. * {{hadoop.mapred.TestLocalMRNotification}} and {{hadoop.mapreduce.v2.TestMROldApiJobs}} seem to be a random failure. They pass successfully on local machine. > Tuning TaskRuntimeEstimator > > > Key: MAPREDUCE-7208 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7208 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: MAPREDUCE-7208.001.patch, MAPREDUCE-7208.002.patch, > smoothing-exponential.md > > > By default, MR uses LegacyTaskRuntimeEstimator to get an estimate of the > runtime. The estimator does not adjust dynamically to the progress rate of > the tasks. On the other hand, the existing alternative > "ExponentiallySmoothedTaskRuntimeEstimator" behavior in unpredictable. > > There are several dimensions to improve the exponential implementation: > # Exponential shooting needs a warmup period. Otherwise, the estimate will > be affected by the initial values. > # Using a single smoothing factor (Lambda) does not work well for all the > tasks. To increase the level of smoothing across the majority of tasks, we > need to give a range of flexibility to dynamically adjust the smoothing > factor based on the history of the task progress. > # Design wise, it is better to separate between the statistical model and > the MR interface. We need to have a way to evaluate estimators statistically, > without the need to run MR. For example, an estimator can be evaluated as a > black box by using a stream of raw data as input and testing the accuracy of > the generated stream of estimates. > # The exponential estimator speculates frequently and fails to detect > slowing tasks. It does not detect slowing tasks. As a result, a taskAttempt > that does not do any progress won't trigger a new speculation. > > The file [^smoothing-exponential.md] describes how Simple Exponential > smoothing factor works. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961092#comment-16961092 ] Jonathan Turner Eagles commented on MAPREDUCE-7208: --- [~ahussein], could you take a look at the test failures? Also, some of the checkstyle seem relevant, but others not. > Tuning TaskRuntimeEstimator > > > Key: MAPREDUCE-7208 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7208 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: MAPREDUCE-7208.001.patch, smoothing-exponential.md > > > By default, MR uses LegacyTaskRuntimeEstimator to get an estimate of the > runtime. The estimator does not adjust dynamically to the progress rate of > the tasks. On the other hand, the existing alternative > "ExponentiallySmoothedTaskRuntimeEstimator" behavior in unpredictable. > > There are several dimensions to improve the exponential implementation: > # Exponential shooting needs a warmup period. Otherwise, the estimate will > be affected by the initial values. > # Using a single smoothing factor (Lambda) does not work well for all the > tasks. To increase the level of smoothing across the majority of tasks, we > need to give a range of flexibility to dynamically adjust the smoothing > factor based on the history of the task progress. > # Design wise, it is better to separate between the statistical model and > the MR interface. We need to have a way to evaluate estimators statistically, > without the need to run MR. For example, an estimator can be evaluated as a > black box by using a stream of raw data as input and testing the accuracy of > the generated stream of estimates. > # The exponential estimator speculates frequently and fails to detect > slowing tasks. It does not detect slowing tasks. As a result, a taskAttempt > that does not do any progress won't trigger a new speculation. > > The file [^smoothing-exponential.md] describes how Simple Exponential > smoothing factor works. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960231#comment-16960231 ] Hadoop QA commented on MAPREDUCE-7208: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 59s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 5s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 53s{color} | {color:orange} hadoop-mapreduce-project/hadoop-mapreduce-client: The patch generated 101 new + 686 unchanged - 6 fixed = 787 total (was 692) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 45s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 43s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 51s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}139m 57s{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 44s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}229m 44s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.mapreduce.v2.TestMROldApiJobs | | | hadoop.mapred.TestLocalMRNotification | | | hadoop.mapreduce.v2.TestSpeculativeExecutionWithMRApp | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | MAPREDUCE-7208 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12970329/MAPREDUCE-7208.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux
[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator
[ https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16852022#comment-16852022 ] Ahmed Hussein commented on MAPREDUCE-7208: -- [~jeagles], [~tgraves], [~vinodkv], [~nroberts] I had some issues using {{ExponentiallySmoothedTaskRuntimeEstimator}}. I made some investigation and implemented a new estimator that addresses some issues with the existing smoothing factor estimator. Do you mind taking a look at the suggested fixes and implementations? *{{SimpleExponentialTaskRuntimeEstimator}} (new) Vs {{ExponentiallySmoothedTaskRuntimeEstimator}} (old)* # New estimator follows Basic Exponential Smooth. # New estimator does not return an estimate for the first few cycles. This increases the accuracy of estimation; especially for long running tasks # New Estimator detects tasks that are slowing down. Old estimator fails to detect such scenarios. # New Estimator detects stalled tasks. Old estimator will not launch any speculative attempts when an attempt has a sharp slow down. *Is the default speculator affected?* * The speculator is still using the {{LegacyTaskRuntimeEstimator}} by default. * The existing implementation uses the statistics.mean to get an {{estimatedNewAttemptRuntime()}}. This causes frequent speculation as the smallest difference between the {{estimatedRuntime}} and the mean will create a new speculativeAttempt. I changed the implementation of {{estimatedNewAttemptRuntime()}} so that it uses (mean + a small delta) * I created a n JUnit {{TestSpeculativeExecOnCluster}} that verifies the speculator running on {{MiniMRYarnCluster}}. The test case can be used for the old estimators. *Tuning parameters:* * {{job.task.estimator.simple.exponential.smooth.lambda-ms}}: The lambda value in the smoothing function of the task estimator * {{job.task.estimator.simple.exponential.smooth.stagnated-ms}}: The window length in the simple exponential smoothing that considers the task attempt is stagnated. This allows the speculator to detect stalled progress. * {{job.task.estimator.simple.exponential.smooth.skip-initials}}: The number of initial readings that the estimator ignores before giving a prediction. A simple smoothing needs several iterations before adjusting and returning good estimates. The skip-initials parameter instructs the estimator to return "no-information" progress updates did not reach that value. > Tuning TaskRuntimeEstimator > > > Key: MAPREDUCE-7208 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7208 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: MAPREDUCE-7208.001.patch, smoothing-exponential.md > > > By default, MR uses LegacyTaskRuntimeEstimator to get an estimate of the > runtime. The estimator does not adjust dynamically to the progress rate of > the tasks. On the other hand, the existing alternative > "ExponentiallySmoothedTaskRuntimeEstimator" behavior in unpredictable. > > There are several dimensions to improve the exponential implementation: > # Exponential shooting needs a warmup period. Otherwise, the estimate will > be affected by the initial values. > # Using a single smoothing factor (Lambda) does not work well for all the > tasks. To increase the level of smoothing across the majority of tasks, we > need to give a range of flexibility to dynamically adjust the smoothing > factor based on the history of the task progress. > # Design wise, it is better to separate between the statistical model and > the MR interface. We need to have a way to evaluate estimators statistically, > without the need to run MR. For example, an estimator can be evaluated as a > black box by using a stream of raw data as input and testing the accuracy of > the generated stream of estimates. > # The exponential estimator speculates frequently and fails to detect > slowing tasks. It does not detect slowing tasks. As a result, a taskAttempt > that does not do any progress won't trigger a new speculation. > > The file [^smoothing-exponential.md] describes how Simple Exponential > smoothing factor works. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org