[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982742#comment-16982742 ] TezQA commented on TEZ-4067: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 36s{color} | {color:orange} tez-dag: The patch generated 18 new + 380 unchanged - 3 fixed = 398 total (was 383) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 49s{color} | {color:green} tez-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 51s{color} | {color:green} tez-dag in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 27s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/tez:d4a62deee | | JIRA Issue | TEZ-4067 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12986826/TEZ-4067.008.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs checkstyle compile | | uname | Linux d422d992da1c 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | master / 47f0f35 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.0.1 | | checkstyle | https://builds.apache.org/job/PreCommit-TEZ-Build/205/artifact/out/diff-checkstyle-tez-dag.txt | | Test Results | https://builds.apache.org/job/PreCommit-TEZ-Build/205/testReport/ | | Max. process+thread count | 235 (vs. ulimit of 5500) | | modules | C: tez-api tez-dag U: . | | Console output | https://builds.apache.org/job/PreCommit-TEZ-Build/205/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project: Apache Tez > Issue Type:
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982678#comment-16982678 ] TezQA commented on TEZ-4067: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 4s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 33s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 0s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 42s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 36s{color} | {color:orange} tez-dag: The patch generated 18 new + 380 unchanged - 3 fixed = 398 total (was 383) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 50s{color} | {color:green} tez-api in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 59s{color} | {color:red} tez-dag in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 32m 43s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | tez.dag.app.TestSpeculation | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/tez:d4a62deee | | JIRA Issue | TEZ-4067 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12986821/TEZ-4067.007.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs checkstyle compile | | uname | Linux ba5ee4bcb160 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | master / 47f0f35 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.0.1 | | checkstyle | https://builds.apache.org/job/PreCommit-TEZ-Build/204/artifact/out/diff-checkstyle-tez-dag.txt | | unit | https://builds.apache.org/job/PreCommit-TEZ-Build/204/artifact/out/patch-unit-tez-dag.txt | | Test Results | https://builds.apache.org/job/PreCommit-TEZ-Build/204/testReport/ | | Max. process+thread count | 230 (vs. ulimit of 5500) | | modules | C: tez-api tez-dag U: . | | Console output | https://builds.apache.org/job/PreCommit-TEZ-Build/204/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Tez Speculation decision is calculated on each update by the dispatcher >
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981932#comment-16981932 ] TezQA commented on TEZ-4067: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 9m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 48s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 38s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 29s{color} | {color:orange} tez-dag: The patch generated 18 new + 380 unchanged - 3 fixed = 398 total (was 383) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 43s{color} | {color:green} tez-api in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 40s{color} | {color:red} tez-dag in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 29m 59s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | tez.dag.app.TestSpeculation | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/tez:d4a62deee | | JIRA Issue | TEZ-4067 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12986704/TEZ-4067.006.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs checkstyle compile | | uname | Linux 5103a1af28bb 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | master / 47f0f35 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.0.1 | | checkstyle | https://builds.apache.org/job/PreCommit-TEZ-Build/203/artifact/out/diff-checkstyle-tez-dag.txt | | unit | https://builds.apache.org/job/PreCommit-TEZ-Build/203/artifact/out/patch-unit-tez-dag.txt | | Test Results | https://builds.apache.org/job/PreCommit-TEZ-Build/203/testReport/ | | Max. process+thread count | 220 (vs. ulimit of 5500) | | modules | C: tez-api tez-dag U: . | | Console output | https://builds.apache.org/job/PreCommit-TEZ-Build/203/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Tez Speculation decision is calculated on each update by the dispatcher >
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981924#comment-16981924 ] Jonathan Turner Eagles commented on TEZ-4067: - +1. LGTM, [~ahussein]. I'm going to wait for Tez QA bot to reply. But all my comments have be addressed. Thanks for this great improvement. > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project: Apache Tez > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: TEZ-4067.001.patch, TEZ-4067.002.patch, > TEZ-4067.003.patch, TEZ-4067.004.patch, TEZ-4067.005.patch, TEZ-4067.006.patch > > > LegacySpeculator is an object field in VertexImpl. Therefore, all events are > handled synchronously by the caller (dispatcher). This implies the following: > # the dispatcher spends long time executing updateStatus as it needs to > check the runtime estimation of the tezAttempts within the vertex. > # the speculator is per stage: lunching a speculation may not the optimum > decision. Ideally, based on resources, speculated tasks should be the ones > with slowest progress. > # the time between speculation is skewed because there is a big delay for > the dispatcher to complete a full cycle. Also, speculation will be more > aggressive compared to MR because MR waits for > "soonest.retry.after.speculate" whenever a task is speculated. On the other > hand, Tez speculates more tasks as it processes stages in parallel. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16978739#comment-16978739 ] Ahmed Hussein commented on TEZ-4067: [~jeagles], I tried to refresh my memory a little bit. There was check on the service state to prevent starting the service more than once. The workflow of the {{DAGAppMaster}} works as follow and correct me if I a wrong: * {{DAGAppMaster}} is created * Services get initialized. this is the phase when the services are added to the "{{DAGAppMaster.services}}" map. * all the services are started inside {{serviceStart.startServices()}}. Note that the {{DAG}} is not created yet. * {{startDag()}} and {{startDagExecution}} finally create the DAG "{{currentDAG}}" and its vertices. This workflow requires that speculators are started and initialized separately after the DAG is created. Although, we can still add them to the services map though, we cannot assume that they will start automatically in {{DAGAppMaster.serviceStart()}}. Same for {{DAGAppMaster.serviceStop()}}. The latter is called at the end of the execution. Therefore, a service in "{{DAGAppMaster.services}}" map will stay around until the whole DAG is completed. Given that a vertex can be completed, the speculator service related to that vertex will hang around until the {{DAGAppMaster}} is completed. If we add the speculators to "{{DAGAppMaster.services}}", we won't be able to remove the service when a vertex is completed, since a {{Vertex/DAGImpl}} does not have access to the "{{DAGAppMaster.services}}". I am almost done with implementing the code based on your suggestions. If you think that having speculators stay alive until DAG is completed, then I will go ahead and upload the patch. Otherwise, I will work on few changes to remove the speculator of a completed vertex. Let me know WDYT. > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project: Apache Tez > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: TEZ-4067.001.patch, TEZ-4067.002.patch, > TEZ-4067.003.patch, TEZ-4067.004.patch, TEZ-4067.005.patch > > > LegacySpeculator is an object field in VertexImpl. Therefore, all events are > handled synchronously by the caller (dispatcher). This implies the following: > # the dispatcher spends long time executing updateStatus as it needs to > check the runtime estimation of the tezAttempts within the vertex. > # the speculator is per stage: lunching a speculation may not the optimum > decision. Ideally, based on resources, speculated tasks should be the ones > with slowest progress. > # the time between speculation is skewed because there is a big delay for > the dispatcher to complete a full cycle. Also, speculation will be more > aggressive compared to MR because MR waits for > "soonest.retry.after.speculate" whenever a task is speculated. On the other > hand, Tez speculates more tasks as it processes stages in parallel. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977859#comment-16977859 ] Jonathan Turner Eagles commented on TEZ-4067: - Closer, as the DAGAppMaster no longer has knowledge about the LegacySpeculator. There are still a few things to fix to get full encapsulation. * All references to speculators need to be abstracted away. {code} // Stop speculators if any stopSpeculators(currentDAG); {code} Should be something like this {code} // Stop dependent services stopDependentServices(currentDAG); {code} Similar for the following code should change references to speculators to dependent services {code} +// If we reach here, then we have recoverable DAG and we need to reinitialize the speculators. +// start speculators of the recovered DAG +startSpeculators(currentDAG); {code} We need to avoid calling isSpeculationEnabled() and getSpeculator() and startSpeculator(). Instead List getDependentServices. The vertex can return include the speculator in the dependent services is speculation is enabled. Do we need to call startSpeculator at all? As a dependent service, startService will be called automatically. Similarly do we need a launch function at all? I'm a little worried that launch will start a thread and the startService will be called and launch another thread. Perhaps the state of the service will prevent this. Could you explain the reasoning for calling launch manually instead of relying on startServices to be called automatically? {code} + private void startSpeculators(DAG dag) { +for (Vertex v : dag.getVertices().values()) { + if (!v.isSpeculationEnabled()) { +continue; + } + if (v.startSpeculator()) { +addIfService(v.getSpeculator(), false); + } +} + } + + private Exception stopSpeculators(DAG dag) { +Exception firstException = null; +for (Vertex v : dag.getVertices().values()) { + if (!v.isSpeculationEnabled()) { +continue; + } + + Exception ex = v.stopSpeculator(); + if (ex != null && firstException == null) { +firstException = ex; +continue; + } + // remove the speculator service from the list of services + services.remove(v.getSpeculator()); +} +return firstException; + } {code} > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project: Apache Tez > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: TEZ-4067.001.patch, TEZ-4067.002.patch, > TEZ-4067.003.patch, TEZ-4067.004.patch, TEZ-4067.005.patch > > > LegacySpeculator is an object field in VertexImpl. Therefore, all events are > handled synchronously by the caller (dispatcher). This implies the following: > # the dispatcher spends long time executing updateStatus as it needs to > check the runtime estimation of the tezAttempts within the vertex. > # the speculator is per stage: lunching a speculation may not the optimum > decision. Ideally, based on resources, speculated tasks should be the ones > with slowest progress. > # the time between speculation is skewed because there is a big delay for > the dispatcher to complete a full cycle. Also, speculation will be more > aggressive compared to MR because MR waits for > "soonest.retry.after.speculate" whenever a task is speculated. On the other > hand, Tez speculates more tasks as it processes stages in parallel. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977645#comment-16977645 ] TezQA commented on TEZ-4067: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 30s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 54s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} The patch passed checkstyle in tez-api {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} tez-dag: The patch generated 0 new + 449 unchanged - 1 fixed = 449 total (was 450) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 43s{color} | {color:green} tez-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 40s{color} | {color:green} tez-dag in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 35m 17s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/tez:d4a62deee | | JIRA Issue | TEZ-4067 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12986260/TEZ-4067.005.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs checkstyle compile | | uname | Linux 236c298c8f84 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | master / 47f0f35 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.0.1 | | Test Results | https://builds.apache.org/job/PreCommit-TEZ-Build/200/testReport/ | | Max. process+thread count | 225 (vs. ulimit of 5500) | | modules | C: tez-api tez-dag U: . | | Console output | https://builds.apache.org/job/PreCommit-TEZ-Build/200/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project: Apache
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16975416#comment-16975416 ] Ahmed Hussein commented on TEZ-4067: Thanks Jon!Sure, I will change that and create a new patch. > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project: Apache Tez > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: TEZ-4067.001.patch, TEZ-4067.002.patch, > TEZ-4067.003.patch, TEZ-4067.004.patch > > > LegacySpeculator is an object field in VertexImpl. Therefore, all events are > handled synchronously by the caller (dispatcher). This implies the following: > # the dispatcher spends long time executing updateStatus as it needs to > check the runtime estimation of the tezAttempts within the vertex. > # the speculator is per stage: lunching a speculation may not the optimum > decision. Ideally, based on resources, speculated tasks should be the ones > with slowest progress. > # the time between speculation is skewed because there is a big delay for > the dispatcher to complete a full cycle. Also, speculation will be more > aggressive compared to MR because MR waits for > "soonest.retry.after.speculate" whenever a task is speculated. On the other > hand, Tez speculates more tasks as it processes stages in parallel. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971719#comment-16971719 ] Jonathan Turner Eagles commented on TEZ-4067: - [~ahussein], Overall this will be a great feature for speculative execution. Thank for the patch. Overall the code looks good. As to Object design, I would like to suggest a change and see if you agree with it. Before the patch, the DAGAppMaster knew about services and the Vertex class. After the patch, the DAGAppMaster adds knowledge about the VertexImpl and LegacySpeculator classes. Could we abstract that knowledge away to improve design. For example, would it be better if Vertex (or perhaps VetexImpl if needed) added a "getDependentServices" api. This would allow the DAGAppMaster to add the dependent services and keep the knowledge out of the DAGAppMaster that the service is a LegacySpeculator class. This would also allow for other dependent services in the future. Let me know if this is possible or what prevents this from being possible. > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project: Apache Tez > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: TEZ-4067.001.patch, TEZ-4067.002.patch, > TEZ-4067.003.patch, TEZ-4067.004.patch > > > LegacySpeculator is an object field in VertexImpl. Therefore, all events are > handled synchronously by the caller (dispatcher). This implies the following: > # the dispatcher spends long time executing updateStatus as it needs to > check the runtime estimation of the tezAttempts within the vertex. > # the speculator is per stage: lunching a speculation may not the optimum > decision. Ideally, based on resources, speculated tasks should be the ones > with slowest progress. > # the time between speculation is skewed because there is a big delay for > the dispatcher to complete a full cycle. Also, speculation will be more > aggressive compared to MR because MR waits for > "soonest.retry.after.speculate" whenever a task is speculated. On the other > hand, Tez speculates more tasks as it processes stages in parallel. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969622#comment-16969622 ] TezQA commented on TEZ-4067: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 40s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} The patch passed checkstyle in tez-api {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} tez-dag: The patch generated 0 new + 142 unchanged - 1 fixed = 142 total (was 143) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 51s{color} | {color:green} tez-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 51s{color} | {color:green} tez-dag in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 34s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/tez:d4a62deee | | JIRA Issue | TEZ-4067 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985279/TEZ-4067.004.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs checkstyle compile | | uname | Linux 92b8321e0241 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | master / b99c7ce | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.0.1 | | Test Results | https://builds.apache.org/job/PreCommit-TEZ-Build/196/testReport/ | | Max. process+thread count | 241 (vs. ulimit of 5500) | | modules | C: tez-api tez-dag U: . | | Console output | https://builds.apache.org/job/PreCommit-TEZ-Build/196/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project:
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969574#comment-16969574 ] TezQA commented on TEZ-4067: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 51s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 17s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} The patch passed checkstyle in tez-api {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} tez-dag: The patch generated 0 new + 142 unchanged - 1 fixed = 142 total (was 143) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 54s{color} | {color:green} tez-api in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 49s{color} | {color:red} tez-dag in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 25m 5s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | tez.dag.app.TestSpeculation | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/tez:d4a62deee | | JIRA Issue | TEZ-4067 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985272/TEZ-4067.003.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs checkstyle compile | | uname | Linux 97af557c6346 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | master / b99c7ce | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.0.1 | | unit | https://builds.apache.org/job/PreCommit-TEZ-Build/195/artifact/out/patch-unit-tez-dag.txt | | Test Results | https://builds.apache.org/job/PreCommit-TEZ-Build/195/testReport/ | | Max. process+thread count | 230 (vs. ulimit of 5500) | | modules | C: tez-api tez-dag U: . | | Console output | https://builds.apache.org/job/PreCommit-TEZ-Build/195/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Tez Speculation decision is calculated on each update by the dispatcher >
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969556#comment-16969556 ] Ahmed Hussein commented on TEZ-4067: Uploaded a new patch to fix error reported in checkstyle and windbags. > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project: Apache Tez > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: TEZ-4067.001.patch, TEZ-4067.002.patch, > TEZ-4067.003.patch > > > LegacySpeculator is an object field in VertexImpl. Therefore, all events are > handled synchronously by the caller (dispatcher). This implies the following: > # the dispatcher spends long time executing updateStatus as it needs to > check the runtime estimation of the tezAttempts within the vertex. > # the speculator is per stage: lunching a speculation may not the optimum > decision. Ideally, based on resources, speculated tasks should be the ones > with slowest progress. > # the time between speculation is skewed because there is a big delay for > the dispatcher to complete a full cycle. Also, speculation will be more > aggressive compared to MR because MR waits for > "soonest.retry.after.speculate" whenever a task is speculated. On the other > hand, Tez speculates more tasks as it processes stages in parallel. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969381#comment-16969381 ] TezQA commented on TEZ-4067: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 31s{color} | {color:orange} tez-dag: The patch generated 8 new + 142 unchanged - 1 fixed = 150 total (was 143) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 3s{color} | {color:red} tez-dag generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 51s{color} | {color:green} tez-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 48s{color} | {color:green} tez-dag in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 24s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:tez-dag | | | org.apache.tez.dag.app.dag.speculation.legacy.LegacySpeculator.isStarted() does not release lock on all exception paths At LegacySpeculator.java:on all exception paths At LegacySpeculator.java:[line 182] | | | org.apache.tez.dag.app.dag.speculation.legacy.LegacySpeculator.serviceStop() does not release lock on all exception paths At LegacySpeculator.java:on all exception paths At LegacySpeculator.java:[line 229] | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/tez:d4a62deee | | JIRA Issue | TEZ-4067 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12970227/TEZ-4067.002.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs checkstyle compile | | uname | Linux c4b218af8abb 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | master / b99c7ce | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.0.1 | | checkstyle | https://builds.apache.org/job/PreCommit-TEZ-Build/194/artifact/out/diff-checkstyle-tez-dag.txt | | findbugs | https://builds.apache.org/job/PreCommit-TEZ-Build/194/artifact/out/new-findbugs-tez-dag.html | | Test Results |
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16852045#comment-16852045 ] Ahmed Hussein commented on TEZ-4067: TEZ-1897 > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project: Apache Tez > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: TEZ-4067.001.patch, TEZ-4067.002.patch > > > LegacySpeculator is an object field in VertexImpl. Therefore, all events are > handled synchronously by the caller (dispatcher). This implies the following: > # the dispatcher spends long time executing updateStatus as it needs to > check the runtime estimation of the tezAttempts within the vertex. > # the speculator is per stage: lunching a speculation may not the optimum > decision. Ideally, based on resources, speculated tasks should be the ones > with slowest progress. > # the time between speculation is skewed because there is a big delay for > the dispatcher to complete a full cycle. Also, speculation will be more > aggressive compared to MR because MR waits for > "soonest.retry.after.speculate" whenever a task is speculated. On the other > hand, Tez speculates more tasks as it processes stages in parallel. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847860#comment-16847860 ] Ahmed Hussein commented on TEZ-4067: An old [TEZ-3934|https://issues.apache.org/jira/browse/TEZ-3934] reported the race condition in the speculator code. When two tasksAttempts are updating their progress simultaneously, the speculator may create two speculative attempts for the same task. The jira was closed after adding two more checks on the hashes to verify that no attempt was speculated while the current thread is busy with the calculation. This does not solve the root problem caused by calling maybeSpeculate() after updating the progress. A proper fix would be to: * The event handler returns after updating the taskAttempt status * A separate thread "speculator" runs periodically to scan the tasks within a vertex to calculate the speculation. Re-implimenting the speculator as-a-service requires the following changes: # add each vertex' speculator to a the list of services in the application master (i.e., DAGAppMaster) # api/DAG needs to support creating vertex speculator as a service. # Test cases (TestSpeculation) may need to be re-written because they were designed for single threaded implementation. > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project: Apache Tez > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > > LegacySpeculator is an object field in VertexImpl. Therefore, all events are > handled synchronously by the caller (dispatcher). This implies the following: > # the dispatcher spends long time executing updateStatus as it needs to > check the runtime estimation of the tezAttempts within the vertex. > # the speculator is per stage: lunching a speculation may not the optimum > decision. Ideally, based on resources, speculated tasks should be the ones > with slowest progress. > # the time between speculation is skewed because there is a big delay for > the dispatcher to complete a full cycle. Also, speculation will be more > aggressive compared to MR because MR waits for > "soonest.retry.after.speculate" whenever a task is speculated. On the other > hand, Tez speculates more tasks as it processes stages in parallel. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844220#comment-16844220 ] Ahmed Hussein commented on TEZ-4067: A concurrent Async dispatcher was added in TEZ-1897 . By default the AsyncDispatcher is disabled. In order to enable the concurrentDispatcher, the TezConfiguration needs to pass {noformat} -Dtez.am.use.concurrent-dispatcher=true {noformat} # The AsynDispatcher may not be ideal for production because each Task/TaskAttmept implies notify event on the blocking queue. For status-updates it may be faster to do the update within one thread rather than calling a new event between two threads. # The frequency of events could overwhelm the pool-workers, and events won't be processed on time. # For both synchronous and Asynchronous dispatcher, there is no mechanism to prevent two different workers scanning the vertex tasks. In that case, workers would duplicate the work without any productivity. Suggested fix # Keep the asyncDispatcher disabled. # In legacySpeculator, remove "maybeSpeculate" from "notifyAttemptStatusUpdate()". This will prevent the event handler from executing the main speculation loop. # Create a thread per speculator to execute " maybeSpeculate" every "soonestRetryAfterSpeculate/soonestRetryAfterNoSpeculate" > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project: Apache Tez > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > > LegacySpeculator is an object field in VertexImpl. Therefore, all events are > handled synchronously by the caller (dispatcher). This implies the following: > # the dispatcher spends long time executing updateStatus as it needs to > check the runtime estimation of the tezAttempts within the vertex. > # the speculator is per stage: lunching a speculation may not the optimum > decision. Ideally, based on resources, speculated tasks should be the ones > with slowest progress. > # the time between speculation is skewed because there is a big delay for > the dispatcher to complete a full cycle. Also, speculation will be more > aggressive compared to MR because MR waits for > "soonest.retry.after.speculate" whenever a task is speculated. On the other > hand, Tez speculates more tasks as it processes stages in parallel. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)