[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977859#comment-16977859 ] Jonathan Turner Eagles commented on TEZ-4067: - Closer, as the DAGAppMaster no longer has knowledge about the LegacySpeculator. There are still a few things to fix to get full encapsulation. * All references to speculators need to be abstracted away. {code} // Stop speculators if any stopSpeculators(currentDAG); {code} Should be something like this {code} // Stop dependent services stopDependentServices(currentDAG); {code} Similar for the following code should change references to speculators to dependent services {code} +// If we reach here, then we have recoverable DAG and we need to reinitialize the speculators. +// start speculators of the recovered DAG +startSpeculators(currentDAG); {code} We need to avoid calling isSpeculationEnabled() and getSpeculator() and startSpeculator(). Instead List getDependentServices. The vertex can return include the speculator in the dependent services is speculation is enabled. Do we need to call startSpeculator at all? As a dependent service, startService will be called automatically. Similarly do we need a launch function at all? I'm a little worried that launch will start a thread and the startService will be called and launch another thread. Perhaps the state of the service will prevent this. Could you explain the reasoning for calling launch manually instead of relying on startServices to be called automatically? {code} + private void startSpeculators(DAG dag) { +for (Vertex v : dag.getVertices().values()) { + if (!v.isSpeculationEnabled()) { +continue; + } + if (v.startSpeculator()) { +addIfService(v.getSpeculator(), false); + } +} + } + + private Exception stopSpeculators(DAG dag) { +Exception firstException = null; +for (Vertex v : dag.getVertices().values()) { + if (!v.isSpeculationEnabled()) { +continue; + } + + Exception ex = v.stopSpeculator(); + if (ex != null && firstException == null) { +firstException = ex; +continue; + } + // remove the speculator service from the list of services + services.remove(v.getSpeculator()); +} +return firstException; + } {code} > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project: Apache Tez > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: TEZ-4067.001.patch, TEZ-4067.002.patch, > TEZ-4067.003.patch, TEZ-4067.004.patch, TEZ-4067.005.patch > > > LegacySpeculator is an object field in VertexImpl. Therefore, all events are > handled synchronously by the caller (dispatcher). This implies the following: > # the dispatcher spends long time executing updateStatus as it needs to > check the runtime estimation of the tezAttempts within the vertex. > # the speculator is per stage: lunching a speculation may not the optimum > decision. Ideally, based on resources, speculated tasks should be the ones > with slowest progress. > # the time between speculation is skewed because there is a big delay for > the dispatcher to complete a full cycle. Also, speculation will be more > aggressive compared to MR because MR waits for > "soonest.retry.after.speculate" whenever a task is speculated. On the other > hand, Tez speculates more tasks as it processes stages in parallel. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977645#comment-16977645 ] TezQA commented on TEZ-4067: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 30s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 54s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} The patch passed checkstyle in tez-api {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} tez-dag: The patch generated 0 new + 449 unchanged - 1 fixed = 449 total (was 450) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 43s{color} | {color:green} tez-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 40s{color} | {color:green} tez-dag in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 35m 17s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/tez:d4a62deee | | JIRA Issue | TEZ-4067 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12986260/TEZ-4067.005.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs checkstyle compile | | uname | Linux 236c298c8f84 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | master / 47f0f35 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.0.1 | | Test Results | https://builds.apache.org/job/PreCommit-TEZ-Build/200/testReport/ | | Max. process+thread count | 225 (vs. ulimit of 5500) | | modules | C: tez-api tez-dag U: . | | Console output | https://builds.apache.org/job/PreCommit-TEZ-Build/200/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project: Apache
[jira] [Commented] (TEZ-4100) Upgrade to hadoop 3.1.3
[ https://issues.apache.org/jira/browse/TEZ-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977623#comment-16977623 ] László Bodor commented on TEZ-4100: --- i think OOZIE-3488 could be a good example for getting rid of some guava dependencies > Upgrade to hadoop 3.1.3 > --- > > Key: TEZ-4100 > URL: https://issues.apache.org/jira/browse/TEZ-4100 > Project: Apache Tez > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: TEZ-4100.01.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Hussein updated TEZ-4067: --- Attachment: TEZ-4067.005.patch > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project: Apache Tez > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: TEZ-4067.001.patch, TEZ-4067.002.patch, > TEZ-4067.003.patch, TEZ-4067.004.patch, TEZ-4067.005.patch > > > LegacySpeculator is an object field in VertexImpl. Therefore, all events are > handled synchronously by the caller (dispatcher). This implies the following: > # the dispatcher spends long time executing updateStatus as it needs to > check the runtime estimation of the tezAttempts within the vertex. > # the speculator is per stage: lunching a speculation may not the optimum > decision. Ideally, based on resources, speculated tasks should be the ones > with slowest progress. > # the time between speculation is skewed because there is a big delay for > the dispatcher to complete a full cycle. Also, speculation will be more > aggressive compared to MR because MR waits for > "soonest.retry.after.speculate" whenever a task is speculated. On the other > hand, Tez speculates more tasks as it processes stages in parallel. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TEZ-4100) Upgrade to hadoop 3.1.3
[ https://issues.apache.org/jira/browse/TEZ-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977605#comment-16977605 ] Jonathan Turner Eagles commented on TEZ-4100: - It's a little more complicated the way I see this. To increase compatibility, can neither upgrade nor stay the same. If Tez upgrades, then users using older versions of guava will no longer work. If Tez stays the same, hadoop 3+ continues to break. If we upgrade with need a second step to help compatibility, we can either remove the Precondition.check apis with some equivalent or we can shade guava to ensure separation between what Tez depends and what Tez users depend. > Upgrade to hadoop 3.1.3 > --- > > Key: TEZ-4100 > URL: https://issues.apache.org/jira/browse/TEZ-4100 > Project: Apache Tez > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: TEZ-4100.01.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005)