[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher

2019-11-19 Thread Jonathan Turner Eagles (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977859#comment-16977859
 ] 

Jonathan Turner Eagles commented on TEZ-4067:
-

Closer, as the DAGAppMaster no longer has knowledge about the LegacySpeculator. 
There are still a few things to fix to get full encapsulation.
* All references to speculators need to be abstracted away.

{code}
// Stop speculators if any
stopSpeculators(currentDAG);
{code}

Should be something like this
{code}
// Stop dependent services
stopDependentServices(currentDAG);
{code}

Similar for the following code should change references to speculators to 
dependent services
{code}
+// If we reach here, then we have recoverable DAG and we need to 
reinitialize the speculators.
+// start speculators of the recovered DAG
+startSpeculators(currentDAG);
{code}

We need to avoid calling isSpeculationEnabled() and getSpeculator() and 
startSpeculator(). Instead List getDependentServices. The 
vertex can return include the speculator in the dependent services is 
speculation is enabled. 
Do we need to call startSpeculator at all? As a dependent service, startService 
will be called automatically. Similarly do we need a launch function at all? 
I'm a little worried that launch will start a thread and the startService will 
be called and launch another thread. Perhaps the state of the service will 
prevent this. Could you explain the reasoning for calling launch manually 
instead of relying on startServices to be called automatically?
{code}
+  private void startSpeculators(DAG dag) {
+for (Vertex v : dag.getVertices().values()) {
+  if (!v.isSpeculationEnabled()) {
+continue;
+  }
+  if (v.startSpeculator()) {
+addIfService(v.getSpeculator(), false);
+  }
+}
+  }
+
+  private Exception stopSpeculators(DAG dag) {
+Exception firstException = null;
+for (Vertex v : dag.getVertices().values()) {
+  if (!v.isSpeculationEnabled()) {
+continue;
+  }
+
+  Exception ex = v.stopSpeculator();
+  if (ex != null && firstException == null) {
+firstException = ex;
+continue;
+  }
+  // remove the speculator service from the list of services
+  services.remove(v.getSpeculator());
+}
+return firstException;
+  }
{code}

> Tez Speculation decision is calculated on each update by the dispatcher
> ---
>
> Key: TEZ-4067
> URL: https://issues.apache.org/jira/browse/TEZ-4067
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: TEZ-4067.001.patch, TEZ-4067.002.patch, 
> TEZ-4067.003.patch, TEZ-4067.004.patch, TEZ-4067.005.patch
>
>
> LegacySpeculator is an object field in VertexImpl. Therefore, all events are 
> handled synchronously by the caller (dispatcher). This implies the following:
>  # the dispatcher spends long time executing updateStatus as it needs to 
> check the runtime estimation of the tezAttempts within the vertex.
>  # the speculator is per stage: lunching a speculation may not the optimum 
> decision. Ideally, based on resources, speculated tasks should be the ones 
> with slowest progress.
>  # the time between speculation is skewed because there is a big delay for 
> the dispatcher to complete a full cycle. Also, speculation will be more 
> aggressive compared to MR because MR waits for 
> "soonest.retry.after.speculate" whenever a task is speculated. On the other 
> hand, Tez speculates more tasks as it processes stages in parallel.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher

2019-11-19 Thread TezQA (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977645#comment-16977645
 ] 

TezQA commented on TEZ-4067:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
30s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} The patch passed checkstyle in tez-api {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} tez-dag: The patch generated 0 new + 449 unchanged - 
1 fixed = 449 total (was 450) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
43s{color} | {color:green} tez-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
40s{color} | {color:green} tez-dag in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m 17s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/tez:d4a62deee |
| JIRA Issue | TEZ-4067 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12986260/TEZ-4067.005.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
checkstyle  compile  |
| uname | Linux 236c298c8f84 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | master / 47f0f35 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.0.1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-TEZ-Build/200/testReport/ |
| Max. process+thread count | 225 (vs. ulimit of 5500) |
| modules | C: tez-api tez-dag U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-TEZ-Build/200/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Tez Speculation decision is calculated on each update by the dispatcher
> ---
>
> Key: TEZ-4067
> URL: https://issues.apache.org/jira/browse/TEZ-4067
> Project: Apache 

[jira] [Commented] (TEZ-4100) Upgrade to hadoop 3.1.3

2019-11-19 Thread Jira


[ 
https://issues.apache.org/jira/browse/TEZ-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977623#comment-16977623
 ] 

László Bodor commented on TEZ-4100:
---

i think OOZIE-3488 could be a good example for getting rid of some guava 
dependencies

> Upgrade to hadoop 3.1.3
> ---
>
> Key: TEZ-4100
> URL: https://issues.apache.org/jira/browse/TEZ-4100
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: TEZ-4100.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher

2019-11-19 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated TEZ-4067:
---
Attachment: TEZ-4067.005.patch

> Tez Speculation decision is calculated on each update by the dispatcher
> ---
>
> Key: TEZ-4067
> URL: https://issues.apache.org/jira/browse/TEZ-4067
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: TEZ-4067.001.patch, TEZ-4067.002.patch, 
> TEZ-4067.003.patch, TEZ-4067.004.patch, TEZ-4067.005.patch
>
>
> LegacySpeculator is an object field in VertexImpl. Therefore, all events are 
> handled synchronously by the caller (dispatcher). This implies the following:
>  # the dispatcher spends long time executing updateStatus as it needs to 
> check the runtime estimation of the tezAttempts within the vertex.
>  # the speculator is per stage: lunching a speculation may not the optimum 
> decision. Ideally, based on resources, speculated tasks should be the ones 
> with slowest progress.
>  # the time between speculation is skewed because there is a big delay for 
> the dispatcher to complete a full cycle. Also, speculation will be more 
> aggressive compared to MR because MR waits for 
> "soonest.retry.after.speculate" whenever a task is speculated. On the other 
> hand, Tez speculates more tasks as it processes stages in parallel.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-4100) Upgrade to hadoop 3.1.3

2019-11-19 Thread Jonathan Turner Eagles (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977605#comment-16977605
 ] 

Jonathan Turner Eagles commented on TEZ-4100:
-

It's a little more complicated the way I see this. To increase compatibility, 
can neither upgrade nor stay the same. If Tez upgrades, then users using older 
versions of guava will no longer work. If Tez stays the same, hadoop 3+ 
continues to break. If we upgrade with need a second step to help 
compatibility, we can either remove the Precondition.check apis with some 
equivalent or we can shade guava to ensure separation between what Tez depends 
and what Tez users depend.

> Upgrade to hadoop 3.1.3
> ---
>
> Key: TEZ-4100
> URL: https://issues.apache.org/jira/browse/TEZ-4100
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: TEZ-4100.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)