[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher

2019-11-20 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978739#comment-16978739
 ] 

Ahmed Hussein commented on TEZ-4067:


[~jeagles], I tried to refresh my memory a little bit. There was check on the 
service state to prevent starting the service more than once.

The workflow of the {{DAGAppMaster}} works as follow and correct me if I a 
wrong:

* {{DAGAppMaster}} is created
* Services get initialized. this is the phase when the services are added to 
the "{{DAGAppMaster.services}}" map.
* all the services are started inside {{serviceStart.startServices()}}. Note 
that the {{DAG}} is not created yet.
* {{startDag()}} and {{startDagExecution}} finally create the DAG 
"{{currentDAG}}" and its vertices.

This workflow requires that speculators are started and initialized separately 
after the DAG is created. Although, we can still add them to the services map 
though, we cannot assume that they will start automatically in 
{{DAGAppMaster.serviceStart()}}.

Same for {{DAGAppMaster.serviceStop()}}. The latter is called at the end of the 
execution. Therefore, a service in "{{DAGAppMaster.services}}" map will stay 
around until the whole DAG is completed. Given that a vertex can be completed, 
the speculator service related to that vertex will hang around until the 
{{DAGAppMaster}} is completed.
If we add the speculators to "{{DAGAppMaster.services}}", we won't be able to 
remove the service when a vertex is completed, since a {{Vertex/DAGImpl}} does 
not have access to the "{{DAGAppMaster.services}}".

I am almost done with implementing the code based on your suggestions. If you 
think that having speculators stay alive until DAG is completed, then I will go 
ahead and upload the patch. Otherwise, I will work on few changes to remove the 
speculator of a completed vertex.

Let me know WDYT.


> Tez Speculation decision is calculated on each update by the dispatcher
> ---
>
> Key: TEZ-4067
> URL: https://issues.apache.org/jira/browse/TEZ-4067
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: TEZ-4067.001.patch, TEZ-4067.002.patch, 
> TEZ-4067.003.patch, TEZ-4067.004.patch, TEZ-4067.005.patch
>
>
> LegacySpeculator is an object field in VertexImpl. Therefore, all events are 
> handled synchronously by the caller (dispatcher). This implies the following:
>  # the dispatcher spends long time executing updateStatus as it needs to 
> check the runtime estimation of the tezAttempts within the vertex.
>  # the speculator is per stage: lunching a speculation may not the optimum 
> decision. Ideally, based on resources, speculated tasks should be the ones 
> with slowest progress.
>  # the time between speculation is skewed because there is a big delay for 
> the dispatcher to complete a full cycle. Also, speculation will be more 
> aggressive compared to MR because MR waits for 
> "soonest.retry.after.speculate" whenever a task is speculated. On the other 
> hand, Tez speculates more tasks as it processes stages in parallel.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (TEZ-4100) Upgrade to hadoop 3.1.3

2019-11-20 Thread Jira


[ 
https://issues.apache.org/jira/browse/TEZ-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977623#comment-16977623
 ] 

László Bodor edited comment on TEZ-4100 at 11/20/19 5:25 PM:
-

[~jeagles]
i think OOZIE-3488 could be a good example for getting rid of some guava 
dependencies

"If Tez upgrades, then users using older versions of guava will no longer work"
I understand this scenario, however IMO there should be a point where tez will 
follow hadoop even if it breaks users using older guava, does it make sense? I 
mean, let's say some next tez upstream release, which officially supports 
hadoop 3.3.x (as Hadoop community already upgraded guava in the scope of 
HADOOP-16210)


was (Author: abstractdog):
i think OOZIE-3488 could be a good example for getting rid of some guava 
dependencies

> Upgrade to hadoop 3.1.3
> ---
>
> Key: TEZ-4100
> URL: https://issues.apache.org/jira/browse/TEZ-4100
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: TEZ-4100.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (TEZ-4100) Upgrade to hadoop 3.1.3

2019-11-20 Thread Jira


[ 
https://issues.apache.org/jira/browse/TEZ-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977623#comment-16977623
 ] 

László Bodor edited comment on TEZ-4100 at 11/20/19 5:25 PM:
-

[~jeagles]
i think OOZIE-3488 could be a good example for getting rid of some guava 
dependencies (doing TEZ-4101 about that)

"If Tez upgrades, then users using older versions of guava will no longer work"
I understand this scenario, however IMO there should be a point where tez will 
follow hadoop even if it breaks users using older guava, does it make sense? I 
mean, let's say some next tez upstream release, which officially supports 
hadoop 3.3.x (as Hadoop community already upgraded guava in the scope of 
HADOOP-16210)


was (Author: abstractdog):
[~jeagles]
i think OOZIE-3488 could be a good example for getting rid of some guava 
dependencies

"If Tez upgrades, then users using older versions of guava will no longer work"
I understand this scenario, however IMO there should be a point where tez will 
follow hadoop even if it breaks users using older guava, does it make sense? I 
mean, let's say some next tez upstream release, which officially supports 
hadoop 3.3.x (as Hadoop community already upgraded guava in the scope of 
HADOOP-16210)

> Upgrade to hadoop 3.1.3
> ---
>
> Key: TEZ-4100
> URL: https://issues.apache.org/jira/browse/TEZ-4100
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: TEZ-4100.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (TEZ-4101) Eliminate some guava dependencies by Java8+ features

2019-11-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned TEZ-4101:
-

Assignee: László Bodor

> Eliminate some guava dependencies by Java8+ features
> 
>
> Key: TEZ-4101
> URL: https://issues.apache.org/jira/browse/TEZ-4101
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (TEZ-4101) Eliminate some guava dependencies by Java8+ features

2019-11-20 Thread Jira
László Bodor created TEZ-4101:
-

 Summary: Eliminate some guava dependencies by Java8+ features
 Key: TEZ-4101
 URL: https://issues.apache.org/jira/browse/TEZ-4101
 Project: Apache Tez
  Issue Type: Improvement
Reporter: László Bodor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)