[jira] [Commented] (TEZ-4067) Tez Speculation decision is calculated on each update by the dispatcher
[ https://issues.apache.org/jira/browse/TEZ-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978739#comment-16978739 ] Ahmed Hussein commented on TEZ-4067: [~jeagles], I tried to refresh my memory a little bit. There was check on the service state to prevent starting the service more than once. The workflow of the {{DAGAppMaster}} works as follow and correct me if I a wrong: * {{DAGAppMaster}} is created * Services get initialized. this is the phase when the services are added to the "{{DAGAppMaster.services}}" map. * all the services are started inside {{serviceStart.startServices()}}. Note that the {{DAG}} is not created yet. * {{startDag()}} and {{startDagExecution}} finally create the DAG "{{currentDAG}}" and its vertices. This workflow requires that speculators are started and initialized separately after the DAG is created. Although, we can still add them to the services map though, we cannot assume that they will start automatically in {{DAGAppMaster.serviceStart()}}. Same for {{DAGAppMaster.serviceStop()}}. The latter is called at the end of the execution. Therefore, a service in "{{DAGAppMaster.services}}" map will stay around until the whole DAG is completed. Given that a vertex can be completed, the speculator service related to that vertex will hang around until the {{DAGAppMaster}} is completed. If we add the speculators to "{{DAGAppMaster.services}}", we won't be able to remove the service when a vertex is completed, since a {{Vertex/DAGImpl}} does not have access to the "{{DAGAppMaster.services}}". I am almost done with implementing the code based on your suggestions. If you think that having speculators stay alive until DAG is completed, then I will go ahead and upload the patch. Otherwise, I will work on few changes to remove the speculator of a completed vertex. Let me know WDYT. > Tez Speculation decision is calculated on each update by the dispatcher > --- > > Key: TEZ-4067 > URL: https://issues.apache.org/jira/browse/TEZ-4067 > Project: Apache Tez > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: TEZ-4067.001.patch, TEZ-4067.002.patch, > TEZ-4067.003.patch, TEZ-4067.004.patch, TEZ-4067.005.patch > > > LegacySpeculator is an object field in VertexImpl. Therefore, all events are > handled synchronously by the caller (dispatcher). This implies the following: > # the dispatcher spends long time executing updateStatus as it needs to > check the runtime estimation of the tezAttempts within the vertex. > # the speculator is per stage: lunching a speculation may not the optimum > decision. Ideally, based on resources, speculated tasks should be the ones > with slowest progress. > # the time between speculation is skewed because there is a big delay for > the dispatcher to complete a full cycle. Also, speculation will be more > aggressive compared to MR because MR waits for > "soonest.retry.after.speculate" whenever a task is speculated. On the other > hand, Tez speculates more tasks as it processes stages in parallel. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (TEZ-4100) Upgrade to hadoop 3.1.3
[ https://issues.apache.org/jira/browse/TEZ-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977623#comment-16977623 ] László Bodor edited comment on TEZ-4100 at 11/20/19 5:25 PM: - [~jeagles] i think OOZIE-3488 could be a good example for getting rid of some guava dependencies "If Tez upgrades, then users using older versions of guava will no longer work" I understand this scenario, however IMO there should be a point where tez will follow hadoop even if it breaks users using older guava, does it make sense? I mean, let's say some next tez upstream release, which officially supports hadoop 3.3.x (as Hadoop community already upgraded guava in the scope of HADOOP-16210) was (Author: abstractdog): i think OOZIE-3488 could be a good example for getting rid of some guava dependencies > Upgrade to hadoop 3.1.3 > --- > > Key: TEZ-4100 > URL: https://issues.apache.org/jira/browse/TEZ-4100 > Project: Apache Tez > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: TEZ-4100.01.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (TEZ-4100) Upgrade to hadoop 3.1.3
[ https://issues.apache.org/jira/browse/TEZ-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977623#comment-16977623 ] László Bodor edited comment on TEZ-4100 at 11/20/19 5:25 PM: - [~jeagles] i think OOZIE-3488 could be a good example for getting rid of some guava dependencies (doing TEZ-4101 about that) "If Tez upgrades, then users using older versions of guava will no longer work" I understand this scenario, however IMO there should be a point where tez will follow hadoop even if it breaks users using older guava, does it make sense? I mean, let's say some next tez upstream release, which officially supports hadoop 3.3.x (as Hadoop community already upgraded guava in the scope of HADOOP-16210) was (Author: abstractdog): [~jeagles] i think OOZIE-3488 could be a good example for getting rid of some guava dependencies "If Tez upgrades, then users using older versions of guava will no longer work" I understand this scenario, however IMO there should be a point where tez will follow hadoop even if it breaks users using older guava, does it make sense? I mean, let's say some next tez upstream release, which officially supports hadoop 3.3.x (as Hadoop community already upgraded guava in the scope of HADOOP-16210) > Upgrade to hadoop 3.1.3 > --- > > Key: TEZ-4100 > URL: https://issues.apache.org/jira/browse/TEZ-4100 > Project: Apache Tez > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: TEZ-4100.01.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (TEZ-4101) Eliminate some guava dependencies by Java8+ features
[ https://issues.apache.org/jira/browse/TEZ-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor reassigned TEZ-4101: - Assignee: László Bodor > Eliminate some guava dependencies by Java8+ features > > > Key: TEZ-4101 > URL: https://issues.apache.org/jira/browse/TEZ-4101 > Project: Apache Tez > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (TEZ-4101) Eliminate some guava dependencies by Java8+ features
László Bodor created TEZ-4101: - Summary: Eliminate some guava dependencies by Java8+ features Key: TEZ-4101 URL: https://issues.apache.org/jira/browse/TEZ-4101 Project: Apache Tez Issue Type: Improvement Reporter: László Bodor -- This message was sent by Atlassian Jira (v8.3.4#803005)