[jira] [Updated] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start
[ https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated TEZ-3274: - Attachment: TEZ-3274.009.patch There's a test failure that came about from fixing a javac error. Talked with [~jeagles] offline and we agreed to leave it the way it was. Also removed an unnecessary pom dependency that was left from some previous versions of the patch. > Vertex with MRInput and broadcast input does not respect slow start > --- > > Key: TEZ-3274 > URL: https://issues.apache.org/jira/browse/TEZ-3274 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Eric Badger > Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, > TEZ-3274.003.patch, TEZ-3274.004.patch, TEZ-3274.005.patch, > TEZ-3274.006.patch, TEZ-3274.007.patch, TEZ-3274.008.patch, TEZ-3274.009.patch > > > Vertices with shuffle input and MRInput choose RootInputVertexManager (and > not ShuffleVertexManager) and start containers and tasks immediately. In this > scenario, resources can be wasted since they do not respect > tez.shuffle-vertex-manager.min-src-fraction > tez.shuffle-vertex-manager.max-src-fraction. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start
[ https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated TEZ-3274: - Attachment: TEZ-3274.008.patch [~jeagles], fixed the javac and findbugs in this patch. They were both bugs from the initial implementation in the ShuffleVertexManager code. > Vertex with MRInput and broadcast input does not respect slow start > --- > > Key: TEZ-3274 > URL: https://issues.apache.org/jira/browse/TEZ-3274 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Eric Badger > Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, > TEZ-3274.003.patch, TEZ-3274.004.patch, TEZ-3274.005.patch, > TEZ-3274.006.patch, TEZ-3274.007.patch, TEZ-3274.008.patch > > > Vertices with shuffle input and MRInput choose RootInputVertexManager (and > not ShuffleVertexManager) and start containers and tasks immediately. In this > scenario, resources can be wasted since they do not respect > tez.shuffle-vertex-manager.min-src-fraction > tez.shuffle-vertex-manager.max-src-fraction. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start
[ https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated TEZ-3274: - Attachment: TEZ-3274.007.patch New patch that removes the deprecated key. Spoke with [~jeagles] offline and we think that moving forward with this patch is a good decision and then following up with the broken deprecated keys functionality in followup jiras. It's not an easy fix for the deprecated keys because you can't map a single key to multiple values, which is what we would be doing here. > Vertex with MRInput and broadcast input does not respect slow start > --- > > Key: TEZ-3274 > URL: https://issues.apache.org/jira/browse/TEZ-3274 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Eric Badger > Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, > TEZ-3274.003.patch, TEZ-3274.004.patch, TEZ-3274.005.patch, > TEZ-3274.006.patch, TEZ-3274.007.patch > > > Vertices with shuffle input and MRInput choose RootInputVertexManager (and > not ShuffleVertexManager) and start containers and tasks immediately. In this > scenario, resources can be wasted since they do not respect > tez.shuffle-vertex-manager.min-src-fraction > tez.shuffle-vertex-manager.max-src-fraction. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start
[ https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated TEZ-3274: - Attachment: TEZ-3274.006.patch Thanks for the review, [~sseth]! Addressed your comments in this new patch. > Vertex with MRInput and broadcast input does not respect slow start > --- > > Key: TEZ-3274 > URL: https://issues.apache.org/jira/browse/TEZ-3274 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Eric Badger > Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, > TEZ-3274.003.patch, TEZ-3274.004.patch, TEZ-3274.005.patch, TEZ-3274.006.patch > > > Vertices with shuffle input and MRInput choose RootInputVertexManager (and > not ShuffleVertexManager) and start containers and tasks immediately. In this > scenario, resources can be wasted since they do not respect > tez.shuffle-vertex-manager.min-src-fraction > tez.shuffle-vertex-manager.max-src-fraction. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start
[ https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated TEZ-3274: - Attachment: TEZ-3274.005.patch Thanks for the review, [~jeagles]! I addressed your comments in this new patch. I agree that we should address some style and coding practice issues in a follow up JIRA. That way we can address them in both the ShuffleVertexManager and RootInputVertexManager in a single patch. > Vertex with MRInput and broadcast input does not respect slow start > --- > > Key: TEZ-3274 > URL: https://issues.apache.org/jira/browse/TEZ-3274 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Eric Badger > Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, > TEZ-3274.003.patch, TEZ-3274.004.patch, TEZ-3274.005.patch > > > Vertices with shuffle input and MRInput choose RootInputVertexManager (and > not ShuffleVertexManager) and start containers and tasks immediately. In this > scenario, resources can be wasted since they do not respect > tez.shuffle-vertex-manager.min-src-fraction > tez.shuffle-vertex-manager.max-src-fraction. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start
[ https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated TEZ-3274: - Attachment: TEZ-3274.004.patch Attaching a completely new patch that adds the relevant slow start code from ShuffleVertexManager and ShuffleVertexManagerBase into RootInputVertexManager. It’s quite cumbersome and adds a lot of redundant code, but it allows MRInput + Broadcast Input vertices to benefit from slow start via separate configs. Additionally, I ported over the slow start unit test from TestShuffleVertexManager and fixed up some other tests that broke because of the new feature. I tested the change on a 15 node cluster. It preserves its previous functionality by default (i.e. no slow start), and is tunable by 3 configs, tez.root-input-vertex-manager.(enable.slow-start,min-src-fraction,max-src-fraction). Slow start is disabled by default, which sets both the min and max to 0, causing all tasks to start immediately, just as they would in the previous ImmediateStart case. When slow start is enabled, it performs just like the ShuffleVertexManager case, scheduling tasks linearly between the min/max values. I tested this with a script that creates a DAG with a MRInput + Broadcast input downstream vertex. {noformat} -- Tab separate values in the input files A = LOAD '/tmp/data1' as (a, b, c); B = LOAD '/tmp/data2' as (x, y, z); C = GROUP A BY a; D = JOIN B by x, C by group using 'replicated'; STORE D into '/tmp/output'; {noformat} [~sseth], [~jeagles], [~rohini], [~jlowe], I would appreciate any and all comments! > Vertex with MRInput and broadcast input does not respect slow start > --- > > Key: TEZ-3274 > URL: https://issues.apache.org/jira/browse/TEZ-3274 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Eric Badger > Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, > TEZ-3274.003.patch, TEZ-3274.004.patch > > > Vertices with shuffle input and MRInput choose RootInputVertexManager (and > not ShuffleVertexManager) and start containers and tasks immediately. In this > scenario, resources can be wasted since they do not respect > tez.shuffle-vertex-manager.min-src-fraction > tez.shuffle-vertex-manager.max-src-fraction. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start
[ https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-3274: Summary: Vertex with MRInput and broadcast input does not respect slow start (was: Vertex with MRInput and shuffle input does not respect slow start) > Vertex with MRInput and broadcast input does not respect slow start > --- > > Key: TEZ-3274 > URL: https://issues.apache.org/jira/browse/TEZ-3274 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Eric Badger > Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, > TEZ-3274.003.patch > > > Vertices with shuffle input and MRInput choose RootInputVertexManager (and > not ShuffleVertexManager) and start containers and tasks immediately. In this > scenario, resources can be wasted since they do not respect > tez.shuffle-vertex-manager.min-src-fraction > tez.shuffle-vertex-manager.max-src-fraction. -- This message was sent by Atlassian JIRA (v6.3.15#6346)