[jira] [Updated] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start

2017-06-30 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated TEZ-3274:
-
Attachment: TEZ-3274.009.patch

There's a test failure that came about from fixing a javac error. Talked with 
[~jeagles] offline and we agreed to leave it the way it was. Also removed an 
unnecessary pom dependency that was left from some previous versions of the 
patch. 

> Vertex with MRInput and broadcast input does not respect slow start
> ---
>
> Key: TEZ-3274
> URL: https://issues.apache.org/jira/browse/TEZ-3274
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Eric Badger
> Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, 
> TEZ-3274.003.patch, TEZ-3274.004.patch, TEZ-3274.005.patch, 
> TEZ-3274.006.patch, TEZ-3274.007.patch, TEZ-3274.008.patch, TEZ-3274.009.patch
>
>
> Vertices with shuffle input and MRInput choose RootInputVertexManager (and 
> not ShuffleVertexManager) and start containers and tasks immediately. In this 
> scenario, resources can be wasted since they do not respect 
> tez.shuffle-vertex-manager.min-src-fraction 
> tez.shuffle-vertex-manager.max-src-fraction. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start

2017-06-29 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated TEZ-3274:
-
Attachment: TEZ-3274.008.patch

[~jeagles], fixed the javac and findbugs in this patch. They were both bugs 
from the initial implementation in the ShuffleVertexManager code.

> Vertex with MRInput and broadcast input does not respect slow start
> ---
>
> Key: TEZ-3274
> URL: https://issues.apache.org/jira/browse/TEZ-3274
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Eric Badger
> Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, 
> TEZ-3274.003.patch, TEZ-3274.004.patch, TEZ-3274.005.patch, 
> TEZ-3274.006.patch, TEZ-3274.007.patch, TEZ-3274.008.patch
>
>
> Vertices with shuffle input and MRInput choose RootInputVertexManager (and 
> not ShuffleVertexManager) and start containers and tasks immediately. In this 
> scenario, resources can be wasted since they do not respect 
> tez.shuffle-vertex-manager.min-src-fraction 
> tez.shuffle-vertex-manager.max-src-fraction. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start

2017-06-28 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated TEZ-3274:
-
Attachment: TEZ-3274.007.patch

New patch that removes the deprecated key. Spoke with [~jeagles] offline and we 
think that moving forward with this patch is a good decision and then following 
up with the broken deprecated keys functionality in followup jiras. It's not an 
easy fix for the deprecated keys because you can't map a single key to multiple 
values, which is what we would be doing here. 

> Vertex with MRInput and broadcast input does not respect slow start
> ---
>
> Key: TEZ-3274
> URL: https://issues.apache.org/jira/browse/TEZ-3274
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Eric Badger
> Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, 
> TEZ-3274.003.patch, TEZ-3274.004.patch, TEZ-3274.005.patch, 
> TEZ-3274.006.patch, TEZ-3274.007.patch
>
>
> Vertices with shuffle input and MRInput choose RootInputVertexManager (and 
> not ShuffleVertexManager) and start containers and tasks immediately. In this 
> scenario, resources can be wasted since they do not respect 
> tez.shuffle-vertex-manager.min-src-fraction 
> tez.shuffle-vertex-manager.max-src-fraction. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start

2017-06-27 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated TEZ-3274:
-
Attachment: TEZ-3274.006.patch

Thanks for the review, [~sseth]! Addressed your comments in this new patch. 

> Vertex with MRInput and broadcast input does not respect slow start
> ---
>
> Key: TEZ-3274
> URL: https://issues.apache.org/jira/browse/TEZ-3274
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Eric Badger
> Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, 
> TEZ-3274.003.patch, TEZ-3274.004.patch, TEZ-3274.005.patch, TEZ-3274.006.patch
>
>
> Vertices with shuffle input and MRInput choose RootInputVertexManager (and 
> not ShuffleVertexManager) and start containers and tasks immediately. In this 
> scenario, resources can be wasted since they do not respect 
> tez.shuffle-vertex-manager.min-src-fraction 
> tez.shuffle-vertex-manager.max-src-fraction. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start

2017-06-27 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated TEZ-3274:
-
Attachment: TEZ-3274.005.patch

Thanks for the review, [~jeagles]! I addressed your comments in this new patch. 
I agree that we should address some style and coding practice issues in a 
follow up JIRA. That way we can address them in both the ShuffleVertexManager 
and RootInputVertexManager in a single patch.

> Vertex with MRInput and broadcast input does not respect slow start
> ---
>
> Key: TEZ-3274
> URL: https://issues.apache.org/jira/browse/TEZ-3274
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Eric Badger
> Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, 
> TEZ-3274.003.patch, TEZ-3274.004.patch, TEZ-3274.005.patch
>
>
> Vertices with shuffle input and MRInput choose RootInputVertexManager (and 
> not ShuffleVertexManager) and start containers and tasks immediately. In this 
> scenario, resources can be wasted since they do not respect 
> tez.shuffle-vertex-manager.min-src-fraction 
> tez.shuffle-vertex-manager.max-src-fraction. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start

2017-05-25 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated TEZ-3274:
-
Attachment: TEZ-3274.004.patch

Attaching a completely new patch that adds the relevant slow start code from 
ShuffleVertexManager and ShuffleVertexManagerBase into RootInputVertexManager. 
It’s quite cumbersome and adds a lot of redundant code, but it allows MRInput + 
Broadcast Input vertices to benefit from slow start via separate configs. 
Additionally, I ported over the slow start unit test from 
TestShuffleVertexManager and fixed up some other tests that broke because of 
the new feature. 

I tested the change on a 15 node cluster. It preserves its previous 
functionality by default (i.e. no slow start), and is tunable by 3 configs, 
tez.root-input-vertex-manager.(enable.slow-start,min-src-fraction,max-src-fraction).
 Slow start is disabled by default, which sets both the min and max to 0, 
causing all tasks to start immediately, just as they would in the previous 
ImmediateStart case. When slow start is enabled, it performs just like the 
ShuffleVertexManager case, scheduling tasks linearly between the min/max 
values. I tested this with a script that creates a DAG with a MRInput + 
Broadcast input downstream vertex.

{noformat}
-- Tab separate values in the input files
A = LOAD '/tmp/data1' as (a, b, c);
B = LOAD '/tmp/data2' as (x, y, z);
C = GROUP A BY a;
D = JOIN B by x, C by group using 'replicated';
STORE D into '/tmp/output';
{noformat}

[~sseth], [~jeagles], [~rohini], [~jlowe], I would appreciate any and all 
comments!

> Vertex with MRInput and broadcast input does not respect slow start
> ---
>
> Key: TEZ-3274
> URL: https://issues.apache.org/jira/browse/TEZ-3274
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Eric Badger
> Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, 
> TEZ-3274.003.patch, TEZ-3274.004.patch
>
>
> Vertices with shuffle input and MRInput choose RootInputVertexManager (and 
> not ShuffleVertexManager) and start containers and tasks immediately. In this 
> scenario, resources can be wasted since they do not respect 
> tez.shuffle-vertex-manager.min-src-fraction 
> tez.shuffle-vertex-manager.max-src-fraction. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start

2017-04-21 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-3274:

Summary: Vertex with MRInput and broadcast input does not respect slow 
start  (was: Vertex with MRInput and shuffle input does not respect slow start)

> Vertex with MRInput and broadcast input does not respect slow start
> ---
>
> Key: TEZ-3274
> URL: https://issues.apache.org/jira/browse/TEZ-3274
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Eric Badger
> Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, 
> TEZ-3274.003.patch
>
>
> Vertices with shuffle input and MRInput choose RootInputVertexManager (and 
> not ShuffleVertexManager) and start containers and tasks immediately. In this 
> scenario, resources can be wasted since they do not respect 
> tez.shuffle-vertex-manager.min-src-fraction 
> tez.shuffle-vertex-manager.max-src-fraction. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)