[jira] [Created] (TEZ-3318) Tez UI: Polling is not restarted after RM recovery

2016-06-30 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-3318:
---

 Summary: Tez UI: Polling is not restarted after RM recovery
 Key: TEZ-3318
 URL: https://issues.apache.org/jira/browse/TEZ-3318
 Project: Apache Tez
  Issue Type: Bug
Reporter: Sreenath Somarajapuram
Assignee: Sreenath Somarajapuram


For a running DAG, we poll the AM to get progress and other realtime 
information. This communication happens via RM. If RM goes down, even after its 
recovery the polling is not re established.

Step to repro:
1. Run a job
2. Go to DAG details page, and ensure that the progress is getting updated.
3. Stop RM, and ensure that error bar is getting displayed in the UI.
4. Start RM.
5. As soon as RM is online, the progress bar must get updated.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3318) Tez UI: Polling is not restarted after RM recovery

2016-06-30 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357256#comment-15357256
 ] 

Hitesh Shah commented on TEZ-3318:
--

I think there should be limit to how many continuous re-tries are done. Maybe 
say 10 mins in total at the very max? i.e. if polling every 10 seconds, max 
retries should be for 60 times? This counter should obviously be reset to 0 on 
the first successful call. 

> Tez UI: Polling is not restarted after RM recovery
> --
>
> Key: TEZ-3318
> URL: https://issues.apache.org/jira/browse/TEZ-3318
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
>
> For a running DAG, we poll the AM to get progress and other realtime 
> information. This communication happens via RM. If RM goes down, even after 
> its recovery the polling is not re established.
> Step to repro:
> 1. Run a job
> 2. Go to DAG details page, and ensure that the progress is getting updated.
> 3. Stop RM, and ensure that error bar is getting displayed in the UI.
> 4. Start RM.
> 5. As soon as RM is online, the progress bar must get updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3286) Allow clients to set processor reserved memory per vertex (instead of per container)

2016-06-30 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-3286:
-
Attachment: TEZ-3286.2.patch

addressing some of the comments. 

> Allow clients to set processor reserved memory per vertex (instead of per 
> container)
> 
>
> Key: TEZ-3286
> URL: https://issues.apache.org/jira/browse/TEZ-3286
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.3
>Reporter: Wei Zheng
>Assignee: Hitesh Shah
> Attachments: TEZ-3286.1.patch, TEZ-3286.2.patch
>
>
> tez.task.scale.memory.reserve-fraction can be set by clients to control how 
> much memory is available to the processor. Ths values applies at a container 
> level though, instead of at a vertex level.
> In case of a hash-join - the processor typically needs more memory. In case 
> of  a Shuffle join - the processor may not need as much. In DAGs with a mix 
> of map joins and shuffle joins - setting this at a container level is 
> sub-optimal.
> To a large extent this comes down to propagating vertex configs to the 
> container / task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3286) Allow clients to set processor reserved memory per vertex (instead of per container)

2016-06-30 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-3286:
-
Attachment: TEZ-3286.3.patch

Missed the config scope changes. 

> Allow clients to set processor reserved memory per vertex (instead of per 
> container)
> 
>
> Key: TEZ-3286
> URL: https://issues.apache.org/jira/browse/TEZ-3286
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.3
>Reporter: Wei Zheng
>Assignee: Hitesh Shah
> Attachments: TEZ-3286.1.patch, TEZ-3286.2.patch, TEZ-3286.3.patch
>
>
> tez.task.scale.memory.reserve-fraction can be set by clients to control how 
> much memory is available to the processor. Ths values applies at a container 
> level though, instead of at a vertex level.
> In case of a hash-join - the processor typically needs more memory. In case 
> of  a Shuffle join - the processor may not need as much. In DAGs with a mix 
> of map joins and shuffle joins - setting this at a container level is 
> sub-optimal.
> To a large extent this comes down to propagating vertex configs to the 
> container / task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-3319) tez-history-parser should not have its own Version class

2016-06-30 Thread Hitesh Shah (JIRA)
Hitesh Shah created TEZ-3319:


 Summary: tez-history-parser should not have its own Version class
 Key: TEZ-3319
 URL: https://issues.apache.org/jira/browse/TEZ-3319
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Rajesh Balamohan
Priority: Critical


This will hopefully restrict problems such as TEZ-3313 to a single 
implementation  

\cc [~rajesh.balamohan] [~ozawa]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3319) tez-history-parser should not have its own Version class

2016-06-30 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357433#comment-15357433
 ] 

Hitesh Shah commented on TEZ-3319:
--

[~rajesh.balamohan] would you mind creating jiras for any other code 
duplication in place today. 

> tez-history-parser should not have its own Version class
> 
>
> Key: TEZ-3319
> URL: https://issues.apache.org/jira/browse/TEZ-3319
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Rajesh Balamohan
>Priority: Critical
>
> This will hopefully restrict problems such as TEZ-3313 to a single 
> implementation  
> \cc [~rajesh.balamohan] [~ozawa]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3319) tez-history-parser should not have its own Version class

2016-06-30 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357458#comment-15357458
 ] 

Tsuyoshi Ozawa commented on TEZ-3319:
-

I agree with the solution :-)

> tez-history-parser should not have its own Version class
> 
>
> Key: TEZ-3319
> URL: https://issues.apache.org/jira/browse/TEZ-3319
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Rajesh Balamohan
>Priority: Critical
>
> This will hopefully restrict problems such as TEZ-3313 to a single 
> implementation  
> \cc [~rajesh.balamohan] [~ozawa]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Success: TEZ-3286 PreCommit Build #1820

2016-06-30 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3286
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1820/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 4136 lines...]
[INFO] Tez ... SUCCESS [  0.022 s]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 58:41 min
[INFO] Finished at: 2016-06-30T17:57:34+00:00
[INFO] Final Memory: 74M/1181M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12815501/TEZ-3286.3.patch
  against master revision 540eab0.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 12 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1820//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1820//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
dcc42afb2da240f23adbe8e7bda434af3093275d logged out


==
==
Finished build.
==
==


Archiving artifacts
[description-setter] Description set: TEZ-3286
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-3286) Allow clients to set processor reserved memory per vertex (instead of per container)

2016-06-30 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357562#comment-15357562
 ] 

TezQA commented on TEZ-3286:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12815501/TEZ-3286.3.patch
  against master revision 540eab0.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 12 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1820//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1820//console

This message is automatically generated.

> Allow clients to set processor reserved memory per vertex (instead of per 
> container)
> 
>
> Key: TEZ-3286
> URL: https://issues.apache.org/jira/browse/TEZ-3286
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.3
>Reporter: Wei Zheng
>Assignee: Hitesh Shah
> Attachments: TEZ-3286.1.patch, TEZ-3286.2.patch, TEZ-3286.3.patch
>
>
> tez.task.scale.memory.reserve-fraction can be set by clients to control how 
> much memory is available to the processor. Ths values applies at a container 
> level though, instead of at a vertex level.
> In case of a hash-join - the processor typically needs more memory. In case 
> of  a Shuffle join - the processor may not need as much. In DAGs with a mix 
> of map joins and shuffle joins - setting this at a container level is 
> sub-optimal.
> To a large extent this comes down to propagating vertex configs to the 
> container / task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3318) Tez UI: Polling is not restarted after RM recovery

2016-06-30 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-3318:

Attachment: TEZ-3318.1.patch

> Tez UI: Polling is not restarted after RM recovery
> --
>
> Key: TEZ-3318
> URL: https://issues.apache.org/jira/browse/TEZ-3318
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-3318.1.patch
>
>
> For a running DAG, we poll the AM to get progress and other realtime 
> information. This communication happens via RM. If RM goes down, even after 
> its recovery the polling is not re established.
> Step to repro:
> 1. Run a job
> 2. Go to DAG details page, and ensure that the progress is getting updated.
> 3. Stop RM, and ensure that error bar is getting displayed in the UI.
> 4. Start RM.
> 5. As soon as RM is online, the progress bar must get updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3318) Tez UI: Polling is not restarted after RM recovery

2016-06-30 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-3318:

Target Version/s: 0.9.0

> Tez UI: Polling is not restarted after RM recovery
> --
>
> Key: TEZ-3318
> URL: https://issues.apache.org/jira/browse/TEZ-3318
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
>
> For a running DAG, we poll the AM to get progress and other realtime 
> information. This communication happens via RM. If RM goes down, even after 
> its recovery the polling is not re established.
> Step to repro:
> 1. Run a job
> 2. Go to DAG details page, and ensure that the progress is getting updated.
> 3. Stop RM, and ensure that error bar is getting displayed in the UI.
> 4. Start RM.
> 5. As soon as RM is online, the progress bar must get updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3318) Tez UI: Polling is not restarted after RM recovery

2016-06-30 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357636#comment-15357636
 ] 

Sreenath Somarajapuram commented on TEZ-3318:
-

[~hitesh]
When polling fails, we don't do a polling retry (From AM). Instead what we do 
is a page reload in double the time. i.e polling delay in 3sec, if RM is not 
reachable we do a page reload (From ATS) every 6 seconds until - 1. RM is 
reachable or 2. the application is complete.

Considering that do we need this this retry limit? Adding the limit is a small 
change though.


> Tez UI: Polling is not restarted after RM recovery
> --
>
> Key: TEZ-3318
> URL: https://issues.apache.org/jira/browse/TEZ-3318
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-3318.1.patch
>
>
> For a running DAG, we poll the AM to get progress and other realtime 
> information. This communication happens via RM. If RM goes down, even after 
> its recovery the polling is not re established.
> Step to repro:
> 1. Run a job
> 2. Go to DAG details page, and ensure that the progress is getting updated.
> 3. Stop RM, and ensure that error bar is getting displayed in the UI.
> 4. Start RM.
> 5. As soon as RM is online, the progress bar must get updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-3320) Java implementation of bitonic merge sort

2016-06-30 Thread Tsuyoshi Ozawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa reassigned TEZ-3320:
---

Assignee: Tsuyoshi Ozawa

> Java implementation of bitonic merge sort
> -
>
> Key: TEZ-3320
> URL: https://issues.apache.org/jira/browse/TEZ-3320
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Tsuyoshi Ozawa
>Assignee: Tsuyoshi Ozawa
>
> Pure java cache-aware bitonic merge sort without JNI can solve the bottleneck 
> of sort. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-3320) Java implementation of bitonic merge sort

2016-06-30 Thread Tsuyoshi Ozawa (JIRA)
Tsuyoshi Ozawa created TEZ-3320:
---

 Summary: Java implementation of bitonic merge sort
 Key: TEZ-3320
 URL: https://issues.apache.org/jira/browse/TEZ-3320
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Tsuyoshi Ozawa


Pure java cache-aware bitonic merge sort without JNI can solve the bottleneck 
of sort. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3314) Double counting input bytes in MultiMRInput

2016-06-30 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-3314:

Fix Version/s: 0.8.4

> Double counting input bytes in MultiMRInput
> ---
>
> Key: TEZ-3314
> URL: https://issues.apache.org/jira/browse/TEZ-3314
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Fix For: 0.9.0, 0.8.4
>
> Attachments: TEZ-3314.0.patch
>
>
> TEZ_INPUT_SPLIT_LENGTH is incremented twice if useNewAPI is set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3291) Optimize splits grouping when locality information is not available

2016-06-30 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-3291:

Fix Version/s: 0.8.4
   0.9.0

> Optimize splits grouping when locality information is not available
> ---
>
> Key: TEZ-3291
> URL: https://issues.apache.org/jira/browse/TEZ-3291
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 0.9.0, 0.8.4
>
> Attachments: TEZ-3291.004.patch, TEZ-3291.2.patch, TEZ-3291.3.patch, 
> TEZ-3291.4.patch, TEZ-3291.5.patch, TEZ-3291.WIP.patch
>
>
> There are scenarios where splits might not contain the location details. S3 
> is an example, where all splits would have "localhost" for the location 
> details. In such cases, curent split computation does not go through the 
> rack local and allow-small groups optimizations and ends up creating small 
> number of splits. Depending on clusters this can end creating long running 
> map jobs.
> Example with hive:
> ==
> 1. Inventory table in tpc-ds dataset is partitioned and is relatively a small 
> table.
> 2. With query-22, hive requests with the original splits count as 52 and 
> overall length of splits themselves is around 12061817 bytes. 
> {{tez.grouping.min-size}} was set to 16 MB.
> 3. In tez splits grouping, this ends up creating a single split with 52+ 
> files be processed in the split.  In clusters with split locations, this 
> would have landed up with multiple splits since {{allowSmallGroups}} would 
> have kicked in.
> But in S3, since everything would have "localhost" all splits get added to 
> single group. This makes things a lot worse.
> 4. Depending on the dataset and the format, this can be problematic. For 
> instance, file open calls and random seeks can be expensive in S3.
> 5. In this case, 52 files have to be opened and processed by single task in 
> sequential fashion. Had it been processed by multiple tasks, response time 
> would have drastically reduced.
> E.g log details
> {noformat}
> 2016-06-01 13:48:08,353 [INFO] [InputInitializer {Map 2} #0] 
> |split.TezMapredSplitsGrouper|: Grouping splits in Tez
> 2016-06-01 13:48:08,353 [INFO] [InputInitializer {Map 2} #0] 
> |split.TezMapredSplitsGrouper|: Desired splits: 110 too large.  Desired 
> splitLength: 109652 Min splitLength: 16777216 New desired splits: 1 Total 
> length: 12061817 Original splits: 52
> 2016-06-01 13:48:08,354 [INFO] [InputInitializer {Map 2} #0] 
> |split.TezMapredSplitsGrouper|: Desired numSplits: 1 lengthPerGroup: 12061817 
> numLocations: 1 numSplitsPerLocation: 52 numSplitsInGroup: 52 totalLength: 
> 12061817 numOriginalSplits: 52 . Grouping by length: true count: false
> 2016-06-01 13:48:08,354 [INFO] [InputInitializer {Map 2} #0] 
> |split.TezMapredSplitsGrouper|: Number of splits desired: 1 created: 1 
> splitsProcessed: 52
> {noformat}
> Alternate options:
> ==
> 1. Force Hadoop to provide bogus locations for S3. But not sure, if that 
> would be accepted anytime soon. Ref: HADOOP-12878
> 2. Set {{tez.grouping.min-size}} to very very low value. But should the end 
> user always be doing this on query to query basis?
> 3. When {{(lengthPerGroup < "tez.grouping.min-size")}}, recompute 
> desiredNumSplits only when number of distinct locations in the splits is > 1. 
> This would force more number of splits to be generated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3314) Double counting input bytes in MultiMRInput

2016-06-30 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357941#comment-15357941
 ] 

Siddharth Seth commented on TEZ-3314:
-

Pulled into branch-0.8 as well.

> Double counting input bytes in MultiMRInput
> ---
>
> Key: TEZ-3314
> URL: https://issues.apache.org/jira/browse/TEZ-3314
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Fix For: 0.9.0, 0.8.4
>
> Attachments: TEZ-3314.0.patch
>
>
> TEZ_INPUT_SPLIT_LENGTH is incremented twice if useNewAPI is set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-3303) Have ShuffleVertexManager consume more precise partition stats

2016-06-30 Thread Tsuyoshi Ozawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa reassigned TEZ-3303:
---

Assignee: Tsuyoshi Ozawa

> Have ShuffleVertexManager consume more precise partition stats
> --
>
> Key: TEZ-3303
> URL: https://issues.apache.org/jira/browse/TEZ-3303
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Tsuyoshi Ozawa
>
> TEZ-3216 adds the support for more precise partition stats. 
> ShuffleVertexManager should be updated to consume the more precise partition 
> stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3303) Have ShuffleVertexManager consume more precise partition stats

2016-06-30 Thread Tsuyoshi Ozawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated TEZ-3303:

Attachment: TEZ-3303.001.patch

Attaching a patch to consume more precise partition stats.

> Have ShuffleVertexManager consume more precise partition stats
> --
>
> Key: TEZ-3303
> URL: https://issues.apache.org/jira/browse/TEZ-3303
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-3303.001.patch
>
>
> TEZ-3216 adds the support for more precise partition stats. 
> ShuffleVertexManager should be updated to consume the more precise partition 
> stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2962) Use per partition stats in shuffle vertex manager auto parallelism

2016-06-30 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357961#comment-15357961
 ] 

Tsuyoshi Ozawa commented on TEZ-2962:
-

This can be done after TEZ-3303.

> Use per partition stats in shuffle vertex manager auto parallelism
> --
>
> Key: TEZ-2962
> URL: https://issues.apache.org/jira/browse/TEZ-2962
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Priority: Critical
>
> The original code used output size sent by completed tasks. Recently per 
> partition stats have been added that provide granular information. Using 
> partition stats may be more accurate and also remove the duplicate counting 
> of data size in partition stats and per task overall.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3293) Fetch failures can cause a shuffle hang waiting for memory merge that never starts

2016-06-30 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357988#comment-15357988
 ] 

Siddharth Seth commented on TEZ-3293:
-

+1. Looks good. Thanks [~jlowe]. Apologies for the delay in the review.

> Fetch failures can cause a shuffle hang waiting for memory merge that never 
> starts
> --
>
> Key: TEZ-3293
> URL: https://issues.apache.org/jira/browse/TEZ-3293
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.1, 0.8.3
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Attachments: TEZ-3293.001.patch
>
>
> Tez jobs can hang in shuffle waiting for a memory merge that never starts.  
> When a MapOutput is reserved it increments usedMemory but when it is 
> unreserved it decrements usedMemory _and_ commitMemory.  If enough shuffle 
> failures occur of sufficient size then commitMemory may never reach the merge 
> threshold even after all outstanding transfers have committed and thus hang 
> the shuffle.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3286) Allow clients to set processor reserved memory per vertex (instead of per container)

2016-06-30 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358029#comment-15358029
 ] 

Siddharth Seth commented on TEZ-3286:
-

+1. Looks good. Thanks [~hitesh]. Will commit in a bit with a small change to 
add a timeout to the new tests.

> Allow clients to set processor reserved memory per vertex (instead of per 
> container)
> 
>
> Key: TEZ-3286
> URL: https://issues.apache.org/jira/browse/TEZ-3286
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.3
>Reporter: Wei Zheng
>Assignee: Hitesh Shah
> Attachments: TEZ-3286.1.patch, TEZ-3286.2.patch, TEZ-3286.3.patch
>
>
> tez.task.scale.memory.reserve-fraction can be set by clients to control how 
> much memory is available to the processor. Ths values applies at a container 
> level though, instead of at a vertex level.
> In case of a hash-join - the processor typically needs more memory. In case 
> of  a Shuffle join - the processor may not need as much. In DAGs with a mix 
> of map joins and shuffle joins - setting this at a container level is 
> sub-optimal.
> To a large extent this comes down to propagating vertex configs to the 
> container / task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3286) Allow clients to set processor reserved memory per vertex (instead of per container)

2016-06-30 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-3286:

Attachment: TEZ-3286.3.withTestTimeout.txt

> Allow clients to set processor reserved memory per vertex (instead of per 
> container)
> 
>
> Key: TEZ-3286
> URL: https://issues.apache.org/jira/browse/TEZ-3286
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.3
>Reporter: Wei Zheng
>Assignee: Hitesh Shah
> Attachments: TEZ-3286.1.patch, TEZ-3286.2.patch, TEZ-3286.3.patch, 
> TEZ-3286.3.withTestTimeout.txt
>
>
> tez.task.scale.memory.reserve-fraction can be set by clients to control how 
> much memory is available to the processor. Ths values applies at a container 
> level though, instead of at a vertex level.
> In case of a hash-join - the processor typically needs more memory. In case 
> of  a Shuffle join - the processor may not need as much. In DAGs with a mix 
> of map joins and shuffle joins - setting this at a container level is 
> sub-optimal.
> To a large extent this comes down to propagating vertex configs to the 
> container / task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3293) Fetch failures can cause a shuffle hang waiting for memory merge that never starts

2016-06-30 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-3293:

Attachment: TEZ-3293.001-branch-0.7.patch

Patch fro branch-0.7

> Fetch failures can cause a shuffle hang waiting for memory merge that never 
> starts
> --
>
> Key: TEZ-3293
> URL: https://issues.apache.org/jira/browse/TEZ-3293
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.1, 0.8.3
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Attachments: TEZ-3293.001-branch-0.7.patch, TEZ-3293.001.patch
>
>
> Tez jobs can hang in shuffle waiting for a memory merge that never starts.  
> When a MapOutput is reserved it increments usedMemory but when it is 
> unreserved it decrements usedMemory _and_ commitMemory.  If enough shuffle 
> failures occur of sufficient size then commitMemory may never reach the merge 
> threshold even after all outstanding transfers have committed and thus hang 
> the shuffle.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-3321) Changes for 0.8.4 release

2016-06-30 Thread Siddharth Seth (JIRA)
Siddharth Seth created TEZ-3321:
---

 Summary: Changes for 0.8.4 release
 Key: TEZ-3321
 URL: https://issues.apache.org/jira/browse/TEZ-3321
 Project: Apache Tez
  Issue Type: Task
Reporter: Siddharth Seth
Assignee: Siddharth Seth






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-3293 PreCommit Build #1823

2016-06-30 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3293
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1823/

###
## LAST 60 LINES OF THE CONSOLE 
###
Started by remote host 127.0.0.1
[EnvInject] - Loading node environment variables.
Building remotely on H5 (Mapreduce Falcon Hadoop Pig Zookeeper Tez Hdfs 
yahoo-not-h2) in workspace 
/home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://git-wip-us.apache.org/repos/asf/tez.git 
 > # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/tez.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/tez.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/master^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/master^{commit} # timeout=10
Checking out Revision 3b08cbf907784de463c9e3c05147b5c6d681251d 
(refs/remotes/origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3b08cbf907784de463c9e3c05147b5c6d681251d
 > git rev-list 71bb2defe97e55e3bf7dbb299fe33ab8a667e7a1 # timeout=10
No emails were triggered.
[PreCommit-TEZ-Build] $ /bin/bash /tmp/hudson2317049414711291726.sh
Running in Jenkins mode


==
==
Testing patch for TEZ-3293.
==
==


HEAD is now at 3b08cbf TEZ-3286. Allow clients to set processor reserved memory 
per vertex (instead of per container). Contributed by Hitesh Shah.
Previous HEAD position was 3b08cbf... TEZ-3286. Allow clients to set processor 
reserved memory per vertex (instead of per container). Contributed by Hitesh 
Shah.
Switched to branch 'master'
Your branch is behind 'origin/master' by 7 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)
First, rewinding head to replay your work on top of it...
Fast-forwarded master to 3b08cbf907784de463c9e3c05147b5c6d681251d.
TEZ-3293 is not "Patch Available".  Exiting.


==
==
Finished build.
==
==


Archiving artifacts
ERROR: No artifacts found that match the file pattern "patchprocess/*.*". 
Configuration error?
ERROR: ?patchprocess/*.*? doesn?t match anything, but ?*.*? does. Perhaps 
that?s what you mean?
Build step 'Archive the artifacts' changed build result to FAILURE
[description-setter] Could not determine description.
Recording test results
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Updated] (TEZ-3321) Changes for 0.8.4 release

2016-06-30 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-3321:

Attachment: TEZ-3321.part1.txt

Patch for branch-0.8 to update version and add section to CHANGES.txt

> Changes for 0.8.4 release
> -
>
> Key: TEZ-3321
> URL: https://issues.apache.org/jira/browse/TEZ-3321
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-3321.part1.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3287) Have UnorderedPartitionedKVWriter honor tez.runtime.empty.partitions.info-via-events.enabled

2016-06-30 Thread Lalitha Viswanathan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358060#comment-15358060
 ] 

Lalitha Viswanathan commented on TEZ-3287:
--

Which version of tez source code is being used in the patch? 
I tried applying the patch manually in 0.8.3 version, compiled and deployed it 
in my cluster. Didn't get the "hive.tez.auto.reducer.parallelism=true" 
optimization working with shuffle hash join. 
Am I missing something?

> Have UnorderedPartitionedKVWriter honor 
> tez.runtime.empty.partitions.info-via-events.enabled
> 
>
> Key: TEZ-3287
> URL: https://issues.apache.org/jira/browse/TEZ-3287
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-3287.001.patch
>
>
> The ordered partitioned output allows applications to specify if empty 
> partition stats should be included as part of DataMovementEvent via a 
> configuration. It seems unordered partitioned output should honor that 
> configuration as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Success: TEZ-3303 PreCommit Build #1821

2016-06-30 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3303
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1821/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 4116 lines...]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 56:30 min
[INFO] Finished at: 2016-06-30T23:47:20+00:00
[INFO] Final Memory: 72M/882M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12815554/TEZ-3303.001.patch
  against master revision ac9cfb9.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1821//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1821//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
c34874fa138f25f5e30f4ae9950b579317b08081 logged out


==
==
Finished build.
==
==


Archiving artifacts
Compressed 3.20 MB of artifacts by 27.4% relative to #1820
[description-setter] Description set: TEZ-3303
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-3303) Have ShuffleVertexManager consume more precise partition stats

2016-06-30 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358062#comment-15358062
 ] 

TezQA commented on TEZ-3303:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12815554/TEZ-3303.001.patch
  against master revision ac9cfb9.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1821//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1821//console

This message is automatically generated.

> Have ShuffleVertexManager consume more precise partition stats
> --
>
> Key: TEZ-3303
> URL: https://issues.apache.org/jira/browse/TEZ-3303
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-3303.001.patch
>
>
> TEZ-3216 adds the support for more precise partition stats. 
> ShuffleVertexManager should be updated to consume the more precise partition 
> stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3287) Have UnorderedPartitionedKVWriter honor tez.runtime.empty.partitions.info-via-events.enabled

2016-06-30 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358068#comment-15358068
 ] 

Tsuyoshi Ozawa commented on TEZ-3287:
-

[~lmv] thanks for your taking a look! The patch is targeting master, not for 
branch-8. After merging this into master, I can backport it to branch-0.8.

> Have UnorderedPartitionedKVWriter honor 
> tez.runtime.empty.partitions.info-via-events.enabled
> 
>
> Key: TEZ-3287
> URL: https://issues.apache.org/jira/browse/TEZ-3287
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-3287.001.patch
>
>
> The ordered partitioned output allows applications to specify if empty 
> partition stats should be included as part of DataMovementEvent via a 
> configuration. It seems unordered partitioned output should honor that 
> configuration as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-3287) Have UnorderedPartitionedKVWriter honor tez.runtime.empty.partitions.info-via-events.enabled

2016-06-30 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358068#comment-15358068
 ] 

Tsuyoshi Ozawa edited comment on TEZ-3287 at 6/30/16 11:57 PM:
---

[~lmv] thanks for your taking a look! The patch is targeting master, not for 
branch-8. After merging this into master, I can backport it to branch-0.8.

BTW, I'm thinking that there is no relationship between 
"hive.tez.auto.reducer.parallelism=true"  and this jira. Let me know if I'm 
wrong.


was (Author: ozawa):
[~lmv] thanks for your taking a look! The patch is targeting master, not for 
branch-8. After merging this into master, I can backport it to branch-0.8.

> Have UnorderedPartitionedKVWriter honor 
> tez.runtime.empty.partitions.info-via-events.enabled
> 
>
> Key: TEZ-3287
> URL: https://issues.apache.org/jira/browse/TEZ-3287
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-3287.001.patch
>
>
> The ordered partitioned output allows applications to specify if empty 
> partition stats should be included as part of DataMovementEvent via a 
> configuration. It seems unordered partitioned output should honor that 
> configuration as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Success: TEZ-3286 PreCommit Build #1822

2016-06-30 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3286
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1822/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 4136 lines...]
[INFO] Tez ... SUCCESS [  0.031 s]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 55:12 min
[INFO] Finished at: 2016-07-01T00:41:14+00:00
[INFO] Final Memory: 85M/1069M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  
http://issues.apache.org/jira/secure/attachment/12815571/TEZ-3286.3.withTestTimeout.txt
  against master revision 71bb2de.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 12 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1822//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1822//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
affaacd3419fd9fe75fec70d4bf26b324fc171da logged out


==
==
Finished build.
==
==


Archiving artifacts
[description-setter] Description set: TEZ-3286
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-3286) Allow clients to set processor reserved memory per vertex (instead of per container)

2016-06-30 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358113#comment-15358113
 ] 

TezQA commented on TEZ-3286:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  
http://issues.apache.org/jira/secure/attachment/12815571/TEZ-3286.3.withTestTimeout.txt
  against master revision 71bb2de.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 12 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1822//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1822//console

This message is automatically generated.

> Allow clients to set processor reserved memory per vertex (instead of per 
> container)
> 
>
> Key: TEZ-3286
> URL: https://issues.apache.org/jira/browse/TEZ-3286
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.3
>Reporter: Wei Zheng
>Assignee: Hitesh Shah
> Fix For: 0.9.0, 0.8.4
>
> Attachments: TEZ-3286.1.patch, TEZ-3286.2.patch, TEZ-3286.3.patch, 
> TEZ-3286.3.withTestTimeout.txt
>
>
> tez.task.scale.memory.reserve-fraction can be set by clients to control how 
> much memory is available to the processor. Ths values applies at a container 
> level though, instead of at a vertex level.
> In case of a hash-join - the processor typically needs more memory. In case 
> of  a Shuffle join - the processor may not need as much. In DAGs with a mix 
> of map joins and shuffle joins - setting this at a container level is 
> sub-optimal.
> To a large extent this comes down to propagating vertex configs to the 
> container / task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3303) Have ShuffleVertexManager consume more precise partition stats

2016-06-30 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358196#comment-15358196
 ] 

Tsuyoshi Ozawa commented on TEZ-3303:
-

[~sseth] could you take a look?

> Have ShuffleVertexManager consume more precise partition stats
> --
>
> Key: TEZ-3303
> URL: https://issues.apache.org/jira/browse/TEZ-3303
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-3303.001.patch
>
>
> TEZ-3216 adds the support for more precise partition stats. 
> ShuffleVertexManager should be updated to consume the more precise partition 
> stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-3322) Add Apache license to the generated tez-configuration-template files

2016-06-30 Thread Siddharth Seth (JIRA)
Siddharth Seth created TEZ-3322:
---

 Summary: Add Apache license to the generated 
tez-configuration-template files
 Key: TEZ-3322
 URL: https://issues.apache.org/jira/browse/TEZ-3322
 Project: Apache Tez
  Issue Type: Task
Reporter: Siddharth Seth






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3318) Tez UI: Polling is not restarted after RM recovery

2016-06-30 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358255#comment-15358255
 ] 

Hitesh Shah commented on TEZ-3318:
--

Does the polling interval reset back to 3 on any successful call? 

> Tez UI: Polling is not restarted after RM recovery
> --
>
> Key: TEZ-3318
> URL: https://issues.apache.org/jira/browse/TEZ-3318
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-3318.1.patch
>
>
> For a running DAG, we poll the AM to get progress and other realtime 
> information. This communication happens via RM. If RM goes down, even after 
> its recovery the polling is not re established.
> Step to repro:
> 1. Run a job
> 2. Go to DAG details page, and ensure that the progress is getting updated.
> 3. Stop RM, and ensure that error bar is getting displayed in the UI.
> 4. Start RM.
> 5. As soon as RM is online, the progress bar must get updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3303) Have ShuffleVertexManager consume more precise partition stats

2016-06-30 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358300#comment-15358300
 ] 

Siddharth Seth commented on TEZ-3303:
-

[~ozawa] - I don't think the patch actually makes use of the stats. It needs to 
check which stats are set - and use that set appropriately. If I'm not mistaken 
the current patch only checks and reads detailed stats, but does nothing with 
them. cc [~mingma] - in case you'd like to review the patch when it's updated.

> Have ShuffleVertexManager consume more precise partition stats
> --
>
> Key: TEZ-3303
> URL: https://issues.apache.org/jira/browse/TEZ-3303
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-3303.001.patch
>
>
> TEZ-3216 adds the support for more precise partition stats. 
> ShuffleVertexManager should be updated to consume the more precise partition 
> stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)