[jira] [Commented] (MAPREDUCE-4400) Fix performance regression for small jobs/workflows

2012-07-24 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421423#comment-13421423
 ] 

Tom White commented on MAPREDUCE-4400:
--

+1

> Fix performance regression for small jobs/workflows
> ---
>
> Key: MAPREDUCE-4400
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4400
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: performance, task
>Affects Versions: 0.20.203.0, 1.0.3
>Reporter: Luke Lu
>Assignee: Luke Lu
> Attachments: mapreduce-4400-branch-1.patch
>
>
> There is a significant performance regression for small jobs/workflows (vs 
> 0.20.2) in the Hadoop 1.x series. Most noticeable with Hive and Pig jobs. 
> PigMix has an average 40% regression against 0.20.2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4400) Fix performance regression for small jobs/workflows

2012-07-19 Thread Shrinivas Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418913#comment-13418913
 ] 

Shrinivas Joshi commented on MAPREDUCE-4400:


As I said earlier, I did verify that this patch was working as expected. It 
does minimize synchronization over MR3809. This patch looks good to be 
committed to me.

> Fix performance regression for small jobs/workflows
> ---
>
> Key: MAPREDUCE-4400
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4400
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.20.203.0, 1.0.3
>Reporter: Luke Lu
>Assignee: Luke Lu
> Attachments: mapreduce-4400-branch-1.patch
>
>
> There is a significant performance regression for small jobs/workflows (vs 
> 0.20.2) in the Hadoop 1.x series. Most noticeable with Hive and Pig jobs. 
> PigMix has an average 40% regression against 0.20.2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4400) Fix performance regression for small jobs/workflows

2012-07-19 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418699#comment-13418699
 ] 

Tom White commented on MAPREDUCE-4400:
--

Luke - yes, please do.

> Fix performance regression for small jobs/workflows
> ---
>
> Key: MAPREDUCE-4400
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4400
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.20.203.0, 1.0.3
>Reporter: Luke Lu
>Assignee: Luke Lu
> Attachments: mapreduce-4400-branch-1.patch
>
>
> There is a significant performance regression for small jobs/workflows (vs 
> 0.20.2) in the Hadoop 1.x series. Most noticeable with Hive and Pig jobs. 
> PigMix has an average 40% regression against 0.20.2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4400) Fix performance regression for small jobs/workflows

2012-07-19 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418539#comment-13418539
 ] 

Luke Lu commented on MAPREDUCE-4400:


Yes. The speed up is more pronounced with outofband heartbeat, which has 
similar effect of MAPREDUCE-1906 (which is not in branch-1). MRv2 doesn't need 
this patch as it was addressed by MAPREDUCE-3809. Tom, can we file a separate 
jira to improve the change in trunk? 

Shrinivas, you're encouraged to review and +1 on the patch :)

> Fix performance regression for small jobs/workflows
> ---
>
> Key: MAPREDUCE-4400
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4400
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.20.203.0, 1.0.3
>Reporter: Luke Lu
>Assignee: Luke Lu
> Attachments: mapreduce-4400-branch-1.patch
>
>
> There is a significant performance regression for small jobs/workflows (vs 
> 0.20.2) in the Hadoop 1.x series. Most noticeable with Hive and Pig jobs. 
> PigMix has an average 40% regression against 0.20.2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4400) Fix performance regression for small jobs/workflows

2012-07-19 Thread Shrinivas Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418476#comment-13418476
 ] 

Shrinivas Joshi commented on MAPREDUCE-4400:


Can I request a code review and commit of this patch so that it gets integrated 
in to MRv1 branch in the mean time it is ported to MRv2? Thanks.

> Fix performance regression for small jobs/workflows
> ---
>
> Key: MAPREDUCE-4400
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4400
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.20.203.0, 1.0.3
>Reporter: Luke Lu
>Assignee: Luke Lu
> Attachments: mapreduce-4400-branch-1.patch
>
>
> There is a significant performance regression for small jobs/workflows (vs 
> 0.20.2) in the Hadoop 1.x series. Most noticeable with Hive and Pig jobs. 
> PigMix has an average 40% regression against 0.20.2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4400) Fix performance regression for small jobs/workflows

2012-07-19 Thread Shrinivas Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418473#comment-13418473
 ] 

Shrinivas Joshi commented on MAPREDUCE-4400:


@Luke: I have not tried with outofband heartbeat property. Do you expect this 
to show more perf gains along with your patch?

> Fix performance regression for small jobs/workflows
> ---
>
> Key: MAPREDUCE-4400
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4400
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.20.203.0, 1.0.3
>Reporter: Luke Lu
>Assignee: Luke Lu
> Attachments: mapreduce-4400-branch-1.patch
>
>
> There is a significant performance regression for small jobs/workflows (vs 
> 0.20.2) in the Hadoop 1.x series. Most noticeable with Hive and Pig jobs. 
> PigMix has an average 40% regression against 0.20.2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4400) Fix performance regression for small jobs/workflows

2012-07-18 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417774#comment-13417774
 ] 

Luke Lu commented on MAPREDUCE-4400:


@Shinivas: have you tried this with 
mapreduce.tasktracker.outofband.heartbeat=true? (needs a cluster restart of 
course).

> Fix performance regression for small jobs/workflows
> ---
>
> Key: MAPREDUCE-4400
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4400
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.20.203.0, 1.0.3
>Reporter: Luke Lu
>Assignee: Luke Lu
> Attachments: mapreduce-4400-branch-1.patch
>
>
> There is a significant performance regression for small jobs/workflows (vs 
> 0.20.2) in the Hadoop 1.x series. Most noticeable with Hive and Pig jobs. 
> PigMix has an average 40% regression against 0.20.2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4400) Fix performance regression for small jobs/workflows

2012-07-16 Thread Shrinivas Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415838#comment-13415838
 ] 

Shrinivas Joshi commented on MAPREDUCE-4400:


Hi Luke - In our experiments your patch did achieve the same effect as what 
MAPREDUCE-4381 was trying to in terms of performance. We noticed good 
performance gains on Mahout KMeans clustering workload (~ 4%). It would be nice 
if we can get the branch-1 version of your change reviewed and checked-in in 
the mean time. Thanks.

> Fix performance regression for small jobs/workflows
> ---
>
> Key: MAPREDUCE-4400
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4400
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.20.203.0, 1.0.3
>Reporter: Luke Lu
>Assignee: Luke Lu
> Attachments: mapreduce-4400-branch-1.patch
>
>
> There is a significant performance regression for small jobs/workflows (vs 
> 0.20.2) in the Hadoop 1.x series. Most noticeable with Hive and Pig jobs. 
> PigMix has an average 40% regression against 0.20.2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4400) Fix performance regression for small jobs/workflows

2012-07-09 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409752#comment-13409752
 ] 

Tom White commented on MAPREDUCE-4400:
--

Maybe have a patch for trunk/branch-2 to bring the two into line then? I think 
it's good to minimize the number of differences where possible.

> Fix performance regression for small jobs/workflows
> ---
>
> Key: MAPREDUCE-4400
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4400
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.20.203.0, 1.0.3
>Reporter: Luke Lu
>Assignee: Luke Lu
> Attachments: mapreduce-4400-branch-1.patch
>
>
> There is a significant performance regression for small jobs/workflows (vs 
> 0.20.2) in the Hadoop 1.x series. Most noticeable with Hive and Pig jobs. 
> PigMix has an average 40% regression against 0.20.2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4400) Fix performance regression for small jobs/workflows

2012-07-09 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409740#comment-13409740
 ] 

Luke Lu commented on MAPREDUCE-4400:


Thanks for the pointer to MAPREDUCE-3809, Tom. IMO, this patch is slightly 
better as it minimizes synchronization. 

> Fix performance regression for small jobs/workflows
> ---
>
> Key: MAPREDUCE-4400
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4400
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.20.203.0, 1.0.3
>Reporter: Luke Lu
>Assignee: Luke Lu
> Attachments: mapreduce-4400-branch-1.patch
>
>
> There is a significant performance regression for small jobs/workflows (vs 
> 0.20.2) in the Hadoop 1.x series. Most noticeable with Hive and Pig jobs. 
> PigMix has an average 40% regression against 0.20.2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4400) Fix performance regression for small jobs/workflows

2012-07-09 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409627#comment-13409627
 ] 

Tom White commented on MAPREDUCE-4400:
--

The same issue was fixed in trunk and branch-2 in MAPREDUCE-3809 in much the 
same way. How about backporting that code to branch-1?

> Fix performance regression for small jobs/workflows
> ---
>
> Key: MAPREDUCE-4400
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4400
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.20.203.0, 1.0.3
>Reporter: Luke Lu
>Assignee: Luke Lu
> Attachments: mapreduce-4400-branch-1.patch
>
>
> There is a significant performance regression for small jobs/workflows (vs 
> 0.20.2) in the Hadoop 1.x series. Most noticeable with Hive and Pig jobs. 
> PigMix has an average 40% regression against 0.20.2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4400) Fix performance regression for small jobs/workflows

2012-07-05 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406969#comment-13406969
 ] 

Luke Lu commented on MAPREDUCE-4400:


Thanks to John Poelman and Shreyas Subramanya of IBM BigInsights performance QA 
for noticing the issue and verifying my fix.

> Fix performance regression for small jobs/workflows
> ---
>
> Key: MAPREDUCE-4400
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4400
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.20.203.0, 1.0.3
>Reporter: Luke Lu
>Assignee: Luke Lu
>
> There is a significant performance regression for small jobs/workflows (vs 
> 0.20.2) in the Hadoop 1.x series. Most noticeable with Hive and Pig jobs. 
> PigMix has an average 40% regression against 0.20.2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira