[jira] [Created] (MAPREDUCE-6079) Renaming JobImpl#username to reporterUserName

2014-09-09 Thread Tsuyoshi OZAWA (JIRA)
Tsuyoshi OZAWA created MAPREDUCE-6079:
-

 Summary: Renaming JobImpl#username to reporterUserName
 Key: MAPREDUCE-6079
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6079
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Tsuyoshi OZAWA


On MAPREDUCE-6033, we found the bug because of confusing field names 
{{userName}} and {{username}}. We should change the names to distinguish them 
easily. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6079) Renaming JobImpl#username to reporterUserName

2014-09-09 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-6079:
--
Attachment: MAPREDUCE-6079.1.patch

 Renaming JobImpl#username to reporterUserName
 -

 Key: MAPREDUCE-6079
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6079
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-6079.1.patch


 On MAPREDUCE-6033, we found the bug because of confusing field names 
 {{userName}} and {{username}}. We should change the names to distinguish them 
 easily. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6079) Renaming JobImpl#username to reporterUserName

2014-09-09 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-6079:
--
Assignee: Tsuyoshi OZAWA
  Status: Patch Available  (was: Open)

 Renaming JobImpl#username to reporterUserName
 -

 Key: MAPREDUCE-6079
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6079
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-6079.1.patch


 On MAPREDUCE-6033, we found the bug because of confusing field names 
 {{userName}} and {{username}}. We should change the names to distinguish them 
 easily. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6079) Renaming JobImpl#username to reporterUserName

2014-09-09 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126747#comment-14126747
 ] 

Akira AJISAKA commented on MAPREDUCE-6079:
--

Thanks for the report and the patch. +1 (non-binding) pending Jenkins.

 Renaming JobImpl#username to reporterUserName
 -

 Key: MAPREDUCE-6079
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6079
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-6079.1.patch


 On MAPREDUCE-6033, we found the bug because of confusing field names 
 {{userName}} and {{username}}. We should change the names to distinguish them 
 easily. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6079) Renaming JobImpl#username to reporterUserName

2014-09-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126751#comment-14126751
 ] 

Hadoop QA commented on MAPREDUCE-6079:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12667367/MAPREDUCE-6079.1.patch
  against trunk revision 90c8ece.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4864//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4864//console

This message is automatically generated.

 Renaming JobImpl#username to reporterUserName
 -

 Key: MAPREDUCE-6079
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6079
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-6079.1.patch


 On MAPREDUCE-6033, we found the bug because of confusing field names 
 {{userName}} and {{username}}. We should change the names to distinguish them 
 easily. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5972) Fix typo 'programatically' in job.xml (and a few other places)

2014-09-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126889#comment-14126889
 ] 

Hudson commented on MAPREDUCE-5972:
---

FAILURE: Integrated in Hadoop-Yarn-trunk #675 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/675/])
MAPREDUCE-5972. Fix typo 'programatically' in job.xml (and a few other places) 
(Akira AJISAKA via aw) (aw: rev d989ac04449dc33da5e2c32a7f24d59cc92de536)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewerPB.java
* hadoop-tools/hadoop-sls/src/main/html/js/thirdparty/jquery.js
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfServlet.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/site/apt/HistoryServerRest.apt.vm
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewer.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapredAppMasterRest.apt.vm


 Fix typo 'programatically' in job.xml (and a few other places)
 --

 Key: MAPREDUCE-5972
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5972
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Trivial
  Labels: newbie
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5972.patch


 In job.xml, there's a typo 'programatically' as the below if a property is 
 set through program.
 {code}
 property
   namemapreduce.job.map.class/name
   valueorg.apache.hadoop.examples.WordCount$TokenizerMapper/value
   sourceprogramatically/source
 /property
 {code}
 should be 'programmatically'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5972) Fix typo 'programatically' in job.xml (and a few other places)

2014-09-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126977#comment-14126977
 ] 

Hudson commented on MAPREDUCE-5972:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1891 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1891/])
MAPREDUCE-5972. Fix typo 'programatically' in job.xml (and a few other places) 
(Akira AJISAKA via aw) (aw: rev d989ac04449dc33da5e2c32a7f24d59cc92de536)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewerPB.java
* hadoop-tools/hadoop-sls/src/main/html/js/thirdparty/jquery.js
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfServlet.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/site/apt/HistoryServerRest.apt.vm
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapredAppMasterRest.apt.vm
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewer.java


 Fix typo 'programatically' in job.xml (and a few other places)
 --

 Key: MAPREDUCE-5972
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5972
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Trivial
  Labels: newbie
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5972.patch


 In job.xml, there's a typo 'programatically' as the below if a property is 
 set through program.
 {code}
 property
   namemapreduce.job.map.class/name
   valueorg.apache.hadoop.examples.WordCount$TokenizerMapper/value
   sourceprogramatically/source
 /property
 {code}
 should be 'programmatically'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5972) Fix typo 'programatically' in job.xml (and a few other places)

2014-09-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127005#comment-14127005
 ] 

Hudson commented on MAPREDUCE-5972:
---

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1866 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1866/])
MAPREDUCE-5972. Fix typo 'programatically' in job.xml (and a few other places) 
(Akira AJISAKA via aw) (aw: rev d989ac04449dc33da5e2c32a7f24d59cc92de536)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapredAppMasterRest.apt.vm
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewer.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfServlet.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/site/apt/HistoryServerRest.apt.vm
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewerPB.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java
* hadoop-tools/hadoop-sls/src/main/html/js/thirdparty/jquery.js


 Fix typo 'programatically' in job.xml (and a few other places)
 --

 Key: MAPREDUCE-5972
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5972
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Trivial
  Labels: newbie
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5972.patch


 In job.xml, there's a typo 'programatically' as the below if a property is 
 set through program.
 {code}
 property
   namemapreduce.job.map.class/name
   valueorg.apache.hadoop.examples.WordCount$TokenizerMapper/value
   sourceprogramatically/source
 /property
 {code}
 should be 'programmatically'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5891) Improved shuffle error handling across NM restarts

2014-09-09 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127043#comment-14127043
 ] 

Jason Lowe commented on MAPREDUCE-5891:
---

Thanks for updating the patch, Junping, and sorry for the delay in re-review.   
 The fixes all look fine.

I agree with Ming that we should be consistent about the default state of this 
feature and NM restart, although I'm not a fan of adding a YARN API to query NM 
restart.  Task containers currently don't talk with the NM, and IMHO this is 
not a good enough reason to change that.  I'm OK with adding it to the shuffle 
protocol if we can do it in a backwards-compatible way, although I don't know 
offhand how that would be accomplished.  Another approach is to try to tie the 
two properties together and have the default value of 
mapreduce.reduce.shuffle.fetch.retry.enabled in mapred-default.xml be 
$\{yarn.nodemanager.recovery.enabled\}, so they could still be set 
independently but by default the NM restart setting drives the fetch retry 
setting.

 Improved shuffle error handling across NM restarts
 --

 Key: MAPREDUCE-5891
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5891
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Junping Du
 Attachments: MAPREDUCE-5891-demo.patch, MAPREDUCE-5891-v2.patch, 
 MAPREDUCE-5891-v3.patch, MAPREDUCE-5891-v4.patch, MAPREDUCE-5891.patch


 To minimize the number of map fetch failures reported by reducers across an 
 NM restart it would be nice if reducers only reported a fetch failure after 
 trying for at specified period of time to retrieve the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6080) JHS checks YARN application ACLs to determine user's access to aggregated logs

2014-09-09 Thread Zhijie Shen (JIRA)
Zhijie Shen created MAPREDUCE-6080:
--

 Summary: JHS checks YARN application ACLs to determine user's 
access to aggregated logs
 Key: MAPREDUCE-6080
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6080
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 2.5.0, 3.0.0
Reporter: Zhijie Shen


While JHS uses JobACLsManager to check user's access tot the job history 
information, it uses ApplicationACLsManager to justify whether the user has 
access to the aggregated log, because it directly imports AggregatedLogsBlock 
into the log web page.

In most cases, the two manager can do consistent access control. However we 
observed case that YARN acls is enabled while MR cluster acls is not. 
Therefore, the user can view all the job information except accessing the 
aggregated logs from JHS. It confuses the user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-2841) Task level native optimization

2014-09-09 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127197#comment-14127197
 ] 

Todd Lipcon commented on MAPREDUCE-2841:


bq. -1 javac. The applied patch generated 1265 javac compiler warnings (more 
than the trunk's current 1264 warnings).

This is due to needing to import the deprecated UTF8 class to provide support 
for that type.

Aside from that, seems like Jenkins is happy with the patch. The merge vote is 
already started on mapreduce-dev and is set to close 9/12 EOD PST.

 Task level native optimization
 --

 Key: MAPREDUCE-2841
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2841
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: task
 Environment: x86-64 Linux/Unix
Reporter: Binglin Chang
Assignee: Sean Zhong
 Attachments: DESIGN.html, MAPREDUCE-2841.v1.patch, 
 MAPREDUCE-2841.v2.patch, MR-2841benchmarks.pdf, dualpivot-0.patch, 
 dualpivotv20-0.patch, fb-shuffle.patch, 
 hadoop-3.0-mapreduce-2841-2014-7-17.patch, micro-benchmark.txt, 
 mr-2841-merge-2.txt, mr-2841-merge-3.patch, mr-2841-merge-4.patch, 
 mr-2841-merge.txt


 I'm recently working on native optimization for MapTask based on JNI. 
 The basic idea is that, add a NativeMapOutputCollector to handle k/v pairs 
 emitted by mapper, therefore sort, spill, IFile serialization can all be done 
 in native code, preliminary test(on Xeon E5410, jdk6u24) showed promising 
 results:
 1. Sort is about 3x-10x as fast as java(only binary string compare is 
 supported)
 2. IFile serialization speed is about 3x of java, about 500MB/s, if hardware 
 CRC32C is used, things can get much faster(1G/
 3. Merge code is not completed yet, so the test use enough io.sort.mb to 
 prevent mid-spill
 This leads to a total speed up of 2x~3x for the whole MapTask, if 
 IdentityMapper(mapper does nothing) is used
 There are limitations of course, currently only Text and BytesWritable is 
 supported, and I have not think through many things right now, such as how to 
 support map side combine. I had some discussion with somebody familiar with 
 hive, it seems that these limitations won't be much problem for Hive to 
 benefit from those optimizations, at least. Advices or discussions about 
 improving compatibility are most welcome:) 
 Currently NativeMapOutputCollector has a static method called canEnable(), 
 which checks if key/value type, comparator type, combiner are all compatible, 
 then MapTask can choose to enable NativeMapOutputCollector.
 This is only a preliminary test, more work need to be done. I expect better 
 final results, and I believe similar optimization can be adopt to reduce task 
 and shuffle too. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5891) Improved shuffle error handling across NM restarts

2014-09-09 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127212#comment-14127212
 ] 

Ming Ma commented on MAPREDUCE-5891:


The patch looks good. I like Jason's idea to have 
mapreduce.reduce.shuffle.fetch.retry.enabled use 
${yarn.nodemanager.recovery.enabled} as default value. As for the other 
approaches,

a) dynamic MR to YARN query, given NM recovery flag is a global cluster level 
setting ( although it is possible to config it on per NM basis ), can we derive 
the value of mapreduce.reduce.shuffle.fetch.retry.enabled at job submission 
time from some YARN API call to RM?

b) shuffle protocol change. It seems Fetcher and ShuffleHandler check http 
header via property key names. So if we add a new property to indicate if 
recovery is supported and continue to keep the same http version property, 
new version of fetcher might be able to work with old version of 
shufflehandler, and vise versa.

 Improved shuffle error handling across NM restarts
 --

 Key: MAPREDUCE-5891
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5891
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Junping Du
 Attachments: MAPREDUCE-5891-demo.patch, MAPREDUCE-5891-v2.patch, 
 MAPREDUCE-5891-v3.patch, MAPREDUCE-5891-v4.patch, MAPREDUCE-5891.patch


 To minimize the number of map fetch failures reported by reducers across an 
 NM restart it would be nice if reducers only reported a fetch failure after 
 trying for at specified period of time to retrieve the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6078) native-task: fix gtest build on macosx

2014-09-09 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127269#comment-14127269
 ] 

Allen Wittenauer commented on MAPREDUCE-6078:
-

Um, should the conditional try to match on the same thing?

 native-task: fix gtest build on macosx
 --

 Key: MAPREDUCE-6078
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6078
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: task
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: MAPREDUCE-6078.v1.patch


 Try compile the HEAD code in macos but failed, looks like MAPREDUCE-5977 
 separate gtest compile from nttest in order to surpress compile warnings, but 
 it forget to add addition compile flags added to nttest is also required for  
 gtest build, this patch fix this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5891) Improved shuffle error handling across NM restarts

2014-09-09 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127309#comment-14127309
 ] 

Jason Lowe commented on MAPREDUCE-5891:
---

bq. a) dynamic MR to YARN query, given NM recovery flag is a global cluster 
level setting ( although it is possible to config it on per NM basis ), can we 
derive the value of mapreduce.reduce.shuffle.fetch.retry.enabled at job 
submission time from some YARN API call to RM?

The RM is unaware of whether the NM supports work-preserving restart, and I'd 
rather not add that coupling just for this.

bq. b) shuffle protocol change. It seems Fetcher and ShuffleHandler check http 
header via property key names. So if we add a new property to indicate if 
recovery is supported and continue to keep the same http version property, 
new version of fetcher might be able to work with old version of 
shufflehandler, and vise versa.

True, we could add a new HTTP header that new Fetchers could query.

 Improved shuffle error handling across NM restarts
 --

 Key: MAPREDUCE-5891
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5891
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Junping Du
 Attachments: MAPREDUCE-5891-demo.patch, MAPREDUCE-5891-v2.patch, 
 MAPREDUCE-5891-v3.patch, MAPREDUCE-5891-v4.patch, MAPREDUCE-5891.patch


 To minimize the number of map fetch failures reported by reducers across an 
 NM restart it would be nice if reducers only reported a fetch failure after 
 trying for at specified period of time to retrieve the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6075) HistoryServerFileSystemStateStore can create zero-length files

2014-09-09 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127432#comment-14127432
 ] 

Daryn Sharp commented on MAPREDUCE-6075:


I'm +1 on the change.  The close/null/cleanup is a rather common pattern is 
hadoop.  Using flush isn't a substitute for a close for all filesystems.  Close 
must always be allowed to throw an exception and only swallowed when another 
exception occurred.

In java, close() is supposed to be idempotent so double close is fine.  Double 
closing a fd is bad because the fd may have already been recycled by another 
thread.

 HistoryServerFileSystemStateStore can create zero-length files
 --

 Key: MAPREDUCE-6075
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6075
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-6075.patch


 When the history server state store writes a token file it uses 
 IOUtils.cleanup() to close the file which will silently ignore errors.  This 
 can lead to empty token files in the state store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5821) IFile merge allocates new byte array for every value

2014-09-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5821:

Fix Version/s: (was: 2.5.0)
   (was: 3.0.0)

 IFile merge allocates new byte array for every value
 

 Key: MAPREDUCE-5821
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5821
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 2.4.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 2.4.1

 Attachments: after-patch.png, before-patch.png, mapreduce-5821.txt, 
 mapreduce-5821.txt


 I wrote a standalone benchmark of the MapOutputBuffer and found that it did a 
 lot of allocations during the merge phase. After looking at an allocation 
 profile, I found that IFile.Reader.nextRawValue() would always allocate a new 
 byte array for every value, so the allocation rate goes way up during the 
 merge phase of the mapper. I imagine this also affects the reducer input, 
 though I didn't profile that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6063) In sortAndSpill of MapTask.java, size is calculated wrongly when bufend bufstart.

2014-09-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6063:

Fix Version/s: (was: 3.0.0)

 In sortAndSpill of MapTask.java, size is calculated wrongly when bufend  
 bufstart.
 ---

 Key: MAPREDUCE-6063
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6063
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Reporter: zhihai xu
Assignee: zhihai xu
 Fix For: 2.6.0

 Attachments: MAPREDUCE-6063.000.patch, MAPREDUCE-6063.branch-1.patch


 In sortAndSpill of MapTask.java, size is calculated wrongly when bufend  
 bufstart.  we should change (bufvoid - bufend) + bufstart to (bufvoid - 
 bufstart) + bufend.
 Should change
 {code}
  long size = (bufend = bufstart
   ? bufend - bufstart
   : (bufvoid - bufend) + bufstart) +
   partitions * APPROX_HEADER_LENGTH;
 {code}
 to:
 {code}
  long size = (bufend = bufstart
   ? bufend - bufstart
   : (bufvoid - bufstart) + bufend) +
   partitions * APPROX_HEADER_LENGTH;
 {code}
 It is because when wraparound happen (bufend  bufstart) ,  the size should 
 bufvoid - bufstart (bigger one) + bufend(small one).
 You can find similar code implementation in MapTask.java:
 {code}
 mapOutputByteCounter.increment(valend = keystart
 ? valend - keystart
 : (bufvoid - keystart) + valend);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5513) ConcurrentModificationException in JobControl

2014-09-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5513:

Fix Version/s: (was: 3.0.0)

 ConcurrentModificationException in JobControl
 -

 Key: MAPREDUCE-5513
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5513
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.1.0-beta, 0.23.9
Reporter: Jason Lowe
Assignee: Robert Parker
 Fix For: 0.23.10, 2.2.0

 Attachments: MAPREDUCE-5513-1.patch


 JobControl.toList is locking individual lists to iterate them, but those 
 lists can be modified elsewhere without holding the list lock.  The locking 
 approaches are mismatched, with toList holding the lock on the actual list 
 object while other methods hold the JobControl lock when modifying the lists.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-4868) Allow multiple iteration for map

2014-09-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4868:

Fix Version/s: (was: 2.4.0)
   (was: 3.0.0)

 Allow multiple iteration for map
 

 Key: MAPREDUCE-4868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4868
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Jerry Chen
   Original Estimate: 168h
  Remaining Estimate: 168h

 Currently, the Mapper class allows advanced users to override public void 
 run(Context context) method for more control over the execution of the 
 mapper, while Context interface limit the operations over the data which is 
 the foundation of more control.
 One of use cases is that when I am considering a hive optimziation problem, I 
 want to go two passes over the input data instead of using a another job or 
 task ( which may slower the whole process). Each pass do the same thing but 
 with a different parameters.
 This is a new paradigm of Map Reduce usage and can be archived easily by 
 extend Context interface a little with the more control over the data such as 
 reset the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile

2014-09-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-3191:

Fix Version/s: (was: 2.5.0)
   (was: 3.0.0)

 docs for map output compression incorrectly reference SequenceFile
 --

 Key: MAPREDUCE-3191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Chen He
Priority: Trivial
  Labels: documentation, noob
 Fix For: 0.23.11, 2.4.1

 Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch


 The documentation currently says that map output compression uses 
 SequenceFile compression. This hasn't been true in several years, since we 
 use IFile for intermediate data now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (MAPREDUCE-3168) [Gridmix] TestCompressionEmulationUtils fails after MR-3158

2014-09-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer reopened MAPREDUCE-3168:
-

 [Gridmix] TestCompressionEmulationUtils fails after MR-3158
 ---

 Key: MAPREDUCE-3168
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3168
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/gridmix
Affects Versions: 0.24.0
Reporter: Amar Kamat
Assignee: Amar Kamat
  Labels: compression-emulation, gridmix, local-job-runner

 TestCompressionEmulationUtils fails after MAPREDUCE-3158 as it uses local 
 job-runner to run jobs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MAPREDUCE-3168) [Gridmix] TestCompressionEmulationUtils fails after MR-3158

2014-09-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-3168.
-
   Resolution: Duplicate
Fix Version/s: (was: 3.0.0)

 [Gridmix] TestCompressionEmulationUtils fails after MR-3158
 ---

 Key: MAPREDUCE-3168
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3168
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/gridmix
Affects Versions: 0.24.0
Reporter: Amar Kamat
Assignee: Amar Kamat
  Labels: compression-emulation, gridmix, local-job-runner

 TestCompressionEmulationUtils fails after MAPREDUCE-3158 as it uses local 
 job-runner to run jobs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-2806) [Gridmix] Load job fails with timeout errors when resource emulation is turned on

2014-09-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-2806:

Fix Version/s: (was: 3.0.0)
   1.1.0

 [Gridmix] Load job fails with timeout errors when resource emulation is 
 turned on
 -

 Key: MAPREDUCE-2806
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2806
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/gridmix
Affects Versions: 0.23.0
Reporter: Amar Kamat
Assignee: Amar Kamat
  Labels: gridmix, loadjob, timeout
 Fix For: 1.1.0


 When the Load job's tasks are emulating cpu/memory, the task-tracker kills 
 the emulating task due to lack of status updates. Load job has its own status 
 reporter which dies too soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-3024) Make all poms to have hadoop-project POM as common parent

2014-09-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-3024:

Fix Version/s: (was: 3.0.0)

 Make all poms to have hadoop-project POM as common parent
 -

 Key: MAPREDUCE-3024
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3024
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.0, 0.24.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 2.0.0-alpha


 in order to effectively use the Maven 'versions' plugin to update version 
 numbers all POMs should have the hadoop-project POM as their common parent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-3024) Make all poms to have hadoop-project POM as common parent

2014-09-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-3024:

Fix Version/s: 2.0.0-alpha

 Make all poms to have hadoop-project POM as common parent
 -

 Key: MAPREDUCE-3024
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3024
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.0, 0.24.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 2.0.0-alpha


 in order to effectively use the Maven 'versions' plugin to update version 
 numbers all POMs should have the hadoop-project POM as their common parent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6075) HistoryServerFileSystemStateStore can create zero-length files

2014-09-09 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127748#comment-14127748
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-6075:
---

[~daryn], thanks for your point, you're right. +1(non-binding) for Jason's 
change.

http://docs.oracle.com/javase/7/docs/api/java/io/Closeable.html

 HistoryServerFileSystemStateStore can create zero-length files
 --

 Key: MAPREDUCE-6075
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6075
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-6075.patch


 When the history server state store writes a token file it uses 
 IOUtils.cleanup() to close the file which will silently ignore errors.  This 
 can lead to empty token files in the state store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6048) TestJavaSerialization fails in trunk build

2014-09-09 Thread Bruno P. Kinoshita (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127853#comment-14127853
 ] 

Bruno P. Kinoshita commented on MAPREDUCE-6048:
---

Hi, I think the builds were removed from Jenkins, but I could **not** reproduce 
with the following settings:

Apache Maven 3.2.1 (ea8b2b07643dbb1b84b6d16e1f08391b666bc1e9; 
2014-02-14T15:37:52-03:00)
Maven home: 
/home/kinow/java/tupilabs/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/EMBEDDED
Java version: 1.7.0_65, vendor: Oracle Corporation
Java home: /usr/lib/jvm/java-7-openjdk-amd64/jre
Default locale: en_US, platform encoding: UTF-8
OS name: linux, version: 3.13.0-35-generic, arch: amd64, family: unix

 TestJavaSerialization fails in trunk build
 --

 Key: MAPREDUCE-6048
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6048
 Project: Hadoop Map/Reduce
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor

 This happened in builds #1871 and #1872
 {code}
 testMapReduceJob(org.apache.hadoop.mapred.TestJavaSerialization)  Time 
 elapsed: 2.784 sec   FAILURE!
 junit.framework.ComparisonFailure: expected:[a   ]1 but was:[0 1]1
   at junit.framework.Assert.assertEquals(Assert.java:100)
   at junit.framework.Assert.assertEquals(Assert.java:107)
   at junit.framework.TestCase.assertEquals(TestCase.java:269)
   at 
 org.apache.hadoop.mapred.TestJavaSerialization.testMapReduceJob(TestJavaSerialization.java:127)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6078) native-task: fix gtest build on macosx

2014-09-09 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127974#comment-14127974
 ] 

Binglin Chang commented on MAPREDUCE-6078:
--

What do you mean? I guess thats the weird cmake syntax.

 native-task: fix gtest build on macosx
 --

 Key: MAPREDUCE-6078
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6078
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: task
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: MAPREDUCE-6078.v1.patch


 Try compile the HEAD code in macos but failed, looks like MAPREDUCE-5977 
 separate gtest compile from nttest in order to surpress compile warnings, but 
 it forget to add addition compile flags added to nttest is also required for  
 gtest build, this patch fix this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5911) Terasort TeraOutputFormat does not check for output directory existance

2014-09-09 Thread Bruno P. Kinoshita (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno P. Kinoshita updated MAPREDUCE-5911:
--
Attachment: HADOOP-5911.patch

Hi, first time writing a patch for Hadoop. Based on the description provided by 
Ivan. Couldn't find any tests referencing this class, but no tests failed in 
maven.

HTH, Bruno

 Terasort TeraOutputFormat does not check for output directory existance
 ---

 Key: MAPREDUCE-5911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Reporter: Ivan Mitic
Assignee: Ivan Mitic
Priority: Minor
 Attachments: HADOOP-5911.patch


 The enforcement that the directory must not yet exist is implemented in 
 {{FileOutputFormat#checkOutputSpecs}} by throwing 
 {{FileAlreadyExistsException}}.  However, terasort uses a specialized output 
 format, {{TeraOutputFormat}}, which is a subclass of {{FileOutputFormat}}.  
 The subclass overrides {{checkOutputSpecs}}, but does not re-implement the 
 existence check and throw {{FileAlreadyExistsException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)