date:20120809


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431607#comment-13431607
 ] 

Ilya Katsov commented on MAPREDUCE-4474:


This patch is for 0.23. It can not be applied to the trunk. 

 TestDistributedShell.testDSShell fails on CentOS 6 because of high virtual 
 memory usage
 ---

 Key: MAPREDUCE-4474
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4474
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.3
 Environment: CentOS 6
Reporter: Ilya Katsov
  Labels: test
 Attachments: MAPREDUCE-4474-branch-0.23.patch


 TestDistributedShell.testDSShell fails on CentOS 6 because of high virtual 
 memory usage: 
 {code}
 2012-07-24 04:50:46,563 INFO  [AsyncDispatcher event handler] rmapp.RMAppImpl 
 (RMAppImpl.java:transition(559)) - Application application_1343091034814_0001 
 failed 1 times due to AM Container for appattempt_1343091034814_0001_01 
 exited with  exitCode: 143 due to: Container 
 [pid=6146,containerID=container_1343091034814_0001_01_01] is running 
 beyond virtual memory limits. Current usage: 82.4mb of 512.0mb physical 
 memory used; 1.1gb of 1.0gb virtual memory used. Killing container.
 Dump of the process-tree for container_1343091034814_0001_01_01 :
   |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
   |- 6146 5773 6146 6146 (bash) 2 0 108613632 340 /bin/bash -c 
 /usr/java/jdk1.6.0_33/jre/bin/java -Xmx512m 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
 --container_memory 128 --
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4535) Test failures with Container .. is running beyond virtual memory limits


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Katsov updated MAPREDUCE-4535:
---

Attachment: MAPREDUCE-4535-branch-0.23.patch

 Test failures with Container .. is running beyond virtual memory limits
 -

 Key: MAPREDUCE-4535
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4535
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.3
Reporter: Ilya Katsov
 Attachments: MAPREDUCE-4535-branch-0.23.patch


 Tests 
 org.apache.hadoop.tools.TestHadoopArchives.{testRelativePath,testPathWithSpaces}
  fail with the following message:
 {code}
 Container [pid=7785,containerID=container_1342495768864_0001_01_01] is 
 running beyond virtual memory limits. Current usage: 143.6mb of 1.5gb 
 physical memory used; 3.4gb of 3.1gb virtual memory used. Killing container.
 Dump of the process-tree for container_1342495768864_0001_01_01 :
   |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
   |- 7797 7785 7785 7785 (java) 573 38 3517018112 36421 
 /usr/java/jdk1.6.0_33/jre/bin/java 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.mapreduce.container.log.dir=/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01
  -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 
 {code}
 This is not a stably reproducible problem, but adding MALLOC_ARENA_MAX 
 resolves the problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4534) Test failures with Container .. is running beyond virtual memory limits


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431615#comment-13431615
 ] 

Ilya Katsov commented on MAPREDUCE-4534:


Accidentially created duplicate of MAPREDUCE-4533. Must be closed.

 Test failures with Container .. is running beyond virtual memory limits
 -

 Key: MAPREDUCE-4534
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4534
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.3
Reporter: Ilya Katsov

 Tests 
 org.apache.hadoop.tools.TestHadoopArchives.{testRelativePath,testPathWithSpaces}
  fail with the following message:
 {code}
 Container [pid=7785,containerID=container_1342495768864_0001_01_01] is 
 running beyond virtual memory limits. Current usage: 143.6mb of 1.5gb 
 physical memory used; 3.4gb of 3.1gb virtual memory used. Killing container.
 Dump of the process-tree for container_1342495768864_0001_01_01 :
   |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
   |- 7797 7785 7785 7785 (java) 573 38 3517018112 36421 
 /usr/java/jdk1.6.0_33/jre/bin/java 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.mapreduce.container.log.dir=/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01
  -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 
   |- 7785 7101 7785 7785 (bash) 1 1 108605440 332 /bin/bash -c 
 /usr/java/jdk1.6.0_33/jre/bin/java 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.mapreduce.container.log.dir=/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01
  -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 
 1/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01/stdout
  
 2/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01/stderr
 {code}
 This is not a stably reproducible problem, but adding MALLOC_ARENA_MAX 
 resolves the problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4533) Test failures with Container .. is running beyond virtual memory limits


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431614#comment-13431614
 ] 

Ilya Katsov commented on MAPREDUCE-4533:


Accidentially created duplicate of MAPREDUCE-4533. Must be closed.

 Test failures with Container .. is running beyond virtual memory limits
 -

 Key: MAPREDUCE-4533
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4533
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.3
Reporter: Ilya Katsov

 Tests 
 org.apache.hadoop.tools.TestHadoopArchives.{testRelativePath,testPathWithSpaces}
  fail with the following message:
 {code}
 Container [pid=7785,containerID=container_1342495768864_0001_01_01] is 
 running beyond virtual memory limits. Current usage: 143.6mb of 1.5gb 
 physical memory used; 3.4gb of 3.1gb virtual memory used. Killing container.
 Dump of the process-tree for container_1342495768864_0001_01_01 :
   |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
   |- 7797 7785 7785 7785 (java) 573 38 3517018112 36421 
 /usr/java/jdk1.6.0_33/jre/bin/java 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.mapreduce.container.log.dir=/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01
  -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 
   |- 7785 7101 7785 7785 (bash) 1 1 108605440 332 /bin/bash -c 
 /usr/java/jdk1.6.0_33/jre/bin/java 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.mapreduce.container.log.dir=/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01
  -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 
 1/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01/stdout
  
 2/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01/stderr
 {code}
 This is not a stably reproducible problem, but adding MALLOC_ARENA_MAX 
 resolves the problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4469) Resource calculation in child tasks is CPU-heavy

2012-08-09 Thread Ahmed Radwan (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431639#comment-13431639
 ] 

Ahmed Radwan commented on MAPREDUCE-4469:
-

Here is a draft patch implementing what I described in my previous comment.

 Resource calculation in child tasks is CPU-heavy
 

 Key: MAPREDUCE-4469
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4469
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 1.0.3
Reporter: Todd Lipcon
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4469.patch


 In doing some benchmarking on a hadoop-1 derived codebase, I noticed that 
 each of the child tasks was doing a ton of syscalls. Upon stracing, I noticed 
 that it's spending a lot of time looping through all the files in /proc to 
 calculate resource usage.
 As a test, I added a flag to disable use of the ResourceCalculatorPlugin 
 within the tasks. On a CPU-bound 500G-sort workload, this improved total job 
 runtime by about 10% (map slot-seconds by 14%, reduce slot seconds by 8%)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4469) Resource calculation in child tasks is CPU-heavy

2012-08-09 Thread Ahmed Radwan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4469:


Attachment: MAPREDUCE-4469_rev2.patch

 Resource calculation in child tasks is CPU-heavy
 

 Key: MAPREDUCE-4469
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4469
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 1.0.3
Reporter: Todd Lipcon
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4469.patch, MAPREDUCE-4469_rev2.patch


 In doing some benchmarking on a hadoop-1 derived codebase, I noticed that 
 each of the child tasks was doing a ton of syscalls. Upon stracing, I noticed 
 that it's spending a lot of time looping through all the files in /proc to 
 calculate resource usage.
 As a test, I added a flag to disable use of the ResourceCalculatorPlugin 
 within the tasks. On a CPU-bound 500G-sort workload, this improved total job 
 runtime by about 10% (map slot-seconds by 14%, reduce slot seconds by 8%)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4470) Fix TestCombineFileInputFormat.testForEmptyFile


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Katsov updated MAPREDUCE-4470:
---

Status: Patch Available  (was: Open)

 Fix TestCombineFileInputFormat.testForEmptyFile
 ---

 Key: MAPREDUCE-4470
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4470
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: Kihwal Lee
 Fix For: 2.1.0-alpha, 3.0.0

 Attachments: MAPREDUCE-4470.patch


 TestCombineFileInputFormat.testForEmptyFile started failing after 
 HADOOP-8599. 
 It expects one split on an empty input file, but with HADOOP-8599 it gets 
 zero. The new behavior seems correct, but is it breaking anything else?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4470) Fix TestCombineFileInputFormat.testForEmptyFile


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Katsov updated MAPREDUCE-4470:
---

Attachment: MAPREDUCE-4470.patch

 Fix TestCombineFileInputFormat.testForEmptyFile
 ---

 Key: MAPREDUCE-4470
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4470
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: Kihwal Lee
 Fix For: 2.1.0-alpha, 3.0.0

 Attachments: MAPREDUCE-4470.patch


 TestCombineFileInputFormat.testForEmptyFile started failing after 
 HADOOP-8599. 
 It expects one split on an empty input file, but with HADOOP-8599 it gets 
 zero. The new behavior seems correct, but is it breaking anything else?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4470) Fix TestCombineFileInputFormat.testForEmptyFile

[
https://issues.apache.org/jira/browse/MAPREDUCE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431744#comment-13431744
]

Hadoop QA commented on MAPREDUCE-4470:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12540006/MAPREDUCE-4470.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2719//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2719//console

This message is automatically generated.

Fix TestCombineFileInputFormat.testForEmptyFile
---

Key: MAPREDUCE-4470
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4470
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: test
Affects Versions: 2.0.0-alpha
Reporter: Kihwal Lee
Fix For: 2.1.0-alpha, 3.0.0

Attachments: MAPREDUCE-4470.patch

TestCombineFileInputFormat.testForEmptyFile started failing after
HADOOP-8599.
It expects one split on an empty input file, but with HADOOP-8599 it gets
zero. The new behavior seems correct, but is it breaking anything else?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4518) FairScheduler: PoolSchedulable#updateDemand() - potential redundant aggregation

2012-08-09 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4518:


Attachment: trunk-MR-4518.patch

Uploading patch for trunk.

I couldn't think of a way to test the patch. Can someone suggest a way to test 
this? 


 FairScheduler: PoolSchedulable#updateDemand() - potential redundant 
 aggregation
 ---

 Key: MAPREDUCE-4518
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4518
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/fair-share
Affects Versions: 1.0.3
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4518_branch1.patch, trunk-MR-4518.patch


 In FS, PoolSchedulable#updateDemand() limits the demand to maxTasks only 
 after iterating though all the pools and computing the final demand. 
 By checking if the demand has reached maxTasks in every iteration, we can 
 avoid redundant work, at the expense of one condition check every iteration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2454) Allow external sorter plugin for MR

2012-08-09 Thread Mariappan Asokan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariappan Asokan updated MAPREDUCE-2454:


Status: Open  (was: Patch Available)

 Allow external sorter plugin for MR
 ---

 Key: MAPREDUCE-2454
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha, 3.0.0, 2.2.0-alpha
Reporter: Mariappan Asokan
Assignee: Mariappan Asokan
Priority: Minor
  Labels: features, performance, plugin, sort
 Attachments: HadoopSortPlugin.pdf, KeyValueIterator.java, 
 MR-2454-trunkPatchPreview.gz, MapOutputSorter.java, 
 MapOutputSorterAbstract.java, ReduceInputSorter.java, mapreduce-2454.patch, 
 mapreduce-2454.patch, mr-2454-on-mr-279-build82.patch.gz


 Define interfaces and some abstract classes in the Hadoop framework to 
 facilitate external sorter plugins both on the Map and Reduce sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4518) FairScheduler: PoolSchedulable#updateDemand() - potential redundant aggregation

[
https://issues.apache.org/jira/browse/MAPREDUCE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431891#comment-13431891
]

Hadoop QA commented on MAPREDUCE-4518:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12540039/trunk-MR-4518.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2720//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2720//console

This message is automatically generated.

FairScheduler: PoolSchedulable#updateDemand() - potential redundant
aggregation
---

Key: MAPREDUCE-4518
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4518
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: contrib/fair-share
Affects Versions: 1.0.3
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Attachments: MR-4518_branch1.patch, trunk-MR-4518.patch

In FS, PoolSchedulable#updateDemand() limits the demand to maxTasks only
after iterating though all the pools and computing the final demand.
By checking if the demand has reached maxTasks in every iteration, we can
avoid redundant work, at the expense of one condition check every iteration.

[jira] [Updated] (MAPREDUCE-4538) add Legacy Counter support to getGroupNames


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4538:
---

Attachment: MR-4538.txt

 add Legacy Counter support to getGroupNames
 ---

 Key: MAPREDUCE-4538
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4538
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 2.1.0-alpha, 3.0.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-4538.txt


 Oozie loops through counters using getGroupNames().  This does not include 
 with it legacy counter names, so they get missed, and can result in a 
 backwards compatibility issue in the oozie counter API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4538) add Legacy Counter support to getGroupNames


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431912#comment-13431912
 ] 

Robert Joseph Evans commented on MAPREDUCE-4538:


Something seems to be wrong with this JIRA.  I cannot mark it as Patch 
Available. Hopefully JIRA fixes itself soon, or I will refile this.

 add Legacy Counter support to getGroupNames
 ---

 Key: MAPREDUCE-4538
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4538
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 2.1.0-alpha, 3.0.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-4538.txt


 Oozie loops through counters using getGroupNames().  This does not include 
 with it legacy counter names, so they get missed, and can result in a 
 backwards compatibility issue in the oozie counter API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4538) add Legacy Counter support to getGroupNames


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4538:
---

Issue Type: Improvement  (was: Bug)

 add Legacy Counter support to getGroupNames
 ---

 Key: MAPREDUCE-4538
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4538
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.3, 2.1.0-alpha, 3.0.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-4538.txt


 Oozie loops through counters using getGroupNames().  This does not include 
 with it legacy counter names, so they get missed, and can result in a 
 backwards compatibility issue in the oozie counter API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4538) add Legacy Counter support to getGroupNames


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4538:
---

Issue Type: Bug  (was: Improvement)

 add Legacy Counter support to getGroupNames
 ---

 Key: MAPREDUCE-4538
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4538
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 2.1.0-alpha, 3.0.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-4538.txt


 Oozie loops through counters using getGroupNames().  This does not include 
 with it legacy counter names, so they get missed, and can result in a 
 backwards compatibility issue in the oozie counter API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4470) Fix TestCombineFileInputFormat.testForEmptyFile

2012-08-09 Thread Mariappan Asokan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariappan Asokan updated MAPREDUCE-4470:


Attachment: TestFileInputFormat.java

I think a proper fix should address all InputFormat implementations.  Tests for 
empty input should be added for all input formats.  For example, I added a test 
in TestFileInputFormat.java to test for empty input.  It is also failing.


 Fix TestCombineFileInputFormat.testForEmptyFile
 ---

 Key: MAPREDUCE-4470
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4470
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: Kihwal Lee
 Fix For: 2.1.0-alpha, 3.0.0

 Attachments: MAPREDUCE-4470.patch, TestFileInputFormat.java


 TestCombineFileInputFormat.testForEmptyFile started failing after 
 HADOOP-8599. 
 It expects one split on an empty input file, but with HADOOP-8599 it gets 
 zero. The new behavior seems correct, but is it breaking anything else?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4470) Fix TestCombineFileInputFormat.testForEmptyFile


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431996#comment-13431996
 ] 

Hadoop QA commented on MAPREDUCE-4470:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12540068/TestFileInputFormat.java
  against trunk revision .

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2721//console

This message is automatically generated.

 Fix TestCombineFileInputFormat.testForEmptyFile
 ---

 Key: MAPREDUCE-4470
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4470
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: Kihwal Lee
 Fix For: 2.1.0-alpha, 3.0.0

 Attachments: MAPREDUCE-4470.patch, TestFileInputFormat.java


 TestCombineFileInputFormat.testForEmptyFile started failing after 
 HADOOP-8599. 
 It expects one split on an empty input file, but with HADOOP-8599 it gets 
 zero. The new behavior seems correct, but is it breaking anything else?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3782) teragen terasort jobs fail when using webhdfs://


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432007#comment-13432007
 ] 

Robert Joseph Evans commented on MAPREDUCE-3782:


I am +1 too, I'll check this in.

 teragen terasort jobs fail when using webhdfs:// 
 -

 Key: MAPREDUCE-3782
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3782
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1, 0.24.0
Reporter: Arpit Gupta
Assignee: Jason Lowe
Priority: Critical
 Attachments: MAPREDUCE-3782.patch


 When running a teragen job with a webhdfs:// url the delegation token that is 
 retrieved is an hdfs delegation token. 
 And the subsequent terasort job on the output fails with java io exception

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3782) teragen terasort jobs fail when using webhdfs://

2012-08-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432012#comment-13432012
 ] 

Hudson commented on MAPREDUCE-3782:
---

Integrated in Hadoop-Common-trunk-Commit #2567 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2567/])
MAPREDUCE-3782. teragen terasort jobs fail when using webhdfs:// (Jason 
Lowe via bobby) (Revision 1371325)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1371325
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/terasort/TeraOutputFormat.java


 teragen terasort jobs fail when using webhdfs:// 
 -

 Key: MAPREDUCE-3782
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3782
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1, 0.24.0
Reporter: Arpit Gupta
Assignee: Jason Lowe
Priority: Critical
 Attachments: MAPREDUCE-3782.patch


 When running a teragen job with a webhdfs:// url the delegation token that is 
 retrieved is an hdfs delegation token. 
 And the subsequent terasort job on the output fails with java io exception

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3782) teragen terasort jobs fail when using webhdfs://

2012-08-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432013#comment-13432013
 ] 

Hudson commented on MAPREDUCE-3782:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2632 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2632/])
MAPREDUCE-3782. teragen terasort jobs fail when using webhdfs:// (Jason 
Lowe via bobby) (Revision 1371325)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1371325
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/terasort/TeraOutputFormat.java


 teragen terasort jobs fail when using webhdfs:// 
 -

 Key: MAPREDUCE-3782
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3782
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1, 0.24.0
Reporter: Arpit Gupta
Assignee: Jason Lowe
Priority: Critical
 Attachments: MAPREDUCE-3782.patch


 When running a teragen job with a webhdfs:// url the delegation token that is 
 retrieved is an hdfs delegation token. 
 And the subsequent terasort job on the output fails with java io exception

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-3782) teragen terasort jobs fail when using webhdfs://


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-3782:
---

   Resolution: Fixed
Fix Version/s: 2.2.0-alpha
   3.0.0
   2.1.0-alpha
   0.23.3
   Status: Resolved  (was: Patch Available)

Thanks Jason, I put this into trunk, branch-2, branch-2.1.0-alpha and 
branch-0.23

 teragen terasort jobs fail when using webhdfs:// 
 -

 Key: MAPREDUCE-3782
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3782
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1, 0.24.0
Reporter: Arpit Gupta
Assignee: Jason Lowe
Priority: Critical
 Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha

 Attachments: MAPREDUCE-3782.patch


 When running a teragen job with a webhdfs:// url the delegation token that is 
 retrieved is an hdfs delegation token. 
 And the subsequent terasort job on the output fails with java io exception

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4518) FairScheduler: PoolSchedulable#updateDemand() - potential redundant aggregation

2012-08-09 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4518:


Status: In Progress  (was: Patch Available)

 FairScheduler: PoolSchedulable#updateDemand() - potential redundant 
 aggregation
 ---

 Key: MAPREDUCE-4518
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4518
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/fair-share
Affects Versions: 1.0.3
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4518_branch1.patch, trunk-MR-4518.patch


 In FS, PoolSchedulable#updateDemand() limits the demand to maxTasks only 
 after iterating though all the pools and computing the final demand. 
 By checking if the demand has reached maxTasks in every iteration, we can 
 avoid redundant work, at the expense of one condition check every iteration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3782) teragen terasort jobs fail when using webhdfs://

2012-08-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432034#comment-13432034
 ] 

Hudson commented on MAPREDUCE-3782:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2587 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2587/])
MAPREDUCE-3782. teragen terasort jobs fail when using webhdfs:// (Jason 
Lowe via bobby) (Revision 1371325)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1371325
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/terasort/TeraOutputFormat.java


 teragen terasort jobs fail when using webhdfs:// 
 -

 Key: MAPREDUCE-3782
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3782
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1, 0.24.0
Reporter: Arpit Gupta
Assignee: Jason Lowe
Priority: Critical
 Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha

 Attachments: MAPREDUCE-3782.patch


 When running a teragen job with a webhdfs:// url the delegation token that is 
 retrieved is an hdfs delegation token. 
 And the subsequent terasort job on the output fails with java io exception

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4470) Fix TestCombineFileInputFormat.testForEmptyFile

2012-08-09 Thread Mariappan Asokan (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432088#comment-13432088
 ] 

Mariappan Asokan commented on MAPREDUCE-4470:
-

Sorry about the file upload.  I did not mean it to be a patch:(


 Fix TestCombineFileInputFormat.testForEmptyFile
 ---

 Key: MAPREDUCE-4470
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4470
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: Kihwal Lee
 Fix For: 2.1.0-alpha, 3.0.0

 Attachments: MAPREDUCE-4470.patch, TestFileInputFormat.java


 TestCombineFileInputFormat.testForEmptyFile started failing after 
 HADOOP-8599. 
 It expects one split on an empty input file, but with HADOOP-8599 it gets 
 zero. The new behavior seems correct, but is it breaking anything else?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4490) JVM reuse is incompatible with LinuxTaskController (and therefore incompatible with Security)

2012-08-09 Thread Evert Lammerts (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432093#comment-13432093
 ] 

Evert Lammerts commented on MAPREDUCE-4490:
---

We ran into this same issue on 0.20.205 - I'll add it is an affected version.

 JVM reuse is incompatible with LinuxTaskController (and therefore 
 incompatible with Security)
 -

 Key: MAPREDUCE-4490
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4490
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task-controller, tasktracker
Affects Versions: 1.0.3
Reporter: George Datskos

 When using LinuxTaskController, JVM reuse (mapred.job.reuse.jvm.num.tasks  
 1) with more map tasks in a job than there are map slots in the cluster will 
 result in immediate task failures for the second task in each JVM (and then 
 the JVM exits). We have investigated this bug and the root cause is as 
 follows. When using LinuxTaskController, the userlog directory for a task 
 attempt (../userlogs/job/task-attempt) is created only on the first 
 invocation (when the JVM is launched) because userlogs directories are 
 created by the task-controller binary which only runs *once* per JVM. 
 Therefore, attempting to create log.index is guaranteed to fail with ENOENT 
 leading to immediate task failure and child JVM exit.
 {quote}
 2012-07-24 14:29:11,914 INFO org.apache.hadoop.mapred.TaskLog: Starting 
 logging for a new task attempt_201207241401_0013_m_27_0 in the same JVM 
 as that of the first task 
 /var/log/hadoop/mapred/userlogs/job_201207241401_0013/attempt_201207241401_0013_m_06_0
 2012-07-24 14:29:11,915 WARN org.apache.hadoop.mapred.Child: Error running 
 child
 ENOENT: No such file or directory
 at org.apache.hadoop.io.nativeio.NativeIO.open(Native Method)
 at 
 org.apache.hadoop.io.SecureIOUtils.createForWrite(SecureIOUtils.java:161)
 at org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:296)
 at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:369)
 at org.apache.hadoop.mapred.Child.main(Child.java:229)
 {quote}
 The above error occurs in a JVM which runs tasks 6 and 27.  Task6 goes 
 smoothly. Then Task27 starts. The directory 
 /var/log/hadoop/mapred/userlogs/job_201207241401_0013/attempt_201207241401_0013_m_027_0
  is never created so when mapred.Child tries to write the log.index file for 
 Task27, it fails with ENOENT because the 
 attempt_201207241401_0013_m_027_0 directory does not exist. Therefore, 
 the second task in each JVM is guaranteed to fail (and then the JVM exits) 
 every time when using LinuxTaskController. Note that this problem does not 
 occur when using the DefaultTaskController because the userlogs directories 
 are created for each task (not just for each JVM as with LinuxTaskController).
 For each task, the TaskRunner calls the TaskController's createLogDir method 
 before attempting to write out an index file.
 * DefaultTaskController#createLogDir: creates log directory for each task
 * LinuxTaskController#createLogDir: does nothing
 ** task-controller binary creates log directory [create_attempt_directories] 
 (but only for the first task)
 Possible Solution: add a new command to task-controller *initialize task* to 
 create attempt directories.  Call that command, with ShellCommandExecutor, in 
 the LinuxTaskController#createLogDir method

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4490) JVM reuse is incompatible with LinuxTaskController (and therefore incompatible with Security)

2012-08-09 Thread Evert Lammerts (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evert Lammerts updated MAPREDUCE-4490:
--

Affects Version/s: 0.20.205.0

 JVM reuse is incompatible with LinuxTaskController (and therefore 
 incompatible with Security)
 -

 Key: MAPREDUCE-4490
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4490
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task-controller, tasktracker
Affects Versions: 0.20.205.0, 1.0.3
Reporter: George Datskos

 When using LinuxTaskController, JVM reuse (mapred.job.reuse.jvm.num.tasks  
 1) with more map tasks in a job than there are map slots in the cluster will 
 result in immediate task failures for the second task in each JVM (and then 
 the JVM exits). We have investigated this bug and the root cause is as 
 follows. When using LinuxTaskController, the userlog directory for a task 
 attempt (../userlogs/job/task-attempt) is created only on the first 
 invocation (when the JVM is launched) because userlogs directories are 
 created by the task-controller binary which only runs *once* per JVM. 
 Therefore, attempting to create log.index is guaranteed to fail with ENOENT 
 leading to immediate task failure and child JVM exit.
 {quote}
 2012-07-24 14:29:11,914 INFO org.apache.hadoop.mapred.TaskLog: Starting 
 logging for a new task attempt_201207241401_0013_m_27_0 in the same JVM 
 as that of the first task 
 /var/log/hadoop/mapred/userlogs/job_201207241401_0013/attempt_201207241401_0013_m_06_0
 2012-07-24 14:29:11,915 WARN org.apache.hadoop.mapred.Child: Error running 
 child
 ENOENT: No such file or directory
 at org.apache.hadoop.io.nativeio.NativeIO.open(Native Method)
 at 
 org.apache.hadoop.io.SecureIOUtils.createForWrite(SecureIOUtils.java:161)
 at org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:296)
 at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:369)
 at org.apache.hadoop.mapred.Child.main(Child.java:229)
 {quote}
 The above error occurs in a JVM which runs tasks 6 and 27.  Task6 goes 
 smoothly. Then Task27 starts. The directory 
 /var/log/hadoop/mapred/userlogs/job_201207241401_0013/attempt_201207241401_0013_m_027_0
  is never created so when mapred.Child tries to write the log.index file for 
 Task27, it fails with ENOENT because the 
 attempt_201207241401_0013_m_027_0 directory does not exist. Therefore, 
 the second task in each JVM is guaranteed to fail (and then the JVM exits) 
 every time when using LinuxTaskController. Note that this problem does not 
 occur when using the DefaultTaskController because the userlogs directories 
 are created for each task (not just for each JVM as with LinuxTaskController).
 For each task, the TaskRunner calls the TaskController's createLogDir method 
 before attempting to write out an index file.
 * DefaultTaskController#createLogDir: creates log directory for each task
 * LinuxTaskController#createLogDir: does nothing
 ** task-controller binary creates log directory [create_attempt_directories] 
 (but only for the first task)
 Possible Solution: add a new command to task-controller *initialize task* to 
 create attempt directories.  Call that command, with ShellCommandExecutor, in 
 the LinuxTaskController#createLogDir method

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4044) YarnClientProtocolProvider does not honor mapred.job.tracker property

2012-08-09 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432104#comment-13432104
 ] 

Jason Lowe commented on MAPREDUCE-4044:
---

I think this isn't as straightforward as adding a deprecation, primarily 
because this is a deprecation across configuration files.  yarn-default.xml and 
yarn-site.xml load before mapred-default.xml and mapred-site.xml.  If 
yarn-site.xml sets yarn.resourcemanager.address to an appropriate value, it 
will be later smashed to local by mapred-default.xml if we tie 
yarn.resourcemanager.address to mapreduce.jobtracker.address.

In addition I think we'd need to update Configuration deprecation support to 
handle multiple deprecated values mapped to the same new key (i.e.: 
mapred.job.tracker *and* mapreduce.jobtracker.address would both need to map to 
yarn.resourcemanager.address).  I don't think the deprecation code currently 
handles a many-to-one mapping, although oddly it appears to support one-to-many.

Bottom line is that this change smells pretty risky, certainly not as easy as a 
one-line Configuration.addDeprecation() call.  Would it make more sense from a 
risk-mitigation standpoint to have Oozie set both mapred.job.tracker and 
yarn.resourcemanager.address so it can work with both 1.x and 2.x?

 YarnClientProtocolProvider does not honor mapred.job.tracker property
 -

 Key: MAPREDUCE-4044
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4044
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.24.0, 0.23.3
Reporter: Alejandro Abdelnur

 The YarnClientProtocolProvider/YARNRunner/ResourceMgrDelegate bootstrap only 
 looks for 'yarn.resourcemanager.address', they ignore 'mapred.job.tracker'
 This breaks backward compatibility and creates issues in Oozie.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4044) YarnClientProtocolProvider does not honor mapred.job.tracker property

2012-08-09 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432110#comment-13432110
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4044:
---

@jason, what you suggest is exactly what Oozie is currently doing. Agree the 
deprecation thingy is not that simple. Still the problem impacts apps in 
general outside of Oozie that set 'mapred.*' values.


 YarnClientProtocolProvider does not honor mapred.job.tracker property
 -

 Key: MAPREDUCE-4044
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4044
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.24.0, 0.23.3
Reporter: Alejandro Abdelnur

 The YarnClientProtocolProvider/YARNRunner/ResourceMgrDelegate bootstrap only 
 looks for 'yarn.resourcemanager.address', they ignore 'mapred.job.tracker'
 This breaks backward compatibility and creates issues in Oozie.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-4367) mapred job -kill tries to connect to history server


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal reassigned MAPREDUCE-4367:


Assignee: Mayank Bansal

 mapred job -kill tries to connect to history server
 ---

 Key: MAPREDUCE-4367
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4367
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Assignee: Mayank Bansal
Priority: Minor

 The {{mapred job -kill}} command attempts to connect to the history server, 
 even though it is unrelated to the process of killing a job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4367) mapred job -kill tries to connect to history server


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432139#comment-13432139
 ] 

Mayank Bansal commented on MAPREDUCE-4367:
--

The issue as reported without HISTORY server up if configured , user can not 
kill the job.
History server does not do anyways in case of kill so in my patch I am short 
circuiting the History server in case of kill.
Adding the test case for testing this scenario in case of History server is up 
and down.
Thanks,
Mayank

 mapred job -kill tries to connect to history server
 ---

 Key: MAPREDUCE-4367
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4367
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Assignee: Mayank Bansal
Priority: Minor
 Attachments: MAPREDUCE-4367-trunk-v1.patch


 The {{mapred job -kill}} command attempts to connect to the history server, 
 even though it is unrelated to the process of killing a job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4367) mapred job -kill tries to connect to history server


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-4367:
-

Attachment: MAPREDUCE-4367-trunk-v1.patch

 mapred job -kill tries to connect to history server
 ---

 Key: MAPREDUCE-4367
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4367
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Assignee: Mayank Bansal
Priority: Minor
 Attachments: MAPREDUCE-4367-trunk-v1.patch


 The {{mapred job -kill}} command attempts to connect to the history server, 
 even though it is unrelated to the process of killing a job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4538) add Legacy Counter support to getGroupNames

2012-08-09 Thread Virag Kothari (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432183#comment-13432183
 ] 

Virag Kothari commented on MAPREDUCE-4538:
--

Bobby, MAPREDUCE-4053 will also be fixed by this patch, correct?

 add Legacy Counter support to getGroupNames
 ---

 Key: MAPREDUCE-4538
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4538
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 2.1.0-alpha, 3.0.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-4538.txt


 Oozie loops through counters using getGroupNames().  This does not include 
 with it legacy counter names, so they get missed, and can result in a 
 backwards compatibility issue in the oozie counter API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4367) mapred job -kill tries to connect to history server


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-4367:
-

Fix Version/s: trunk
   Status: Patch Available  (was: Open)

 mapred job -kill tries to connect to history server
 ---

 Key: MAPREDUCE-4367
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4367
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Assignee: Mayank Bansal
Priority: Minor
 Fix For: trunk

 Attachments: MAPREDUCE-4367-trunk-v1.patch


 The {{mapred job -kill}} command attempts to connect to the history server, 
 even though it is unrelated to the process of killing a job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4044) YarnClientProtocolProvider does not honor mapred.job.tracker property


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432203#comment-13432203
 ] 

Arun C Murthy commented on MAPREDUCE-4044:
--

I'm confused. If someone set mapreduce.framework.name to yarn, why should we 
support mapred.job.tracker? 

 YarnClientProtocolProvider does not honor mapred.job.tracker property
 -

 Key: MAPREDUCE-4044
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4044
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.24.0, 0.23.3
Reporter: Alejandro Abdelnur

 The YarnClientProtocolProvider/YARNRunner/ResourceMgrDelegate bootstrap only 
 looks for 'yarn.resourcemanager.address', they ignore 'mapred.job.tracker'
 This breaks backward compatibility and creates issues in Oozie.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-4535) Test failures with Container .. is running beyond virtual memory limits


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy reassigned MAPREDUCE-4535:


Assignee: Ilya Katsov

 Test failures with Container .. is running beyond virtual memory limits
 -

 Key: MAPREDUCE-4535
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4535
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.3
Reporter: Ilya Katsov
Assignee: Ilya Katsov
 Attachments: MAPREDUCE-4535-branch-0.23.patch


 Tests 
 org.apache.hadoop.tools.TestHadoopArchives.{testRelativePath,testPathWithSpaces}
  fail with the following message:
 {code}
 Container [pid=7785,containerID=container_1342495768864_0001_01_01] is 
 running beyond virtual memory limits. Current usage: 143.6mb of 1.5gb 
 physical memory used; 3.4gb of 3.1gb virtual memory used. Killing container.
 Dump of the process-tree for container_1342495768864_0001_01_01 :
   |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
   |- 7797 7785 7785 7785 (java) 573 38 3517018112 36421 
 /usr/java/jdk1.6.0_33/jre/bin/java 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.mapreduce.container.log.dir=/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01
  -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 
 {code}
 This is not a stably reproducible problem, but adding MALLOC_ARENA_MAX 
 resolves the problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4474) TestDistributedShell.testDSShell fails on CentOS 6 because of high virtual memory usage


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4474:
-

Assignee: Ilya Katsov
  Status: Open  (was: Patch Available)

 TestDistributedShell.testDSShell fails on CentOS 6 because of high virtual 
 memory usage
 ---

 Key: MAPREDUCE-4474
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4474
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.3
 Environment: CentOS 6
Reporter: Ilya Katsov
Assignee: Ilya Katsov
  Labels: test
 Attachments: MAPREDUCE-4474-branch-0.23.patch


 TestDistributedShell.testDSShell fails on CentOS 6 because of high virtual 
 memory usage: 
 {code}
 2012-07-24 04:50:46,563 INFO  [AsyncDispatcher event handler] rmapp.RMAppImpl 
 (RMAppImpl.java:transition(559)) - Application application_1343091034814_0001 
 failed 1 times due to AM Container for appattempt_1343091034814_0001_01 
 exited with  exitCode: 143 due to: Container 
 [pid=6146,containerID=container_1343091034814_0001_01_01] is running 
 beyond virtual memory limits. Current usage: 82.4mb of 512.0mb physical 
 memory used; 1.1gb of 1.0gb virtual memory used. Killing container.
 Dump of the process-tree for container_1343091034814_0001_01_01 :
   |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
   |- 6146 5773 6146 6146 (bash) 2 0 108613632 340 /bin/bash -c 
 /usr/java/jdk1.6.0_33/jre/bin/java -Xmx512m 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
 --container_memory 128 --
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4474) TestDistributedShell.testDSShell fails on CentOS 6 because of high virtual memory usage


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432216#comment-13432216
 ] 

Arun C Murthy commented on MAPREDUCE-4474:
--

Ilya, can u pls rebase your patch after YARN-1? Tx!

 TestDistributedShell.testDSShell fails on CentOS 6 because of high virtual 
 memory usage
 ---

 Key: MAPREDUCE-4474
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4474
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.3
 Environment: CentOS 6
Reporter: Ilya Katsov
Assignee: Ilya Katsov
  Labels: test
 Attachments: MAPREDUCE-4474-branch-0.23.patch


 TestDistributedShell.testDSShell fails on CentOS 6 because of high virtual 
 memory usage: 
 {code}
 2012-07-24 04:50:46,563 INFO  [AsyncDispatcher event handler] rmapp.RMAppImpl 
 (RMAppImpl.java:transition(559)) - Application application_1343091034814_0001 
 failed 1 times due to AM Container for appattempt_1343091034814_0001_01 
 exited with  exitCode: 143 due to: Container 
 [pid=6146,containerID=container_1343091034814_0001_01_01] is running 
 beyond virtual memory limits. Current usage: 82.4mb of 512.0mb physical 
 memory used; 1.1gb of 1.0gb virtual memory used. Killing container.
 Dump of the process-tree for container_1343091034814_0001_01_01 :
   |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
   |- 6146 5773 6146 6146 (bash) 2 0 108613632 340 /bin/bash -c 
 /usr/java/jdk1.6.0_33/jre/bin/java -Xmx512m 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
 --container_memory 128 --
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4466) Using URI for yarn.nodemanager log dirs fails

2012-08-09 Thread Siddharth Seth (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Siddharth Seth updated MAPREDUCE-4466:
--

Fix Version/s: (was: trunk)
Status: Open (was: Patch Available)

Thanks for the updated patch Mayank. Mostly looks good.
The findbugs warning can be avoided by catching individual exceptions instead
of a generic catchAll.
The unit test has some issues. It refers to absolute paths (file:///target/) -
which will break on most systems. Also TestNMWebServers isn't the best place to
test this. A simple verification of getContainerLogDirs on a path with and
without file:// should be sufficient.

Unsetting the Fix Version - that needs to be set only after the change is
committed.

Using URI for yarn.nodemanager log dirs fails
-

Key: MAPREDUCE-4466
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4466
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Eli Collins
Assignee: Mayank Bansal
Priority: Minor
Attachments: MAPREDUCE-4466-trunk-v1.patch,
MAPREDUCE-4466-trunk-v2.patch, MAPREDUCE-4466-trunk-v3.patch

If I use URIs (eg file:///home/eli/hadoop/dirs) for yarn.nodemanager.log-dirs
or yarn.nodemanager.remote-app-log-dir the container log servlet fails with
an NPE (works if I remove the file scheme). Using a URI for
yarn.nodemanager.local-dirs works.

[jira] [Commented] (MAPREDUCE-3881) building fail under Windows

2012-08-09 Thread Trevor Robinson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432274#comment-13432274
 ] 

Trevor Robinson commented on MAPREDUCE-3881:


This patch fixes the issue for me. Note that it uses tab characters on the 
newly added line though.

 building fail under Windows
 ---

 Key: MAPREDUCE-3881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3881
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
 Environment: D:\os\hadoopcommonmvn --version
 Apache Maven 3.0.4 (r1232337; 2012-01-17 16:44:56+0800)
 Maven home: C:\portable\maven\bin\..
 Java version: 1.7.0_02, vendor: Oracle Corporation
 Java home: C:\Program Files (x86)\Java\jdk1.7.0_02\jre
 Default locale: zh_CN, platform encoding: GBK
 OS name: windows 7, version: 6.1, arch: x86, family: windows
Reporter: Changming Sun
Priority: Minor
 Attachments: pom.xml.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 hadoop-mapreduce-project\hadoop-yarn\hadoop-yarn-common\pom.xml is not 
 portable.
  execution
 idgenerate-version/id
 phasegenerate-sources/phase
 configuration
   executablescripts/saveVersion.sh/executable
   arguments
 argument${project.version}/argument
 argument${project.build.directory}/argument
   /arguments
 /configuration
 goals
   goalexec/goal
 /goals
   /execution
 when I built it under windows , I got a such error:
 [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2:exec 
 (gen
 erate-version) on project hadoop-yarn-common: Command execution failed. 
 Cannot r
 un program scripts\saveVersion.sh (in directory 
 D:\os\hadoopcommon\hadoop-map
 reduce-project\hadoop-yarn\hadoop-yarn-common): CreateProcess error=2, 
 
 ? - [Help 1]
 we should modify it like this: (copied from 
 hadoop-common-project\hadoop-common\pom.xml)
 configuration
   target
 mkdir 
 dir=${project.build.directory}/generated-sources/java/
 exec executable=sh
   arg
   line=${basedir}/dev-support/saveVersion.sh 
 ${project.version} ${project.build.directory}/generated-sources/java/
 /exec
   /target
 /configuration
   /execution

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4491) Encryption and Key Protection

2012-08-09 Thread Benoy Antony (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4491:


Attachment: Hadoop_Encryption.pdf
MR_4491_1.1.patch
MR_4491_trunk.patch

Attaching the initial patches for trunk and branch-1.1. Please review and let 
me know the comments. 

Did minor updates in the design document.

One of the test cases in the patch depends on a test class which will be part 
of another jira (yet to be filed due to the ASF Jira problem)

 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf, 
 MR_4491_1.1.patch, MR_4491_trunk.patch


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4367) mapred job -kill tries to connect to history server