[jira] [Created] (MAPREDUCE-4149) Rumen fails to parse certain counter strings

2012-04-13 Thread Ravi Gummadi (Created) (JIRA)
Rumen fails to parse certain counter strings


 Key: MAPREDUCE-4149
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4149
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Ravi Gummadi


If a counter name contains { or }, Rumen is not able to parse it and throws 
ParseException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4149) Rumen fails to parse certain counter strings

2012-04-13 Thread Ravi Gummadi (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated MAPREDUCE-4149:


Attachment: 4149.patch

Attaching patch for trunk with the fix.

 Rumen fails to parse certain counter strings
 

 Key: MAPREDUCE-4149
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4149
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Ravi Gummadi
 Attachments: 4149.patch


 If a counter name contains { or }, Rumen is not able to parse it and 
 throws ParseException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4102) job counters not available in Jobhistory webui for killed jobs

2012-04-13 Thread Bhallamudi Venkata Siva Kamesh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhallamudi Venkata Siva Kamesh updated MAPREDUCE-4102:
--

Attachment: MAPREDUCE-4102.patch

For this, I *think* we should display all sucessfully completed map/reduce task 
counters as the overall job counters for killed/failed jobs.Also when job has 
no sucessfully completed map/reduce tasks it displays
{noformat}Sorry it looks like job_clusterid_jobNum has no 
counters.{noformat}

Attaching the patch for the same. Please review.

 job counters not available in Jobhistory webui for killed jobs
 --

 Key: MAPREDUCE-4102
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4102
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: webapps
Affects Versions: 0.23.2
Reporter: Thomas Graves
 Attachments: MAPREDUCE-4102.patch


 Run a simple wordcount or sleep, and kill the job before it finishes.  Go to 
 the job history web ui and click the Counters link for that job. It 
 displays 500 error.
 The job history log has:
 Caused by: com.google.inject.ProvisionException: Guice provision errors:
 2012-04-03 19:42:53,148 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
 handling URI: /jobhistory/jobcounters/job_1333482028750_0001
 java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 ...
 ..
 ...
 1) Error injecting constructor, java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:56)
   while locating org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock
 ...
 ..
 ...
 Caused by: java.lang.NullPointerExceptionat 
 org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:328)
 at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.getCounters(CountersBlock.java:188)
 at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:57)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 There are task counters available if you drill down into successful tasks 
 though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4102) job counters not available in Jobhistory webui for killed jobs

2012-04-13 Thread Bhallamudi Venkata Siva Kamesh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhallamudi Venkata Siva Kamesh updated MAPREDUCE-4102:
--

Affects Version/s: 2.0.0
   Status: Patch Available  (was: Open)

 job counters not available in Jobhistory webui for killed jobs
 --

 Key: MAPREDUCE-4102
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4102
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: webapps
Affects Versions: 0.23.2, 2.0.0
Reporter: Thomas Graves
 Attachments: MAPREDUCE-4102.patch


 Run a simple wordcount or sleep, and kill the job before it finishes.  Go to 
 the job history web ui and click the Counters link for that job. It 
 displays 500 error.
 The job history log has:
 Caused by: com.google.inject.ProvisionException: Guice provision errors:
 2012-04-03 19:42:53,148 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
 handling URI: /jobhistory/jobcounters/job_1333482028750_0001
 java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 ...
 ..
 ...
 1) Error injecting constructor, java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:56)
   while locating org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock
 ...
 ..
 ...
 Caused by: java.lang.NullPointerExceptionat 
 org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:328)
 at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.getCounters(CountersBlock.java:188)
 at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:57)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 There are task counters available if you drill down into successful tasks 
 though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4150) Versioning and rolling upgrades for Yarn/MR2

2012-04-13 Thread Ahmed Radwan (Created) (JIRA)
Versioning and rolling upgrades for Yarn/MR2


 Key: MAPREDUCE-4150
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4150
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.1
Reporter: Ahmed Radwan


It doesn't seem that Yarn components, for example the ResourceManager or 
NodeManager, do build/package version checking before trying to communicate 
with each other. 

The objective of this ticket is to support the following requirements / use 
cases:

- New versions can be marked incompatible with old versions, and services 
should be prevented from communicating with each other in such case. This will 
avoid non-deterministic behavior/problems resulting from incompatible 
components trying to communicate with each other.

- Permitting a policy for running different - but compatible - versions on the 
same cluster (for example, in a rolling upgrade scenario). See HDFS-2983 for 
the corresponding HDFS implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3927) Shuffle hang when set map.failures.percent

2012-04-13 Thread Bhallamudi Venkata Siva Kamesh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253225#comment-13253225
 ] 

Bhallamudi Venkata Siva Kamesh commented on MAPREDUCE-3927:
---

Hi MengWang,
 Any update on the patch?

 Shuffle hang when set map.failures.percent
 --

 Key: MAPREDUCE-3927
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3927
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.21.0, 0.23.0
Reporter: MengWang
  Labels: patch
 Fix For: 0.24.0

 Attachments: MAPREDUCE-3927.patch, MAPREDUCE-3927.patch


 When set mapred.max.map.failures.percent and there does have some failed 
 maps, then shuffle will hang

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4150) Versioning and rolling upgrades for Yarn/MR2

2012-04-13 Thread Ahmed Radwan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253226#comment-13253226
 ] 

Ahmed Radwan commented on MAPREDUCE-4150:
-

I see the code has a YarnVersionInfo class, but as far as I can see, it is only 
used in displaying version info in the web UI.

 Versioning and rolling upgrades for Yarn/MR2
 

 Key: MAPREDUCE-4150
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4150
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.1
Reporter: Ahmed Radwan

 It doesn't seem that Yarn components, for example the ResourceManager or 
 NodeManager, do build/package version checking before trying to communicate 
 with each other. 
 The objective of this ticket is to support the following requirements / use 
 cases:
 - New versions can be marked incompatible with old versions, and services 
 should be prevented from communicating with each other in such case. This 
 will avoid non-deterministic behavior/problems resulting from incompatible 
 components trying to communicate with each other.
 - Permitting a policy for running different - but compatible - versions on 
 the same cluster (for example, in a rolling upgrade scenario). See HDFS-2983 
 for the corresponding HDFS implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4149) Rumen fails to parse certain counter strings

2012-04-13 Thread Ravi Gummadi (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated MAPREDUCE-4149:


Attachment: 4149.branch-1.v1.patch

Attaching patch for branch-1 with the fix. Also added testcase that fails 
without this fix and passes with the fix.

 Rumen fails to parse certain counter strings
 

 Key: MAPREDUCE-4149
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4149
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Ravi Gummadi
 Attachments: 4149.branch-1.v1.patch, 4149.patch


 If a counter name contains { or }, Rumen is not able to parse it and 
 throws ParseException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4102) job counters not available in Jobhistory webui for killed jobs

2012-04-13 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253252#comment-13253252
 ] 

Hadoop QA commented on MAPREDUCE-4102:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12522540/MAPREDUCE-4102.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified test 
files.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
  org.apache.hadoop.yarn.server.TestDiskFailures
  org.apache.hadoop.yarn.server.TestContainerManagerSecurity
  
org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService
  
org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry
  
org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization
  
org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs
  org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobs
  
org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesTasks
  
org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesTasks
  org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebApp
  
org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesJobs
  org.apache.hadoop.mapred.TestMiniMRClasspath
  org.apache.hadoop.mapreduce.v2.TestMRJobs
  org.apache.hadoop.mapred.TestMiniMRWithDFSWithDistinctUsers
  org.apache.hadoop.mapred.TestMiniMRBringup
  org.apache.hadoop.mapred.TestMiniMRChildTask
  org.apache.hadoop.mapred.TestReduceFetch
  org.apache.hadoop.mapred.TestClusterMRNotification
  org.apache.hadoop.mapred.TestReduceFetchFromPartialMem
  org.apache.hadoop.mapred.TestJobCounters
  org.apache.hadoop.mapreduce.TestChild
  org.apache.hadoop.mapred.TestMiniMRClientCluster
  org.apache.hadoop.ipc.TestSocketFactory
  org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService
  org.apache.hadoop.mapreduce.v2.TestMROldApiJobs
  org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution
  org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter
  org.apache.hadoop.mapred.TestClientRedirect
  org.apache.hadoop.mapred.TestLazyOutput
  org.apache.hadoop.mapred.TestJobCleanup
  org.apache.hadoop.mapreduce.TestMapReduceLazyOutput
  org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
  org.apache.hadoop.mapreduce.v2.TestMRAppWithCombiner
  org.apache.hadoop.conf.TestNoDefaultsJobConf
  org.apache.hadoop.mapreduce.v2.TestRMNMInfo
  org.apache.hadoop.mapred.TestClusterMapReduceTestCase
  org.apache.hadoop.mapreduce.v2.TestNonExistentJob
  org.apache.hadoop.mapred.TestJobSysDirWithDFS
  org.apache.hadoop.mapreduce.v2.TestUberAM
  org.apache.hadoop.mapreduce.v2.TestMiniMRProxyUser
  org.apache.hadoop.mapred.TestJobName
  org.apache.hadoop.mapreduce.security.TestJHSSecurity

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2218//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2218//console

This message is automatically generated.

 job counters not available in Jobhistory webui for killed jobs
 --

 Key: MAPREDUCE-4102
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4102
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: webapps
Affects Versions: 0.23.2, 2.0.0
Reporter: Thomas Graves
 Attachments: MAPREDUCE-4102.patch


 Run a simple wordcount or sleep, and kill the job before it finishes.  Go to 
 the job history web ui and click the Counters link for that job. 

[jira] [Updated] (MAPREDUCE-4102) job counters not available in Jobhistory webui for killed jobs

2012-04-13 Thread Bhallamudi Venkata Siva Kamesh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhallamudi Venkata Siva Kamesh updated MAPREDUCE-4102:
--

Status: Open  (was: Patch Available)

Cancelling the patch to address the test failures

 job counters not available in Jobhistory webui for killed jobs
 --

 Key: MAPREDUCE-4102
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4102
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: webapps
Affects Versions: 0.23.2, 2.0.0
Reporter: Thomas Graves
 Attachments: MAPREDUCE-4102.patch


 Run a simple wordcount or sleep, and kill the job before it finishes.  Go to 
 the job history web ui and click the Counters link for that job. It 
 displays 500 error.
 The job history log has:
 Caused by: com.google.inject.ProvisionException: Guice provision errors:
 2012-04-03 19:42:53,148 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
 handling URI: /jobhistory/jobcounters/job_1333482028750_0001
 java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 ...
 ..
 ...
 1) Error injecting constructor, java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:56)
   while locating org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock
 ...
 ..
 ...
 Caused by: java.lang.NullPointerExceptionat 
 org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:328)
 at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.getCounters(CountersBlock.java:188)
 at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:57)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 There are task counters available if you drill down into successful tasks 
 though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4149) Rumen fails to parse certain counter strings

2012-04-13 Thread Amar Kamat (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253264#comment-13253264
 ] 

Amar Kamat commented on MAPREDUCE-4149:
---

Ravi, let 'UserCounterMapper' extend 'IdentityMapper'. This way, you can set 
our special counters in UserCounterMapper's map api and then invoke 
IdentityMapper's map() api. Thoughts?

 Rumen fails to parse certain counter strings
 

 Key: MAPREDUCE-4149
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4149
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Ravi Gummadi
 Attachments: 4149.branch-1.v1.patch, 4149.patch


 If a counter name contains { or }, Rumen is not able to parse it and 
 throws ParseException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4050) Invalid node link

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253332#comment-13253332
 ] 

Hudson commented on MAPREDUCE-4050:
---

Integrated in Hadoop-Hdfs-0.23-Build #226 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/226/])
Merge MAPREDUCE-4050 from trunk. For tasks without assigned containers, 
changes the node text on the UI to N/A instead of a link to null. (Contributed 
by Bhallamudi Venkata Siva Kamesh) (Revision 1325437)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325437
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/TaskPage.java


 Invalid node link
 -

 Key: MAPREDUCE-4050
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4050
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1
Reporter: Bhallamudi Venkata Siva Kamesh
Assignee: Bhallamudi Venkata Siva Kamesh
 Fix For: 0.23.3

 Attachments: MAPREDUCE-4050.patch, MAPREDUCE-4050.png


 When a task is in *UNASSIGNED* state, node link is displayed as +null+.
 But I think it is better to display the link as *N/A* rather than +null+.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4140) mapreduce classes incorrectly importing clover.org.apache.* classes

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253331#comment-13253331
 ] 

Hudson commented on MAPREDUCE-4140:
---

Integrated in Hadoop-Hdfs-0.23-Build #226 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/226/])
Merge -r 1325351:1325352 from trunk to branch-0.23. Fixes: MAPREDUCE-4140 
(Revision 1325362)

 Result = SUCCESS
tomwhite : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325362
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetDelegationTokenRequest.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java


 mapreduce classes incorrectly importing clover.org.apache.* classes
 -

 Key: MAPREDUCE-4140
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4140
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Fix For: 0.23.3

 Attachments: MAPREDUCE-4140.patch, MAPREDUCE-4140.patch, 
 MAPREDUCE-4140.patch


 A number of classes in mapreduce are importing clover.org.apache.* classes
 e.g. 
 hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4050) Invalid node link

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253349#comment-13253349
 ] 

Hudson commented on MAPREDUCE-4050:
---

Integrated in Hadoop-Hdfs-trunk #1013 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1013/])
MAPREDUCE-4050. For tasks without assigned containers, changes the node 
text on the UI to N/A instead of a link to null. (Contributed by Bhallamudi 
Venkata Siva Kamesh) (Revision 1325435)

 Result = FAILURE
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325435
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/TaskPage.java


 Invalid node link
 -

 Key: MAPREDUCE-4050
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4050
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1
Reporter: Bhallamudi Venkata Siva Kamesh
Assignee: Bhallamudi Venkata Siva Kamesh
 Fix For: 0.23.3

 Attachments: MAPREDUCE-4050.patch, MAPREDUCE-4050.png


 When a task is in *UNASSIGNED* state, node link is displayed as +null+.
 But I think it is better to display the link as *N/A* rather than +null+.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4147) YARN should not have a compile-time dependency on HDFS

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253351#comment-13253351
 ] 

Hudson commented on MAPREDUCE-4147:
---

Integrated in Hadoop-Hdfs-trunk #1013 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1013/])
MAPREDUCE-4147. YARN should not have a compile-time dependency on HDFS. 
(Revision 1325573)

 Result = FAILURE
tomwhite : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325573
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/pom.xml


 YARN should not have a compile-time dependency on HDFS
 --

 Key: MAPREDUCE-4147
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4147
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.1
Reporter: Tom White
Assignee: Tom White
 Fix For: 2.0.0

 Attachments: MAPREDUCE-4147.patch


 YARN doesn't (and shouldn't) use any HDFS-specific APIs, so it should not 
 declare HDFS as a compile-time dependency.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4139) Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped

2012-04-13 Thread Jason Lowe (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4139:
--

Status: Open  (was: Patch Available)

 Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped
 ---

 Key: MAPREDUCE-4139
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4139
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4139.patch


 When the main thread calls ResourceManager$SchedulerEventDispatcher.stop() it 
 grabs a lock on the object, kicks the event processor thread, and then waits 
 for the thread to exit.  However the interrupted event processor thread can 
 end up trying to call the synchronized getConfig() method which results in 
 deadlock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4128) AM Recovery expects all attempts of a completed task to also be completed.

2012-04-13 Thread Robert Joseph Evans (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans resolved MAPREDUCE-4128.


   Resolution: Fixed
Fix Version/s: (was: 3.0.0)
   2.0.0
   0.23.3

Thanks for your work on this Bikas.  I have put this into trunk, branch-2 and 
branch-0.23

 AM Recovery expects all attempts of a completed task to also be completed.
 --

 Key: MAPREDUCE-4128
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4128
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 3.0.0
Reporter: Bikas Saha
Assignee: Bikas Saha
 Fix For: 0.23.3, 2.0.0

 Attachments: MAPREDUCE-4128-1.patch, MAPREDUCE-4128-2.patch, 
 MAPREDUCE-4128.patch


 The AM seems to assume that all attempts of a completed task (from a previous 
 AM incarnation) would also be completed. There is at least one case in which 
 this does not hold. Case being cancellation of a completed task resulting in 
 a new running attempt.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4139) Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped

2012-04-13 Thread Jason Lowe (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4139:
--

Attachment: MAPREDUCE-4139.patch

Updated patch to fix for findbug warning.

Test failures are unrelated to this patch.  They are the same failures that 
were recently reported for MAPREDUCE-4144 and others.

 Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped
 ---

 Key: MAPREDUCE-4139
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4139
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4139.patch, MAPREDUCE-4139.patch


 When the main thread calls ResourceManager$SchedulerEventDispatcher.stop() it 
 grabs a lock on the object, kicks the event processor thread, and then waits 
 for the thread to exit.  However the interrupted event processor thread can 
 end up trying to call the synchronized getConfig() method which results in 
 deadlock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4139) Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped

2012-04-13 Thread Jason Lowe (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4139:
--

Status: Patch Available  (was: Open)

 Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped
 ---

 Key: MAPREDUCE-4139
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4139
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4139.patch, MAPREDUCE-4139.patch


 When the main thread calls ResourceManager$SchedulerEventDispatcher.stop() it 
 grabs a lock on the object, kicks the event processor thread, and then waits 
 for the thread to exit.  However the interrupted event processor thread can 
 end up trying to call the synchronized getConfig() method which results in 
 deadlock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4128) AM Recovery expects all attempts of a completed task to also be completed.

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253379#comment-13253379
 ] 

Hudson commented on MAPREDUCE-4128:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2143 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2143/])
MAPREDUCE-4128. AM Recovery expects all attempts of a completed task to 
also be completed. (Bikas Saha via bobby) (Revision 1325765)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325765
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEventHandler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/TaskFinishedEvent.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/Task20LineHistoryEventEmitter.java


 AM Recovery expects all attempts of a completed task to also be completed.
 --

 Key: MAPREDUCE-4128
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4128
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 3.0.0
Reporter: Bikas Saha
Assignee: Bikas Saha
 Fix For: 0.23.3, 2.0.0

 Attachments: MAPREDUCE-4128-1.patch, MAPREDUCE-4128-2.patch, 
 MAPREDUCE-4128.patch


 The AM seems to assume that all attempts of a completed task (from a previous 
 AM incarnation) would also be completed. There is at least one case in which 
 this does not hold. Case being cancellation of a completed task resulting in 
 a new running attempt.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4128) AM Recovery expects all attempts of a completed task to also be completed.

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253380#comment-13253380
 ] 

Hudson commented on MAPREDUCE-4128:
---

Integrated in Hadoop-Common-trunk-Commit #2070 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2070/])
MAPREDUCE-4128. AM Recovery expects all attempts of a completed task to 
also be completed. (Bikas Saha via bobby) (Revision 1325765)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325765
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEventHandler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/TaskFinishedEvent.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/Task20LineHistoryEventEmitter.java


 AM Recovery expects all attempts of a completed task to also be completed.
 --

 Key: MAPREDUCE-4128
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4128
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 3.0.0
Reporter: Bikas Saha
Assignee: Bikas Saha
 Fix For: 0.23.3, 2.0.0

 Attachments: MAPREDUCE-4128-1.patch, MAPREDUCE-4128-2.patch, 
 MAPREDUCE-4128.patch


 The AM seems to assume that all attempts of a completed task (from a previous 
 AM incarnation) would also be completed. There is at least one case in which 
 this does not hold. Case being cancellation of a completed task resulting in 
 a new running attempt.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4140) mapreduce classes incorrectly importing clover.org.apache.* classes

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253385#comment-13253385
 ] 

Hudson commented on MAPREDUCE-4140:
---

Integrated in Hadoop-Mapreduce-trunk #1048 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1048/])
MAPREDUCE-4140. mapreduce classes incorrectly importing 
clover.org.apache.* classes. Contributed by Patrick Hunt (Revision 1325352)

 Result = SUCCESS
tomwhite : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325352
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetDelegationTokenRequest.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java


 mapreduce classes incorrectly importing clover.org.apache.* classes
 -

 Key: MAPREDUCE-4140
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4140
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Fix For: 0.23.3

 Attachments: MAPREDUCE-4140.patch, MAPREDUCE-4140.patch, 
 MAPREDUCE-4140.patch


 A number of classes in mapreduce are importing clover.org.apache.* classes
 e.g. 
 hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4128) AM Recovery expects all attempts of a completed task to also be completed.

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253394#comment-13253394
 ] 

Hudson commented on MAPREDUCE-4128:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2084 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2084/])
MAPREDUCE-4128. AM Recovery expects all attempts of a completed task to 
also be completed. (Bikas Saha via bobby) (Revision 1325765)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325765
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEventHandler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/TaskFinishedEvent.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/Task20LineHistoryEventEmitter.java


 AM Recovery expects all attempts of a completed task to also be completed.
 --

 Key: MAPREDUCE-4128
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4128
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 3.0.0
Reporter: Bikas Saha
Assignee: Bikas Saha
 Fix For: 0.23.3, 2.0.0

 Attachments: MAPREDUCE-4128-1.patch, MAPREDUCE-4128-2.patch, 
 MAPREDUCE-4128.patch


 The AM seems to assume that all attempts of a completed task (from a previous 
 AM incarnation) would also be completed. There is at least one case in which 
 this does not hold. Case being cancellation of a completed task resulting in 
 a new running attempt.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4050) Invalid node link

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253389#comment-13253389
 ] 

Hudson commented on MAPREDUCE-4050:
---

Integrated in Hadoop-Mapreduce-trunk #1048 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1048/])
MAPREDUCE-4050. For tasks without assigned containers, changes the node 
text on the UI to N/A instead of a link to null. (Contributed by Bhallamudi 
Venkata Siva Kamesh) (Revision 1325435)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325435
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/TaskPage.java


 Invalid node link
 -

 Key: MAPREDUCE-4050
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4050
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1
Reporter: Bhallamudi Venkata Siva Kamesh
Assignee: Bhallamudi Venkata Siva Kamesh
 Fix For: 0.23.3

 Attachments: MAPREDUCE-4050.patch, MAPREDUCE-4050.png


 When a task is in *UNASSIGNED* state, node link is displayed as +null+.
 But I think it is better to display the link as *N/A* rather than +null+.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4147) YARN should not have a compile-time dependency on HDFS

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253391#comment-13253391
 ] 

Hudson commented on MAPREDUCE-4147:
---

Integrated in Hadoop-Mapreduce-trunk #1048 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1048/])
MAPREDUCE-4147. YARN should not have a compile-time dependency on HDFS. 
(Revision 1325573)

 Result = SUCCESS
tomwhite : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325573
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/pom.xml


 YARN should not have a compile-time dependency on HDFS
 --

 Key: MAPREDUCE-4147
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4147
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.1
Reporter: Tom White
Assignee: Tom White
 Fix For: 2.0.0

 Attachments: MAPREDUCE-4147.patch


 YARN doesn't (and shouldn't) use any HDFS-specific APIs, so it should not 
 declare HDFS as a compile-time dependency.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4071) NPE while executing MRAppMaster shutdown hook

2012-04-13 Thread Bhallamudi Venkata Siva Kamesh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhallamudi Venkata Siva Kamesh updated MAPREDUCE-4071:
--

Status: Patch Available  (was: Open)

 NPE while executing MRAppMaster shutdown hook
 -

 Key: MAPREDUCE-4071
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4071
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.3, 2.0.0
Reporter: Bhallamudi Venkata Siva Kamesh
 Attachments: MAPREDUCE-4071-1.patch, MAPREDUCE-4071-2.patch, 
 MAPREDUCE-4071.patch


 While running the shutdown hook of MRAppMaster, hit NPE
 {noformat}
 Exception in thread Thread-1 java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:668)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1004)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4071) NPE while executing MRAppMaster shutdown hook

2012-04-13 Thread Bhallamudi Venkata Siva Kamesh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhallamudi Venkata Siva Kamesh updated MAPREDUCE-4071:
--

Attachment: MAPREDUCE-4071-2.patch

To fix this issue, I just re-factored the existing code. After re-factoring I 
successfully ran *wordcount* and *terasort* examples. 

Please review the patch and provide your comments.

 NPE while executing MRAppMaster shutdown hook
 -

 Key: MAPREDUCE-4071
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4071
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.3, 2.0.0
Reporter: Bhallamudi Venkata Siva Kamesh
 Attachments: MAPREDUCE-4071-1.patch, MAPREDUCE-4071-2.patch, 
 MAPREDUCE-4071.patch


 While running the shutdown hook of MRAppMaster, hit NPE
 {noformat}
 Exception in thread Thread-1 java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:668)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1004)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4151) RM scheduler web page should filter apps to those that are relevant to scheduling

2012-04-13 Thread Jason Lowe (Created) (JIRA)
RM scheduler web page should filter apps to those that are relevant to 
scheduling
-

 Key: MAPREDUCE-4151
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4151
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2, webapps
Reporter: Jason Lowe


On the ResourceManager's scheduler web page, the bottom of the page shows the 
apps block.  When the cluster has run a lot of applications (e.g.: 10,000+) 
loading the apps table can take a long time, and that prolongs the plotting of 
the queue status which is the most interesting portion of the page.

If the user is bothering to go to the scheduler page, they're probably not 
interested in apps that are not affecting what the scheduler is doing (e.g.: 
FINISHED, FAILED, KILLED, etc.).  Having the RM filter the apps for this page 
should significantly reduce the time it takes to load this page on the client, 
and it also helps reduce the amount of apps the user has to sift through when 
looking for the apps that are affecting what the scheduler is doing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4102) job counters not available in Jobhistory webui for killed jobs

2012-04-13 Thread Bhallamudi Venkata Siva Kamesh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhallamudi Venkata Siva Kamesh updated MAPREDUCE-4102:
--

Attachment: MAPREDUCE-4102-1.patch

Certainly all the test failures are not induced by this patch. Fixed all the 
relavant test failures and ran all the Testcases shown above.
All the tests passed in my env.

Not modified any *src* changes. Just modified the *test* changes.
Patch contains a testcase, which fails without src changes, and passes with src 
changes.
Please review.

 job counters not available in Jobhistory webui for killed jobs
 --

 Key: MAPREDUCE-4102
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4102
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: webapps
Affects Versions: 0.23.2, 2.0.0
Reporter: Thomas Graves
 Attachments: MAPREDUCE-4102-1.patch, MAPREDUCE-4102.patch


 Run a simple wordcount or sleep, and kill the job before it finishes.  Go to 
 the job history web ui and click the Counters link for that job. It 
 displays 500 error.
 The job history log has:
 Caused by: com.google.inject.ProvisionException: Guice provision errors:
 2012-04-03 19:42:53,148 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
 handling URI: /jobhistory/jobcounters/job_1333482028750_0001
 java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 ...
 ..
 ...
 1) Error injecting constructor, java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:56)
   while locating org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock
 ...
 ..
 ...
 Caused by: java.lang.NullPointerExceptionat 
 org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:328)
 at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.getCounters(CountersBlock.java:188)
 at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:57)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 There are task counters available if you drill down into successful tasks 
 though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4102) job counters not available in Jobhistory webui for killed jobs

2012-04-13 Thread Bhallamudi Venkata Siva Kamesh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhallamudi Venkata Siva Kamesh updated MAPREDUCE-4102:
--

Status: Patch Available  (was: Open)

 job counters not available in Jobhistory webui for killed jobs
 --

 Key: MAPREDUCE-4102
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4102
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: webapps
Affects Versions: 0.23.2, 2.0.0
Reporter: Thomas Graves
 Attachments: MAPREDUCE-4102-1.patch, MAPREDUCE-4102.patch


 Run a simple wordcount or sleep, and kill the job before it finishes.  Go to 
 the job history web ui and click the Counters link for that job. It 
 displays 500 error.
 The job history log has:
 Caused by: com.google.inject.ProvisionException: Guice provision errors:
 2012-04-03 19:42:53,148 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
 handling URI: /jobhistory/jobcounters/job_1333482028750_0001
 java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 ...
 ..
 ...
 1) Error injecting constructor, java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:56)
   while locating org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock
 ...
 ..
 ...
 Caused by: java.lang.NullPointerExceptionat 
 org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:328)
 at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.getCounters(CountersBlock.java:188)
 at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:57)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 There are task counters available if you drill down into successful tasks 
 though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4139) Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped

2012-04-13 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253439#comment-13253439
 ] 

Hadoop QA commented on MAPREDUCE-4139:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12522569/MAPREDUCE-4139.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2219//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2219//console

This message is automatically generated.

 Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped
 ---

 Key: MAPREDUCE-4139
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4139
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4139.patch, MAPREDUCE-4139.patch


 When the main thread calls ResourceManager$SchedulerEventDispatcher.stop() it 
 grabs a lock on the object, kicks the event processor thread, and then waits 
 for the thread to exit.  However the interrupted event processor thread can 
 end up trying to call the synchronized getConfig() method which results in 
 deadlock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4150) Versioning and rolling upgrades for Yarn/MR2

2012-04-13 Thread Robert Joseph Evans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253450#comment-13253450
 ] 

Robert Joseph Evans commented on MAPREDUCE-4150:


As part of this I would like to see something like HDFS-3245 so that if we do 
support multiple versions/rolling upgrades we can display a breakdown of what 
versions the cluster is currently running with.

 Versioning and rolling upgrades for Yarn/MR2
 

 Key: MAPREDUCE-4150
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4150
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.1
Reporter: Ahmed Radwan

 It doesn't seem that Yarn components, for example the ResourceManager or 
 NodeManager, do build/package version checking before trying to communicate 
 with each other. 
 The objective of this ticket is to support the following requirements / use 
 cases:
 - New versions can be marked incompatible with old versions, and services 
 should be prevented from communicating with each other in such case. This 
 will avoid non-deterministic behavior/problems resulting from incompatible 
 components trying to communicate with each other.
 - Permitting a policy for running different - but compatible - versions on 
 the same cluster (for example, in a rolling upgrade scenario). See HDFS-2983 
 for the corresponding HDFS implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4152) map task left hanging after AM dies trying to connect to RM

2012-04-13 Thread Thomas Graves (Created) (JIRA)
map task left hanging after AM dies trying to connect to RM
---

 Key: MAPREDUCE-4152
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4152
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.2
Reporter: Thomas Graves
Assignee: Thomas Graves


We had an instance where the RM went down for more then an hour.  The 
application master exited with Could not contact RM after 36 milliseconds

2012-04-11 10:43:36,040 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1333003059741_15999Job 
Transitioned from RUNNING to ERROR




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4152) map task left hanging after AM dies trying to connect to RM

2012-04-13 Thread Thomas Graves (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253460#comment-13253460
 ] 

Thomas Graves commented on MAPREDUCE-4152:
--

The Job did not kill off the map task that it had running before exiting.  In 
JobImpl when it moves from RUNNING to ERROR, all it does is send the 
JobUnsuccessfulCompletion event.  I would think it would atleast try to kill 
any tasks it has.

Now there might also be another issue with NM as to why it didn't kill it.  I 
need to investigate that further.  The NM was also not able to connect to RM 
and I saw one of the threads restart. I'm guessing when that restarted it lost 
that container but I need to investigate that further.



 map task left hanging after AM dies trying to connect to RM
 ---

 Key: MAPREDUCE-4152
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4152
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.2
Reporter: Thomas Graves
Assignee: Thomas Graves

 We had an instance where the RM went down for more then an hour.  The 
 application master exited with Could not contact RM after 36 
 milliseconds
 2012-04-11 10:43:36,040 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: 
 job_1333003059741_15999Job Transitioned from RUNNING to ERROR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4102) job counters not available in Jobhistory webui for killed jobs

2012-04-13 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253481#comment-13253481
 ] 

Hadoop QA commented on MAPREDUCE-4102:
--

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12522573/MAPREDUCE-4102-1.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified test 
files.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2221//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2221//console

This message is automatically generated.

 job counters not available in Jobhistory webui for killed jobs
 --

 Key: MAPREDUCE-4102
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4102
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: webapps
Affects Versions: 0.23.2, 2.0.0
Reporter: Thomas Graves
 Attachments: MAPREDUCE-4102-1.patch, MAPREDUCE-4102.patch


 Run a simple wordcount or sleep, and kill the job before it finishes.  Go to 
 the job history web ui and click the Counters link for that job. It 
 displays 500 error.
 The job history log has:
 Caused by: com.google.inject.ProvisionException: Guice provision errors:
 2012-04-03 19:42:53,148 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
 handling URI: /jobhistory/jobcounters/job_1333482028750_0001
 java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 ...
 ..
 ...
 1) Error injecting constructor, java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:56)
   while locating org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock
 ...
 ..
 ...
 Caused by: java.lang.NullPointerExceptionat 
 org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:328)
 at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.getCounters(CountersBlock.java:188)
 at 
 org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:57)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 There are task counters available if you drill down into successful tasks 
 though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4008) ResourceManager throws MetricsException on start up saying QueueMetrics MBean already exists

2012-04-13 Thread Devaraj K (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4008:
-

Component/s: (was: resourcemanager)

 ResourceManager throws MetricsException on start up saying QueueMetrics MBean 
 already exists
 

 Key: MAPREDUCE-4008
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4008
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, scheduler
Affects Versions: 2.0.0, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
 Attachments: MAPREDUCE-4008.patch


 {code:xml}
 2012-03-14 15:22:23,089 WARN org.apache.hadoop.metrics2.util.MBeans: Error 
 creating MBean object name: 
 Hadoop:service=ResourceManager,name=QueueMetrics,q0=default
 org.apache.hadoop.metrics2.MetricsException: 
 org.apache.hadoop.metrics2.MetricsException: 
 Hadoop:service=ResourceManager,name=QueueMetrics,q0=default already exists!
   at 
 org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newObjectName(DefaultMetricsSystem.java:117)
   at 
 org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newMBeanName(DefaultMetricsSystem.java:102)
   at org.apache.hadoop.metrics2.util.MBeans.getMBeanName(MBeans.java:91)
   at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:55)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.startMBeans(MetricsSourceAdapter.java:218)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.start(MetricsSourceAdapter.java:93)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.registerSource(MetricsSystemImpl.java:243)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl$1.postStart(MetricsSystemImpl.java:227)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl$3.invoke(MetricsSystemImpl.java:288)
   at $Proxy6.postStart(Unknown Source)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:183)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.init(MetricsSystemImpl.java:155)
   at 
 org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.init(DefaultMetricsSystem.java:54)
   at 
 org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.initialize(DefaultMetricsSystem.java:50)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.start(ResourceManager.java:454)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:588)
 Caused by: org.apache.hadoop.metrics2.MetricsException: 
 Hadoop:service=ResourceManager,name=QueueMetrics,q0=default already exists!
   at 
 org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newObjectName(DefaultMetricsSystem.java:113)
   ... 19 more
 2012-03-14 15:22:23,090 WARN org.apache.hadoop.metrics2.util.MBeans: Failed 
 to register MBean null
 javax.management.RuntimeOperationsException: Exception occurred trying to 
 register the MBean
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:969)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:917)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:312)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:482)
   at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:57)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.startMBeans(MetricsSourceAdapter.java:218)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.start(MetricsSourceAdapter.java:93)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.registerSource(MetricsSystemImpl.java:243)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl$1.postStart(MetricsSystemImpl.java:227)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl$3.invoke(MetricsSystemImpl.java:288)
   at $Proxy6.postStart(Unknown Source)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:183)
   at 
 

[jira] [Updated] (MAPREDUCE-4048) NullPointerException exception while accessing the Application Master UI

2012-04-13 Thread Devaraj K (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4048:
-

Component/s: mrv2

 NullPointerException exception while accessing the Application Master UI
 

 Key: MAPREDUCE-4048
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4048
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
 Attachments: MAPREDUCE-4048.patch


 {code:xml}
 2012-03-21 10:21:31,838 ERROR [2145015588@qtp-957250718-801] 
 org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: 
 /mapreduce/attempts/job_1332261815858_2_8/m/KILLED
 java.lang.reflect.InvocationTargetException
 at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:150)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at 
 com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263)
 at 
 com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178)
 at 
 com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
 at 
 com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62)
 at 
 com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900)
 at 
 com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
 ...
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
 Caused by: java.lang.NullPointerException
 at com.google.common.base.Joiner.toString(Joiner.java:317)
 at com.google.common.base.Joiner.appendTo(Joiner.java:97)
 at com.google.common.base.Joiner.appendTo(Joiner.java:127)
 at com.google.common.base.Joiner.join(Joiner.java:158)
 at com.google.common.base.Joiner.join(Joiner.java:166)
 at 
 org.apache.hadoop.yarn.util.StringHelper.join(StringHelper.java:102)
 at 
 org.apache.hadoop.mapreduce.v2.app.webapp.AppController.badRequest(AppController.java:319)
 at 
 org.apache.hadoop.mapreduce.v2.app.webapp.AppController.attempts(AppController.java:286)
 ... 36 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3494) The RM should handle the graceful shutdown of the NM.

2012-04-13 Thread Devaraj K (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-3494:
-

Component/s: (was: resourcemanager)

 The RM should handle the graceful shutdown of the NM.
 -

 Key: MAPREDUCE-3494
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3494
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Affects Versions: 2.0.0, 3.0.0
Reporter: Ravi Teja Ch N V
Assignee: Devaraj K
 Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
 MAPREDUCE-3494.patch


 Instead of waiting for the NM expiry, RM should remove and handle the NM, 
 which is shutdown gracefully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4071) NPE while executing MRAppMaster shutdown hook

2012-04-13 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253503#comment-13253503
 ] 

Hadoop QA commented on MAPREDUCE-4071:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12522570/MAPREDUCE-4071-2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified test 
files.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
  org.apache.hadoop.yarn.server.TestDiskFailures
  org.apache.hadoop.yarn.server.TestContainerManagerSecurity
  
org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService
  
org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry
  
org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization
  
org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs
  org.apache.hadoop.mapreduce.v2.app.TestRecovery
  org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator
  org.apache.hadoop.mapred.TestMiniMRClasspath
  org.apache.hadoop.mapreduce.v2.TestMRJobs
  org.apache.hadoop.mapred.TestMiniMRWithDFSWithDistinctUsers
  org.apache.hadoop.mapred.TestMiniMRBringup
  org.apache.hadoop.mapred.TestMiniMRChildTask
  org.apache.hadoop.mapred.TestReduceFetch
  org.apache.hadoop.mapred.TestClusterMRNotification
  org.apache.hadoop.mapred.TestReduceFetchFromPartialMem
  org.apache.hadoop.mapred.TestJobCounters
  org.apache.hadoop.mapreduce.TestChild
  org.apache.hadoop.mapred.TestMiniMRClientCluster
  org.apache.hadoop.ipc.TestSocketFactory
  org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService
  org.apache.hadoop.mapreduce.v2.TestMROldApiJobs
  org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution
  org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter
  org.apache.hadoop.mapred.TestClientRedirect
  org.apache.hadoop.mapred.TestLazyOutput
  org.apache.hadoop.mapred.TestJobCleanup
  org.apache.hadoop.mapreduce.TestMapReduceLazyOutput
  org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
  org.apache.hadoop.mapreduce.v2.TestMRAppWithCombiner
  org.apache.hadoop.conf.TestNoDefaultsJobConf
  org.apache.hadoop.mapreduce.v2.TestRMNMInfo
  org.apache.hadoop.mapred.TestClusterMapReduceTestCase
  org.apache.hadoop.mapreduce.v2.TestNonExistentJob
  org.apache.hadoop.mapred.TestJobSysDirWithDFS
  org.apache.hadoop.mapreduce.v2.TestUberAM
  org.apache.hadoop.mapreduce.v2.TestMiniMRProxyUser
  org.apache.hadoop.mapred.TestJobName
  org.apache.hadoop.mapreduce.security.TestJHSSecurity

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2220//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2220//console

This message is automatically generated.

 NPE while executing MRAppMaster shutdown hook
 -

 Key: MAPREDUCE-4071
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4071
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.3, 2.0.0
Reporter: Bhallamudi Venkata Siva Kamesh
 Attachments: MAPREDUCE-4071-1.patch, MAPREDUCE-4071-2.patch, 
 MAPREDUCE-4071.patch


 While running the shutdown hook of MRAppMaster, hit NPE
 {noformat}
 Exception in thread Thread-1 java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:668)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1004)
 

[jira] [Updated] (MAPREDUCE-3921) MR AM should act on the nodes liveliness information when nodes go up/down/unhealthy

2012-04-13 Thread Bikas Saha (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-3921:
--

Status: Open  (was: Patch Available)

 MR AM should act on the nodes liveliness information when nodes go 
 up/down/unhealthy
 

 Key: MAPREDUCE-3921
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3921
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.2

 Attachments: MAPREDUCE-3921-1.patch, MAPREDUCE-3921-3.patch, 
 MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, 
 MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3921) MR AM should act on the nodes liveliness information when nodes go up/down/unhealthy

2012-04-13 Thread Bikas Saha (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-3921:
--

Attachment: MAPREDUCE-3921-3.patch

Attaching patch after pulling latest changes and improved test for AM recovery.

 MR AM should act on the nodes liveliness information when nodes go 
 up/down/unhealthy
 

 Key: MAPREDUCE-3921
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3921
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.2

 Attachments: MAPREDUCE-3921-1.patch, MAPREDUCE-3921-3.patch, 
 MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, 
 MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3921) MR AM should act on the nodes liveliness information when nodes go up/down/unhealthy

2012-04-13 Thread Bikas Saha (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-3921:
--

Status: Patch Available  (was: Open)

 MR AM should act on the nodes liveliness information when nodes go 
 up/down/unhealthy
 

 Key: MAPREDUCE-3921
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3921
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.2

 Attachments: MAPREDUCE-3921-1.patch, MAPREDUCE-3921-3.patch, 
 MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, 
 MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4153) NM Application invalid state transition on reboot command from RM

2012-04-13 Thread Thomas Graves (Created) (JIRA)
NM Application invalid state transition on reboot command from RM
-

 Key: MAPREDUCE-4153
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4153
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Affects Versions: 0.23.2
Reporter: Thomas Graves


If the RM goes down and comes back up, it tells the NM to reboot.  When the NM 
reboots, if it has any applications it aggregates the logs for those 
applications, then it transitions the app to APPLICATION_LOG_HANDLING_FINISHED. 
I saw a case where there was an app that was in the RUNNING state and tried to 
transition to APPLICATION_LOG_HANDLING_finished and it got the invalid 
transition.

 [DeletionService #1]2012-04-11 15:12:40,476 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
 Can't handle this event at current state
 [AsyncDispatcher event 
handler]org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid 
event: APPLICATION_LOG_HANDLING_FINISHED at RUNNING
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:382)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:517)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:509)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:125)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:74)
at java.lang.Thread.run(Thread.java:619)
2012-04-11 15:12:40,476 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
 Application application_1333003059741_15999 transitioned from RUNNING to null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3921) MR AM should act on the nodes liveliness information when nodes go up/down/unhealthy

2012-04-13 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253678#comment-13253678
 ] 

Hadoop QA commented on MAPREDUCE-3921:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12522596/MAPREDUCE-3921-3.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified test 
files.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 508 javac compiler warnings (more 
than the trunk's current 506 warnings).

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
  org.apache.hadoop.yarn.server.TestDiskFailures
  org.apache.hadoop.yarn.server.TestContainerManagerSecurity
  
org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService
  
org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry
  
org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization
  
org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs
  org.apache.hadoop.mapred.TestMiniMRClasspath
  org.apache.hadoop.mapreduce.v2.TestMRJobs
  org.apache.hadoop.mapred.TestMiniMRWithDFSWithDistinctUsers
  org.apache.hadoop.mapred.TestMiniMRBringup
  org.apache.hadoop.mapred.TestMiniMRChildTask
  org.apache.hadoop.mapred.TestReduceFetch
  org.apache.hadoop.mapred.TestClusterMRNotification
  org.apache.hadoop.mapred.TestReduceFetchFromPartialMem
  org.apache.hadoop.mapred.TestJobCounters
  org.apache.hadoop.mapreduce.TestChild
  org.apache.hadoop.mapred.TestMiniMRClientCluster
  org.apache.hadoop.ipc.TestSocketFactory
  org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService
  org.apache.hadoop.mapreduce.v2.TestMROldApiJobs
  org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution
  org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter
  org.apache.hadoop.mapred.TestClientRedirect
  org.apache.hadoop.mapred.TestLazyOutput
  org.apache.hadoop.mapred.TestJobCleanup
  org.apache.hadoop.mapreduce.TestMapReduceLazyOutput
  org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
  org.apache.hadoop.mapreduce.v2.TestMRAppWithCombiner
  org.apache.hadoop.conf.TestNoDefaultsJobConf
  org.apache.hadoop.mapreduce.v2.TestRMNMInfo
  org.apache.hadoop.mapred.TestClusterMapReduceTestCase
  org.apache.hadoop.mapreduce.v2.TestNonExistentJob
  org.apache.hadoop.mapred.TestJobSysDirWithDFS
  org.apache.hadoop.mapreduce.v2.TestUberAM
  org.apache.hadoop.mapreduce.v2.TestMiniMRProxyUser
  org.apache.hadoop.mapred.TestJobName
  org.apache.hadoop.mapreduce.security.TestJHSSecurity

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build///testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build///console

This message is automatically generated.

 MR AM should act on the nodes liveliness information when nodes go 
 up/down/unhealthy
 

 Key: MAPREDUCE-3921
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3921
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.2

 Attachments: MAPREDUCE-3921-1.patch, MAPREDUCE-3921-3.patch, 
 MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, 
 MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: 

[jira] [Updated] (MAPREDUCE-3972) Locking and exception issues in JobHistory Server.

2012-04-13 Thread Robert Joseph Evans (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-3972:
---

Attachment: MR-3972.txt

This patch should address all of the review comments so far.

The JobListCache is now a ConcurrentSkipListMap with no locking around it.  To 
do this I declared it safe that we may delete a few more jobs from the cache 
then expected.

The HistoryStorage class is no longer informed about items being removed from 
HDFS, and the CachedHistoryStorage tries to preemptively know that they were 
removed, but it is not that critical if it does not happen.

Please take a look as now that there is no locking around the JobListCache 
there needed to be some extra checks to avoid NPEs 

 Locking and exception issues in JobHistory Server.
 --

 Key: MAPREDUCE-3972
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3972
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2
Affects Versions: 0.23.2
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-3972.txt, MR-3972.txt, MR-3972.txt, MR-3972.txt, 
 MR-3972.txt


 The JobHistory server's locking is inconsistent and wrong in some cases.  
 This is not super critical because the issues would only show up if a job is 
 being cleaned up or moved from intermediate done to done, at the same time it 
 is being parsed into a CompletedJob.  However the locking is slowing down the 
 server in some cases, and is a ticking time bomb that needs to be addressed.
 As part of this too we need to be sure that the Cleaner and Intermediate to 
 Done migration threads handle exceptions properly.  Now it appears that the 
 exception is logged, and the thread just shuts down.  This means that the 
 history server could still be up and running for weeks and never remove old 
 jobs.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3972) Locking and exception issues in JobHistory Server.

2012-04-13 Thread Robert Joseph Evans (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-3972:
---

Status: Patch Available  (was: Open)

 Locking and exception issues in JobHistory Server.
 --

 Key: MAPREDUCE-3972
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3972
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2
Affects Versions: 0.23.2
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-3972.txt, MR-3972.txt, MR-3972.txt, MR-3972.txt, 
 MR-3972.txt


 The JobHistory server's locking is inconsistent and wrong in some cases.  
 This is not super critical because the issues would only show up if a job is 
 being cleaned up or moved from intermediate done to done, at the same time it 
 is being parsed into a CompletedJob.  However the locking is slowing down the 
 server in some cases, and is a ticking time bomb that needs to be addressed.
 As part of this too we need to be sure that the Cleaner and Intermediate to 
 Done migration threads handle exceptions properly.  Now it appears that the 
 exception is logged, and the thread just shuts down.  This means that the 
 history server could still be up and running for weeks and never remove old 
 jobs.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4150) Versioning and rolling upgrades for Yarn/MR2

2012-04-13 Thread Aaron T. Myers (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253738#comment-13253738
 ] 

Aaron T. Myers commented on MAPREDUCE-4150:
---

One of the first steps in implementing this should probably be to move 
VersionUtil into Common from HDFS.

 Versioning and rolling upgrades for Yarn/MR2
 

 Key: MAPREDUCE-4150
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4150
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.1
Reporter: Ahmed Radwan

 It doesn't seem that Yarn components, for example the ResourceManager or 
 NodeManager, do build/package version checking before trying to communicate 
 with each other. 
 The objective of this ticket is to support the following requirements / use 
 cases:
 - New versions can be marked incompatible with old versions, and services 
 should be prevented from communicating with each other in such case. This 
 will avoid non-deterministic behavior/problems resulting from incompatible 
 components trying to communicate with each other.
 - Permitting a policy for running different - but compatible - versions on 
 the same cluster (for example, in a rolling upgrade scenario). See HDFS-2983 
 for the corresponding HDFS implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (MAPREDUCE-4150) Versioning and rolling upgrades for Yarn/MR2

2012-04-13 Thread Ahmed Radwan (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan reassigned MAPREDUCE-4150:
---

Assignee: Ahmed Radwan

 Versioning and rolling upgrades for Yarn/MR2
 

 Key: MAPREDUCE-4150
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4150
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan

 It doesn't seem that Yarn components, for example the ResourceManager or 
 NodeManager, do build/package version checking before trying to communicate 
 with each other. 
 The objective of this ticket is to support the following requirements / use 
 cases:
 - New versions can be marked incompatible with old versions, and services 
 should be prevented from communicating with each other in such case. This 
 will avoid non-deterministic behavior/problems resulting from incompatible 
 components trying to communicate with each other.
 - Permitting a policy for running different - but compatible - versions on 
 the same cluster (for example, in a rolling upgrade scenario). See HDFS-2983 
 for the corresponding HDFS implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3972) Locking and exception issues in JobHistory Server.

2012-04-13 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253792#comment-13253792
 ] 

Hadoop QA commented on MAPREDUCE-3972:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12522611/MR-3972.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified test 
files.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
  org.apache.hadoop.yarn.server.TestDiskFailures
  org.apache.hadoop.yarn.server.TestContainerManagerSecurity
  
org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService
  
org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry
  
org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization
  
org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs
  org.apache.hadoop.mapred.TestMiniMRClasspath
  org.apache.hadoop.mapreduce.v2.TestMRJobs
  org.apache.hadoop.mapred.TestMiniMRWithDFSWithDistinctUsers
  org.apache.hadoop.mapred.TestMiniMRBringup
  org.apache.hadoop.mapred.TestMiniMRChildTask
  org.apache.hadoop.mapred.TestReduceFetch
  org.apache.hadoop.mapred.TestClusterMRNotification
  org.apache.hadoop.mapred.TestReduceFetchFromPartialMem
  org.apache.hadoop.mapred.TestJobCounters
  org.apache.hadoop.mapreduce.TestChild
  org.apache.hadoop.mapred.TestMiniMRClientCluster
  org.apache.hadoop.ipc.TestSocketFactory
  org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService
  org.apache.hadoop.mapreduce.v2.TestMROldApiJobs
  org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution
  org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter
  org.apache.hadoop.mapred.TestClientRedirect
  org.apache.hadoop.mapred.TestLazyOutput
  org.apache.hadoop.mapred.TestJobCleanup
  org.apache.hadoop.mapreduce.TestMapReduceLazyOutput
  org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
  org.apache.hadoop.mapreduce.v2.TestMRAppWithCombiner
  org.apache.hadoop.conf.TestNoDefaultsJobConf
  org.apache.hadoop.mapreduce.v2.TestRMNMInfo
  org.apache.hadoop.mapred.TestClusterMapReduceTestCase
  org.apache.hadoop.mapreduce.v2.TestNonExistentJob
  org.apache.hadoop.mapred.TestJobSysDirWithDFS
  org.apache.hadoop.mapreduce.v2.TestUberAM
  org.apache.hadoop.mapreduce.v2.TestMiniMRProxyUser
  org.apache.hadoop.mapred.TestJobName
  org.apache.hadoop.mapreduce.security.TestJHSSecurity

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2223//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2223//console

This message is automatically generated.

 Locking and exception issues in JobHistory Server.
 --

 Key: MAPREDUCE-3972
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3972
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2
Affects Versions: 0.23.2
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-3972.txt, MR-3972.txt, MR-3972.txt, MR-3972.txt, 
 MR-3972.txt


 The JobHistory server's locking is inconsistent and wrong in some cases.  
 This is not super critical because the issues would only show up if a job is 
 being cleaned up or moved from intermediate done to done, at the same time it 
 is being parsed into a CompletedJob.  However the locking is slowing down the 
 server in some cases, and is a ticking time bomb that needs to be addressed.
 As part of this too we need to be sure that the Cleaner and Intermediate to 
 Done migration threads 

[jira] [Commented] (MAPREDUCE-4144) ResourceManager NPE while handling NODE_UPDATE

2012-04-13 Thread Siddharth Seth (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253805#comment-13253805
 ] 

Siddharth Seth commented on MAPREDUCE-4144:
---

+1. lgtm. Thanks Jason.
The same won't occur with off-switch requests since * seems to be a mandatory 
request - and is never removed from the request table.

 ResourceManager NPE while handling NODE_UPDATE
 --

 Key: MAPREDUCE-4144
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4144
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: MAPREDUCE-4144-testcase.patch, MAPREDUCE-4144.patch


 The RM on one of our clusters has exited twice in the past few days because 
 of an NPE while trying to handle a NODE_UPDATE:
 {noformat}
 2012-04-12 02:09:01,672 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
  [ResourceManager Event Processor]java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:261)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:223)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.allocate(SchedulerApp.java:246)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1229)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignNodeLocalContainers(LeafQueue.java:1078)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1048)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:859)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:756)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:573)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:622)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:78)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:302)
 at java.lang.Thread.run(Thread.java:619)
 {noformat}
 This is very similar to the failure reported in MAPREDUCE-3005.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3972) Locking and exception issues in JobHistory Server.

2012-04-13 Thread Robert Joseph Evans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253810#comment-13253810
 ] 

Robert Joseph Evans commented on MAPREDUCE-3972:


These tests are not related to the patch.

 Locking and exception issues in JobHistory Server.
 --

 Key: MAPREDUCE-3972
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3972
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2
Affects Versions: 0.23.2
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-3972.txt, MR-3972.txt, MR-3972.txt, MR-3972.txt, 
 MR-3972.txt


 The JobHistory server's locking is inconsistent and wrong in some cases.  
 This is not super critical because the issues would only show up if a job is 
 being cleaned up or moved from intermediate done to done, at the same time it 
 is being parsed into a CompletedJob.  However the locking is slowing down the 
 server in some cases, and is a ticking time bomb that needs to be addressed.
 As part of this too we need to be sure that the Cleaner and Intermediate to 
 Done migration threads handle exceptions properly.  Now it appears that the 
 exception is logged, and the thread just shuts down.  This means that the 
 history server could still be up and running for weeks and never remove old 
 jobs.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4144) ResourceManager NPE while handling NODE_UPDATE

2012-04-13 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-4144:
--

   Resolution: Fixed
Fix Version/s: 0.23.3
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-2 and branch-0.23

 ResourceManager NPE while handling NODE_UPDATE
 --

 Key: MAPREDUCE-4144
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4144
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 0.23.3

 Attachments: MAPREDUCE-4144-testcase.patch, MAPREDUCE-4144.patch


 The RM on one of our clusters has exited twice in the past few days because 
 of an NPE while trying to handle a NODE_UPDATE:
 {noformat}
 2012-04-12 02:09:01,672 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
  [ResourceManager Event Processor]java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:261)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:223)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.allocate(SchedulerApp.java:246)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1229)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignNodeLocalContainers(LeafQueue.java:1078)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1048)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:859)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:756)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:573)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:622)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:78)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:302)
 at java.lang.Thread.run(Thread.java:619)
 {noformat}
 This is very similar to the failure reported in MAPREDUCE-3005.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4144) ResourceManager NPE while handling NODE_UPDATE

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253820#comment-13253820
 ] 

Hudson commented on MAPREDUCE-4144:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2145 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2145/])
MAPREDUCE-4144. Fix a NPE in the ResourceManager when handling node 
updates. (Contributed by Jason Lowe) (Revision 1325991)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325991
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java


 ResourceManager NPE while handling NODE_UPDATE
 --

 Key: MAPREDUCE-4144
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4144
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 0.23.3

 Attachments: MAPREDUCE-4144-testcase.patch, MAPREDUCE-4144.patch


 The RM on one of our clusters has exited twice in the past few days because 
 of an NPE while trying to handle a NODE_UPDATE:
 {noformat}
 2012-04-12 02:09:01,672 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
  [ResourceManager Event Processor]java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:261)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:223)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.allocate(SchedulerApp.java:246)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1229)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignNodeLocalContainers(LeafQueue.java:1078)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1048)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:859)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:756)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:573)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:622)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:78)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:302)
 at java.lang.Thread.run(Thread.java:619)
 {noformat}
 This is very similar to the failure reported in MAPREDUCE-3005.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4144) ResourceManager NPE while handling NODE_UPDATE

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253823#comment-13253823
 ] 

Hudson commented on MAPREDUCE-4144:
---

Integrated in Hadoop-Common-trunk-Commit #2072 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2072/])
MAPREDUCE-4144. Fix a NPE in the ResourceManager when handling node 
updates. (Contributed by Jason Lowe) (Revision 1325991)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325991
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java


 ResourceManager NPE while handling NODE_UPDATE
 --

 Key: MAPREDUCE-4144
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4144
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 0.23.3

 Attachments: MAPREDUCE-4144-testcase.patch, MAPREDUCE-4144.patch


 The RM on one of our clusters has exited twice in the past few days because 
 of an NPE while trying to handle a NODE_UPDATE:
 {noformat}
 2012-04-12 02:09:01,672 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
  [ResourceManager Event Processor]java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:261)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:223)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.allocate(SchedulerApp.java:246)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1229)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignNodeLocalContainers(LeafQueue.java:1078)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1048)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:859)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:756)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:573)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:622)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:78)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:302)
 at java.lang.Thread.run(Thread.java:619)
 {noformat}
 This is very similar to the failure reported in MAPREDUCE-3005.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4144) ResourceManager NPE while handling NODE_UPDATE

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253835#comment-13253835
 ] 

Hudson commented on MAPREDUCE-4144:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2086 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2086/])
MAPREDUCE-4144. Fix a NPE in the ResourceManager when handling node 
updates. (Contributed by Jason Lowe) (Revision 1325991)

 Result = FAILURE
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325991
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java


 ResourceManager NPE while handling NODE_UPDATE
 --

 Key: MAPREDUCE-4144
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4144
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 0.23.3

 Attachments: MAPREDUCE-4144-testcase.patch, MAPREDUCE-4144.patch


 The RM on one of our clusters has exited twice in the past few days because 
 of an NPE while trying to handle a NODE_UPDATE:
 {noformat}
 2012-04-12 02:09:01,672 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
  [ResourceManager Event Processor]java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:261)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:223)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.allocate(SchedulerApp.java:246)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1229)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignNodeLocalContainers(LeafQueue.java:1078)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1048)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:859)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:756)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:573)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:622)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:78)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:302)
 at java.lang.Thread.run(Thread.java:619)
 {noformat}
 This is very similar to the failure reported in MAPREDUCE-3005.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4154) streaming MR job succeeds even if the streaming command fails

2012-04-13 Thread Thejas M Nair (Created) (JIRA)
streaming MR job succeeds even if the streaming command fails
-

 Key: MAPREDUCE-4154
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4154
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.0.2
Reporter: Thejas M Nair


Hadoop 1.0.1 behaves as expected - The task fails for streaming MR job if the 
streaming command fails. But it succeeds in hadoop 1.0.2 .


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4154) streaming MR job succeeds even if the streaming command fails

2012-04-13 Thread Thejas M Nair (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253853#comment-13253853
 ] 

Thejas M Nair commented on MAPREDUCE-4154:
--

Example of streaming job that succeeds with 1.0.2 - 

sudo -u templeton hadoop fs -rmr /tmp/t.out; sudo -u templeton /usr/bin/hadoop 
jar /usr/lib/hadoop/contrib/streaming/hadoop-streaming-1.0.2.jar -input 
/tmp/nums.txt -output /tmp/t.out -mapper *'/bin/ls no_such-file-12e3'* -reducer 
/usr/bin/wc


 streaming MR job succeeds even if the streaming command fails
 -

 Key: MAPREDUCE-4154
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4154
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.0.2
Reporter: Thejas M Nair

 Hadoop 1.0.1 behaves as expected - The task fails for streaming MR job if the 
 streaming command fails. But it succeeds in hadoop 1.0.2 .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-3378) Create a single 'hadoop-mapreduce' Maven artifact

2012-04-13 Thread Tom White (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White resolved MAPREDUCE-3378.
--

Resolution: Won't Fix

I've opened HADOOP-8278 to track 1. HADOOP-8009 addressed 2. So I'm closing 
this JIRA now.



 Create a single 'hadoop-mapreduce' Maven artifact
 -

 Key: MAPREDUCE-3378
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3378
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.0
Reporter: Tom White
 Attachments: MAPREDUCE-3378.patch


 In 0.23.0 there are multiple artifacts (hadoop-mapreduce-client-app, 
 hadoop-mapreduce-client-common, hadoop-mapreduce-client-core, etc). It would 
 be simpler for users to declare a dependency on hadoop-mapreduce (much like 
 there's hadoop-common and hadoop-hdfs). (This would also be a step towards 
 MAPREDUCE-2600.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4150) Versioning and rolling upgrades for Yarn/MR2

2012-04-13 Thread Ahmed Radwan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253963#comment-13253963
 ] 

Ahmed Radwan commented on MAPREDUCE-4150:
-

@Robert: Thanks for the suggestion! I agree displaying such information on the 
web UI will be very helpful next step. Let's plan on this after we are done 
with this ticket so we know exactly what to display.

@ATM: Good point, I filed HADOOP-8280 to take care of that. Thanks!

 Versioning and rolling upgrades for Yarn/MR2
 

 Key: MAPREDUCE-4150
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4150
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan

 It doesn't seem that Yarn components, for example the ResourceManager or 
 NodeManager, do build/package version checking before trying to communicate 
 with each other. 
 The objective of this ticket is to support the following requirements / use 
 cases:
 - New versions can be marked incompatible with old versions, and services 
 should be prevented from communicating with each other in such case. This 
 will avoid non-deterministic behavior/problems resulting from incompatible 
 components trying to communicate with each other.
 - Permitting a policy for running different - but compatible - versions on 
 the same cluster (for example, in a rolling upgrade scenario). See HDFS-2983 
 for the corresponding HDFS implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira