date:20120910


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451831#comment-13451831
 ] 

Hadoop QA commented on MAPREDUCE-4517:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/1252/MAPREDUCE-4517.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2833//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2833//console

This message is automatically generated.

 Too many INFO messages written out during AM to RM heartbeat
 

 Key: MAPREDUCE-4517
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4517
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster
Reporter: James Kinley
Priority: Minor
  Labels: patch
 Fix For: trunk

 Attachments: MAPREDUCE-4517.patch


 Too many INFO log messages written out during AM to RM heartbeat. Based on 
 default frequency of 1000ms (scheduler.heartbeat.interval-ms) either 2 or 4 
 INFO messages are written out per second:
 LOG.info(Before Scheduling:  + getStat());
 ListContainer allocatedContainers = getResources();
 LOG.info(After Scheduling:  + getStat());
 if (allocatedContainers.size()  0) {
   LOG.info(Before Assign:  + getStat());
   scheduledRequests.assign(allocatedContainers);
   LOG.info(After Assign:  + getStat());
 }
 These should probably be changed to DEBUG message to save the log growing too 
 quickly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4642) MiniMRClientClusterFactory should not use job.setJar()


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451948#comment-13451948
 ] 

Arun C Murthy commented on MAPREDUCE-4642:
--

Sorry to come in late. 

I'm confused - setJar is a standard user-api - this seems like a bug in Oozie 
if it can't deal with it?

In the worst case - we can fix MR1 to use setJar too, but this fix seems off?

 MiniMRClientClusterFactory should not use job.setJar()
 --

 Key: MAPREDUCE-4642
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4642
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Robert Kanter
Assignee: Robert Kanter
 Fix For: 2.0.2-alpha

 Attachments: MAPREDUCE-4642.patch, MAPREDUCE-4642.patch


 Currently, MiniMRClientClusterFactory does {{job.setJar(callerJar)}} so that 
 the {{callerJar}} is added to the cache in MR2.  However, this makes the 
 resulting configuration inconsistent between MR1 and MR2 as in MR1 the job 
 jar is not set and in MR2 its set to the {{callerJar}}.  This difference can 
 also cause some tests to fail in Oozie.  We should instead use the 
 {{job.addCacheFile()}} method.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-199) Locality hints for Reduce


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451952#comment-13451952
 ] 

Arun C Murthy commented on MAPREDUCE-199:
-

I'm not sure I see the value in user-specifying 'hints', the way to get this to 
work is to 'figure' where the map-outputs are (the AM knows it) and then try to 
pick the right hosts/racks. 


 Locality hints for Reduce
 -

 Key: MAPREDUCE-199
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-199
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: applicationmaster, mrv2
Reporter: Benjamin Reed
Assignee: Harsh J
 Attachments: MAPREDUCE-199.patch, MAPREDUCE-199.patch


 It would be nice if we could add method to OutputFormat that would allow a 
 job to indicate where a reducer for a given partition should should run. This 
 is similar to the getSplits() method on InputFormat. In our application the 
 reducer is using other data in addition to the map outputs during processing 
 and data accesses could be made more efficient if the JobTracker scheduled 
 the reducers to run on specific hosts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4645) Providing a random seed to Slive should make the sequence of filenames completely deterministic

2012-09-10 Thread Ravi Prakash (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated MAPREDUCE-4645:


Status: Patch Available  (was: Open)

 Providing a random seed to Slive should make the sequence of filenames 
 completely deterministic
 ---

 Key: MAPREDUCE-4645
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4645
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: performance, test
Affects Versions: 2.0.0-alpha, 0.23.1
Reporter: Ravi Prakash
Assignee: Ravi Prakash
  Labels: performance, test
 Attachments: MAPREDUCE-4645.branch-0.23.patch


 Using the -random seed option still doesn't produce a deterministic sequence 
 of filenames. Hence there's no way to replicate the performance test. If I'm 
 providing a seed, its obvious that I want the test to be reproducible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4647) We should only unjar jobjar if there is a lib directory in it.

Robert Joseph Evans created MAPREDUCE-4647:
--

 Summary: We should only unjar jobjar if there is a lib directory 
in it.
 Key: MAPREDUCE-4647
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4647
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans


For backwards compatibility we recently added made is so we would unjar the 
job.jar and add anything to the classpath in the lib directory of that jar.  
But this also slows job startup down a lot if the jar is large.  We should only 
unjar it if actually doing so would add something new to the classpath.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4645) Providing a random seed to Slive should make the sequence of filenames completely deterministic

[
https://issues.apache.org/jira/browse/MAPREDUCE-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452003#comment-13452003
]

Hadoop QA commented on MAPREDUCE-4645:
--

+1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12544273/MAPREDUCE-4645.branch-0.23.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2834//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2834//console

This message is automatically generated.

Providing a random seed to Slive should make the sequence of filenames
completely deterministic
---

Key: MAPREDUCE-4645
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4645
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: performance, test
Affects Versions: 0.23.1, 2.0.0-alpha
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Labels: performance, test
Attachments: MAPREDUCE-4645.branch-0.23.patch

Using the -random seed option still doesn't produce a deterministic sequence
of filenames. Hence there's no way to replicate the performance test. If I'm
providing a seed, its obvious that I want the test to be reproducible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-4564) Shell timeout mechanism does not work for processes spawned using winutils


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy resolved MAPREDUCE-4564.
--

   Resolution: Fixed
Fix Version/s: 1-win

I just committed this. Thanks Bikas (and Chuan for the review).

 Shell timeout mechanism does not work for processes spawned using winutils
 --

 Key: MAPREDUCE-4564
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4564
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
 Fix For: 1-win

 Attachments: MAPREDUCE-4564.branch-1-win.1.patch, 
 MAPREDUCE-4564.branch-1-win.2.patch


 Upon timeout, Shell calls Java process.destroy() to terminate the spawned 
 process. This would destroy the winutils process but not the real process 
 spawned by winutils.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4645) Providing a random seed to Slive should make the sequence of filenames completely deterministic

2012-09-10 Thread Ravi Prakash (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452050#comment-13452050
 ] 

Ravi Prakash commented on MAPREDUCE-4645:
-

The same patch applies to branch-2 and trunk.

 Providing a random seed to Slive should make the sequence of filenames 
 completely deterministic
 ---

 Key: MAPREDUCE-4645
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4645
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: performance, test
Affects Versions: 0.23.1, 2.0.0-alpha
Reporter: Ravi Prakash
Assignee: Ravi Prakash
  Labels: performance, test
 Attachments: MAPREDUCE-4645.branch-0.23.patch


 Using the -random seed option still doesn't produce a deterministic sequence 
 of filenames. Hence there's no way to replicate the performance test. If I'm 
 providing a seed, its obvious that I want the test to be reproducible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-199) Locality hints for Reduce

2012-09-10 Thread Harsh J (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452070#comment-13452070
]

Harsh J commented on MAPREDUCE-199:
---

bq. I'm not sure I see the value in user-specifying 'hints', the way to get
this to work is to 'figure' where the map-outputs are (the AM knows it) and
then try to pick the right hosts/racks.

This is good too, as an auto-optimization of regular MR apps. However, in
HBase-land we would benefit by direct control if the reducer can be scheduled
directly onto the RegionServer that hosts the sorted area of keys the reducer
is going to process, or even fetch.

Seems like we can go for two things:

# Auto-optimize by default, so that all users benefit somehow.
# Provide a way to override the automation via API supplied partition-host
mappings, to allow those who want to control for other odd purposes.

Arun - Would this be good?

Locality hints for Reduce
-

Key: MAPREDUCE-199
URL: https://issues.apache.org/jira/browse/MAPREDUCE-199
Project: Hadoop Map/Reduce
Issue Type: New Feature
Components: applicationmaster, mrv2
Reporter: Benjamin Reed
Assignee: Harsh J
Attachments: MAPREDUCE-199.patch, MAPREDUCE-199.patch

It would be nice if we could add method to OutputFormat that would allow a
job to indicate where a reducer for a given partition should should run. This
is similar to the getSplits() method on InputFormat. In our application the
reducer is using other data in addition to the map outputs during processing
and data accesses could be made more efficient if the JobTracker scheduled
the reducers to run on specific hosts.

[jira] [Commented] (MAPREDUCE-4647) We should only unjar jobjar if there is a lib directory in it.

2012-09-10 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452090#comment-13452090
 ] 

Todd Lipcon commented on MAPREDUCE-4647:


Is this a regression of MAPREDUCE-967, more or less?

 We should only unjar jobjar if there is a lib directory in it.
 --

 Key: MAPREDUCE-4647
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4647
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans

 For backwards compatibility we recently added made is so we would unjar the 
 job.jar and add anything to the classpath in the lib directory of that jar.  
 But this also slows job startup down a lot if the jar is large.  We should 
 only unjar it if actually doing so would add something new to the classpath.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4647) We should only unjar jobjar if there is a lib directory in it.

[
https://issues.apache.org/jira/browse/MAPREDUCE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452105#comment-13452105
]

Robert Joseph Evans commented on MAPREDUCE-4647:

I think that it is a regression. It looks like Yarn/MRV2 changed the handling
of job.jar to be through the distributed cache. Originally we only added the
jar to the classpath as a cache file. MAPREDUCE-4068 then changed it to be a
cache archive so that the classes and lib directories could be added to the
classpath. So both of those together effectively undid MAPREDUCE-967. I am
not sure how simple it will be to try and fully implement MAPREUDCE-967 again,
because the job.jar is going through the distributed cache and this really is
behavior that seems to be MAPREDUCE specific, unless we want to some how try
and make it generic for everyone on YARN.

That is why I wanted to have the client look at the jar and decided if it
should be a cache file, or a cache archive. From what I have seen very few
jars actually rely on this behavior. If you would prefer to make it generic
for YARN I can look into that instead. It should not be too hard, but it might
require a small change to the container launch context.

We should only unjar jobjar if there is a lib directory in it.
--

Key: MAPREDUCE-4647
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4647
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: mrv2
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans

For backwards compatibility we recently added made is so we would unjar the
job.jar and add anything to the classpath in the lib directory of that jar.
But this also slows job startup down a lot if the jar is large. We should
only unjar it if actually doing so would add something new to the classpath.

[jira] [Commented] (MAPREDUCE-4647) We should only unjar jobjar if there is a lib directory in it.


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452111#comment-13452111
 ] 

Robert Joseph Evans commented on MAPREDUCE-4647:


Actually I do think I will do it in a generic way through YARN.  I have taken a 
closer look at MAPREDUCE-967, and I see that mapreduce.job.jar.unpack.pattern 
is a user visible config that is now unused, so that is a regression.  I am not 
really sure how big of a change this is going to require though, but hopefully 
it will not be too large.

 We should only unjar jobjar if there is a lib directory in it.
 --

 Key: MAPREDUCE-4647
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4647
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans

 For backwards compatibility we recently added made is so we would unjar the 
 job.jar and add anything to the classpath in the lib directory of that jar.  
 But this also slows job startup down a lot if the jar is large.  We should 
 only unjar it if actually doing so would add something new to the classpath.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4642) MiniMRClientClusterFactory should not use job.setJar()

2012-09-10 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452115#comment-13452115
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4642:
---

Arun, MR1 is just fine. The problem was in the MiniMRClientClusterFactory.java 
(part of the adapter in YARN to mimic MR1 API) which was using the setJar() to 
set up minicluster classes in the classpath. Thus if the job running against 
the minicluster was trying to use setJar(), it would not work. With this fix 
avoid using setJar(), thus allowing testcases submitting jobs using setJar() to 
work as expected. Hope this clarifies.



 MiniMRClientClusterFactory should not use job.setJar()
 --

 Key: MAPREDUCE-4642
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4642
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Robert Kanter
Assignee: Robert Kanter
 Fix For: 2.0.2-alpha

 Attachments: MAPREDUCE-4642.patch, MAPREDUCE-4642.patch


 Currently, MiniMRClientClusterFactory does {{job.setJar(callerJar)}} so that 
 the {{callerJar}} is added to the cache in MR2.  However, this makes the 
 resulting configuration inconsistent between MR1 and MR2 as in MR1 the job 
 jar is not set and in MR2 its set to the {{callerJar}}.  This difference can 
 also cause some tests to fail in Oozie.  We should instead use the 
 {{job.addCacheFile()}} method.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4607) Race condition in ReduceTask completion can result in Task being incorrectly failed

2012-09-10 Thread Bikas Saha (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-4607:
--

Status: Patch Available  (was: Open)

 Race condition in ReduceTask completion can result in Task being incorrectly 
 failed
 ---

 Key: MAPREDUCE-4607
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4607
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: MAPREDUCE-4607.1.patch, MAPREDUCE-4607.2.patch, 
 MAPREDUCE-4607.3.patch, MAPREDUCE-4607.4.patch, MAPREDUCE-4607.patch


 Problem reported by chackaravarthy in MAPREDUCE-4252
 This problem has been handled when speculative task launched for map task and 
 other attempt got failed (not killed)
 Can the similar kind of scenario can happen in case of reduce task?
 Consider the following scenario for reduce task in case of speculation (one 
 attempt got killed):
 1. A task attempt is started.
 2. A speculative task attempt for the same task is started.
 3. The first task attempt completes and causes the task to transition to 
 SUCCEEDED.
 4. Then speculative task attempt will be killed because of the completion of 
 first attempt.
 As a result, internal error will be thrown from this attempt 
 (TaskImpl.MapRetroactiveKilledTransition) and hence task attempt failure 
 leads to job failure.
 TaskImpl.MapRetroactiveKilledTransition
 if (!TaskType.MAP.equals(task.getType())) {
 LOG.error(Unexpected event for REDUCE task  + event.getType());
 task.internalError(event.getType());
   }
 So, do we need to have following code in MapRetroactiveKilledTransition also 
 just like in MapRetroactiveFailureTransition.
 if (event instanceof TaskTAttemptEvent) {
 TaskTAttemptEvent castEvent = (TaskTAttemptEvent) event;
 if (task.getState() == TaskState.SUCCEEDED 
 !castEvent.getTaskAttemptID().equals(task.successfulAttempt)) {
   // don't allow a different task attempt to override a previous
   // succeeded state
   return TaskState.SUCCEEDED;
 }
   }
 please check whether this is a valid case and give your suggestion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4607) Race condition in ReduceTask completion can result in Task being incorrectly failed

2012-09-10 Thread Bikas Saha (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-4607:
--

Attachment: MAPREDUCE-4607.4.patch

Attaching patch with mock task changed.

 Race condition in ReduceTask completion can result in Task being incorrectly 
 failed
 ---

 Key: MAPREDUCE-4607
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4607
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: MAPREDUCE-4607.1.patch, MAPREDUCE-4607.2.patch, 
 MAPREDUCE-4607.3.patch, MAPREDUCE-4607.4.patch, MAPREDUCE-4607.patch


 Problem reported by chackaravarthy in MAPREDUCE-4252
 This problem has been handled when speculative task launched for map task and 
 other attempt got failed (not killed)
 Can the similar kind of scenario can happen in case of reduce task?
 Consider the following scenario for reduce task in case of speculation (one 
 attempt got killed):
 1. A task attempt is started.
 2. A speculative task attempt for the same task is started.
 3. The first task attempt completes and causes the task to transition to 
 SUCCEEDED.
 4. Then speculative task attempt will be killed because of the completion of 
 first attempt.
 As a result, internal error will be thrown from this attempt 
 (TaskImpl.MapRetroactiveKilledTransition) and hence task attempt failure 
 leads to job failure.
 TaskImpl.MapRetroactiveKilledTransition
 if (!TaskType.MAP.equals(task.getType())) {
 LOG.error(Unexpected event for REDUCE task  + event.getType());
 task.internalError(event.getType());
   }
 So, do we need to have following code in MapRetroactiveKilledTransition also 
 just like in MapRetroactiveFailureTransition.
 if (event instanceof TaskTAttemptEvent) {
 TaskTAttemptEvent castEvent = (TaskTAttemptEvent) event;
 if (task.getState() == TaskState.SUCCEEDED 
 !castEvent.getTaskAttemptID().equals(task.successfulAttempt)) {
   // don't allow a different task attempt to override a previous
   // succeeded state
   return TaskState.SUCCEEDED;
 }
   }
 please check whether this is a valid case and give your suggestion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4647) We should only unjar jobjar if there is a lib directory in it.


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452241#comment-13452241
 ] 

Robert Joseph Evans commented on MAPREDUCE-4647:


I wanted to add in too that it looks like MAPREDUCE-3018 removed part of 
MAPREDUCE-967 from streaming too, because YARN did not support this.

 We should only unjar jobjar if there is a lib directory in it.
 --

 Key: MAPREDUCE-4647
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4647
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans

 For backwards compatibility we recently added made is so we would unjar the 
 job.jar and add anything to the classpath in the lib directory of that jar.  
 But this also slows job startup down a lot if the jar is large.  We should 
 only unjar it if actually doing so would add something new to the classpath.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4607) Race condition in ReduceTask completion can result in Task being incorrectly failed

[
https://issues.apache.org/jira/browse/MAPREDUCE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452256#comment-13452256
]

Hadoop QA commented on MAPREDUCE-4607:
--

+1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12544492/MAPREDUCE-4607.4.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2835//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2835//console

This message is automatically generated.

Race condition in ReduceTask completion can result in Task being incorrectly
failed
---

Key: MAPREDUCE-4607
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4607
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Bikas Saha
Assignee: Bikas Saha
Attachments: MAPREDUCE-4607.1.patch, MAPREDUCE-4607.2.patch,
MAPREDUCE-4607.3.patch, MAPREDUCE-4607.4.patch, MAPREDUCE-4607.patch

Problem reported by chackaravarthy in MAPREDUCE-4252
This problem has been handled when speculative task launched for map task and
other attempt got failed (not killed)
Can the similar kind of scenario can happen in case of reduce task?
Consider the following scenario for reduce task in case of speculation (one
attempt got killed):
1. A task attempt is started.
2. A speculative task attempt for the same task is started.
3. The first task attempt completes and causes the task to transition to
SUCCEEDED.
4. Then speculative task attempt will be killed because of the completion of
first attempt.
As a result, internal error will be thrown from this attempt
(TaskImpl.MapRetroactiveKilledTransition) and hence task attempt failure
leads to job failure.
TaskImpl.MapRetroactiveKilledTransition
if (!TaskType.MAP.equals(task.getType())) {
LOG.error(Unexpected event for REDUCE task + event.getType());
task.internalError(event.getType());
}
So, do we need to have following code in MapRetroactiveKilledTransition also
just like in MapRetroactiveFailureTransition.
if (event instanceof TaskTAttemptEvent) {
TaskTAttemptEvent castEvent = (TaskTAttemptEvent) event;
if (task.getState() == TaskState.SUCCEEDED
!castEvent.getTaskAttemptID().equals(task.successfulAttempt)) {
// don't allow a different task attempt to override a previous
// succeeded state
return TaskState.SUCCEEDED;
}
}
please check whether this is a valid case and give your suggestion.

[jira] [Commented] (MAPREDUCE-199) Locality hints for Reduce

[
https://issues.apache.org/jira/browse/MAPREDUCE-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452276#comment-13452276
]

Arun C Murthy commented on MAPREDUCE-199:
-

Harsh - I'm not familiar with the HBase case; can you please add more colour?

bq. in HBase-land we would benefit by direct control if the reducer can be
scheduled directly onto the RegionServer that hosts the sorted area of keys the
reducer is going to process, or even fetch.

In this case, won't it be sufficient to schedule maps on the RS? If the data is
already sorted, but would you try schedule reduces instead?

My concern adding apis/config is that it becomes part of the user interface and
I'd like to think through it's implications, and whether it's really necessary,
before we commit to it. Makes sense?

Locality hints for Reduce
-

[jira] [Commented] (MAPREDUCE-199) Locality hints for Reduce

2012-09-10 Thread Karthik Kambatla (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452299#comment-13452299
]

Karthik Kambatla commented on MAPREDUCE-199:

This might not be the use case Harsh was thinking of, but here is a use case
from my summer internship a couple of years ago:

Our use case: We were building a topic-based pub/sub system. The published
events were in one HBase table, and the subscriptions were in another table.
While the published events were stored by their published time-stamp, the
subscriptions were stored by Topic ID: Subscription ID as the key. Matching
the published events to subscriptions required a join of the two tables on the
topic.

Approach: The map phase reads all the published events and emits (topic, event)
pairs. The reduce's input essentially is all events for a topic - the reduce
reads all the subscriptions of that topic and matches. Now, it would save a lot
of communication if the reduce (for topic A) were scheduled on the same node
that had the subscriptions for the same topic A. Hence, the need for reduce
data-locality.

We achieved this data locality through ugly hacks to the JT to store HBase
region (key-range): host mapping and overloading the partitioner to push each
key, value pair to appropriate reducers. I don't remember the exact speedups,
but it was quite significant. (if my memory is not wrong ~2x)

Locality hints for Reduce
-

[jira] [Commented] (MAPREDUCE-4639) CombineFileInputFormat#getSplits should throw IOException when input paths contain a directory

2012-09-10 Thread Harsh J (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452311#comment-13452311
 ] 

Harsh J commented on MAPREDUCE-4639:


Jim,

Nice patch. I may sound greedy but can we have recursive input directory 
support in CFIP too, at the same time? There's code in FIP you can reuse from.

 CombineFileInputFormat#getSplits should throw IOException when input paths 
 contain a directory
 --

 Key: MAPREDUCE-4639
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4639
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Reporter: Jim Donofrio
Priority: Minor
 Attachments: MAPREDUCE-4639.patch


 FileInputFormat#getSplits throws an IOException when the input paths contain 
 a directory. CombineFileInputFormat should do the same, otherwise the jo will 
 not fail until the record reader is initialized when FileSystem#open will say 
 that the directory does not exist.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4646:
--

Assignee: Jason Lowe
Target Version/s: 2.0.2-alpha, 0.23.4
  Status: Patch Available  (was: Open)

 client does not receive job diagnostics for failed jobs
 ---

 Key: MAPREDUCE-4646
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4646
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.1-alpha, 0.23.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4646.patch


 When a job fails the client is not showing any diagnostics.  For example, 
 running a fail job results in this not-so-helpful message from the client:
 {noformat}
 2012-09-07 21:12:00,649 INFO  [main] mapreduce.Job 
 (Job.java:monitorAndPrintJob(1308)) - Job job_1347052207658_0001 failed with 
 state FAILED due to:
 {noformat}
 ...and nothing else to go with it indicating what went wrong.  The job 
 diagnostics are apparently not making it back to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4646:
--

Attachment: MAPREDUCE-4646.patch

Patch to add diagnostics to the job report.  Manually tested with a fail job, 
and the client now sees something useful, e.g.:

{noformat}
2012-09-10 18:55:52,506 INFO  [main] mapreduce.Job 
(Job.java:monitorAndPrintJob(1308)) - Job job_1347303315737_0001 failed with 
state FAILED due to: Task failed task_1347303315737_0001_m_01
{noformat}

 client does not receive job diagnostics for failed jobs
 ---

 Key: MAPREDUCE-4646
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4646
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0, 2.0.1-alpha
Reporter: Jason Lowe
 Attachments: MAPREDUCE-4646.patch


 When a job fails the client is not showing any diagnostics.  For example, 
 running a fail job results in this not-so-helpful message from the client:
 {noformat}
 2012-09-07 21:12:00,649 INFO  [main] mapreduce.Job 
 (Job.java:monitorAndPrintJob(1308)) - Job job_1347052207658_0001 failed with 
 state FAILED due to:
 {noformat}
 ...and nothing else to go with it indicating what went wrong.  The job 
 diagnostics are apparently not making it back to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs

[
https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452339#comment-13452339
]

Hadoop QA commented on MAPREDUCE-4646:
--

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12544508/MAPREDUCE-4646.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 1 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2836//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2836//console

This message is automatically generated.

client does not receive job diagnostics for failed jobs
---

Key: MAPREDUCE-4646
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4646
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: mrv2
Affects Versions: 0.23.0, 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe
Attachments: MAPREDUCE-4646.patch

When a job fails the client is not showing any diagnostics. For example,
running a fail job results in this not-so-helpful message from the client:
{noformat}
2012-09-07 21:12:00,649 INFO [main] mapreduce.Job
(Job.java:monitorAndPrintJob(1308)) - Job job_1347052207658_0001 failed with
state FAILED due to:
{noformat}
...and nothing else to go with it indicating what went wrong. The job
diagnostics are apparently not making it back to the client.

[jira] [Updated] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-4646:
---

Status: Open  (was: Patch Available)

Good catch, Jason.

Can you fix {{MRBuilderUtils.newJobReport()}} itself to also take in a 
diagnostics instead of doing an explicit setDiagnostics?

Also, you can send a {{JobDiagnosticsUpdateEvent}} itself instead of an 
addDiagnostics() in the test-case.

 client does not receive job diagnostics for failed jobs
 ---

 Key: MAPREDUCE-4646
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4646
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.1-alpha, 0.23.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4646.patch


 When a job fails the client is not showing any diagnostics.  For example, 
 running a fail job results in this not-so-helpful message from the client:
 {noformat}
 2012-09-07 21:12:00,649 INFO  [main] mapreduce.Job 
 (Job.java:monitorAndPrintJob(1308)) - Job job_1347052207658_0001 failed with 
 state FAILED due to:
 {noformat}
 ...and nothing else to go with it indicating what went wrong.  The job 
 diagnostics are apparently not making it back to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4637) Killing an unassigned task attempt causes the job to fail

2012-09-10 Thread Mayank Bansal (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-4637:
-

Attachment: MAPREDUCE-4637-trunk-v3.patch

Based on Tom's suggestion extending patch for the new state as well.

Arun,

Did you get a chance to take a look?

Thanks,
Mayank

 Killing an unassigned task attempt causes the job to fail
 -

 Key: MAPREDUCE-4637
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4637
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Tom White
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-4637-trunk.patch, 
 MAPREDUCE-4637-trunk-v2.patch, MAPREDUCE-4637-trunk-v3.patch, MapReduce.png


 Attempting to kill a task attempt that has been scheduled but is not running 
 causes an invalid state transition and the AM to stop with an error. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4637) Killing an unassigned task attempt causes the job to fail

[
https://issues.apache.org/jira/browse/MAPREDUCE-4637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452431#comment-13452431
]

Hadoop QA commented on MAPREDUCE-4637:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12544522/MAPREDUCE-4637-trunk-v3.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 1 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app:

org.apache.hadoop.mapreduce.v2.app.TestMRClientService

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2837//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2837//console

This message is automatically generated.

Killing an unassigned task attempt causes the job to fail
-

Key: MAPREDUCE-4637
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4637
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Tom White
Assignee: Mayank Bansal
Attachments: MAPREDUCE-4637-trunk.patch,
MAPREDUCE-4637-trunk-v2.patch, MAPREDUCE-4637-trunk-v3.patch, MapReduce.png

Attempting to kill a task attempt that has been scheduled but is not running
causes an invalid state transition and the AM to stop with an error.

[jira] [Updated] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4646:
--

Attachment: MAPREDUCE-4646.patch

Thanks for the review, Vinod.  I've updated the patch accordingly.

 client does not receive job diagnostics for failed jobs
 ---

 Key: MAPREDUCE-4646
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4646
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0, 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4646.patch, MAPREDUCE-4646.patch


 When a job fails the client is not showing any diagnostics.  For example, 
 running a fail job results in this not-so-helpful message from the client:
 {noformat}
 2012-09-07 21:12:00,649 INFO  [main] mapreduce.Job 
 (Job.java:monitorAndPrintJob(1308)) - Job job_1347052207658_0001 failed with 
 state FAILED due to:
 {noformat}
 ...and nothing else to go with it indicating what went wrong.  The job 
 diagnostics are apparently not making it back to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4646:
--

Status: Patch Available  (was: Open)

 client does not receive job diagnostics for failed jobs
 ---

 Key: MAPREDUCE-4646
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4646
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.1-alpha, 0.23.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4646.patch, MAPREDUCE-4646.patch


 When a job fails the client is not showing any diagnostics.  For example, 
 running a fail job results in this not-so-helpful message from the client:
 {noformat}
 2012-09-07 21:12:00,649 INFO  [main] mapreduce.Job 
 (Job.java:monitorAndPrintJob(1308)) - Job job_1347052207658_0001 failed with 
 state FAILED due to:
 {noformat}
 ...and nothing else to go with it indicating what went wrong.  The job 
 diagnostics are apparently not making it back to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs

[
https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452459#comment-13452459
]

Hadoop QA commented on MAPREDUCE-4646:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12544528/MAPREDUCE-4646.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified test
files.

-1 javac. The patch appears to cause the build to fail.

Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2838//console

This message is automatically generated.

client does not receive job diagnostics for failed jobs
---

[jira] [Created] (MAPREDUCE-4648) Diagnostics from AM are missing from job history

Jason Lowe created MAPREDUCE-4648:
-

 Summary: Diagnostics from AM are missing from job history
 Key: MAPREDUCE-4648
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4648
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha, 0.23.0
Reporter: Jason Lowe


When a job fails during setup or commit, any diagnostics from the MapReduce 
ApplicationMaster are not available in the job history.  Currently the 
diagnostics for the job are collected from the diagnostics of tasks run for the 
job, but the AM has no corresponding task record in the job history.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4646:
--

Status: Open  (was: Patch Available)

 client does not receive job diagnostics for failed jobs
 ---

 Key: MAPREDUCE-4646
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4646
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.1-alpha, 0.23.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4646.patch, MAPREDUCE-4646.patch


 When a job fails the client is not showing any diagnostics.  For example, 
 running a fail job results in this not-so-helpful message from the client:
 {noformat}
 2012-09-07 21:12:00,649 INFO  [main] mapreduce.Job 
 (Job.java:monitorAndPrintJob(1308)) - Job job_1347052207658_0001 failed with 
 state FAILED due to:
 {noformat}
 ...and nothing else to go with it indicating what went wrong.  The job 
 diagnostics are apparently not making it back to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4646:
--

Attachment: MAPREDUCE-4646.patch

Accidentally missed a change in the last patch.

 client does not receive job diagnostics for failed jobs
 ---

 Key: MAPREDUCE-4646
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4646
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0, 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4646.patch, MAPREDUCE-4646.patch, 
 MAPREDUCE-4646.patch


 When a job fails the client is not showing any diagnostics.  For example, 
 running a fail job results in this not-so-helpful message from the client:
 {noformat}
 2012-09-07 21:12:00,649 INFO  [main] mapreduce.Job 
 (Job.java:monitorAndPrintJob(1308)) - Job job_1347052207658_0001 failed with 
 state FAILED due to:
 {noformat}
 ...and nothing else to go with it indicating what went wrong.  The job 
 diagnostics are apparently not making it back to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4646:
--

Status: Patch Available  (was: Open)

 client does not receive job diagnostics for failed jobs
 ---

 Key: MAPREDUCE-4646
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4646
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.1-alpha, 0.23.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4646.patch, MAPREDUCE-4646.patch, 
 MAPREDUCE-4646.patch


 When a job fails the client is not showing any diagnostics.  For example, 
 running a fail job results in this not-so-helpful message from the client:
 {noformat}
 2012-09-07 21:12:00,649 INFO  [main] mapreduce.Job 
 (Job.java:monitorAndPrintJob(1308)) - Job job_1347052207658_0001 failed with 
 state FAILED due to:
 {noformat}
 ...and nothing else to go with it indicating what went wrong.  The job 
 diagnostics are apparently not making it back to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs

[
https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452550#comment-13452550
]

Hadoop QA commented on MAPREDUCE-4646:
--

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12544537/MAPREDUCE-4646.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2839//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2839//console

This message is automatically generated.

client does not receive job diagnostics for failed jobs
---

[jira] [Updated] (MAPREDUCE-4637) Killing an unassigned task attempt causes the job to fail

2012-09-10 Thread Mayank Bansal (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-4637:
-

Attachment: MAPREDUCE-4637-trunk-v4.patch

Fixing the test.

Thanks,
Mayank

 Killing an unassigned task attempt causes the job to fail
 -

 Key: MAPREDUCE-4637
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4637
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Tom White
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-4637-trunk.patch, 
 MAPREDUCE-4637-trunk-v2.patch, MAPREDUCE-4637-trunk-v3.patch, 
 MAPREDUCE-4637-trunk-v4.patch, MapReduce.png


 Attempting to kill a task attempt that has been scheduled but is not running 
 causes an invalid state transition and the AM to stop with an error. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4315) jobhistory.jsp throws 500 when a .txt file is found in /done

2012-09-10 Thread Sandy Ryza (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-4315:
--

Attachment: MAPREDUCE-4315.patch

 jobhistory.jsp throws 500 when a .txt file is found in /done
 

 Key: MAPREDUCE-4315
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4315
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 0.20.2
Reporter: Alexander Alten-Lorenz
 Attachments: MAPREDUCE-4315.patch


 if a .txt file located in /done the parser throws an 500 error.
 Trace:
 java.lang.ArrayIndexOutOfBoundsException: 1
 at 
 org.apache.hadoop.mapred.jobhistory_jsp$2.compare(jobhistory_jsp.java:295)
 at 
 org.apache.hadoop.mapred.jobhistory_jsp$2.compare(jobhistory_jsp.java:279)
 at java.util.Arrays.mergeSort(Arrays.java:1270)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.sort(Arrays.java:1210)
 at 
 org.apache.hadoop.mapred.jobhistory_jsp._jspService(jobhistory_jsp.java:279)
 at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
 at 
 org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:864)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 at org.mortbay.jetty.Server.handle(Server.java:326)
 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
 Reproduce:
 cd ../done
 touch test.txt
 reload jobhistory

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4315) jobhistory.jsp throws 500 when a .txt file is found in /done

2012-09-10 Thread Sandy Ryza (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-4315:
--

Status: Patch Available  (was: Open)

 jobhistory.jsp throws 500 when a .txt file is found in /done
 

 Key: MAPREDUCE-4315
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4315
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 0.20.2
Reporter: Alexander Alten-Lorenz
 Attachments: MAPREDUCE-4315.patch


 if a .txt file located in /done the parser throws an 500 error.
 Trace:
 java.lang.ArrayIndexOutOfBoundsException: 1
 at 
 org.apache.hadoop.mapred.jobhistory_jsp$2.compare(jobhistory_jsp.java:295)
 at 
 org.apache.hadoop.mapred.jobhistory_jsp$2.compare(jobhistory_jsp.java:279)
 at java.util.Arrays.mergeSort(Arrays.java:1270)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.sort(Arrays.java:1210)
 at 
 org.apache.hadoop.mapred.jobhistory_jsp._jspService(jobhistory_jsp.java:279)
 at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
 at 
 org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:864)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 at org.mortbay.jetty.Server.handle(Server.java:326)
 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
 Reproduce:
 cd ../done
 touch test.txt
 reload jobhistory

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4315) jobhistory.jsp throws 500 when a .txt file is found in /done

2012-09-10 Thread Sandy Ryza (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452600#comment-13452600
 ] 

Sandy Ryza commented on MAPREDUCE-4315:
---

The problem occurs not because the file ends with .txt, but because it 
doesn't have a prefix like job_201209101405_0001_1347311220402.


 jobhistory.jsp throws 500 when a .txt file is found in /done
 

 Key: MAPREDUCE-4315
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4315
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 0.20.2
Reporter: Alexander Alten-Lorenz
 Attachments: MAPREDUCE-4315.patch


 if a .txt file located in /done the parser throws an 500 error.
 Trace:
 java.lang.ArrayIndexOutOfBoundsException: 1
 at 
 org.apache.hadoop.mapred.jobhistory_jsp$2.compare(jobhistory_jsp.java:295)
 at 
 org.apache.hadoop.mapred.jobhistory_jsp$2.compare(jobhistory_jsp.java:279)
 at java.util.Arrays.mergeSort(Arrays.java:1270)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.sort(Arrays.java:1210)
 at 
 org.apache.hadoop.mapred.jobhistory_jsp._jspService(jobhistory_jsp.java:279)
 at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
 at 
 org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:864)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 at org.mortbay.jetty.Server.handle(Server.java:326)
 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
 Reproduce:
 cd ../done
 touch test.txt
 reload jobhistory

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4637) Killing an unassigned task attempt causes the job to fail

[
https://issues.apache.org/jira/browse/MAPREDUCE-4637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452605#comment-13452605
]

Hadoop QA commented on MAPREDUCE-4637:
--

+1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12544559/MAPREDUCE-4637-trunk-v4.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2840//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2840//console

This message is automatically generated.

Killing an unassigned task attempt causes the job to fail
-

Attempting to kill a task attempt that has been scheduled but is not running
causes an invalid state transition and the AM to stop with an error.

[jira] [Created] (MAPREDUCE-4649) mr-jobhistory-daemon.sh needs to be updated post YARN-1

Vinod Kumar Vavilapalli created MAPREDUCE-4649:
--

 Summary: mr-jobhistory-daemon.sh needs to be updated post YARN-1
 Key: MAPREDUCE-4649
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4649
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli


Even today, JHS is assuming that YARN_HOME will be same as HADOOP_MAPRED_HOME 
besides other such assumptions. We need to fix it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4649) mr-jobhistory-daemon.sh needs to be updated post YARN-1


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-4649:
---

Affects Version/s: 2.0.2-alpha
   0.23.3

 mr-jobhistory-daemon.sh needs to be updated post YARN-1
 ---

 Key: MAPREDUCE-4649
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4649
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 0.23.3, 2.0.2-alpha
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli

 Even today, JHS is assuming that YARN_HOME will be same as HADOOP_MAPRED_HOME 
 besides other such assumptions. We need to fix it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4502) Multi-level aggregation with combining the result of maps per node/rack

2012-09-10 Thread Tsuyoshi OZAWA (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452640#comment-13452640
]

Tsuyoshi OZAWA commented on MAPREDUCE-4502:
---

Karthik, thanks for your comment.

bq. Is the local aggregation done asynchronously as the mappers process
respective input?

Partially, yes. After at least two map tasks finishing, local aggregation can
be done.

bq. Will there be one LocalAggregator and one ShuffleHandler per each reducer?
Or, is it a single LocalAggregator/ShuffleHandler daemon with relevant
thread(pool)s per container?

Latter.

It's ideal to minimize code modifications and maximize the performance. At the
current MR implementation, a ShuffleHandler is launched per container. Keeping
it so can save the code modification.
BTW, multi-threaded LocalAggregator is very effective for performance, however,
the implementation can be more complex than single-threaded one. As a first
step, it's reasonable to implement single-thread version.

bq. The current design doc seems to be aimed at aggregation per container. The
bigger goal being aggregation and node/rack levels, does the same design
extend/apply to the final goal?

I thought node-level aggregation and container-level aggregation in MR 2.0 are
exactly same.

To make this design more generic to support rack-level aggregation, a special
task like Reducer which can fetch outputs and reduce them, but write its
outputs not to HDFS but to local disk is necessary. With the special task, it
can be used in rack-level aggregation by extending the new APIs between mappers
and reducers to launch special tasks and delegate the aggregation.

Please ask me if you have any questions.
- Tsuyoshi

Multi-level aggregation with combining the result of maps per node/rack
---

Key: MAPREDUCE-4502
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4502
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: applicationmaster, mrv2
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
Attachments: speculative_draft.pdf

The shuffle costs is expensive in Hadoop in spite of the existence of
combiner, because the scope of combining is limited within only one MapTask.
To solve this problem, it's a good way to aggregate the result of maps per
node/rack by launch combiner.
This JIRA is to implement the multi-level aggregation infrastructure,
including combining per container(MAPREDUCE-3902 is related), coordinating
containers by application master without breaking fault tolerance of jobs.

[jira] [Updated] (MAPREDUCE-4525) Combiner per node

2012-09-10 Thread Tsuyoshi OZAWA (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4525:
--

Summary: Combiner per node  (was: Combiner per container)

 Combiner per node
 -

 Key: MAPREDUCE-4525
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4525
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: applicationmaster, mrv2
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA

 This JIRA is to implement the combining per container(MAPREDUCE-3902 is 
 related), coordinating containers by application master without breaking 
 fault tolerance of jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4502) Multi-level aggregation with combining the result of maps per node/rack

2012-09-10 Thread Tsuyoshi OZAWA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452642#comment-13452642
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-4502:
---

Ah, I'm sorry for confusing you, Karthik. This document say about not 
container-level aggregation but node-level aggregation. I'll fix the document. 
Thanks.

 Multi-level aggregation with combining the result of maps per node/rack
 ---

 Key: MAPREDUCE-4502
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4502
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster, mrv2
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: speculative_draft.pdf


 The shuffle costs is expensive in Hadoop in spite of the existence of 
 combiner, because the scope of combining is limited within only one MapTask. 
 To solve this problem, it's a good way to aggregate the result of maps per 
 node/rack by launch combiner.
 This JIRA is to implement the multi-level aggregation infrastructure, 
 including combining per container(MAPREDUCE-3902 is related), coordinating 
 containers by application master without breaking fault tolerance of jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4315) jobhistory.jsp throws 500 when a .txt file is found in /done


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452648#comment-13452648
 ] 

Hadoop QA commented on MAPREDUCE-4315:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12544563/MAPREDUCE-4315.patch
  against trunk revision .

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2841//console

This message is automatically generated.

 jobhistory.jsp throws 500 when a .txt file is found in /done
 

 Key: MAPREDUCE-4315
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4315
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 0.20.2
Reporter: Alexander Alten-Lorenz
 Attachments: MAPREDUCE-4315.patch


 if a .txt file located in /done the parser throws an 500 error.
 Trace:
 java.lang.ArrayIndexOutOfBoundsException: 1
 at 
 org.apache.hadoop.mapred.jobhistory_jsp$2.compare(jobhistory_jsp.java:295)
 at 
 org.apache.hadoop.mapred.jobhistory_jsp$2.compare(jobhistory_jsp.java:279)
 at java.util.Arrays.mergeSort(Arrays.java:1270)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.mergeSort(Arrays.java:1282)
 at java.util.Arrays.mergeSort(Arrays.java:1281)
 at java.util.Arrays.sort(Arrays.java:1210)
 at 
 org.apache.hadoop.mapred.jobhistory_jsp._jspService(jobhistory_jsp.java:279)
 at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
 at 
 org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:864)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 at org.mortbay.jetty.Server.handle(Server.java:326)
 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
 Reproduce:
 cd ../done
 touch test.txt
 reload jobhistory

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-199) Locality hints for Reduce

2012-09-10 Thread eric baldeschwieler (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

eric baldeschwieler updated MAPREDUCE-199:
--

I can see the value of matching reduce outputs to region servers. This does
seem like a compelling use case.

That said, the MR interface is already very broad. Let's let any extensions to
the API bake for a while to make sure we are doing the right thing. Its a lot
easier to add thing to the config or API than take them out. Using the same
abstractions / API as the Map would be nice if doable.

Locality hints for Reduce
-

[jira] [Resolved] (MAPREDUCE-4338) NodeManager daemon is failing to start.


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved MAPREDUCE-4338.


Resolution: Not A Problem

Please check connectivity from your NM machine to RM. You can login into the NM 
node and do a telnet to the RM host at port 8025. Closing this as 
not-a-problem. Create any newer issues in NM at 
https://issues.apache.org/jira/browse/YARN. Tx.

 NodeManager daemon is failing to start.
 ---

 Key: MAPREDUCE-4338
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4338
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.0
 Environment: Ubuntu Server 11.04, 
Reporter: srikanth ayalasomayajulu
  Labels: features, hadoop
 Fix For: 0.23.0

   Original Estimate: 4h
  Remaining Estimate: 4h

 Node manager daemons is not getting started on the slave machines. and giving 
 an error like stated below.
 2012-06-12 19:05:56,172 FATAL nodemanager.NodeManager 
 (NodeManager.java:main(233)) - Error starting NodeManager
 org.apache.hadoop.yarn.YarnException: Failed to Start 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager
 at 
 org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:163)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:231)
 Caused by: org.apache.avro.AvroRuntimeException: 
 java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:132)
 at 
 org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
 ... 2 more
 Caused by: java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:161)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:128)
 ... 3 more
 Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: 
 Call From mvm5/192.168.100.177 to mvm4:8025 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy14.registerNodeManager(Unknown Source)
 at 
 org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
 ... 5 more
 Caused by: java.net.ConnectException: Call From mvm5/192.168.100.177 to 
 mvm4:8025 failed on connection exception: java.net.ConnectException: 
 Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:617)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
 ... 7 more
 Caused by: java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
 at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
 at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:419)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:460)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:557)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
 at org.apache.hadoop.ipc.Client.call(Client.java:1065)
 ... 8 more
 2012-06-12 19:05:56,184 INFO  ipc.Server (Server.java:stop(1709)) - Stopping 
 server on 47645
 2012-06-12 19:05:56,184 INFO  ipc.Server (Server.java:stop(1709)) - Stopping 
 server on 4344
 2012-06-12 19:05:56,190 INFO  impl.MetricsSystemImpl 
 (MetricsSystemImpl.java:stop(199)) - Stopping NodeManager metrics system...
 2012-06-12 19:05:56,190 INFO  impl.MetricsSystemImpl 
 (MetricsSystemImpl.java:stopSources(408)) - Stopping metrics source JvmMetrics
 2012-06-12 19:05:56,191 INFO  nodemanager.NodeManager 
 (StringUtils.java:run(605)) - SHUTDOWN_MSG:

--
This message is automatically

[jira] [Updated] (MAPREDUCE-4338) NodeManager daemon is failing to start.