date:20121025

[jira] [Resolved] (MAPREDUCE-4745) Application Master is hanging when the TaskImpl gets T_KILL event and completes attempts by the time

2012-10-25 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved MAPREDUCE-4745.


Resolution: Duplicate

I am fixing all the issues related to TA_KILL at MAPREDUCE-4751. Closing this 
as duplicate.

Appreciate help validating the fix though.

> Application Master is hanging when the TaskImpl gets T_KILL event and 
> completes attempts by the time  
> --
>
> Key: MAPREDUCE-4745
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4745
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Devaraj K
>Assignee: Devaraj K
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-4744) Application Master is running forever when the TaskAttempt gets TA_KILL event at the state SUCCESS_CONTAINER_CLEANUP

2012-10-25 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved MAPREDUCE-4744.


Resolution: Duplicate

I am fixing all the issues related to TA_KILL at MAPREDUCE-4751. Closing this 
as duplicate.

Appreciate help validating the fix though.

> Application Master is running forever when the TaskAttempt gets TA_KILL event 
> at the state SUCCESS_CONTAINER_CLEANUP
> 
>
> Key: MAPREDUCE-4744
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4744
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.2-alpha
>Reporter: Devaraj K
>Assignee: Devaraj K
>
> When the Task issues KILL event to TaskAttempt, It is expecting to get event 
> back to the Task from TaskAttempt. If the Task Attempt state 
> SUCCESS_CONTAINER_CLEANUP state then it is ignoring and Task is waiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED

2012-10-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484616#comment-13484616
 ] 

Hadoop QA commented on MAPREDUCE-4748:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12550896/MAPREDUCE-4748.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2967//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2967//console

This message is automatically generated.

> Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED
> ---
>
> Key: MAPREDUCE-4748
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4748
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.3
>Reporter: Robert Joseph Evans
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4748.patch
>
>
> We saw this happen when running a large pig script.
> {noformat}
> 2012-10-23 22:45:24,986 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Can't handle this event 
> at current state for task_1350837501057_21978_m_040453
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> T_ATTEMPT_SUCCEEDED at SUCCEEDED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:604)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:89)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:914)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:908)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
> at java.lang.Thread.run(Thread.java:619)
> {noformat}
> Speculative execution was enabled, and that task did speculate so it looks 
> like this is an error in the state machine either between the task attempts 
> or just within that single task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Moved] (MAPREDUCE-4751) AM stuck in KILL_WAIT for days

2012-10-25 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli moved YARN-167 to MAPREDUCE-4751:
-

Affects Version/s: (was: 0.23.3)
   0.23.3
   2.0.2-alpha
  Key: MAPREDUCE-4751  (was: YARN-167)
  Project: Hadoop Map/Reduce  (was: Hadoop YARN)

> AM stuck in KILL_WAIT for days
> --
>
> Key: MAPREDUCE-4751
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4751
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha, 0.23.3
>Reporter: Ravi Prakash
>Assignee: Vinod Kumar Vavilapalli
> Attachments: TaskAttemptStateGraph.jpg
>
>
> We found some jobs were stuck in KILL_WAIT for days on end. The RM shows them 
> as RUNNING. When you go to the AM, it shows it in the KILL_WAIT state, and a 
> few maps running. All these maps were scheduled on nodes which are now in the 
> RM's Lost nodes list. The running maps are in the FAIL_CONTAINER_CLEANUP state

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED

2012-10-25 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4748:
--

Target Version/s: 2.0.3-alpha, 0.23.5
  Status: Patch Available  (was: Open)

> Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED
> ---
>
> Key: MAPREDUCE-4748
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4748
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.3
>Reporter: Robert Joseph Evans
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4748.patch
>
>
> We saw this happen when running a large pig script.
> {noformat}
> 2012-10-23 22:45:24,986 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Can't handle this event 
> at current state for task_1350837501057_21978_m_040453
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> T_ATTEMPT_SUCCEEDED at SUCCEEDED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:604)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:89)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:914)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:908)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
> at java.lang.Thread.run(Thread.java:619)
> {noformat}
> Speculative execution was enabled, and that task did speculate so it looks 
> like this is an error in the state machine either between the task attempts 
> or just within that single task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED

2012-10-25 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4748:
--

Attachment: MAPREDUCE-4748.patch

Simple patch to ignore T_ATTEMPT_SUCCEEDED, T_KILL, and 
T_ATTEMPT_COMMIT_PENDING at SUCCEEDED and keep the job from abruptly ending in 
error.

I'm a bit worried about the bookkeeping wrt. task.finishedAttempts and 
task.numberUncompletedAttempts.  Current patch matches the bookkeeping behavior 
for T_ATTEMPT_KILLED or T_ATTEMPT_FAILED when we're effectively ignoring the 
event.  However I'm wondering if this could lead to corner cases during 
KILL_WAIT like those reported in MAPREDUCE-4745.

It looks like TaskAttempt will report T_ATTEMPT_KILLED after it succeeded but 
only for map tasks.  We don't want to double-count in that case, but if a kill 
of the TaskAttempt doesn't report it was killed it seems like we could miss 
some bookeeping if we just ignore bookkeeping when we see an attempt 
redundantly succeeded.  Thoughts?

> Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED
> ---
>
> Key: MAPREDUCE-4748
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4748
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.3
>Reporter: Robert Joseph Evans
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4748.patch
>
>
> We saw this happen when running a large pig script.
> {noformat}
> 2012-10-23 22:45:24,986 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Can't handle this event 
> at current state for task_1350837501057_21978_m_040453
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> T_ATTEMPT_SUCCEEDED at SUCCEEDED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:604)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:89)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:914)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:908)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
> at java.lang.Thread.run(Thread.java:619)
> {noformat}
> Speculative execution was enabled, and that task did speculate so it looks 
> like this is an error in the state machine either between the task attempts 
> or just within that single task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED

2012-10-25 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484566#comment-13484566
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-4748:


Genuine bug. TaskImpl needs to accept T_ATTEMPT_SUCCEEDED at SUCCEEDED. Not 
sure how we missed something as basic as this. We should also accept-and-ignore 
T_KILL and T_ATTEMPT_COMMIT_PENDING.

> Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED
> ---
>
> Key: MAPREDUCE-4748
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4748
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.3
>Reporter: Robert Joseph Evans
>Assignee: Jason Lowe
>
> We saw this happen when running a large pig script.
> {noformat}
> 2012-10-23 22:45:24,986 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Can't handle this event 
> at current state for task_1350837501057_21978_m_040453
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> T_ATTEMPT_SUCCEEDED at SUCCEEDED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:604)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:89)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:914)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:908)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
> at java.lang.Thread.run(Thread.java:619)
> {noformat}
> Speculative execution was enabled, and that task did speculate so it looks 
> like this is an error in the state machine either between the task attempts 
> or just within that single task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4630) API for setting dfs.block.size

2012-10-25 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1348#comment-1348
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-4630:


I agree with others here. What is the use-case? Are dfs.block.size and 
mapreduce.input.fileinputformat.split.minsize not enough? Seems like you are 
looking for {{void setMinInputSplitSize(Job job, long size)}} ?

> API for setting dfs.block.size
> --
>
> Key: MAPREDUCE-4630
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4630
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
> Environment: Hadoop 2
>Reporter: Radim Kolar
>Priority: Minor
>
> Add API for setting block size in Tool while creating MR job.
> I propose
> FileOutputFormat.setBlockSize(Job job, int blocksize);
> which sets dfs.block.size

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED

2012-10-25 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484442#comment-13484442
 ] 

Jason Lowe commented on MAPREDUCE-4748:
---

Here's a log from another case showing we have a race between two attempts from 
the same task that succeed almost simultaneously:

{noformat}
2012-10-24 11:31:40,751 INFO [IPC Server handler 1 on 52922] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from 
attempt_1350066773975_116662_m_032327_1
2012-10-24 11:31:40,751 INFO [IPC Server handler 1 on 52922] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
attempt_1350066773975_116662_m_032327_1 is : 1.0
2012-10-24 11:31:40,751 INFO [IPC Server handler 21 on 52922] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Done acknowledgement from 
attempt_1350066773975_116662_m_032327_1
2012-10-24 11:31:40,751 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
attempt_1350066773975_116662_m_032327_1 TaskAttempt Transitioned from RUNNING 
to SUCCESS_CONTAINER_CLEANUP
2012-10-24 11:31:40,752 INFO [ContainerLauncher #55] 
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing 
the event EventType: CONTAINER_REMOTE_CLEANUP for container 
container_1350066773975_116662_01_051566 taskAttempt 
attempt_1350066773975_116662_m_032327_1
2012-10-24 11:31:40,752 INFO [ContainerLauncher #55] 
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING 
attempt_1350066773975_116662_m_032327_1
2012-10-24 11:31:40,754 INFO [IPC Server handler 7 on 52922] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from 
attempt_1350066773975_116662_r_03_0
2012-10-24 11:31:40,754 INFO [IPC Server handler 7 on 52922] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
attempt_1350066773975_116662_r_03_0 is : 0.072
2012-10-24 11:31:40,755 INFO [IPC Server handler 25 on 52922] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from 
attempt_1350066773975_116662_m_032327_0
2012-10-24 11:31:40,755 INFO [IPC Server handler 25 on 52922] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
attempt_1350066773975_116662_m_032327_0 is : 1.0
2012-10-24 11:31:40,756 INFO [IPC Server handler 20 on 52922] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Done acknowledgement from 
attempt_1350066773975_116662_m_032327_0
2012-10-24 11:31:40,756 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
attempt_1350066773975_116662_m_032327_0 TaskAttempt Transitioned from RUNNING 
to SUCCESS_CONTAINER_CLEANUP
2012-10-24 11:31:40,756 INFO [ContainerLauncher #484] 
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing 
the event EventType: CONTAINER_REMOTE_CLEANUP for container 
container_1350066773975_116662_01_037193 taskAttempt 
attempt_1350066773975_116662_m_032327_0
2012-10-24 11:31:40,756 INFO [ContainerLauncher #484] 
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING 
attempt_1350066773975_116662_m_032327_0
2012-10-24 11:31:40,757 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
attempt_1350066773975_116662_m_032327_1 TaskAttempt Transitioned from 
SUCCESS_CONTAINER_CLEANUP to SUCCEEDED
2012-10-24 11:31:40,757 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with 
attempt attempt_1350066773975_116662_m_032327_1
2012-10-24 11:31:40,757 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Issuing kill to other 
attempt attempt_1350066773975_116662_m_032327_0
2012-10-24 11:31:40,757 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: 
task_1350066773975_116662_m_032327 Task Transitioned from RUNNING to SUCCEEDED
2012-10-24 11:31:40,757 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 51029
2012-10-24 11:31:40,780 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
attempt_1350066773975_116662_m_032327_0 TaskAttempt Transitioned from 
SUCCESS_CONTAINER_CLEANUP to SUCCEEDED
2012-10-24 11:31:40,814 ERROR [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Can't handle this event 
at current state for task_1350066773975_116662_m_032327
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
T_ATTEMPT_SUCCEEDED at SUCCEEDED
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine

[jira] [Assigned] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED

2012-10-25 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reassigned MAPREDUCE-4748:
-

Assignee: Jason Lowe

> Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED
> ---
>
> Key: MAPREDUCE-4748
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4748
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.3
>Reporter: Robert Joseph Evans
>Assignee: Jason Lowe
>
> We saw this happen when running a large pig script.
> {noformat}
> 2012-10-23 22:45:24,986 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Can't handle this event 
> at current state for task_1350837501057_21978_m_040453
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> T_ATTEMPT_SUCCEEDED at SUCCEEDED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:604)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:89)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:914)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:908)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
> at java.lang.Thread.run(Thread.java:619)
> {noformat}
> Speculative execution was enabled, and that task did speculate so it looks 
> like this is an error in the state machine either between the task attempts 
> or just within that single task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-4747) Fancy graphs for visualizing task progress

2012-10-25 Thread Ravi Prakash (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash reassigned MAPREDUCE-4747:
---

Assignee: Ravi Prakash

> Fancy graphs for visualizing task progress
> --
>
> Key: MAPREDUCE-4747
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4747
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 0.23.4
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
>
> We should think about what kind of map / reduce graphs we want to see in MRv2 
> to visualize all the task progress / completion information we have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4747) Fancy graphs for visualizing task progress

2012-10-25 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484162#comment-13484162
 ] 

Robert Joseph Evans commented on MAPREDUCE-4747:


Figure 1 of #2 Terabyte sort on Hadoop, looks like an improved version of what 
we had originally. I like it, and it looks like it would scale fairly well with 
the number of tasks, but it does not give you the same information that the 
swim lanes would do.  It tends to give more of a view of resource utilization, 
it is hard to see the long tail tasks, and what might have caused them.  It 
would be great if we could support both, but I would want us to concentrate on 
the swim lanes first.  Although we do need to be careful and verify that this 
will work for jobs with 60,000+ tasks.  Even if we have to reduce the 
granularity of the images produced.

> Fancy graphs for visualizing task progress
> --
>
> Key: MAPREDUCE-4747
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4747
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 0.23.4
>Reporter: Ravi Prakash
>
> We should think about what kind of map / reduce graphs we want to see in MRv2 
> to visualize all the task progress / completion information we have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4746) The MR Application Master does not have a config to set environment variables

2012-10-25 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484157#comment-13484157
 ] 

Robert Joseph Evans commented on MAPREDUCE-4746:


My biggest comment is that this needs to be added to mapred-default.xml so that 
there is documentation for it.

> The MR Application Master does not have a config to set environment variables
> -
>
> Key: MAPREDUCE-4746
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4746
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 0.23.4
>Reporter: Robert Parker
>Assignee: Robert Parker
> Fix For: 3.0.0, 2.0.3-alpha, 0.23.5
>
> Attachments: MAPREDUCE-4746.patch
>
>
> There is no mechanism for defining environment variables (i.e. 
> LD_LIBRARY_PATH) for the MRAppMaster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4730) AM crashes due to OOM while serving up map task completion events

2012-10-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484136#comment-13484136
 ] 

Hudson commented on MAPREDUCE-4730:
---

Integrated in Hadoop-Mapreduce-trunk #1236 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1236/])
MAPREDUCE-4730. Fix Reducer's EventFetcher to scale the map-completion 
requests slowly to avoid HADOOP-8942. Contributed by Jason Lowe. (Revision 
1401941)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401941
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/EventFetcher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestEventFetcher.java


> AM crashes due to OOM while serving up map task completion events
> -
>
> Key: MAPREDUCE-4730
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4730
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 0.23.3
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Fix For: 2.0.3-alpha, 0.23.5
>
> Attachments: MAPREDUCE-4730.patch, MAPREDUCE-4730.patch, 
> MAPREDUCE-4730.patch
>
>
> We're seeing a repeatable OOM crash in the AM for a task with around 3 
> maps and 3000 reducers.  Details to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4741) WARN and ERROR messages logged during normal AM shutdown

2012-10-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484133#comment-13484133
 ] 

Hudson commented on MAPREDUCE-4741:
---

Integrated in Hadoop-Mapreduce-trunk #1236 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1236/])
MAPREDUCE-4741. WARN and ERROR messages logged during normal AM shutdown. 
Contributed by Vinod Kumar Vavilapalli (Revision 1401738)

 Result = FAILURE
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401738
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMCommunicator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/taskclean/TaskCleanerImpl.java


> WARN and ERROR messages logged during normal AM shutdown
> 
>
> Key: MAPREDUCE-4741
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4741
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Vinod Kumar Vavilapalli
>Priority: Minor
> Fix For: 2.0.3-alpha, 0.23.5
>
> Attachments: MAPREDUCE-4741-20121023.txt
>
>
> The ApplicationMaster is logging WARN and ERROR messages during normal 
> shutdown, and some users are misinterpreting these as serious problems.  For 
> example:
> {noformat}
> 2012-10-02 13:58:50,247 ERROR [ContainerLauncher Event Handler] 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Returning, 
> interrupted : java.lang.InterruptedException
> [...]
> 2012-10-02 13:58:50,248 ERROR [Thread-47] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Returning, 
> interrupted : java.lang.InterruptedException
> 2012-10-02 13:58:50,248 WARN [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Allocated thread 
> interrupted. Returning.
> [...]
> 2012-10-02 13:58:50,367 ERROR [TaskCleaner Event Handler] 
> org.apache.hadoop.mapreduce.v2.app.taskclean.TaskCleanerImpl: Returning, 
> interrupted : java.lang.InterruptedException
> {noformat}
> Warnings or errors should not be logged if everything is working as intended.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4730) AM crashes due to OOM while serving up map task completion events

2012-10-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484095#comment-13484095
 ] 

Hudson commented on MAPREDUCE-4730:
---

Integrated in Hadoop-Hdfs-trunk #1206 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1206/])
MAPREDUCE-4730. Fix Reducer's EventFetcher to scale the map-completion 
requests slowly to avoid HADOOP-8942. Contributed by Jason Lowe. (Revision 
1401941)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401941
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/EventFetcher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestEventFetcher.java


> AM crashes due to OOM while serving up map task completion events
> -
>
> Key: MAPREDUCE-4730
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4730
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 0.23.3
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Fix For: 2.0.3-alpha, 0.23.5
>
> Attachments: MAPREDUCE-4730.patch, MAPREDUCE-4730.patch, 
> MAPREDUCE-4730.patch
>
>
> We're seeing a repeatable OOM crash in the AM for a task with around 3 
> maps and 3000 reducers.  Details to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4741) WARN and ERROR messages logged during normal AM shutdown

2012-10-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484092#comment-13484092
 ] 

Hudson commented on MAPREDUCE-4741:
---

Integrated in Hadoop-Hdfs-trunk #1206 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1206/])
MAPREDUCE-4741. WARN and ERROR messages logged during normal AM shutdown. 
Contributed by Vinod Kumar Vavilapalli (Revision 1401738)

 Result = SUCCESS
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401738
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMCommunicator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/taskclean/TaskCleanerImpl.java


> WARN and ERROR messages logged during normal AM shutdown
> 
>
> Key: MAPREDUCE-4741
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4741
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Vinod Kumar Vavilapalli
>Priority: Minor
> Fix For: 2.0.3-alpha, 0.23.5
>
> Attachments: MAPREDUCE-4741-20121023.txt
>
>
> The ApplicationMaster is logging WARN and ERROR messages during normal 
> shutdown, and some users are misinterpreting these as serious problems.  For 
> example:
> {noformat}
> 2012-10-02 13:58:50,247 ERROR [ContainerLauncher Event Handler] 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Returning, 
> interrupted : java.lang.InterruptedException
> [...]
> 2012-10-02 13:58:50,248 ERROR [Thread-47] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Returning, 
> interrupted : java.lang.InterruptedException
> 2012-10-02 13:58:50,248 WARN [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Allocated thread 
> interrupted. Returning.
> [...]
> 2012-10-02 13:58:50,367 ERROR [TaskCleaner Event Handler] 
> org.apache.hadoop.mapreduce.v2.app.taskclean.TaskCleanerImpl: Returning, 
> interrupted : java.lang.InterruptedException
> {noformat}
> Warnings or errors should not be logged if everything is working as intended.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4730) AM crashes due to OOM while serving up map task completion events

2012-10-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484079#comment-13484079
 ] 

Hudson commented on MAPREDUCE-4730:
---

Integrated in Hadoop-Hdfs-0.23-Build #415 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/415/])
MAPREDUCE-4730. Fix Reducer's EventFetcher to scale the map-completion 
requests slowly to avoid HADOOP-8942. Contributed by Jason Lowe.
svn merge --ignore-ancestry -c 1401941 ../../trunk/ (Revision 1401943)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401943
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/EventFetcher.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestEventFetcher.java


> AM crashes due to OOM while serving up map task completion events
> -
>
> Key: MAPREDUCE-4730
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4730
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 0.23.3
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Fix For: 2.0.3-alpha, 0.23.5
>
> Attachments: MAPREDUCE-4730.patch, MAPREDUCE-4730.patch, 
> MAPREDUCE-4730.patch
>
>
> We're seeing a repeatable OOM crash in the AM for a task with around 3 
> maps and 3000 reducers.  Details to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4741) WARN and ERROR messages logged during normal AM shutdown

2012-10-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484075#comment-13484075
 ] 

Hudson commented on MAPREDUCE-4741:
---

Integrated in Hadoop-Hdfs-0.23-Build #415 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/415/])
svn merge -c 1401738 FIXES: MAPREDUCE-4741. WARN and ERROR messages logged 
during normal AM shutdown. Contributed by Vinod Kumar Vavilapalli (Revision 
1401743)

 Result = SUCCESS
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401743
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMCommunicator.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/taskclean/TaskCleanerImpl.java


> WARN and ERROR messages logged during normal AM shutdown
> 
>
> Key: MAPREDUCE-4741
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4741
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Vinod Kumar Vavilapalli
>Priority: Minor
> Fix For: 2.0.3-alpha, 0.23.5
>
> Attachments: MAPREDUCE-4741-20121023.txt
>
>
> The ApplicationMaster is logging WARN and ERROR messages during normal 
> shutdown, and some users are misinterpreting these as serious problems.  For 
> example:
> {noformat}
> 2012-10-02 13:58:50,247 ERROR [ContainerLauncher Event Handler] 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Returning, 
> interrupted : java.lang.InterruptedException
> [...]
> 2012-10-02 13:58:50,248 ERROR [Thread-47] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Returning, 
> interrupted : java.lang.InterruptedException
> 2012-10-02 13:58:50,248 WARN [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Allocated thread 
> interrupted. Returning.
> [...]
> 2012-10-02 13:58:50,367 ERROR [TaskCleaner Event Handler] 
> org.apache.hadoop.mapreduce.v2.app.taskclean.TaskCleanerImpl: Returning, 
> interrupted : java.lang.InterruptedException
> {noformat}
> Warnings or errors should not be logged if everything is working as intended.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4750) Enable NNBenchWithoutMR in MapredTestDriver

2012-10-25 Thread liang xie (JIRA)

liang xie created MAPREDUCE-4750:


 Summary: Enable NNBenchWithoutMR in MapredTestDriver
 Key: MAPREDUCE-4750
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4750
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, test
Affects Versions: 3.0.0
Reporter: liang xie
 Attachments: MAPREDUCE-4750.txt

Right now, we could run nnbench from MapredTestDriver only, there's no entry 
for NNBenchWithoutMR, it would be better enable it explicitly, such that we can 
do namenode benchmark with less influence factors

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4750) Enable NNBenchWithoutMR in MapredTestDriver

2012-10-25 Thread liang xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liang xie updated MAPREDUCE-4750:
-

Attachment: MAPREDUCE-4750.txt

minor change, it should be safe:)

> Enable NNBenchWithoutMR in MapredTestDriver
> ---
>
> Key: MAPREDUCE-4750
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4750
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, test
>Affects Versions: 3.0.0
>Reporter: liang xie
> Attachments: MAPREDUCE-4750.txt
>
>
> Right now, we could run nnbench from MapredTestDriver only, there's no entry 
> for NNBenchWithoutMR, it would be better enable it explicitly, such that we 
> can do namenode benchmark with less influence factors

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4741) WARN and ERROR messages logged during normal AM shutdown

2012-10-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484033#comment-13484033
 ] 

Hudson commented on MAPREDUCE-4741:
---

Integrated in Hadoop-Yarn-trunk #16 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/16/])
MAPREDUCE-4741. WARN and ERROR messages logged during normal AM shutdown. 
Contributed by Vinod Kumar Vavilapalli (Revision 1401738)

 Result = SUCCESS
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401738
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMCommunicator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/taskclean/TaskCleanerImpl.java


> WARN and ERROR messages logged during normal AM shutdown
> 
>
> Key: MAPREDUCE-4741
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4741
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Vinod Kumar Vavilapalli
>Priority: Minor
> Fix For: 2.0.3-alpha, 0.23.5
>
> Attachments: MAPREDUCE-4741-20121023.txt
>
>
> The ApplicationMaster is logging WARN and ERROR messages during normal 
> shutdown, and some users are misinterpreting these as serious problems.  For 
> example:
> {noformat}
> 2012-10-02 13:58:50,247 ERROR [ContainerLauncher Event Handler] 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Returning, 
> interrupted : java.lang.InterruptedException
> [...]
> 2012-10-02 13:58:50,248 ERROR [Thread-47] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Returning, 
> interrupted : java.lang.InterruptedException
> 2012-10-02 13:58:50,248 WARN [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Allocated thread 
> interrupted. Returning.
> [...]
> 2012-10-02 13:58:50,367 ERROR [TaskCleaner Event Handler] 
> org.apache.hadoop.mapreduce.v2.app.taskclean.TaskCleanerImpl: Returning, 
> interrupted : java.lang.InterruptedException
> {noformat}
> Warnings or errors should not be logged if everything is working as intended.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4730) AM crashes due to OOM while serving up map task completion events

2012-10-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484036#comment-13484036
 ] 

Hudson commented on MAPREDUCE-4730:
---

Integrated in Hadoop-Yarn-trunk #16 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/16/])
MAPREDUCE-4730. Fix Reducer's EventFetcher to scale the map-completion 
requests slowly to avoid HADOOP-8942. Contributed by Jason Lowe. (Revision 
1401941)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401941
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/EventFetcher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestEventFetcher.java


> AM crashes due to OOM while serving up map task completion events
> -
>
> Key: MAPREDUCE-4730
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4730
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 0.23.3
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Fix For: 2.0.3-alpha, 0.23.5
>
> Attachments: MAPREDUCE-4730.patch, MAPREDUCE-4730.patch, 
> MAPREDUCE-4730.patch
>
>
> We're seeing a repeatable OOM crash in the AM for a task with around 3 
> maps and 3000 reducers.  Details to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4734) The history server should link back to NM logs if aggregation is incomplete / disabled

2012-10-25 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-4734:
--

Attachment: MR4734_WIP.txt

Partial patch to create the links between the history server and the node 
manager. I won't be looking at this for several days, so if someone wants to 
take over meanwhile, go ahead.

> The history server should link back to NM logs if aggregation is incomplete / 
> disabled
> --
>
> Key: MAPREDUCE-4734
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4734
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.4
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: MR4734_WIP.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4630) API for setting dfs.block.size

2012-10-25 Thread alex gemini (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483967#comment-13483967
 ] 

alex gemini commented on MAPREDUCE-4630:


FileSystem.create(Path, overwrite, bufferSize, replication, blockSize, progress)
appear in hadoop faq in section 3.8

> API for setting dfs.block.size
> --
>
> Key: MAPREDUCE-4630
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4630
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
> Environment: Hadoop 2
>Reporter: Radim Kolar
>Priority: Minor
>
> Add API for setting block size in Tool while creating MR job.
> I propose
> FileOutputFormat.setBlockSize(Job job, int blocksize);
> which sets dfs.block.size

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-4745) Application Master is hanging when the TaskImpl gets T_KILL event and completes attempts by the time

[jira] [Resolved] (MAPREDUCE-4744) Application Master is running forever when the TaskAttempt gets TA_KILL event at the state SUCCESS_CONTAINER_CLEANUP

[jira] [Commented] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED

[jira] [Moved] (MAPREDUCE-4751) AM stuck in KILL_WAIT for days

[jira] [Updated] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED

[jira] [Updated] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED

[jira] [Commented] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED

[jira] [Commented] (MAPREDUCE-4630) API for setting dfs.block.size

[jira] [Commented] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED

[jira] [Assigned] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED

[jira] [Assigned] (MAPREDUCE-4747) Fancy graphs for visualizing task progress

[jira] [Commented] (MAPREDUCE-4747) Fancy graphs for visualizing task progress

[jira] [Commented] (MAPREDUCE-4746) The MR Application Master does not have a config to set environment variables

[jira] [Commented] (MAPREDUCE-4730) AM crashes due to OOM while serving up map task completion events

[jira] [Commented] (MAPREDUCE-4741) WARN and ERROR messages logged during normal AM shutdown

[jira] [Commented] (MAPREDUCE-4730) AM crashes due to OOM while serving up map task completion events

[jira] [Commented] (MAPREDUCE-4741) WARN and ERROR messages logged during normal AM shutdown

[jira] [Commented] (MAPREDUCE-4730) AM crashes due to OOM while serving up map task completion events

[jira] [Commented] (MAPREDUCE-4741) WARN and ERROR messages logged during normal AM shutdown

[jira] [Created] (MAPREDUCE-4750) Enable NNBenchWithoutMR in MapredTestDriver

[jira] [Updated] (MAPREDUCE-4750) Enable NNBenchWithoutMR in MapredTestDriver

[jira] [Commented] (MAPREDUCE-4741) WARN and ERROR messages logged during normal AM shutdown

[jira] [Commented] (MAPREDUCE-4730) AM crashes due to OOM while serving up map task completion events

[jira] [Updated] (MAPREDUCE-4734) The history server should link back to NM logs if aggregation is incomplete / disabled

[jira] [Commented] (MAPREDUCE-4630) API for setting dfs.block.size

25 matches

Site Navigation

Mail list logo

Footer information