[jira] [Resolved] (MAPREDUCE-4745) Application Master is hanging when the TaskImpl gets T_KILL event and completes attempts by the time
[ https://issues.apache.org/jira/browse/MAPREDUCE-4745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli resolved MAPREDUCE-4745. Resolution: Duplicate I am fixing all the issues related to TA_KILL at MAPREDUCE-4751. Closing this as duplicate. Appreciate help validating the fix though. > Application Master is hanging when the TaskImpl gets T_KILL event and > completes attempts by the time > -- > > Key: MAPREDUCE-4745 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4745 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Devaraj K >Assignee: Devaraj K > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4744) Application Master is running forever when the TaskAttempt gets TA_KILL event at the state SUCCESS_CONTAINER_CLEANUP
[ https://issues.apache.org/jira/browse/MAPREDUCE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli resolved MAPREDUCE-4744. Resolution: Duplicate I am fixing all the issues related to TA_KILL at MAPREDUCE-4751. Closing this as duplicate. Appreciate help validating the fix though. > Application Master is running forever when the TaskAttempt gets TA_KILL event > at the state SUCCESS_CONTAINER_CLEANUP > > > Key: MAPREDUCE-4744 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4744 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 2.0.2-alpha >Reporter: Devaraj K >Assignee: Devaraj K > > When the Task issues KILL event to TaskAttempt, It is expecting to get event > back to the Task from TaskAttempt. If the Task Attempt state > SUCCESS_CONTAINER_CLEANUP state then it is ignoring and Task is waiting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED
[ https://issues.apache.org/jira/browse/MAPREDUCE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484616#comment-13484616 ] Hadoop QA commented on MAPREDUCE-4748: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550896/MAPREDUCE-4748.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2967//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2967//console This message is automatically generated. > Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED > --- > > Key: MAPREDUCE-4748 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4748 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.3 >Reporter: Robert Joseph Evans >Assignee: Jason Lowe > Attachments: MAPREDUCE-4748.patch > > > We saw this happen when running a large pig script. > {noformat} > 2012-10-23 22:45:24,986 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Can't handle this event > at current state for task_1350837501057_21978_m_040453 > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > T_ATTEMPT_SUCCEEDED at SUCCEEDED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:604) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:89) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:914) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:908) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) > at java.lang.Thread.run(Thread.java:619) > {noformat} > Speculative execution was enabled, and that task did speculate so it looks > like this is an error in the state machine either between the task attempts > or just within that single task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Moved] (MAPREDUCE-4751) AM stuck in KILL_WAIT for days
[ https://issues.apache.org/jira/browse/MAPREDUCE-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli moved YARN-167 to MAPREDUCE-4751: - Affects Version/s: (was: 0.23.3) 0.23.3 2.0.2-alpha Key: MAPREDUCE-4751 (was: YARN-167) Project: Hadoop Map/Reduce (was: Hadoop YARN) > AM stuck in KILL_WAIT for days > -- > > Key: MAPREDUCE-4751 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4751 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.0.2-alpha, 0.23.3 >Reporter: Ravi Prakash >Assignee: Vinod Kumar Vavilapalli > Attachments: TaskAttemptStateGraph.jpg > > > We found some jobs were stuck in KILL_WAIT for days on end. The RM shows them > as RUNNING. When you go to the AM, it shows it in the KILL_WAIT state, and a > few maps running. All these maps were scheduled on nodes which are now in the > RM's Lost nodes list. The running maps are in the FAIL_CONTAINER_CLEANUP state -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED
[ https://issues.apache.org/jira/browse/MAPREDUCE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-4748: -- Target Version/s: 2.0.3-alpha, 0.23.5 Status: Patch Available (was: Open) > Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED > --- > > Key: MAPREDUCE-4748 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4748 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.3 >Reporter: Robert Joseph Evans >Assignee: Jason Lowe > Attachments: MAPREDUCE-4748.patch > > > We saw this happen when running a large pig script. > {noformat} > 2012-10-23 22:45:24,986 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Can't handle this event > at current state for task_1350837501057_21978_m_040453 > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > T_ATTEMPT_SUCCEEDED at SUCCEEDED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:604) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:89) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:914) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:908) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) > at java.lang.Thread.run(Thread.java:619) > {noformat} > Speculative execution was enabled, and that task did speculate so it looks > like this is an error in the state machine either between the task attempts > or just within that single task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED
[ https://issues.apache.org/jira/browse/MAPREDUCE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-4748: -- Attachment: MAPREDUCE-4748.patch Simple patch to ignore T_ATTEMPT_SUCCEEDED, T_KILL, and T_ATTEMPT_COMMIT_PENDING at SUCCEEDED and keep the job from abruptly ending in error. I'm a bit worried about the bookkeeping wrt. task.finishedAttempts and task.numberUncompletedAttempts. Current patch matches the bookkeeping behavior for T_ATTEMPT_KILLED or T_ATTEMPT_FAILED when we're effectively ignoring the event. However I'm wondering if this could lead to corner cases during KILL_WAIT like those reported in MAPREDUCE-4745. It looks like TaskAttempt will report T_ATTEMPT_KILLED after it succeeded but only for map tasks. We don't want to double-count in that case, but if a kill of the TaskAttempt doesn't report it was killed it seems like we could miss some bookeeping if we just ignore bookkeeping when we see an attempt redundantly succeeded. Thoughts? > Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED > --- > > Key: MAPREDUCE-4748 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4748 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.3 >Reporter: Robert Joseph Evans >Assignee: Jason Lowe > Attachments: MAPREDUCE-4748.patch > > > We saw this happen when running a large pig script. > {noformat} > 2012-10-23 22:45:24,986 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Can't handle this event > at current state for task_1350837501057_21978_m_040453 > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > T_ATTEMPT_SUCCEEDED at SUCCEEDED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:604) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:89) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:914) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:908) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) > at java.lang.Thread.run(Thread.java:619) > {noformat} > Speculative execution was enabled, and that task did speculate so it looks > like this is an error in the state machine either between the task attempts > or just within that single task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED
[ https://issues.apache.org/jira/browse/MAPREDUCE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484566#comment-13484566 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-4748: Genuine bug. TaskImpl needs to accept T_ATTEMPT_SUCCEEDED at SUCCEEDED. Not sure how we missed something as basic as this. We should also accept-and-ignore T_KILL and T_ATTEMPT_COMMIT_PENDING. > Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED > --- > > Key: MAPREDUCE-4748 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4748 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.3 >Reporter: Robert Joseph Evans >Assignee: Jason Lowe > > We saw this happen when running a large pig script. > {noformat} > 2012-10-23 22:45:24,986 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Can't handle this event > at current state for task_1350837501057_21978_m_040453 > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > T_ATTEMPT_SUCCEEDED at SUCCEEDED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:604) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:89) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:914) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:908) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) > at java.lang.Thread.run(Thread.java:619) > {noformat} > Speculative execution was enabled, and that task did speculate so it looks > like this is an error in the state machine either between the task attempts > or just within that single task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4630) API for setting dfs.block.size
[ https://issues.apache.org/jira/browse/MAPREDUCE-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1348#comment-1348 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-4630: I agree with others here. What is the use-case? Are dfs.block.size and mapreduce.input.fileinputformat.split.minsize not enough? Seems like you are looking for {{void setMinInputSplitSize(Job job, long size)}} ? > API for setting dfs.block.size > -- > > Key: MAPREDUCE-4630 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4630 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Environment: Hadoop 2 >Reporter: Radim Kolar >Priority: Minor > > Add API for setting block size in Tool while creating MR job. > I propose > FileOutputFormat.setBlockSize(Job job, int blocksize); > which sets dfs.block.size -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED
[ https://issues.apache.org/jira/browse/MAPREDUCE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484442#comment-13484442 ] Jason Lowe commented on MAPREDUCE-4748: --- Here's a log from another case showing we have a race between two attempts from the same task that succeed almost simultaneously: {noformat} 2012-10-24 11:31:40,751 INFO [IPC Server handler 1 on 52922] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1350066773975_116662_m_032327_1 2012-10-24 11:31:40,751 INFO [IPC Server handler 1 on 52922] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1350066773975_116662_m_032327_1 is : 1.0 2012-10-24 11:31:40,751 INFO [IPC Server handler 21 on 52922] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Done acknowledgement from attempt_1350066773975_116662_m_032327_1 2012-10-24 11:31:40,751 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1350066773975_116662_m_032327_1 TaskAttempt Transitioned from RUNNING to SUCCESS_CONTAINER_CLEANUP 2012-10-24 11:31:40,752 INFO [ContainerLauncher #55] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_1350066773975_116662_01_051566 taskAttempt attempt_1350066773975_116662_m_032327_1 2012-10-24 11:31:40,752 INFO [ContainerLauncher #55] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING attempt_1350066773975_116662_m_032327_1 2012-10-24 11:31:40,754 INFO [IPC Server handler 7 on 52922] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1350066773975_116662_r_03_0 2012-10-24 11:31:40,754 INFO [IPC Server handler 7 on 52922] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1350066773975_116662_r_03_0 is : 0.072 2012-10-24 11:31:40,755 INFO [IPC Server handler 25 on 52922] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1350066773975_116662_m_032327_0 2012-10-24 11:31:40,755 INFO [IPC Server handler 25 on 52922] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1350066773975_116662_m_032327_0 is : 1.0 2012-10-24 11:31:40,756 INFO [IPC Server handler 20 on 52922] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Done acknowledgement from attempt_1350066773975_116662_m_032327_0 2012-10-24 11:31:40,756 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1350066773975_116662_m_032327_0 TaskAttempt Transitioned from RUNNING to SUCCESS_CONTAINER_CLEANUP 2012-10-24 11:31:40,756 INFO [ContainerLauncher #484] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_1350066773975_116662_01_037193 taskAttempt attempt_1350066773975_116662_m_032327_0 2012-10-24 11:31:40,756 INFO [ContainerLauncher #484] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING attempt_1350066773975_116662_m_032327_0 2012-10-24 11:31:40,757 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1350066773975_116662_m_032327_1 TaskAttempt Transitioned from SUCCESS_CONTAINER_CLEANUP to SUCCEEDED 2012-10-24 11:31:40,757 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with attempt attempt_1350066773975_116662_m_032327_1 2012-10-24 11:31:40,757 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Issuing kill to other attempt attempt_1350066773975_116662_m_032327_0 2012-10-24 11:31:40,757 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1350066773975_116662_m_032327 Task Transitioned from RUNNING to SUCCEEDED 2012-10-24 11:31:40,757 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 51029 2012-10-24 11:31:40,780 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1350066773975_116662_m_032327_0 TaskAttempt Transitioned from SUCCESS_CONTAINER_CLEANUP to SUCCEEDED 2012-10-24 11:31:40,814 ERROR [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Can't handle this event at current state for task_1350066773975_116662_m_032327 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine
[jira] [Assigned] (MAPREDUCE-4748) Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED
[ https://issues.apache.org/jira/browse/MAPREDUCE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned MAPREDUCE-4748: - Assignee: Jason Lowe > Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED > --- > > Key: MAPREDUCE-4748 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4748 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.3 >Reporter: Robert Joseph Evans >Assignee: Jason Lowe > > We saw this happen when running a large pig script. > {noformat} > 2012-10-23 22:45:24,986 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Can't handle this event > at current state for task_1350837501057_21978_m_040453 > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > T_ATTEMPT_SUCCEEDED at SUCCEEDED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:604) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:89) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:914) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:908) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) > at java.lang.Thread.run(Thread.java:619) > {noformat} > Speculative execution was enabled, and that task did speculate so it looks > like this is an error in the state machine either between the task attempts > or just within that single task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-4747) Fancy graphs for visualizing task progress
[ https://issues.apache.org/jira/browse/MAPREDUCE-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash reassigned MAPREDUCE-4747: --- Assignee: Ravi Prakash > Fancy graphs for visualizing task progress > -- > > Key: MAPREDUCE-4747 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4747 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 >Affects Versions: 0.23.4 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > > We should think about what kind of map / reduce graphs we want to see in MRv2 > to visualize all the task progress / completion information we have. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4747) Fancy graphs for visualizing task progress
[ https://issues.apache.org/jira/browse/MAPREDUCE-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484162#comment-13484162 ] Robert Joseph Evans commented on MAPREDUCE-4747: Figure 1 of #2 Terabyte sort on Hadoop, looks like an improved version of what we had originally. I like it, and it looks like it would scale fairly well with the number of tasks, but it does not give you the same information that the swim lanes would do. It tends to give more of a view of resource utilization, it is hard to see the long tail tasks, and what might have caused them. It would be great if we could support both, but I would want us to concentrate on the swim lanes first. Although we do need to be careful and verify that this will work for jobs with 60,000+ tasks. Even if we have to reduce the granularity of the images produced. > Fancy graphs for visualizing task progress > -- > > Key: MAPREDUCE-4747 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4747 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 >Affects Versions: 0.23.4 >Reporter: Ravi Prakash > > We should think about what kind of map / reduce graphs we want to see in MRv2 > to visualize all the task progress / completion information we have. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4746) The MR Application Master does not have a config to set environment variables
[ https://issues.apache.org/jira/browse/MAPREDUCE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484157#comment-13484157 ] Robert Joseph Evans commented on MAPREDUCE-4746: My biggest comment is that this needs to be added to mapred-default.xml so that there is documentation for it. > The MR Application Master does not have a config to set environment variables > - > > Key: MAPREDUCE-4746 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4746 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster >Affects Versions: 0.23.4 >Reporter: Robert Parker >Assignee: Robert Parker > Fix For: 3.0.0, 2.0.3-alpha, 0.23.5 > > Attachments: MAPREDUCE-4746.patch > > > There is no mechanism for defining environment variables (i.e. > LD_LIBRARY_PATH) for the MRAppMaster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4730) AM crashes due to OOM while serving up map task completion events
[ https://issues.apache.org/jira/browse/MAPREDUCE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484136#comment-13484136 ] Hudson commented on MAPREDUCE-4730: --- Integrated in Hadoop-Mapreduce-trunk #1236 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1236/]) MAPREDUCE-4730. Fix Reducer's EventFetcher to scale the map-completion requests slowly to avoid HADOOP-8942. Contributed by Jason Lowe. (Revision 1401941) Result = FAILURE vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401941 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/EventFetcher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestEventFetcher.java > AM crashes due to OOM while serving up map task completion events > - > > Key: MAPREDUCE-4730 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4730 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 0.23.3 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Blocker > Fix For: 2.0.3-alpha, 0.23.5 > > Attachments: MAPREDUCE-4730.patch, MAPREDUCE-4730.patch, > MAPREDUCE-4730.patch > > > We're seeing a repeatable OOM crash in the AM for a task with around 3 > maps and 3000 reducers. Details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4741) WARN and ERROR messages logged during normal AM shutdown
[ https://issues.apache.org/jira/browse/MAPREDUCE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484133#comment-13484133 ] Hudson commented on MAPREDUCE-4741: --- Integrated in Hadoop-Mapreduce-trunk #1236 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1236/]) MAPREDUCE-4741. WARN and ERROR messages logged during normal AM shutdown. Contributed by Vinod Kumar Vavilapalli (Revision 1401738) Result = FAILURE jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401738 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMCommunicator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/taskclean/TaskCleanerImpl.java > WARN and ERROR messages logged during normal AM shutdown > > > Key: MAPREDUCE-4741 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4741 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 0.23.3, 2.0.1-alpha >Reporter: Jason Lowe >Assignee: Vinod Kumar Vavilapalli >Priority: Minor > Fix For: 2.0.3-alpha, 0.23.5 > > Attachments: MAPREDUCE-4741-20121023.txt > > > The ApplicationMaster is logging WARN and ERROR messages during normal > shutdown, and some users are misinterpreting these as serious problems. For > example: > {noformat} > 2012-10-02 13:58:50,247 ERROR [ContainerLauncher Event Handler] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Returning, > interrupted : java.lang.InterruptedException > [...] > 2012-10-02 13:58:50,248 ERROR [Thread-47] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Returning, > interrupted : java.lang.InterruptedException > 2012-10-02 13:58:50,248 WARN [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Allocated thread > interrupted. Returning. > [...] > 2012-10-02 13:58:50,367 ERROR [TaskCleaner Event Handler] > org.apache.hadoop.mapreduce.v2.app.taskclean.TaskCleanerImpl: Returning, > interrupted : java.lang.InterruptedException > {noformat} > Warnings or errors should not be logged if everything is working as intended. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4730) AM crashes due to OOM while serving up map task completion events
[ https://issues.apache.org/jira/browse/MAPREDUCE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484095#comment-13484095 ] Hudson commented on MAPREDUCE-4730: --- Integrated in Hadoop-Hdfs-trunk #1206 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1206/]) MAPREDUCE-4730. Fix Reducer's EventFetcher to scale the map-completion requests slowly to avoid HADOOP-8942. Contributed by Jason Lowe. (Revision 1401941) Result = SUCCESS vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401941 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/EventFetcher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestEventFetcher.java > AM crashes due to OOM while serving up map task completion events > - > > Key: MAPREDUCE-4730 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4730 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 0.23.3 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Blocker > Fix For: 2.0.3-alpha, 0.23.5 > > Attachments: MAPREDUCE-4730.patch, MAPREDUCE-4730.patch, > MAPREDUCE-4730.patch > > > We're seeing a repeatable OOM crash in the AM for a task with around 3 > maps and 3000 reducers. Details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4741) WARN and ERROR messages logged during normal AM shutdown
[ https://issues.apache.org/jira/browse/MAPREDUCE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484092#comment-13484092 ] Hudson commented on MAPREDUCE-4741: --- Integrated in Hadoop-Hdfs-trunk #1206 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1206/]) MAPREDUCE-4741. WARN and ERROR messages logged during normal AM shutdown. Contributed by Vinod Kumar Vavilapalli (Revision 1401738) Result = SUCCESS jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401738 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMCommunicator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/taskclean/TaskCleanerImpl.java > WARN and ERROR messages logged during normal AM shutdown > > > Key: MAPREDUCE-4741 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4741 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 0.23.3, 2.0.1-alpha >Reporter: Jason Lowe >Assignee: Vinod Kumar Vavilapalli >Priority: Minor > Fix For: 2.0.3-alpha, 0.23.5 > > Attachments: MAPREDUCE-4741-20121023.txt > > > The ApplicationMaster is logging WARN and ERROR messages during normal > shutdown, and some users are misinterpreting these as serious problems. For > example: > {noformat} > 2012-10-02 13:58:50,247 ERROR [ContainerLauncher Event Handler] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Returning, > interrupted : java.lang.InterruptedException > [...] > 2012-10-02 13:58:50,248 ERROR [Thread-47] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Returning, > interrupted : java.lang.InterruptedException > 2012-10-02 13:58:50,248 WARN [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Allocated thread > interrupted. Returning. > [...] > 2012-10-02 13:58:50,367 ERROR [TaskCleaner Event Handler] > org.apache.hadoop.mapreduce.v2.app.taskclean.TaskCleanerImpl: Returning, > interrupted : java.lang.InterruptedException > {noformat} > Warnings or errors should not be logged if everything is working as intended. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4730) AM crashes due to OOM while serving up map task completion events
[ https://issues.apache.org/jira/browse/MAPREDUCE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484079#comment-13484079 ] Hudson commented on MAPREDUCE-4730: --- Integrated in Hadoop-Hdfs-0.23-Build #415 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/415/]) MAPREDUCE-4730. Fix Reducer's EventFetcher to scale the map-completion requests slowly to avoid HADOOP-8942. Contributed by Jason Lowe. svn merge --ignore-ancestry -c 1401941 ../../trunk/ (Revision 1401943) Result = SUCCESS vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401943 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/EventFetcher.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestEventFetcher.java > AM crashes due to OOM while serving up map task completion events > - > > Key: MAPREDUCE-4730 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4730 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 0.23.3 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Blocker > Fix For: 2.0.3-alpha, 0.23.5 > > Attachments: MAPREDUCE-4730.patch, MAPREDUCE-4730.patch, > MAPREDUCE-4730.patch > > > We're seeing a repeatable OOM crash in the AM for a task with around 3 > maps and 3000 reducers. Details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4741) WARN and ERROR messages logged during normal AM shutdown
[ https://issues.apache.org/jira/browse/MAPREDUCE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484075#comment-13484075 ] Hudson commented on MAPREDUCE-4741: --- Integrated in Hadoop-Hdfs-0.23-Build #415 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/415/]) svn merge -c 1401738 FIXES: MAPREDUCE-4741. WARN and ERROR messages logged during normal AM shutdown. Contributed by Vinod Kumar Vavilapalli (Revision 1401743) Result = SUCCESS jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401743 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMCommunicator.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/taskclean/TaskCleanerImpl.java > WARN and ERROR messages logged during normal AM shutdown > > > Key: MAPREDUCE-4741 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4741 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 0.23.3, 2.0.1-alpha >Reporter: Jason Lowe >Assignee: Vinod Kumar Vavilapalli >Priority: Minor > Fix For: 2.0.3-alpha, 0.23.5 > > Attachments: MAPREDUCE-4741-20121023.txt > > > The ApplicationMaster is logging WARN and ERROR messages during normal > shutdown, and some users are misinterpreting these as serious problems. For > example: > {noformat} > 2012-10-02 13:58:50,247 ERROR [ContainerLauncher Event Handler] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Returning, > interrupted : java.lang.InterruptedException > [...] > 2012-10-02 13:58:50,248 ERROR [Thread-47] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Returning, > interrupted : java.lang.InterruptedException > 2012-10-02 13:58:50,248 WARN [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Allocated thread > interrupted. Returning. > [...] > 2012-10-02 13:58:50,367 ERROR [TaskCleaner Event Handler] > org.apache.hadoop.mapreduce.v2.app.taskclean.TaskCleanerImpl: Returning, > interrupted : java.lang.InterruptedException > {noformat} > Warnings or errors should not be logged if everything is working as intended. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4750) Enable NNBenchWithoutMR in MapredTestDriver
liang xie created MAPREDUCE-4750: Summary: Enable NNBenchWithoutMR in MapredTestDriver Key: MAPREDUCE-4750 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4750 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, test Affects Versions: 3.0.0 Reporter: liang xie Attachments: MAPREDUCE-4750.txt Right now, we could run nnbench from MapredTestDriver only, there's no entry for NNBenchWithoutMR, it would be better enable it explicitly, such that we can do namenode benchmark with less influence factors -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4750) Enable NNBenchWithoutMR in MapredTestDriver
[ https://issues.apache.org/jira/browse/MAPREDUCE-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liang xie updated MAPREDUCE-4750: - Attachment: MAPREDUCE-4750.txt minor change, it should be safe:) > Enable NNBenchWithoutMR in MapredTestDriver > --- > > Key: MAPREDUCE-4750 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4750 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client, test >Affects Versions: 3.0.0 >Reporter: liang xie > Attachments: MAPREDUCE-4750.txt > > > Right now, we could run nnbench from MapredTestDriver only, there's no entry > for NNBenchWithoutMR, it would be better enable it explicitly, such that we > can do namenode benchmark with less influence factors -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4741) WARN and ERROR messages logged during normal AM shutdown
[ https://issues.apache.org/jira/browse/MAPREDUCE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484033#comment-13484033 ] Hudson commented on MAPREDUCE-4741: --- Integrated in Hadoop-Yarn-trunk #16 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/16/]) MAPREDUCE-4741. WARN and ERROR messages logged during normal AM shutdown. Contributed by Vinod Kumar Vavilapalli (Revision 1401738) Result = SUCCESS jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401738 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMCommunicator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/taskclean/TaskCleanerImpl.java > WARN and ERROR messages logged during normal AM shutdown > > > Key: MAPREDUCE-4741 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4741 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 0.23.3, 2.0.1-alpha >Reporter: Jason Lowe >Assignee: Vinod Kumar Vavilapalli >Priority: Minor > Fix For: 2.0.3-alpha, 0.23.5 > > Attachments: MAPREDUCE-4741-20121023.txt > > > The ApplicationMaster is logging WARN and ERROR messages during normal > shutdown, and some users are misinterpreting these as serious problems. For > example: > {noformat} > 2012-10-02 13:58:50,247 ERROR [ContainerLauncher Event Handler] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Returning, > interrupted : java.lang.InterruptedException > [...] > 2012-10-02 13:58:50,248 ERROR [Thread-47] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Returning, > interrupted : java.lang.InterruptedException > 2012-10-02 13:58:50,248 WARN [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Allocated thread > interrupted. Returning. > [...] > 2012-10-02 13:58:50,367 ERROR [TaskCleaner Event Handler] > org.apache.hadoop.mapreduce.v2.app.taskclean.TaskCleanerImpl: Returning, > interrupted : java.lang.InterruptedException > {noformat} > Warnings or errors should not be logged if everything is working as intended. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4730) AM crashes due to OOM while serving up map task completion events
[ https://issues.apache.org/jira/browse/MAPREDUCE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484036#comment-13484036 ] Hudson commented on MAPREDUCE-4730: --- Integrated in Hadoop-Yarn-trunk #16 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/16/]) MAPREDUCE-4730. Fix Reducer's EventFetcher to scale the map-completion requests slowly to avoid HADOOP-8942. Contributed by Jason Lowe. (Revision 1401941) Result = SUCCESS vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401941 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/EventFetcher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestEventFetcher.java > AM crashes due to OOM while serving up map task completion events > - > > Key: MAPREDUCE-4730 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4730 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 0.23.3 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Blocker > Fix For: 2.0.3-alpha, 0.23.5 > > Attachments: MAPREDUCE-4730.patch, MAPREDUCE-4730.patch, > MAPREDUCE-4730.patch > > > We're seeing a repeatable OOM crash in the AM for a task with around 3 > maps and 3000 reducers. Details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4734) The history server should link back to NM logs if aggregation is incomplete / disabled
[ https://issues.apache.org/jira/browse/MAPREDUCE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-4734: -- Attachment: MR4734_WIP.txt Partial patch to create the links between the history server and the node manager. I won't be looking at this for several days, so if someone wants to take over meanwhile, go ahead. > The history server should link back to NM logs if aggregation is incomplete / > disabled > -- > > Key: MAPREDUCE-4734 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4734 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver, mrv2 >Affects Versions: 0.23.4 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: MR4734_WIP.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4630) API for setting dfs.block.size
[ https://issues.apache.org/jira/browse/MAPREDUCE-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483967#comment-13483967 ] alex gemini commented on MAPREDUCE-4630: FileSystem.create(Path, overwrite, bufferSize, replication, blockSize, progress) appear in hadoop faq in section 3.8 > API for setting dfs.block.size > -- > > Key: MAPREDUCE-4630 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4630 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Environment: Hadoop 2 >Reporter: Radim Kolar >Priority: Minor > > Add API for setting block size in Tool while creating MR job. > I propose > FileOutputFormat.setBlockSize(Job job, int blocksize); > which sets dfs.block.size -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira