[jira] [Updated] (MAPREDUCE-6002) MR task should prevent report error to AM when process is shutting down

2014-07-26 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated MAPREDUCE-6002:
---

   Resolution: Fixed
Fix Version/s: 2.5.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-2 and branch-2.5. Thanks [~leftnoteasy] for the 
patch, and [~jlowe] for the feedback!

 MR task should prevent report error to AM when process is shutting down
 ---

 Key: MAPREDUCE-6002
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6002
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Affects Versions: 2.5.0
Reporter: Wangda Tan
Assignee: Wangda Tan
 Fix For: 2.5.0

 Attachments: MR-6002.patch


 With MAPREDUCE-5900, preempted MR task should not be treat as failed. 
 But it is still possible a MR task fail and report to AM when preemption take 
 effect and the AM hasn't received completed container from RM yet. It will 
 cause the task attempt marked failed instead of preempted.
 An example is FileSystem has shutdown hook, it will close all FileSystem 
 instance, if at the same time, the FileSystem is in-use (like reading split 
 details from HDFS), MR task will fail and report the fatal error to MR AM. An 
 exception will be raised:
 {code}
 2014-07-22 01:46:19,613 FATAL [IPC Server handler 10 on 56903] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
 attempt_1405985051088_0018_m_25_0 - exited : java.io.IOException: 
 Filesystem closed
   at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:707)
   at 
 org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:776)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:837)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:645)
   at java.io.DataInputStream.readByte(DataInputStream.java:265)
   at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
   at 
 org.apache.hadoop.io.WritableUtils.readVIntInRange(WritableUtils.java:348)
   at org.apache.hadoop.io.Text.readString(Text.java:464)
   at org.apache.hadoop.io.Text.readString(Text.java:457)
   at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:357)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
 {code}
 We should prevent this, because it is possible other exceptions happen when 
 shutting down, we shouldn't report any of such exceptions to AM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-6002) MR task should prevent report error to AM when process is shutting down

2014-07-24 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated MAPREDUCE-6002:
--

Status: Patch Available  (was: Open)

 MR task should prevent report error to AM when process is shutting down
 ---

 Key: MAPREDUCE-6002
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6002
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Affects Versions: 2.5.0
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: MR-6002.patch


 With MAPREDUCE-5900, preempted MR task should not be treat as failed. 
 But it is still possible a MR task fail and report to AM when preemption take 
 effect and the AM hasn't received completed container from RM yet. It will 
 cause the task attempt marked failed instead of preempted.
 An example is FileSystem has shutdown hook, it will close all FileSystem 
 instance, if at the same time, the FileSystem is in-use (like reading split 
 details from HDFS), MR task will fail and report the fatal error to MR AM. An 
 exception will be raised:
 {code}
 2014-07-22 01:46:19,613 FATAL [IPC Server handler 10 on 56903] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
 attempt_1405985051088_0018_m_25_0 - exited : java.io.IOException: 
 Filesystem closed
   at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:707)
   at 
 org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:776)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:837)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:645)
   at java.io.DataInputStream.readByte(DataInputStream.java:265)
   at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
   at 
 org.apache.hadoop.io.WritableUtils.readVIntInRange(WritableUtils.java:348)
   at org.apache.hadoop.io.Text.readString(Text.java:464)
   at org.apache.hadoop.io.Text.readString(Text.java:457)
   at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:357)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
 {code}
 We should prevent this, because it is possible other exceptions happen when 
 shutting down, we shouldn't report any of such exceptions to AM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-6002) MR task should prevent report error to AM when process is shutting down

2014-07-24 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated MAPREDUCE-6002:
--

Attachment: MR-6002.patch

Attached a patch for review.

 MR task should prevent report error to AM when process is shutting down
 ---

 Key: MAPREDUCE-6002
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6002
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Affects Versions: 2.5.0
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: MR-6002.patch


 With MAPREDUCE-5900, preempted MR task should not be treat as failed. 
 But it is still possible a MR task fail and report to AM when preemption take 
 effect and the AM hasn't received completed container from RM yet. It will 
 cause the task attempt marked failed instead of preempted.
 An example is FileSystem has shutdown hook, it will close all FileSystem 
 instance, if at the same time, the FileSystem is in-use (like reading split 
 details from HDFS), MR task will fail and report the fatal error to MR AM. An 
 exception will be raised:
 {code}
 2014-07-22 01:46:19,613 FATAL [IPC Server handler 10 on 56903] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
 attempt_1405985051088_0018_m_25_0 - exited : java.io.IOException: 
 Filesystem closed
   at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:707)
   at 
 org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:776)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:837)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:645)
   at java.io.DataInputStream.readByte(DataInputStream.java:265)
   at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
   at 
 org.apache.hadoop.io.WritableUtils.readVIntInRange(WritableUtils.java:348)
   at org.apache.hadoop.io.Text.readString(Text.java:464)
   at org.apache.hadoop.io.Text.readString(Text.java:457)
   at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:357)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
 {code}
 We should prevent this, because it is possible other exceptions happen when 
 shutting down, we shouldn't report any of such exceptions to AM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)