[jira] Updated: (MAPREDUCE-1861) Raid should rearrange the replicas while raiding
[ https://issues.apache.org/jira/browse/MAPREDUCE-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1861: -- Fix Version/s: (was: 0.22.0) 0.23.0 Affects Version/s: (was: 0.22.0) 0.23.0 Status: Patch Available (was: Open) > Raid should rearrange the replicas while raiding > > > Key: MAPREDUCE-1861 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1861 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/raid >Affects Versions: 0.23.0 >Reporter: Scott Chen >Assignee: Scott Chen > Fix For: 0.23.0 > > Attachments: MAPREDUCE-1861-v2.txt, MAPREDUCE-1861.txt, > MAPREDUCE-1861.txt > > > Raided file introduce extra dependencies on the blocks on the same stripe. > Therefore we need a new way to place the blocks. > It is desirable that raided file satisfies the following two conditions: > a. Replicas on the same stripe should be on different machines (or racks) > b. Replicas of the same block should be on different racks > MAPREDUCE-1831 will try to delete the replicas on the same stripe and the > same machine (a). > But in the mean time, it will try to maintain the number of distinct racks of > one block (b). > We cannot satisfy (a) and (b) at the same time with the current logic in > BlockPlacementPolicyDefault.chooseTarget(). > One choice we have is to change BlockPlacementPolicyDefault.chooseTarget(). > However, this placement is in general good for all files including the > unraided ones. > It is not clear to us that we can make this good for both raided and unraided > files. > So we propose this idea that when raiding the file. We create one more > off-rack replica (so the replication=4 now). > Than we delete two blocks using the policy in MAPREDUCE-1831 after that > (replication=2 now). > This way we can rearrange the replicas to satisfy (a) and (b) at the same > time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1861) Raid should rearrange the replicas while raiding
[ https://issues.apache.org/jira/browse/MAPREDUCE-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969621#action_12969621 ] Scott Chen commented on MAPREDUCE-1861: --- Rebase the patch. > Raid should rearrange the replicas while raiding > > > Key: MAPREDUCE-1861 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1861 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/raid >Affects Versions: 0.23.0 >Reporter: Scott Chen >Assignee: Scott Chen > Fix For: 0.23.0 > > Attachments: MAPREDUCE-1861-v2.txt, MAPREDUCE-1861.txt, > MAPREDUCE-1861.txt > > > Raided file introduce extra dependencies on the blocks on the same stripe. > Therefore we need a new way to place the blocks. > It is desirable that raided file satisfies the following two conditions: > a. Replicas on the same stripe should be on different machines (or racks) > b. Replicas of the same block should be on different racks > MAPREDUCE-1831 will try to delete the replicas on the same stripe and the > same machine (a). > But in the mean time, it will try to maintain the number of distinct racks of > one block (b). > We cannot satisfy (a) and (b) at the same time with the current logic in > BlockPlacementPolicyDefault.chooseTarget(). > One choice we have is to change BlockPlacementPolicyDefault.chooseTarget(). > However, this placement is in general good for all files including the > unraided ones. > It is not clear to us that we can make this good for both raided and unraided > files. > So we propose this idea that when raiding the file. We create one more > off-rack replica (so the replication=4 now). > Than we delete two blocks using the policy in MAPREDUCE-1831 after that > (replication=2 now). > This way we can rearrange the replicas to satisfy (a) and (b) at the same > time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1861) Raid should rearrange the replicas while raiding
[ https://issues.apache.org/jira/browse/MAPREDUCE-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1861: -- Attachment: MAPREDUCE-1861-v2.txt > Raid should rearrange the replicas while raiding > > > Key: MAPREDUCE-1861 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1861 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/raid >Affects Versions: 0.23.0 >Reporter: Scott Chen >Assignee: Scott Chen > Fix For: 0.23.0 > > Attachments: MAPREDUCE-1861-v2.txt, MAPREDUCE-1861.txt, > MAPREDUCE-1861.txt > > > Raided file introduce extra dependencies on the blocks on the same stripe. > Therefore we need a new way to place the blocks. > It is desirable that raided file satisfies the following two conditions: > a. Replicas on the same stripe should be on different machines (or racks) > b. Replicas of the same block should be on different racks > MAPREDUCE-1831 will try to delete the replicas on the same stripe and the > same machine (a). > But in the mean time, it will try to maintain the number of distinct racks of > one block (b). > We cannot satisfy (a) and (b) at the same time with the current logic in > BlockPlacementPolicyDefault.chooseTarget(). > One choice we have is to change BlockPlacementPolicyDefault.chooseTarget(). > However, this placement is in general good for all files including the > unraided ones. > It is not clear to us that we can make this good for both raided and unraided > files. > So we propose this idea that when raiding the file. We create one more > off-rack replica (so the replication=4 now). > Than we delete two blocks using the policy in MAPREDUCE-1831 after that > (replication=2 now). > This way we can rearrange the replicas to satisfy (a) and (b) at the same > time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-2215) A more elegant FileSystem#listCorruptFileBlocks API (RAID changes)
A more elegant FileSystem#listCorruptFileBlocks API (RAID changes) -- Key: MAPREDUCE-2215 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2215 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Reporter: Patrick Kling Assignee: Patrick Kling Map/reduce changes related to HADOOP-7060 and HDFS-1533. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1831) BlockPlacement policy for RAID
[ https://issues.apache.org/jira/browse/MAPREDUCE-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969589#action_12969589 ] Scott Chen commented on MAPREDUCE-1831: --- I have remove the dependency of MAPREDUCE-1861. This patch does chooseTarget for parity file. This does not need MAPREDUCE-1861. I will also rebase the patch in MAPREDUCE-1861. > BlockPlacement policy for RAID > -- > > Key: MAPREDUCE-1831 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/raid >Affects Versions: 0.23.0 >Reporter: Scott Chen >Assignee: Scott Chen > Fix For: 0.23.0 > > Attachments: MAPREDUCE-1831-v2.txt, MAPREDUCE-1831.20100610.txt, > MAPREDUCE-1831.txt, MAPREDUCE-1831.v1.1.txt > > > Raid introduce the new dependency between blocks within a file. > The blocks help decode each other. Therefore we should avoid put them on the > same machine. > The proposed BlockPlacementPolicy does the following > 1. When writing parity blocks, it avoid the parity blocks and source blocks > sit together. > 2. When reducing replication number, it deletes the blocks that sits with > other dependent blocks. > 3. It does not change the way we write normal files. It only has different > behavior when processing raid files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1831) BlockPlacement policy for RAID
[ https://issues.apache.org/jira/browse/MAPREDUCE-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1831: -- Fix Version/s: (was: 0.22.0) 0.23.0 Affects Version/s: (was: 0.22.0) 0.23.0 Status: Patch Available (was: Open) > BlockPlacement policy for RAID > -- > > Key: MAPREDUCE-1831 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/raid >Affects Versions: 0.23.0 >Reporter: Scott Chen >Assignee: Scott Chen > Fix For: 0.23.0 > > Attachments: MAPREDUCE-1831-v2.txt, MAPREDUCE-1831.20100610.txt, > MAPREDUCE-1831.txt, MAPREDUCE-1831.v1.1.txt > > > Raid introduce the new dependency between blocks within a file. > The blocks help decode each other. Therefore we should avoid put them on the > same machine. > The proposed BlockPlacementPolicy does the following > 1. When writing parity blocks, it avoid the parity blocks and source blocks > sit together. > 2. When reducing replication number, it deletes the blocks that sits with > other dependent blocks. > 3. It does not change the way we write normal files. It only has different > behavior when processing raid files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1831) BlockPlacement policy for RAID
[ https://issues.apache.org/jira/browse/MAPREDUCE-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1831: -- Attachment: MAPREDUCE-1831-v2.txt I have changed the summary of this issue. This not only includes deleteReplica but also chooseTarget. And the patch is uploaded. > BlockPlacement policy for RAID > -- > > Key: MAPREDUCE-1831 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/raid >Affects Versions: 0.22.0 >Reporter: Scott Chen >Assignee: Scott Chen > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1831-v2.txt, MAPREDUCE-1831.20100610.txt, > MAPREDUCE-1831.txt, MAPREDUCE-1831.v1.1.txt > > > Raid introduce the new dependency between blocks within a file. > The blocks help decode each other. Therefore we should avoid put them on the > same machine. > The proposed BlockPlacementPolicy does the following > 1. When writing parity blocks, it avoid the parity blocks and source blocks > sit together. > 2. When reducing replication number, it deletes the blocks that sits with > other dependent blocks. > 3. It does not change the way we write normal files. It only has different > behavior when processing raid files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1831) BlockPlacement policy for RAID
[ https://issues.apache.org/jira/browse/MAPREDUCE-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1831: -- Description: Raid introduce the new dependency between blocks within a file. The blocks help decode each other. Therefore we should avoid put them on the same machine. The proposed BlockPlacementPolicy does the following 1. When writing parity blocks, it avoid the parity blocks and source blocks sit together. 2. When reducing replication number, it deletes the blocks that sits with other dependent blocks. 3. It does not change the way we write normal files. It only has different behavior when processing raid files. was: In raid, it is good to have the blocks on the same stripe located on different machine. This way when one machine is down, it does not broke two blocks on the stripe. By doing this, we can decrease the block error probability in raid from O(p^3) to O(p^4) which can be a hugh improvement (where p is the replica missing probability). One way to do this is that we can add a new BlockPlacementPolicy which deletes the replicas that are co-located. So when raiding the file, we can make the remaining replicas live on different machines. Summary: BlockPlacement policy for RAID (was: Delete the co-located replicas when raiding file) > BlockPlacement policy for RAID > -- > > Key: MAPREDUCE-1831 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/raid >Affects Versions: 0.22.0 >Reporter: Scott Chen >Assignee: Scott Chen > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1831.20100610.txt, MAPREDUCE-1831.txt, > MAPREDUCE-1831.v1.1.txt > > > Raid introduce the new dependency between blocks within a file. > The blocks help decode each other. Therefore we should avoid put them on the > same machine. > The proposed BlockPlacementPolicy does the following > 1. When writing parity blocks, it avoid the parity blocks and source blocks > sit together. > 2. When reducing replication number, it deletes the blocks that sits with > other dependent blocks. > 3. It does not change the way we write normal files. It only has different > behavior when processing raid files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-2214) TaskTracker should release slot if task is not launched
[ https://issues.apache.org/jira/browse/MAPREDUCE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969572#action_12969572 ] Joydeep Sen Sarma commented on MAPREDUCE-2214: -- i think what happened in our case was something like this: # task was requested to be killed # the TT performed the kill action and reported back to the JT # but the task reported back as done - at which point the TT promptly moved it into the SUCCEEDED state # meanwhile the JT scheduled a cleanup and the cleanup failed to launch without returning the slot the cris-crossing of #2 and #3 was what was unexpected i think (something the code doesn't anticipate). we don't hit this problem with speculation because we never request speculation when the task is about to complete (there's a check on the remaining time on the task and if the remaining time is less than N min - we don't speculate. there's a jira for this - don't remember which). > TaskTracker should release slot if task is not launched > --- > > Key: MAPREDUCE-2214 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2214 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.1 >Reporter: Ramkumar Vadali >Assignee: Ramkumar Vadali > > TaskTracker.TaskInProgress.launchTask() does not launch a task if it is not > in an expected state. However, in the case where the task is not launched, > the slot is not released. We have observed this in production - the task was > in SUCCEEDED state by the time launchTask() got to it and then the slot was > never released. It is not clear how the task got into that state, but it is > better to handle the case. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-2214) TaskTracker should release slot if task is not launched
[ https://issues.apache.org/jira/browse/MAPREDUCE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969485#action_12969485 ] Dick King commented on MAPREDUCE-2214: -- Speculative execution is a legitimate way a task can become {{SUCCEEDED}} while an attempt on that task is waiting to get launched. > TaskTracker should release slot if task is not launched > --- > > Key: MAPREDUCE-2214 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2214 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.1 >Reporter: Ramkumar Vadali >Assignee: Ramkumar Vadali > > TaskTracker.TaskInProgress.launchTask() does not launch a task if it is not > in an expected state. However, in the case where the task is not launched, > the slot is not released. We have observed this in production - the task was > in SUCCEEDED state by the time launchTask() got to it and then the slot was > never released. It is not clear how the task got into that state, but it is > better to handle the case. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-2214) TaskTracker should release slot if task is not launched
TaskTracker should release slot if task is not launched --- Key: MAPREDUCE-2214 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2214 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.1 Reporter: Ramkumar Vadali Assignee: Ramkumar Vadali TaskTracker.TaskInProgress.launchTask() does not launch a task if it is not in an expected state. However, in the case where the task is not launched, the slot is not released. We have observed this in production - the task was in SUCCEEDED state by the time launchTask() got to it and then the slot was never released. It is not clear how the task got into that state, but it is better to handle the case. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-2209) TaskTracker's heartbeat hang for several minutes when copying large job.jar from HDFS
[ https://issues.apache.org/jira/browse/MAPREDUCE-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liyin Liang updated MAPREDUCE-2209: --- Description: If a job's jar file is very large, e.g 200m+, the TaskTracker's heartbeat hang for several minutes when localizing the job. The jstack of related threads are as follows: {code:borderStyle=solid} "TaskLauncher for task" daemon prio=10 tid=0x002b05ee5000 nid=0x1adf runnable [0x42e56000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked <0x002afc892ec8> (a sun.nio.ch.Util$1) - locked <0x002afc892eb0> (a java.util.Collections$UnmodifiableSet) - locked <0x002afc8927d8> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:260) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) - locked <0x002afce26158> (a java.io.BufferedInputStream) at java.io.DataInputStream.readShort(DataInputStream.java:295) at org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1304) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1556) - locked <0x002afce26218> (a org.apache.hadoop.hdfs.DFSClient$DFSInputStream) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1673) - locked <0x002afce26218> (a org.apache.hadoop.hdfs.DFSClient$DFSInputStream) at java.io.DataInputStream.read(DataInputStream.java:83) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:47) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:209) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:142) at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1214) at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1195) at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:824) - locked <0x002afce2d260> (a org.apache.hadoop.mapred.TaskTracker$RunningJob) at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1745) at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:103) at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1710) "Map-events fetcher for all reduce tasks on tracker_r01a08025:localhost/127.0.0.1:50050" daemon prio=10 tid=0x002b05ef8000 nid=0x1ada waiting for monitor entry [0x42d55000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.reducesInShuffle(TaskTracker.java:582) - waiting to lock <0x002afce2d260> (a org.apache.hadoop.mapred.TaskTracker$RunningJob) at org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.run(TaskTracker.java:617) - locked <0x002a9eefe1f8> (a java.util.TreeMap) "IPC Server handler 2 on 50050" daemon prio=10 tid=0x002b050eb000 nid=0x1ab0 waiting for monitor entry [0x4234b000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.mapred.TaskTracker.getMapCompletionEvents(TaskTracker.java:2684) - waiting to lock <0x002a9eefe1f8> (a java.util.TreeMap) - locked <0x002a9eac1de8> (a org.apache.hadoop.mapred.TaskTracker) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894) "main" prio=10 tid=0x40113800 nid=0x197d waiting for monitor entry [0x4022a000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1196) - waiting to lock <0x002a9eac1de8> (a org.apache.hadoop.mapred.TaskTracker) at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1068) at org.apache.hadoop.
[jira] Commented: (MAPREDUCE-2209) TaskTracker's heartbeat hang for several minutes when copying large job.jar from HDFS
[ https://issues.apache.org/jira/browse/MAPREDUCE-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969238#action_12969238 ] Liyin Liang commented on MAPREDUCE-2209: I setup a cluster with the latest version 0.21.0. To simulate the large job.jar problem, let TaskLauncher thread sleep 100 seconds just before download job.jar in localizeJobJarFile function. Then the heartbeat of some TT will hang for almost 100 seconds. Basically, the jstack is the same with 0.19: {code:borderStyle=solid} "TaskLauncher for MAP tasks" daemon prio=10 tid=0x2aab3145a800 nid=0x3fe8 waiting on condition [0x440b3000..0x440b3a10] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.mapred.TaskTracker.localizeJobJarFile(TaskTracker.java:1150) at org.apache.hadoop.mapred.TaskTracker.localizeJobFiles(TaskTracker.java:1074) at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:977) - locked <0x2aaab3a86f10> (a org.apache.hadoop.mapred.TaskTracker$RunningJob) at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2248) at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:2213) "Map-events fetcher for all reduce tasks on tracker_hd2:localhost.localdomain/127.0.0.1:36128" daemon prio=10 tid=0x2aab 31451c00 nid=0x3fde waiting for monitor entry [0x41a4..0x41a40d90] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.reducesInShuffle(TaskTracker.java:800) - waiting to lock <0x2aaab3a86f10> (a org.apache.hadoop.mapred.TaskTracker$RunningJob) at org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.run(TaskTracker.java:834) - locked <0x2aaab38ee1b8> (a java.util.TreeMap) "IPC Server handler 0 on 36128" daemon prio=10 tid=0x4368ac00 nid=0x3fc8 waiting for monitor entry [0x425f6000..0x425 f7c90] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.mapred.TaskTracker.getMapCompletionEvents(TaskTracker.java:3254) - waiting to lock <0x2aaab38ee1b8> (a java.util.TreeMap) - locked <0x2aaab37f1708> (a org.apache.hadoop.mapred.TaskTracker) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:342) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1350) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1346) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1344) "main" prio=10 tid=0x42fff400 nid=0x3f91 waiting for monitor entry [0x41ef..0x41ef0ed0] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1535) - waiting to lock <0x2aaab37f1708> (a org.apache.hadoop.mapred.TaskTracker) at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1433) at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2330) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3462) {code} lock order of relative threads: TaskLauncher(localizeJobJarFile): locked RunningJob Map-events fetcher:locked runningJobs waiting to lock RunningJob IPC Server handler(getMapCompletionEvents): locked TaskTracker waiting to lock runningJobs main(transmitHeartBeat): waiting to lock TaskTracker So, TaskTracker is locked indirectly when downloading job.jar. > TaskTracker's heartbeat hang for several minutes when copying large job.jar > from HDFS > - > > Key: MAPREDUCE-2209 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2209 > Project: Hadoop Map/Reduce > Issue Type: Bug > Environment: hadoop version: 0.19.1 >Reporter: Liyin Liang >Priority: Blocker > > If a job's jar file is very large, e.g 200m+, the TaskTracker's heartbeat > hang for several minutes when localizing the job. The jstack