[jira] Updated: (MAPREDUCE-1861) Raid should rearrange the replicas while raiding

2010-12-08 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1861:
--

Fix Version/s: (was: 0.22.0)
   0.23.0
Affects Version/s: (was: 0.22.0)
   0.23.0
   Status: Patch Available  (was: Open)

> Raid should rearrange the replicas while raiding
> 
>
> Key: MAPREDUCE-1861
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1861
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.23.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.23.0
>
> Attachments: MAPREDUCE-1861-v2.txt, MAPREDUCE-1861.txt, 
> MAPREDUCE-1861.txt
>
>
> Raided file introduce extra dependencies on the blocks on the same stripe.
> Therefore we need a new way to place the blocks.
> It is desirable that raided file satisfies the following two conditions:
> a. Replicas on the same stripe should be on different machines (or racks)
> b. Replicas of the same block should be on different racks
> MAPREDUCE-1831 will try to delete the replicas on the same stripe and the 
> same machine (a).
> But in the mean time, it will try to maintain the number of distinct racks of 
> one block (b).
> We cannot satisfy (a) and (b) at the same time with the current logic in 
> BlockPlacementPolicyDefault.chooseTarget().
> One choice we have is to change BlockPlacementPolicyDefault.chooseTarget().
> However, this placement is in general good for all files including the 
> unraided ones.
> It is not clear to us that we can make this good for both raided and unraided 
> files.
> So we propose this idea that when raiding the file. We create one more 
> off-rack replica (so the replication=4 now).
> Than we delete two blocks using the policy in MAPREDUCE-1831 after that 
> (replication=2 now).
> This way we can rearrange the replicas to satisfy (a) and (b) at the same 
> time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1861) Raid should rearrange the replicas while raiding

2010-12-08 Thread Scott Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969621#action_12969621
 ] 

Scott Chen commented on MAPREDUCE-1861:
---

Rebase the patch.

> Raid should rearrange the replicas while raiding
> 
>
> Key: MAPREDUCE-1861
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1861
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.23.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.23.0
>
> Attachments: MAPREDUCE-1861-v2.txt, MAPREDUCE-1861.txt, 
> MAPREDUCE-1861.txt
>
>
> Raided file introduce extra dependencies on the blocks on the same stripe.
> Therefore we need a new way to place the blocks.
> It is desirable that raided file satisfies the following two conditions:
> a. Replicas on the same stripe should be on different machines (or racks)
> b. Replicas of the same block should be on different racks
> MAPREDUCE-1831 will try to delete the replicas on the same stripe and the 
> same machine (a).
> But in the mean time, it will try to maintain the number of distinct racks of 
> one block (b).
> We cannot satisfy (a) and (b) at the same time with the current logic in 
> BlockPlacementPolicyDefault.chooseTarget().
> One choice we have is to change BlockPlacementPolicyDefault.chooseTarget().
> However, this placement is in general good for all files including the 
> unraided ones.
> It is not clear to us that we can make this good for both raided and unraided 
> files.
> So we propose this idea that when raiding the file. We create one more 
> off-rack replica (so the replication=4 now).
> Than we delete two blocks using the policy in MAPREDUCE-1831 after that 
> (replication=2 now).
> This way we can rearrange the replicas to satisfy (a) and (b) at the same 
> time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1861) Raid should rearrange the replicas while raiding

2010-12-08 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1861:
--

Attachment: MAPREDUCE-1861-v2.txt

> Raid should rearrange the replicas while raiding
> 
>
> Key: MAPREDUCE-1861
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1861
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.23.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.23.0
>
> Attachments: MAPREDUCE-1861-v2.txt, MAPREDUCE-1861.txt, 
> MAPREDUCE-1861.txt
>
>
> Raided file introduce extra dependencies on the blocks on the same stripe.
> Therefore we need a new way to place the blocks.
> It is desirable that raided file satisfies the following two conditions:
> a. Replicas on the same stripe should be on different machines (or racks)
> b. Replicas of the same block should be on different racks
> MAPREDUCE-1831 will try to delete the replicas on the same stripe and the 
> same machine (a).
> But in the mean time, it will try to maintain the number of distinct racks of 
> one block (b).
> We cannot satisfy (a) and (b) at the same time with the current logic in 
> BlockPlacementPolicyDefault.chooseTarget().
> One choice we have is to change BlockPlacementPolicyDefault.chooseTarget().
> However, this placement is in general good for all files including the 
> unraided ones.
> It is not clear to us that we can make this good for both raided and unraided 
> files.
> So we propose this idea that when raiding the file. We create one more 
> off-rack replica (so the replication=4 now).
> Than we delete two blocks using the policy in MAPREDUCE-1831 after that 
> (replication=2 now).
> This way we can rearrange the replicas to satisfy (a) and (b) at the same 
> time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-2215) A more elegant FileSystem#listCorruptFileBlocks API (RAID changes)

2010-12-08 Thread Patrick Kling (JIRA)
A more elegant FileSystem#listCorruptFileBlocks API (RAID changes)
--

 Key: MAPREDUCE-2215
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2215
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/raid
Reporter: Patrick Kling
Assignee: Patrick Kling


Map/reduce changes related to HADOOP-7060 and HDFS-1533.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1831) BlockPlacement policy for RAID

2010-12-08 Thread Scott Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969589#action_12969589
 ] 

Scott Chen commented on MAPREDUCE-1831:
---

I have remove the dependency of MAPREDUCE-1861. This patch does chooseTarget 
for parity file. This does not need MAPREDUCE-1861.
I will also rebase the patch in MAPREDUCE-1861.

> BlockPlacement policy for RAID
> --
>
> Key: MAPREDUCE-1831
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.23.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.23.0
>
> Attachments: MAPREDUCE-1831-v2.txt, MAPREDUCE-1831.20100610.txt, 
> MAPREDUCE-1831.txt, MAPREDUCE-1831.v1.1.txt
>
>
> Raid introduce the new dependency between blocks within a file.
> The blocks help decode each other. Therefore we should avoid put them on the 
> same machine.
> The proposed BlockPlacementPolicy does the following
> 1. When writing parity blocks, it avoid the parity blocks and source blocks 
> sit together.
> 2. When reducing replication number, it deletes the blocks that sits with 
> other dependent blocks.
> 3. It does not change the way we write normal files. It only has different 
> behavior when processing raid files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1831) BlockPlacement policy for RAID

2010-12-08 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1831:
--

Fix Version/s: (was: 0.22.0)
   0.23.0
Affects Version/s: (was: 0.22.0)
   0.23.0
   Status: Patch Available  (was: Open)

> BlockPlacement policy for RAID
> --
>
> Key: MAPREDUCE-1831
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.23.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.23.0
>
> Attachments: MAPREDUCE-1831-v2.txt, MAPREDUCE-1831.20100610.txt, 
> MAPREDUCE-1831.txt, MAPREDUCE-1831.v1.1.txt
>
>
> Raid introduce the new dependency between blocks within a file.
> The blocks help decode each other. Therefore we should avoid put them on the 
> same machine.
> The proposed BlockPlacementPolicy does the following
> 1. When writing parity blocks, it avoid the parity blocks and source blocks 
> sit together.
> 2. When reducing replication number, it deletes the blocks that sits with 
> other dependent blocks.
> 3. It does not change the way we write normal files. It only has different 
> behavior when processing raid files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1831) BlockPlacement policy for RAID

2010-12-08 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1831:
--

Attachment: MAPREDUCE-1831-v2.txt

I have changed the summary of this issue. This not only includes deleteReplica 
but also chooseTarget.
And the patch is uploaded.

> BlockPlacement policy for RAID
> --
>
> Key: MAPREDUCE-1831
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1831-v2.txt, MAPREDUCE-1831.20100610.txt, 
> MAPREDUCE-1831.txt, MAPREDUCE-1831.v1.1.txt
>
>
> Raid introduce the new dependency between blocks within a file.
> The blocks help decode each other. Therefore we should avoid put them on the 
> same machine.
> The proposed BlockPlacementPolicy does the following
> 1. When writing parity blocks, it avoid the parity blocks and source blocks 
> sit together.
> 2. When reducing replication number, it deletes the blocks that sits with 
> other dependent blocks.
> 3. It does not change the way we write normal files. It only has different 
> behavior when processing raid files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1831) BlockPlacement policy for RAID

2010-12-08 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1831:
--

Description: 
Raid introduce the new dependency between blocks within a file.
The blocks help decode each other. Therefore we should avoid put them on the 
same machine.

The proposed BlockPlacementPolicy does the following
1. When writing parity blocks, it avoid the parity blocks and source blocks sit 
together.
2. When reducing replication number, it deletes the blocks that sits with other 
dependent blocks.
3. It does not change the way we write normal files. It only has different 
behavior when processing raid files.

  was:
In raid, it is good to have the blocks on the same stripe located on different 
machine.
This way when one machine is down, it does not broke two blocks on the stripe.
By doing this, we can decrease the block error probability in raid from O(p^3) 
to O(p^4) which can be a hugh improvement (where p is the replica missing 
probability).

One way to do this is that we can add a new BlockPlacementPolicy which deletes 
the replicas that are co-located.
So when raiding the file, we can make the remaining replicas live on different 
machines.

Summary: BlockPlacement policy for RAID  (was: Delete the co-located 
replicas when raiding file)

> BlockPlacement policy for RAID
> --
>
> Key: MAPREDUCE-1831
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1831.20100610.txt, MAPREDUCE-1831.txt, 
> MAPREDUCE-1831.v1.1.txt
>
>
> Raid introduce the new dependency between blocks within a file.
> The blocks help decode each other. Therefore we should avoid put them on the 
> same machine.
> The proposed BlockPlacementPolicy does the following
> 1. When writing parity blocks, it avoid the parity blocks and source blocks 
> sit together.
> 2. When reducing replication number, it deletes the blocks that sits with 
> other dependent blocks.
> 3. It does not change the way we write normal files. It only has different 
> behavior when processing raid files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2214) TaskTracker should release slot if task is not launched

2010-12-08 Thread Joydeep Sen Sarma (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969572#action_12969572
 ] 

Joydeep Sen Sarma commented on MAPREDUCE-2214:
--

i think what happened in our case was something like this:
# task was requested to be killed
# the TT performed the kill action and reported back to the JT
# but the task reported back as done - at which point the TT promptly moved it 
into the SUCCEEDED state
# meanwhile the JT scheduled a cleanup and the cleanup failed to launch without 
returning the slot

the cris-crossing of #2 and #3 was what was unexpected i think (something the 
code doesn't anticipate). 

we don't hit this problem with speculation because we never request speculation 
when the task is about to complete (there's a check on the remaining time on 
the task and if the remaining time is less than N min - we don't speculate. 
there's a jira for this - don't remember which).

> TaskTracker should release slot if task is not launched
> ---
>
> Key: MAPREDUCE-2214
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2214
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.20.1
>Reporter: Ramkumar Vadali
>Assignee: Ramkumar Vadali
>
> TaskTracker.TaskInProgress.launchTask() does not launch a task if it is not 
> in an expected state. However, in the case where the task is not launched, 
> the slot is not released. We have observed this in production - the task was 
> in SUCCEEDED state by the time launchTask() got to it and then the slot was 
> never released. It is not clear how the task got into that state, but it is 
> better to handle the case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2214) TaskTracker should release slot if task is not launched

2010-12-08 Thread Dick King (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969485#action_12969485
 ] 

Dick King commented on MAPREDUCE-2214:
--

Speculative execution is a legitimate way a task can become {{SUCCEEDED}} while 
an attempt on that task is waiting to get launched.

> TaskTracker should release slot if task is not launched
> ---
>
> Key: MAPREDUCE-2214
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2214
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.20.1
>Reporter: Ramkumar Vadali
>Assignee: Ramkumar Vadali
>
> TaskTracker.TaskInProgress.launchTask() does not launch a task if it is not 
> in an expected state. However, in the case where the task is not launched, 
> the slot is not released. We have observed this in production - the task was 
> in SUCCEEDED state by the time launchTask() got to it and then the slot was 
> never released. It is not clear how the task got into that state, but it is 
> better to handle the case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-2214) TaskTracker should release slot if task is not launched

2010-12-08 Thread Ramkumar Vadali (JIRA)
TaskTracker should release slot if task is not launched
---

 Key: MAPREDUCE-2214
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2214
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Ramkumar Vadali
Assignee: Ramkumar Vadali


TaskTracker.TaskInProgress.launchTask() does not launch a task if it is not in 
an expected state. However, in the case where the task is not launched, the 
slot is not released. We have observed this in production - the task was in 
SUCCEEDED state by the time launchTask() got to it and then the slot was never 
released. It is not clear how the task got into that state, but it is better to 
handle the case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-2209) TaskTracker's heartbeat hang for several minutes when copying large job.jar from HDFS

2010-12-08 Thread Liyin Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Liang updated MAPREDUCE-2209:
---

Description: 
If a job's jar file is very large, e.g 200m+, the TaskTracker's heartbeat hang 
for several minutes when localizing the job. The jstack of related threads are 
as follows:
{code:borderStyle=solid}
"TaskLauncher for task" daemon prio=10 tid=0x002b05ee5000 nid=0x1adf 
runnable [0x42e56000]
   java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
- locked <0x002afc892ec8> (a sun.nio.ch.Util$1)
- locked <0x002afc892eb0> (a java.util.Collections$UnmodifiableSet)
- locked <0x002afc8927d8> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at 
org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:260)
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
- locked <0x002afce26158> (a java.io.BufferedInputStream)
at java.io.DataInputStream.readShort(DataInputStream.java:295)
at 
org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1304)
at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1556)
- locked <0x002afce26218> (a 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream)
at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1673)
- locked <0x002afce26218> (a 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream)
at java.io.DataInputStream.read(DataInputStream.java:83)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:47)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:209)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:142)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1214)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1195)
at 
org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:824)
- locked <0x002afce2d260> (a 
org.apache.hadoop.mapred.TaskTracker$RunningJob)
at 
org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1745)
at 
org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:103)
at 
org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1710)

"Map-events fetcher for all reduce tasks on 
tracker_r01a08025:localhost/127.0.0.1:50050" daemon prio=10 
tid=0x002b05ef8000 
nid=0x1ada waiting for monitor entry [0x42d55000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.reducesInShuffle(TaskTracker.java:582)
- waiting to lock <0x002afce2d260> (a 
org.apache.hadoop.mapred.TaskTracker$RunningJob)
at 
org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.run(TaskTracker.java:617)
- locked <0x002a9eefe1f8> (a java.util.TreeMap)


"IPC Server handler 2 on 50050" daemon prio=10 tid=0x002b050eb000 
nid=0x1ab0 waiting for monitor entry [0x4234b000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.hadoop.mapred.TaskTracker.getMapCompletionEvents(TaskTracker.java:2684)
- waiting to lock <0x002a9eefe1f8> (a java.util.TreeMap)
- locked <0x002a9eac1de8> (a org.apache.hadoop.mapred.TaskTracker)
at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)

"main" prio=10 tid=0x40113800 nid=0x197d waiting for monitor entry 
[0x4022a000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1196)
- waiting to lock <0x002a9eac1de8> (a 
org.apache.hadoop.mapred.TaskTracker)
at 
org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1068)
at org.apache.hadoop.

[jira] Commented: (MAPREDUCE-2209) TaskTracker's heartbeat hang for several minutes when copying large job.jar from HDFS

2010-12-08 Thread Liyin Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969238#action_12969238
 ] 

Liyin Liang commented on MAPREDUCE-2209:


I setup a cluster with the latest version 0.21.0.  To simulate the large 
job.jar problem, let TaskLauncher thread sleep 100 seconds just before download 
job.jar in localizeJobJarFile function.  Then the heartbeat of some TT will 
hang for almost 100 seconds. Basically, the jstack is the same with 0.19:
{code:borderStyle=solid}
"TaskLauncher for MAP tasks" daemon prio=10 tid=0x2aab3145a800 nid=0x3fe8 
waiting on condition [0x440b3000..0x440b3a10]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at 
org.apache.hadoop.mapred.TaskTracker.localizeJobJarFile(TaskTracker.java:1150)
at 
org.apache.hadoop.mapred.TaskTracker.localizeJobFiles(TaskTracker.java:1074)
at 
org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:977)
- locked <0x2aaab3a86f10> (a 
org.apache.hadoop.mapred.TaskTracker$RunningJob)
at 
org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2248)
at 
org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:2213)

"Map-events fetcher for all reduce tasks on 
tracker_hd2:localhost.localdomain/127.0.0.1:36128" daemon prio=10 tid=0x2aab
31451c00 nid=0x3fde waiting for monitor entry 
[0x41a4..0x41a40d90]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.reducesInShuffle(TaskTracker.java:800)
- waiting to lock <0x2aaab3a86f10> (a 
org.apache.hadoop.mapred.TaskTracker$RunningJob)
at 
org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.run(TaskTracker.java:834)
- locked <0x2aaab38ee1b8> (a java.util.TreeMap)

"IPC Server handler 0 on 36128" daemon prio=10 tid=0x4368ac00 
nid=0x3fc8 waiting for monitor entry [0x425f6000..0x425
f7c90]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.hadoop.mapred.TaskTracker.getMapCompletionEvents(TaskTracker.java:3254)
- waiting to lock <0x2aaab38ee1b8> (a java.util.TreeMap)
- locked <0x2aaab37f1708> (a org.apache.hadoop.mapred.TaskTracker)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:342)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1350)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1346)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1344)

"main" prio=10 tid=0x42fff400 nid=0x3f91 waiting for monitor entry 
[0x41ef..0x41ef0ed0]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1535)
- waiting to lock <0x2aaab37f1708> (a 
org.apache.hadoop.mapred.TaskTracker)
at 
org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1433)
at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2330)
at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3462)
{code}
lock order of relative threads:
TaskLauncher(localizeJobJarFile): locked RunningJob
Map-events fetcher:locked 
runningJobs   waiting to lock RunningJob
IPC Server handler(getMapCompletionEvents):  locked TaskTracker   waiting to 
lock runningJobs
main(transmitHeartBeat): waiting to 
lock TaskTracker   
So, TaskTracker is locked indirectly when downloading job.jar.

> TaskTracker's heartbeat hang for several minutes when copying large job.jar 
> from HDFS
> -
>
> Key: MAPREDUCE-2209
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2209
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
> Environment: hadoop version: 0.19.1
>Reporter: Liyin Liang
>Priority: Blocker
>
> If a job's jar file is very large, e.g 200m+, the TaskTracker's heartbeat 
> hang for several minutes when localizing the job. The jstack