[jira] Created: (MAPREDUCE-2308) Sort buffer size (io.sort.mb) is limited to 2 GB
Sort buffer size (io.sort.mb) is limited to 2 GB -- Key: MAPREDUCE-2308 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2308 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0, 0.20.2, 0.20.1 Environment: Cloudera CDH3b3 (0.20.2+) Reporter: Jay Hacker Priority: Minor I have MapReduce jobs that use a large amount of per-task memory, because the algorithm I'm using converges faster if more data is together on a node. I have my JVM heap size set at 3200 MB, and if I use the popular rule of thumb that io.sort.mb should be ~70% of that, I get 2240 MB. I rounded this down to 2048 MB, but map tasks crash with : {noformat} java.io.IOException: Invalid io.sort.mb: 2048 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:790) ... {noformat} MapTask.MapOutputBuffer implements its buffer with a byte[] of size io.sort.mb (in bytes), and is sanity checking the size before allocating the array. The problem is that Java arrays can't have more than 2^31 - 1 elements (even with a 64-bit JVM), and this is a limitation of the Java language specificiation itself. As memory and data sizes grow, this would seem to be a crippling limtiation of Java. It would be nice if this ceiling were documented, and an error issued sooner, e.g. in jobtracker startup upon reading the config. Going forward, we may need to implement some array of arrays hack for large buffers. :( -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (MAPREDUCE-2307) Exception thrown in Jobtracker logs, when the Scheduler configured is FairScheduler.
Exception thrown in Jobtracker logs, when the Scheduler configured is FairScheduler. Key: MAPREDUCE-2307 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2307 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.23.0 Reporter: Devaraj K Priority: Minor If we try to start the job tracker with fair scheduler using the default configuration, It is giving the below exception. {code:xml} 2010-07-03 10:18:27,142 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.mapred.JobTracker: Starting RUNNING 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 9001: starting 2010-07-03 10:18:28,037 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/linux172.site 2010-07-03 10:18:28,090 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/linux177.site 2010-07-03 10:18:40,074 ERROR org.apache.hadoop.mapred.PoolManager: Failed to reload allocations file - will use existing allocations. java.lang.NullPointerException at java.io.File.init(File.java:222) at org.apache.hadoop.mapred.PoolManager.reloadAllocsIfNecessary(PoolManager.java:127) at org.apache.hadoop.mapred.FairScheduler.assignTasks(FairScheduler.java:234) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2785) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:513) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:984) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:980) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:978) {code} -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (MAPREDUCE-2309) While querying the Job Statics from the command-line, if we give wrong status name then there is no warning or response.
While querying the Job Statics from the command-line, if we give wrong status name then there is no warning or response. Key: MAPREDUCE-2309 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2309 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.23.0 Reporter: Devaraj K Priority: Minor If we try to get the jobs information by giving the wrong status name from the command line interface, it is not giving any warning or response. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (MAPREDUCE-2310) If we stop Job Tracker, Task Tracker is also getting stopped.
If we stop Job Tracker, Task Tracker is also getting stopped. - Key: MAPREDUCE-2310 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2310 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.20.2 Reporter: Devaraj K Priority: Minor If we execute stop-jobtracker.sh for stopping Job Tracker, Task Tracker is also stopping. This is not applicable for the latest (trunk) code because stop-jobtracker.sh file is not coming. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-2308) Sort buffer size (io.sort.mb) is limited to 2 GB
[ https://issues.apache.org/jira/browse/MAPREDUCE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12992070#comment-12992070 ] Arun C Murthy commented on MAPREDUCE-2308: -- You are hitting the JVM limit on the size of an array... we'll need to change the io.sort.mb to use multiple buffers... Sort buffer size (io.sort.mb) is limited to 2 GB -- Key: MAPREDUCE-2308 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2308 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.1, 0.20.2, 0.21.0 Environment: Cloudera CDH3b3 (0.20.2+) Reporter: Jay Hacker Priority: Minor I have MapReduce jobs that use a large amount of per-task memory, because the algorithm I'm using converges faster if more data is together on a node. I have my JVM heap size set at 3200 MB, and if I use the popular rule of thumb that io.sort.mb should be ~70% of that, I get 2240 MB. I rounded this down to 2048 MB, but map tasks crash with : {noformat} java.io.IOException: Invalid io.sort.mb: 2048 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:790) ... {noformat} MapTask.MapOutputBuffer implements its buffer with a byte[] of size io.sort.mb (in bytes), and is sanity checking the size before allocating the array. The problem is that Java arrays can't have more than 2^31 - 1 elements (even with a 64-bit JVM), and this is a limitation of the Java language specificiation itself. As memory and data sizes grow, this would seem to be a crippling limtiation of Java. It would be nice if this ceiling were documented, and an error issued sooner, e.g. in jobtracker startup upon reading the config. Going forward, we may need to implement some array of arrays hack for large buffers. :( -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (MAPREDUCE-2311) TestFairScheduler failing on trunk
TestFairScheduler failing on trunk -- Key: MAPREDUCE-2311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2311 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.22.0 Reporter: Todd Lipcon Priority: Blocker Fix For: 0.22.0 Most of the test cases in this test are failing on trunk, unclear how long since the contrib tests weren't running while the core tests were failed. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (MAPREDUCE-2312) Better error handling in RaidShell
Better error handling in RaidShell -- Key: MAPREDUCE-2312 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2312 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Ramkumar Vadali Assignee: Ramkumar Vadali Priority: Minor If there is an error trying to find the parity information for a corrupt file, RaidShell should print it as corrupt, instead of bailing. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-2311) TestFairScheduler failing on trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-2311: -- Attachment: MAPREDUCE-2311.txt TestFairScheduler failing on trunk -- Key: MAPREDUCE-2311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2311 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Scott Chen Priority: Blocker Fix For: 0.22.0 Attachments: MAPREDUCE-2311.txt Most of the test cases in this test are failing on trunk, unclear how long since the contrib tests weren't running while the core tests were failed. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Assigned: (MAPREDUCE-2311) TestFairScheduler failing on trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen reassigned MAPREDUCE-2311: - Assignee: Scott Chen TestFairScheduler failing on trunk -- Key: MAPREDUCE-2311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2311 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Scott Chen Priority: Blocker Fix For: 0.22.0 Attachments: MAPREDUCE-2311.txt Most of the test cases in this test are failing on trunk, unclear how long since the contrib tests weren't running while the core tests were failed. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-2311) TestFairScheduler failing on trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12992306#comment-12992306 ] Scott Chen commented on MAPREDUCE-2311: --- Sorry. This was my bad. TestFairScheduler failing on trunk -- Key: MAPREDUCE-2311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2311 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Scott Chen Priority: Blocker Fix For: 0.22.0 Attachments: MAPREDUCE-2311.txt Most of the test cases in this test are failing on trunk, unclear how long since the contrib tests weren't running while the core tests were failed. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-2311) TestFairScheduler failing on trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12992307#comment-12992307 ] Scott Chen commented on MAPREDUCE-2311: --- Sorry. This was my bad. TestFairScheduler failing on trunk -- Key: MAPREDUCE-2311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2311 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Scott Chen Priority: Blocker Fix For: 0.22.0 Attachments: MAPREDUCE-2311.txt Most of the test cases in this test are failing on trunk, unclear how long since the contrib tests weren't running while the core tests were failed. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-2311) TestFairScheduler failing on trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-2311: -- Attachment: (was: MAPREDUCE-2311.txt) TestFairScheduler failing on trunk -- Key: MAPREDUCE-2311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2311 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Scott Chen Priority: Blocker Fix For: 0.22.0 Attachments: MAPREDUCE-2311.txt Most of the test cases in this test are failing on trunk, unclear how long since the contrib tests weren't running while the core tests were failed. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-2311) TestFairScheduler failing on trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-2311: -- Attachment: MAPREDUCE-2311.txt TestFairScheduler failing on trunk -- Key: MAPREDUCE-2311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2311 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Scott Chen Priority: Blocker Fix For: 0.22.0 Attachments: MAPREDUCE-2311.txt Most of the test cases in this test are failing on trunk, unclear how long since the contrib tests weren't running while the core tests were failed. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-2311) TestFairScheduler failing on trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12992329#comment-12992329 ] Priyo Mustafi commented on MAPREDUCE-2311: -- Hi Scott, thanks for the patch. Can you explain the code change in FairScheduler updateRunnability method? TestFairScheduler failing on trunk -- Key: MAPREDUCE-2311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2311 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Scott Chen Priority: Blocker Fix For: 0.22.0 Attachments: MAPREDUCE-2311.txt Most of the test cases in this test are failing on trunk, unclear how long since the contrib tests weren't running while the core tests were failed. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-2074) Task should fail when symlink creation fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Priyo Mustafi updated MAPREDUCE-2074: - Attachment: MAPREDUCE-2074.txt Fixed a test failure Task should fail when symlink creation fail --- Key: MAPREDUCE-2074 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2074 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache Affects Versions: 0.20.2 Reporter: Koji Noguchi Assignee: Priyo Mustafi Priority: Minor Fix For: 0.22.0 Attachments: MAPREDUCE-2074.txt, MAPREDUCE-2074.txt, MAPREDUCE-2074.txt If I pass an invalid symlink as -Dmapred.cache.files=/user/knoguchi/onerecord.txt#abc/abc Task only reports a WARN and goes on. {noformat} 2010-09-16 21:38:49,782 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /0/tmp/mapred-local/taskTracker/knoguchi/distcache/-5031501808205559510_-128488332_1354038698/abc-nn1.def.com/user/knoguchi/onerecord.txt - /0/tmp/mapred-local/taskTracker/knoguchi/jobcache/job_201008310107_15105/attempt_201008310107_15105_m_00_0/work/./abc/abc 2010-09-16 21:38:49,789 WARN org.apache.hadoop.mapred.TaskRunner: Failed to create symlink: /0/tmp/mapred-local/taskTracker/knoguchi/distcache/-5031501808205559510_-128488332_1354038698/abc-nn1.def.com/user/knoguchi/onerecord.txt - /0/tmp/mapred-local/taskTracker/knoguchi/jobcache/job_201008310107_15105/attempt_201008310107_15105_m_00_0/work/./abc/abc {noformat} I believe we should fail the task at this point. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-2074) Task should fail when symlink creation fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Priyo Mustafi updated MAPREDUCE-2074: - Status: Patch Available (was: Open) Task should fail when symlink creation fail --- Key: MAPREDUCE-2074 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2074 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache Affects Versions: 0.20.2 Reporter: Koji Noguchi Assignee: Priyo Mustafi Priority: Minor Fix For: 0.22.0 Attachments: MAPREDUCE-2074.txt, MAPREDUCE-2074.txt, MAPREDUCE-2074.txt If I pass an invalid symlink as -Dmapred.cache.files=/user/knoguchi/onerecord.txt#abc/abc Task only reports a WARN and goes on. {noformat} 2010-09-16 21:38:49,782 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /0/tmp/mapred-local/taskTracker/knoguchi/distcache/-5031501808205559510_-128488332_1354038698/abc-nn1.def.com/user/knoguchi/onerecord.txt - /0/tmp/mapred-local/taskTracker/knoguchi/jobcache/job_201008310107_15105/attempt_201008310107_15105_m_00_0/work/./abc/abc 2010-09-16 21:38:49,789 WARN org.apache.hadoop.mapred.TaskRunner: Failed to create symlink: /0/tmp/mapred-local/taskTracker/knoguchi/distcache/-5031501808205559510_-128488332_1354038698/abc-nn1.def.com/user/knoguchi/onerecord.txt - /0/tmp/mapred-local/taskTracker/knoguchi/jobcache/job_201008310107_15105/attempt_201008310107_15105_m_00_0/work/./abc/abc {noformat} I believe we should fail the task at this point. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-2074) Task should fail when symlink creation fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Priyo Mustafi updated MAPREDUCE-2074: - Status: Open (was: Patch Available) Task should fail when symlink creation fail --- Key: MAPREDUCE-2074 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2074 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache Affects Versions: 0.20.2 Reporter: Koji Noguchi Assignee: Priyo Mustafi Priority: Minor Fix For: 0.22.0 Attachments: MAPREDUCE-2074.txt, MAPREDUCE-2074.txt, MAPREDUCE-2074.txt If I pass an invalid symlink as -Dmapred.cache.files=/user/knoguchi/onerecord.txt#abc/abc Task only reports a WARN and goes on. {noformat} 2010-09-16 21:38:49,782 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /0/tmp/mapred-local/taskTracker/knoguchi/distcache/-5031501808205559510_-128488332_1354038698/abc-nn1.def.com/user/knoguchi/onerecord.txt - /0/tmp/mapred-local/taskTracker/knoguchi/jobcache/job_201008310107_15105/attempt_201008310107_15105_m_00_0/work/./abc/abc 2010-09-16 21:38:49,789 WARN org.apache.hadoop.mapred.TaskRunner: Failed to create symlink: /0/tmp/mapred-local/taskTracker/knoguchi/distcache/-5031501808205559510_-128488332_1354038698/abc-nn1.def.com/user/knoguchi/onerecord.txt - /0/tmp/mapred-local/taskTracker/knoguchi/jobcache/job_201008310107_15105/attempt_201008310107_15105_m_00_0/work/./abc/abc {noformat} I believe we should fail the task at this point. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira