Fwd: HDFS: file is not distributed after upload
Hi, folks! I've deployed hadoop (0.20.203.0rc1) on 8-node cluster. After uploading file onto hdfs I've got this file only on one of the nodes instead of being uniformly distributed across all nodes. What can be the issue? $HADOOP_HOME/bin/hadoop dfs -copyFromLocal ../data/rmat-20.0 /user/frolo/input/rmat-20.0 $HADOOP_HOME/bin/hadoop dfs -stat %b %o %r %n /user/frolo/input/rmat-* 1220222968 67108864 1 rmat-20.0 $HADOOP_HOME/bin/hadoop dfsadmin -report Configured Capacity: 2536563998720 (2.31 TB) Present Capacity: 1642543419392 (1.49 TB) DFS Remaining: 1641312030720 (1.49 TB) DFS Used: 1231388672 (1.15 GB) DFS Used%: 0.07% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 - Datanodes available: 8 (8 total, 0 dead) Name: 10.10.1.15:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 131536928768 (122.5 GB) DFS Remaining: 185533546496(172.79 GB) DFS Used%: 0% DFS Remaining%: 58.51% Last contact: Fri Feb 07 12:10:27 MSK 2014 Name: 10.10.1.13:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 131533377536 (122.5 GB) DFS Remaining: 185537097728(172.79 GB) DFS Used%: 0% DFS Remaining%: 58.52% Last contact: Fri Feb 07 12:10:27 MSK 2014 Name: 10.10.1.17:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 120023924736 (111.78 GB) DFS Remaining: 197046550528(183.51 GB) DFS Used%: 0% DFS Remaining%: 62.15% Last contact: Fri Feb 07 12:10:27 MSK 2014 Name: 10.10.1.18:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 78510628864 (73.12 GB) DFS Remaining: 238559846400(222.18 GB) DFS Used%: 0% DFS Remaining%: 75.24% Last contact: Fri Feb 07 12:10:24 MSK 2014 Name: 10.10.1.14:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 131537530880 (122.5 GB) DFS Remaining: 185532944384(172.79 GB) DFS Used%: 0% DFS Remaining%: 58.51% Last contact: Fri Feb 07 12:10:27 MSK 2014 Name: 10.10.1.11:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 1231216640 (1.15 GB) Non DFS Used: 84698116096 (78.88 GB) DFS Remaining: 231141167104(215.27 GB) DFS Used%: 0.39% DFS Remaining%: 72.9% Last contact: Fri Feb 07 12:10:24 MSK 2014 Name: 10.10.1.16:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 131537494016 (122.5 GB) DFS Remaining: 185532981248(172.79 GB) DFS Used%: 0% DFS Remaining%: 58.51% Last contact: Fri Feb 07 12:10:27 MSK 2014 Name: 10.10.1.12:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 84642578432 (78.83 GB) DFS Remaining: 232427896832(216.47 GB) DFS Used%: 0% DFS Remaining%: 73.3% Last contact: Fri Feb 07 12:10:27 MSK 2014 Best, Alex
Re: HDFS: file is not distributed after upload
Hi, The 0.20.203.0rc1 is a very old version at this point. Why not use a more current version if you're deploying a new cluster? Onto your issue, your configuration XML files (core-site.xml, hdfs-site.xml or mapred-site.xml) most likely have a dfs.replication value set to 1 causing only that may replicas to be written out by default. On Fri, Feb 7, 2014 at 2:11 PM, Alexander Frolov alexndr.fro...@gmail.com wrote: Hi, folks! I've deployed hadoop (0.20.203.0rc1) on 8-node cluster. After uploading file onto hdfs I've got this file only on one of the nodes instead of being uniformly distributed across all nodes. What can be the issue? $HADOOP_HOME/bin/hadoop dfs -copyFromLocal ../data/rmat-20.0 /user/frolo/input/rmat-20.0 $HADOOP_HOME/bin/hadoop dfs -stat %b %o %r %n /user/frolo/input/rmat-* 1220222968 67108864 1 rmat-20.0 $HADOOP_HOME/bin/hadoop dfsadmin -report Configured Capacity: 2536563998720 (2.31 TB) Present Capacity: 1642543419392 (1.49 TB) DFS Remaining: 1641312030720 (1.49 TB) DFS Used: 1231388672 (1.15 GB) DFS Used%: 0.07% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 - Datanodes available: 8 (8 total, 0 dead) Name: 10.10.1.15:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 131536928768 (122.5 GB) DFS Remaining: 185533546496(172.79 GB) DFS Used%: 0% DFS Remaining%: 58.51% Last contact: Fri Feb 07 12:10:27 MSK 2014 Name: 10.10.1.13:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 131533377536 (122.5 GB) DFS Remaining: 185537097728(172.79 GB) DFS Used%: 0% DFS Remaining%: 58.52% Last contact: Fri Feb 07 12:10:27 MSK 2014 Name: 10.10.1.17:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 120023924736 (111.78 GB) DFS Remaining: 197046550528(183.51 GB) DFS Used%: 0% DFS Remaining%: 62.15% Last contact: Fri Feb 07 12:10:27 MSK 2014 Name: 10.10.1.18:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 78510628864 (73.12 GB) DFS Remaining: 238559846400(222.18 GB) DFS Used%: 0% DFS Remaining%: 75.24% Last contact: Fri Feb 07 12:10:24 MSK 2014 Name: 10.10.1.14:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 131537530880 (122.5 GB) DFS Remaining: 185532944384(172.79 GB) DFS Used%: 0% DFS Remaining%: 58.51% Last contact: Fri Feb 07 12:10:27 MSK 2014 Name: 10.10.1.11:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 1231216640 (1.15 GB) Non DFS Used: 84698116096 (78.88 GB) DFS Remaining: 231141167104(215.27 GB) DFS Used%: 0.39% DFS Remaining%: 72.9% Last contact: Fri Feb 07 12:10:24 MSK 2014 Name: 10.10.1.16:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 131537494016 (122.5 GB) DFS Remaining: 185532981248(172.79 GB) DFS Used%: 0% DFS Remaining%: 58.51% Last contact: Fri Feb 07 12:10:27 MSK 2014 Name: 10.10.1.12:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 84642578432 (78.83 GB) DFS Remaining: 232427896832(216.47 GB) DFS Used%: 0% DFS Remaining%: 73.3% Last contact: Fri Feb 07 12:10:27 MSK 2014 Best, Alex -- Harsh J
Re: HDFS: file is not distributed after upload
Hi Alex, You should give the copyFromLocal command from the namenode or any machine that is not a datanode to get the file distributed. On Fri, Feb 7, 2014 at 10:53 AM, Harsh J ha...@cloudera.com wrote: Hi, The 0.20.203.0rc1 is a very old version at this point. Why not use a more current version if you're deploying a new cluster? Onto your issue, your configuration XML files (core-site.xml, hdfs-site.xml or mapred-site.xml) most likely have a dfs.replication value set to 1 causing only that may replicas to be written out by default. On Fri, Feb 7, 2014 at 2:11 PM, Alexander Frolov alexndr.fro...@gmail.com wrote: Hi, folks! I've deployed hadoop (0.20.203.0rc1) on 8-node cluster. After uploading file onto hdfs I've got this file only on one of the nodes instead of being uniformly distributed across all nodes. What can be the issue? $HADOOP_HOME/bin/hadoop dfs -copyFromLocal ../data/rmat-20.0 /user/frolo/input/rmat-20.0 $HADOOP_HOME/bin/hadoop dfs -stat %b %o %r %n /user/frolo/input/rmat-* 1220222968 67108864 1 rmat-20.0 $HADOOP_HOME/bin/hadoop dfsadmin -report Configured Capacity: 2536563998720 (2.31 TB) Present Capacity: 1642543419392 (1.49 TB) DFS Remaining: 1641312030720 (1.49 TB) DFS Used: 1231388672 (1.15 GB) DFS Used%: 0.07% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 - Datanodes available: 8 (8 total, 0 dead) Name: 10.10.1.15:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 131536928768 (122.5 GB) DFS Remaining: 185533546496(172.79 GB) DFS Used%: 0% DFS Remaining%: 58.51% Last contact: Fri Feb 07 12:10:27 MSK 2014 Name: 10.10.1.13:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 131533377536 (122.5 GB) DFS Remaining: 185537097728(172.79 GB) DFS Used%: 0% DFS Remaining%: 58.52% Last contact: Fri Feb 07 12:10:27 MSK 2014 Name: 10.10.1.17:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 120023924736 (111.78 GB) DFS Remaining: 197046550528(183.51 GB) DFS Used%: 0% DFS Remaining%: 62.15% Last contact: Fri Feb 07 12:10:27 MSK 2014 Name: 10.10.1.18:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 78510628864 (73.12 GB) DFS Remaining: 238559846400(222.18 GB) DFS Used%: 0% DFS Remaining%: 75.24% Last contact: Fri Feb 07 12:10:24 MSK 2014 Name: 10.10.1.14:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 131537530880 (122.5 GB) DFS Remaining: 185532944384(172.79 GB) DFS Used%: 0% DFS Remaining%: 58.51% Last contact: Fri Feb 07 12:10:27 MSK 2014 Name: 10.10.1.11:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 1231216640 (1.15 GB) Non DFS Used: 84698116096 (78.88 GB) DFS Remaining: 231141167104(215.27 GB) DFS Used%: 0.39% DFS Remaining%: 72.9% Last contact: Fri Feb 07 12:10:24 MSK 2014 Name: 10.10.1.16:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 131537494016 (122.5 GB) DFS Remaining: 185532981248(172.79 GB) DFS Used%: 0% DFS Remaining%: 58.51% Last contact: Fri Feb 07 12:10:27 MSK 2014 Name: 10.10.1.12:50010 Decommission Status : Normal Configured Capacity: 317070499840 (295.29 GB) DFS Used: 24576 (24 KB) Non DFS Used: 84642578432 (78.83 GB) DFS Remaining: 232427896832(216.47 GB) DFS Used%: 0% DFS Remaining%: 73.3% Last contact: Fri Feb 07 12:10:27 MSK 2014 Best, Alex -- Harsh J
Can we avoid restarting of AM when it fails?
Hi, I am having some failure test cases where my Application Master is supposed to fail. But when it fails it is again started with appID_02 . Is there a way for me to avoid the second instance of the Application Master getting started? Is it re-started automatically by the RM after the first one failed? Thanks, Kishore
Re: java.lang.OutOfMemoryError: Java heap space
Thanks Park for sharing the above configs But I am wondering if the above config changes would make any huge difference in my case. As per my logs, I am very worried about this line - INFO org.apache.hadoop.mapred.MapTask: Record too large for in-memory buffer: 644245358 bytes If I am understanding it properly, my 1 record is very large to fit into the memory, which is causing the issue. Any of the above changes wouldn't make any huge impact, please correct me if I am taking it totally wrong. - Adding hadoop user group here as well, to throw some valuable inputs to understand the above question. Since I am doing a join on a grouped bag, do you think that might be the case ? But if that is the issue, as far as I understand Bags in Pig are spillable, it shouldn't have given this issue. I can't get rid of group by, Grouping by first should idealing improve my join. But if this is the root cause, if I am understanding it correctly, do you think I should get rid of group-by. But my question in that case would be what would happen if I do group by later after join, if will result in much bigger bag (because it would have more records after join) Am I thinking here correctly ? Regards Prav On Fri, Feb 7, 2014 at 3:11 AM, Cheolsoo Park piaozhe...@gmail.com wrote: Looks like you're running out of space in MapOutputBuffer. Two suggestions- 1) You said that io.sort.mb is already set to 768 MB, but did you try to lower io.sort.spill.percent in order to spill earlier and more often? Page 12- http://www.slideshare.net/Hadoop_Summit/optimizing-mapreduce-job-performance 2) Can't you increase the parallelism of mappers so that each mapper has to handle a smaller size of data? Pig determines the number of mappers by total input size / pig.maxCombinedSplitSize (128MB by default). So you can try to lower pig.maxCombinedSplitSize. But I admit Pig internal data types are not memory-efficient, and that is an optimization opportunity. Contribute! On Thu, Feb 6, 2014 at 2:54 PM, praveenesh kumar praveen...@gmail.com wrote: Its a normal join. I can't use replicated join, as the data is very large. Regards Prav On Thu, Feb 6, 2014 at 7:52 PM, abhishek abhishek.dod...@gmail.com wrote: Hi Praveenesh, Did you use replicated join in your pig script or is it a regular join ?? Regards Abhishek Sent from my iPhone On Feb 6, 2014, at 11:25 AM, praveenesh kumar praveen...@gmail.com wrote: Hi all, I am running a Pig Script which is running fine for small data. But when I scale the data, I am getting the following error at my map stage. Please refer to the map logs as below. My Pig script is doing a group by first, followed by a join on the grouped data. Any clues to understand where I should look at or how shall I deal with this situation. I don't want to just go by just increasing the heap space. My map jvm heap space is already 3 GB with io.sort.mb = 768 MB. 2014-02-06 19:15:12,243 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2014-02-06 19:15:15,025 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0 2014-02-06 19:15:15,123 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@2bd9e2822014-02-06 19:15:15,546 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 768 2014-02-06 19:15:19,846 INFO org.apache.hadoop.mapred.MapTask: data buffer = 612032832/644245088 2014-02-06 19:15:19,846 INFO org.apache.hadoop.mapred.MapTask: record buffer = 9563013/10066330 2014-02-06 19:15:20,037 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor 2014-02-06 19:15:21,083 INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader: Created input record counter: Input records from _1_tmp1327641329 2014-02-06 19:15:52,894 INFO org.apache.hadoop.mapred.MapTask: Spilling map output: buffer full= true 2014-02-06 19:15:52,895 INFO org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 611949600; bufvoid = 644245088 2014-02-06 19:15:52,895 INFO org.apache.hadoop.mapred.MapTask: kvstart = 0; kvend = 576; length = 10066330 2014-02-06 19:16:06,182 INFO org.apache.hadoop.mapred.MapTask: Finished spill 0 2014-02-06 19:16:16,169 INFO org.apache.pig.impl.util.SpillableMemoryManager: first memory handler call - Collection threshold init = 328728576(321024K) used = 1175055104(1147514K) committed = 1770848256(1729344K) max = 2097152000(2048000K) 2014-02-06 19:16:20,446 INFO org.apache.pig.impl.util.SpillableMemoryManager: Spilled an estimate of 308540402 bytes from 1 objects. init = 328728576(321024K) used = 1175055104(1147514K) committed
meaning or usage of reserved containers in YARN Capacity scheduler
Hi all, I have a question about reserved containers in the YARN capacity scheduler. After reading the source code and related document, it is not very clear. What is the purpose or practical usage of the reserved container? thx.
Re: java.lang.OutOfMemoryError: Java heap space
Hi Park, Your explanation makes perfect sense in my case. Thanks for explaining what is happening behind the scenes. I am wondering you used normal java compression/decompression or is there a UDF already available to do this stuff or some kind of property that we need to enable to say to PIG that compress bags before spilling. Regards Prav On Fri, Feb 7, 2014 at 4:37 PM, Cheolsoo Park piaozhe...@gmail.com wrote: Hi Prav, You're thinking correctly, and it's true that Pig bags are spillable. However, spilling is no magic, meaning you can still run into OOM with huge bags like you have here. Pig runs Spillable Memory Manager (SMM) in a separate thread. When spilling is triggered, SMM locks bags that it's trying to spill to disk. After the spilling is finished, GC frees up memory. The problem is that it's possible that more bags are loaded into memory while the spilling is in progress. Now JVM triggers GC, but GC cannot free up memory because SMM is locking the bags, resulting in OOM error. This happens quite often. Sounds like you do group-by to reduce the number of rows before join and don't immediately run any aggregation function on the grouped bags. If that's the case, can you compress those bags? For eg, you could add a foreach after group-by and run a UDF that compresses a bag and returns it as bytearray. From there, you're moving around small blobs rather than big bags. Of course, you will need to decompress them when you restore data out of those bags at some point. This trick saved me several times in the past particularly when I dealt with bags of large chararrays. Just a thought. Hope this is helpful. Thanks, Cheolsoo On Fri, Feb 7, 2014 at 7:37 AM, praveenesh kumar praveen...@gmail.com wrote: Thanks Park for sharing the above configs But I am wondering if the above config changes would make any huge difference in my case. As per my logs, I am very worried about this line - INFO org.apache.hadoop.mapred.MapTask: Record too large for in-memory buffer: 644245358 bytes If I am understanding it properly, my 1 record is very large to fit into the memory, which is causing the issue. Any of the above changes wouldn't make any huge impact, please correct me if I am taking it totally wrong. - Adding hadoop user group here as well, to throw some valuable inputs to understand the above question. Since I am doing a join on a grouped bag, do you think that might be the case ? But if that is the issue, as far as I understand Bags in Pig are spillable, it shouldn't have given this issue. I can't get rid of group by, Grouping by first should idealing improve my join. But if this is the root cause, if I am understanding it correctly, do you think I should get rid of group-by. But my question in that case would be what would happen if I do group by later after join, if will result in much bigger bag (because it would have more records after join) Am I thinking here correctly ? Regards Prav On Fri, Feb 7, 2014 at 3:11 AM, Cheolsoo Park piaozhe...@gmail.com wrote: Looks like you're running out of space in MapOutputBuffer. Two suggestions- 1) You said that io.sort.mb is already set to 768 MB, but did you try to lower io.sort.spill.percent in order to spill earlier and more often? Page 12- http://www.slideshare.net/Hadoop_Summit/optimizing-mapreduce-job-performance 2) Can't you increase the parallelism of mappers so that each mapper has to handle a smaller size of data? Pig determines the number of mappers by total input size / pig.maxCombinedSplitSize (128MB by default). So you can try to lower pig.maxCombinedSplitSize. But I admit Pig internal data types are not memory-efficient, and that is an optimization opportunity. Contribute! On Thu, Feb 6, 2014 at 2:54 PM, praveenesh kumar praveen...@gmail.com wrote: Its a normal join. I can't use replicated join, as the data is very large. Regards Prav On Thu, Feb 6, 2014 at 7:52 PM, abhishek abhishek.dod...@gmail.com wrote: Hi Praveenesh, Did you use replicated join in your pig script or is it a regular join ?? Regards Abhishek Sent from my iPhone On Feb 6, 2014, at 11:25 AM, praveenesh kumar praveen...@gmail.com wrote: Hi all, I am running a Pig Script which is running fine for small data. But when I scale the data, I am getting the following error at my map stage. Please refer to the map logs as below. My Pig script is doing a group by first, followed by a join on the grouped data. Any clues to understand where I should look at or how shall I deal with this situation. I don't want to just go by just increasing the heap space. My map jvm heap space is already 3 GB with io.sort.mb = 768 MB. 2014-02-06 19:15:12,243 WARN
Re: java.lang.OutOfMemoryError: Java heap space
Hi Prav, You're thinking correctly, and it's true that Pig bags are spillable. However, spilling is no magic, meaning you can still run into OOM with huge bags like you have here. Pig runs Spillable Memory Manager (SMM) in a separate thread. When spilling is triggered, SMM locks bags that it's trying to spill to disk. After the spilling is finished, GC frees up memory. The problem is that it's possible that more bags are loaded into memory while the spilling is in progress. Now JVM triggers GC, but GC cannot free up memory because SMM is locking the bags, resulting in OOM error. This happens quite often. Sounds like you do group-by to reduce the number of rows before join and don't immediately run any aggregation function on the grouped bags. If that's the case, can you compress those bags? For eg, you could add a foreach after group-by and run a UDF that compresses a bag and returns it as bytearray. From there, you're moving around small blobs rather than big bags. Of course, you will need to decompress them when you restore data out of those bags at some point. This trick saved me several times in the past particularly when I dealt with bags of large chararrays. Just a thought. Hope this is helpful. Thanks, Cheolsoo On Fri, Feb 7, 2014 at 7:37 AM, praveenesh kumar praveen...@gmail.comwrote: Thanks Park for sharing the above configs But I am wondering if the above config changes would make any huge difference in my case. As per my logs, I am very worried about this line - INFO org.apache.hadoop.mapred.MapTask: Record too large for in-memory buffer: 644245358 bytes If I am understanding it properly, my 1 record is very large to fit into the memory, which is causing the issue. Any of the above changes wouldn't make any huge impact, please correct me if I am taking it totally wrong. - Adding hadoop user group here as well, to throw some valuable inputs to understand the above question. Since I am doing a join on a grouped bag, do you think that might be the case ? But if that is the issue, as far as I understand Bags in Pig are spillable, it shouldn't have given this issue. I can't get rid of group by, Grouping by first should idealing improve my join. But if this is the root cause, if I am understanding it correctly, do you think I should get rid of group-by. But my question in that case would be what would happen if I do group by later after join, if will result in much bigger bag (because it would have more records after join) Am I thinking here correctly ? Regards Prav On Fri, Feb 7, 2014 at 3:11 AM, Cheolsoo Park piaozhe...@gmail.comwrote: Looks like you're running out of space in MapOutputBuffer. Two suggestions- 1) You said that io.sort.mb is already set to 768 MB, but did you try to lower io.sort.spill.percent in order to spill earlier and more often? Page 12- http://www.slideshare.net/Hadoop_Summit/optimizing-mapreduce-job-performance 2) Can't you increase the parallelism of mappers so that each mapper has to handle a smaller size of data? Pig determines the number of mappers by total input size / pig.maxCombinedSplitSize (128MB by default). So you can try to lower pig.maxCombinedSplitSize. But I admit Pig internal data types are not memory-efficient, and that is an optimization opportunity. Contribute! On Thu, Feb 6, 2014 at 2:54 PM, praveenesh kumar praveen...@gmail.com wrote: Its a normal join. I can't use replicated join, as the data is very large. Regards Prav On Thu, Feb 6, 2014 at 7:52 PM, abhishek abhishek.dod...@gmail.com wrote: Hi Praveenesh, Did you use replicated join in your pig script or is it a regular join ?? Regards Abhishek Sent from my iPhone On Feb 6, 2014, at 11:25 AM, praveenesh kumar praveen...@gmail.com wrote: Hi all, I am running a Pig Script which is running fine for small data. But when I scale the data, I am getting the following error at my map stage. Please refer to the map logs as below. My Pig script is doing a group by first, followed by a join on the grouped data. Any clues to understand where I should look at or how shall I deal with this situation. I don't want to just go by just increasing the heap space. My map jvm heap space is already 3 GB with io.sort.mb = 768 MB. 2014-02-06 19:15:12,243 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2014-02-06 19:15:15,025 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0 2014-02-06 19:15:15,123 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@2bd9e2822014-02-06 19:15:15,546 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 768 2014-02-06 19:15:19,846 INFO
Re: Problems building hadoop 2.2.0 from source
Thanks, I built 2.3 yesterday (checked out from from link suggested in earlier post of this thread) without problems apart from VM running out of memory which was fixed with export MAVEN_OPTS=-Xmx2048m At least, I got a message saying successful build. Thanks for your help. On 8 February 2014 10:53, Ted Yu yuzhih...@gmail.com wrote: In the output for a passing test, I saw: 2014-02-06 16:48:49,722 ERROR [Thread[Thread-71,5,main]] delegation.AbstractDelegationTokenSecretManager (AbstractDelegationTokenSecretManager.java:run(557)) - InterruptedExcpetion recieved for ExpiredTokenRemover thread java.lang.InterruptedException: sleep interrupted Meaning the above was not critical. branch-2.3 is receiving attention now. Discovering test failure there would be more helpful. Cheers On Thu, Feb 6, 2014 at 9:25 PM, Christopher Thomas christophermauricetho...@gmail.com wrote: I guess the ERROR lines in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client- jobclient/target/surefire-reports/org.apache.hadoop.mapreduce.v2. TestMRJobsWithHistoryService-output.txt led me to believe that the problem was with TestMRJobsWithHistoryService. If that's not the case then what do these messages indicate? As I say I am a complete novice and the learning curve is very steep. On 7 February 2014 14:47, Ted Yu yuzhih...@gmail.com wrote: I checked out source code from http://svn.apache.org/repos/asf/hadoop/common/branches/branch-2.3 and it builds. From TestMRJobsWithHistoryService.txt, the test passed. What led to this test being singled out among the 454 tests ? Thanks On Thu, Feb 6, 2014 at 7:26 PM, Christopher Thomas christophermauricetho...@gmail.com wrote: Yes well I tried 2.3, but I have found a number of problems building it. I had to resort to manually applying patches that I found in the bug tracking lists, which did not seem to have made it into all branches. So for the moment I am sticking with 2.2.0 which is advertised as being stable. I apologise for the confusion. Here is the contents of ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/surefire-reports/org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService.txt, though perhaps not that illuminating: --- Test set: org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService --- Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 52.669 sec On 7 February 2014 14:12, Ted Yu yuzhih...@gmail.com wrote: The output was from hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/surefire-reports/org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService-output.txt Can you show us the contents of ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/surefire-reports/org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService.txt ? BTW hadoop 2.3 release candidate is coming up. You may consider paying more attention to hadoop 2.3 Cheers On Thu, Feb 6, 2014 at 5:33 PM, Christopher Thomas christophermauricetho...@gmail.com wrote: I included the last part of hadoop-mapreduce-project/hadoop-mapreduce-client/ hadoop-mapreduce-client-jobclient/target/surefire- reports/org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService. txt in the second half of my initial posting, calling it the output from TestMRJobsWithHistoryService. Sloppy terminology I know, sorry if I wasn't very clear. Regards Chris On 7 February 2014 11:53, Ted Yu yuzhih...@gmail.com wrote: There isn't System.exit call in TestMRJobsWithHistoryService.java What did hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/surefire-reports/org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService.txt say ? Cheers On Thu, Feb 6, 2014 at 4:41 PM, Christopher Thomas christophermauricetho...@gmail.com wrote: Hi, I am a complete beginner to Hadoop, trying to build 2.2.0 from source on a Macbook Pro running OS X Mavericks. I am following the 'instructions' at http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html such as they are. I get the following test failure: Forking command line: /bin/sh -c cd /Users/hadoop/hadoop-2.2.0-src/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/bin/java -Xmx1024m -XX:+HeapDumpOnOutOfMemoryError -jar /Users/hadoop/hadoop-2.2.0-src/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/surefire/surefirebooter1837947962445626736.jar