Re: hadoop/hive data loading
Hi, What is the meaning of 'union' over here. Is there any hadoop job with 1 (or few) reducer that combines all data together. Have you tried external (dynamic) partitions for combining data? -amit - Original Message - From: hadoopman hadoop...@gmail.com To: common-user@hadoop.apache.org Cc: Sent: Tuesday, 10 May 2011 11:26 PM Subject: hadoop/hive data loading When we load data into hive sometimes we've run into situations where the load fails and the logs show a heap out of memory error. If I load just a few days (or months) of data then no problem. But then if I try to load two years (for example) of data then I've seen it fail. Not with every feed but certain ones. Sometimes I've been able to split the data and get it to load. An example of one type of feed I'm working on is the apache web server access logs. Generally it works. But there are times when I need to load more than a few months of data and get the memory heap errors in the task logs. Generally how do people load their data into Hive? We have a process where we first copy it to hdfs then from there we run a staging process to get it into hive. Once that completes we perform a union all then overwrite table partition. Usually it's during the union all stage that we see these errors appear. Also is there a log which tells you which log it fails on? I can see which task/job failed but not finding which file it's complaining about. I figure that might help a bit.. Thanks!
how do I keep reduce tmp files in mapred.local.dir
hello all, I am trying to keep the output and copied files for reduce tasks after a job finishes. I commented out many remove, delete kind of code, from TaskTracker, Task, etc, but still can not keep them. Any idea?
Bad connection to FS. command aborted
Hi all! I have been trying to figure out why I m getting this error! All that I did was : 1) Use a single node cluster 2) Made some modifications in the core (in some MapRed modules). Successfully compiled it 3) Tried bin/start-dfs.sh alone. All the required daemons (NN and DN) are up. The NameNode and DataNode logs are nt showing any errors/exceptions. Only interesting thing I found was : *WARN org.apache.hadoop.ipc.Server: Incorrect header or version mismatch from 10.72.147.109:40048 got version 94 expected version 3 * in the NameNode logs. Someone please help me out of this! Matthew
Re: Bad connection to FS. command aborted
Matthew - this normally means there's a mismatch version of hadoop somewhere in your pipe - I'd suggest to make sure that all your hadoop-core-x-y-z.jar are the same, across the board. EF On Wed, May 11, 2011 at 7:26 AM, Matthew John tmatthewjohn1...@gmail.comwrote: Hi all! I have been trying to figure out why I m getting this error! All that I did was : 1) Use a single node cluster 2) Made some modifications in the core (in some MapRed modules). Successfully compiled it 3) Tried bin/start-dfs.sh alone. All the required daemons (NN and DN) are up. The NameNode and DataNode logs are nt showing any errors/exceptions. Only interesting thing I found was : *WARN org.apache.hadoop.ipc.Server: Incorrect header or version mismatch from 10.72.147.109:40048 got version 94 expected version 3 * in the NameNode logs. Someone please help me out of this! Matthew
Differentiate Reducer or Combiner
Hi, I am using the same class for both combiner and reducer tasks since the code is almost the same. However, I just realized I need to do a slightly different thing when running on the Reducer. Is there an easy way to know whether the process is running as a reducer or combiner? I'm hoping that I'm missing an obvious solution here. Regards, Arv
RE: Bad connection to FS. command aborted
The Hadoop IPCs are version specific. That is done to prevent an older version from talking to a newer one. Even if nothing has changed in the internal protocols the version check is enforced. Make sure the new hadoop-core.jar from your modification is on the classpath used by the hadoop shell script. Bill -Original Message- From: Matthew John [mailto:tmatthewjohn1...@gmail.com] Sent: Wednesday, May 11, 2011 9:27 AM To: common-user Subject: Bad connection to FS. command aborted Hi all! I have been trying to figure out why I m getting this error! All that I did was : 1) Use a single node cluster 2) Made some modifications in the core (in some MapRed modules). Successfully compiled it 3) Tried bin/start-dfs.sh alone. All the required daemons (NN and DN) are up. The NameNode and DataNode logs are nt showing any errors/exceptions. Only interesting thing I found was : *WARN org.apache.hadoop.ipc.Server: Incorrect header or version mismatch from 10.72.147.109:40048 got version 94 expected version 3 * in the NameNode logs. Someone please help me out of this! Matthew
Re: Bad connection to FS. command aborted
I did a ant jar after modifying two files in the mapred module. This, from what I understand, creates a hadoop-*-core.jar in the build folder. Now I assume that will be used henceforth for any execution. So how can this be a problem if I am running a single-node cluster. Version mismatch with whom ? On Wed, May 11, 2011 at 7:07 PM, Habermaas, William william.haberm...@fatwire.com wrote: The Hadoop IPCs are version specific. That is done to prevent an older version from talking to a newer one. Even if nothing has changed in the internal protocols the version check is enforced. Make sure the new hadoop-core.jar from your modification is on the classpath used by the hadoop shell script. Bill -Original Message- From: Matthew John [mailto:tmatthewjohn1...@gmail.com] Sent: Wednesday, May 11, 2011 9:27 AM To: common-user Subject: Bad connection to FS. command aborted Hi all! I have been trying to figure out why I m getting this error! All that I did was : 1) Use a single node cluster 2) Made some modifications in the core (in some MapRed modules). Successfully compiled it 3) Tried bin/start-dfs.sh alone. All the required daemons (NN and DN) are up. The NameNode and DataNode logs are nt showing any errors/exceptions. Only interesting thing I found was : *WARN org.apache.hadoop.ipc.Server: Incorrect header or version mismatch from 10.72.147.109:40048 got version 94 expected version 3 * in the NameNode logs. Someone please help me out of this! Matthew
RE: Bad connection to FS. command aborted
If the hadoop script is picking up a different hadoop-core jar then the classes that ipc to the NN will be using a different version. Bill -Original Message- From: Matthew John [mailto:tmatthewjohn1...@gmail.com] Sent: Wednesday, May 11, 2011 9:41 AM To: common-user@hadoop.apache.org Subject: Re: Bad connection to FS. command aborted I did a ant jar after modifying two files in the mapred module. This, from what I understand, creates a hadoop-*-core.jar in the build folder. Now I assume that will be used henceforth for any execution. So how can this be a problem if I am running a single-node cluster. Version mismatch with whom ? On Wed, May 11, 2011 at 7:07 PM, Habermaas, William william.haberm...@fatwire.com wrote: The Hadoop IPCs are version specific. That is done to prevent an older version from talking to a newer one. Even if nothing has changed in the internal protocols the version check is enforced. Make sure the new hadoop-core.jar from your modification is on the classpath used by the hadoop shell script. Bill -Original Message- From: Matthew John [mailto:tmatthewjohn1...@gmail.com] Sent: Wednesday, May 11, 2011 9:27 AM To: common-user Subject: Bad connection to FS. command aborted Hi all! I have been trying to figure out why I m getting this error! All that I did was : 1) Use a single node cluster 2) Made some modifications in the core (in some MapRed modules). Successfully compiled it 3) Tried bin/start-dfs.sh alone. All the required daemons (NN and DN) are up. The NameNode and DataNode logs are nt showing any errors/exceptions. Only interesting thing I found was : *WARN org.apache.hadoop.ipc.Server: Incorrect header or version mismatch from 10.72.147.109:40048 got version 94 expected version 3 * in the NameNode logs. Someone please help me out of this! Matthew
Re: Differentiate Reducer or Combiner
Hello Arv, On Wed, May 11, 2011 at 7:04 PM, Arv Mistry a...@kindsight.net wrote: Is there an easy way to know whether the process is running as a reducer or combiner? I'm hoping that I'm missing an obvious solution here. There's no API way of telling if a Reducer class is running in combiner or reducer mode. The recommended solution would be to use different classes (perhaps use derivative classes for keeping with the OOP phil., if you have to) for Combiner and Reducer. -- Harsh J
Re: Bad connection to FS. command aborted
I did the following: *1) modified bin/hadoop* * * *for f in $HADOOP_HOME/hadoop-*-core.jar; do* * CLASSPATH=${CLASSPATH}:$f;* *done* * * *modified to * * * *for f in $HADOOP_HOME/build/hadoop-*-core.jar; do* * CLASSPATH=${CLASSPATH}:$f;* *done* This takes care of the version problem statement that used to come. *2) The I got an error related to Error register getProtocolVersion* I changed my hadoop-metrics back to NullContext settings. *3) Still I am getting problem with IPC server. and the same error - Bad connection to FS. Command aborted.* Following is my Namenode log. Someone please help help out since I am not very much familiar with the ipc server mechanisms around! :( 2011-05-11 19:34:30,114 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG: / STARTUP_MSG: Starting NameNode STARTUP_MSG: host = mattHDFS1set1/10.72.147.109 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.3-dev STARTUP_MSG: build = -r ; compiled by 'matthew' on Wed May 11 19:12:36 IST 2011 / 2011-05-11 19:34:30,253 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=54310 2011-05-11 19:34:30,261 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: mattHDFS1set1/10.72.147.109:54310 2011-05-11 19:34:30,271 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null 2011-05-11 19:34:30,272 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext 2011-05-11 19:34:30,385 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=matthew,matthew,adm,dialout,cdrom,plugdev,lpadmin,admin,sambashare 2011-05-11 19:34:30,385 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup 2011-05-11 19:34:30,385 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true 2011-05-11 19:34:30,395 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext 2011-05-11 19:34:30,397 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean 2011-05-11 19:34:30,449 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 1 2011-05-11 19:34:30,455 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 0 2011-05-11 19:34:30,455 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 97 loaded in 0 seconds. 2011-05-11 19:34:30,455 INFO org.apache.hadoop.hdfs.server.common.Storage: Edits file /mount_xfs/tmpdir/hadoop-matthew/dfs/name/current/edits of size 4 edits # 0 loaded in 0 seconds. 2011-05-11 19:34:30,488 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 97 saved in 0 seconds. 2011-05-11 19:34:30,535 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 204 msecs 2011-05-11 19:34:30,537 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Total number of blocks = 0 2011-05-11 19:34:30,537 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of invalid blocks = 0 2011-05-11 19:34:30,537 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of under-replicated blocks = 0 2011-05-11 19:34:30,537 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of over-replicated blocks = 0 2011-05-11 19:34:30,537 INFO org.apache.hadoop.hdfs.StateChange: STATE* Leaving safe mode after 0 secs. 2011-05-11 19:34:30,537 INFO org.apache.hadoop.hdfs.StateChange: STATE* Network topology has 0 racks and 0 datanodes 2011-05-11 19:34:30,538 INFO org.apache.hadoop.hdfs.StateChange: STATE* UnderReplicatedBlocks has 0 blocks 2011-05-11 19:34:36,080 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2011-05-11 19:34:36,163 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50070 2011-05-11 19:34:36,165 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50070 webServer.getConnectors()[0].getLocalPort() returned 50070 2011-05-11 19:34:36,165 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50070 2011-05-11 19:34:36,165 INFO org.mortbay.log: jetty-6.1.14 2011-05-11 19:34:36,594 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50070 2011-05-11 19:34:36,594 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at: 0.0.0.0:50070 2011-05-11 19:34:36,604 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting 2011-05-11 19:34:36,609 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54310: starting 2011-05-11 19:34:36,619 INFO
Suggestions for swapping issue
Hello Hadoop Gurus, We are running a 4-node cluster. We just upgraded the RAM to 48 GB. We have allocated around 33-34 GB per node for hadoop processes. Leaving the rest of the 14-15 GB memory for OS and as buffer. There are no other processes running on these nodes. Most of the lighter jobs run successfully but one big job is de-stabilizing the cluster. One node starts swapping and runs out of swap space and goes offline. We tracked the processes on that node and noticed that it ends up with more than expected hadoop-java processes. The other 3 nodes were running 10 or 11 processes and this node ends up with 36. After killing the job we find these processes still show up and we have to kill them manually. We have tried reducing the swappiness to 6 but saw the same results. It also looks like hadoop stays well within the memory limits allocated and still starts swapping. Some other suggestions we have seen are: 1) Increase swap size. Current size is 6 GB. The most quoted size is 'tons of swap' but note sure how much it translates to in numbers. Should it be 16 or 24 GB 2) Increase overcommit ratio. Not sure if this helps as a few blog comments mentioned it didn't help Any other hadoop or linux config suggestions are welcome. Thanks. -Adi
Re: Suggestions for swapping issue
You have to do the math... If you have 2gb per mapper, and run 10 mappers per node... That means 20gb of memory. Then you have TT and DN running which also take memory... What did you set as the number of mappers/reducers per node? What do you see in ganglia or when you run top? Sent from a remote device. Please excuse any typos... Mike Segel On May 11, 2011, at 12:31 PM, Adi adi.pan...@gmail.com wrote: Hello Hadoop Gurus, We are running a 4-node cluster. We just upgraded the RAM to 48 GB. We have allocated around 33-34 GB per node for hadoop processes. Leaving the rest of the 14-15 GB memory for OS and as buffer. There are no other processes running on these nodes. Most of the lighter jobs run successfully but one big job is de-stabilizing the cluster. One node starts swapping and runs out of swap space and goes offline. We tracked the processes on that node and noticed that it ends up with more than expected hadoop-java processes. The other 3 nodes were running 10 or 11 processes and this node ends up with 36. After killing the job we find these processes still show up and we have to kill them manually. We have tried reducing the swappiness to 6 but saw the same results. It also looks like hadoop stays well within the memory limits allocated and still starts swapping. Some other suggestions we have seen are: 1) Increase swap size. Current size is 6 GB. The most quoted size is 'tons of swap' but note sure how much it translates to in numbers. Should it be 16 or 24 GB 2) Increase overcommit ratio. Not sure if this helps as a few blog comments mentioned it didn't help Any other hadoop or linux config suggestions are welcome. Thanks. -Adi
Re: Suggestions for swapping issue
By our calculations hadoop should not exceed 70% of memory. Allocated per node - 48 map slots (24 GB) , 12 reduce slots (6 GB), 1 GB each for DataNode/TaskTracker and one JobTracker Totalling 33/34 GB allocation. The queues are capped at using only 90% of capacity allocated so generally 10% of slots are always kept free. The cluster was running total 33 mappers and 1 reducer so around 8-9 mappers per node with 3 GB max limit and they were utilizing around 2GB each. Top was showing 100% memory utilized. Which our sys admin says is ok as the memory is used for file caching by linux if the processes are not using it. No swapping on 3 nodes. Then node4 just started swapping after the number of processes shot up unexpectedly. The main mystery are these excess number of processes on the node which went down. 36 as opposed to expected 11. The other 3 nodes were successfully executing the mappers without any memory/swap issues. -Adi On Wed, May 11, 2011 at 1:40 PM, Michel Segel michael_se...@hotmail.comwrote: You have to do the math... If you have 2gb per mapper, and run 10 mappers per node... That means 20gb of memory. Then you have TT and DN running which also take memory... What did you set as the number of mappers/reducers per node? What do you see in ganglia or when you run top? Sent from a remote device. Please excuse any typos... Mike Segel On May 11, 2011, at 12:31 PM, Adi adi.pan...@gmail.com wrote: Hello Hadoop Gurus, We are running a 4-node cluster. We just upgraded the RAM to 48 GB. We have allocated around 33-34 GB per node for hadoop processes. Leaving the rest of the 14-15 GB memory for OS and as buffer. There are no other processes running on these nodes. Most of the lighter jobs run successfully but one big job is de-stabilizing the cluster. One node starts swapping and runs out of swap space and goes offline. We tracked the processes on that node and noticed that it ends up with more than expected hadoop-java processes. The other 3 nodes were running 10 or 11 processes and this node ends up with 36. After killing the job we find these processes still show up and we have to kill them manually. We have tried reducing the swappiness to 6 but saw the same results. It also looks like hadoop stays well within the memory limits allocated and still starts swapping. Some other suggestions we have seen are: 1) Increase swap size. Current size is 6 GB. The most quoted size is 'tons of swap' but note sure how much it translates to in numbers. Should it be 16 or 24 GB 2) Increase overcommit ratio. Not sure if this helps as a few blog comments mentioned it didn't help Any other hadoop or linux config suggestions are welcome. Thanks. -Adi
Re: Suggestions for swapping issue
How is it that 36 processes are not expected if you have configured 48 + 12 = 50 slots available on the machine? On Wed, May 11, 2011 at 11:11 AM, Adi adi.pan...@gmail.com wrote: By our calculations hadoop should not exceed 70% of memory. Allocated per node - 48 map slots (24 GB) , 12 reduce slots (6 GB), 1 GB each for DataNode/TaskTracker and one JobTracker Totalling 33/34 GB allocation. The queues are capped at using only 90% of capacity allocated so generally 10% of slots are always kept free. The cluster was running total 33 mappers and 1 reducer so around 8-9 mappers per node with 3 GB max limit and they were utilizing around 2GB each. Top was showing 100% memory utilized. Which our sys admin says is ok as the memory is used for file caching by linux if the processes are not using it. No swapping on 3 nodes. Then node4 just started swapping after the number of processes shot up unexpectedly. The main mystery are these excess number of processes on the node which went down. 36 as opposed to expected 11. The other 3 nodes were successfully executing the mappers without any memory/swap issues. -Adi On Wed, May 11, 2011 at 1:40 PM, Michel Segel michael_se...@hotmail.com wrote: You have to do the math... If you have 2gb per mapper, and run 10 mappers per node... That means 20gb of memory. Then you have TT and DN running which also take memory... What did you set as the number of mappers/reducers per node? What do you see in ganglia or when you run top? Sent from a remote device. Please excuse any typos... Mike Segel On May 11, 2011, at 12:31 PM, Adi adi.pan...@gmail.com wrote: Hello Hadoop Gurus, We are running a 4-node cluster. We just upgraded the RAM to 48 GB. We have allocated around 33-34 GB per node for hadoop processes. Leaving the rest of the 14-15 GB memory for OS and as buffer. There are no other processes running on these nodes. Most of the lighter jobs run successfully but one big job is de-stabilizing the cluster. One node starts swapping and runs out of swap space and goes offline. We tracked the processes on that node and noticed that it ends up with more than expected hadoop-java processes. The other 3 nodes were running 10 or 11 processes and this node ends up with 36. After killing the job we find these processes still show up and we have to kill them manually. We have tried reducing the swappiness to 6 but saw the same results. It also looks like hadoop stays well within the memory limits allocated and still starts swapping. Some other suggestions we have seen are: 1) Increase swap size. Current size is 6 GB. The most quoted size is 'tons of swap' but note sure how much it translates to in numbers. Should it be 16 or 24 GB 2) Increase overcommit ratio. Not sure if this helps as a few blog comments mentioned it didn't help Any other hadoop or linux config suggestions are welcome. Thanks. -Adi
Re: Suggestions for swapping issue
Actually per node 56 + 12 = 68 slots(not mappers/reducers). With the jobs configuration it was using 6 slots per mapper(resulting in 8-9 mappers), 6 slot per reducer( 1 reducer). There was mistake in my earlier mails. The map slots are 56 not 48, but still total memory allocation for hadoop comes to around 35-36GB. -Adi On Wed, May 11, 2011 at 2:16 PM, Ted Dunning tdunn...@maprtech.com wrote: How is it that 36 processes are not expected if you have configured 48 + 12 = 50 slots available on the machine? On Wed, May 11, 2011 at 11:11 AM, Adi adi.pan...@gmail.com wrote: By our calculations hadoop should not exceed 70% of memory. Allocated per node - 48 map slots (24 GB) , 12 reduce slots (6 GB), 1 GB each for DataNode/TaskTracker and one JobTracker Totalling 33/34 GB allocation. The queues are capped at using only 90% of capacity allocated so generally 10% of slots are always kept free. The cluster was running total 33 mappers and 1 reducer so around 8-9 mappers per node with 3 GB max limit and they were utilizing around 2GB each. Top was showing 100% memory utilized. Which our sys admin says is ok as the memory is used for file caching by linux if the processes are not using it. No swapping on 3 nodes. Then node4 just started swapping after the number of processes shot up unexpectedly. The main mystery are these excess number of processes on the node which went down. 36 as opposed to expected 11. The other 3 nodes were successfully executing the mappers without any memory/swap issues. -Adi On Wed, May 11, 2011 at 1:40 PM, Michel Segel michael_se...@hotmail.com wrote: You have to do the math... If you have 2gb per mapper, and run 10 mappers per node... That means 20gb of memory. Then you have TT and DN running which also take memory... What did you set as the number of mappers/reducers per node? What do you see in ganglia or when you run top? Sent from a remote device. Please excuse any typos... Mike Segel On May 11, 2011, at 12:31 PM, Adi adi.pan...@gmail.com wrote: Hello Hadoop Gurus, We are running a 4-node cluster. We just upgraded the RAM to 48 GB. We have allocated around 33-34 GB per node for hadoop processes. Leaving the rest of the 14-15 GB memory for OS and as buffer. There are no other processes running on these nodes. Most of the lighter jobs run successfully but one big job is de-stabilizing the cluster. One node starts swapping and runs out of swap space and goes offline. We tracked the processes on that node and noticed that it ends up with more than expected hadoop-java processes. The other 3 nodes were running 10 or 11 processes and this node ends up with 36. After killing the job we find these processes still show up and we have to kill them manually. We have tried reducing the swappiness to 6 but saw the same results. It also looks like hadoop stays well within the memory limits allocated and still starts swapping. Some other suggestions we have seen are: 1) Increase swap size. Current size is 6 GB. The most quoted size is 'tons of swap' but note sure how much it translates to in numbers. Should it be 16 or 24 GB 2) Increase overcommit ratio. Not sure if this helps as a few blog comments mentioned it didn't help Any other hadoop or linux config suggestions are welcome. Thanks. -Adi
Re: Suggestions for swapping issue
On May 11, 2011, at 11:11 AM, Adi wrote: By our calculations hadoop should not exceed 70% of memory. Allocated per node - 48 map slots (24 GB) , 12 reduce slots (6 GB), 1 GB each for DataNode/TaskTracker and one JobTracker Totalling 33/34 GB allocation. It sounds like you are only taking into consideration the heap size. There is more memory allocated than just the heap... The queues are capped at using only 90% of capacity allocated so generally 10% of slots are always kept free. But that doesn't translate into how free the nodes, which you've discovered. Individual nodes should be configured based on the assumption that *all* slots will be used. The cluster was running total 33 mappers and 1 reducer so around 8-9 mappers per node with 3 GB max limit and they were utilizing around 2GB each. Top was showing 100% memory utilized. Which our sys admin says is ok as the memory is used for file caching by linux if the processes are not using it. Well, yes and no. What is the breakdown of that 100%? Is there any actually allocated to buffer cache or is it all user space? No swapping on 3 nodes. Then node4 just started swapping after the number of processes shot up unexpectedly. The main mystery are these excess number of processes on the node which went down. 36 as opposed to expected 11. The other 3 nodes were successfully executing the mappers without any memory/swap issues. Likely speculative execution or something else. But again: don't build machines with the assumption that only x% of the slots will get used. There is no guarantee in the system that says that free slots will be balanced across all nodes... esp when you take into consideration node failure. -Adi On Wed, May 11, 2011 at 1:40 PM, Michel Segel michael_se...@hotmail.comwrote: You have to do the math... If you have 2gb per mapper, and run 10 mappers per node... That means 20gb of memory. Then you have TT and DN running which also take memory... What did you set as the number of mappers/reducers per node? What do you see in ganglia or when you run top? Sent from a remote device. Please excuse any typos... Mike Segel On May 11, 2011, at 12:31 PM, Adi adi.pan...@gmail.com wrote: Hello Hadoop Gurus, We are running a 4-node cluster. We just upgraded the RAM to 48 GB. We have allocated around 33-34 GB per node for hadoop processes. Leaving the rest of the 14-15 GB memory for OS and as buffer. There are no other processes running on these nodes. Most of the lighter jobs run successfully but one big job is de-stabilizing the cluster. One node starts swapping and runs out of swap space and goes offline. We tracked the processes on that node and noticed that it ends up with more than expected hadoop-java processes. The other 3 nodes were running 10 or 11 processes and this node ends up with 36. After killing the job we find these processes still show up and we have to kill them manually. We have tried reducing the swappiness to 6 but saw the same results. It also looks like hadoop stays well within the memory limits allocated and still starts swapping. Some other suggestions we have seen are: 1) Increase swap size. Current size is 6 GB. The most quoted size is 'tons of swap' but note sure how much it translates to in numbers. Should it be 16 or 24 GB 2) Increase overcommit ratio. Not sure if this helps as a few blog comments mentioned it didn't help Any other hadoop or linux config suggestions are welcome. Thanks. -Adi
Re: Suggestions for swapping issue
Thanks for your comments Allen.I have added mine inline. On May 11, 2011, at 11:11 AM, Adi wrote: By our calculations hadoop should not exceed 70% of memory. Allocated per node - 48 map slots (24 GB) , 12 reduce slots (6 GB), 1 GB each for DataNode/TaskTracker and one JobTracker Totalling 33/34 GB allocation. It sounds like you are only taking into consideration the heap size. There is more memory allocated than just the heap... Our heapsize for mappers is half of memory allocated to each mapper. But you're right we should account more for DN/TT/JT. The queues are capped at using only 90% of capacity allocated so generally 10% of slots are always kept free. But that doesn't translate into how free the nodes, which you've discovered. Individual nodes should be configured based on the assumption that *all* slots will be used. The cluster was running total 33 mappers and 1 reducer so around 8-9 mappers per node with 3 GB max limit and they were utilizing around 2GB each. Top was showing 100% memory utilized. Which our sys admin says is ok as the memory is used for file caching by linux if the processes are not using it. Well, yes and no. What is the breakdown of that 100%? Is there any actually allocated to buffer cache or is it all user space? Here's the breakdown. Tasks: 260 total, 1 running, 259 sleeping, 0 stopped, 0 zombie Cpu(s): 12.7%us, 1.2%sy, 0.0%ni, 83.6%id, 2.2%wa, 0.0%hi, 0.3%si, 0.0% Mem: 49450772k total, 49143288k used, 307484k free,16912k buffers Swap: 5242872k total, 248k used, 5242624k free, 7076564k cached No swapping on 3 nodes. Then node4 just started swapping after the number of processes shot up unexpectedly. The main mystery are these excess number of processes on the node which went down. 36 as opposed to expected 11. The other 3 nodes were successfully executing the mappers without any memory/swap issues. Likely speculative execution or something else. But again: don't build machines with the assumption that only x% of the slots will get used. There is no guarantee in the system that says that free slots will be balanced across all nodes... esp when you take into consideration node failure. I will look into this. Meanwhile, now I am running the job again with three nodes and observing that the completed tasks still show up in the process tree. In the earlier run I was allocating max heap size as intial, which I have disabled now. So mu hunch is that this job will run out of memory a little later. Hadoop is showing 8/10 tasks running per node and each node right now has 25-30 java processes. I grepped for a completed attempt and it still shows up in the ps listing. Non-Running TasksTask AttemptsStatus attempt_201104280947_1266_m_33_0 SUCCEEDED $ ps -ef | grep hadoop | grep attempt_201104280947_1266_m_33_0 hadoop 17315 5018 26 16:38 ?00:13:39 /usr/java/jre1.6.0_21/bin/java -Djava.library.path=/usr/local/hadoop/bin/../lib/native/Linux-amd64-64:/usr/local/hadoop/cluster/mapred/local/taskTracker/auser/jobcache/job_201104280947_1266/ *attempt_201104280947_1266_m_33_0*/work -Xmx1536M -Djava.io.tmpdir=/usr/local/hadoop/cluster/mapred/local/taskTracker/auser/jobcache/job_201104280947_1266/attempt_201104280947_1266_m_33_0/work/tmp -classpath Any ideas? -Adi
is it possible to concatenate output files under many reducers?
hi, all. I have 60 reducers which are generating same output files. from output-r--1 to output-r-00059. under this situation, I want to control the count of output files. for example, is it possible to concatenate all output files to 10 ? from output-r-1 to output-r-00010. thanks -- Junyoung Kim (juneng...@gmail.com)
Re: is it possible to concatenate output files under many reducers?
Short, blind answer: You could run 10 reducers. Otherwise, you'll have to run another job that picks up a few files each in mapper and merges them out. But having 60 files shouldn't really be a problem if they are sufficiently large (at least 80% of a block size perhaps -- you can tune # of reducers to achieve this). On Thu, May 12, 2011 at 6:14 AM, Jun Young Kim juneng...@gmail.com wrote: hi, all. I have 60 reducers which are generating same output files. from output-r--1 to output-r-00059. under this situation, I want to control the count of output files. for example, is it possible to concatenate all output files to 10 ? from output-r-1 to output-r-00010. thanks -- Junyoung Kim (juneng...@gmail.com) -- Harsh J