Re: No locks available
Edward Capriolo wrote: On Mon, Jan 17, 2011 at 8:13 AM, Adarsh Sharma wrote: Harsh J wrote: Could you re-check your permissions on the $(dfs.data.dir)s for your failing DataNode versus the user that runs it? On Mon, Jan 17, 2011 at 6:33 PM, Adarsh Sharma wrote: Can i know why it occurs. Thanx Harsh , I know this issue and I cross-check several times permissions of of all dirs ( dfs.name.dir, dfs.data.dir, mapred.local.dir ). It is 755 and is owned by hadoop user and group. I found that in failed datanode dir , it is unable to create 5 files in dfs.data.dir whereas on the other hand, it creates following files in successsful datanode : curent tmp storage in_use.lock Does it helps. Thanx No locks available can mean that you are trying to use hadoop on a filesystem that does not support file level locking. Are you trying to run your name node storage in NFS space? I am sorry but my Namenode is in separate Machine outside CLoud. The path is in /home/hadoop/project/hadoop-0.20.2/name It's is running properly. I find it difficult because I followed the same steps in the other 2 VM's and they are running. How could I debug this for 1 exceptional case where it is failing. Thanks & Regards Adarsh Sharma
BZip2Codec memory usage for map output compression?
Hi, How can memory usage be calculated in case of BZip2Codec for map output? Cheers, Attila
Re: Question about Hadoop Default FCFS Job Scheduler
OK, I got your point, you mean why don't we put the for loop into obtainNewLocalMapTask(), yes, I think we can do that, but the result is the same with current codes, and I don't think it will lead too many benefits on performance, and personally, I like the current style, :-) Best, Nan On Tue, Jan 18, 2011 at 12:24 AM, He Chen wrote: > Hi Nan, > > Thank you for the reply. I understand what you mean. What I concern is > inside the "obtainNewLocalMapTask(...)" method, it only assigns one tasks a > time. > > Now I understand why it only assigns one task at a time. It is because the > outside loop: > > for (i = 0; i < MapperCapacity; ++i){ > > (..) > > } > > I mean why this loop exists here. Why does the scheduler use this type of > loop. It imposes overhead to the task assigning process if only assign one > task at a time. It is obviously that a node can be assigned all available > local tasks it can in one "afford obtainNewLocalMapTask(..)" method > call. > > Bests > > Chen > > On Mon, Jan 17, 2011 at 8:28 AM, Nan Zhu wrote: > > > Hi, Chen > > > > How is it going recently? > > > > Actually I think you misundertand the code in assignTasks() in > > JobQueueTaskScheduler.java, see the following structure of the > interesting > > codes: > > > > //I'm sorry, I hacked the code so much, the name of the variables may be > > different from the original version > > > > for (i = 0; i < MapperCapacity; ++i){ > > ... > > for (JobInProgress job:jobQueue){ > > //try to shedule a node-local or rack-local map tasks > > //here is the interesting place > > t = job.obtainNewLocalMapTask(...); > > if (t != null){ > > ... > > break;//the break statement here will make the control flow back > > to "for (job:jobQueue)" which means that it will restart map tasks > > selection > > procedure from the first job, so , it is actually schedule all of the > first > > job's local mappers first until the map slots are full > > } > > } > > } > > > > BTW, we can only schedule a reduce task in a single heartbeat > > > > > > > > Best, > > Nan > > On Sat, Jan 15, 2011 at 1:45 PM, He Chen wrote: > > > > > Hey all > > > > > > Why does the FCFS scheduler only let a node chooses one task at a time > in > > > one job? In order to increase the data locality, > > > it is reasonable to let a node to choose all its local tasks (if it > can) > > > from a job at a time. > > > > > > Any reply will be appreciated. > > > > > > Thanks > > > > > > Chen > > > > > >
Re: Question about Hadoop Default FCFS Job Scheduler
Hi, Chen Actually not one task each time, see this statement: assignedTasks.add(t); assignedTasks is the return value of this method, and it's a collection of selected tasks, it will contain multiple tasks if the candidates are there.. Best, Nan On Tue, Jan 18, 2011 at 12:24 AM, He Chen wrote: > Hi Nan, > > Thank you for the reply. I understand what you mean. What I concern is > inside the "obtainNewLocalMapTask(...)" method, it only assigns one tasks a > time. > > Now I understand why it only assigns one task at a time. It is because the > outside loop: > > for (i = 0; i < MapperCapacity; ++i){ > > (..) > > } > > I mean why this loop exists here. Why does the scheduler use this type of > loop. It imposes overhead to the task assigning process if only assign one > task at a time. It is obviously that a node can be assigned all available > local tasks it can in one "afford obtainNewLocalMapTask(..)" method > call. > > Bests > > Chen > > On Mon, Jan 17, 2011 at 8:28 AM, Nan Zhu wrote: > > > Hi, Chen > > > > How is it going recently? > > > > Actually I think you misundertand the code in assignTasks() in > > JobQueueTaskScheduler.java, see the following structure of the > interesting > > codes: > > > > //I'm sorry, I hacked the code so much, the name of the variables may be > > different from the original version > > > > for (i = 0; i < MapperCapacity; ++i){ > > ... > > for (JobInProgress job:jobQueue){ > > //try to shedule a node-local or rack-local map tasks > > //here is the interesting place > > t = job.obtainNewLocalMapTask(...); > > if (t != null){ > > ... > > break;//the break statement here will make the control flow back > > to "for (job:jobQueue)" which means that it will restart map tasks > > selection > > procedure from the first job, so , it is actually schedule all of the > first > > job's local mappers first until the map slots are full > > } > > } > > } > > > > BTW, we can only schedule a reduce task in a single heartbeat > > > > > > > > Best, > > Nan > > On Sat, Jan 15, 2011 at 1:45 PM, He Chen wrote: > > > > > Hey all > > > > > > Why does the FCFS scheduler only let a node chooses one task at a time > in > > > one job? In order to increase the data locality, > > > it is reasonable to let a node to choose all its local tasks (if it > can) > > > from a job at a time. > > > > > > Any reply will be appreciated. > > > > > > Thanks > > > > > > Chen > > > > > >
Re: Question about Hadoop Default FCFS Job Scheduler
Hi Nan, Thank you for the reply. I understand what you mean. What I concern is inside the "obtainNewLocalMapTask(...)" method, it only assigns one tasks a time. Now I understand why it only assigns one task at a time. It is because the outside loop: for (i = 0; i < MapperCapacity; ++i){ (..) } I mean why this loop exists here. Why does the scheduler use this type of loop. It imposes overhead to the task assigning process if only assign one task at a time. It is obviously that a node can be assigned all available local tasks it can in one "afford obtainNewLocalMapTask(..)" method call. Bests Chen On Mon, Jan 17, 2011 at 8:28 AM, Nan Zhu wrote: > Hi, Chen > > How is it going recently? > > Actually I think you misundertand the code in assignTasks() in > JobQueueTaskScheduler.java, see the following structure of the interesting > codes: > > //I'm sorry, I hacked the code so much, the name of the variables may be > different from the original version > > for (i = 0; i < MapperCapacity; ++i){ > ... > for (JobInProgress job:jobQueue){ > //try to shedule a node-local or rack-local map tasks > //here is the interesting place > t = job.obtainNewLocalMapTask(...); > if (t != null){ > ... > break;//the break statement here will make the control flow back > to "for (job:jobQueue)" which means that it will restart map tasks > selection > procedure from the first job, so , it is actually schedule all of the first > job's local mappers first until the map slots are full > } > } > } > > BTW, we can only schedule a reduce task in a single heartbeat > > > > Best, > Nan > On Sat, Jan 15, 2011 at 1:45 PM, He Chen wrote: > > > Hey all > > > > Why does the FCFS scheduler only let a node chooses one task at a time in > > one job? In order to increase the data locality, > > it is reasonable to let a node to choose all its local tasks (if it can) > > from a job at a time. > > > > Any reply will be appreciated. > > > > Thanks > > > > Chen > > >
Re: question about Hadoop job conf
Set them to final if you don't want the default values being applied. A true addition should solve your problem (although it may generate some warnings when your job tries to override them with their defaults). (Default value xml files are in the Hadoop jars and are usually picked up when a JobConf/Configuration is created, unless your Cluster's configuration is on the CLASSPATH) Although am wondering what do you gain by changing fs.checkpoint.size for a Task? It isn't read by the MapReduce code base at all and is only read inside the HDFS's Checkpointer/SN Nodes at initialization. On Mon, Jan 17, 2011 at 8:11 PM, xiufeng liu wrote: > Hi, > > > The following is the setting of mapred-site where I have set the * > mapred.child.java.opts* to *-Xmx512 -Xincgc*, and *fs.checkpoint.size* to * > 268435456*. But in the runtime setting job.xml, I found that it is still > using the default value *mapred.child.java.opts*= *-Xmx200, and the > *fs.checkpoint.size=67108864, > instead of the values in mapred-site.xml ? Could anybody advise? Thanks! > > -afancy > > [xiliu@xiliu-fedora conf]$ cat mapred-site.xml > > > > > > > > mapred.job.tracker > xiliu-fedora:9001 > > > mapred.local.dir > /data1/hadoop-0.20.2/mapred/ > > > mapred.tasktracker.map.tasks.maximum > 4 > > > mapred.tasktracker.reduce.tasks.maximum > 4 > > > fs.checkpoint.size > 268435456 > > > mapred.child.java.opts > -Xmx51 -Xincgc > > > > > > ** > -- Harsh J www.harshj.com
Re: Why Hadoop is slow in Cloud
On Mon, Jan 17, 2011 at 6:08 AM, Steve Loughran wrote: > On 17/01/11 04:11, Adarsh Sharma wrote: >> >> Dear all, >> >> Yesterday I performed a kind of testing between *Hadoop in Standalone >> Servers* & *Hadoop in Cloud. >> >> *I establish a Hadoop cluster of 4 nodes ( Standalone Machines ) in >> which one node act as Master ( Namenode , Jobtracker ) and the remaining >> nodes act as slaves ( Datanodes, Tasktracker ). >> On the other hand, for testing Hadoop in *Cloud* ( Euclayptus ), I made >> one Standalone Machine as *Hadoop Master* and the slaves are configured >> on the VM's in Cloud. >> >> I am confused about the stats obtained after the testing. What I >> concluded that the VM are giving half peformance as compared with >> Standalone Servers. > > Interesting stats, nothing that massively surprises me, especially as your > benchmarks are very much streaming through datasets. If you were doing > something more CPU intensive (graph work, for example), things wouldn't look > so bad > > I've done stuff in this area. > http://www.slideshare.net/steve_l/farming-hadoop-inthecloud > > > >> >> I am expected some slow down but at this level I never expect. Would >> this is genuine or there may be some configuration problem. >> >> I am using 1 GB (10-1000mb/s) LAN in VM machines and 100mb/s in >> Standalone Servers. >> >> Please have a look on the results and if interested comment on it. >> > > > The big killer here is File IO, with today's HDD controllers and virtual > filesystems, disk IO is way underpowered compared to physical disk IO. > Networking is reduced (but improving), and CPU can be pretty good, but disk > is bad. > > > Why? > > 1. Every access to a block in the VM is turned into virtual disk controller > operations which are then interpreted by the VDC and turned into > reads/writes in the virtual disk drive > > 2. which is turned into seeks, reads and writes in the physical hardware. > > Some workarounds > > -allocate physical disks for the HDFS filesystem, for the duration of the > VMs. > > -have the local hosts serve up a bit of their filesystem on a fast protocol > (like NFS), and have every VM mount the local physical NFS filestore as > their hadoop data dirs. > > Q: "Why is my Nintendo emulator slow on a 800 MHZ computer made 10 years after Nintendo?" A: Emulation Everything you emulate you cut X% performance right off the top. Emulation is great when you want to run mac on windows or freebsd on linux or nintendo on linux. However most people would do better with technologies that use kernel level isolation such as Linux containers, Solaris Zones, Linux VServer (my favorite) http://linux-vserver.org/, User Mode Linux or similar technologies that ISOLATE rather then EMULATE. Sorry list I feel I rant about this bi-annually. I have just always been so shocked about how many people get lured into cloud and virtualized solutions for "better management" and "near native performance"
question about Hadoop job conf
Hi, The following is the setting of mapred-site where I have set the * mapred.child.java.opts* to *-Xmx512 -Xincgc*, and *fs.checkpoint.size* to * 268435456*. But in the runtime setting job.xml, I found that it is still using the default value *mapred.child.java.opts*= *-Xmx200, and the *fs.checkpoint.size=67108864, instead of the values in mapred-site.xml ? Could anybody advise? Thanks! -afancy [xiliu@xiliu-fedora conf]$ cat mapred-site.xml mapred.job.tracker xiliu-fedora:9001 mapred.local.dir /data1/hadoop-0.20.2/mapred/ mapred.tasktracker.map.tasks.maximum 4 mapred.tasktracker.reduce.tasks.maximum 4 fs.checkpoint.size 268435456 mapred.child.java.opts -Xmx51 -Xincgc **
Re: Question about Hadoop Default FCFS Job Scheduler
Hi, Chen How is it going recently? Actually I think you misundertand the code in assignTasks() in JobQueueTaskScheduler.java, see the following structure of the interesting codes: //I'm sorry, I hacked the code so much, the name of the variables may be different from the original version for (i = 0; i < MapperCapacity; ++i){ ... for (JobInProgress job:jobQueue){ //try to shedule a node-local or rack-local map tasks //here is the interesting place t = job.obtainNewLocalMapTask(...); if (t != null){ ... break;//the break statement here will make the control flow back to "for (job:jobQueue)" which means that it will restart map tasks selection procedure from the first job, so , it is actually schedule all of the first job's local mappers first until the map slots are full } } } BTW, we can only schedule a reduce task in a single heartbeat Best, Nan On Sat, Jan 15, 2011 at 1:45 PM, He Chen wrote: > Hey all > > Why does the FCFS scheduler only let a node chooses one task at a time in > one job? In order to increase the data locality, > it is reasonable to let a node to choose all its local tasks (if it can) > from a job at a time. > > Any reply will be appreciated. > > Thanks > > Chen >
Re: HDFS and Input Splits
On Mon, Jan 17, 2011 at 6:11 PM, Marco Didonna wrote: > Hello everyone, > I am pretty new to hadoop and I have started learning it thanks to Tom > White's book. There is something I still do not understand though: it's > about the splitting of the input data in order to distribute the work > load to a cluster of machines. I would like to discuss two possible > scenarios and ask some question. > > For both scenario let's suppose the input data is a single 5GB text file. > > ::Scenario n.1:: > > The 5GB input file is put on HDFS. According to the default settings it > will be blindly split into 64MB chunks and sent to the various datanodes > in a redundant flavor (according to the replica factor). Let's suppose I > need to perform a line oriented analysis so my unit of analysis is a > single line. Let's suppose I use a TextInputFormat which allows to use > the entire line as value and the file offset (ignored) as key. > > On machine N I have this chunk of text[1]: > > "[...] MIDWAY upon the journey of our life > I found myself within a forest dark, > For the straightforward pathway had been lost. > Ah me! how hard a" > > And on machine K I have this other chunk: > > "thing it is to say > What was this forest savage, rough, and stern, > Which in the very thought renews the fear. [...]" > > How can the mapper running on machine N reconstruct the correct prase > "Ah me! how hard a thing it is to say". Maybe it will ask the namenode > the address of the datanode holding the next file block? The DFSInputStream can read across blocks (See the implementation for the read methods inside DFSInputStream, which is what is provided on an FS.open() call). LineReader, which is used to read lines off it, is only interested in reading till it's designated length end (in bytes) is reached (the first split may read beyond its last line in its block and subsequent splits other than the first ignore the first line read). Also read the wiki on Hadoop's MapReduce: http://wiki.apache.org/hadoop/HadoopMapReduce which explains this behavior. > ::Scenario n.2:: > > The 5GB is stored on the machine (local filesystem) submitting the job > to the hadoop cluster. I suppose this machine will split the file evenly > (using line endings as possible split points) and send the chunks to the > node in the cluster. > Is that correct? How is replication performed in this scenario? Replication is not performed immediately. It is done for every DataNode 'heartbeat', where-in the NameNode identifies if it needs to perform any replications of a new block (or changed block) and assigns the task to the DN waiting for its heartbeat response (a.k.a, waiting for action). Splits are also not done at line boundaries (which is good cause not all files are text files), but at the specified block size byte boundaries. The reader logic, as mentioned above, takes care of proper line reading across mapper's splits. (Please correct me if I'm wrong anywhere.) -- Harsh J www.harshj.com
Re: No locks available
Edward Capriolo wrote: On Mon, Jan 17, 2011 at 8:13 AM, Adarsh Sharma wrote: Harsh J wrote: Could you re-check your permissions on the $(dfs.data.dir)s for your failing DataNode versus the user that runs it? On Mon, Jan 17, 2011 at 6:33 PM, Adarsh Sharma wrote: Can i know why it occurs. Thanx Harsh , I know this issue and I cross-check several times permissions of of all dirs ( dfs.name.dir, dfs.data.dir, mapred.local.dir ). It is 755 and is owned by hadoop user and group. I found that in failed datanode dir , it is unable to create 5 files in dfs.data.dir whereas on the other hand, it creates following files in successsful datanode : curent tmp storage in_use.lock Does it helps. Thanx No locks available can mean that you are trying to use hadoop on a filesystem that does not support file level locking. Are you trying to run your name node storage in NFS space? Yes Edward U'r absolutely right. I mount hard disk path to the Datanode ( VM ) dfs.data.path. But it causes no problem in other nodes. Thanx
Re: When applying a patch, which attachment should I use?
Thanx a Lot Edward, This information is very helpful to me. With Best Regards Adarsh Sharma edward choi wrote: Dear Adarsh, I have a single machine running Namenode/JobTracker/Hbase Master. There are 17 machines running Datanode/TaskTracker Among those 17 machines, 14 are running Hbase Regionservers. The other 3 machines are running Zookeeper. And about the Zookeeper, Hbase comes with its own Zookeeper so you don't need to install a new Zookeeper. (except for the special occasion, which I'll explain later) I assigned 14 machines as regionservers using "$HBASE_HOME/conf/regionservers". I assigned 3 machines as Zookeeperss using "hbase.zookeeper.quorum" property in "$HBASE_HOME/conf/hbase-site.xml". Don't forget to set "export HBASE_MANAGES_ZK=true" in "$HBASE_HOME/conf/hbase-env.sh". (This is where you announce that you will be using Zookeeper that comes with HBase) This way, when you execute "$HBASE_HOME/bin/start-hbase.sh", HBase will automatically start Zookeeper first, then start HBase daemons. Also, you can install your own Zookeeper and tell HBase to use it instead of its own. I read it on the internet that Zookeeper that comes with HBase does not work properly on Windows 7 64bit. ( http://alans.se/blog/2010/hadoop-hbase-cygwin-windows-7-x64/) So in that case you need to install your own Zookeeper, set it up properly, and tell HBase to use it instead of its own. All you need to do is configure zoo.cfg and add it to the HBase CLASSPATH. And don't forget to set "export HBASE_MANAGES_ZK=false" in "$HBASE_HOME/conf/hbase-env.sh". This way, HBase will not start Zookeeper automatically. About the separation of Zookeepers from regionservers, Yes, it is recommended to separate Zookeepers from regionservers. But that won't be necessary unless your clusters are very heavily loaded. They also suggest that you give Zookeeper its own hard disk. But I haven't done that myself yet. (Hard disks cost money you know) So I'd say your cluster seems fine. But when you want to expand your cluster, you'd need some changes. I suggest you take a look at "Hadoop: The Definitive Guide". Regards, Edward 2011/1/13 Adarsh Sharma Thanks Edward, Can you describe me the architecture used in your configuration. Fore.g I have a cluster of 10 servers and 1 node act as ( Namenode, Jobtracker, Hmaster ). Remainning 9 nodes act as ( Slaves, datanodes, Tasktracker, Hregionservers ). Among these 9 nodes I also set 3 nodes in zookeeper.quorum.property. I want to know that is it necessary to configure zookeeper separately with the zookeeper-3.2.2 package or just have some IP's listed in zookeeper.quorum.property and Hbase take care of it. Can we specify IP's of Hregionservers used before as zookeeper servers ( HQuorumPeer ) or we must need separate servers for it. My problem arises in running zookeeper. My Hbase is up and running in fully distributed mode too. With Best Regards Adarsh Sharma edward choi wrote: Dear Adarsh, My situation is somewhat different from yours as I am only running Hadoop and Hbase (as opposed to Hadoop/Hive/Hbase). But I hope my experience could be of help to you somehow. I applied the "hdfs-630-0.20-append.patch" to every single Hadoop node. (including master and slaves) Then I followed exactly what they told me to do on http://hbase.apache.org/docs/current/api/overview-summary.html#overview_description . I didn't get a single error message and successfully started HBase in a fully distributed mode. I am not using Hive so I can't tell what caused the MasterNotRunningException, but the patch above is meant to allow DFSClients pass NameNode lists of known dead Datanodes. I doubt that the patch has anything to do with MasterNotRunningException. Hope this helps. Regards, Ed 2011/1/13 Adarsh Sharma I am also facing some issues and i think applying hdfs-630-0.20-append.patch< https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch would solve my problem. I try to run Hadoop/Hive/Hbase integration in fully Distributed mode. But I am facing master Not Running Exception mentioned in http://wiki.apache.org/hadoop/Hive/HBaseIntegration. My Hadoop Version= 0.20.2, Hive =0.6.0 , Hbase=0.20.6. What you think Edward. Thanks Adarsh edward choi wrote: I am not familiar with this whole svn and patch stuff, so please understand my asking. I was going to apply hdfs-630-0.20-append.patch< https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch only because I wanted to install HBase and the installation guide told me to. The append branch you mentioned, does that include hdfs-630-0.20-append.patch< https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch as well? Is it like the latest patch with all the good stuff packed in one? Regards, Ed 2011/1/12 Ted Dunning You may also be interested in the append branch: http://
Re: No locks available
On Mon, Jan 17, 2011 at 8:13 AM, Adarsh Sharma wrote: > Harsh J wrote: >> >> Could you re-check your permissions on the $(dfs.data.dir)s for your >> failing DataNode versus the user that runs it? >> >> On Mon, Jan 17, 2011 at 6:33 PM, Adarsh Sharma >> wrote: >> >>> >>> Can i know why it occurs. >>> >> >> > > Thanx Harsh , I know this issue and I cross-check several times permissions > of of all dirs ( dfs.name.dir, dfs.data.dir, mapred.local.dir ). > > It is 755 and is owned by hadoop user and group. > > I found that in failed datanode dir , it is unable to create 5 files in > dfs.data.dir whereas on the other hand, it creates following files in > successsful datanode : > > curent > tmp > storage > in_use.lock > > Does it helps. > > Thanx > No locks available can mean that you are trying to use hadoop on a filesystem that does not support file level locking. Are you trying to run your name node storage in NFS space?
Re: No locks available
Harsh J wrote: Could you re-check your permissions on the $(dfs.data.dir)s for your failing DataNode versus the user that runs it? On Mon, Jan 17, 2011 at 6:33 PM, Adarsh Sharma wrote: Can i know why it occurs. Thanx Harsh , I know this issue and I cross-check several times permissions of of all dirs ( dfs.name.dir, dfs.data.dir, mapred.local.dir ). It is 755 and is owned by hadoop user and group. I found that in failed datanode dir , it is unable to create 5 files in dfs.data.dir whereas on the other hand, it creates following files in successsful datanode : curent tmp storage in_use.lock Does it helps. Thanx
Re: No locks available
Could you re-check your permissions on the $(dfs.data.dir)s for your failing DataNode versus the user that runs it? On Mon, Jan 17, 2011 at 6:33 PM, Adarsh Sharma wrote: > Can i know why it occurs. -- Harsh J www.harshj.com
Re: No locks available
xiufeng liu wrote: did you format the namenode before you start? try to format it and start: 1) go to HADOOP_HOME/bin 2) ./hadoop namenode -format I format the namenode and then issue the command : bin/start-all.sh this results 2 of my datanodes to run properly but causes the below exception for one datanode. Can i know why it occurs. Thanx On Mon, Jan 17, 2011 at 1:43 PM, Adarsh Sharma wrote: Dear all, I know this a silly mistake but not able to find the reason of the exception that causes one datanode to fail to start. I mount /hdd2-1 of a phsical machine into this VM and start datanode,tasktracker. Datanode fails after few seconds. Can someone tell me the root cause. Below is the exception : 2011-01-17 18:01:08,199 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: / STARTUP_MSG: Starting DataNode STARTUP_MSG: host = hadoop7/172.16.1.8 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 / 2011-01-17 18:03:36,391 INFO org.apache.hadoop.hdfs.server.common.Storage: java.io.IOException: No locks available at sun.nio.ch.FileChannelImpl.lock0(Native Method) at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881) at java.nio.channels.FileChannel.tryLock(FileChannel.java:962) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:112) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298) at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:216) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1246) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1368) 2011-01-17 18:03:36,393 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: No locks available at sun.nio.ch.FileChannelImpl.lock0(Native Method) at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881) at java.nio.channels.FileChannel.tryLock(FileChannel.java:962) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:112) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298) at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:216) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238) "~/project/hadoop-0.20.2/logs/hadoop-hadoop-datanode-hadoop7.log" 42L, 3141C 1,1 Top Thanks Adarsh
Re: No locks available
did you format the namenode before you start? try to format it and start: 1) go to HADOOP_HOME/bin 2) ./hadoop namenode -format On Mon, Jan 17, 2011 at 1:43 PM, Adarsh Sharma wrote: > Dear all, > > > I know this a silly mistake but not able to find the reason of the > exception that causes one datanode to fail to start. > > I mount /hdd2-1 of a phsical machine into this VM and start > datanode,tasktracker. > > Datanode fails after few seconds. > > Can someone tell me the root cause. > > Below is the exception : > > 2011-01-17 18:01:08,199 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: > / > STARTUP_MSG: Starting DataNode > STARTUP_MSG: host = hadoop7/172.16.1.8 > STARTUP_MSG: args = [] > STARTUP_MSG: version = 0.20.2 > STARTUP_MSG: build = > https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r > 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 > / > 2011-01-17 18:03:36,391 INFO org.apache.hadoop.hdfs.server.common.Storage: > java.io.IOException: No locks available > at sun.nio.ch.FileChannelImpl.lock0(Native Method) > at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881) > at java.nio.channels.FileChannel.tryLock(FileChannel.java:962) > at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527) > at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505) > at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:112) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:216) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1246) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1368) > > 2011-01-17 18:03:36,393 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: No > locks available > at sun.nio.ch.FileChannelImpl.lock0(Native Method) > at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881) > at java.nio.channels.FileChannel.tryLock(FileChannel.java:962) > at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527) > at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505) > at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:112) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:216) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238) > "~/project/hadoop-0.20.2/logs/hadoop-hadoop-datanode-hadoop7.log" 42L, > 3141C 1,1 Top > > > > > Thanks > > Adarsh >
HDFS and Input Splits
Hello everyone, I am pretty new to hadoop and I have started learning it thanks to Tom White's book. There is something I still do not understand though: it's about the splitting of the input data in order to distribute the work load to a cluster of machines. I would like to discuss two possible scenarios and ask some question. For both scenario let's suppose the input data is a single 5GB text file. ::Scenario n.1:: The 5GB input file is put on HDFS. According to the default settings it will be blindly split into 64MB chunks and sent to the various datanodes in a redundant flavor (according to the replica factor). Let's suppose I need to perform a line oriented analysis so my unit of analysis is a single line. Let's suppose I use a TextInputFormat which allows to use the entire line as value and the file offset (ignored) as key. On machine N I have this chunk of text[1]: "[...] MIDWAY upon the journey of our life I found myself within a forest dark, For the straightforward pathway had been lost. Ah me! how hard a" And on machine K I have this other chunk: "thing it is to say What was this forest savage, rough, and stern, Which in the very thought renews the fear. [...]" How can the mapper running on machine N reconstruct the correct prase "Ah me! how hard a thing it is to say". Maybe it will ask the namenode the address of the datanode holding the next file block? ::Scenario n.2:: The 5GB is stored on the machine (local filesystem) submitting the job to the hadoop cluster. I suppose this machine will split the file evenly (using line endings as possible split points) and send the chunks to the node in the cluster. Is that correct? How is replication performed in this scenario? Thank you. MD
No locks available
Dear all, I know this a silly mistake but not able to find the reason of the exception that causes one datanode to fail to start. I mount /hdd2-1 of a phsical machine into this VM and start datanode,tasktracker. Datanode fails after few seconds. Can someone tell me the root cause. Below is the exception : 2011-01-17 18:01:08,199 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: / STARTUP_MSG: Starting DataNode STARTUP_MSG: host = hadoop7/172.16.1.8 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 / 2011-01-17 18:03:36,391 INFO org.apache.hadoop.hdfs.server.common.Storage: java.io.IOException: No locks available at sun.nio.ch.FileChannelImpl.lock0(Native Method) at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881) at java.nio.channels.FileChannel.tryLock(FileChannel.java:962) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:112) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298) at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:216) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1246) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1368) 2011-01-17 18:03:36,393 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: No locks available at sun.nio.ch.FileChannelImpl.lock0(Native Method) at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881) at java.nio.channels.FileChannel.tryLock(FileChannel.java:962) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:112) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298) at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:216) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238) "~/project/hadoop-0.20.2/logs/hadoop-hadoop-datanode-hadoop7.log" 42L, 3141C 1,1 Top Thanks Adarsh
Re: How to replace Jetty-6.1.14 with Jetty 7 in Hadoop?
On 16/01/11 09:41, xiufeng liu wrote: Hi, In my cluster, Hadoop somehow cannot work, and I found that it was due to the Jetty-6.1.14 which is not able to start up. However, Jetty 7 can work in my cluster. Could any body know how to replace Jetty6.1.14 with Jetty7? Thanks afancy The switch to jetty 7 will not be easy, and I wouldn't encourage you to do it unless you want to get into editing the Hadoop source, retesting everything, Try moving up to v 6.1.25, which should be more straightforward. Replace the JAR, QA the cluster with some terasorting.
Re: Why Hadoop is slow in Cloud
On 17/01/11 04:11, Adarsh Sharma wrote: Dear all, Yesterday I performed a kind of testing between *Hadoop in Standalone Servers* & *Hadoop in Cloud. *I establish a Hadoop cluster of 4 nodes ( Standalone Machines ) in which one node act as Master ( Namenode , Jobtracker ) and the remaining nodes act as slaves ( Datanodes, Tasktracker ). On the other hand, for testing Hadoop in *Cloud* ( Euclayptus ), I made one Standalone Machine as *Hadoop Master* and the slaves are configured on the VM's in Cloud. I am confused about the stats obtained after the testing. What I concluded that the VM are giving half peformance as compared with Standalone Servers. Interesting stats, nothing that massively surprises me, especially as your benchmarks are very much streaming through datasets. If you were doing something more CPU intensive (graph work, for example), things wouldn't look so bad I've done stuff in this area. http://www.slideshare.net/steve_l/farming-hadoop-inthecloud I am expected some slow down but at this level I never expect. Would this is genuine or there may be some configuration problem. I am using 1 GB (10-1000mb/s) LAN in VM machines and 100mb/s in Standalone Servers. Please have a look on the results and if interested comment on it. The big killer here is File IO, with today's HDD controllers and virtual filesystems, disk IO is way underpowered compared to physical disk IO. Networking is reduced (but improving), and CPU can be pretty good, but disk is bad. Why? 1. Every access to a block in the VM is turned into virtual disk controller operations which are then interpreted by the VDC and turned into reads/writes in the virtual disk drive 2. which is turned into seeks, reads and writes in the physical hardware. Some workarounds -allocate physical disks for the HDFS filesystem, for the duration of the VMs. -have the local hosts serve up a bit of their filesystem on a fast protocol (like NFS), and have every VM mount the local physical NFS filestore as their hadoop data dirs.