Incompatible namespaceIDs after formatting namenode
Hi We just started implemented hadoop on our system for the first time(Cloudera CDH3u2 ) After reformatting a namenode for a few times, DataNode is not coming up with error Incompatible namespaceIDs I found a note on this http://pages.cs.brandeis.edu/~cs147a/lab/hadoop-troubleshooting/ but I'm really not sure about removing data node directories. How is it possible that data will not be lost? I have to do it on all datanodes... Please explain me how all this reformat tasks preserves user's data ? -- View this message in context: http://old.nabble.com/Incompatible-namespaceIDs-after-formatting-namenode-tp33142065p33142065.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Incompatible namespaceIDs after formatting namenode
For short, here is a script that may be useful for your to remove hdfs directory on DNs from your headnode. for each DN hostname do ssh root@[DN hostname] rm [your hdfs directory]/dfs/data/current/VERSION; done On Sun, Jan 15, 2012 at 7:22 AM, Uma Maheswara Rao G mahesw...@huawei.comwrote: Since you already formatted NN, why do you think dataloss if you remove storage directories of DNs here? Since you formatted the NN, new namespaceID will be generated. When DNs registering to it, they will have still old NamespaceID, so, it will say incompatible namespaceIDs. So, here currently the solution is to remove the storage directories of all DNs. Regards, Uma From: gdan2000 [gdan2...@gmail.com] Sent: Sunday, January 15, 2012 2:15 PM To: core-u...@hadoop.apache.org Subject: Incompatible namespaceIDs after formatting namenode Hi We just started implemented hadoop on our system for the first time(Cloudera CDH3u2 ) After reformatting a namenode for a few times, DataNode is not coming up with error Incompatible namespaceIDs I found a note on this http://pages.cs.brandeis.edu/~cs147a/lab/hadoop-troubleshooting/ but I'm really not sure about removing data node directories. How is it possible that data will not be lost? I have to do it on all datanodes... Please explain me how all this reformat tasks preserves user's data ? -- View this message in context: http://old.nabble.com/Incompatible-namespaceIDs-after-formatting-namenode-tp33142065p33142065.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Incompatible namespaceIDs after formatting namenode
Il 15/01/2012 09:45, gdan2000 ha scritto: [...] really not sure about removing data node directories. How is it possible that data will not be lost? I have to do it on all datanodes... Please explain me how all this reformat tasks preserves user's data ? When you reformat the namenode, you are erasing and rebuilding what actually is the filesystem allocation table. As for traditional filesystems, you are not deleting the actual blocks. Differently from the traditional scenario, the previously allocated space is not automatically reused, but datanodes that have blocks from the previous allocation will instead deny to link to the namenode. I assume the main reason for this design choice is to preserve data from problems on name resolutions or configuration errors (i.e. the datanode trying to link to a wrong namenode) Il 15/01/2012 21:16, Chen He ha scritto: For short, here is a script that may be useful for your to remove hdfs directory on DNs from your headnode. [...] You can also use slaves.sh in $HADOOP_HOME/bin to accomplish that: it allows you to run a command on every slave node. Cheers, Paolo
Re: hadoop filesystem cache
There is some work being done in this area by some folks over at UC Berkeley's AMP Lab in coordination with Facebook. I don't believe it has been published quite yet, but the title of the project is PACMan -- I expect it will be published soon. -Todd On Sat, Jan 14, 2012 at 5:30 PM, Rita rmorgan...@gmail.com wrote: After reading this article, http://www.cloudera.com/blog/2012/01/caching-in-hbase-slabcache/ , I was wondering if there was a filesystem cache for hdfs. For example, if a large file (10gigabytes) was keep getting accessed on the cluster instead of keep getting it from the network why not storage the content of the file locally on the client itself. A use case on the client would be like this: property namedfs.client.cachedirectory/name value/var/cache/hdfs/value /property property namedfs.client.cachesize/name descriptionin megabytes/description value10/value /property Any thoughts of a feature like this? -- --- Get your facts first, then you can distort them as you please.-- -- Todd Lipcon Software Engineer, Cloudera
Re: How do I run all the Hadoop unit tests?
I never did. Maybe I'll try again with version 1.0.0. On Thu, Nov 17, 2011 at 1:12 AM, cheersyang wil...@yahoo.cn wrote: I met the same issue as you, curious to know if you figure out a solution -- View this message in context: http://lucene.472066.n3.nabble.com/How-do-I-run-all-the-Hadoop-unit-tests-tp3321283p3515176.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.