Incompatible namespaceIDs after formatting namenode

2012-01-15 Thread gdan2000

Hi

We just started implemented hadoop on our system for the first time(Cloudera
CDH3u2 )

After reformatting a namenode for a few times, DataNode is not coming up
with error Incompatible namespaceIDs

I found a note on this
http://pages.cs.brandeis.edu/~cs147a/lab/hadoop-troubleshooting/ but I'm
really not sure about removing data node directories.

How is it possible that data will not be lost? I have to do it on all
datanodes...

Please explain me how all this reformat tasks preserves user's data ?

-- 
View this message in context: 
http://old.nabble.com/Incompatible-namespaceIDs-after-formatting-namenode-tp33142065p33142065.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Re: Incompatible namespaceIDs after formatting namenode

2012-01-15 Thread Chen He
For short, here is a script that may be useful for your to remove hdfs
directory on DNs from your headnode.

for each DN hostname
   do
  ssh root@[DN hostname] rm [your hdfs
directory]/dfs/data/current/VERSION;
done

On Sun, Jan 15, 2012 at 7:22 AM, Uma Maheswara Rao G
mahesw...@huawei.comwrote:

 Since you already formatted NN, why do you think dataloss if you remove
 storage directories of DNs here?
 Since you formatted the NN, new namespaceID will be generated. When DNs
 registering to it, they will have still old NamespaceID, so, it will say
 incompatible namespaceIDs. So, here currently the solution is to remove the
 storage directories of all DNs.

 Regards,
 Uma

 
 From: gdan2000 [gdan2...@gmail.com]
 Sent: Sunday, January 15, 2012 2:15 PM
 To: core-u...@hadoop.apache.org
 Subject: Incompatible namespaceIDs after formatting namenode

 Hi

 We just started implemented hadoop on our system for the first
 time(Cloudera
 CDH3u2 )

 After reformatting a namenode for a few times, DataNode is not coming up
 with error Incompatible namespaceIDs

 I found a note on this
 http://pages.cs.brandeis.edu/~cs147a/lab/hadoop-troubleshooting/ but I'm
 really not sure about removing data node directories.

 How is it possible that data will not be lost? I have to do it on all
 datanodes...

 Please explain me how all this reformat tasks preserves user's data ?

 --
 View this message in context:
 http://old.nabble.com/Incompatible-namespaceIDs-after-formatting-namenode-tp33142065p33142065.html
 Sent from the Hadoop core-user mailing list archive at Nabble.com.




Re: Incompatible namespaceIDs after formatting namenode

2012-01-15 Thread Paolo Rodeghiero

Il 15/01/2012 09:45, gdan2000 ha scritto:
[...]
 really not sure about removing data node directories.

 How is it possible that data will not be lost? I have to do it on all
 datanodes...

 Please explain me how all this reformat tasks preserves user's data ?


When you reformat the namenode, you are erasing and rebuilding what 
actually is the filesystem allocation table. As for traditional 
filesystems, you are not deleting the actual blocks.


Differently from the traditional scenario, the previously allocated 
space is not automatically reused, but datanodes that have blocks from 
the previous allocation will instead deny to link to the namenode.


I assume the main reason for this design choice is to preserve data from 
problems on name resolutions or configuration errors (i.e. the datanode 
trying to link to a wrong namenode)


Il 15/01/2012 21:16, Chen He ha scritto:
 For short, here is a script that may be useful for your to remove hdfs
 directory on DNs from your headnode.
[...]

You can also use slaves.sh in $HADOOP_HOME/bin to accomplish that:
it allows you to run a command on every slave node.

Cheers,
Paolo





Re: hadoop filesystem cache

2012-01-15 Thread Todd Lipcon
There is some work being done in this area by some folks over at UC
Berkeley's AMP Lab in coordination with Facebook. I don't believe it
has been published quite yet, but the title of the project is PACMan
-- I expect it will be published soon.

-Todd

On Sat, Jan 14, 2012 at 5:30 PM, Rita rmorgan...@gmail.com wrote:
 After reading this article,
 http://www.cloudera.com/blog/2012/01/caching-in-hbase-slabcache/ , I was
 wondering if there was a filesystem cache for hdfs. For example, if a large
 file (10gigabytes) was keep getting accessed on the cluster instead of keep
 getting it from the network why not storage the content of the file locally
 on the client itself.  A use case on the client would be like this:



 property
  namedfs.client.cachedirectory/name
  value/var/cache/hdfs/value
 /property


 property
 namedfs.client.cachesize/name
 descriptionin megabytes/description
 value10/value
 /property


 Any thoughts of a feature like this?


 --
 --- Get your facts first, then you can distort them as you please.--



-- 
Todd Lipcon
Software Engineer, Cloudera


Re: How do I run all the Hadoop unit tests?

2012-01-15 Thread W.P. McNeill
I never did. Maybe I'll try again with version 1.0.0.

On Thu, Nov 17, 2011 at 1:12 AM, cheersyang wil...@yahoo.cn wrote:

 I met the same issue as you, curious to know if you figure out a solution

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/How-do-I-run-all-the-Hadoop-unit-tests-tp3321283p3515176.html
 Sent from the Hadoop lucene-users mailing list archive at Nabble.com.