Hi Martinus,
Hadoop HA is available in Hadoop 2.0.0. This release is currently
being voted on in the community.
You can read more here:
http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/
-Todd
On Mon, May 21, 2012 at 11:24 PM, Martinus Martinus
Hi Todd,
Thanks for your answer. Is that will have the same capability as the
commercial M5 of MapR : http://www.mapr.com/products/why-mapr ?
Thanks.
On Tue, May 22, 2012 at 2:26 PM, Todd Lipcon t...@cloudera.com wrote:
Hi Martinus,
Hadoop HA is available in Hadoop 2.0.0. This release is
On Tue, May 22, 2012 at 12:08 AM, Martinus Martinus
martinus...@gmail.com wrote:
Hi Todd,
Thanks for your answer. Is that will have the same capability as the
commercial M5 of MapR : http://www.mapr.com/products/why-mapr ?
I can't speak to a closed source product's feature set. But, the 2.0.0
Hello Brendan,
Do as suggested by Marcos..If you do not set these properties,
Hadoop uses tmp directory by default..Apart from setting these
properties in your hdfs-site.xml file add the following property in
your core-site.xml file -
property
namehadoop.tmp.dir/name
No. 2.0.0 will not have the same level of ha as MapR. Specifically, the job
tracker hasn't been addressed and the name node Issues have only been partially
addressed.
On May 22, 2012, at 8:08 AM, Martinus Martinus martinus...@gmail.com wrote:
Hi Todd,
Thanks for your answer. Is that will
Thanks and it works!
I wonder where can we find all the settings. I check the code for
hdfs-default.xml but it doesn't have the settings you mentioned.
Brendan
From: donta...@gmail.com
Date: Tue, 22 May 2012 13:03:17 +0530
Subject: Re: namenode
Hi there,
i am currently trying to get rid of bugs in my Hadoop program by
debugging it. Everything went fine til some point yesterday. I dont know
what exactly happened, but my program does not stop at breakpoints
within the Reducer and also not within the RawComparator for the values
That's great..The best way to get these kind of info is to ask
questions on the mailing list whenever we face any problem.There are
certain things that are not documented anywhere. I had faced a lot of
problems initially, but the community and the people are really
great.There are so many
Hi,
Hi Brendan,
The number of files that can be stored in HDFS is limited by the size of
the NameNode's RAM. The downside with storing small files is that you would
saturate the NameNode's RAM with a small data set (sum of the size of all
your small files). However, you can store around 100
Hi Brendan,
Every file, directory and block in HDFS is represented as an
object in the namenode’s memory, each of which occupies 150 bytes.When
we store many small files in the HDFS, these small files occupy a
large portion of the namespace(large overhead on namenode). As a
consequence, the
From: Björn-Elmar Macek [mailto:ma...@cs.uni-kassel.de]
Sent: Tuesday, May 22, 2012 3:12 PM
To: hdfs-user@hadoop.apache.org
Subject: Hadoop Debugging in LocalMode (Breakpoints not reached)
Hi there,
i am currently trying to get rid of bugs in my Hadoop program
Brendan,
The issue with using lots of small files is that your processing
overhead increases (repeated, avoidable file open-read(little)-close
calls). HDFS is also used by those who wish to also heavily process
the data they've stored and with a huge number of files such a process
is not gonna be
Brendan,
The hdfs-default.xml does have dfs.name.dir listed:
http://hadoop.apache.org/common/docs/current/hdfs-default.html. The
configuration is also mentioned on the official tutorial:
http://hadoop.apache.org/common/docs/current/cluster_setup.html#Configuration+Files
On Tue, May 22, 2012 at
In addition to the responses already provided, there is another downside to
using hadoop with numerous files: it takes much longer to run a hadoop job!
Starting a hadoop job consists of communicating between the driver (which runs
on a client machine outside the cluster) and the namenode to
Brendan, since you are looking for a distr file system that can store multi
millions of files, try out MapR. A few customers have actually crossed
over 1 trillion files without hitting problems. Small files or large files
are handled equally well.
Of course, if you are doing map-reduce, it is
15 matches
Mail list logo