Thanx a Lot Edward,

This information is very helpful to me.

With Best Regards

Adarsh Sharma

edward choi wrote:
Dear Adarsh,

I have a single machine running Namenode/JobTracker/Hbase Master.
There are 17 machines running Datanode/TaskTracker
Among those 17 machines, 14 are running Hbase Regionservers.
The other 3 machines are running Zookeeper.

And about the Zookeeper,
Hbase comes with its own Zookeeper so you don't need to install a new
Zookeeper. (except for the special occasion, which I'll explain later)
I assigned 14 machines as regionservers using
I assigned 3 machines as Zookeeperss using "hbase.zookeeper.quorum" property
in "$HBASE_HOME/conf/hbase-site.xml".
Don't forget to set "export HBASE_MANAGES_ZK=true"
in "$HBASE_HOME/conf/". (This is where you announce that you
will be using Zookeeper that comes with HBase)
This way, when you execute "$HBASE_HOME/bin/", HBase will
automatically start Zookeeper first, then start HBase daemons.

Also, you can install your own Zookeeper and tell HBase to use it instead of
its own.
I read it on the internet that Zookeeper that comes with HBase does not work
properly on Windows 7 64bit. (
So in that case you need to install your own Zookeeper, set it up properly,
and tell HBase to use it instead of its own.
All you need to do is configure zoo.cfg and add it to the HBase CLASSPATH.
And don't forget to set "export HBASE_MANAGES_ZK=false"
in "$HBASE_HOME/conf/".
This way, HBase will not start Zookeeper automatically.

About the separation of Zookeepers from regionservers,
Yes, it is recommended to separate Zookeepers from regionservers.
But that won't be necessary unless your clusters are very heavily loaded.
They also suggest that you give Zookeeper its own hard disk. But I haven't
done that myself yet. (Hard disks cost money you know)
So I'd say your cluster seems fine.
But when you want to expand your cluster, you'd need some changes. I suggest
you take a look at "Hadoop: The Definitive Guide".


2011/1/13 Adarsh Sharma <>

Thanks Edward,

Can you describe me the architecture used in your configuration.

Fore.g I have a cluster of 10 servers and

1 node act as ( Namenode, Jobtracker, Hmaster ).
Remainning 9 nodes act as ( Slaves, datanodes, Tasktracker, Hregionservers
Among these 9 nodes I also set 3 nodes in

I want to know that is it necessary to configure zookeeper separately with
the zookeeper-3.2.2 package or just have some IP's listed in and Hbase take care of it.

Can we specify IP's of Hregionservers used before as zookeeper servers (
HQuorumPeer ) or we must need separate servers for it.

My problem arises in running zookeeper. My Hbase is up and running  in
fully distributed mode too.

With Best Regards

Adarsh Sharma

edward choi wrote:

Dear Adarsh,

My situation is somewhat different from yours as I am only running Hadoop
and Hbase (as opposed to Hadoop/Hive/Hbase).

But I hope my experience could be of help to you somehow.

I applied the "hdfs-630-0.20-append.patch" to every single Hadoop node.
(including master and slaves)
Then I followed exactly what they told me to do on

I didn't get a single error message and successfully started HBase in a
fully distributed mode.

I am not using Hive so I can't tell what caused the
MasterNotRunningException, but the patch above is meant to  allow
pass NameNode lists of known dead Datanodes.
I doubt that the patch has anything to do with MasterNotRunningException.

Hope this helps.


2011/1/13 Adarsh Sharma <>

I am also facing some issues  and i think applying

   would solve my problem.

I try to run Hadoop/Hive/Hbase integration in fully Distributed mode.

But I am facing master Not Running Exception mentioned in

My Hadoop Version= 0.20.2, Hive =0.6.0 , Hbase=0.20.6.

What you think Edward.

Thanks  Adarsh

edward choi wrote:

I am not familiar with this whole svn and patch stuff, so please
my asking.

I was going to apply
because I wanted to install HBase and the installation guide told me to.
The append branch you mentioned, does that include
Is it like the latest patch with all the good stuff packed in one?


2011/1/12 Ted Dunning <>

You may also be interested in the append branch:

On Tue, Jan 11, 2011 at 3:12 AM, edward choi <> wrote:

Thanks for the info.
I am currently using Hadoop 0.20.2, so I guess I only need apply

I wasn't familiar with the term "trunk". I guess it means "the latest
Thanks again.

Best Regards,

2011/1/11 Konstantin Boudnik <>

Yeah, that's pretty crazy all right. In your case looks like that 3
patches on the top are the latest for 0.20-append branch, 0.21 branch
and trunk (which perhaps 0.22 branch at the moment). It doesn't look
like you need to apply all of them - just try the latest for your
particular branch.

The mess is caused by the fact the ppl are using different names for
consequent patches (as in file.1.patch, file.2.patch etc) This is
_very_ confusing indeed, especially when different contributors work
on the same fix/feature.
 Take care,
Konstantin (Cos) Boudnik

On Mon, Jan 10, 2011 at 01:10, edward choi <> wrote:

For the first time I am about to apply a patch to HDFS.

Above is the one that I am trying to do.
But there are like 15 patches and I don't know which one to use.

Could anyone tell me if I need to apply them all or just the one at



The whole patching process is just so confusing :-(


Reply via email to