Sounds good, Pierre. FWIW, if you want a preview, here's how to get a 5-node HBase cluster running based on the master branch of HBase in about a minute:
1. Source the clusterdock.sh script that defines the clusterdock_ helper functions: source /dev/stdin <<< "$(curl -sL http://tiny.cloudera.com/clusterdock.sh)" 2. Start up a cluster: CLUSTERDOCK_TOPOLOGY_IMAGE= hbasejenkinsuser-docker-hbase.bintray.io/dev/clusterdock:apache_hbase_topology clusterdock_run ./bin/start_cluster -r hbasejenkinsuser-docker-hbase.bintray.io --namespace dev apache_hbase --hbase-version=master --hadoop-version=2.7.1 --secondary-nodes='node-{2..5}' And that's it. Feel free to put a -h for help information (put it right after the ./bin/start_cluster for details about the function or after the apache_hbase for details about the Apache HBase topology. -Dima On Mon, Sep 5, 2016 at 3:44 PM, Pierre Caserta <pierre.case...@gmail.com> wrote: > Thanks for your answer. > I will check the ticket https://issues.apache.org/jira/browse/HBASE-15961 > <https://issues.apache.org/jira/browse/HBASE-15961> regularly and try > clusterdock as soon as the documentation comes out. > I will try to use hostname with domain like: master.hadoopnet.com < > http://master.hadoopnet.com/> and network named hadoopnet.com < > http://hadoopnet.com/> to try if this resolve the problem. > Currently my hostnames are hadoop-master, hadoop-slave1 and hadoop-slave2, > maybe that is the problem. > > > On 5 Sep 2016, at 23:31, Dima Spivak <dimaspi...@apache.org> wrote: > > > > clusterdock uses --net=host for running the framework out of a container, > > but each Hadoop/HBase cluster itself runs with its own bridge network. > Just > > suggesting clusterdock since it's what we now use for testing HBase > > releases and it looks a bit more sophisticated than this other project > > (e.g. no need to rebuild images for different cluster sizes). > > > > The error you're seeing is caused by not using the FQDN of the containers > > when referring to them; Docker networks use the network name as the > domain. > > > > On Monday, September 5, 2016, Pierre Caserta <pierre.case...@gmail.com > <mailto:pierre.case...@gmail.com>> > > wrote: > > > >> That is a good script thanks but I would like to understand exactly what > >> is the problem with my config without adding another level of > abstraction > >> and just running the clusterdock command. > >> In your script I can see that you are using --net=host. I think this is > >> the main difference compared to what I am doing which is creating a > bridge > >> network for the hadoop cluster. > >> I have only 3 machines: hadoop-master, hadoop-slave1, hadoop-slave2. > >> > >> Why do those strange hadoop-slave2.hadoopnet alias appear in the web ui? > >> It looks like the network name is used as part of the hostname. > >> Any idea what it is happening in my case? > >> > >> Pierre > >> > >>> On 5 Sep 2016, at 16:48, Dima Spivak <dimaspi...@apache.org > >> <javascript:;>> wrote: > >>> > >>> You should try the Apache HBase topology for clusterdock that was > >> committed > >>> a few months back. See HBASE-12721 for details. > >>> > >>> On Sunday, September 4, 2016, Pierre Caserta <pierre.case...@gmail.com > <mailto:pierre.case...@gmail.com> > >> <javascript:;>> > >>> wrote: > >>> > >>>> Hi, > >>>> I am building a fully distributed hbase cluster with unmanaged > >> zookeeper. > >>>> I pretty much used this example and install hbase on top of it: > >>>> https://github.com/kiwenlau/hadoop-cluster-docker > >>>> > >>>> Hadoop and hdfs works fine but I get this exception with hbase: > >>>> > >>>> 2016-09-05 06:27:12,268 INFO [hadoop-master:16000. > >> activeMasterManager] > >>>> zookeeper.MetaTableLocator: Failed verification of hbase:meta,,1 at > >>>> address=hadoop-slave2,16020,1473052276351, > exception=org.apache.hadoop. > >>>> hbase.NotServingRegionException: Region hbase:meta,,1 is not online > on > >>>> hadoop-slave2.hadoopnet,16020,1473056813966 > >>>> at org.apache.hadoop.hbase.regionserver.HRegionServer. > >>>> getRegionByEncodedName(HRegionServer.java:2910) > >>>> > >>>> This is bloking because any command I enter on the hbase shell will > >> return > >>>> the following error: > >>>> > >>>> ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is > >>>> initializing > >>>> > >>>> The containers are runned using --net=hadoopnet > >>>> which is a network create as such: > >>>> > >>>> docker network create --driver=bridge hadoopnet > >>>> > >>>> The hbase webui is showing this: > >>>> > >>>> Region Servers > >>>> ServerName Start time Version Requests Per Second Num. > >>>> Regions > >>>> hadoop-slave1,16020,1473056814064 Mon Sep 05 06:26:54 UTC 2016 > >>>> 1.2.2 0 0 > >>>> hadoop-slave1.hadoopnet,16020,1473056814064 Mon Sep 05 06:26:54 UTC > >>>> 2016 Unknown 0 0 > >>>> hadoop-slave2,16020,1473056813966 Mon Sep 05 06:26:53 UTC 2016 > >>>> 1.2.2 0 0 > >>>> hadoop-slave2.hadoopnet,16020,1473056813966 Mon Sep 05 06:26:53 UTC > >>>> 2016 Unknown 0 0 > >>>> Total:4 2 nodes with inconsistent version 0 > 0 > >>>> > >>>> I should have only 2 regionservers but 2 strange > hadoop-slave1.hadoopnet > >>>> and hadoop-slave2.hadoopnet are added to the list. > >>>> When I look at zk using: > >>>> > >>>> /usr/local/hbase/bin/hbase zkcli -server zk:2181 ls /hbase/rs > >>>> > >>>> I only see my 2 regionserver: hadoop-slave1,16020,1473056814064 and > >>>> hadoop-slave2,16020,1473056813966 > >>>> > >>>> Looking at the zookeeper.MetaTableLocator: Failed verification error I > >> see > >>>> that hadoop-slave2,16020,1473052276351 and > >> hadoop-slave2.hadoopnet,16020,1473056813966 > >>>> get mixed up. > >>>> > >>>> here is my config on all server > >>>> > >>>> <?xml version="1.0" encoding="UTF-8"?> > >>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > >>>> > >>>> <configuration> > >>>> <property> > >>>> <name>hbase.rootdir</name> > >>>> <value>hdfs://hadoop-master:9000/hbase</value> > >>>> <description>The directory shared by region servers. > Should > >>>> be fully-qualified to include the filesystem to use. E.g: > >>>> hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR</description> > >>>> </property> > >>>> <property> > >>>> <name>hbase.master</name> > >>>> <value>hdfs://hadoop-master:60000</value> > >>>> <description>The host and port that the HBase master runs > >>>> at.</description> > >>>> </property> > >>>> <property> > >>>> <name>hbase.cluster.distributed</name> > >>>> <value>true</value> > >>>> <description>The mode the cluster will be in. Possible > >>>> values are > >>>> false: standalone and pseudo-distributed setups with > >> managed > >>>> Zookeeper > >>>> true: fully-distributed with unmanaged Zookeeper Quorum > >> (see > >>>> hbase-env.sh)</description> > >>>> </property> > >>>> <property> > >>>> <name>hbase.master.info.port</name> > >>>> <value>60010</value> > >>>> <description>The UI interface of HBase master > >>>> runs.</description> > >>>> </property> > >>>> <property> > >>>> <name>hbase.zookeeper.quorum</name> > >>>> <value>zk</value> > >>>> <description>string m_e_m_b_e_r_s is replaced by list of > >>>> hosts separated by comma. Its generated by configure-slaves.sh on > master > >>>> node</description> > >>>> </property> > >>>> <property> > >>>> <name>hbase.zookeeper.property.maxClientCnxns</name> > >>>> <value>300</value> > >>>> </property> > >>>> <property> > >>>> <name>hbase.zookeeper.property.datadir</name> > >>>> <value>/tmp/zookeeper</value> > >>>> <description>location of storage of zookeeper > >>>> data</description> > >>>> </property> > >>>> <property> > >>>> <name>hbase.zookeeper.property.clientPort</name> > >>>> <value>2181</value> > >>>> </property> > >>>> > >>>> </configuration> > >>>> > >>>> I created a stack overflow question as well: > http://stackoverflow.com/ > >>>> questions/39325041/hbase-on-docker-notservingregionexception- > >>>> because-of-hostname-alisas <http://stackoverflow.com/ > >>>> questions/39325041/hbase-on-docker-notservingregionexception- > >>>> because-of-hostname-alisas> > >>>> > >>>> Thanks, > >>>> Pierre > >>> > >>> > >>> > >>> -- > >>> -Dima > >> > >> > > > > -- > > -Dima > >