Thanks for your answer. I will check the ticket https://issues.apache.org/jira/browse/HBASE-15961 <https://issues.apache.org/jira/browse/HBASE-15961> regularly and try clusterdock as soon as the documentation comes out. I will try to use hostname with domain like: master.hadoopnet.com <http://master.hadoopnet.com/> and network named hadoopnet.com <http://hadoopnet.com/> to try if this resolve the problem. Currently my hostnames are hadoop-master, hadoop-slave1 and hadoop-slave2, maybe that is the problem.
> On 5 Sep 2016, at 23:31, Dima Spivak <dimaspi...@apache.org> wrote: > > clusterdock uses --net=host for running the framework out of a container, > but each Hadoop/HBase cluster itself runs with its own bridge network. Just > suggesting clusterdock since it's what we now use for testing HBase > releases and it looks a bit more sophisticated than this other project > (e.g. no need to rebuild images for different cluster sizes). > > The error you're seeing is caused by not using the FQDN of the containers > when referring to them; Docker networks use the network name as the domain. > > On Monday, September 5, 2016, Pierre Caserta <pierre.case...@gmail.com > <mailto:pierre.case...@gmail.com>> > wrote: > >> That is a good script thanks but I would like to understand exactly what >> is the problem with my config without adding another level of abstraction >> and just running the clusterdock command. >> In your script I can see that you are using --net=host. I think this is >> the main difference compared to what I am doing which is creating a bridge >> network for the hadoop cluster. >> I have only 3 machines: hadoop-master, hadoop-slave1, hadoop-slave2. >> >> Why do those strange hadoop-slave2.hadoopnet alias appear in the web ui? >> It looks like the network name is used as part of the hostname. >> Any idea what it is happening in my case? >> >> Pierre >> >>> On 5 Sep 2016, at 16:48, Dima Spivak <dimaspi...@apache.org >> <javascript:;>> wrote: >>> >>> You should try the Apache HBase topology for clusterdock that was >> committed >>> a few months back. See HBASE-12721 for details. >>> >>> On Sunday, September 4, 2016, Pierre Caserta <pierre.case...@gmail.com >>> <mailto:pierre.case...@gmail.com> >> <javascript:;>> >>> wrote: >>> >>>> Hi, >>>> I am building a fully distributed hbase cluster with unmanaged >> zookeeper. >>>> I pretty much used this example and install hbase on top of it: >>>> https://github.com/kiwenlau/hadoop-cluster-docker >>>> >>>> Hadoop and hdfs works fine but I get this exception with hbase: >>>> >>>> 2016-09-05 06:27:12,268 INFO [hadoop-master:16000. >> activeMasterManager] >>>> zookeeper.MetaTableLocator: Failed verification of hbase:meta,,1 at >>>> address=hadoop-slave2,16020,1473052276351, exception=org.apache.hadoop. >>>> hbase.NotServingRegionException: Region hbase:meta,,1 is not online on >>>> hadoop-slave2.hadoopnet,16020,1473056813966 >>>> at org.apache.hadoop.hbase.regionserver.HRegionServer. >>>> getRegionByEncodedName(HRegionServer.java:2910) >>>> >>>> This is bloking because any command I enter on the hbase shell will >> return >>>> the following error: >>>> >>>> ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is >>>> initializing >>>> >>>> The containers are runned using --net=hadoopnet >>>> which is a network create as such: >>>> >>>> docker network create --driver=bridge hadoopnet >>>> >>>> The hbase webui is showing this: >>>> >>>> Region Servers >>>> ServerName Start time Version Requests Per Second Num. >>>> Regions >>>> hadoop-slave1,16020,1473056814064 Mon Sep 05 06:26:54 UTC 2016 >>>> 1.2.2 0 0 >>>> hadoop-slave1.hadoopnet,16020,1473056814064 Mon Sep 05 06:26:54 UTC >>>> 2016 Unknown 0 0 >>>> hadoop-slave2,16020,1473056813966 Mon Sep 05 06:26:53 UTC 2016 >>>> 1.2.2 0 0 >>>> hadoop-slave2.hadoopnet,16020,1473056813966 Mon Sep 05 06:26:53 UTC >>>> 2016 Unknown 0 0 >>>> Total:4 2 nodes with inconsistent version 0 0 >>>> >>>> I should have only 2 regionservers but 2 strange hadoop-slave1.hadoopnet >>>> and hadoop-slave2.hadoopnet are added to the list. >>>> When I look at zk using: >>>> >>>> /usr/local/hbase/bin/hbase zkcli -server zk:2181 ls /hbase/rs >>>> >>>> I only see my 2 regionserver: hadoop-slave1,16020,1473056814064 and >>>> hadoop-slave2,16020,1473056813966 >>>> >>>> Looking at the zookeeper.MetaTableLocator: Failed verification error I >> see >>>> that hadoop-slave2,16020,1473052276351 and >> hadoop-slave2.hadoopnet,16020,1473056813966 >>>> get mixed up. >>>> >>>> here is my config on all server >>>> >>>> <?xml version="1.0" encoding="UTF-8"?> >>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> >>>> >>>> <configuration> >>>> <property> >>>> <name>hbase.rootdir</name> >>>> <value>hdfs://hadoop-master:9000/hbase</value> >>>> <description>The directory shared by region servers. Should >>>> be fully-qualified to include the filesystem to use. E.g: >>>> hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR</description> >>>> </property> >>>> <property> >>>> <name>hbase.master</name> >>>> <value>hdfs://hadoop-master:60000</value> >>>> <description>The host and port that the HBase master runs >>>> at.</description> >>>> </property> >>>> <property> >>>> <name>hbase.cluster.distributed</name> >>>> <value>true</value> >>>> <description>The mode the cluster will be in. Possible >>>> values are >>>> false: standalone and pseudo-distributed setups with >> managed >>>> Zookeeper >>>> true: fully-distributed with unmanaged Zookeeper Quorum >> (see >>>> hbase-env.sh)</description> >>>> </property> >>>> <property> >>>> <name>hbase.master.info.port</name> >>>> <value>60010</value> >>>> <description>The UI interface of HBase master >>>> runs.</description> >>>> </property> >>>> <property> >>>> <name>hbase.zookeeper.quorum</name> >>>> <value>zk</value> >>>> <description>string m_e_m_b_e_r_s is replaced by list of >>>> hosts separated by comma. Its generated by configure-slaves.sh on master >>>> node</description> >>>> </property> >>>> <property> >>>> <name>hbase.zookeeper.property.maxClientCnxns</name> >>>> <value>300</value> >>>> </property> >>>> <property> >>>> <name>hbase.zookeeper.property.datadir</name> >>>> <value>/tmp/zookeeper</value> >>>> <description>location of storage of zookeeper >>>> data</description> >>>> </property> >>>> <property> >>>> <name>hbase.zookeeper.property.clientPort</name> >>>> <value>2181</value> >>>> </property> >>>> >>>> </configuration> >>>> >>>> I created a stack overflow question as well: http://stackoverflow.com/ >>>> questions/39325041/hbase-on-docker-notservingregionexception- >>>> because-of-hostname-alisas <http://stackoverflow.com/ >>>> questions/39325041/hbase-on-docker-notservingregionexception- >>>> because-of-hostname-alisas> >>>> >>>> Thanks, >>>> Pierre >>> >>> >>> >>> -- >>> -Dima >> >> > > -- > -Dima