That is a good script thanks but I would like to understand exactly what is the problem with my config without adding another level of abstraction and just running the clusterdock command. In your script I can see that you are using --net=host. I think this is the main difference compared to what I am doing which is creating a bridge network for the hadoop cluster. I have only 3 machines: hadoop-master, hadoop-slave1, hadoop-slave2.
Why do those strange hadoop-slave2.hadoopnet alias appear in the web ui? It looks like the network name is used as part of the hostname. Any idea what it is happening in my case? Pierre > On 5 Sep 2016, at 16:48, Dima Spivak <dimaspi...@apache.org> wrote: > > You should try the Apache HBase topology for clusterdock that was committed > a few months back. See HBASE-12721 for details. > > On Sunday, September 4, 2016, Pierre Caserta <pierre.case...@gmail.com> > wrote: > >> Hi, >> I am building a fully distributed hbase cluster with unmanaged zookeeper. >> I pretty much used this example and install hbase on top of it: >> https://github.com/kiwenlau/hadoop-cluster-docker >> >> Hadoop and hdfs works fine but I get this exception with hbase: >> >> 2016-09-05 06:27:12,268 INFO [hadoop-master:16000.activeMasterManager] >> zookeeper.MetaTableLocator: Failed verification of hbase:meta,,1 at >> address=hadoop-slave2,16020,1473052276351, exception=org.apache.hadoop. >> hbase.NotServingRegionException: Region hbase:meta,,1 is not online on >> hadoop-slave2.hadoopnet,16020,1473056813966 >> at org.apache.hadoop.hbase.regionserver.HRegionServer. >> getRegionByEncodedName(HRegionServer.java:2910) >> >> This is bloking because any command I enter on the hbase shell will return >> the following error: >> >> ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is >> initializing >> >> The containers are runned using --net=hadoopnet >> which is a network create as such: >> >> docker network create --driver=bridge hadoopnet >> >> The hbase webui is showing this: >> >> Region Servers >> ServerName Start time Version Requests Per Second Num. >> Regions >> hadoop-slave1,16020,1473056814064 Mon Sep 05 06:26:54 UTC 2016 >> 1.2.2 0 0 >> hadoop-slave1.hadoopnet,16020,1473056814064 Mon Sep 05 06:26:54 UTC >> 2016 Unknown 0 0 >> hadoop-slave2,16020,1473056813966 Mon Sep 05 06:26:53 UTC 2016 >> 1.2.2 0 0 >> hadoop-slave2.hadoopnet,16020,1473056813966 Mon Sep 05 06:26:53 UTC >> 2016 Unknown 0 0 >> Total:4 2 nodes with inconsistent version 0 0 >> >> I should have only 2 regionservers but 2 strange hadoop-slave1.hadoopnet >> and hadoop-slave2.hadoopnet are added to the list. >> When I look at zk using: >> >> /usr/local/hbase/bin/hbase zkcli -server zk:2181 ls /hbase/rs >> >> I only see my 2 regionserver: hadoop-slave1,16020,1473056814064 and >> hadoop-slave2,16020,1473056813966 >> >> Looking at the zookeeper.MetaTableLocator: Failed verification error I see >> that hadoop-slave2,16020,1473052276351 and >> hadoop-slave2.hadoopnet,16020,1473056813966 >> get mixed up. >> >> here is my config on all server >> >> <?xml version="1.0" encoding="UTF-8"?> >> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> >> >> <configuration> >> <property> >> <name>hbase.rootdir</name> >> <value>hdfs://hadoop-master:9000/hbase</value> >> <description>The directory shared by region servers. Should >> be fully-qualified to include the filesystem to use. E.g: >> hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR</description> >> </property> >> <property> >> <name>hbase.master</name> >> <value>hdfs://hadoop-master:60000</value> >> <description>The host and port that the HBase master runs >> at.</description> >> </property> >> <property> >> <name>hbase.cluster.distributed</name> >> <value>true</value> >> <description>The mode the cluster will be in. Possible >> values are >> false: standalone and pseudo-distributed setups with managed >> Zookeeper >> true: fully-distributed with unmanaged Zookeeper Quorum (see >> hbase-env.sh)</description> >> </property> >> <property> >> <name>hbase.master.info.port</name> >> <value>60010</value> >> <description>The UI interface of HBase master >> runs.</description> >> </property> >> <property> >> <name>hbase.zookeeper.quorum</name> >> <value>zk</value> >> <description>string m_e_m_b_e_r_s is replaced by list of >> hosts separated by comma. Its generated by configure-slaves.sh on master >> node</description> >> </property> >> <property> >> <name>hbase.zookeeper.property.maxClientCnxns</name> >> <value>300</value> >> </property> >> <property> >> <name>hbase.zookeeper.property.datadir</name> >> <value>/tmp/zookeeper</value> >> <description>location of storage of zookeeper >> data</description> >> </property> >> <property> >> <name>hbase.zookeeper.property.clientPort</name> >> <value>2181</value> >> </property> >> >> </configuration> >> >> I created a stack overflow question as well: http://stackoverflow.com/ >> questions/39325041/hbase-on-docker-notservingregionexception- >> because-of-hostname-alisas <http://stackoverflow.com/ >> questions/39325041/hbase-on-docker-notservingregionexception- >> because-of-hostname-alisas> >> >> Thanks, >> Pierre > > > > -- > -Dima