Hi, I am trying to setup replication for my HBase clusters. I have two small clusters for testing each with 4 machines. The setup for the two clusters is identical. Each machine runs a DataNode, and HRegionServer. Three of the machines run a ZK peer and one machine runs the HMaster and NameNode. The cluster master machines have hostnames (ds1,ds2 ...) and the slave cluster is (bk1, bk2 ...). I set the replication scope to 1 for my test table column families and set the hbase.replication property to true for both clusters. Next I ran the add_peer.rb script with the following command on the ds1 machine:
hbase org.jruby.Main /usr/lib/hbase/bin/replication/add_peer.rb ds1:2181:/hbase bk1:2181:/hbase After the script finishes ZK for the master cluster has the replication znode and children of peers, master, and state. The slave ZK didn't have a replication znode. I fixed that problem by rerunning the script on the bk1 machine and commenting out the code to write to the master ZK. Now the slave ZK has the /hbase/replication/master znode with data (ds1:2181:/hbase). Everthing looked to be configured correctly. I restarted the clusters. The logs of the master regionservers stated: This cluster (ds1:2181:/hbase) is a master for replication, compared with (ds1:2181:/hbase) The logs on the slave cluster stated: This cluster (bk1:2181:/hbase) is a slave for replication, compared with (ds1:2181:/hbase) Using the hbase shell I put a row into the test table. The regionserver for that table had a log statement like: Going to report log #192.168.1.166%3A60020.1291757445179 for position 15828 in hdfs://ds1:9000/hbase/.logs/ds1.internal,60020,1291757445059/192.168.1.166 <http://192.168.1.166/>%3A60020.1291757445179 (192.168.1.166 is ds1) I wait and even after several minutes the row still does not appear in the slave cluster table. Any help with what the problem might be is greatly appreciated. Both clusters are using a CDH3b3. The HBase version is exactly 0.89.20100924+28. -Nathaniel Cook
