Hi Roshan, Are you seeing any replication related exception in your RS logs ?
On Tue, Apr 27, 2021 at 1:59 PM Roshan <jlks...@gmail.com> wrote: > Hi, > > In the hbase-1.4.10, I have enabled replication for all tables and > configured the peer_id. the list_peers provide the below result: > > hbase(main):001:0> list_peers > > PEER_ID CLUSTER_KEY ENDPOINT_CLASSNAME STATE TABLE_CFS BANDWIDTH > > 1 10.XX.221.XX,10.XX.234.XX,10.XX.212.XX:2171:/hbase nil ENABLED nil 0 > > 1 row(s) in 0.1430 seconds > > > But the status_replication shows replication lag > > hbase(main):002:0> status 'replication' > > version 1.4.10 > > 3 live servers > > 10.XX.232.XX: > > SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, > > TimeStampsOfLastShippedOp=Thu Jan 01 05:30:00 IST 1970, Replication Lag= > > *1619545264329* > > SINK : AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Tue Apr 27 > > 23:09:23 IST 2021 > > 10.XX.118.XX: > > SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, > > TimeStampsOfLastShippedOp=Thu Jan 01 05:30:00 IST 1970, Replication Lag= > > *1619545264663* > > SINK : AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Tue Apr 27 > > 18:53:23 IST 2021 > > 10.XX.138.XX: > > SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, > > TimeStampsOfLastShippedOp=Thu Jan 01 05:30:00 IST 1970, Replication Lag= > > *1619545263509* > > SINK : AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Tue Apr 27 > > 10:31:05 IST 2021 > > > > But all the data are replicated properly to the defined cluster. I have > checked the table in both clusters. > > I have verified using VerifyReplication Mapreduce to check unreplicated > rows. But there are no rows in the unreplicated one. All are good Rows. > > ./hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication 1 > > tablename > > > > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier$Counters > > GOODROWS=45 > > File Input Format Counters > > Bytes Read=0 > > File Output Format Counters > > Bytes Written=0 > > > Due to this issue, I have Zknodes under replication is growing > exponentially which causes issues in running ZK cluster which eventually > affects the Hbase Connection too. Below exception occurs in ZK > > *ERROR java.io.IOException: Len error* > > Increasing jute.maxbuffer in ZK will not solve the problem as replication > znode is increasing though the data are replicated properly to the given > cluster Peer_id. > > I have enabled two-way replication between the cluster. It happens in both > the cluster. > > hbase version - 1.4.10 > ZK Version - 3.4.10 > Hadoop version - 2.7.3 > > Please help to fix this. > > Regards, > Roshan >