[jira] [Updated] (HBASE-26195) Data is present in replicated cluster but not present in primary cluster.
[ https://issues.apache.org/jira/browse/HBASE-26195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-26195: -- Fix Version/s: 1.7.2 (was: 1.8.0) > Data is present in replicated cluster but not present in primary cluster. > - > > Key: HBASE-26195 > URL: https://issues.apache.org/jira/browse/HBASE-26195 > Project: HBase > Issue Type: Bug > Components: Replication, wal >Affects Versions: 1.7.0 >Reporter: Rushabh Shah >Assignee: Rushabh Shah >Priority: Major > Fix For: 1.7.2 > > > We encountered a case where we are seeing some rows (via Phoenix) in > replicated cluster but they are not present in source/active cluster. > Triaging further we found memstore rollback logs in few of the region servers. > {noformat} > 2021-07-28 14:17:59,353 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,353 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,355 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,355 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,356 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > {noformat} > Looking more into logs, found that there were some hdfs layer issues sync'ing > wal to hdfs. > It was taking around 6 mins to sync wal. Logs below > {noformat} > 2021-07-28 14:19:30,511 WARN [sync.0] hdfs.DataStreamer - Slow > waitForAckedSeqno took 391210ms (threshold=3ms). File being written: > /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620, > block: BP-958889176--1567030695029:blk_1689647875_616028364, Write > pipeline datanodes: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]. > 2021-07-28 14:19:30,589 WARN [sync.1] hdfs.DataStreamer - Slow > waitForAckedSeqno took 391148ms (threshold=3ms). File being written: > /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620, > block: BP-958889176--1567030695029:blk_1689647875_616028364, Write > pipeline datanodes: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]. > 2021-07-28 14:19:30,589 WARN [sync.2] hdfs.DataStreamer - Slow > waitForAckedSeqno took 391147ms (threshold=3ms). File being written: > /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620, > block: BP-958889176--1567030695029:blk_1689647875_616028364, Write > pipeline datanodes: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]. > 2021-07-28 14:19:30,591 INFO [sync.0] wal.FSHLog - Slow sync cost: 391289 > ms, current pipeline: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]] > 2021-07-28 14:19:30,591 INFO [sync.1] wal.FSHLog - Slow sync cost: 391227 > ms, current pipeline: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]] > 2021-07-28 14:19:30,591 WARN [sync.1] wal.FSHLog - Requesting log roll > because we exceeded slow sync threshold; time=391227 ms, threshold=1 ms, > current pipeline: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > >
[jira] [Updated] (HBASE-26195) Data is present in replicated cluster but not present in primary cluster.
[ https://issues.apache.org/jira/browse/HBASE-26195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh Shah updated HBASE-26195: - Fix Version/s: 1.8.0 > Data is present in replicated cluster but not present in primary cluster. > - > > Key: HBASE-26195 > URL: https://issues.apache.org/jira/browse/HBASE-26195 > Project: HBase > Issue Type: Bug > Components: Replication, wal >Affects Versions: 1.7.0 >Reporter: Rushabh Shah >Assignee: Rushabh Shah >Priority: Major > Fix For: 1.8.0 > > > We encountered a case where we are seeing some rows (via Phoenix) in > replicated cluster but they are not present in source/active cluster. > Triaging further we found memstore rollback logs in few of the region servers. > {noformat} > 2021-07-28 14:17:59,353 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,353 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,355 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,355 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,356 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > {noformat} > Looking more into logs, found that there were some hdfs layer issues sync'ing > wal to hdfs. > It was taking around 6 mins to sync wal. Logs below > {noformat} > 2021-07-28 14:19:30,511 WARN [sync.0] hdfs.DataStreamer - Slow > waitForAckedSeqno took 391210ms (threshold=3ms). File being written: > /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620, > block: BP-958889176--1567030695029:blk_1689647875_616028364, Write > pipeline datanodes: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]. > 2021-07-28 14:19:30,589 WARN [sync.1] hdfs.DataStreamer - Slow > waitForAckedSeqno took 391148ms (threshold=3ms). File being written: > /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620, > block: BP-958889176--1567030695029:blk_1689647875_616028364, Write > pipeline datanodes: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]. > 2021-07-28 14:19:30,589 WARN [sync.2] hdfs.DataStreamer - Slow > waitForAckedSeqno took 391147ms (threshold=3ms). File being written: > /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620, > block: BP-958889176--1567030695029:blk_1689647875_616028364, Write > pipeline datanodes: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]. > 2021-07-28 14:19:30,591 INFO [sync.0] wal.FSHLog - Slow sync cost: 391289 > ms, current pipeline: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]] > 2021-07-28 14:19:30,591 INFO [sync.1] wal.FSHLog - Slow sync cost: 391227 > ms, current pipeline: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]] > 2021-07-28 14:19:30,591 WARN [sync.1] wal.FSHLog - Requesting log roll > because we exceeded slow sync threshold; time=391227 ms, threshold=1 ms, > current pipeline: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]] >
[jira] [Updated] (HBASE-26195) Data is present in replicated cluster but not present in primary cluster.
[ https://issues.apache.org/jira/browse/HBASE-26195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh Shah updated HBASE-26195: - Affects Version/s: (was: 2.5.0) (was: 3.0.0-alpha-1) > Data is present in replicated cluster but not present in primary cluster. > - > > Key: HBASE-26195 > URL: https://issues.apache.org/jira/browse/HBASE-26195 > Project: HBase > Issue Type: Bug > Components: Replication, wal >Affects Versions: 1.7.0 >Reporter: Rushabh Shah >Assignee: Rushabh Shah >Priority: Major > > We encountered a case where we are seeing some rows (via Phoenix) in > replicated cluster but they are not present in source/active cluster. > Triaging further we found memstore rollback logs in few of the region servers. > {noformat} > 2021-07-28 14:17:59,353 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,353 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,355 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,355 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,356 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > {noformat} > Looking more into logs, found that there were some hdfs layer issues sync'ing > wal to hdfs. > It was taking around 6 mins to sync wal. Logs below > {noformat} > 2021-07-28 14:19:30,511 WARN [sync.0] hdfs.DataStreamer - Slow > waitForAckedSeqno took 391210ms (threshold=3ms). File being written: > /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620, > block: BP-958889176--1567030695029:blk_1689647875_616028364, Write > pipeline datanodes: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]. > 2021-07-28 14:19:30,589 WARN [sync.1] hdfs.DataStreamer - Slow > waitForAckedSeqno took 391148ms (threshold=3ms). File being written: > /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620, > block: BP-958889176--1567030695029:blk_1689647875_616028364, Write > pipeline datanodes: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]. > 2021-07-28 14:19:30,589 WARN [sync.2] hdfs.DataStreamer - Slow > waitForAckedSeqno took 391147ms (threshold=3ms). File being written: > /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620, > block: BP-958889176--1567030695029:blk_1689647875_616028364, Write > pipeline datanodes: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]. > 2021-07-28 14:19:30,591 INFO [sync.0] wal.FSHLog - Slow sync cost: 391289 > ms, current pipeline: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]] > 2021-07-28 14:19:30,591 INFO [sync.1] wal.FSHLog - Slow sync cost: 391227 > ms, current pipeline: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]] > 2021-07-28 14:19:30,591 WARN [sync.1] wal.FSHLog - Requesting log roll > because we exceeded slow sync threshold; time=391227 ms, threshold=1 ms, > current pipeline: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > >
[jira] [Updated] (HBASE-26195) Data is present in replicated cluster but not present in primary cluster.
[ https://issues.apache.org/jira/browse/HBASE-26195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharath Vissapragada updated HBASE-26195: - Affects Version/s: 2.5.0 3.0.0-alpha-1 > Data is present in replicated cluster but not present in primary cluster. > - > > Key: HBASE-26195 > URL: https://issues.apache.org/jira/browse/HBASE-26195 > Project: HBase > Issue Type: Bug > Components: Replication, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0 >Reporter: Rushabh Shah >Assignee: Rushabh Shah >Priority: Major > > We encountered a case where we are seeing some rows (via Phoenix) in > replicated cluster but they are not present in source/active cluster. > Triaging further we found memstore rollback logs in few of the region servers. > {noformat} > 2021-07-28 14:17:59,353 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,353 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,355 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,355 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,356 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > {noformat} > Looking more into logs, found that there were some hdfs layer issues sync'ing > wal to hdfs. > It was taking around 6 mins to sync wal. Logs below > {noformat} > 2021-07-28 14:19:30,511 WARN [sync.0] hdfs.DataStreamer - Slow > waitForAckedSeqno took 391210ms (threshold=3ms). File being written: > /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620, > block: BP-958889176--1567030695029:blk_1689647875_616028364, Write > pipeline datanodes: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]. > 2021-07-28 14:19:30,589 WARN [sync.1] hdfs.DataStreamer - Slow > waitForAckedSeqno took 391148ms (threshold=3ms). File being written: > /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620, > block: BP-958889176--1567030695029:blk_1689647875_616028364, Write > pipeline datanodes: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]. > 2021-07-28 14:19:30,589 WARN [sync.2] hdfs.DataStreamer - Slow > waitForAckedSeqno took 391147ms (threshold=3ms). File being written: > /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620, > block: BP-958889176--1567030695029:blk_1689647875_616028364, Write > pipeline datanodes: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]. > 2021-07-28 14:19:30,591 INFO [sync.0] wal.FSHLog - Slow sync cost: 391289 > ms, current pipeline: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]] > 2021-07-28 14:19:30,591 INFO [sync.1] wal.FSHLog - Slow sync cost: 391227 > ms, current pipeline: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]] > 2021-07-28 14:19:30,591 WARN [sync.1] wal.FSHLog - Requesting log roll > because we exceeded slow sync threshold; time=391227 ms, threshold=1 ms, > current pipeline: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > >
[jira] [Updated] (HBASE-26195) Data is present in replicated cluster but not present in primary cluster.
[ https://issues.apache.org/jira/browse/HBASE-26195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh Shah updated HBASE-26195: - Summary: Data is present in replicated cluster but not present in primary cluster. (was: Data is present in replicated cluster but not visible on primary cluster.) > Data is present in replicated cluster but not present in primary cluster. > - > > Key: HBASE-26195 > URL: https://issues.apache.org/jira/browse/HBASE-26195 > Project: HBase > Issue Type: Bug > Components: Replication, wal >Affects Versions: 1.7.0 >Reporter: Rushabh Shah >Assignee: Rushabh Shah >Priority: Major > > We encountered a case where we are seeing some rows (via Phoenix) in > replicated cluster but they are not present in source/active cluster. > Triaging further we found memstore rollback logs in few of the region servers. > {noformat} > 2021-07-28 14:17:59,353 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,353 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,355 DEBUG [3,queue=3,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,355 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > 2021-07-28 14:17:59,356 DEBUG [,queue=25,port=60020] regionserver.HRegion - > rollbackMemstore rolled back 23 > {noformat} > Looking more into logs, found that there were some hdfs layer issues sync'ing > wal to hdfs. > It was taking around 6 mins to sync wal. Logs below > {noformat} > 2021-07-28 14:19:30,511 WARN [sync.0] hdfs.DataStreamer - Slow > waitForAckedSeqno took 391210ms (threshold=3ms). File being written: > /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620, > block: BP-958889176--1567030695029:blk_1689647875_616028364, Write > pipeline datanodes: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]. > 2021-07-28 14:19:30,589 WARN [sync.1] hdfs.DataStreamer - Slow > waitForAckedSeqno took 391148ms (threshold=3ms). File being written: > /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620, > block: BP-958889176--1567030695029:blk_1689647875_616028364, Write > pipeline datanodes: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]. > 2021-07-28 14:19:30,589 WARN [sync.2] hdfs.DataStreamer - Slow > waitForAckedSeqno took 391147ms (threshold=3ms). File being written: > /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620, > block: BP-958889176--1567030695029:blk_1689647875_616028364, Write > pipeline datanodes: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]. > 2021-07-28 14:19:30,591 INFO [sync.0] wal.FSHLog - Slow sync cost: 391289 > ms, current pipeline: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]] > 2021-07-28 14:19:30,591 INFO [sync.1] wal.FSHLog - Slow sync cost: 391227 > ms, current pipeline: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > > DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK], > > DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]] > 2021-07-28 14:19:30,591 WARN [sync.1] wal.FSHLog - Requesting log roll > because we exceeded slow sync threshold; time=391227 ms, threshold=1 ms, > current pipeline: > [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK], > >