[jira] [Updated] (HBASE-26195) Data is present in replicated cluster but not present in primary cluster.

2022-08-02 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-26195:
--
Fix Version/s: 1.7.2
   (was: 1.8.0)

> Data is present in replicated cluster but not present in primary cluster.
> -
>
> Key: HBASE-26195
> URL: https://issues.apache.org/jira/browse/HBASE-26195
> Project: HBase
>  Issue Type: Bug
>  Components: Replication, wal
>Affects Versions: 1.7.0
>Reporter: Rushabh Shah
>Assignee: Rushabh Shah
>Priority: Major
> Fix For: 1.7.2
>
>
> We encountered a case where we are seeing some rows (via Phoenix) in 
> replicated cluster but they are not present in source/active cluster.
> Triaging further we found memstore rollback logs in few of the region servers.
> {noformat}
> 2021-07-28 14:17:59,353 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,353 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,355 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,355 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,356 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> {noformat}
> Looking more into logs, found that there were some hdfs layer issues sync'ing 
> wal to hdfs.
> It was taking around 6 mins to sync wal. Logs below
> {noformat}
> 2021-07-28 14:19:30,511 WARN  [sync.0] hdfs.DataStreamer - Slow 
> waitForAckedSeqno took 391210ms (threshold=3ms). File being written: 
> /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620,
>  block: BP-958889176--1567030695029:blk_1689647875_616028364, Write 
> pipeline datanodes: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]].
> 2021-07-28 14:19:30,589 WARN  [sync.1] hdfs.DataStreamer - Slow 
> waitForAckedSeqno took 391148ms (threshold=3ms). File being written: 
> /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620,
>  block: BP-958889176--1567030695029:blk_1689647875_616028364, Write 
> pipeline datanodes: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]].
> 2021-07-28 14:19:30,589 WARN  [sync.2] hdfs.DataStreamer - Slow 
> waitForAckedSeqno took 391147ms (threshold=3ms). File being written: 
> /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620,
>  block: BP-958889176--1567030695029:blk_1689647875_616028364, Write 
> pipeline datanodes: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]].
> 2021-07-28 14:19:30,591 INFO  [sync.0] wal.FSHLog - Slow sync cost: 391289 
> ms, current pipeline: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]
> 2021-07-28 14:19:30,591 INFO  [sync.1] wal.FSHLog - Slow sync cost: 391227 
> ms, current pipeline: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]
> 2021-07-28 14:19:30,591 WARN  [sync.1] wal.FSHLog - Requesting log roll 
> because we exceeded slow sync threshold; time=391227 ms, threshold=1 ms, 
> current pipeline: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> 

[jira] [Updated] (HBASE-26195) Data is present in replicated cluster but not present in primary cluster.

2021-09-08 Thread Rushabh Shah (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh Shah updated HBASE-26195:
-
Fix Version/s: 1.8.0

> Data is present in replicated cluster but not present in primary cluster.
> -
>
> Key: HBASE-26195
> URL: https://issues.apache.org/jira/browse/HBASE-26195
> Project: HBase
>  Issue Type: Bug
>  Components: Replication, wal
>Affects Versions: 1.7.0
>Reporter: Rushabh Shah
>Assignee: Rushabh Shah
>Priority: Major
> Fix For: 1.8.0
>
>
> We encountered a case where we are seeing some rows (via Phoenix) in 
> replicated cluster but they are not present in source/active cluster.
> Triaging further we found memstore rollback logs in few of the region servers.
> {noformat}
> 2021-07-28 14:17:59,353 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,353 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,355 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,355 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,356 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> {noformat}
> Looking more into logs, found that there were some hdfs layer issues sync'ing 
> wal to hdfs.
> It was taking around 6 mins to sync wal. Logs below
> {noformat}
> 2021-07-28 14:19:30,511 WARN  [sync.0] hdfs.DataStreamer - Slow 
> waitForAckedSeqno took 391210ms (threshold=3ms). File being written: 
> /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620,
>  block: BP-958889176--1567030695029:blk_1689647875_616028364, Write 
> pipeline datanodes: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]].
> 2021-07-28 14:19:30,589 WARN  [sync.1] hdfs.DataStreamer - Slow 
> waitForAckedSeqno took 391148ms (threshold=3ms). File being written: 
> /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620,
>  block: BP-958889176--1567030695029:blk_1689647875_616028364, Write 
> pipeline datanodes: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]].
> 2021-07-28 14:19:30,589 WARN  [sync.2] hdfs.DataStreamer - Slow 
> waitForAckedSeqno took 391147ms (threshold=3ms). File being written: 
> /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620,
>  block: BP-958889176--1567030695029:blk_1689647875_616028364, Write 
> pipeline datanodes: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]].
> 2021-07-28 14:19:30,591 INFO  [sync.0] wal.FSHLog - Slow sync cost: 391289 
> ms, current pipeline: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]
> 2021-07-28 14:19:30,591 INFO  [sync.1] wal.FSHLog - Slow sync cost: 391227 
> ms, current pipeline: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]
> 2021-07-28 14:19:30,591 WARN  [sync.1] wal.FSHLog - Requesting log roll 
> because we exceeded slow sync threshold; time=391227 ms, threshold=1 ms, 
> current pipeline: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]
> 

[jira] [Updated] (HBASE-26195) Data is present in replicated cluster but not present in primary cluster.

2021-08-19 Thread Rushabh Shah (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh Shah updated HBASE-26195:
-
Affects Version/s: (was: 2.5.0)
   (was: 3.0.0-alpha-1)

> Data is present in replicated cluster but not present in primary cluster.
> -
>
> Key: HBASE-26195
> URL: https://issues.apache.org/jira/browse/HBASE-26195
> Project: HBase
>  Issue Type: Bug
>  Components: Replication, wal
>Affects Versions: 1.7.0
>Reporter: Rushabh Shah
>Assignee: Rushabh Shah
>Priority: Major
>
> We encountered a case where we are seeing some rows (via Phoenix) in 
> replicated cluster but they are not present in source/active cluster.
> Triaging further we found memstore rollback logs in few of the region servers.
> {noformat}
> 2021-07-28 14:17:59,353 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,353 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,355 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,355 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,356 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> {noformat}
> Looking more into logs, found that there were some hdfs layer issues sync'ing 
> wal to hdfs.
> It was taking around 6 mins to sync wal. Logs below
> {noformat}
> 2021-07-28 14:19:30,511 WARN  [sync.0] hdfs.DataStreamer - Slow 
> waitForAckedSeqno took 391210ms (threshold=3ms). File being written: 
> /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620,
>  block: BP-958889176--1567030695029:blk_1689647875_616028364, Write 
> pipeline datanodes: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]].
> 2021-07-28 14:19:30,589 WARN  [sync.1] hdfs.DataStreamer - Slow 
> waitForAckedSeqno took 391148ms (threshold=3ms). File being written: 
> /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620,
>  block: BP-958889176--1567030695029:blk_1689647875_616028364, Write 
> pipeline datanodes: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]].
> 2021-07-28 14:19:30,589 WARN  [sync.2] hdfs.DataStreamer - Slow 
> waitForAckedSeqno took 391147ms (threshold=3ms). File being written: 
> /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620,
>  block: BP-958889176--1567030695029:blk_1689647875_616028364, Write 
> pipeline datanodes: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]].
> 2021-07-28 14:19:30,591 INFO  [sync.0] wal.FSHLog - Slow sync cost: 391289 
> ms, current pipeline: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]
> 2021-07-28 14:19:30,591 INFO  [sync.1] wal.FSHLog - Slow sync cost: 391227 
> ms, current pipeline: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]
> 2021-07-28 14:19:30,591 WARN  [sync.1] wal.FSHLog - Requesting log roll 
> because we exceeded slow sync threshold; time=391227 ms, threshold=1 ms, 
> current pipeline: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> 

[jira] [Updated] (HBASE-26195) Data is present in replicated cluster but not present in primary cluster.

2021-08-12 Thread Bharath Vissapragada (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharath Vissapragada updated HBASE-26195:
-
Affects Version/s: 2.5.0
   3.0.0-alpha-1

> Data is present in replicated cluster but not present in primary cluster.
> -
>
> Key: HBASE-26195
> URL: https://issues.apache.org/jira/browse/HBASE-26195
> Project: HBase
>  Issue Type: Bug
>  Components: Replication, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0
>Reporter: Rushabh Shah
>Assignee: Rushabh Shah
>Priority: Major
>
> We encountered a case where we are seeing some rows (via Phoenix) in 
> replicated cluster but they are not present in source/active cluster.
> Triaging further we found memstore rollback logs in few of the region servers.
> {noformat}
> 2021-07-28 14:17:59,353 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,353 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,355 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,355 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,356 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> {noformat}
> Looking more into logs, found that there were some hdfs layer issues sync'ing 
> wal to hdfs.
> It was taking around 6 mins to sync wal. Logs below
> {noformat}
> 2021-07-28 14:19:30,511 WARN  [sync.0] hdfs.DataStreamer - Slow 
> waitForAckedSeqno took 391210ms (threshold=3ms). File being written: 
> /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620,
>  block: BP-958889176--1567030695029:blk_1689647875_616028364, Write 
> pipeline datanodes: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]].
> 2021-07-28 14:19:30,589 WARN  [sync.1] hdfs.DataStreamer - Slow 
> waitForAckedSeqno took 391148ms (threshold=3ms). File being written: 
> /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620,
>  block: BP-958889176--1567030695029:blk_1689647875_616028364, Write 
> pipeline datanodes: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]].
> 2021-07-28 14:19:30,589 WARN  [sync.2] hdfs.DataStreamer - Slow 
> waitForAckedSeqno took 391147ms (threshold=3ms). File being written: 
> /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620,
>  block: BP-958889176--1567030695029:blk_1689647875_616028364, Write 
> pipeline datanodes: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]].
> 2021-07-28 14:19:30,591 INFO  [sync.0] wal.FSHLog - Slow sync cost: 391289 
> ms, current pipeline: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]
> 2021-07-28 14:19:30,591 INFO  [sync.1] wal.FSHLog - Slow sync cost: 391227 
> ms, current pipeline: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]
> 2021-07-28 14:19:30,591 WARN  [sync.1] wal.FSHLog - Requesting log roll 
> because we exceeded slow sync threshold; time=391227 ms, threshold=1 ms, 
> current pipeline: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> 

[jira] [Updated] (HBASE-26195) Data is present in replicated cluster but not present in primary cluster.

2021-08-12 Thread Rushabh Shah (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh Shah updated HBASE-26195:
-
Summary: Data is present in replicated cluster but not present in primary 
cluster.  (was: Data is present in replicated cluster but not visible on 
primary cluster.)

> Data is present in replicated cluster but not present in primary cluster.
> -
>
> Key: HBASE-26195
> URL: https://issues.apache.org/jira/browse/HBASE-26195
> Project: HBase
>  Issue Type: Bug
>  Components: Replication, wal
>Affects Versions: 1.7.0
>Reporter: Rushabh Shah
>Assignee: Rushabh Shah
>Priority: Major
>
> We encountered a case where we are seeing some rows (via Phoenix) in 
> replicated cluster but they are not present in source/active cluster.
> Triaging further we found memstore rollback logs in few of the region servers.
> {noformat}
> 2021-07-28 14:17:59,353 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,353 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,354 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,355 DEBUG [3,queue=3,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,355 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> 2021-07-28 14:17:59,356 DEBUG [,queue=25,port=60020] regionserver.HRegion - 
> rollbackMemstore rolled back 23
> {noformat}
> Looking more into logs, found that there were some hdfs layer issues sync'ing 
> wal to hdfs.
> It was taking around 6 mins to sync wal. Logs below
> {noformat}
> 2021-07-28 14:19:30,511 WARN  [sync.0] hdfs.DataStreamer - Slow 
> waitForAckedSeqno took 391210ms (threshold=3ms). File being written: 
> /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620,
>  block: BP-958889176--1567030695029:blk_1689647875_616028364, Write 
> pipeline datanodes: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]].
> 2021-07-28 14:19:30,589 WARN  [sync.1] hdfs.DataStreamer - Slow 
> waitForAckedSeqno took 391148ms (threshold=3ms). File being written: 
> /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620,
>  block: BP-958889176--1567030695029:blk_1689647875_616028364, Write 
> pipeline datanodes: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]].
> 2021-07-28 14:19:30,589 WARN  [sync.2] hdfs.DataStreamer - Slow 
> waitForAckedSeqno took 391147ms (threshold=3ms). File being written: 
> /hbase/WALs/,60020,1626191371499/%2C60020%2C1626191371499.1627480615620,
>  block: BP-958889176--1567030695029:blk_1689647875_616028364, Write 
> pipeline datanodes: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]].
> 2021-07-28 14:19:30,591 INFO  [sync.0] wal.FSHLog - Slow sync cost: 391289 
> ms, current pipeline: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]
> 2021-07-28 14:19:30,591 INFO  [sync.1] wal.FSHLog - Slow sync cost: 391227 
> ms, current pipeline: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-505dabb0-0fd6-42d9-b25d-f25e249fe504,DISK],
>  
> DatanodeInfoWithStorage[:50010,DS-6c585673-d4d0-4ec6-bafe-ad4cd861fb4b,DISK]]
> 2021-07-28 14:19:30,591 WARN  [sync.1] wal.FSHLog - Requesting log roll 
> because we exceeded slow sync threshold; time=391227 ms, threshold=1 ms, 
> current pipeline: 
> [DatanodeInfoWithStorage[:50010,DS-b5747702-8ab9-4a5e-916e-5fae6e305738,DISK],
>  
>