Ashu Pachauri created HBASE-15001:
-------------------------------------
Summary: Thread Safety issues in ReplicationSinkManager and
HBaseInterClusterReplicationEndpoint
Key: HBASE-15001
URL: https://issues.apache.org/jira/browse/HBASE-15001
Project: HBase
Issue Type: Bug
Components: Replication
Affects Versions: 2.0.0, 1.2.0, 1.3.0, 1.2.1
Reporter: Ashu Pachauri
Assignee: Ashu Pachauri
Priority: Critical
ReplicationSinkManager is not thread-safe. This can cause problems in
HBaseInterClusterReplicationEndpoint, when the walprovider is multiwal.
For example:
1. When multiple threads report bad sinks, the sink list can be non-empty but
report a negative size because the ArrayList itself is not thread-safe.
2. HBaseInterClusterReplicationEndpoint depends on the number of sinks to batch
edits for shipping. However, it's quite possible that the following code makes
it assume that there are no batches to process (sink size is non-zero, but by
the time we reach the "batching" part, sink size becomes zero.)
{code}
if (replicationSinkMgr.getSinks().size() == 0) {
return false;
}
...
int n = Math.min(Math.min(this.maxThreads, entries.size()/100+1),
replicationSinkMgr.getSinks().size());
{code}
This is very dangerous, because, assuming no batches to process, we can safely
report that we replicated successfully, while we actually did not replicate
anything.
The idea is to make all operations in ReplicationSinkManager thread-safe and do
a verification on the size of replicated edits before we report success.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)