Github user adamjshook commented on a diff in the pull request:
https://github.com/apache/accumulo/pull/254#discussion_r114592878
--- Diff:
server/tserver/src/main/java/org/apache/accumulo/tserver/replication/AccumuloReplicaSystem.java
---
@@ -536,8 +570,21 @@ public ReplicationStats execute(Client client) throws
Exception {
// If we have some edits to send
if (0 < edits.walEdits.getEditsSize()) {
+ // Check if we are interrupted before to writing the edits
+ if (Thread.interrupted()) {
+ log.debug("Replication work interrupted before writing edits,
returning empty replication stats");
+ return new ReplicationStats(0L, 0L, 0L);
+ }
+
log.debug("Sending {} edits", edits.walEdits.getEditsSize());
long entriesReplicated = client.replicateLog(remoteTableId,
edits.walEdits, tcreds);
--- End diff --
I don't see a difference either way. What I am trying to address is a rare
block in the replication pipeline, not necessarily that it takes long to do the
work. I'd just need the RPC call to timeout and bubble up so the tserver
performing the replication work can release the lock so it is re-tried.
This does, however, complicate my testing since I am now dealing with
Thrift timeouts.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---