[
https://issues.apache.org/jira/browse/KUDU-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15222418#comment-15222418
]
Todd Lipcon commented on KUDU-1073:
-----------------------------------
[~jdcryans] have you seen any issues like this in recent months? Seems likely
from the comments above it probably got fixed.
> Single TS falling too far behind hung YCSB
> ------------------------------------------
>
> Key: KUDU-1073
> URL: https://issues.apache.org/jira/browse/KUDU-1073
> Project: Kudu
> Issue Type: Bug
> Components: client, consensus
> Affects Versions: Private Beta
> Reporter: Todd Lipcon
> Assignee: Jean-Daniel Cryans
> Priority: Critical
>
> This caused a YCSB job to fail:
> - a server fell behind for some reason (haven't done root cause on why --
> maybe just a bit slow)
> - leader GCed the logs needed to catch it up, and thus stopped sending it any
> heartbeats or other messages
> - the server had one write pending
> - the java client apparently just kept retrying over and over against the
> same server
> The server with the pending txn may actually have been the leader at the time
> it was written - otherwise not sure why Java keeps retrying it. Or perhaps
> the Java client got an error on the leader, failed over to try the follower,
> and RPCs to the follower are timing out.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)