On Mon, Apr 25, 2011 at 12:21 PM, Jonathan Ellis wrote:
> I bet the problem is with the other tasks on the executor that Gossip
> heartbeat runs on.
>
> I see at least two that could cause blocking: hint cleanup
> post-delivery and flush-expired-memtables, both of which call
> forceFlush which wil
I bet the problem is with the other tasks on the executor that Gossip
heartbeat runs on.
I see at least two that could cause blocking: hint cleanup
post-delivery and flush-expired-memtables, both of which call
forceFlush which will block if the flush queue + threads are full.
We've run into this
Got just enough time to look at this done today to verify that:
Sometimes nodes (under pressure) fails to send heartbeats for long
enough to get marked as dead by other nodes (why is a good question,
which I need to check better. Does not seem to be GC).
The node does however start sending heart
World as seen from .81 in the below ring
.81 Up Normal 85.55 GB8.33% Token(bytes[30])
.82 Down Normal 83.23 GB8.33% Token(bytes[313230])
.83 Up Normal 70.43 GB8.33% Token(bytes[313437])
.84 Up Normal 81.7 GB 8.33% Token(bytes