[jira] [Commented] (CASSANDRA-19495) Hints not stored after node goes down for the second time

Stefan Miklosovic (Jira) Mon, 01 Apr 2024 06:55:05 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-19495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17832849#comment-17832849
 ]


Stefan Miklosovic commented on CASSANDRA-19495:
-----------------------------------------------

I see there are 3 instances of failing tests in multiplexer for trunk but I do 
not know what is the cause of it, does not happen on lower branches. 

It timeouts on 

"at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:529)"

ModificationStatement is either DeleteStatement or UpdateStatement, we are not 
deleting anything, only selecting and inserting so insertion it is (as it also 
throws WriteTimeoutException). I could probably just wrap the insertion 
execution in try-catch and catch WriteTimeoutException and just ignore it but I 
am not sure if that solves anything ... Some writes might obviously timeout as 
we populate the db 

https://app.circleci.com/pipelines/github/driftx/cassandra/1549/workflows/e0e763fa-e847-45c0-82ed-34f394758cd9/jobs/80339/tests

> Hints not stored after node goes down for the second time
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-19495
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19495
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Hints
>            Reporter: Paul Chandler
>            Assignee: Stefan Miklosovic
>            Priority: Urgent
>             Fix For: 4.1.x, 5.0-rc, 5.x
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> I have a scenario where a node goes down, hints are recorded on the second 
> node and replayed, as expected. If the first node goes down for a second time 
> and time span between the first time it stopped and the second time it 
> stopped is more than the max_hint_window then the hint is not recorded, no 
> hint file is created, and the mutation never arrives at the node after it 
> comes up again.
> I have debugged this and it appears to due to the way hint window is 
> persisted after https://issues.apache.org/jira/browse/CASSANDRA-14309
> The code here: 
> [https://github.com/apache/cassandra/blame/cassandra-4.1/src/java/org/apache/cassandra/service/StorageProxy.java#L2402]
>  uses the time stored in the HintsBuffer.earliestHintByHost map.  This map is 
> based on the UUID of the host, but this does not seem to be cleared when the 
> node is back up, and I think this is what is causing the problem.
>  
> This is in cassandra 4.1.5



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19495) Hints not stored after node goes down for the second time

Reply via email to