[ 
https://issues.apache.org/jira/browse/CASSANDRA-14543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523302#comment-16523302
 ] 

Kurt Greaves commented on CASSANDRA-14543:
------------------------------------------

{quote}But hints-handoff will only happen once, and we know the target node is 
missing that deletion. It may not need it if the tombstone is repaired from 
another node within GCGS, otherwise, it's the best (only) way to delete the 
data on that node.
{quote}
Well, it could create more unnecessary read-repair in the following scenario 
(which is the only case where HH of purgeable tombstones comes into play):
 3 nodes, A, B, C.
 # Insert as per your example
 # Node B goes down
 # Delete partition
 # GCGS passes
 # A and C compact away partition deletion
 # B comes back up
 # A/C HH tombstone to B
 
Any further reads for that partition will now cause a RR where the tombstone is 
not propagated

But really, we'd be only addressing the case where a deletion is performed 
within the HH window and then the node stays down until GCGS passes. This seems 
like a really narrow use case here, especially because if a node is down for 
GCGS you're going to have problems anyway (unless there's something I'm missing 
here).

> Hinted handoff to replay purgeable tombstones 
> ----------------------------------------------
>
>                 Key: CASSANDRA-14543
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14543
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jay Zhuang
>            Priority: Minor
>
> Hinted-handoff currently only dispatches and applies the mutations that are 
> within GCGS: 
> [{{Hint.java:97}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hints/Hint.java#L97].
>  Which is to make sure it won't resurrect any deleted data.
> But replaying tombstones should be safe, it could reduce the chance to have 
> [un-repairable inconsistent 
> data|https://lists.apache.org/thread.html/2d3d39d960143d4d2146ed2530821504ff855e832713dec7d0afd8ac@%3Cdev.cassandra.apache.org%3E].
> Here is the user scenario it tries to fix:
> {noformat}
> 1. Create a 3 nodes cluster
> 2. Create a table with small gc_grace_seconds (for reproducing purpose):
> CREATE KEYSPACE foo WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': 3};
> CREATE TABLE foo.bar (
> id int PRIMARY KEY,
> name text
> ) WITH gc_grace_seconds=30;
> 3. Insert data with consistency all:
> INSERT INTO foo.bar (id, name) VALUES(1, 'cstar');
> 4. stop 1 node
> $ ccm node2 stop
> 5. Delete the data with consistency quorum:
> DELETE FROM foo.bar WHERE id=1;
> 6. Wait 30 seconds and then start node2:
> $ ccm node2 start
> {noformat}
> Now, node2 has the data, node1/node3 have the purgeable tombstone. It 
> triggers RR every time which sends data from node2 to node1/node3 but repairs 
> nothing.
> With purgeable tombstones hints handoff, it at least will dispatch the 
> tombstone and delete the data on node2. It won't fix the root cause but 
> reduce the chance to have this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to