[jira] [Comment Edited] (CASSANDRA-19018) An SAI-specific mechanism to ensure consistency isn't violated for multi-column (i.e. AND) queries at CL > ONE

Caleb Rackliffe (Jira) Fri, 02 Feb 2024 23:30:05 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-19018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813875#comment-17813875
 ]


Caleb Rackliffe edited comment on CASSANDRA-19018 at 2/3/24 7:29 AM:
---------------------------------------------------------------------

I've been working in [the Harry 
PR|https://github.com/maedhroz/cassandra/pull/15], as it's just easier to 
iterate there. As things stabilize, I'll move them back to the 19018 patches. 
So far, Harry has pointed out two things:

1.) If we read at {{QUORUM}} or {{ALL}} on a single-node cluster, 
{{needsReconciliation()}} will answer in the affirmative, and we'll revert to 
the non-strict filtering logic intended for use w/ multiple nodes. This is easy 
to avoid by setting a sensible read consistency level, and with this, 
{{SingleNodeSAITest}} and {{StaticsTortureTest}} make it through extended runs 
without problems. It might be better to additionally check the replication 
factor of the keyspace we're querying to make sure we still do the right thing 
for {{QUORUM}} and {{ALL}}. Making this change in 
{{StatementRestriction#getRowFilter()}} also allows these two tests to run 
without issue.

2.) {{InJvmSutBase}} and therefore {{MultiNodeSAITest}} use paging w/ a fetch 
size of 1. (I've changed {{PartialUpdateHandlingTest}} to do this too, for 
reasons that will soon become apparent, although it passes with flying colors 
still.) In conjunction w/ the new non-strict filtering logic here, there are 
situations where the strict post-filtering that happens after replica filtering 
protection eliminates a non-strict match from a replica...but no short read 
protection occurs. I think this would also occur w/ a LIMIT of 1. I've made an 
attempt at adding short read protection in 
{{DataResolver#resolveWithReplicaFilteringProtection()}}, and this takes 
{{MultiNodeSAITest}} from failing almost immediately (about 10 seconds) to 
running happily for a minute or two, and eventually failing in a completely 
different way:

{noformat}
org.apache.cassandra.harry.model.Model$ValidationException: Found a row while 
model predicts statics only:
Expected:  rowStateRow(-4270507212538418608L, -9223372036854775808, 
values(-8292973307042192125L), lts(28972L), values(-8292973307042192125L), 
lts(28972L))
Actual: resultSetRow(-4270507212538418608L, 3338663292788043269L, 
statics(-8292973307042192125L), lts(28972L), 
values(-9223372036854775808L,-8292973307042192125L,-7423979211207825555L), 
lts(-9223372036854775808L,22376L,22376L))
Query: CompiledStatement{cql='SELECT pk1, pk2, pk3, ck1, ck2, ck3, s1, v1, v2, 
v3, writetime(s1), writetime(v1), writetime(v2), writetime(v3) FROM 
distributed_test_keyspace.tbl1 WHERE pk1 = ? AND pk2 = ? AND pk3 = ? AND v2 = ? 
AND v3 = ?;', 
bindings=1153178704L,"ZHHyABdiABdiABdiABdiABdiygANvtdA1417414080121120318513717714751110139374521458162113150301962217515912023323114324224572725422917410488118183642122461261341782131372397068109145207519610123518824536237251161062231134916218712465938595117240579750159172261719222439136",27216L,-8292973307042192125L,-7423979211207825555L}
{noformat}

Disabling static column indexing or increasing the fetch/page size to 10,000 
rows produces this:

{noformat}
org.apache.cassandra.harry.model.Model$ValidationException: Returned row state 
doesn't match the one predicted by the model:
Expected: -6688467811848818630-4962768465676381896, -4962768465676381896,  ( 
rowStateRow(5024781677423462809L, -2339751831653719635, 
values(-6688467811848818630L), lts(19921L), 
values(-6688467811848818630L,-4962768465676381896L,-4962768465676381896L), 
lts(264L,264L,7465L), 
clustering=[ZHHyABdiABdiABdiRwdFaXalaaANFfLo25295481012923054220223105571722471366830351971311472242226817725124121123136771432346020014173159421101258430141229231571772059511915411223222966681181131611075911168145115129138661402092155423321224244801012431697933235232,
 ZYFiYEUkzcKOhdyazcKOhdyazcKOhdyazcKOhdyazcKOhdyaxSFhHosLPpEzbCrp, 57773], 
values=[GIzPsPrWJWiYMmxIzNckJdrhiUJHPEmagerzwhAVJyOVBlABXHfIIeyGqyovtyIOxvoNdWCExWChJsmxpsnwWJzRqYZNkjFPsIfJJitGAhNYiuVxGnifddZuxSdVBEqoJhugErHMAHXhvJVGKzOkwCPEQiWhEXDOPpNdszFFSNwoeOtYoNyCYigczgDARnqHVXlRuHDKEjxElPVmNeCUBgzyWKnBjhdHTBnsEDPkIYnGiKMcwhbzPMvmrIebFgKtgGOUcJopINGGKJgYvTkqlAmDMhWlpGqXPgfWWTvlMFeOITwGpsXQdBkpBVUCZxmd72114182,
 -4962768465676381896, -4962768465676381896]))
Actual:   -6688467811848818630-4962768465676381896, -7423979211207825555,  
(resultSetRow(5024781677423462809L, -2339751831653719635L, 
statics(-6688467811848818630L), lts(19921L), 
values(-6688467811848818630L,-4962768465676381896L,-7423979211207825555L), 
lts(264L,264L,264L))).
{noformat}

In any case, I've committed the state of my testing setup and [some preliminary 
fixes|https://github.com/maedhroz/cassandra/pull/15/commits/8d701c4f67bd7670fa1968fe4540259e4145000e]
 in the [WIP Harry branch|https://github.com/maedhroz/cassandra/pull/15]. It's 
possible we've identified a completely new failure care here. I'll have to pick 
it up Monday. It should be easy to reproduce the problems above though with 
that branch.


was (Author: maedhroz):
I've been working in [the Harry 
PR|https://github.com/maedhroz/cassandra/pull/15], as it's just easier to 
iterate there. As things stabilize, I'll move them back to the 19018 patches. 
So far, Harry has pointed out two things:

1.) If we read at {{QUORUM}} or {{ALL}} on a single-node cluster, 
{{needsReconciliation()}} will answer in the affirmative, and we'll revert to 
the non-strict filtering logic intended for use w/ multiple nodes. This is easy 
to avoid by setting a sensible read consistency level, and with this, 
{{SingleNodeSAITest}} and {{StaticsTortureTest}} make it through extended runs 
without problems. It might be better to additionally check the replication 
factor of the keyspace we're querying to make sure we still do the right thing 
for {{QUORUM}} and {{ALL}}. Making this change in 
{{StatementRestriction#getRowFilter()}} also allows these two tests to run 
without issue.

2.) {{InJvmSutBase}} and therefore {{MultiNodeSAITest}} use paging w/ a fetch 
size of 1. (I've changed {{PartialUpdateHandlingTest}} to do this too, for 
reasons that will soon become apparent, although it passes with flying colors 
still.) In conjunction w/ the new non-strict filtering logic here, there are 
situations where the strict post-filtering that happens after replica filtering 
protection eliminates a non-strict match from a replica...but no short read 
protection occurs. I think this would also occur w/ a LIMIT of 1. I've made an 
attempt at adding short read protection in 
{{DataResolver#resolveWithReplicaFilteringProtection()}}, and this takes 
{{MultiNodeSAITest}} from failing almost immediately (about 10 seconds) to 
running happily for a minute or two, and eventually failing in a completely 
different way:

{noformat}
org.apache.cassandra.harry.model.Model$ValidationException: Found a row while 
model predicts statics only:
Expected:  rowStateRow(-4270507212538418608L, -9223372036854775808, 
values(-8292973307042192125L), lts(28972L), values(-8292973307042192125L), 
lts(28972L))
Actual: resultSetRow(-4270507212538418608L, 3338663292788043269L, 
statics(-8292973307042192125L), lts(28972L), 
values(-9223372036854775808L,-8292973307042192125L,-7423979211207825555L), 
lts(-9223372036854775808L,22376L,22376L))
Query: CompiledStatement{cql='SELECT pk1, pk2, pk3, ck1, ck2, ck3, s1, v1, v2, 
v3, writetime(s1), writetime(v1), writetime(v2), writetime(v3) FROM 
distributed_test_keyspace.tbl1 WHERE pk1 = ? AND pk2 = ? AND pk3 = ? AND v2 = ? 
AND v3 = ?;', 
bindings=1153178704L,"ZHHyABdiABdiABdiABdiABdiygANvtdA1417414080121120318513717714751110139374521458162113150301962217515912023323114324224572725422917410488118183642122461261341782131372397068109145207519610123518824536237251161062231134916218712465938595117240579750159172261719222439136",27216L,-8292973307042192125L,-7423979211207825555L}
{noformat}

Disabling static column indexing or increasing the fetch/page size to 10,000 
rows produces this:

{noformat}
org.apache.cassandra.harry.model.Model$ValidationException: Returned row state 
doesn't match the one predicted by the model:
Expected: -6688467811848818630-4962768465676381896, -4962768465676381896,  ( 
rowStateRow(5024781677423462809L, -2339751831653719635, 
values(-6688467811848818630L), lts(19921L), 
values(-6688467811848818630L,-4962768465676381896L,-4962768465676381896L), 
lts(264L,264L,7465L), 
clustering=[ZHHyABdiABdiABdiRwdFaXalaaANFfLo25295481012923054220223105571722471366830351971311472242226817725124121123136771432346020014173159421101258430141229231571772059511915411223222966681181131611075911168145115129138661402092155423321224244801012431697933235232,
 ZYFiYEUkzcKOhdyazcKOhdyazcKOhdyazcKOhdyazcKOhdyaxSFhHosLPpEzbCrp, 57773], 
values=[GIzPsPrWJWiYMmxIzNckJdrhiUJHPEmagerzwhAVJyOVBlABXHfIIeyGqyovtyIOxvoNdWCExWChJsmxpsnwWJzRqYZNkjFPsIfJJitGAhNYiuVxGnifddZuxSdVBEqoJhugErHMAHXhvJVGKzOkwCPEQiWhEXDOPpNdszFFSNwoeOtYoNyCYigczgDARnqHVXlRuHDKEjxElPVmNeCUBgzyWKnBjhdHTBnsEDPkIYnGiKMcwhbzPMvmrIebFgKtgGOUcJopINGGKJgYvTkqlAmDMhWlpGqXPgfWWTvlMFeOITwGpsXQdBkpBVUCZxmd72114182,
 -4962768465676381896, -4962768465676381896]))
Actual:   -6688467811848818630-4962768465676381896, -7423979211207825555,  
(resultSetRow(5024781677423462809L, -2339751831653719635L, 
statics(-6688467811848818630L), lts(19921L), 
values(-6688467811848818630L,-4962768465676381896L,-7423979211207825555L), 
lts(264L,264L,264L))).
{noformat}

In any case, I've committed the state of my testing setup and some preliminary 
fixes in the [WIP Harry branch|https://github.com/maedhroz/cassandra/pull/15]. 
It's possible we've identified a completely new failure care here. I'll have to 
pick it up Monday. It should be easy to reproduce the problems above though 
with that branch.

> An SAI-specific mechanism to ensure consistency isn't violated for 
> multi-column (i.e. AND) queries at CL > ONE
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-19018
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19018
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Coordination, Feature/SAI
>            Reporter: Caleb Rackliffe
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>             Fix For: 5.0-rc, 5.x
>
>         Attachments: ci_summary-1.html, ci_summary.html, 
> result_details.tar-1.gz, result_details.tar.gz
>
>          Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> CASSANDRA-19007 is going to be where we add a guardrail around 
> filtering/index queries that use intersection/AND over partially updated 
> non-key columns. (ex. Restricting one clustering column and one normal column 
> does not cause a consistency problem, as primary keys cannot be partially 
> updated.) This issue exists to attempt to fix this specifically for SAI in 
> 5.0.x, as Accord will (last I checked) not be available until the 5.1 release.
> The SAI-specific version of the originally reported issue is this:
> {noformat}
> try (Cluster cluster = init(Cluster.build(2).withConfig(config -> 
> config.with(GOSSIP).with(NETWORK)).start()))
>         {
>             cluster.schemaChange(withKeyspace("CREATE TABLE %s.t (k int 
> PRIMARY KEY, a int, b int)"));
>             cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(a) USING 
> 'sai'"));
>             cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(b) USING 
> 'sai'"));
>             // insert a split row
>             cluster.get(1).executeInternal(withKeyspace("INSERT INTO %s.t(k, 
> a) VALUES (0, 1)"));
>             cluster.get(2).executeInternal(withKeyspace("INSERT INTO %s.t(k, 
> b) VALUES (0, 2)"));
>         // Uncomment this line and test succeeds w/ partial writes 
> completed...
>         //cluster.get(1).nodetoolResult("repair", 
> KEYSPACE).asserts().success();
>             String select = withKeyspace("SELECT * FROM %s.t WHERE a = 1 AND 
> b = 2");
>             Object[][] initialRows = cluster.coordinator(1).execute(select, 
> ConsistencyLevel.ALL);
>             assertRows(initialRows, row(0, 1, 2)); // not found!!
>         }
> {noformat}
> To make a long story short, the local SAI indexes are hiding local partial 
> matches from the coordinator that would combine there to form full matches. 
> Simple non-index filtering queries also suffer from this problem, but they 
> hide the partial matches in a different way. I'll outline a possible solution 
> for this in the comments that takes advantage of replica filtering protection 
> and the repaired/unrepaired datasets...and attempts to minimize the amount of 
> extra row data sent to the coordinator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-19018) An SAI-specific mechanism to ensure consistency isn't violated for multi-column (i.e. AND) queries at CL > ONE

Reply via email to