[ 
https://issues.apache.org/jira/browse/CASSANDRA-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347781#comment-16347781
 ] 

Jordan West commented on CASSANDRA-14055:
-----------------------------------------

Hi [~lboutros],

One of the original authors of SASI here. I've been taking a look at this issue 
and your patch. Using the provided test against the {{cassandra-3.11}} branch 
(fc3357a00e2b6e56d399f07c5b81a82780c1e143), I see three different failure cases 
– two related directly to this issue and one tangentially related. More details 
on those below. With respect to this issue in particular, the three scenarios 
cause the test to fail because {{IndexSummaryManager}} ends up creating a new 
{{View}} where {{oldSSTables}} and {{newIndexes}} have overlapping values. This 
occurs because the {{IndexSummaryManager}} may "update" (re-open) an 
{{SSTableReader}} for an index already in the view. I believe this is unique to 
{{IndexSummaryManager}} and I am able to make your tests pass* without your 
patch by ensuring that there is no overlap between {{oldSStables}} and 
{{newIndexes}} (favoring {{newIndexes}}). Your patch looks to do this as well, 
though the approach is a bit different.

One thing I am curious about in your patch is the {{keepFile}} changes to 
{{SSTableIndex#release}}. Generally, this concerns me because it seems to be 
working around improper reference counting rather than correcting the reference 
counting itself. Also, while using the provided test, I am unable to hit a case 
where the condition {{obsolete.get() || sstableRef.globalCount() == 0}} is 
true. I see the file missing in the {{View}} but not on disk itself. Could you 
elaborate a bit more on the need for this change and your use of the 
{{keepFile}} flag?

The three failure scenarios I see using the provided test are:
h5. 8 keys returned - sequential case

In this scenario, at the time when the query that fails runs, the {{View}} is 
missing the most recently flushed sstable. As mentioned previously, this is 
because the intersection of {{oldSSTables}} and {{newIndexes}} is non-empty. 
This can be fixed* by ensuring nothing in {{newIndexes}} is in {{oldSSTables}}. 
I call this the sequential case because the compaction that occurs during the 
test completes before the index summary redistribution begins to create a new 
{{View}}. This is also addressed by your patch.
h5. 8 keys returned - race case

This scenario is similar to the previous one but has the additional issue of 
triggering improper {{SSTableIndex}} reference counting. From the perspective 
of the provided test, the failure scenario is the same and the fix* is as well. 
The issue occurs because of a race between compaction and index 
redistribution's creation of new {{View}} instances. This causes redistribution 
to create two {{View}} instances, the first of which is thrown away due to a 
failed compare and swap. The problem is the side-effects (calling 
{{SSTableIndex#release}}) have occurred already inside the creation of the 
garbage {{View}}, causing the reference count for the index to drop below 0. I 
see this issue as a separate one from this ticket and will file a separate 
JIRA. It is not fixed by the previously mentioned change and while I haven't 
checked in detail, I don't think the provided patch addresses this either.
h5. 0 keys returned

This scenario is similar to the first but there are three threads involved in 
the race: the compaction, the flushing of the last memtable, and the index 
redistribution. In this case, the end result is an empty {{View}}, which leads 
to no keys being returned since the system thinks there are no indexes to 
search. This is fixed* by what I mentioned previously and occurs because index 
redistribution re-opens both sstables in the original {{View}} instead of just 
one. It is also addressed by your patch. 

 

I am curious if you see any other failure scenarios besides these three and, in 
particular, if you can elaborate on and provide examples of the issues you see 
regarding the files being missing on disk and the need for the {{keepFile}} 
change.
 
\* While this fix makes the provided test pass I am still verifying its correct 
from the reference counting perspective.

> Index redistribution breaks SASI index
> --------------------------------------
>
>                 Key: CASSANDRA-14055
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14055
>             Project: Cassandra
>          Issue Type: Bug
>          Components: sasi
>            Reporter: Ludovic Boutros
>            Assignee: Ludovic Boutros
>            Priority: Major
>              Labels: patch
>             Fix For: 3.11.2
>
>         Attachments: CASSANDRA-14055.patch, CASSANDRA-14055.patch, 
> CASSANDRA-14055.patch
>
>
> During index redistribution process, a new view is created.
> During this creation, old indexes should be released.
> But, new indexes are "attached" to the same SSTable as the old indexes.
> This leads to the deletion of the last SASI index file and breaks the index.
> The issue is in this function : 
> [https://github.com/apache/cassandra/blob/9ee44db49b13d4b4c91c9d6332ce06a6e2abf944/src/java/org/apache/cassandra/index/sasi/conf/view/View.java#L62]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to