[jira] [Commented] (CASSANDRA-5732) Can not query secondary index

Sam Tunnicliffe (JIRA) Thu, 10 Oct 2013 07:03:58 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13791509#comment-13791509
 ]


Sam Tunnicliffe commented on CASSANDRA-5732:
--------------------------------------------

The reason for the missing results is that in CFS.getColumnFamily() we look up 
the cfs id from Schema to calculate the cache key. However, 2i CFSes are never 
loaded into the Schema, so Schema.instance.getId always returns null. Simply 
fixing this by calling Schema.instance.load() with the 2i CFMD when the index 
is initialized uncovers another issue. The cfid is now retrievable, but the 
deserialization of a cached 2i row fails as it depends on the 2i CFMD being 
present in the enclosing KSMD for the eventual call to Schema.getCFMD(). Once 
we start adding index CFs to Schema they then become involved in schema 
migrations which makes everything very messy. So rather than adding them 
directly to KSMD like regular CFs, I added a separate cfId->CFMD map to Schema, 
so as far as most things are concerned nothing has changed, just we have one 
further place to look when retrieving CFMD for a given cfId.

The attached patch is against the 1.2 branch, CASSANDRA-4875 is a duplicate of 
this, but has a fixver of 1.1 [~jbellis], do you want me to submit a patch 
against 1.1 also?

I wrote a dtest for this, pull request for that here: 
https://github.com/riptano/cassandra-dtest/pull/22

Looking at this, I also uncovered what I think is an issue with the setup of 
the 2i cache config. In AbstractSimplePerColumnSecondaryIndex (in 1.2, the same 
code is in KeysIndex in 1.1), the estimated key and mean column counts are used 
to gauge the index's cardinality then use that to decide whether or not to 
enable row caching. This calculation is first performed prior to the index 
actually being built, so there are no SSTables to provide the estimates, which 
results in row caching always being disabled until the next time the index is 
initialized when C* is restarted (this appears to be why the repro steps 
require a restart). If this is a genuine problem, I'll create a separate JIRA 
to address it. 


> Can not query secondary index
> -----------------------------
>
>                 Key: CASSANDRA-5732
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5732
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.2.5
>         Environment: Windows 8, Jre 1.6.0_45 32-bit
>            Reporter: Tony Anecito
>            Assignee: Sam Tunnicliffe
>         Attachments: 5732-1.2
>
>
> Noticed after taking a column family that already existed and assigning to an 
> IntegerType index_type:KEYS and the caching was already set to 'ALL' that the 
> prepared statement do not return rows neither did it throw an exception. Here 
> is the sequence.
> 1. Starting state query running with caching off for a Column Family with the 
> query using the secondary index for te WHERE clause.
> 2, Set Column Family caching to ALL using Cassandra-CLI and update CQL. 
> Cassandra-cli Describe shows column family caching set to ALL
> 3. Rerun query and it works.
> 4. Restart Cassandra and run query and no rows returned. Cassandra-cli 
> Describe shows column family caching set to ALL
> 5. Set Column Family caching to NONE using Cassandra-cli and update CQL. 
> Rerun query and no rows returned. Cassandra-cli Describe for column family 
> shows caching set to NONE.
> 6. Restart Cassandra. Rerun query and it is working again. We are now back to 
> the starting state.
> Best Regards,
> -Tony



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (CASSANDRA-5732) Can not query secondary index

Reply via email to