[jira] [Commented] (CASSANDRA-9540) Cql query doesn't return right information when using IN on columns for some keys

Mathijs Vogelzang (JIRA) Thu, 04 Jun 2015 08:34:31 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14572973#comment-14572973
 ]


Mathijs Vogelzang commented on CASSANDRA-9540:
----------------------------------------------

Another observation: when testing this bug from our java code with the datastax 
cql driver, we notice that while the written data is still in the memtable, 
everything works correctly. Once its flushed to an SSTable on disk, the 
unpredictable behavior starts. Maybe this bug is another argument for 
https://issues.apache.org/jira/browse/CASSANDRA-9161 ?

> Cql query doesn't return right information when using IN on columns for some 
> keys
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9540
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9540
>             Project: Cassandra
>          Issue Type: Bug
>          Components: API
>         Environment: Cassandra 2.1.5
>            Reporter: Mathijs Vogelzang
>            Assignee: Carl Yeksigian
>             Fix For: 2.1.x
>
>
> We are investigating a weird issue where one of our clients doesn't get data 
> on his dashboard. It seems Cassandra is not returning data for a particular 
> key ("brokenkey" from now on).
> Some background:
> We have a row where we store a "metadata" column and data in columns 
> "bucket/0", "bucket/1", "bucket/2", etc. Depending on the date selection of 
> the UI, we know that we only need to retrieve bucket/0, bucket/0 and bucket/1 
> etc. (we always need to retrieve "metadata").
> A typical query may look like this (using SELECT column1 to just show what is 
> returned, normally we would of course do SELECT value):
> {noformat}
> cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where 
> key=textAsBlob('install/workingkey');
>  blobAsText(column1)
> ---------------------
>             bucket/0
>             metadata
> (2 rows)
> cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where 
> key=textAsBlob('install/brokenkey');
>  blobAsText(column1)
> ---------------------
>             bucket/0
>             metadata
> (2 rows)
> {noformat}
> These two queries work as expected, and return the information that we 
> actually stored.
> However, when we "filter" for certain columns, the brokenkey starts behaving 
> very weird:
> {noformat}
> cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where 
> key=textAsBlob('install/workingkey') and column1 IN 
> (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));
>  blobAsText(column1)
> ---------------------
>             bucket/0
>             metadata
> (2 rows)
> cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where 
> key=textAsBlob('install/workingkey') and column1 IN 
> (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));
>  blobAsText(column1)
> ---------------------
>             bucket/0
>             metadata
> (2 rows)
> ***  As expected, querying for more information doesn't really matter for the 
> working key ***
> cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where 
> key=textAsBlob('install/brokenkey') and column1 IN 
> (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));
>  blobAsText(column1)
> ---------------------
>             bucket/0
> (1 rows)
> *** Cassandra stops giving us the metadata column when asking for a few more 
> columns! ***
> cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where 
> key=textAsBlob('install/brokenkey') and column1 IN 
> (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));
>  key | column1 | value
> -----+---------+-------
> (0 rows)
> *** Adding the bogus column name even makes it return nothing from this row 
> anymore! ***
> {noformat}
> There are at least two rows that malfunction like this in our table (which is 
> quite old already and has gone through a bunch of Cassandra upgrades). I've 
> upgraded our whole cluster to 2.1.5 (we were on 2.1.2 when I discovered this 
> problem) and compacted, repaired and scrubbed this column family, which 
> hasn't helped.
> Our table structure is:
> {noformat}
> cqlsh:AppBrain> describe table "GroupedSeries";
> CREATE TABLE "AppBrain"."GroupedSeries" (
>     key blob,
>     column1 blob,
>     value blob,
>     PRIMARY KEY (key, column1)
> ) WITH COMPACT STORAGE
>     AND CLUSTERING ORDER BY (column1 ASC)
>     AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
>     AND comment = ''
>     AND compaction = {'min_threshold': '4', 'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32'}
>     AND compression = {'sstable_compression': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>     AND dclocal_read_repair_chance = 0.1
>     AND default_time_to_live = 0
>     AND gc_grace_seconds = 864000
>     AND max_index_interval = 2048
>     AND memtable_flush_period_in_ms = 0
>     AND min_index_interval = 128
>     AND read_repair_chance = 1.0
>     AND speculative_retry = 'NONE';
> {noformat}
> Let me know if I can give more information that may be helpful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9540) Cql query doesn't return right information when using IN on columns for some keys

Reply via email to