[jira] [Created] (CASSANDRA-9540) Cql query doesn't return right information when using IN on columns for some keys

Mathijs Vogelzang (JIRA) Wed, 03 Jun 2015 08:16:13 -0700

Mathijs Vogelzang created CASSANDRA-9540:
--------------------------------------------


             Summary: Cql query doesn't return right information when using IN 
on columns for some keys
                 Key: CASSANDRA-9540
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9540
             Project: Cassandra
          Issue Type: Bug
          Components: API
         Environment: Cassandra 2.1.5
            Reporter: Mathijs Vogelzang


We are investigating a weird issue where one of our clients doesn't get data on 
his dashboard. It seems Cassandra is not returning data for a particular key 
("brokenkey" from now on).

Some background:
We have a row where we store a "metadata" column and data in columns 
"bucket/0", "bucket/1", "bucket/2", etc. Depending on the date selection of the 
UI, we know that we only need to retrieve bucket/0, bucket/0 and bucket/1 etc. 
(we always need to retrieve "metadata").

A typical query may look like this (using SELECT column1 to just show what is 
returned, normally we would of course do SELECT value):
{{noformat}}
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where 
key=textAsBlob('install/workingkey');

 blobAsText(column1)
---------------------
            bucket/0
            metadata

(2 rows)
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where 
key=textAsBlob('install/brokenkey');

 blobAsText(column1)
---------------------
            bucket/0
            metadata

(2 rows)
{{/noformat}}
These two queries work as expected, and return the information that we actually 
stored.
However, when we "filter" for certain columns, the brokenkey starts behaving 
very weird:

{{noformat}}
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where 
key=textAsBlob('install/workingkey') and column1 IN 
(textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));

 blobAsText(column1)
---------------------
            bucket/0
            metadata

(2 rows)
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where 
key=textAsBlob('install/workingkey') and column1 IN 
(textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));

 blobAsText(column1)
---------------------
            bucket/0
            metadata

(2 rows)
***  As expected, querying for more information doesn't really matter for the 
working key ***

cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where 
key=textAsBlob('install/brokenkey') and column1 IN 
(textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));

 blobAsText(column1)
---------------------
            bucket/0

(1 rows)
*** Cassandra stops giving us the metadata column when asking for a few more 
columns! ***
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where 
key=textAsBlob('install/brokenkey') and column1 IN 
(textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));

 key | column1 | value
-----+---------+-------

(0 rows)
*** Adding the bogus column name even makes it return nothing from this row 
anymore! ***
{{/noformat}}

There are at least two rows that malfunction like this in our table (which is 
quite old already and has gone through a bunch of Cassandra upgrades). I've 
upgraded our whole cluster to 2.1.5 (we were on 2.1.2 when I discovered this 
problem) and compacted, repaired and scrubbed this column family, which hasn't 
helped.

Our table structure is:
{{noformat}}
cqlsh:AppBrain> describe table "GroupedSeries";

CREATE TABLE "AppBrain"."GroupedSeries" (
    key blob,
    column1 blob,
    value blob,
    PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
    AND CLUSTERING ORDER BY (column1 ASC)
    AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
    AND comment = ''
    AND compaction = {'min_threshold': '4', 'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32'}
    AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 1.0
    AND speculative_retry = 'NONE';
{{/noformat}}

Let me know if I can give more information that may be helpful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9540) Cql query doesn't return right information when using IN on columns for some keys

Reply via email to