[jira] [Commented] (CASSANDRA-9540) Cql query doesn't return right information when using IN on columns for some keys

2015-06-04 Thread Mathijs Vogelzang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572973#comment-14572973
 ] 

Mathijs Vogelzang commented on CASSANDRA-9540:
--

Another observation: when testing this bug from our java code with the datastax 
cql driver, we notice that while the written data is still in the memtable, 
everything works correctly. Once its flushed to an SSTable on disk, the 
unpredictable behavior starts. Maybe this bug is another argument for 
https://issues.apache.org/jira/browse/CASSANDRA-9161 ?

 Cql query doesn't return right information when using IN on columns for some 
 keys
 -

 Key: CASSANDRA-9540
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9540
 Project: Cassandra
  Issue Type: Bug
  Components: API
 Environment: Cassandra 2.1.5
Reporter: Mathijs Vogelzang
Assignee: Carl Yeksigian
 Fix For: 2.1.x


 We are investigating a weird issue where one of our clients doesn't get data 
 on his dashboard. It seems Cassandra is not returning data for a particular 
 key (brokenkey from now on).
 Some background:
 We have a row where we store a metadata column and data in columns 
 bucket/0, bucket/1, bucket/2, etc. Depending on the date selection of 
 the UI, we know that we only need to retrieve bucket/0, bucket/0 and bucket/1 
 etc. (we always need to retrieve metadata).
 A typical query may look like this (using SELECT column1 to just show what is 
 returned, normally we would of course do SELECT value):
 {noformat}
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/workingkey');
  blobAsText(column1)
 -
 bucket/0
 metadata
 (2 rows)
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/brokenkey');
  blobAsText(column1)
 -
 bucket/0
 metadata
 (2 rows)
 {noformat}
 These two queries work as expected, and return the information that we 
 actually stored.
 However, when we filter for certain columns, the brokenkey starts behaving 
 very weird:
 {noformat}
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/workingkey') and column1 IN 
 (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));
  blobAsText(column1)
 -
 bucket/0
 metadata
 (2 rows)
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/workingkey') and column1 IN 
 (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));
  blobAsText(column1)
 -
 bucket/0
 metadata
 (2 rows)
 ***  As expected, querying for more information doesn't really matter for the 
 working key ***
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/brokenkey') and column1 IN 
 (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));
  blobAsText(column1)
 -
 bucket/0
 (1 rows)
 *** Cassandra stops giving us the metadata column when asking for a few more 
 columns! ***
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/brokenkey') and column1 IN 
 (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));
  key | column1 | value
 -+-+---
 (0 rows)
 *** Adding the bogus column name even makes it return nothing from this row 
 anymore! ***
 {noformat}
 There are at least two rows that malfunction like this in our table (which is 
 quite old already and has gone through a bunch of Cassandra upgrades). I've 
 upgraded our whole cluster to 2.1.5 (we were on 2.1.2 when I discovered this 
 problem) and compacted, repaired and scrubbed this column family, which 
 hasn't helped.
 Our table structure is:
 {noformat}
 cqlsh:AppBrain describe table GroupedSeries;
 CREATE TABLE AppBrain.GroupedSeries (
 key blob,
 column1 blob,
 value blob,
 PRIMARY KEY (key, column1)
 ) WITH COMPACT STORAGE
 AND CLUSTERING ORDER BY (column1 ASC)
 AND caching = '{keys:ALL, rows_per_partition:NONE}'
 AND comment = ''
 AND compaction = {'min_threshold': '4', 'class': 
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
 'max_threshold': '32'}
 AND compression = {'sstable_compression': 
 'org.apache.cassandra.io.compress.LZ4Compressor'}
 AND dclocal_read_repair_chance = 0.1
 AND default_time_to_live = 0
 AND gc_grace_seconds = 864000
 AND max_index_interval 

[jira] [Commented] (CASSANDRA-9540) Cql query doesn't return right information when using IN on columns for some keys

2015-06-04 Thread Mathijs Vogelzang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573093#comment-14573093
 ] 

Mathijs Vogelzang commented on CASSANDRA-9540:
--

And one final observation from our testing: it seems the bug happens as soon as 
a cell value exceeds 64kb in size. (My earlier comment was wrong, there were 
144,000 2-byte hex chars in the json file, so the cell size is roughly 70kb 
instead of 144kb).

This bug is very difficult for us to work around (we're using the IN query as a 
replacement for the former thrift slice query to get specific column values), 
as we currently can not trust cassandra to return values that we wrote into it 
earlier, depending on whether our queried columns are present or not. If 
there's any workaround how we can query a set of columns using CQL that works, 
please let me know.

 Cql query doesn't return right information when using IN on columns for some 
 keys
 -

 Key: CASSANDRA-9540
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9540
 Project: Cassandra
  Issue Type: Bug
  Components: API
 Environment: Cassandra 2.1.5
Reporter: Mathijs Vogelzang
Assignee: Carl Yeksigian
 Fix For: 2.1.x


 We are investigating a weird issue where one of our clients doesn't get data 
 on his dashboard. It seems Cassandra is not returning data for a particular 
 key (brokenkey from now on).
 Some background:
 We have a row where we store a metadata column and data in columns 
 bucket/0, bucket/1, bucket/2, etc. Depending on the date selection of 
 the UI, we know that we only need to retrieve bucket/0, bucket/0 and bucket/1 
 etc. (we always need to retrieve metadata).
 A typical query may look like this (using SELECT column1 to just show what is 
 returned, normally we would of course do SELECT value):
 {noformat}
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/workingkey');
  blobAsText(column1)
 -
 bucket/0
 metadata
 (2 rows)
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/brokenkey');
  blobAsText(column1)
 -
 bucket/0
 metadata
 (2 rows)
 {noformat}
 These two queries work as expected, and return the information that we 
 actually stored.
 However, when we filter for certain columns, the brokenkey starts behaving 
 very weird:
 {noformat}
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/workingkey') and column1 IN 
 (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));
  blobAsText(column1)
 -
 bucket/0
 metadata
 (2 rows)
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/workingkey') and column1 IN 
 (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));
  blobAsText(column1)
 -
 bucket/0
 metadata
 (2 rows)
 ***  As expected, querying for more information doesn't really matter for the 
 working key ***
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/brokenkey') and column1 IN 
 (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));
  blobAsText(column1)
 -
 bucket/0
 (1 rows)
 *** Cassandra stops giving us the metadata column when asking for a few more 
 columns! ***
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/brokenkey') and column1 IN 
 (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));
  key | column1 | value
 -+-+---
 (0 rows)
 *** Adding the bogus column name even makes it return nothing from this row 
 anymore! ***
 {noformat}
 There are at least two rows that malfunction like this in our table (which is 
 quite old already and has gone through a bunch of Cassandra upgrades). I've 
 upgraded our whole cluster to 2.1.5 (we were on 2.1.2 when I discovered this 
 problem) and compacted, repaired and scrubbed this column family, which 
 hasn't helped.
 Our table structure is:
 {noformat}
 cqlsh:AppBrain describe table GroupedSeries;
 CREATE TABLE AppBrain.GroupedSeries (
 key blob,
 column1 blob,
 value blob,
 PRIMARY KEY (key, column1)
 ) WITH COMPACT STORAGE
 AND CLUSTERING ORDER BY (column1 ASC)
 AND caching = '{keys:ALL, rows_per_partition:NONE}'
 AND comment = ''
 AND compaction = {'min_threshold': '4', 'class': 
 

[jira] [Updated] (CASSANDRA-9540) Cql IN query wrong on rows with values bigger than 64kb

2015-06-04 Thread Mathijs Vogelzang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mathijs Vogelzang updated CASSANDRA-9540:
-
Summary: Cql IN query wrong on rows with values bigger than 64kb  (was: Cql 
query doesn't return right information when using IN on columns for some keys)

 Cql IN query wrong on rows with values bigger than 64kb
 ---

 Key: CASSANDRA-9540
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9540
 Project: Cassandra
  Issue Type: Bug
  Components: API
 Environment: Cassandra 2.1.5
Reporter: Mathijs Vogelzang
Assignee: Carl Yeksigian
 Fix For: 2.1.x


 We are investigating a weird issue where one of our clients doesn't get data 
 on his dashboard. It seems Cassandra is not returning data for a particular 
 key (brokenkey from now on).
 Some background:
 We have a row where we store a metadata column and data in columns 
 bucket/0, bucket/1, bucket/2, etc. Depending on the date selection of 
 the UI, we know that we only need to retrieve bucket/0, bucket/0 and bucket/1 
 etc. (we always need to retrieve metadata).
 A typical query may look like this (using SELECT column1 to just show what is 
 returned, normally we would of course do SELECT value):
 {noformat}
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/workingkey');
  blobAsText(column1)
 -
 bucket/0
 metadata
 (2 rows)
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/brokenkey');
  blobAsText(column1)
 -
 bucket/0
 metadata
 (2 rows)
 {noformat}
 These two queries work as expected, and return the information that we 
 actually stored.
 However, when we filter for certain columns, the brokenkey starts behaving 
 very weird:
 {noformat}
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/workingkey') and column1 IN 
 (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));
  blobAsText(column1)
 -
 bucket/0
 metadata
 (2 rows)
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/workingkey') and column1 IN 
 (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));
  blobAsText(column1)
 -
 bucket/0
 metadata
 (2 rows)
 ***  As expected, querying for more information doesn't really matter for the 
 working key ***
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/brokenkey') and column1 IN 
 (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));
  blobAsText(column1)
 -
 bucket/0
 (1 rows)
 *** Cassandra stops giving us the metadata column when asking for a few more 
 columns! ***
 cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
 key=textAsBlob('install/brokenkey') and column1 IN 
 (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));
  key | column1 | value
 -+-+---
 (0 rows)
 *** Adding the bogus column name even makes it return nothing from this row 
 anymore! ***
 {noformat}
 There are at least two rows that malfunction like this in our table (which is 
 quite old already and has gone through a bunch of Cassandra upgrades). I've 
 upgraded our whole cluster to 2.1.5 (we were on 2.1.2 when I discovered this 
 problem) and compacted, repaired and scrubbed this column family, which 
 hasn't helped.
 Our table structure is:
 {noformat}
 cqlsh:AppBrain describe table GroupedSeries;
 CREATE TABLE AppBrain.GroupedSeries (
 key blob,
 column1 blob,
 value blob,
 PRIMARY KEY (key, column1)
 ) WITH COMPACT STORAGE
 AND CLUSTERING ORDER BY (column1 ASC)
 AND caching = '{keys:ALL, rows_per_partition:NONE}'
 AND comment = ''
 AND compaction = {'min_threshold': '4', 'class': 
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
 'max_threshold': '32'}
 AND compression = {'sstable_compression': 
 'org.apache.cassandra.io.compress.LZ4Compressor'}
 AND dclocal_read_repair_chance = 0.1
 AND default_time_to_live = 0
 AND gc_grace_seconds = 864000
 AND max_index_interval = 2048
 AND memtable_flush_period_in_ms = 0
 AND min_index_interval = 128
 AND read_repair_chance = 1.0
 AND speculative_retry = 'NONE';
 {noformat}
 Let me know if I can give more information that may be helpful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9540) Cql query doesn't return right information when using IN on columns for some keys

2015-06-04 Thread Mathijs Vogelzang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572659#comment-14572659
 ] 

Mathijs Vogelzang commented on CASSANDRA-9540:
--

I was able to make this case reproducible by getting the key from our 
production database and saving it with sstable2json.
It seems that the broken keys have one somewhat bigger cell (144 kB in this 
case).

A testcase json file is available at 
https://www.dropbox.com/s/kzu2jmqmwz788k8/testcase.json?dl=0 (the SSTable that 
it creates is available at 
https://www.dropbox.com/s/ilwpjka5r70n2os/test-testbug-ka-2-Data.db?dl=0 )

The commands that I executed are in cqlsh:
{noformat}
create keyspace test with 
replication={'class':'SimpleStrategy','replication_factor':1};
use test;
CREATE TABLE test.testbug ( key blob, column1 blob, value blob, 
PRIMARY KEY (key, column1) ) WITH COMPACT STORAGE AND CLUSTERING ORDER BY 
(column1 ASC) AND bloom_filter_fp_chance = 0.01 AND caching = 
'{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND 
compaction = {'min_threshold': '4', 'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32'} AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'} AND 
dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND 
gc_grace_seconds = 864000 AND max_index_interval = 2048 AND 
memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND 
read_repair_chance = 1.0 AND speculative_retry = 'NONE';
{noformat}
Then I injected the json as follows on the commandline:
{noformat}
json2sstable -K test -c testbug testcase.json 
DATADIR/test/testbug/test-testbug-ka-1-Data.db
nodetool refresh test testbug
{noformat}

The cqlsh then behaves as reported earlier, where querying for row test 
behaves as expected, but for row broken we don't get any data when we ask for 
columns that don't exist in that row (depending on the exact set of asked for 
columns, either 'metadata' is dropped, or no data is returned at all):
{noformat}
cqlsh:test select blobAsText(key),blobAsText(column1) from testbug where 
key=textAsBlob('test') and column1 in 
(textAsBlob('metadata'),textAsBlob('bucket/0'));

 blobAsText(key) | blobAsText(column1)
-+-
test |bucket/0
test |metadata

(2 rows)
cqlsh:test select blobAsText(key),blobAsText(column1) from testbug where 
key=textAsBlob('broken') and column1 in 
(textAsBlob('metadata'),textAsBlob('bucket/0'));

 blobAsText(key) | blobAsText(column1)
-+-
  broken |bucket/0
  broken |metadata

(2 rows)
cqlsh:test select blobAsText(key),blobAsText(column1) from testbug where 
key=textAsBlob('test') and column1 in 
(textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdf'));

 blobAsText(key) | blobAsText(column1)
-+-
test |bucket/0
test |metadata

(2 rows)
cqlsh:test select blobAsText(key),blobAsText(column1) from testbug where 
key=textAsBlob('broken') and column1 in 
(textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdf'));

 key | column1 | value
-+-+---

(0 rows)
cqlsh:test select blobAsText(key),blobAsText(column1) from testbug where 
key=textAsBlob('broken') and column1 in 
(textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'));

 blobAsText(key) | blobAsText(column1)
-+-
  broken |bucket/0

(1 rows)
{noformat}

 Cql query doesn't return right information when using IN on columns for some 
 keys
 -

 Key: CASSANDRA-9540
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9540
 Project: Cassandra
  Issue Type: Bug
  Components: API
 Environment: Cassandra 2.1.5
Reporter: Mathijs Vogelzang

 We are investigating a weird issue where one of our clients doesn't get data 
 on his dashboard. It seems Cassandra is not returning data for a particular 
 key (brokenkey from now on).
 Some background:
 We have a row where we store a metadata column and data in columns 
 bucket/0, bucket/1, bucket/2, etc. Depending on the date selection of 
 the UI, we know that we only need to retrieve bucket/0, bucket/0 and bucket/1 
 etc. (we always need to retrieve metadata).
 A typical query may look like this (using SELECT column1 to just show what is 
 returned, normally we would of course do SELECT value):
 {noformat}
 cqlsh:AppBrain select blobAsText(column1) from 

[jira] [Created] (CASSANDRA-9540) Cql query doesn't return right information when using IN on columns for some keys

2015-06-03 Thread Mathijs Vogelzang (JIRA)
Mathijs Vogelzang created CASSANDRA-9540:


 Summary: Cql query doesn't return right information when using IN 
on columns for some keys
 Key: CASSANDRA-9540
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9540
 Project: Cassandra
  Issue Type: Bug
  Components: API
 Environment: Cassandra 2.1.5
Reporter: Mathijs Vogelzang


We are investigating a weird issue where one of our clients doesn't get data on 
his dashboard. It seems Cassandra is not returning data for a particular key 
(brokenkey from now on).

Some background:
We have a row where we store a metadata column and data in columns 
bucket/0, bucket/1, bucket/2, etc. Depending on the date selection of the 
UI, we know that we only need to retrieve bucket/0, bucket/0 and bucket/1 etc. 
(we always need to retrieve metadata).

A typical query may look like this (using SELECT column1 to just show what is 
returned, normally we would of course do SELECT value):
{{noformat}}
cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
key=textAsBlob('install/workingkey');

 blobAsText(column1)
-
bucket/0
metadata

(2 rows)
cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
key=textAsBlob('install/brokenkey');

 blobAsText(column1)
-
bucket/0
metadata

(2 rows)
{{/noformat}}
These two queries work as expected, and return the information that we actually 
stored.
However, when we filter for certain columns, the brokenkey starts behaving 
very weird:

{{noformat}}
cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
key=textAsBlob('install/workingkey') and column1 IN 
(textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));

 blobAsText(column1)
-
bucket/0
metadata

(2 rows)
cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
key=textAsBlob('install/workingkey') and column1 IN 
(textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));

 blobAsText(column1)
-
bucket/0
metadata

(2 rows)
***  As expected, querying for more information doesn't really matter for the 
working key ***

cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
key=textAsBlob('install/brokenkey') and column1 IN 
(textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));

 blobAsText(column1)
-
bucket/0

(1 rows)
*** Cassandra stops giving us the metadata column when asking for a few more 
columns! ***
cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
key=textAsBlob('install/brokenkey') and column1 IN 
(textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));

 key | column1 | value
-+-+---

(0 rows)
*** Adding the bogus column name even makes it return nothing from this row 
anymore! ***
{{/noformat}}

There are at least two rows that malfunction like this in our table (which is 
quite old already and has gone through a bunch of Cassandra upgrades). I've 
upgraded our whole cluster to 2.1.5 (we were on 2.1.2 when I discovered this 
problem) and compacted, repaired and scrubbed this column family, which hasn't 
helped.

Our table structure is:
{{noformat}}
cqlsh:AppBrain describe table GroupedSeries;

CREATE TABLE AppBrain.GroupedSeries (
key blob,
column1 blob,
value blob,
PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC)
AND caching = '{keys:ALL, rows_per_partition:NONE}'
AND comment = ''
AND compaction = {'min_threshold': '4', 'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 1.0
AND speculative_retry = 'NONE';
{{/noformat}}

Let me know if I can give more information that may be helpful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9540) Cql query doesn't return right information when using IN on columns for some keys

2015-06-03 Thread Mathijs Vogelzang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mathijs Vogelzang updated CASSANDRA-9540:
-
Description: 
We are investigating a weird issue where one of our clients doesn't get data on 
his dashboard. It seems Cassandra is not returning data for a particular key 
(brokenkey from now on).

Some background:
We have a row where we store a metadata column and data in columns 
bucket/0, bucket/1, bucket/2, etc. Depending on the date selection of the 
UI, we know that we only need to retrieve bucket/0, bucket/0 and bucket/1 etc. 
(we always need to retrieve metadata).

A typical query may look like this (using SELECT column1 to just show what is 
returned, normally we would of course do SELECT value):
{noformat}
cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
key=textAsBlob('install/workingkey');

 blobAsText(column1)
-
bucket/0
metadata

(2 rows)
cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
key=textAsBlob('install/brokenkey');

 blobAsText(column1)
-
bucket/0
metadata

(2 rows)
{noformat}
These two queries work as expected, and return the information that we actually 
stored.
However, when we filter for certain columns, the brokenkey starts behaving 
very weird:

{noformat}
cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
key=textAsBlob('install/workingkey') and column1 IN 
(textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));

 blobAsText(column1)
-
bucket/0
metadata

(2 rows)
cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
key=textAsBlob('install/workingkey') and column1 IN 
(textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));

 blobAsText(column1)
-
bucket/0
metadata

(2 rows)
***  As expected, querying for more information doesn't really matter for the 
working key ***

cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
key=textAsBlob('install/brokenkey') and column1 IN 
(textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));

 blobAsText(column1)
-
bucket/0

(1 rows)
*** Cassandra stops giving us the metadata column when asking for a few more 
columns! ***
cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
key=textAsBlob('install/brokenkey') and column1 IN 
(textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));

 key | column1 | value
-+-+---

(0 rows)
*** Adding the bogus column name even makes it return nothing from this row 
anymore! ***
{noformat}

There are at least two rows that malfunction like this in our table (which is 
quite old already and has gone through a bunch of Cassandra upgrades). I've 
upgraded our whole cluster to 2.1.5 (we were on 2.1.2 when I discovered this 
problem) and compacted, repaired and scrubbed this column family, which hasn't 
helped.

Our table structure is:
{noformat}
cqlsh:AppBrain describe table GroupedSeries;

CREATE TABLE AppBrain.GroupedSeries (
key blob,
column1 blob,
value blob,
PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC)
AND caching = '{keys:ALL, rows_per_partition:NONE}'
AND comment = ''
AND compaction = {'min_threshold': '4', 'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 1.0
AND speculative_retry = 'NONE';
{noformat}

Let me know if I can give more information that may be helpful.

  was:
We are investigating a weird issue where one of our clients doesn't get data on 
his dashboard. It seems Cassandra is not returning data for a particular key 
(brokenkey from now on).

Some background:
We have a row where we store a metadata column and data in columns 
bucket/0, bucket/1, bucket/2, etc. Depending on the date selection of the 
UI, we know that we only need to retrieve bucket/0, bucket/0 and bucket/1 etc. 
(we always need to retrieve metadata).

A typical query may look like this (using SELECT column1 to just show what is 
returned, normally we would of course do SELECT value):
{{noformat}}
cqlsh:AppBrain select blobAsText(column1) from GroupedSeries where 
key=textAsBlob('install/workingkey');

 blobAsText(column1)

[jira] [Commented] (CASSANDRA-8786) NullPointerException in ColumnDefinition.hasIndexOption

2015-02-12 Thread Mathijs Vogelzang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14318416#comment-14318416
 ] 

Mathijs Vogelzang commented on CASSANDRA-8786:
--

The tables were probably originally created on version 1.x
We've only tried with cqlsh.
I've just done a DESCRIBE in cassandra-cli on our production table and the 
new one and the most suspicious difference is that on the production 
(non-working) table, one column name is listed like this:
{noformat}
(...)
  Column Name: account_id
  Validation Class: org.apache.cassandra.db.marshal.LongType
  Index Name: idx_accid
  Index Type: KEYS
  Column Name:  next_column_name
(...){noformat}

and on the newly created development table (the one without problems) it is
{noformat}
(...)
 Column Name: account_id
  Validation Class: org.apache.cassandra.db.marshal.LongType
  Index Name: idx_accid
  Index Type: KEYS
  Index Options: {}
  Column Name:  next_column_name
(...){noformat}

(Other minor differences are speculative retry NONE vs 99.0PERCENTILE, Read 
repair chance 0 vs 1 and Bloom filter FP chance default vs. 0.01, but the 
Index Options seems closer to where the problem probably lies to me)

 NullPointerException in ColumnDefinition.hasIndexOption
 ---

 Key: CASSANDRA-8786
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8786
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.1.2
Reporter: Mathijs Vogelzang
 Fix For: 2.1.4


 We have a Cassandra cluster that we've been using through many upgrades, and 
 thus most of our column families have originally been created by Thrift. We 
 are on Cassandra 2.1.2 now.
 We've now ported most of our code to use CQL, and our code occasionally tries 
 to recreate tables with IF NOT EXISTS to work properly on development / 
 testing environments.
 When we issue the CQL statement CREATE INDEX IF NOT EXISTS index ON 
 tableName (accountId) (this index does exist on that table already), we 
 get a {{DriverInternalError: An unexpected error occurred server side on 
 cass_host/xx.xxx.xxx.xxx:9042: java.lang.NullPointerException}}
 The error on the server is:
 {noformat}
  java.lang.NullPointerException: null
 at 
 org.apache.cassandra.config.ColumnDefinition.hasIndexOption(ColumnDefinition.java:489)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.cql3.statements.CreateIndexStatement.validate(CreateIndexStatement.java:87)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:224)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:248) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 {noformat}
 This happens every time we run this CQL statement. We've tried to reproduce 
 it in a test cassandra cluster by creating the table according to the exact 
 DESCRIBE TABLE specification, but then this NullPointerException doesn't 
 happon upon the CREATE INDEX one. So it seems that the tables on our 
 production cluster (that were originally created through thrift) are still 
 subtly different schema-wise then a freshly created table according to the 
 same creation statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8786) NullPointerException in ColumnDefinition.hasIndexOption

2015-02-11 Thread Mathijs Vogelzang (JIRA)
Mathijs Vogelzang created CASSANDRA-8786:


 Summary: NullPointerException in ColumnDefinition.hasIndexOption
 Key: CASSANDRA-8786
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8786
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.1.2
Reporter: Mathijs Vogelzang


We have a Cassandra cluster that we've been using through many upgrades, and 
thus most of our column families have originally been created by Thrift. We are 
on Cassandra 2.1.2 now.
We've now ported most of our code to use CQL, and our code occasionally tries 
to recreate tables with IF NOT EXISTS to work properly on development / 
testing environments.
When we issue the CQL statement CREATE INDEX IF NOT EXISTS index ON 
tableName (accountId) (this index does exist on that table already), we get 
a DriverInternalError: An unexpected error occurred server side on 
cass_host/xx.xxx.xxx.xxx:9042: java.lang.NullPointerException

The error on the server is java.lang.NullPointerException: null
at 
org.apache.cassandra.config.ColumnDefinition.hasIndexOption(ColumnDefinition.java:489)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.statements.CreateIndexStatement.validate(CreateIndexStatement.java:87)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:224)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:248) 
~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
 ~[apache-cassandra-2.1.2.jar:2.1.2]

This happens every time we run this CQL statement. We've tried to reproduce it 
in a test cassandra cluster by creating the table according to the exact 
DESCRIBE TABLE specification, but then this NullPointerException doesn't 
happon upon the CREATE INDEX one. So it seems that the tables on our production 
cluster (that were originally created through thrift) are still subtly 
different schema-wise then a freshly created table according to the same 
creation statement.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-6478) Importing sstables through sstableloader tombstoned data

2013-12-12 Thread Mathijs Vogelzang (JIRA)
Mathijs Vogelzang created CASSANDRA-6478:


 Summary: Importing sstables through sstableloader tombstoned data
 Key: CASSANDRA-6478
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6478
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: Cassandra 2.0.3
Reporter: Mathijs Vogelzang


We've tried to import sstables from a snapshot of a 1.2.10 cluster into a 
running 2.0.3 cluster. When using sstableloader, for some reason we couldn't 
retrieve some of the data. When investigating further, it turned out that 
tombstones in the far future were created for some rows. (sstable2json returned 
the correct data, but with an addition of metadata: {deletionInfo:
{markedForDeleteAt:1796952039620607,localDeletionTime:0}} to the rows that 
seemed missing).
This happened again exactly the same way when we cleared the new cluster and 
ran sstableloader again.

The sstables itself seemed fine, they were working on the old cluster, 
upgradesstables tells there's nothing to upgrade, and we were finally able to 
move our data correctly by copying the SSTables with scp into the right 
directory on the hosts of the new clusters worked fine (but naturally this 
required much more disk space than when sstableloader only sends the relevant 
parts).




--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6478) Importing sstables through sstableloader tombstoned data

2013-12-12 Thread Mathijs Vogelzang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mathijs Vogelzang updated CASSANDRA-6478:
-

Since Version: 2.0.3
Fix Version/s: 2.0.3

 Importing sstables through sstableloader tombstoned data
 

 Key: CASSANDRA-6478
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6478
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: Cassandra 2.0.3
Reporter: Mathijs Vogelzang
 Fix For: 2.0.3


 We've tried to import sstables from a snapshot of a 1.2.10 cluster into a 
 running 2.0.3 cluster. When using sstableloader, for some reason we couldn't 
 retrieve some of the data. When investigating further, it turned out that 
 tombstones in the far future were created for some rows. (sstable2json 
 returned the correct data, but with an addition of metadata: 
 {deletionInfo:
 {markedForDeleteAt:1796952039620607,localDeletionTime:0}} to the rows 
 that seemed missing).
 This happened again exactly the same way when we cleared the new cluster and 
 ran sstableloader again.
 The sstables itself seemed fine, they were working on the old cluster, 
 upgradesstables tells there's nothing to upgrade, and we were finally able to 
 move our data correctly by copying the SSTables with scp into the right 
 directory on the hosts of the new clusters worked fine (but naturally this 
 required much more disk space than when sstableloader only sends the relevant 
 parts).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-5381) java.io.EOFException exception while executing nodetool repair with compression enabled

2013-03-27 Thread Mathijs Vogelzang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13615171#comment-13615171
 ] 

Mathijs Vogelzang commented on CASSANDRA-5381:
--

We have the same issue where all streaming between nodes fails with an 
EOFException and then too many retries. This started when we upgraded from 
1.1.7 to 1.2.2, and didn't go away on subsequent upgrade to 1.2.3.

We tried running with/without internode compression and encryption, and found 
out that when encryption is off, everything works fine (also WITH compression 
on). With encryption on, it doesn't work, also with internode compression 
turned off, so for us it definitely has something to do with streaming while 
internode encryption is enabled.

 java.io.EOFException exception while executing nodetool repair with 
 compression enabled
 ---

 Key: CASSANDRA-5381
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5381
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.3
 Environment: Linux Virtual Machines, Red Hat Enterprise release 6.4, 
 kernel version  2.6.32-358.2.1.el6.x86_64. Each VM has 8GB memory and 4vCPUS.
Reporter: Neil Thomson
Priority: Minor

 Very similar to issue reported in CASSANDRA-5105. I have 3 nodes configured 
 in a cluster. The nodes are configured with compression enabled. When 
 attempting a nodetool repair on one node, i get exceptions in the other nodes 
 in the cluster.
 Disabling compression on the column family allows nodetool repair to run 
 without error.
 Exception:
 INFO [Streaming to /3.69.211.179:2] 2013-03-25 12:30:27,874 
 StreamReplyVerbHandler.java (line 50) Need to re-stream file 
 /var/lib/cassandra/data/rt/values/rt-values-ib-1-Data.db to /3.69.211.179
 INFO [Streaming to /3.69.211.179:2] 2013-03-25 12:30:27,991 
 StreamReplyVerbHandler.java (line 50) Need to re-stream file 
 /var/lib/cassandra/data/rt/values/rt-values-ib-1-Data.db to /3.69.211.179
 ERROR [Streaming to /3.69.211.179:2] 2013-03-25 12:30:28,113 
 CassandraDaemon.java (line 164) Exception in thread Thread[Streaming to 
 /3.69.211.179:2,5,main]
 java.lang.RuntimeException: java.io.EOFException
 at com.google.common.base.Throwables.propagate(Throwables.java:160)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
 Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 Caused by: java.io.EOFException
 at java.io.DataInputStream.readInt(Unknown Source)
 at 
 org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:193)
 at 
 org.apache.cassandra.streaming.compress.CompressedFileStreamTask.stream(CompressedFileStreamTask.java:114)
 at 
 org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 ... 3 more
 Keyspace configuration is as follows:
 Keyspace: rt:
   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
   Durable Writes: true
 Options: [replication_factor:3]
   Column Families:
 ColumnFamily: tagname
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 0.1
   DC Local Read repair chance: 0.0
   Populate IO Cache on flush: false
   Replicate on write: true
   Caching: KEYS_ONLY
   Bloom Filter FP chance: default
   Built indexes: []
   Compaction Strategy: 
 org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
 ColumnFamily: values
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 0.1
   DC Local Read repair chance: 0.0
   Populate IO Cache on flush: false
   Replicate on write: true
   Caching: KEYS_ONLY
   Bloom Filter FP chance: default
   Built indexes: []
   Compaction Strategy: 
 org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA,