[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-04-01 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13956721#comment-13956721
 ] 

Tyler Hobbs commented on CASSANDRA-6825:


[~slebresne] the logic is primarily broken because it continues checking latter 
components after it knows that the first component intersects.  For example, 
suppose you have a slice of {{((1, 1), "")}}, min column names of {{(0, 2)}}, 
and max column names of {{(2, 3)}}.  The first component of the slice start 
falls within the min/max range; the second component does not.  Although the 
slice is _starting_ outside of the min/max range for the second component, it 
should be considered intersecting because we'll accept other values for the 
second component (for higher values of the first component).  The current logic 
sees that the second component doesn't fall within min/max and considers it 
non-intersecting.

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
>Assignee: Tyler Hobbs
> Fix For: 2.0.7, 2.1 beta2
>
> Attachments: cassandra.log, selectpartitions.zip, 
> selectrowcounts.txt, testdb_1395372407904.zip, testdb_1395372407904.zip
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-04-01 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13956669#comment-13956669
 ] 

Sylvain Lebresne commented on CASSANDRA-6825:
-

[~thobbs] Any insights on what it off in the logic exactly?

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
>Assignee: Tyler Hobbs
> Fix For: 2.0.7, 2.1 beta2
>
> Attachments: cassandra.log, selectpartitions.zip, 
> selectrowcounts.txt, testdb_1395372407904.zip, testdb_1395372407904.zip
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-03-28 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13951437#comment-13951437
 ] 

Tyler Hobbs commented on CASSANDRA-6825:


It loos like CASSANDRA-6327 is the cause for this.  The logic for testing 
sstables for inclusion when there's a composite comparator and multiple 
components in the slice filter is off.  This showed up for {{count(\*)}} 
because counting queries are always paged internally; the second page was 
erroneously skipping an sstable.  If the {{select *}} query has the same page 
size (10k), it will also omit results.

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
>Assignee: Tyler Hobbs
> Attachments: cassandra.log, selectpartitions.zip, 
> selectrowcounts.txt, testdb_1395372407904.zip, testdb_1395372407904.zip
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-03-28 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13951200#comment-13951200
 ] 

Tyler Hobbs commented on CASSANDRA-6825:


[~wtmitchell3] thanks! I can reproduce the issue now, so I should be able to 
track down what's going on.

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
>Assignee: Tyler Hobbs
> Attachments: cassandra.log, selectpartitions.zip, 
> selectrowcounts.txt, testdb_1395372407904.zip, testdb_1395372407904.zip
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-03-21 Thread Bill Mitchell (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13943586#comment-13943586
 ] 

Bill Mitchell commented on CASSANDRA-6825:
--

As it happens, I have that info handy as my JUnit testcase includes it in the 
log4j output:


CREATE TABLE testdb_1395374703023.sr (
siteid text,
listid bigint,
partition int,
createdate timestamp,
emailcrypt text,
emailaddr text,
properties text,
removedate timestamp,
PRIMARY KEY ((siteid, listid, partition), createdate, emailcrypt)
) WITH CLUSTERING ORDER BY (createdate DESC, emailcrypt ASC)
   AND read_repair_chance = 0.1
   AND dclocal_read_repair_chance = 0.0
   AND replicate_on_write = true
   AND gc_grace_seconds = 864000
   AND bloom_filter_fp_chance = 0.01
   AND caching = 'KEYS_ONLY'
   AND comment = ''
   AND compaction = { 'class' : 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
   AND compression = { 'sstable_compression' : 
'org.apache.cassandra.io.compress.SnappyCompressor' };

(siteID was a BIGINT until recently when the schema was changed to TEXT to 
match the use of siteID elsewhere in the product.  I had not thought to 
represent our Java String as a Cassandra UUID.)

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
>Assignee: Tyler Hobbs
> Attachments: cassandra.log, selectpartitions.zip, 
> selectrowcounts.txt, testdb_1395372407904.zip, testdb_1395372407904.zip
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-03-21 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13943492#comment-13943492
 ] 

Tyler Hobbs commented on CASSANDRA-6825:


[~wtmitchell3] what type is the siteid column supposed to be?  So far I've 
tried varint, uuid, and text and had problems with each.   Just pasting 
"DESCRIBE KEYSPACE testdb_" would also work.

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
>Assignee: Tyler Hobbs
> Attachments: cassandra.log, selectpartitions.zip, 
> selectrowcounts.txt, testdb_1395372407904.zip, testdb_1395372407904.zip
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-03-21 Thread Bill Mitchell (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13942881#comment-13942881
 ] 

Bill Mitchell commented on CASSANDRA-6825:
--

Tyler, you use an interesting word, "flush".  After running a test with a 
different database name, I went back and looked at the first keyspace, as I did 
not drain the node before zipping the file the first time.  A third SSTable had 
now been written.  See the larger .zip file I have attached.  When I try the 
same statements through cqlsh, a SELECT * FROM sr WHERE ... AND partition = 2 
now shows 2 rows, but SELECT COUNT(*) FROM sr WHERE ... AND partition=2 
still returns a count of 1.  So the count is still incorrect.  

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
>Assignee: Tyler Hobbs
> Attachments: cassandra.log, selectpartitions.zip, 
> selectrowcounts.txt, testdb_1395372407904.zip, testdb_1395372407904.zip
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-03-20 Thread Bill Mitchell (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13942757#comment-13942757
 ] 

Bill Mitchell commented on CASSANDRA-6825:
--

I've attached a testdb_1395372407904.zip of the data/testdb_1395372407904 
directory after the test ran.  After the test completed, I did select * from sr 
and it returned 10 rows:

cqlsh:testdb_1395372407904> select count(*) from sr limit 10;

 count

 10

(1 rows)

When I did a select count(*) for each of the six partitions, they total only 
9:
cqlsh:testdb_1395372407904> select count(*) from sr where siteID = '4CA4F79E-3AB
2-41C5-AE42-C7009736F1D5' and listID = 24 and partition = 0 LIMIT 10;

 count
---
 2

(1 rows)

cqlsh:testdb_1395372407904> select count(*) from sr where siteID = '4CA4F79E-3AB
2-41C5-AE42-C7009736F1D5' and listID = 24 and partition = 1 LIMIT 10;

 count
---
 2

(1 rows)

cqlsh:testdb_1395372407904> select count(*) from sr where siteID = '4CA4F79E-3AB
2-41C5-AE42-C7009736F1D5' and listID = 24 and partition = 2 LIMIT 10;

 count
---
 1

(1 rows)

cqlsh:testdb_1395372407904> select count(*) from sr where siteID = '4CA4F79E-3AB
2-41C5-AE42-C7009736F1D5' and listID = 24 and partition = 3 LIMIT 10;

 count
---
 1

(1 rows)

cqlsh:testdb_1395372407904> select count(*) from sr where siteID = '4CA4F79E-3AB
2-41C5-AE42-C7009736F1D5' and listID = 24 and partition = 4 LIMIT 10;

 count
---
 1

(1 rows)

cqlsh:testdb_1395372407904> select count(*) from sr where siteID = '4CA4F79E-3AB
2-41C5-AE42-C7009736F1D5' and listID = 24 and partition = 5 LIMIT 10;

 count
---
 2

(1 rows)

As it turns out, the 1 rows not counted were all from partition=2, and have 
a createDate identical except in the milliseconds to 1 rows that do appear. 
 The common key values of the presumably uncounted rows (as they are the rows 
that did not return on the SELECT query, CASSANDRA-6826) are 
siteID=4CA4F79E-3AB2-41C5-AE42-C7009736F1D5,listID=24,partition=2,createDate=2014-03-20T22:27:26.457-0500.
 


> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
>Assignee: Tyler Hobbs
> Attachments: cassandra.log, selectpartitions.zip, 
> selectrowcounts.txt, testdb_1395372407904.zip
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-03-20 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13942500#comment-13942500
 ] 

Tyler Hobbs commented on CASSANDRA-6825:


Scratch that, I made a mistake in my test case (facepalm).  After fixing that, 
I'm not able to reproduce.  

[~billmichell]  A zip of your sstables could be useful if you can still provide 
that.

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
>Assignee: Tyler Hobbs
> Attachments: cassandra.log, selectpartitions.zip, selectrowcounts.txt
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-03-20 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13942437#comment-13942437
 ] 

Tyler Hobbs commented on CASSANDRA-6825:


The overcounting problem seems to be limited to overwrites that end up in 
different SSTables (or a memtable).   If you write once, flush, and then 
overwrite, your count will be exactly 2x.

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
>Assignee: Tyler Hobbs
> Attachments: cassandra.log, selectpartitions.zip, selectrowcounts.txt
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-03-20 Thread Bill Mitchell (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13942232#comment-13942232
 ] 

Bill Mitchell commented on CASSANDRA-6825:
--

I can confirm the problem is still there in 2.0.6.  As I was verifying that I 
could still reproduce CASSANDRA-6826, I checked for the COUNT(*) issue too.  In 
one of the tables six partitions, a COUNT(*) reported 1 rows, but if I did 
a SELECT * in either ascending or descending order, cqlsh printed 2 rows.  
Would it help if I zipped up the data directory containing the table after the 
problem appeared?  Or would you need other information from the system 
directory, too, to see how the data is recorded? That might help in isolating 
how the problem arises.

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
>Assignee: Russ Hatch
> Attachments: cassandra.log, selectpartitions.zip, selectrowcounts.txt
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-03-20 Thread Russ Hatch (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13942193#comment-13942193
 ] 

Russ Hatch commented on CASSANDRA-6825:
---

Unfortunately I was not able to reproduce this issue. I tried cassandra 2.0.5 
on linux, and also on Win7 (tried with the python driver, and cqlsh for 
checking the counts). I was using java 1.7.0_51.

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
>Assignee: Russ Hatch
> Attachments: cassandra.log, selectpartitions.zip, selectrowcounts.txt
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-03-10 Thread Bill Mitchell (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926446#comment-13926446
 ] 

Bill Mitchell commented on CASSANDRA-6825:
--

After shortening the column names, the schema is: CREATE TABLE sr (s bigint, l 
bigint, partition int, cd timestamp, ec text, ea text, properties text, rd 
timestamp, PRIMARY KEY ((s, l, p), cd, ec)) WITH CLUSTERING ORDER BY (cd DESC, 
ec ASC).

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
>Assignee: Russ Hatch
> Attachments: cassandra.log, selectpartitions.zip, selectrowcounts.txt
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-03-10 Thread Russ Hatch (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926090#comment-13926090
 ] 

Russ Hatch commented on CASSANDRA-6825:
---

[~wtmitchell3] -- I'm working to reproduce this issue. To get as close to the 
mark can you provide me with a full schema for the table? (Mainly I'm 
interested in which columns are part of the primary key -- aside from the 
siteid, listid, and partition).

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
>Assignee: Russ Hatch
> Attachments: cassandra.log, selectpartitions.zip, selectrowcounts.txt
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-03-08 Thread Bill Mitchell (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13924860#comment-13924860
 ] 

Bill Mitchell commented on CASSANDRA-6825:
--

If it helps in reproducing it, unlike my earlier report in CASSANDRA-6736, this 
failure and that of CASSANDRA-6826 appear in a small volume test, less than 
100,000 rows total.  This lower number was being run in a JUnit test as part of 
a maven build of a complete product, such that the test keyspace and tables 
were created, but the row insertion did not begin until 9 minutes later.  So 
Cassandra is not noting these as high-volume activity, and the row width is not 
large enough to provoke incremental compaction, or in fact any compaction 
whatsoever.  

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
>Assignee: Russ Hatch
> Attachments: cassandra.log, selectpartitions.zip, selectrowcounts.txt
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-03-07 Thread Bill Mitchell (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13924480#comment-13924480
 ] 

Bill Mitchell commented on CASSANDRA-6825:
--

Yes.  I've added that to the environment description.  

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
>Reporter: Bill Mitchell
> Attachments: cassandra.log, selectpartitions.zip, selectrowcounts.txt
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6825) COUNT(*) with WHERE not finding all the matching rows

2014-03-07 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13924454#comment-13924454
 ] 

Jonathan Ellis commented on CASSANDRA-6825:
---

Is this a single-node cluster?

> COUNT(*) with WHERE not finding all the matching rows
> -
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: quad core Windows7 x64
> Cassandra 2.0.5
>Reporter: Bill Mitchell
> Attachments: cassandra.log, selectpartitions.zip, selectrowcounts.txt
>
>
> Investigating another problem, I needed to do COUNT(*) on the several 
> partitions of a table immediately after a test case ran, and I discovered 
> that count(*) on the full table and on each of the partitions returned 
> different counts.  
> In particular case, SELECT COUNT(*) FROM sr LIMIT 100; returned the 
> expected count from the test 9 rows.  The composite primary key splits 
> the logical row into six distinct partitions, and when I issue a query asking 
> for the total across all six partitions, the returned result is only 83999.  
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND 
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical 
> WHERE predicate reports only 14,000. 
> This is failing immediately after running a single small test, such that 
> there are only two SSTables, sr-jb-1 and sr-jb-2.  Compaction never needed to 
> run.  
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect 
> count(*) results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)