[ https://issues.apache.org/jira/browse/CASSANDRA-10583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971593#comment-14971593 ]
Kai Wang edited comment on CASSANDRA-10583 at 10/23/15 7:36 PM: ---------------------------------------------------------------- This seems to be related to bulk loading. To reproduce: 1. Clone https://github.com/depend/issues/tree/master/CASSANDRA-10583. Build and run it, this application will generate an sstable with 10 rows. 2. Load it into C* with sstableloader. 3. {noformat} cqlsh:timeseries_test> select * from double_daily; tag | group | timestamp | value ------+-------+--------------------------+------- TEST | 1 | 2002-05-01 04:00:00+0000 | 0 TEST | 1 | 2002-05-02 04:00:00+0000 | 1 TEST | 1 | 2002-05-03 04:00:00+0000 | 2 TEST | 1 | 2002-05-04 04:00:00+0000 | 3 TEST | 1 | 2002-05-05 04:00:00+0000 | 4 TEST | 1 | 2002-05-06 04:00:00+0000 | 5 TEST | 1 | 2002-05-07 04:00:00+0000 | 6 TEST | 1 | 2002-05-08 04:00:00+0000 | 7 TEST | 1 | 2002-05-09 04:00:00+0000 | 8 TEST | 1 | 2002-05-10 04:00:00+0000 | 9 (10 rows) {noformat} 4. {noformat} cqlsh:timeseries_test> select * from double_daily where tag='TEST' and group = 1 and timestamp > '2002-05-01 00:00:00-0400'; tag | group | timestamp | value ------+-------+--------------------------+------- TEST | 1 | 2002-05-01 04:00:00+0000 | 0 TEST | 1 | 2002-05-02 04:00:00+0000 | 1 TEST | 1 | 2002-05-03 04:00:00+0000 | 2 TEST | 1 | 2002-05-04 04:00:00+0000 | 3 TEST | 1 | 2002-05-05 04:00:00+0000 | 4 TEST | 1 | 2002-05-06 04:00:00+0000 | 5 TEST | 1 | 2002-05-07 04:00:00+0000 | 6 TEST | 1 | 2002-05-08 04:00:00+0000 | 7 TEST | 1 | 2002-05-09 04:00:00+0000 | 8 TEST | 1 | 2002-05-10 04:00:00+0000 | 9 (10 rows) {noformat} 5. {noformat} cqlsh:timeseries_test> select * from double_daily where tag='TEST' and group = 1 and timestamp > '2002-05-02 00:00:00-0400'; tag | group | timestamp | value -----+-------+-----------+------- (0 rows) {noformat} I wasn't able to find that "equal" condition which returns everything. But query #5 still shows nothing is later than 2002/5/2 which is not true. was (Author: depend): This seems to be related to bulk loading. To reproduce: 1. Clone https://github.com/depend/issues/tree/master/CASSANDRA-10583. Build and run it, this application will generate an sstable with 10 rows. 2. Load it into C* with sstableloader. 3. {noformat} cqlsh:timeseries_test> select * from double_daily; tag | group | timestamp | value ------+-------+--------------------------+------- TEST | 1 | 2002-05-01 04:00:00+0000 | 0 TEST | 1 | 2002-05-02 04:00:00+0000 | 1 TEST | 1 | 2002-05-03 04:00:00+0000 | 2 TEST | 1 | 2002-05-04 04:00:00+0000 | 3 TEST | 1 | 2002-05-05 04:00:00+0000 | 4 TEST | 1 | 2002-05-06 04:00:00+0000 | 5 TEST | 1 | 2002-05-07 04:00:00+0000 | 6 TEST | 1 | 2002-05-08 04:00:00+0000 | 7 TEST | 1 | 2002-05-09 04:00:00+0000 | 8 TEST | 1 | 2002-05-10 04:00:00+0000 | 9 (10 rows) {noformat} 4. cqlsh:timeseries_test> select * from double_daily where tag='TEST' and group = 1 and timestamp > '2002-05-01 00:00:00-0400'; tag | group | timestamp | value ------+-------+--------------------------+------- TEST | 1 | 2002-05-01 04:00:00+0000 | 0 TEST | 1 | 2002-05-02 04:00:00+0000 | 1 TEST | 1 | 2002-05-03 04:00:00+0000 | 2 TEST | 1 | 2002-05-04 04:00:00+0000 | 3 TEST | 1 | 2002-05-05 04:00:00+0000 | 4 TEST | 1 | 2002-05-06 04:00:00+0000 | 5 TEST | 1 | 2002-05-07 04:00:00+0000 | 6 TEST | 1 | 2002-05-08 04:00:00+0000 | 7 TEST | 1 | 2002-05-09 04:00:00+0000 | 8 TEST | 1 | 2002-05-10 04:00:00+0000 | 9 (10 rows) 5. cqlsh:timeseries_test> select * from double_daily where tag='TEST' and group = 1 and timestamp > '2002-05-02 00:00:00-0400'; tag | group | timestamp | value -----+-------+-----------+------- (0 rows) I wasn't able to find that "equal" condition which returns everything. But query #5 still shows nothing is later than 2002/5/2 which is not true. > After bulk loading CQL query on timestamp column returns wrong result > --------------------------------------------------------------------- > > Key: CASSANDRA-10583 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10583 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Datastax Community Edition 2.1.10, Windows 2008 R2, Java > x64 1.8.0_60 > Reporter: Kai Wang > Fix For: 3.x, 2.1.x, 2.2.x > > > I have this table: > {noformat} > CREATE TABLE test ( > tag text, > group int, > timestamp timestamp, > value double, > PRIMARY KEY (tag, group, timestamp) > ) WITH CLUSTERING ORDER BY (group ASC, timestamp DESC) > {noformat} > First I used CQLSSTableWriter to bulk load a bunch of sstables. Then I ran > this query: > {noformat} > cqlsh> select * from test where tag = 'MSFT' and group = 1 and timestamp > ='2004-12-15 16:00:00-0500'; > tag | group | timestamp | value > ------+-------+--------------------------+------- > MSFT | 1 | 2004-12-15 21:00:00+0000 | 27.11 > MSFT | 1 | 2004-12-16 21:00:00+0000 | 27.16 > MSFT | 1 | 2004-12-17 21:00:00+0000 | 26.96 > MSFT | 1 | 2004-12-20 21:00:00+0000 | 26.95 > MSFT | 1 | 2004-12-21 21:00:00+0000 | 27.07 > MSFT | 1 | 2004-12-22 21:00:00+0000 | 26.98 > MSFT | 1 | 2004-12-23 21:00:00+0000 | 27.01 > MSFT | 1 | 2004-12-27 21:00:00+0000 | 26.85 > MSFT | 1 | 2004-12-28 21:00:00+0000 | 26.95 > MSFT | 1 | 2004-12-29 21:00:00+0000 | 26.9 > MSFT | 1 | 2004-12-30 21:00:00+0000 | 26.76 > (11 rows) > {noformat} > The result is obviously wrong. > If I run this query: > {noformat} > cqlsh> select * from test where tag = 'MSFT' and group = 1 and timestamp > ='2004-12-16 16:00:00-0500'; > tag | group | timestamp | value > -----+-------+-----------+------- > (0 rows) > {noformat} > In DevCenter I tried to create a similar table and insert a few rows but > couldn't reproduce this. This may have something to do with the bulk loading > process. But still, the fact cqlsh returns data that doesn't match the query > is concerning. -- This message was sent by Atlassian JIRA (v6.3.4#6332)