[ 
https://issues.apache.org/jira/browse/CASSANDRA-10583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971593#comment-14971593
 ] 

Kai Wang edited comment on CASSANDRA-10583 at 10/23/15 7:35 PM:
----------------------------------------------------------------

This seems to be related to bulk loading. 

To reproduce:

1. Clone https://github.com/depend/issues/tree/master/CASSANDRA-10583. Build 
and run it, this application will generate an sstable with 10 rows.
2. Load it into C* with sstableloader.
3. 
{noformat} 
cqlsh:timeseries_test> select * from double_daily;

 tag  | group | timestamp                | value
------+-------+--------------------------+-------
 TEST |     1 | 2002-05-01 04:00:00+0000 |     0
 TEST |     1 | 2002-05-02 04:00:00+0000 |     1
 TEST |     1 | 2002-05-03 04:00:00+0000 |     2
 TEST |     1 | 2002-05-04 04:00:00+0000 |     3
 TEST |     1 | 2002-05-05 04:00:00+0000 |     4
 TEST |     1 | 2002-05-06 04:00:00+0000 |     5
 TEST |     1 | 2002-05-07 04:00:00+0000 |     6
 TEST |     1 | 2002-05-08 04:00:00+0000 |     7
 TEST |     1 | 2002-05-09 04:00:00+0000 |     8
 TEST |     1 | 2002-05-10 04:00:00+0000 |     9

(10 rows)
{noformat} 

4. cqlsh:timeseries_test> select * from double_daily where tag='TEST' and group 
= 1 and timestamp > '2002-05-01 00:00:00-0400';

 tag  | group | timestamp                | value
------+-------+--------------------------+-------
 TEST |     1 | 2002-05-01 04:00:00+0000 |     0
 TEST |     1 | 2002-05-02 04:00:00+0000 |     1
 TEST |     1 | 2002-05-03 04:00:00+0000 |     2
 TEST |     1 | 2002-05-04 04:00:00+0000 |     3
 TEST |     1 | 2002-05-05 04:00:00+0000 |     4
 TEST |     1 | 2002-05-06 04:00:00+0000 |     5
 TEST |     1 | 2002-05-07 04:00:00+0000 |     6
 TEST |     1 | 2002-05-08 04:00:00+0000 |     7
 TEST |     1 | 2002-05-09 04:00:00+0000 |     8
 TEST |     1 | 2002-05-10 04:00:00+0000 |     9

(10 rows)

5. cqlsh:timeseries_test> select * from double_daily where tag='TEST' and group 
= 1 and timestamp > '2002-05-02 00:00:00-0400';

 tag | group | timestamp | value
-----+-------+-----------+-------

(0 rows)

I wasn't able to find that "equal" condition which returns everything. But 
query #5 still shows nothing is later than 2002/5/2 which is not true.


was (Author: depend):
This seems to be related to bulk loading. 

To reproduce:

1. Clone https://github.com/depend/issues/tree/master/CASSANDRA-10583. Build 
and run it, this application will generate an sstable with 10 rows.
2. Load it into C* with sstableloader.
3. cqlsh:timeseries_test> select * from double_daily;

 tag  | group | timestamp                | value
------+-------+--------------------------+-------
 TEST |     1 | 2002-05-01 04:00:00+0000 |     0
 TEST |     1 | 2002-05-02 04:00:00+0000 |     1
 TEST |     1 | 2002-05-03 04:00:00+0000 |     2
 TEST |     1 | 2002-05-04 04:00:00+0000 |     3
 TEST |     1 | 2002-05-05 04:00:00+0000 |     4
 TEST |     1 | 2002-05-06 04:00:00+0000 |     5
 TEST |     1 | 2002-05-07 04:00:00+0000 |     6
 TEST |     1 | 2002-05-08 04:00:00+0000 |     7
 TEST |     1 | 2002-05-09 04:00:00+0000 |     8
 TEST |     1 | 2002-05-10 04:00:00+0000 |     9

(10 rows)

4. cqlsh:timeseries_test> select * from double_daily where tag='TEST' and group 
= 1 and timestamp > '2002-05-01 00:00:00-0400';

 tag  | group | timestamp                | value
------+-------+--------------------------+-------
 TEST |     1 | 2002-05-01 04:00:00+0000 |     0
 TEST |     1 | 2002-05-02 04:00:00+0000 |     1
 TEST |     1 | 2002-05-03 04:00:00+0000 |     2
 TEST |     1 | 2002-05-04 04:00:00+0000 |     3
 TEST |     1 | 2002-05-05 04:00:00+0000 |     4
 TEST |     1 | 2002-05-06 04:00:00+0000 |     5
 TEST |     1 | 2002-05-07 04:00:00+0000 |     6
 TEST |     1 | 2002-05-08 04:00:00+0000 |     7
 TEST |     1 | 2002-05-09 04:00:00+0000 |     8
 TEST |     1 | 2002-05-10 04:00:00+0000 |     9

(10 rows)

5. cqlsh:timeseries_test> select * from double_daily where tag='TEST' and group 
= 1 and timestamp > '2002-05-02 00:00:00-0400';

 tag | group | timestamp | value
-----+-------+-----------+-------

(0 rows)

I wasn't able to find that "equal" condition which returns everything. But 
query #5 still shows nothing is later than 2002/5/2 which is not true.

> After bulk loading CQL query on timestamp column returns wrong result
> ---------------------------------------------------------------------
>
>                 Key: CASSANDRA-10583
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10583
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Datastax Community Edition 2.1.10, Windows 2008 R2, Java 
> x64 1.8.0_60
>            Reporter: Kai Wang
>             Fix For: 3.x, 2.1.x, 2.2.x
>
>
> I have this table:
> {noformat}
> CREATE TABLE test (
>     tag text,
>     group int,
>     timestamp timestamp,
>     value double,
>     PRIMARY KEY (tag, group, timestamp)
> ) WITH CLUSTERING ORDER BY (group ASC, timestamp DESC)
> {noformat}
> First I used CQLSSTableWriter to bulk load a bunch of sstables. Then I ran 
> this query:
> {noformat}
> cqlsh> select * from test where tag = 'MSFT' and group = 1 and timestamp 
> ='2004-12-15 16:00:00-0500';
>  tag  | group | timestamp                | value
> ------+-------+--------------------------+-------
>  MSFT |     1 | 2004-12-15 21:00:00+0000 | 27.11
>  MSFT |     1 | 2004-12-16 21:00:00+0000 | 27.16
>  MSFT |     1 | 2004-12-17 21:00:00+0000 | 26.96
>  MSFT |     1 | 2004-12-20 21:00:00+0000 | 26.95
>  MSFT |     1 | 2004-12-21 21:00:00+0000 | 27.07
>  MSFT |     1 | 2004-12-22 21:00:00+0000 | 26.98
>  MSFT |     1 | 2004-12-23 21:00:00+0000 | 27.01
>  MSFT |     1 | 2004-12-27 21:00:00+0000 | 26.85
>  MSFT |     1 | 2004-12-28 21:00:00+0000 | 26.95
>  MSFT |     1 | 2004-12-29 21:00:00+0000 |  26.9
>  MSFT |     1 | 2004-12-30 21:00:00+0000 | 26.76
> (11 rows)
> {noformat}
> The result is obviously wrong.
> If I run this query:
> {noformat}
> cqlsh> select * from test where tag = 'MSFT' and group = 1 and timestamp 
> ='2004-12-16 16:00:00-0500';
>  tag | group | timestamp | value
> -----+-------+-----------+-------
> (0 rows)
> {noformat}
> In DevCenter I tried to create a similar table and insert a few rows but 
> couldn't reproduce this. This may have something to do with the bulk loading 
> process. But still, the fact cqlsh returns data that doesn't match the query 
> is concerning.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to