[jira] [Commented] (CASSANDRA-6220) Unable to select multiple entries using In clause on clustering part of compound key

2013-10-22 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802269#comment-13802269
 ] 

Constance Eustace commented on CASSANDRA-6220:
--

It may also require invalidatekeycache / caches, possibly with a flush in there 
as well...

> Unable to select multiple entries using In clause on clustering part of 
> compound key
> 
>
> Key: CASSANDRA-6220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6220
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ashot Golovenko
> Attachments: inserts.zip
>
>
> I have the following table:
> CREATE TABLE rating (
> id bigint,
> mid int,
> hid int,
> r double,
> PRIMARY KEY ((id, mid), hid));
> And I get really really strange result sets on the following queries:
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329320;
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329220;
>  hid   | r
> ---+---
>  201329220 | 53.62
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid in (201329320, 201329220);
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)  <-- WRONG - should be two records
> As you can see although both records exist I'm not able the fetch all of them 
> using in clause. By now I have to cycle my requests which are about 30 and I 
> find it highly inefficient given that I query physically the same row. 
> More of that  - it doesn't happen all the time! For different id values 
> sometimes I get the correct dataset.
> Ideally I'd like the following select to work:
> SELECT hid, r FROM rating WHERE id  = 755349113 and mid in ? and hid in ?;
> Which doesn't work either.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6220) Unable to select multiple entries using In clause on clustering part of compound key

2013-10-22 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802237#comment-13802237
 ] 

Constance Eustace commented on CASSANDRA-6220:
--

If I do this sequence:

DROP SCHEMA
CREATE SCHEMA
CREATE INITIAL DATA (i.e. no updates to existing data)
NODETOOL COMPACT <-- magic sauce
MASSIVE INSERT + SIMULTANEOUS UPDATES to INITIAL DATA

does not reproduce. The nodetool compact after the schema creation seems to 
reset/stabilize the database. I used to replicate very reliably after about 
300,000 inserts / 2000 updates. Now I do 1.75million inserts with 20,000 
updates and no reproduction.




> Unable to select multiple entries using In clause on clustering part of 
> compound key
> 
>
> Key: CASSANDRA-6220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6220
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ashot Golovenko
> Attachments: inserts.zip
>
>
> I have the following table:
> CREATE TABLE rating (
> id bigint,
> mid int,
> hid int,
> r double,
> PRIMARY KEY ((id, mid), hid));
> And I get really really strange result sets on the following queries:
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329320;
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329220;
>  hid   | r
> ---+---
>  201329220 | 53.62
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid in (201329320, 201329220);
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)  <-- WRONG - should be two records
> As you can see although both records exist I'm not able the fetch all of them 
> using in clause. By now I have to cycle my requests which are about 30 and I 
> find it highly inefficient given that I query physically the same row. 
> More of that  - it doesn't happen all the time! For different id values 
> sometimes I get the correct dataset.
> Ideally I'd like the following select to work:
> SELECT hid, r FROM rating WHERE id  = 755349113 and mid in ? and hid in ?;
> Which doesn't work either.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6220) Unable to select multiple entries using In clause on clustering part of compound key

2013-10-22 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802209#comment-13802209
 ] 

Constance Eustace commented on CASSANDRA-6220:
--

My current thinking is that truncation / schema recreation disrupts 
synchronization of compaction with the searching datastructure/sstables, and 
until a full compaction completes, updates after truncation/schema recreation 
are iffy...

> Unable to select multiple entries using In clause on clustering part of 
> compound key
> 
>
> Key: CASSANDRA-6220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6220
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ashot Golovenko
> Attachments: inserts.zip
>
>
> I have the following table:
> CREATE TABLE rating (
> id bigint,
> mid int,
> hid int,
> r double,
> PRIMARY KEY ((id, mid), hid));
> And I get really really strange result sets on the following queries:
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329320;
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329220;
>  hid   | r
> ---+---
>  201329220 | 53.62
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid in (201329320, 201329220);
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)  <-- WRONG - should be two records
> As you can see although both records exist I'm not able the fetch all of them 
> using in clause. By now I have to cycle my requests which are about 30 and I 
> find it highly inefficient given that I query physically the same row. 
> More of that  - it doesn't happen all the time! For different id values 
> sometimes I get the correct dataset.
> Ideally I'd like the following select to work:
> SELECT hid, r FROM rating WHERE id  = 755349113 and mid in ? and hid in ?;
> Which doesn't work either.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6220) Unable to select multiple entries using In clause on clustering part of compound key

2013-10-21 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800984#comment-13800984
 ] 

Constance Eustace commented on CASSANDRA-6220:
--

Does nodetool compact   fix the corruption? It did for me, 
but I don't think it stops the ongoing corruption...

> Unable to select multiple entries using In clause on clustering part of 
> compound key
> 
>
> Key: CASSANDRA-6220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6220
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ashot Golovenko
> Attachments: inserts.zip
>
>
> I have the following table:
> CREATE TABLE rating (
> id bigint,
> mid int,
> hid int,
> r double,
> PRIMARY KEY ((id, mid), hid));
> And I get really really strange result sets on the following queries:
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329320;
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329220;
>  hid   | r
> ---+---
>  201329220 | 53.62
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid in (201329320, 201329220);
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)  <-- WRONG - should be two records
> As you can see although both records exist I'm not able the fetch all of them 
> using in clause. By now I have to cycle my requests which are about 30 and I 
> find it highly inefficient given that I query physically the same row. 
> More of that  - it doesn't happen all the time! For different id values 
> sometimes I get the correct dataset.
> Ideally I'd like the following select to work:
> SELECT hid, r FROM rating WHERE id  = 755349113 and mid in ? and hid in ?;
> Which doesn't work either.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6220) Unable to select multiple entries using In clause on clustering part of compound key

2013-10-21 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800734#comment-13800734
 ] 

Constance Eustace commented on CASSANDRA-6220:
--

Thanks, was going to write a java driver reproduction in case the cass-jdbc was 
somehow creating the problem, but if you've reproduced that way I don't have 
to...


> Unable to select multiple entries using In clause on clustering part of 
> compound key
> 
>
> Key: CASSANDRA-6220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6220
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ashot Golovenko
> Attachments: inserts.zip
>
>
> I have the following table:
> CREATE TABLE rating (
> id bigint,
> mid int,
> hid int,
> r double,
> PRIMARY KEY ((id, mid), hid));
> And I get really really strange result sets on the following queries:
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329320;
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329220;
>  hid   | r
> ---+---
>  201329220 | 53.62
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid in (201329320, 201329220);
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)  <-- WRONG - should be two records
> As you can see although both records exist I'm not able the fetch all of them 
> using in clause. By now I have to cycle my requests which are about 30 and I 
> find it highly inefficient given that I query physically the same row. 
> More of that  - it doesn't happen all the time! For different id values 
> sometimes I get the correct dataset.
> Ideally I'd like the following select to work:
> SELECT hid, r FROM rating WHERE id  = 755349113 and mid in ? and hid in ?;
> Which doesn't work either.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6220) Unable to select multiple entries using In clause on clustering part of compound key

2013-10-21 Thread Ashot Golovenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800720#comment-13800720
 ] 

Ashot Golovenko commented on CASSANDRA-6220:


For inserts I was using a datastax java driver 1.0.3 with cassandra 2.0.1, 
single node on MacOsX 10.8.5 with SSD.
Wrong result sets can be seen through java driver and cqlsh as well.


> Unable to select multiple entries using In clause on clustering part of 
> compound key
> 
>
> Key: CASSANDRA-6220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6220
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ashot Golovenko
> Attachments: inserts.zip
>
>
> I have the following table:
> CREATE TABLE rating (
> id bigint,
> mid int,
> hid int,
> r double,
> PRIMARY KEY ((id, mid), hid));
> And I get really really strange result sets on the following queries:
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329320;
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329220;
>  hid   | r
> ---+---
>  201329220 | 53.62
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid in (201329320, 201329220);
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)  <-- WRONG - should be two records
> As you can see although both records exist I'm not able the fetch all of them 
> using in clause. By now I have to cycle my requests which are about 30 and I 
> find it highly inefficient given that I query physically the same row. 
> More of that  - it doesn't happen all the time! For different id values 
> sometimes I get the correct dataset.
> Ideally I'd like the following select to work:
> SELECT hid, r FROM rating WHERE id  = 755349113 and mid in ? and hid in ?;
> Which doesn't work either.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6220) Unable to select multiple entries using In clause on clustering part of compound key

2013-10-21 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800710#comment-13800710
 ] 

Constance Eustace commented on CASSANDRA-6220:
--

What do you use? Cass-jdbc, binary protocol, or is this simply cqlsh scripts?

> Unable to select multiple entries using In clause on clustering part of 
> compound key
> 
>
> Key: CASSANDRA-6220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6220
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ashot Golovenko
> Attachments: inserts.zip
>
>
> I have the following table:
> CREATE TABLE rating (
> id bigint,
> mid int,
> hid int,
> r double,
> PRIMARY KEY ((id, mid), hid));
> And I get really really strange result sets on the following queries:
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329320;
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329220;
>  hid   | r
> ---+---
>  201329220 | 53.62
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid in (201329320, 201329220);
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)  <-- WRONG - should be two records
> As you can see although both records exist I'm not able the fetch all of them 
> using in clause. By now I have to cycle my requests which are about 30 and I 
> find it highly inefficient given that I query physically the same row. 
> More of that  - it doesn't happen all the time! For different id values 
> sometimes I get the correct dataset.
> Ideally I'd like the following select to work:
> SELECT hid, r FROM rating WHERE id  = 755349113 and mid in ? and hid in ?;
> Which doesn't work either.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6220) Unable to select multiple entries using In clause on clustering part of compound key

2013-10-21 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800709#comment-13800709
 ] 

Constance Eustace commented on CASSANDRA-6220:
--

one of the CASS-6137 comments has a github with a reproduction script if you 
need to reliably reproduce. Takes about 400,000 inserts + 6000 updates 

> Unable to select multiple entries using In clause on clustering part of 
> compound key
> 
>
> Key: CASSANDRA-6220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6220
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ashot Golovenko
> Attachments: inserts.zip
>
>
> I have the following table:
> CREATE TABLE rating (
> id bigint,
> mid int,
> hid int,
> r double,
> PRIMARY KEY ((id, mid), hid));
> And I get really really strange result sets on the following queries:
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329320;
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329220;
>  hid   | r
> ---+---
>  201329220 | 53.62
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid in (201329320, 201329220);
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)  <-- WRONG - should be two records
> As you can see although both records exist I'm not able the fetch all of them 
> using in clause. By now I have to cycle my requests which are about 30 and I 
> find it highly inefficient given that I query physically the same row. 
> More of that  - it doesn't happen all the time! For different id values 
> sometimes I get the correct dataset.
> Ideally I'd like the following select to work:
> SELECT hid, r FROM rating WHERE id  = 755349113 and mid in ? and hid in ?;
> Which doesn't work either.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6220) Unable to select multiple entries using In clause on clustering part of compound key

2013-10-20 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800239#comment-13800239
 ] 

Jonathan Ellis commented on CASSANDRA-6220:
---

Can you give an example of what we need to INSERT to reproduce?

> Unable to select multiple entries using In clause on clustering part of 
> compound key
> 
>
> Key: CASSANDRA-6220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6220
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ashot Golovenko
>
> I have the following table:
> CREATE TABLE rating (
> id bigint,
> mid int,
> hid int,
> r double,
> PRIMARY KEY ((id, mid), hid));
> And I get really really strange result sets on the following queries:
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329320;
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329220;
>  hid   | r
> ---+---
>  201329220 | 53.62
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid in (201329320, 201329220);
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)  <-- WRONG - should be two records
> As you can see although both records exist I'm not able the fetch all of them 
> using in clause. By now I have to cycle my requests which are about 30 and I 
> find it highly inefficient given that I query physically the same row. 
> More of that  - it doesn't happen all the time! For different id values 
> sometimes I get the correct dataset.
> Ideally I'd like the following select to work:
> SELECT hid, r FROM rating WHERE id  = 755349113 and mid in ? and hid in ?;
> Which doesn't work either.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6220) Unable to select multiple entries using In clause on clustering part of compound key

2013-10-20 Thread Ashot Golovenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800213#comment-13800213
 ] 

Ashot Golovenko commented on CASSANDRA-6220:


looks like the same problem

> Unable to select multiple entries using In clause on clustering part of 
> compound key
> 
>
> Key: CASSANDRA-6220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6220
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ashot Golovenko
>
> I have the following table:
> CREATE TABLE rating (
> id bigint,
> mid int,
> hid int,
> r double,
> PRIMARY KEY ((id, mid), hid));
> And I get really really strange result sets on the following queries:
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329320;
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid = 201329220;
>  hid   | r
> ---+---
>  201329220 | 53.62
> (1 rows)
> cqlsh:bm> SELECT hid, r FROM rating WHERE id  = 755349113 and mid = 201310 
> and hid in (201329320, 201329220);
>  hid   | r
> ---+
>  201329320 | 45.476
> (1 rows)  <-- WRONG - should be two records
> As you can see although both records exist I'm not able the fetch all of them 
> using in clause. By now I have to cycle my requests which are about 30 and I 
> find it highly inefficient given that I query physically the same row. 
> More of that  - it doesn't happen all the time! For different id values 
> sometimes I get the correct dataset.
> Ideally I'd like the following select to work:
> SELECT hid, r FROM rating WHERE id  = 755349113 and mid in ? and hid in ?;
> Which doesn't work either.



--
This message was sent by Atlassian JIRA
(v6.1#6144)