[jira] [Comment Edited] (CASSANDRA-6220) Unable to select multiple entries using In clause on clustering part of compound key
[ https://issues.apache.org/jira/browse/CASSANDRA-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802237#comment-13802237 ] Constance Eustace edited comment on CASSANDRA-6220 at 10/22/13 8:41 PM: If I do this sequence: DROP SCHEMA CREATE SCHEMA CREATE INITIAL DATA (i.e. no updates to existing data) NODETOOL COMPACT -- magic sauce MASSIVE INSERT + SIMULTANEOUS UPDATES to INITIAL DATA does not reproduce. The nodetool compact after the schema creation seems to reset/stabilize the database. I used to replicate very reliably after about 300,000 inserts / 2000 updates. Now I do 1.75million inserts with 20,000 updates and no reproduction. Obviously you could probably run the nodetool compact after the SCHEMA creation, and then do initial data creation/update+insert run was (Author: cowardlydragon): If I do this sequence: DROP SCHEMA CREATE SCHEMA CREATE INITIAL DATA (i.e. no updates to existing data) NODETOOL COMPACT -- magic sauce MASSIVE INSERT + SIMULTANEOUS UPDATES to INITIAL DATA does not reproduce. The nodetool compact after the schema creation seems to reset/stabilize the database. I used to replicate very reliably after about 300,000 inserts / 2000 updates. Now I do 1.75million inserts with 20,000 updates and no reproduction. Unable to select multiple entries using In clause on clustering part of compound key Key: CASSANDRA-6220 URL: https://issues.apache.org/jira/browse/CASSANDRA-6220 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ashot Golovenko Attachments: inserts.zip I have the following table: CREATE TABLE rating ( id bigint, mid int, hid int, r double, PRIMARY KEY ((id, mid), hid)); And I get really really strange result sets on the following queries: cqlsh:bm SELECT hid, r FROM rating WHERE id = 755349113 and mid = 201310 and hid = 201329320; hid | r ---+ 201329320 | 45.476 (1 rows) cqlsh:bm SELECT hid, r FROM rating WHERE id = 755349113 and mid = 201310 and hid = 201329220; hid | r ---+--- 201329220 | 53.62 (1 rows) cqlsh:bm SELECT hid, r FROM rating WHERE id = 755349113 and mid = 201310 and hid in (201329320, 201329220); hid | r ---+ 201329320 | 45.476 (1 rows) -- WRONG - should be two records As you can see although both records exist I'm not able the fetch all of them using in clause. By now I have to cycle my requests which are about 30 and I find it highly inefficient given that I query physically the same row. More of that - it doesn't happen all the time! For different id values sometimes I get the correct dataset. Ideally I'd like the following select to work: SELECT hid, r FROM rating WHERE id = 755349113 and mid in ? and hid in ?; Which doesn't work either. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Comment Edited] (CASSANDRA-6220) Unable to select multiple entries using In clause on clustering part of compound key
[ https://issues.apache.org/jira/browse/CASSANDRA-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802269#comment-13802269 ] Constance Eustace edited comment on CASSANDRA-6220 at 10/22/13 9:11 PM: I was able to reproduce the original way of reproduction (drop schema, create schema, INSERT / UPDATE with no nodetool compact in there). Post-repair of the corruption seemed to require nodetool compact, invalidatekeycache, and/or possibly flush. Now that I've repaired. I'm going to run a 3.5 million insert + simulataneous update run to see if the nodetool compact repair makes the data more durable. was (Author: cowardlydragon): It may also require invalidatekeycache / caches, possibly with a flush in there as well... Unable to select multiple entries using In clause on clustering part of compound key Key: CASSANDRA-6220 URL: https://issues.apache.org/jira/browse/CASSANDRA-6220 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ashot Golovenko Attachments: inserts.zip I have the following table: CREATE TABLE rating ( id bigint, mid int, hid int, r double, PRIMARY KEY ((id, mid), hid)); And I get really really strange result sets on the following queries: cqlsh:bm SELECT hid, r FROM rating WHERE id = 755349113 and mid = 201310 and hid = 201329320; hid | r ---+ 201329320 | 45.476 (1 rows) cqlsh:bm SELECT hid, r FROM rating WHERE id = 755349113 and mid = 201310 and hid = 201329220; hid | r ---+--- 201329220 | 53.62 (1 rows) cqlsh:bm SELECT hid, r FROM rating WHERE id = 755349113 and mid = 201310 and hid in (201329320, 201329220); hid | r ---+ 201329320 | 45.476 (1 rows) -- WRONG - should be two records As you can see although both records exist I'm not able the fetch all of them using in clause. By now I have to cycle my requests which are about 30 and I find it highly inefficient given that I query physically the same row. More of that - it doesn't happen all the time! For different id values sometimes I get the correct dataset. Ideally I'd like the following select to work: SELECT hid, r FROM rating WHERE id = 755349113 and mid in ? and hid in ?; Which doesn't work either. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Comment Edited] (CASSANDRA-6220) Unable to select multiple entries using In clause on clustering part of compound key
[ https://issues.apache.org/jira/browse/CASSANDRA-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802269#comment-13802269 ] Constance Eustace edited comment on CASSANDRA-6220 at 10/22/13 9:12 PM: I was able to reproduce the original way of reproduction (drop schema, create schema, INSERT / UPDATE with no nodetool compact in there). Post-repair of the corruption seemed to require nodetool compact, invalidatekeycache, and/or possibly flush. Now that I've repaired. I'm going to run a 3.5 million insert + simulataneous update run to see if the nodetool compact repair makes the data more durable, as has been seen today before. was (Author: cowardlydragon): I was able to reproduce the original way of reproduction (drop schema, create schema, INSERT / UPDATE with no nodetool compact in there). Post-repair of the corruption seemed to require nodetool compact, invalidatekeycache, and/or possibly flush. Now that I've repaired. I'm going to run a 3.5 million insert + simulataneous update run to see if the nodetool compact repair makes the data more durable. Unable to select multiple entries using In clause on clustering part of compound key Key: CASSANDRA-6220 URL: https://issues.apache.org/jira/browse/CASSANDRA-6220 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ashot Golovenko Attachments: inserts.zip I have the following table: CREATE TABLE rating ( id bigint, mid int, hid int, r double, PRIMARY KEY ((id, mid), hid)); And I get really really strange result sets on the following queries: cqlsh:bm SELECT hid, r FROM rating WHERE id = 755349113 and mid = 201310 and hid = 201329320; hid | r ---+ 201329320 | 45.476 (1 rows) cqlsh:bm SELECT hid, r FROM rating WHERE id = 755349113 and mid = 201310 and hid = 201329220; hid | r ---+--- 201329220 | 53.62 (1 rows) cqlsh:bm SELECT hid, r FROM rating WHERE id = 755349113 and mid = 201310 and hid in (201329320, 201329220); hid | r ---+ 201329320 | 45.476 (1 rows) -- WRONG - should be two records As you can see although both records exist I'm not able the fetch all of them using in clause. By now I have to cycle my requests which are about 30 and I find it highly inefficient given that I query physically the same row. More of that - it doesn't happen all the time! For different id values sometimes I get the correct dataset. Ideally I'd like the following select to work: SELECT hid, r FROM rating WHERE id = 755349113 and mid in ? and hid in ?; Which doesn't work either. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Comment Edited] (CASSANDRA-6220) Unable to select multiple entries using In clause on clustering part of compound key
[ https://issues.apache.org/jira/browse/CASSANDRA-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800709#comment-13800709 ] Constance Eustace edited comment on CASSANDRA-6220 at 10/21/13 2:58 PM: one of the CASS-6137 comments has a github with a reproduction script if you need to reliably reproduce. Takes about 400,000 inserts + 6000 updates for me, single node https://github.com/cowarlydragon/CASS-6137 was (Author: cowardlydragon): one of the CASS-6137 comments has a github with a reproduction script if you need to reliably reproduce. Takes about 400,000 inserts + 6000 updates Unable to select multiple entries using In clause on clustering part of compound key Key: CASSANDRA-6220 URL: https://issues.apache.org/jira/browse/CASSANDRA-6220 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ashot Golovenko Attachments: inserts.zip I have the following table: CREATE TABLE rating ( id bigint, mid int, hid int, r double, PRIMARY KEY ((id, mid), hid)); And I get really really strange result sets on the following queries: cqlsh:bm SELECT hid, r FROM rating WHERE id = 755349113 and mid = 201310 and hid = 201329320; hid | r ---+ 201329320 | 45.476 (1 rows) cqlsh:bm SELECT hid, r FROM rating WHERE id = 755349113 and mid = 201310 and hid = 201329220; hid | r ---+--- 201329220 | 53.62 (1 rows) cqlsh:bm SELECT hid, r FROM rating WHERE id = 755349113 and mid = 201310 and hid in (201329320, 201329220); hid | r ---+ 201329320 | 45.476 (1 rows) -- WRONG - should be two records As you can see although both records exist I'm not able the fetch all of them using in clause. By now I have to cycle my requests which are about 30 and I find it highly inefficient given that I query physically the same row. More of that - it doesn't happen all the time! For different id values sometimes I get the correct dataset. Ideally I'd like the following select to work: SELECT hid, r FROM rating WHERE id = 755349113 and mid in ? and hid in ?; Which doesn't work either. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Comment Edited] (CASSANDRA-6220) Unable to select multiple entries using In clause on clustering part of compound key
[ https://issues.apache.org/jira/browse/CASSANDRA-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800984#comment-13800984 ] Constance Eustace edited comment on CASSANDRA-6220 at 10/21/13 8:15 PM: Does nodetool compact keyspace tablename fix the corruption? It did for me, but I don't think it stops the ongoing corruption... EDIT: my reproduction seems to indicate nodetool compact MAY fix ongoing updates after the nodetool compact was executed... I was unable to generate bad queries after another 1.5 million row inserts and 30,000 updates to existing data. was (Author: cowardlydragon): Does nodetool compact keyspace tablename fix the corruption? It did for me, but I don't think it stops the ongoing corruption... Unable to select multiple entries using In clause on clustering part of compound key Key: CASSANDRA-6220 URL: https://issues.apache.org/jira/browse/CASSANDRA-6220 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ashot Golovenko Attachments: inserts.zip I have the following table: CREATE TABLE rating ( id bigint, mid int, hid int, r double, PRIMARY KEY ((id, mid), hid)); And I get really really strange result sets on the following queries: cqlsh:bm SELECT hid, r FROM rating WHERE id = 755349113 and mid = 201310 and hid = 201329320; hid | r ---+ 201329320 | 45.476 (1 rows) cqlsh:bm SELECT hid, r FROM rating WHERE id = 755349113 and mid = 201310 and hid = 201329220; hid | r ---+--- 201329220 | 53.62 (1 rows) cqlsh:bm SELECT hid, r FROM rating WHERE id = 755349113 and mid = 201310 and hid in (201329320, 201329220); hid | r ---+ 201329320 | 45.476 (1 rows) -- WRONG - should be two records As you can see although both records exist I'm not able the fetch all of them using in clause. By now I have to cycle my requests which are about 30 and I find it highly inefficient given that I query physically the same row. More of that - it doesn't happen all the time! For different id values sometimes I get the correct dataset. Ideally I'd like the following select to work: SELECT hid, r FROM rating WHERE id = 755349113 and mid in ? and hid in ?; Which doesn't work either. -- This message was sent by Atlassian JIRA (v6.1#6144)