[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358561#comment-16358561 ] Datta commented on CASSANDRA-6875: -- Is there any other possible way to compare multi column in IN clause (other than clustering key)? > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: CQL >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.9, 2.1 rc1 > > Attachments: 6875-part2-v2.txt, 6875-part2.txt > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009075#comment-14009075 ] Bill Mitchell commented on CASSANDRA-6875: -- To try this out, I cobbled up a test case by accessing the TupleType directly on the client side, as this feature is not yet supported in the Java driver. My approach was to serialize my two ordering column values, then use TupleType.buildValue() to concatenate them into a single ByteBuffer, build a List of all these, then use serialize on a ListType instance to get a single ByteBuffer representing the entire list, and bind that using setBytesUnsafe(). I'm not totally sure of all this, but it seems reasonable. My SELECT statement syntax followed the first of the three Tyler suggested: ... WHERE (c1, c2) IN ?, as this allows the statement to be prepared only once, irrespective of the number of compound keys provided. What I saw was the following traceback on the server: 14/05/26 14:33:09 ERROR messages.ErrorMessage: Unexpected exception during request java.util.NoSuchElementException at java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:396) at java.util.LinkedHashMap$ValueIterator.next(LinkedHashMap.java:409) at org.apache.cassandra.cql3.statements.SelectStatement.buildMultiColumnInBound(SelectStatement.java:941) at org.apache.cassandra.cql3.statements.SelectStatement.buildBound(SelectStatement.java:814) at org.apache.cassandra.cql3.statements.SelectStatement.getRequestedBound(SelectStatement.java:977) at org.apache.cassandra.cql3.statements.SelectStatement.makeFilter(SelectStatement.java:444) at org.apache.cassandra.cql3.statements.SelectStatement.getSliceCommands(SelectStatement.java:340) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:210) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:61) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158) at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:309) at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:132) at org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304) Stepping through the code, it appears to have analyzed my statement correctly. In BuildMultiColumnInBound, splitInValues contains 1426 tuples, which is the number I intended to pass. The names parameter identifies two columns, createdate and emailcrypt. The loop executes twice, but on the third iteration there are no more elements in names, thus the exception. Moving the construction of the iterator within the loop fixed my Exception. The code still looks suspect, though, as it calculates a bound b based on whether the first column is reversed, then uses bound, not b, in the following statement. I've not researched which would be correct, as this appears closely related to the fix Sylvain just developed for CASSANDRA-7105. {code} TreeSet inValues = new TreeSet<>(isReversed ? cfDef.cfm.comparator.reverseComparator : cfDef.cfm.comparator); for (List components : splitInValues) { ColumnNameBuilder nameBuilder = builder.copy(); for (ByteBuffer component : components) nameBuilder.add(component); Iterator iter = names.iterator(); Bound b = isReversed == isReversedType(iter.next()) ? bound : Bound.reverse(bound); inValues.add((bound == Bound.END && nameBuilder.remainingCount() > 0) ? nameBuilder.buildAsEndOfRange() : nameBuilder.build()); } return new ArrayList<>(inValues); {code} > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.9, 2.1 rc1 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRI
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14007313#comment-14007313 ] Tyler Hobbs commented on CASSANDRA-6875: [~slebresne] done. Thanks for catching that. > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.9, 2.1 rc1 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14006950#comment-14006950 ] Sylvain Lebresne commented on CASSANDRA-6875: - [~thobbs] Can you bump the CQL version in QueryProcessor and update the CQL doc accordingly too? > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.9, 2.1 rc1 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004951#comment-14004951 ] Sylvain Lebresne commented on CASSANDRA-6875: - bq. As I mentioned, I've kept the same behavior for slices over mixed-asc/desc comparators; I'll wait for your response to decide what to do about that. That's definitively a problem and we should fix it. I've created CASSANDRA-7281 for that. As far as this patch goes, as long as it preserve correct behavior for all-ASC or all-DESC (which I believe it does), we can leave the rest to CASSANDRA-7281. The patch lgtm with 2 minor nits (and you will have to rebase it before commit btw): * In buildBound, I believe we can as well check restriction[ 0], saving the iterator creation. * There's an unecessary import in QueryProcessor (of VisibleForTesting) That said, I'm still a bit uncomfortable pushing this in 2.0 given where we are of the 2.0 cycle. But if I'm the only one feeling this way, I suppose I can shut up (and so +1 minus that caveat). > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004088#comment-14004088 ] Tyler Hobbs commented on CASSANDRA-6875: I've backported TupleType from 7248 and pushed my changes to the same branch. (I merged in cassandra-2.0, so let me know if you'd like a rebase or anything.) > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999757#comment-13999757 ] Sylvain Lebresne commented on CASSANDRA-6875: - I've created CASSANDRA-7248 for the more general support of tuple types discussed above. I'll note that imo the correct course of action would be to postpone this to 2.1 and base it on top of that CASSANDRA-7248. This would make it much easier on the client drivers anyway since they'll have support for the TupleType in the v3 protocol, while if we commit this to 2.0 they won't and will have to special case for it. Also, the changes here are far from trivial so it would also make sense in term of "not introducing risky new features" in a minor release. Though if we absolutely insist on getting this in 2.0, we can just extract the TupleTyple AbstractType from the CASSANDRA-7248 patch and use that here. > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996581#comment-13996581 ] Sylvain Lebresne commented on CASSANDRA-6875: - I haven't looked at that last version yet (but will try to shortly), but something occured to me. The patch currently use CompositeType for tuple markers (and thus tuple values must be serialized following CompositeType). We shouldn't do that. CompositeType is a type that is never used by the native protocol so 1) it will appear like a "custom type" to drivers and 2) the composite encoding is inconsistent with the rest of the encodings and in particular the end-of-component that said encoding contains is completely useless. This is basically the same problem than CASSANDRA-7209, we shouldn't commit to such inadequate encoding. Another way to put it is that this patch introduces the concept of tuple values in the native protocol, but we don't have such type. So maybe the right way to go would be to introduce such tuple type first (which would really just be some form of anymous user type) and use that. Of course, adding such type is even more out of scope for 2.0 than this ticket already is. So i think we may want to introduce such tuple type in 2.1 but I suppose we could still commit this to 2.0 but with the right encoding (there is no reason not to reuse the one for user type from CASSANDRA-7209) and just some simple TupleType (that won't have more support in 2.0 that what this issue does; it will be exposed to drivers as a custom type for 2.0 but I don't think there is anything we can do about that one). > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993078#comment-13993078 ] Tyler Hobbs commented on CASSANDRA-6875: Okay, most of your concerns should be addressed on the [same branch|https://github.com/thobbs/cassandra/compare/3808c2b35993fed50554f13a73b2444afb598715...CASSANDRA-6875-2.0]. As I mentioned, I've kept the same behavior for slices over mixed-asc/desc comparators; I'll wait for your response to decide what to do about that. Tuples.Value no longer returns serialized values, so {{buildBounds}} takes care of that work. It seemed cleanest to just split out different functions for building the different types of multi-column bounds instead of fitting it all into one loop. I did try moving MultiColumnRestrictions into a separate attribute in SelectStatement, but that didn't seem to make things much clearer. The column names that the restriction applies to would need to be separately tracked. When processing restrictions, you would have to refer back to that list and check it or update it. I think the current behavior is reasonably clear if you think of it like this: SelectStatement.columnRestrictions tracks restrictions on each column; in the case of multi-column restrictions, some columns may share a single restriction object. Once ASF mail is working again, I'll start that thread about tests on the dev ML. > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987526#comment-13987526 ] Sylvain Lebresne commented on CASSANDRA-6875: - I don't disagree, the split I'm suggesting is not absolutely future proof, and we'll need to find something better in the long run. But that don't change the fact that storing multi-column restriction in an array when we only use one is misleading and error prone as of this patch imo. I mainly meant the splitting as a way to support what we do support today more cleanly, but I do absolutely think SelectStatement will require much serious refactoring sooner than later. There's CASSANDRA-4762 too that we will want to get at some point (which, it's worth mentioning, is made slightly less urgent by this ticket because it covers some of the same cases, though it's not fully equivalent) that will complicate matters here. So anyway, I'm all for a third option that is both clean and future proof, but I'm coming short here right now. Or rather, I see ways to get there, but they require more refactoring than is reasonable for a 2.0 target. So I went with the suggestion that felt cleaner, if not too future proof :). But if you prefer not do the class splitting, I can live with that as long as we maybe split buildBound and add some comments. > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986889#comment-13986889 ] Tyler Hobbs commented on CASSANDRA-6875: After thinking more closely about splitting up SelectStatement into two subclasses for single-column and multi-column restrictions, I'm not 100% convinced that's the best path. For example, suppose you have a query like {{SELECT * FROM foo WHERE key=0 AND c1 > 0 AND (c1, c2) < (2, 3)}}. We could a) require it to be written like {{(c1) > (0) AND (c1, c2) < (2, 3)}}, or b) accept that syntax and correctly reduce the expressions to a single multi-column slice restriction. I'm not sure that option (b) would be clearer than keeping the restrictions separate. I can also imagine us supporting something like {{SELECT ... WHERE key=0 AND c1=0 AND (c2, c3) > (1, 2)}} in the future. Of course, we could also require this to be written differently ({{(c1, c2, c3) > (0, 1, 2) AND (c1) <= (0)}} or reduce it to a single multi-column slice restriction. I'm just pointing out that this may become less clear than simply improving the bounds-building code (which I agree is needed). > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986810#comment-13986810 ] Sylvain Lebresne commented on CASSANDRA-6875: - bq. Do you mind if I start a dev ML thread? Absolutely not. > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986770#comment-13986770 ] Tyler Hobbs commented on CASSANDRA-6875: bq. We can, see cql_prepared_test.py (arguably our number of tests for prepared statement is deeply lacking, but it's possible to have some). Ah, thanks, good to know. bq. I'll fight to the death the concept that "unit test are a lot faster to work with" as an absolute truth. Don't worry, I'm not going to challenge you to a duel :). It's not an absolute truth, but it's easy to do things like run a unit test with the debugger on, which makes a big difference in some cases. bq. fixing those issue is likely simpler than migrating all the existing tests back to the unit tests. I'm definitely not suggesting moving any existing dtests to unit tests. I'm just proposing that we allow some mix of unit tests and dtests for newly written tests. bq. we could have a debate I suppose (but definitively not here) Do you mind if I start a dev ML thread? It would be good to get input from other devs and QA. > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986597#comment-13986597 ] Sylvain Lebresne commented on CASSANDRA-6875: - Ahah, it is uncommitted. Well I just did commit it for info. It's just one uninteresting test though, not sure why I never committed it. > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986584#comment-13986584 ] Ryan McGuire commented on CASSANDRA-6875: - bq. We can, see cql_prepared_test.py (arguably our number of tests for prepared statement is deeply lacking, but it's possible to have some). [~slebresne] Is that test possibly uncommitted? I actually don't have that test in my repo. > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986466#comment-13986466 ] Sylvain Lebresne commented on CASSANDRA-6875: - bq. we can't use prepared statements with the dtests right now (as far as I can tell) We can, see cql_prepared_test.py (arguably our number of tests for prepared statement is deeply lacking, but it's possible to have some). On the more general question of where tests should go, we could have a debate I suppose (but definitively not here), but frankly, I don't think there is very many downside to having the tests in the dtests and since we have tons of them there already, I'd much rather not waste precious time at changing for the sake of change. But quickly, the reasons why I think dtests are really not that bad here: # it doesn't get a whole lot more end-to-end than the CQL tests imo. # dtests feels to me a lot more readable and easier to work with. Mainly because for that kind of tests python is just more comfortable/quick to get things done. # there *is* a few of the CQL dtests where we do want a distributed setup, like CAS tests. Of course we could left those in dtests and move the rest in the unit test suite, but keeping all CQL tests at the same place just feels simpler (you don't duplicate all those small utility functions that you invariably need to make tests easier). # I work with unit tests and dtests daily, and it honestly is not at all my experience that working with unit tests is a "lot faster". Quite the contrary in fact. I'm willing to admit that one may be more comfortable with one suite or the other, but I'll fight to the death the concept that "unit test are a *lot* faster to work with" as an absolute truth. I'll note the CQL dtests are not perfect. They could use being reorganized a bit, and we can speed them up dramatically by not spinning up a cluster on every test. That said, fixing those issue is likely simpler than migrating all the existing tests back to the unit tests. All this said, to focus back on this issue, I'd rather keep CQL tests to dtests for now (even if we do start a debate on changing that fact on dev list), but I won't block the issue if that's not the case. That remark was in the category "minor comments". > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986178#comment-13986178 ] Tyler Hobbs commented on CASSANDRA-6875: bq. It bothers me to start adding unit tests for CQL queries when all of our CQL tests are currently in dtest. I'd much rather keep it all in the dtests to avoid the confusion on where is what tested. I went with unit tests for a couple of reasons. The main one is that we can't use prepared statements with the dtests right now (as far as I can tell). The other is that it's a lot faster to work with unit tests than dtests here, and a distributed setup isn't really required for any of these tests. We should definitely have a discussion about where and how to test in general (maybe in a dev ML thread), but my feeling is that dtests are good for: * Testing end-to-end * Tests that require more than one node Otherwise, if it's reasonable to do in a unit test, I don't see why we wouldn't do it there. FWIW, if we could use prepared statements in the dtests, I would also add some of the prepared statement test cases from the unit tests there. All of your other suggestions sound good, thanks! > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985587#comment-13985587 ] Sylvain Lebresne commented on CASSANDRA-6875: - The overall principle looks good, but I feel this could use some more comments and/or made a little more clear. Mainly, the {{isMultiColumn}} path in {{SelectStatement.buildBound}} looks weird at face value: we're inside a loop that populates a {{ColumnNameBuilder}}, but the {{isMultiColumn}} path completely ignores both the iterated object and the builder. This work because it relies on the fact that when there is a multi-column restriction then it's the only one restriction (which is duplicated in the {{SelectStatement.columnRestrictions}} array), and that the value from such restriction is not a single column value but rather a fully built composite serialized value (which is not self evident from the method naming in particular). But it's hard to piece it all that together when you look at {{buildBound}} currently. Some comments would help, but I'd prefer going even further and move the {{isMultiColumn}} path outside of the loop (by looking up first if the first restriction in {{columnRestrictions}} is a multi-column one or not) since it has no reason to be in the loop. In fact, I'd go a tad further by making SelectStatement abstract and have 2 subClass, one for single-column restrictions with a {{SingleColumnRelation[] columnRestrictions}} array field as we have now, and one for multi-column that has just one non-array {{MultiColumnRestriction columnsRestriction}} field. After all, both cases exclude one another in the current implementation. Somewhat related, I'm slightly afraid that the parts about multi-column restrictions returning fully serialized composites (through Tuples.Value.get()) will not play nice with the 2.1 code, where we don't manipulate composites as opaque ByteBuffers anymore (concretely, Tuples will serialize the composite, but SelectStatement will have to deserialize it back right away to get a Composite, which will be both ugly and inefficient). So to avoid having to change everything on merge, I think it would be cleaner to make Tuples.Value return a list of (individual column) values instead of just one, and let SelectStatement build back the full composite name using a ColumnNameBuilder. Especially if you make the per-column and multi-column paths a tad more separated as suggested above, I suspect it might clarify things a bit. Other than that, a bunch of more minor comments and nits: * The {{SelectStatement.RawStatement.prepare()}} re-org patch breaks proper indentation at places (for instance, the indentation of parameters to 'new ColumnSpecification' in the first branch of udpateSingleColumnRestriction, though there is a few other places). Would be nice to fix those. * Can't we use {{QueryProcessor.instance.getPrepared}} instead of creating a only-for-test {{QueryProcessor.staticGetPrepared}}? Or at the very least leave such shortcuts in the tests where it belongs. * In Tuples.Literal.prepare, I'd prefer good ol'fashion indexed loop to iterate over 2 lists (feels clearer, and saves the allocator creation as a bonus). * In Tuples.Raw.makeInReceiver should probably be called makeReceiver (it's not related to "IN"). I'd also drop the spaces in the string generated (if only for consistency with the string generated in INRaw). As a side node, Raw.makeReceiver uses indexed iteration while INRaw.makeInReceiver don't, can't we make both consistent style wise for OCD sakes? * Why make methods of CQLStatement abstract (it's an interface)? Also, I'd rather add the QueryOptions parameter to the existing executeInternal and default to QueryOptions.DEFAULT when calling it, rather than having 2 methods. Though tbh, my preference would be to move the tests to dtest and leave those somewhat unrelated changes to another ticket, see below. * SingleColumnRelation.previousInTuple is now unused but not removed. * We could save one list allocation (instead of both toCreate and toUpdate) in SelectStatement.updateRestrictionsForRelation (for EQ and IN, we know it can only be a create, and for slices we can lookup with getExisting). * In Restriction, the {{values}} method is already at top-level, no reason to re-declare it for EQ. * It bothers me to start adding unit tests for CQL queries when all of our CQL tests are currently in dtest. I'd *much* rather keep it all in the dtests to avoid the confusion on where is what tested. > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter:
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984098#comment-13984098 ] Michaël Figuière commented on CASSANDRA-6875: - All right. Makes sense. > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983930#comment-13983930 ] Brandon Williams commented on CASSANDRA-6875: - Moving this back to 'bug', [~mfiguiere], because anything that thrift can do that CQL can't is a bug. > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13973391#comment-13973391 ] Tyler Hobbs commented on CASSANDRA-6875: Regarding prepared statements, I assume we want to support all of the following: * {{... WHERE (k, c1) IN ?}} * {{... WHERE (k, c1) IN (?, ?, ...)}} * {{... WHERE (k, c1) IN ((?, ?), (?, ?), ...)}} > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.8 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13955776#comment-13955776 ] Brandon Williams commented on CASSANDRA-6875: - Since CASSANDRA-4851 was in 2.0, let's do this one in 2.0 as well. > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.0.7 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
[ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938104#comment-13938104 ] Sylvain Lebresne commented on CASSANDRA-6875: - That's a relatively nature extension of CASSANDRA-4851 and we can definitively support that. > CQL3: select multiple CQL rows in a single partition using IN > - > > Key: CASSANDRA-6875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6875 > Project: Cassandra > Issue Type: Bug > Components: API >Reporter: Nicolas Favre-Felix >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 2.1 > > > In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is > important to support reading several distinct CQL rows from a given partition > using a distinct set of "coordinates" for these rows within the partition. > CASSANDRA-4851 introduced a range scan over the multi-dimensional space of > clustering keys. We also need to support a "multi-get" of CQL rows, > potentially using the "IN" keyword to define a set of clustering keys to > fetch at once. > (reusing the same example\:) > Consider the following table: > {code} > CREATE TABLE test ( > k int, > c1 int, > c2 int, > PRIMARY KEY (k, c1, c2) > ); > {code} > with the following data: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 0 | 1 > 0 | 1 | 0 > 0 | 1 | 1 > {code} > We can fetch a single row or a range of rows, but not a set of them: > {code} > > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ; > Bad Request: line 1:54 missing EOF at ',' > {code} > Supporting this syntax would return: > {code} > k | c1 | c2 > ---++ > 0 | 0 | 0 > 0 | 1 | 1 > {code} > Being able to fetch these two CQL rows in a single read is important to > maintain partition-level isolation. -- This message was sent by Atlassian JIRA (v6.2#6252)