[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2018-02-09 Thread Datta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358561#comment-16358561
 ] 

Datta commented on CASSANDRA-6875:
--

Is there any other possible way to compare multi column in IN clause (other 
than clustering key)? 

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.9, 2.1 rc1
>
> Attachments: 6875-part2-v2.txt, 6875-part2.txt
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-26 Thread Bill Mitchell (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009075#comment-14009075
 ] 

Bill Mitchell commented on CASSANDRA-6875:
--

To try this out, I cobbled up a test case by accessing the TupleType directly 
on the client side, as this feature is not yet supported in the Java driver.  
My approach was to serialize my two ordering column values, then use 
TupleType.buildValue() to concatenate them into a single ByteBuffer, build a 
List of all these, then use serialize on a ListType instance to get 
a single ByteBuffer representing the entire list, and bind that using 
setBytesUnsafe().  I'm not totally sure of all this, but it seems reasonable.  

My SELECT statement syntax followed the first of the three Tyler suggested: ... 
WHERE (c1, c2) IN ?, as this allows the statement to be prepared only once, 
irrespective of the number of compound keys provided.  

What I saw was the following traceback on the server:
14/05/26 14:33:09 ERROR messages.ErrorMessage: Unexpected exception during 
request
java.util.NoSuchElementException
at 
java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:396)
at java.util.LinkedHashMap$ValueIterator.next(LinkedHashMap.java:409)
at 
org.apache.cassandra.cql3.statements.SelectStatement.buildMultiColumnInBound(SelectStatement.java:941)
at 
org.apache.cassandra.cql3.statements.SelectStatement.buildBound(SelectStatement.java:814)
at 
org.apache.cassandra.cql3.statements.SelectStatement.getRequestedBound(SelectStatement.java:977)
at 
org.apache.cassandra.cql3.statements.SelectStatement.makeFilter(SelectStatement.java:444)
at 
org.apache.cassandra.cql3.statements.SelectStatement.getSliceCommands(SelectStatement.java:340)
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:210)
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:61)
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158)
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:309)
at 
org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:132)
at 
org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304)

Stepping through the code, it appears to have analyzed my statement correctly.  
In BuildMultiColumnInBound, splitInValues contains 1426 tuples, which is the 
number I intended to pass.  The names parameter identifies two columns, 
createdate and emailcrypt.  The loop executes twice, but on the third iteration 
there are no more elements in names, thus the exception. 

Moving the construction of the iterator within the loop fixed my Exception.  
The code still looks suspect, though, as it calculates a bound b based on 
whether the first column is reversed, then uses bound, not b, in the following 
statement.  I've not researched which would be correct, as this appears closely 
related to the fix Sylvain just developed for CASSANDRA-7105.   

{code}
TreeSet inValues = new TreeSet<>(isReversed ? 
cfDef.cfm.comparator.reverseComparator : cfDef.cfm.comparator);
for (List components : splitInValues)
{
ColumnNameBuilder nameBuilder = builder.copy();
for (ByteBuffer component : components)
nameBuilder.add(component);

Iterator iter = names.iterator();
Bound b = isReversed == isReversedType(iter.next()) ? bound : 
Bound.reverse(bound);
inValues.add((bound == Bound.END && nameBuilder.remainingCount() > 
0) ? nameBuilder.buildAsEndOfRange() : nameBuilder.build());
}
return new ArrayList<>(inValues);
{code}  

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.9, 2.1 rc1
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRI

[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-23 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14007313#comment-14007313
 ] 

Tyler Hobbs commented on CASSANDRA-6875:


[~slebresne] done.  Thanks for catching that.

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.9, 2.1 rc1
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-23 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14006950#comment-14006950
 ] 

Sylvain Lebresne commented on CASSANDRA-6875:
-

[~thobbs] Can you bump the CQL version in QueryProcessor and update the CQL doc 
accordingly too?

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.9, 2.1 rc1
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-21 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004951#comment-14004951
 ] 

Sylvain Lebresne commented on CASSANDRA-6875:
-

bq. As I mentioned, I've kept the same behavior for slices over mixed-asc/desc 
comparators; I'll wait for your response to decide what to do about that.

That's definitively a problem and we should fix it. I've created CASSANDRA-7281 
for that.  As far as this patch goes, as long as it preserve correct behavior 
for all-ASC or all-DESC (which I believe it does), we can leave the rest to 
CASSANDRA-7281.

The patch lgtm with 2 minor nits (and you will have to rebase it before commit 
btw):
* In buildBound, I believe we can as well check restriction[ 0], saving the 
iterator creation.
* There's an unecessary import in QueryProcessor (of VisibleForTesting)

That said, I'm still a bit uncomfortable pushing this in 2.0 given where we are 
of the 2.0 cycle. But if I'm the only one feeling this way, I suppose I can 
shut up (and so +1 minus that caveat).


> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-20 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004088#comment-14004088
 ] 

Tyler Hobbs commented on CASSANDRA-6875:


I've backported TupleType from 7248 and pushed my changes to the same branch.  
(I merged in cassandra-2.0, so let me know if you'd like a rebase or anything.)

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-16 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999757#comment-13999757
 ] 

Sylvain Lebresne commented on CASSANDRA-6875:
-

I've created CASSANDRA-7248 for the more general support of tuple types 
discussed above. I'll note that imo the correct course of action would be to 
postpone this to 2.1 and base it on top of that CASSANDRA-7248. This would make 
it much easier on the client drivers anyway since they'll have support for the 
TupleType in the v3 protocol, while if we commit this to 2.0 they won't and 
will have to special case for it. Also, the changes here are far from trivial 
so it would also make sense in term of "not introducing risky new features" in 
a minor release.

Though if we absolutely insist on getting this in 2.0, we can just extract the 
TupleTyple AbstractType from the CASSANDRA-7248 patch and use that here.

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-15 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996581#comment-13996581
 ] 

Sylvain Lebresne commented on CASSANDRA-6875:
-

I haven't looked at that last version yet (but will try to shortly), but 
something occured to me. The patch currently use CompositeType for tuple 
markers (and thus tuple values must be serialized following CompositeType). We 
shouldn't do that. CompositeType is a type that is never used by the native 
protocol so 1) it will appear like a "custom type" to drivers and 2) the 
composite encoding is inconsistent with the rest of the encodings and in 
particular the end-of-component that said encoding contains is completely 
useless. This is basically the same problem than CASSANDRA-7209, we shouldn't 
commit to such inadequate encoding.

Another way to put it is that this patch introduces the concept of tuple values 
in the native protocol, but we don't have such type. So maybe the right way to 
go would be to introduce such tuple type first (which would really just be some 
form of anymous user type) and use that. Of course, adding such type is even 
more out of scope for 2.0 than this ticket already is. So i think we may want 
to introduce such tuple type in 2.1 but I suppose we could still commit this to 
2.0 but with the right encoding (there is no reason not to reuse the one for 
user type from CASSANDRA-7209) and just some simple TupleType (that won't have 
more support in 2.0 that what this issue does; it will be exposed to drivers as 
a custom type for 2.0 but I don't think there is anything we can do about that 
one).


> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-15 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993078#comment-13993078
 ] 

Tyler Hobbs commented on CASSANDRA-6875:


Okay, most of your concerns should be addressed on the [same 
branch|https://github.com/thobbs/cassandra/compare/3808c2b35993fed50554f13a73b2444afb598715...CASSANDRA-6875-2.0].

As I mentioned, I've kept the same behavior for slices over mixed-asc/desc 
comparators; I'll wait for your response to decide what to do about that.

Tuples.Value no longer returns serialized values, so {{buildBounds}} takes care 
of that work.  It seemed cleanest to just split out different functions for 
building the different types of multi-column bounds instead of fitting it all 
into one loop.

I did try moving MultiColumnRestrictions into a separate attribute in 
SelectStatement, but that didn't seem to make things much clearer.  The column 
names that the restriction applies to would need to be separately tracked.  
When processing restrictions, you would have to refer back to that list and 
check it or update it.  I think the current behavior is reasonably clear if you 
think of it like this: SelectStatement.columnRestrictions tracks restrictions 
on each column; in the case of multi-column restrictions, some columns may 
share a single restriction object.

Once ASF mail is working again, I'll start that thread about tests on the dev 
ML.

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-02 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987526#comment-13987526
 ] 

Sylvain Lebresne commented on CASSANDRA-6875:
-

I don't disagree, the split I'm suggesting is not absolutely future proof, and 
we'll need to find something better in the long run. But that don't change the 
fact that storing multi-column restriction in an array when we only use one is 
misleading and error prone as of this patch imo. I mainly meant the splitting 
as a way to support what we do support today more cleanly, but I do absolutely 
think SelectStatement will require much serious refactoring sooner than later. 
There's CASSANDRA-4762 too that we will want to get at some point (which, it's 
worth mentioning, is made slightly less urgent by this ticket because it covers 
some of the same cases, though it's not fully equivalent) that will complicate 
matters here.

So anyway, I'm all for a third option that is both clean and future proof, but 
I'm coming short here right now. Or rather, I see ways to get there, but they 
require more refactoring than is reasonable for a 2.0 target. So I went with 
the suggestion that felt cleaner, if not too future proof :). But if you prefer 
not do the class splitting, I can live with that as long as we maybe split 
buildBound and add some comments.

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-01 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986889#comment-13986889
 ] 

Tyler Hobbs commented on CASSANDRA-6875:


After thinking more closely about splitting up SelectStatement into two 
subclasses for single-column and multi-column restrictions, I'm not 100% 
convinced that's the best path.

For example, suppose you have a query like {{SELECT * FROM foo WHERE key=0 AND 
c1 > 0 AND (c1, c2) < (2, 3)}}.  We could a) require it to be written like 
{{(c1) > (0) AND (c1, c2) < (2, 3)}}, or b) accept that syntax and correctly 
reduce the expressions to a single multi-column slice restriction.  I'm not 
sure that option (b) would be clearer than keeping the restrictions separate.

I can also imagine us supporting something like {{SELECT ... WHERE key=0 AND 
c1=0 AND (c2, c3) > (1, 2)}} in the future.  Of course, we could also require 
this to be written differently ({{(c1, c2, c3) > (0, 1, 2) AND (c1) <= (0)}} or 
reduce it to a single multi-column slice restriction.  I'm just pointing out 
that this may become less clear than simply improving the bounds-building code 
(which I agree is needed).



> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-01 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986810#comment-13986810
 ] 

Sylvain Lebresne commented on CASSANDRA-6875:
-

bq. Do you mind if I start a dev ML thread?

Absolutely not.

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-01 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986770#comment-13986770
 ] 

Tyler Hobbs commented on CASSANDRA-6875:


bq. We can, see cql_prepared_test.py (arguably our number of tests for prepared 
statement is deeply lacking, but it's possible to have some).

Ah, thanks, good to know.

bq. I'll fight to the death the concept that "unit test are a lot faster to 
work with" as an absolute truth.

Don't worry, I'm not going to challenge you to a duel :).  It's not an absolute 
truth, but it's easy to do things like run a unit test with the debugger on, 
which makes a big difference in some cases.

bq. fixing those issue is likely simpler than migrating all the existing tests 
back to the unit tests.

I'm definitely not suggesting moving any existing dtests to unit tests.  I'm 
just proposing that we allow some mix of unit tests and dtests for newly 
written tests.

bq.  we could have a debate I suppose (but definitively not here)

Do you mind if I start a dev ML thread?  It would be good to get input from 
other devs and QA.



> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-01 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986597#comment-13986597
 ] 

Sylvain Lebresne commented on CASSANDRA-6875:
-

Ahah, it is uncommitted. Well I just did commit it for info. It's just one 
uninteresting test though, not sure why I never committed it.

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-01 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986584#comment-13986584
 ] 

Ryan McGuire commented on CASSANDRA-6875:
-

bq. We can, see cql_prepared_test.py (arguably our number of tests for prepared 
statement is deeply lacking, but it's possible to have some).

[~slebresne] Is that test possibly uncommitted? I actually don't have that test 
in my repo.

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-01 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986466#comment-13986466
 ] 

Sylvain Lebresne commented on CASSANDRA-6875:
-

bq. we can't use prepared statements with the dtests right now (as far as I can 
tell)

We can, see cql_prepared_test.py (arguably our number of tests for prepared 
statement is deeply lacking, but it's possible to have some).

On the more general question of where tests should go, we could have a debate I 
suppose (but definitively not here), but frankly, I don't think there is very 
many downside to having the tests in the dtests and since we have tons of them 
there already, I'd much rather not waste precious time at changing for the sake 
of change. But quickly, the reasons why I think dtests are really not that bad 
here:
# it doesn't get a whole lot more end-to-end than the CQL tests imo.
# dtests feels to me a lot more readable and easier to work with. Mainly 
because for that kind of tests python is just more comfortable/quick to get 
things done.
# there *is* a few of the CQL dtests where we do want a distributed setup, like 
CAS tests. Of course we could left those in dtests and move the rest in the 
unit test suite, but keeping all CQL tests at the same place just feels simpler 
(you don't duplicate all those small utility functions that you invariably need 
to make tests easier).
# I work with unit tests and dtests daily, and it honestly is not at all my 
experience that working with unit tests is a "lot faster". Quite the contrary 
in fact. I'm willing to admit that one may be more comfortable with one suite 
or the other, but I'll fight to the death the concept that "unit test are a 
*lot* faster to work with" as an absolute truth.

I'll note the CQL dtests are not perfect. They could use being reorganized a 
bit, and we can speed them up dramatically by not spinning up a cluster on 
every test. That said, fixing those issue is likely simpler than migrating all 
the existing tests back to the unit tests.

All this said, to focus back on this issue, I'd rather keep CQL tests to dtests 
for now (even if we do start a debate on changing that fact on dev list), but I 
won't block the issue if that's not the case. That remark was in the category 
"minor comments".

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-04-30 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986178#comment-13986178
 ] 

Tyler Hobbs commented on CASSANDRA-6875:


bq. It bothers me to start adding unit tests for CQL queries when all of our 
CQL tests are currently in dtest. I'd much rather keep it all in the dtests to 
avoid the confusion on where is what tested.

I went with unit tests for a couple of reasons.  The main one is that we can't 
use prepared statements with the dtests right now (as far as I can tell).  The 
other is that it's a lot faster to work with unit tests than dtests here, and a 
distributed setup isn't really required for any of these tests.  We should 
definitely have a discussion about where and how to test in general (maybe in a 
dev ML thread), but my feeling is that dtests are good for:
* Testing end-to-end
* Tests that require more than one node

Otherwise, if it's reasonable to do in a unit test, I don't see why we wouldn't 
do it there.  FWIW, if we could use prepared statements in the dtests, I would 
also add some of the prepared statement test cases from the unit tests there.

All of your other suggestions sound good, thanks!

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-04-30 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985587#comment-13985587
 ] 

Sylvain Lebresne commented on CASSANDRA-6875:
-

The overall principle looks good, but I feel this could use some more comments 
and/or made a little more clear. Mainly, the {{isMultiColumn}} path in 
{{SelectStatement.buildBound}} looks weird at face value: we're inside a loop 
that populates a {{ColumnNameBuilder}}, but the {{isMultiColumn}} path 
completely ignores both the iterated object and the builder. This work because 
it relies on the fact that when there is a multi-column restriction then it's 
the only one restriction (which is duplicated in the 
{{SelectStatement.columnRestrictions}} array), and that the value from such 
restriction is not a single column value but rather a fully built composite 
serialized value (which is not self evident from the method naming in 
particular).  But it's hard to piece it all that together when you look at 
{{buildBound}} currently.  Some comments would help, but I'd prefer going even 
further and move the {{isMultiColumn}} path outside of the loop (by looking up 
first if the first restriction in {{columnRestrictions}} is a multi-column one 
or not) since it has no reason to be in the loop. In fact, I'd go a tad further 
by making SelectStatement abstract and have 2 subClass, one for single-column 
restrictions with a {{SingleColumnRelation[] columnRestrictions}} array field 
as we have now, and one for multi-column that has just one non-array 
{{MultiColumnRestriction columnsRestriction}} field. After all, both cases 
exclude one another in the current implementation.

Somewhat related, I'm slightly afraid that the parts about multi-column 
restrictions returning fully serialized composites (through Tuples.Value.get()) 
will not play nice with the 2.1 code, where we don't manipulate composites as 
opaque ByteBuffers anymore (concretely, Tuples will serialize the composite, 
but SelectStatement will have to deserialize it back right away to get a 
Composite, which will be both ugly and inefficient).  So to avoid having to 
change everything on merge, I think it would be cleaner to make Tuples.Value 
return a list of (individual column) values instead of just one, and let 
SelectStatement build back the full composite name using a ColumnNameBuilder.  
Especially if you make the per-column and multi-column paths a tad more 
separated as suggested above, I suspect it might clarify things a bit.  

Other than that, a bunch of more minor comments and nits:
* The {{SelectStatement.RawStatement.prepare()}} re-org patch breaks proper 
indentation at places (for instance, the indentation of parameters to 'new 
ColumnSpecification' in the first branch of udpateSingleColumnRestriction, 
though there is a few other places). Would be nice to fix those.
* Can't we use {{QueryProcessor.instance.getPrepared}} instead of creating a 
only-for-test {{QueryProcessor.staticGetPrepared}}? Or at the very least leave 
such shortcuts in the tests where it belongs.
* In Tuples.Literal.prepare, I'd prefer good ol'fashion indexed loop to iterate 
over 2 lists (feels clearer, and saves the allocator creation as a bonus).
* In Tuples.Raw.makeInReceiver should probably be called makeReceiver (it's not 
related to "IN"). I'd also drop the spaces in the string generated (if only for 
consistency with the string generated in INRaw). As a side node, 
Raw.makeReceiver uses indexed iteration while INRaw.makeInReceiver don't, can't 
we make both consistent style wise for OCD sakes?
* Why make methods of CQLStatement abstract (it's an interface)?  Also, I'd 
rather add the QueryOptions parameter to the existing executeInternal and 
default to QueryOptions.DEFAULT when calling it, rather than having 2 methods. 
Though tbh, my preference would be to move the tests to dtest and leave those 
somewhat unrelated changes to another ticket, see below.
* SingleColumnRelation.previousInTuple is now unused but not removed.
* We could save one list allocation (instead of both toCreate and toUpdate) in 
SelectStatement.updateRestrictionsForRelation (for EQ and IN, we know it can 
only be a create, and for slices we can lookup with getExisting).
* In Restriction, the {{values}} method is already at top-level, no reason to 
re-declare it for EQ.
* It bothers me to start adding unit tests for CQL queries when all of our CQL 
tests are currently in dtest. I'd *much* rather keep it all in the dtests to 
avoid the confusion on where is what tested.


> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter:

[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-04-29 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984098#comment-13984098
 ] 

Michaël Figuière commented on CASSANDRA-6875:
-

All right. Makes sense.

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-04-28 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983930#comment-13983930
 ] 

Brandon Williams commented on CASSANDRA-6875:
-

Moving this back to 'bug', [~mfiguiere], because anything that thrift can do 
that CQL can't is a bug.

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-04-17 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13973391#comment-13973391
 ] 

Tyler Hobbs commented on CASSANDRA-6875:


Regarding prepared statements, I assume we want to support all of the following:
* {{... WHERE (k, c1) IN ?}}
* {{... WHERE (k, c1) IN (?, ?, ...)}}
* {{... WHERE (k, c1) IN ((?, ?), (?, ?), ...)}}

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.8
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-03-31 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13955776#comment-13955776
 ] 

Brandon Williams commented on CASSANDRA-6875:
-

Since CASSANDRA-4851 was in 2.0, let's do this one in 2.0 as well.

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.7
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-03-17 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938104#comment-13938104
 ] 

Sylvain Lebresne commented on CASSANDRA-6875:
-

That's a relatively nature extension of CASSANDRA-4851 and we can definitively 
support that.

> CQL3: select multiple CQL rows in a single partition using IN
> -
>
> Key: CASSANDRA-6875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
> Project: Cassandra
>  Issue Type: Bug
>  Components: API
>Reporter: Nicolas Favre-Felix
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.1
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
> important to support reading several distinct CQL rows from a given partition 
> using a distinct set of "coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
> clustering keys. We also need to support a "multi-get" of CQL rows, 
> potentially using the "IN" keyword to define a set of clustering keys to 
> fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---++
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to 
> maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)