[jira] [Commented] (CASSANDRA-4004) Add support for ReversedType

2012-05-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266726#comment-13266726
 ] 

Jonathan Ellis commented on CASSANDRA-4004:
---

+1

> Add support for ReversedType
> 
>
> Key: CASSANDRA-4004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4004
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.1.1
>
> Attachments: 4004.txt, 4004_alternative-v2.txt, 4004_alternative.txt
>
>
> It would be nice to add a native syntax for the use of ReversedType. I'm sure 
> there is anything in SQL that we inspired ourselves from, so I would propose 
> something like:
> {noformat}
> CREATE TABLE timeseries (
>   key text,
>   time uuid,
>   value text,
>   PRIMARY KEY (key, time DESC)
> )
> {noformat}
> Alternatively, the DESC could also be put after the column name definition 
> but one argument for putting it in the PK instead is that this only apply to 
> keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4004) Add support for ReversedType

2012-05-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266569#comment-13266569
 ] 

Jonathan Ellis commented on CASSANDRA-4004:
---

My concern is that that the order of the terms in the {{ORDER BY}} clause is 
relevant -- in the dtest, c1 is always given first (which is valid).  But I 
don't think the code would reject {{c2 DESC, c1 ASC}}, although it should.

So what I'm looking for is either a reference to the ordering from 
{{columnAliases}}, or an ordered map for {{definedOrderings}}, which I'm not 
seeing.

> Add support for ReversedType
> 
>
> Key: CASSANDRA-4004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4004
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.1.1
>
> Attachments: 4004.txt, 4004_alternative.txt
>
>
> It would be nice to add a native syntax for the use of ReversedType. I'm sure 
> there is anything in SQL that we inspired ourselves from, so I would propose 
> something like:
> {noformat}
> CREATE TABLE timeseries (
>   key text,
>   time uuid,
>   value text,
>   PRIMARY KEY (key, time DESC)
> )
> {noformat}
> Alternatively, the DESC could also be put after the column name definition 
> but one argument for putting it in the PK instead is that this only apply to 
> keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4004) Add support for ReversedType

2012-05-02 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266497#comment-13266497
 ] 

Sylvain Lebresne commented on CASSANDRA-4004:
-

Which hashmap are you talking about? If it's {{stmt.parameters.orderings}}, the 
actual order of elements in the map is not used by the code so that shouldn't 
matter. I'll note there is a test for this patch at 
https://github.com/riptano/cassandra-dtest/blob/master/cql_tests.py (the line 
{{require('#4004')}} needs to be commented to actually run the test) showing 
that ORDER BY X DESC, Y ASC is the reverse of ORDER BY Y ASC, X DESC.

As a side note, I've convinced myself that 4004_alternative.txt is maybe a bit 
more natural than creating a new, reversed type so I'm good with that.



> Add support for ReversedType
> 
>
> Key: CASSANDRA-4004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4004
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.1.1
>
> Attachments: 4004.txt, 4004_alternative.txt
>
>
> It would be nice to add a native syntax for the use of ReversedType. I'm sure 
> there is anything in SQL that we inspired ourselves from, so I would propose 
> something like:
> {noformat}
> CREATE TABLE timeseries (
>   key text,
>   time uuid,
>   value text,
>   PRIMARY KEY (key, time DESC)
> )
> {noformat}
> Alternatively, the DESC could also be put after the column name definition 
> but one argument for putting it in the PK instead is that this only apply to 
> keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4004) Add support for ReversedType

2012-04-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265416#comment-13265416
 ] 

Jonathan Ellis commented on CASSANDRA-4004:
---

I may be misreading this, but it looks like it treats {{ORDER BY X DESC, Y 
ASC}} the same as {{ORDER BY Y ASC, X DESC}} since it iterates in HashMap's 
entrySet order.  A simple fix might be to use a LinkedHashMap.

> Add support for ReversedType
> 
>
> Key: CASSANDRA-4004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4004
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.1.1
>
> Attachments: 4004.txt, 4004_alternative.txt
>
>
> It would be nice to add a native syntax for the use of ReversedType. I'm sure 
> there is anything in SQL that we inspired ourselves from, so I would propose 
> something like:
> {noformat}
> CREATE TABLE timeseries (
>   key text,
>   time uuid,
>   value text,
>   PRIMARY KEY (key, time DESC)
> )
> {noformat}
> Alternatively, the DESC could also be put after the column name definition 
> but one argument for putting it in the PK instead is that this only apply to 
> keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4004) Add support for ReversedType

2012-04-19 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257663#comment-13257663
 ] 

Jonathan Ellis commented on CASSANDRA-4004:
---

bq. the model defines an ordering of the rows (where rows is in the sense of 
SQL) in tables, order that is defined as the ordering implied by the types of 
the "clustered" keys (and to be clear, I don't care what clustering mean in 
SQL, I'm reusing the name because you're using it, but I only mean by that term 
the fields in the PK after the first one). That doesn't imply the disk order 
has to respect it

I think the mental model of rows as predicates, queries returning sets of rows 
with no inherent order, and ORDER BY as specifying the desired order, is much 
simpler and easier to reason about (see prior point about having to consult DDL 
+ QUERY to figure out what order results are supposed to appear in).

bq. To my defence, you're attributing your semantic to my made up syntax 

I was trying to say that I view ReversedType(Int32Type) as modification of 
Int32Type (which should not affect int ordering) and not a completely new type, 
the way the (hypothetical) ReversedInt (or BackwardsInt, or AlmostNotQuiteInt) 
type would be.  Since the latter isn't really related to an int at all, even 
though they look a lot like ints in many respects.

bq. I do think that in most case it's more natural to define a reversed type 
rather than just adding an optim for reversed queries. 

I don't follow.

bq. I do think that have a form of syntactic double negation that is not 
equivalent to removing both is kind of weird... I do think that it's not 
necessarily clear per se (i.e to anyone that may not be familiar with SQL 
clustering for instance) that "WITH CLUSTERING ORDER (x DESC)" does not change 
the ordering

But saying "{{ORDER BY X DESC}} always gives you higher X first" is the only 
way to avoid the double negation!  Otherwise in your original syntax of PK (X, 
Y DESC), the only way to get 1 to sort before 100 is to ask for ORDER BY Y DESC 
so the DESC cancel out!

I just can't agree that "ORDER BY Y DESC" giving {1, 100} is going to be less 
confusing than {100, 1}, no matter how much we tell users, "No, you see, it's 
really just reversing the clustering order, which you already reversed..."

Users may not be familiar with clustering, but they're *very* familiar with 
ORDER BY, which as I said above, is very clear on what it does.  Clustering is 
the closest example of how performance hints should *not* change the semantics 
of the query, but indexes fall into the same category.

It may also be worth pointing out that it's worth preserving CQL compatibility 
with Hive; queries that execute on both (and to the best of my knowledge CQL3 
is a strict subset of Hive SQL) should not give different results.

> Add support for ReversedType
> 
>
> Key: CASSANDRA-4004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4004
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.1.1
>
> Attachments: 4004.txt
>
>
> It would be nice to add a native syntax for the use of ReversedType. I'm sure 
> there is anything in SQL that we inspired ourselves from, so I would propose 
> something like:
> {noformat}
> CREATE TABLE timeseries (
>   key text,
>   time uuid,
>   value text,
>   PRIMARY KEY (key, time DESC)
> )
> {noformat}
> Alternatively, the DESC could also be put after the column name definition 
> but one argument for putting it in the PK instead is that this only apply to 
> keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4004) Add support for ReversedType

2012-04-19 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257619#comment-13257619
 ] 

Sylvain Lebresne commented on CASSANDRA-4004:
-

bq. Nothing, except that it's a separate ticket's worth of work.

Oh ok. For the records I didn't implied otherwise.

bq. But that's not the same as reverse-clustering on a type: "y int ... PRIMARY 
KEY (x, y DESC)" (to use your syntax) is NOT the same as "y ReversedInt ... 
PRIMARY KEY (x, y)." In the former, ORDER BY Y DESC

What?! I said that I wasn't sure my syntax was good. But with all I've said I 
expected it was clear that what I want to do with this ticket from day one is 
to allow to define "y ReversedInt ... PRIMARY KEY" but without having to write 
a custom java class since we don't have to and that is *exactly* what my patch 
implements. I'm fine saying my syntax suck and allow to write is "y 
reversed(int) .. PK". But to be clear, I don't think that option is a bad fit 
at all for CQL3, and that's not the C* veteran talk.

bq. In the former, ORDER BY Y DESC should give larger Y before smaller (that 
is, 100 before 1); in the latter, the reverse

To my defence, you're attributing *your* semantic to *my* made up syntax (which 
again, may be is counter-intuitive to you with your background but is really 
not to me, and I made it clear that it was a suggestion. I even said in the 
description that "Alternatively, the DESC could also be put after the column 
name definition").

bq. I really don't see the distinction between saying "disk order" and 
"clustering order," as in "the clustered part of th PK induces an ordering of 
records ...

Maybe with the reversed(int) syntax it makes it more clear, but when I talk 
about ordering of records, I'm saying that we should say that in CQL the model 
defines an ordering of the rows (where rows is in the sense of SQL) in tables, 
order that is defined as the ordering implied by the types of the "clustered" 
keys (and to be clear, I don't care what clustering mean in SQL, I'm reusing 
the name because you're using it, but I *only* mean by that term the fields in 
the PK after the first one). That doesn't imply the disk order has to respect 
it (though it will but that's an implementation detail). In other words, and 
somewhat unrelated to this issue, I think there would be value to say that the 
order of SELECT without any ORDER BY is something defined by CQL (while SQL 
does not do that). I think there would be value because I think it helps 
understanding which model are a good fit for CQL.

Now, and to sum up, I think that having the "y reversed(int)" syntax has the 
following advantages over just allowing to change the on-disk order:
# I do think that in most case it's more natural to define a reversed type 
rather than just adding an optim for reversed queries. Typically, it means that 
'y reversed("myCustomType")' is the same than 'y "myReversedCustomType"' which 
has a nice consistency to it. In the alternative, and even though I'm *not* 
saying it's ill defined in any way, I do think that have a form of syntactic 
double negation that is not equivalent to removing both is kind of weird.
# Though that seems to be very clear to you, I do think that it's not 
necessarily clear per se (i.e to anyone that may not be familiar with SQL 
clustering for instance) that "WITH CLUSTERING ORDER (x DESC)" does not change 
the ordering (and by that I mean 'does not semantically mean "x 
reversed(type)"').
# With that solution, we can maintain (without doing anything) the fact that a 
select without ORDERING respect the ordering implied by the "clustering". I 
think it's convenient for C*. Again, lots of efficient model for C* uses that 
ordering, so it feels like a better idea to say 'oh, and contrarily to SQL the 
order of records in a table is defined (and thus the default ordering or 
SELECT) and lots of good modeling pattern for C* rely on this'.

> Add support for ReversedType
> 
>
> Key: CASSANDRA-4004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4004
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.1.1
>
> Attachments: 4004.txt
>
>
> It would be nice to add a native syntax for the use of ReversedType. I'm sure 
> there is anything in SQL that we inspired ourselves from, so I would propose 
> something like:
> {noformat}
> CREATE TABLE timeseries (
>   key text,
>   time uuid,
>   value text,
>   PRIMARY KEY (key, time DESC)
> )
> {noformat}
> Alternatively, the DESC could also be put after the column name definition 
> but one argument for putting it in the PK instead is that this only apply to 
> keys.

--
This message is

[jira] [Commented] (CASSANDRA-4004) Add support for ReversedType

2012-04-19 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257580#comment-13257580
 ] 

Jonathan Ellis commented on CASSANDRA-4004:
---

bq. what happened to "Third (and this is the big one) I strongly suspect that 
we're going to start supporting at least limited run-time ordering in the near 
future" from CASSANDRA-3925

Nothing, except that it's a separate ticket's worth of work.

bq. I never suggested that [ORDER BY depends on disk order], not even a little. 
Not more than you did.

I really don't see the distinction between saying "disk order" and "clustering 
order," as in "the clustered part of th PK induces an ordering of records ... 
SELECT without ORDER BY should return records in that clustering order ... 
SELECT ORDER BY ASC returns 'z' before 'a'." 

But disk order or clustering order, I don't care which you call it; I reject 
both as modifiers of the semantics of DESC.

bq. the fact that value X is larger than Y depends on the ordering induces by 
your custom types

Agreed.  But that's not the same as reverse-clustering on a type: "y int ... 
PRIMARY KEY (x, y DESC)" (to use your syntax) is NOT the same as "y ReversedInt 
... PRIMARY KEY (x, y)."  In the former, ORDER BY Y DESC should give larger Y 
before smaller; in the latter, the reverse.


> Add support for ReversedType
> 
>
> Key: CASSANDRA-4004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4004
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.1.1
>
> Attachments: 4004.txt
>
>
> It would be nice to add a native syntax for the use of ReversedType. I'm sure 
> there is anything in SQL that we inspired ourselves from, so I would propose 
> something like:
> {noformat}
> CREATE TABLE timeseries (
>   key text,
>   time uuid,
>   value text,
>   PRIMARY KEY (key, time DESC)
> )
> {noformat}
> Alternatively, the DESC could also be put after the column name definition 
> but one argument for putting it in the PK instead is that this only apply to 
> keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4004) Add support for ReversedType

2012-04-19 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257568#comment-13257568
 ] 

Sylvain Lebresne commented on CASSANDRA-4004:
-

bq. This is what I'm referring to:

Wait, what happened to "Third (and this is the big one) I strongly suspect that 
we're going to start supporting at least limited run-time ordering in the near 
future" from CASSANDRA-3925. How can I reconcile that with "The other relevant 
context here is that we decided not to support arbitrary orderings"?

bq. "ORDER BY X DESC" does not mean "give me them in the reverse order that Xes 
are in on disk"

I *never* suggested that, not even a little. Not more than you did.

bq. it means "give me larger values before smaller ones." This isn't open for 
debate, it's a very clear requirement

Sure. But the definition of larger versus smaller depends on what ordering you 
are talking about. This isn't open for debate either. Math have closed that 
debate for ages. And SQL is not excluded from that rule, but it just happens 
that SQL has default orderings (based on the column type) and you can't define 
new ones. But we can do that in CQL. We can independently of this ticket 
because of custom types.

Again, once you consider custom types (which we have), you can't hide behind 
that the fact that value X is larger than Y depends on the ordering induces by 
your custom types. That's the ASC order, and DESC is the reverse of that. If 
someone define it's own custom types being "reverseIntegerType", how can you 
avoid SELECT ORDER BY DESC to not return 1 before 3? You can't, and returning 1 
before 3 absolutely make sense because 1 is larger than 3 if the order is 
'reverseInteger'.

bq. SQL and CQL are declarative languages: "Here is what I want; you figure out 
how to give me the results." This has proved a good design.

Sure, *nothing* in what I'm suggesting is at odd with that.

bq. Modifying the semantics of a query based on index or clustering

Again, I'm not suggesting any such thing at all. The semantic of a SELECT X 
ORDER BY Y depends on what ordering relation is defined for Y *because the 
ordering relation is what defines the order*. SQL has a limited and non 
customizable set of types *and* (implicitly) define an ordering relation for 
each one. If one type was 'thing' it would have to define the ordering of 
'thing' otherwise ORDER BY queries wouldn't be properly defined. CQL also has a 
default set of types which have associated ordering relation. I'm *only* 
suggesting we add a simple syntax so that given a type/relation (a default one 
or a custom one btw), we can define the type/ordering relation that validate 
the same value but have the reversed ordering.

> Add support for ReversedType
> 
>
> Key: CASSANDRA-4004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4004
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.1.1
>
> Attachments: 4004.txt
>
>
> It would be nice to add a native syntax for the use of ReversedType. I'm sure 
> there is anything in SQL that we inspired ourselves from, so I would propose 
> something like:
> {noformat}
> CREATE TABLE timeseries (
>   key text,
>   time uuid,
>   value text,
>   PRIMARY KEY (key, time DESC)
> )
> {noformat}
> Alternatively, the DESC could also be put after the column name definition 
> but one argument for putting it in the PK instead is that this only apply to 
> keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4004) Add support for ReversedType

2012-04-19 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257539#comment-13257539
 ] 

Jonathan Ellis commented on CASSANDRA-4004:
---

bq. I'm either misunderstanding what you call 'arbitrary orderings' or I have 
not been part of that discussion

I think you are misunderstanding.  This is what I'm referring to:

{code}
.   if (stmt.parameters.orderBy != null)
{
CFDefinition.Name name = cfDef.get(stmt.parameters.orderBy);
if (name == null)
throw new InvalidRequestException(String.format("Order by 
on unknown column %s", stmt.parameters.orderBy));

if (name.kind != CFDefinition.Name.Kind.COLUMN_ALIAS || 
name.position != 0)
throw new InvalidRequestException(String.format("Order by 
is currently only supported on the second column of the PRIMARY KEY (if any), 
got %s", stmt.parameters.orderBy));
}
{code}

bq. How is that sophistry, seriously? 

"ORDER BY X DESC" does not mean "give me them in the reverse order that Xes are 
in on disk", it means "give me larger values before smaller ones."  This isn't 
open for debate, it's a very clear requirement.

Remember that clustering is not new ground for databases; SQL has been there, 
done that.  As I mentioned when we were designing the CQL3 schema syntax, 
RDBMSes have had a concept of clustered indexes for a long, long time.  But 
clustering on an index ASC or DESC does not affect the results other than as an 
optimization; when you ORDER BY X, that's what you get.

SQL and CQL are declarative languages: "Here is what I want; you figure out how 
to give me the results."  This has proved a good design.  Modifying the 
semantics of a query based on index or clustering or other declarations 
elsewhere has ZERO precedent and is bad design to boot; you don't want users to 
have to consult their DDL when debugging, to know what results a query will 
give.

Thus, the only design that makes sense in the larger context of a declarative 
language is to treat the clustering as an optimization as I've described (or 
"as an index", if you prefer), and continue to reject ORDER BY requests that 
are neither forward- nor reverse-clustered.

> Add support for ReversedType
> 
>
> Key: CASSANDRA-4004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4004
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.1.1
>
> Attachments: 4004.txt
>
>
> It would be nice to add a native syntax for the use of ReversedType. I'm sure 
> there is anything in SQL that we inspired ourselves from, so I would propose 
> something like:
> {noformat}
> CREATE TABLE timeseries (
>   key text,
>   time uuid,
>   value text,
>   PRIMARY KEY (key, time DESC)
> )
> {noformat}
> Alternatively, the DESC could also be put after the column name definition 
> but one argument for putting it in the PK instead is that this only apply to 
> keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4004) Add support for ReversedType

2012-04-19 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257521#comment-13257521
 ] 

Sylvain Lebresne commented on CASSANDRA-4004:
-

bq. From the standpoint of a new user, this is sophistry.

How is that sophistry, seriously? I think it's very important that the 
clustered part of th PK induces an ordering of records (which SQL don't have, 
but we're not talking about SQL here, right). It's important  because you don't 
model things the same way in Cassandra than traditionally you do in SQL: you 
denormalize, you model time series etc... for which the notion that there is an 
ordering or records is not an implementation detail, nor something dealt with 
at query time (contrarily to SQL), but is an important part of the model. It 
would be confusing for brand new user to *not* say that ordering is part of the 
data model (and again yes, that's a difference with SQL). I also don't see how 
saying that is in any way related to being a veteran and whatnot. I 200% agree 
that CQL3 is an abstraction and that it is A Good Thing. I'm saying the 
ordering induced by the PK should be part of that abstraction.

But then it's natural that SELECT without ORDER BY should return records in 
that clustering order, which will indeed not be the same than the order of with 
ORDER BY ASC *unless* Y is the first clustered key. But if is the first 
clustered key, then yes, SELECT and SELECT ORDER BY ASC should be the same (and 
they are). But then it's not rocket science to say that if the ordering is 
'reversed alphabetical order', then 'z' > 'a' and thus a SELECT ORDER BY ASC 
returns 'z' before 'a'.

So I absolutely and strongly refute that this proposal is somehow sophistry and 
even more so that it's a negation of the abstractive nature of CQL3 or 
influenced by the thrift API any more that the solution you're pushing for.

bq. The other relevant context here is that we decided not to support arbitrary 
orderings

I'm either misunderstanding what you call 'arbitrary orderings' or I have not 
been part of that discussion. Because if you talk of custom types, then CQL3 
does support them (you can declare CREATE TABLE foo (k "myCustomType" PRIMARY 
KEY)). And I'm -1 on removing that support, unless someone has compelling 
reason to do so because I certainly don't see any and that's useful. And yes, I 
do see this as a good reason to go with my proposal, since it's not very 
consistent if
{noformat}
CREATE TABLE foo (
k1 uuid,
k2 "myEfficientComplexNumberType",
c text,
PRIMARY KEY (k1, k2)
) WITH CLUSTERING ORDER BY (k2 DESC)
{noformat}
is *not* the same than
{noformat}
CREATE TABLE foo (
k1 uuid,
k2 "myReversedEfficientComplexNumberType",
c text,
PRIMARY KEY (k1, k2)
)
{noformat}

> Add support for ReversedType
> 
>
> Key: CASSANDRA-4004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4004
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.1.1
>
> Attachments: 4004.txt
>
>
> It would be nice to add a native syntax for the use of ReversedType. I'm sure 
> there is anything in SQL that we inspired ourselves from, so I would propose 
> something like:
> {noformat}
> CREATE TABLE timeseries (
>   key text,
>   time uuid,
>   value text,
>   PRIMARY KEY (key, time DESC)
> )
> {noformat}
> Alternatively, the DESC could also be put after the column name definition 
> but one argument for putting it in the PK instead is that this only apply to 
> keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4004) Add support for ReversedType

2012-04-18 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256595#comment-13256595
 ] 

Jonathan Ellis commented on CASSANDRA-4004:
---

bq. it doesn't change the semantic of the query, it changes the logical 
ordering (which happens to be the same than the physical one but that last part 
is an implementation detail) of records in the table

>From the standpoint of a new user, this is sophistry.  If I ask for X DESC but 
>I get X1 before X10, then you've most certainly changed the semantics of my 
>query out from under me.

With Thrift we've been doing the equivalent of writing assembly code.  CQL lets 
us step up a level of abstraction. This is a Good Thing, but it will require a 
little getting used to for us veterans.

bq. if the DESC above is "just an optimisation for reversed queries"

Remember that {{SELECT X WHERE Y}} is NOT the reverse order from {{SELECT X 
WHERE Y ORDER BY X DESC}}, or put another way, {{ORDER BY}} is never implicit.  
If {{ORDER BY}} is unspecified, any order is valid, so we are free to use the 
most efficient one.

(The other relevant context here is that we decided not to support arbitrary 
orderings.  The two valid orderings a user may request correspond to 
physical/clustered order, and the reverse of that.  Both of which are 
reasonably efficient, but the former is more so.)

So, I think we should continue to enforce that restriction, and allow 
clustering order only to optimize for forwards/backwards (and by implication, 
to affect the default result ordering).  In your example, the permissible 
orderings would be KIND ASC, time DESC (most efficient, and what we should use 
when no explicit ORDER BY is given) and KIND DESC, time ASC.

bq. the question of the syntax is another matter

My syntax preference falls out of the way I view the problem above. :)  I.e., 
that it's there if you want to optimize for DESC but it doesn't change things 
in a really fundamental way. So I'd prefer to have it in the WITH extensions 
section along with similar options like COMPACT STORAGE.

> Add support for ReversedType
> 
>
> Key: CASSANDRA-4004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4004
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.1.1
>
> Attachments: 4004.txt
>
>
> It would be nice to add a native syntax for the use of ReversedType. I'm sure 
> there is anything in SQL that we inspired ourselves from, so I would propose 
> something like:
> {noformat}
> CREATE TABLE timeseries (
>   key text,
>   time uuid,
>   value text,
>   PRIMARY KEY (key, time DESC)
> )
> {noformat}
> Alternatively, the DESC could also be put after the column name definition 
> but one argument for putting it in the PK instead is that this only apply to 
> keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4004) Add support for ReversedType

2012-04-16 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254684#comment-13254684
 ] 

Sylvain Lebresne commented on CASSANDRA-4004:
-

bq. but I don't think it makes sense for CQL.

Why wouldn't it?  The notion of what is largest or smallest only make sense 
once you've defined what ordering you're talking about (what I'm calling the 
logical ordering). We do still allow in CQL custom orderings (which is useful), 
so why giving a simple syntax to define the reverse ordering or an existing one 
wouldn't make sense? With my first patch, "ORDER BY X DESC" does *always* 
return the largest X first, given the ordering.

bq. but it shouldn't change the semantics of the query itself

To be precise, it doesn't change the semantic of the query, it changes the 
logical ordering (which happens to be the same than the physical one but that 
last part is an implementation detail) of records in the table.


Now, looking more closely at the alternative of keeping the logical ordering 
unchanged but changing the physical ordering (in order to get faster reversed 
queries), I think this just doesn't work. And by "doesn't work", I mean that as 
soon as we have composites, it would be costly to implement (making it 
useless). Typically, suppose you follow that idea and declare:
{noformat}
CREATE TABLE timeseries (
  key text,
  kind int,
  time timestamp,
  value text,
  PRIMARY KEY (key, kind, time)
) WITH CLUSTERING ORDER BY (kind ASC, time DESC)
{noformat}

Now, if the query is:
{noformat}
SELECT kind, time FROM timeseries WHERE key =  LIMIT 200;
{noformat}
then, if the DESC above is "just an optimisation for reversed queries, then the 
expected result is (say):
{noformat}
kind | time
---
   0 |0
   0 |1
   ...
   0 |   99
   0 |  100
   1 |0
   1 |1
   1 |2
   ...
{noformat}
but the physical layout is now in fact:
{noformat}
kind | time
---
   0 |  100
   0 |   99
   ...
   0 |1
   0 |0
   1 |  100
   1 |   99
   1 |   98
   ...
{noformat}
I don't see how to implement that query efficiently (without potentially making 
many queries).

Lastly, while I think that changing the logical ordering is the correct way to 
deal with this, the question of the syntax is another matter. I do happen to 
like the syntax in the description of this ticket, but I don't care too much 
either.


> Add support for ReversedType
> 
>
> Key: CASSANDRA-4004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4004
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.1.1
>
> Attachments: 4004.txt
>
>
> It would be nice to add a native syntax for the use of ReversedType. I'm sure 
> there is anything in SQL that we inspired ourselves from, so I would propose 
> something like:
> {noformat}
> CREATE TABLE timeseries (
>   key text,
>   time uuid,
>   value text,
>   PRIMARY KEY (key, time DESC)
> )
> {noformat}
> Alternatively, the DESC could also be put after the column name definition 
> but one argument for putting it in the PK instead is that this only apply to 
> keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4004) Add support for ReversedType

2012-04-14 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254139#comment-13254139
 ] 

Jonathan Ellis commented on CASSANDRA-4004:
---

bq. The idea of my patch is that adding DESC to a field in a table declaration 
changes the order of records logically 

That makes sense for the old world of ReversedType and {{reversed}} slice flag, 
but I don't think it makes sense for CQL. When I ask for "ORDER BY X DESC" I 
expect the largest X first, period. So a table declaration like this makes 
sense as an optimization if DESC is your most frequent query type, but it 
shouldn't change the semantics of the query itself.

> Add support for ReversedType
> 
>
> Key: CASSANDRA-4004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4004
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.1.1
>
> Attachments: 4004.txt
>
>
> It would be nice to add a native syntax for the use of ReversedType. I'm sure 
> there is anything in SQL that we inspired ourselves from, so I would propose 
> something like:
> {noformat}
> CREATE TABLE timeseries (
>   key text,
>   time uuid,
>   value text,
>   PRIMARY KEY (key, time DESC)
> )
> {noformat}
> Alternatively, the DESC could also be put after the column name definition 
> but one argument for putting it in the PK instead is that this only apply to 
> keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4004) Add support for ReversedType

2012-04-14 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254128#comment-13254128
 ] 

Sylvain Lebresne commented on CASSANDRA-4004:
-

The idea of my patch is that adding DESC to a field in a table declaration 
changes the order of records *logically* (it also does it physically, but that 
is an implementation detail). The typical example (which I think is very 
common) is for time series, where you're often interested by having record "by 
default" in reverse time order (from the more recent to the older). In that 
case a SELECT without any ORDER BY would return records directly in reverse 
time order. So at least given that definition, no, I don't think there is 
anything to do for the code of ORDER BY in SELECT.

I reckon there is another way to add ReversedType, which I'm assuming is what 
you are referring to, which consists in *not* changing the logical order, but 
to change the physical order. In which case, yes, we would have to take that 
into account for ORDER BY during SELECT.

I'll admit I'm less of a fan of that second option though. It seems that 
changing the logical order is 1) more natural (to me at least obviously) and 2) 
avoids having to use ORDER BY DESC in all your query.

bq. I think I'd prefer to move this into our "implementation extensions" 
section as WITH CLUSTERING ORDERED BY (time DESC) or something similar

Why not, but I do prefer "my" notation in that it's more concise and, at least 
if it changes the logical order, I guess it doesn't feel like being just an 
'implementation extension'.

> Add support for ReversedType
> 
>
> Key: CASSANDRA-4004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4004
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.1.1
>
> Attachments: 4004.txt
>
>
> It would be nice to add a native syntax for the use of ReversedType. I'm sure 
> there is anything in SQL that we inspired ourselves from, so I would propose 
> something like:
> {noformat}
> CREATE TABLE timeseries (
>   key text,
>   time uuid,
>   value text,
>   PRIMARY KEY (key, time DESC)
> )
> {noformat}
> Alternatively, the DESC could also be put after the column name definition 
> but one argument for putting it in the PK instead is that this only apply to 
> keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4004) Add support for ReversedType

2012-04-13 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253587#comment-13253587
 ] 

Jonathan Ellis commented on CASSANDRA-4004:
---

I think I'd prefer to move this into our "implementation extensions" section as 
{{WITH CLUSTERING ORDERED BY (time DESC)}} or something similar.

I also don't see any code to account for, if we've already ordered it as DESC 
in the clustering declaration, that's going to affect what we need to do for 
ORDER BY DESC (or ASC).

> Add support for ReversedType
> 
>
> Key: CASSANDRA-4004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4004
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.1.1
>
> Attachments: 4004.txt
>
>
> It would be nice to add a native syntax for the use of ReversedType. I'm sure 
> there is anything in SQL that we inspired ourselves from, so I would propose 
> something like:
> {noformat}
> CREATE TABLE timeseries (
>   key text,
>   time uuid,
>   value text,
>   PRIMARY KEY (key, time DESC)
> )
> {noformat}
> Alternatively, the DESC could also be put after the column name definition 
> but one argument for putting it in the PK instead is that this only apply to 
> keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira