[ https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185514#comment-13185514 ]
Sylvain Lebresne commented on CASSANDRA-2474: --------------------------------------------- bq. Personally I'd rather support both for one release to make the transition easier, but with neither super nor composite support I doubt many people are using the current .., so if doing both adds a lot of complexity I'm okay with this. The patch changes enough the implementation of select that keeping support for '..' would amount to add back some special code to handle it. But I guess removing it right away may mean a rather painful upgrade for anyone using CQL in production right now, so maybe it's worth it. Once the patch is ready, I'll see what adding back the '..' for easing transition entails exactly. bq. I've made static definitions (i.e, those definitions that don't use COMPACT STORAGE basically) really static. To try to justify this a little bit more (so it doesn't seem too random a choice), I see mainly two big advantages to doing that: # added validation/security for the programmer: If you define: {noformat} CREATE TABLE Users ( ID int PRIMARY KEY, NAME text, EMAIL text) {noformat} I think it's great that the DB warns you that {noformat} INSERT INTO users (ID, NAME, EMA1L) VALUES (2, "Jacques", "j...@cques.com") {noformat} or {noformat} SELECT * FROM users WHERE EMA1L = "j...@cques.com" {noformat} are likely mistakes on your side. It's also what someone coming from SQL would expect :P # It adds some (imo reassuring) regularity to the language, in that in {noformat} SELECT xxx, yyy FROM cf WHERE zzz > 3; {noformat} we know that xxx, yyy and zzz are *always* names defined in the "schema" (schema meaning here the CREATE TABLE definition). If we allow something random, it will only be meaningful for static (and sparse) CF and we will have to deal with the conflict with other column definition (parts of the PRIMARY KEY typically). Typically, in my example above, it means we would allow random column names to be insert except for the column name ID. And I don't see any downside since you can cheaply update the schema or use wide rows if appropriate. Yes, internally our engine would allow for insert non-predefined column for 'static' CF, but is that useful is the right question. Or, as a great man once said: "schemaless" is a non-feature; "painless schema" is what people care about. bq. Granted, it doesn't make a great deal of sense to use IN + LIMIT, but if someone does, the LIMIT should take precedence What I meant is I'm not sure how to implement it. Suppose you have the following wide row definition (good ol' time series): {noformat} CREATE TABLE Events ( event_type text, time date, event_details binay, PRIMARY KEY (event_type, time) ) USING COMPACT STORAGE {noformat} and say for two event_type e1 and e2, you have 1000 events each. Now if you do (with limit as a way to do paging): {noformat} SELECT * FROM Events WHERE event_type IN (e1, e2) LIMIT 500; {noformat} How does that translate internally? If we do a multiGetSlice with a slice having a limit of 500, we'll read 500 columns from e2 uselessly. And we have a similar problem if we do more simply: {noformat} SELECT * FROM Events LIMIT 1000 {noformat} because we currently have no way to do a range query that stops when we have n columns *across* all rows. In a way it's a simpler problem that in the 'IN' case because we could add internal support for this, but it's additional work and not really in the scope of this ticket. In other words, I'm not sure how to implement LIMIT currently with the new definitions introduced by this patch while keeping it's SQL semantic. bq. What if we allowed "ORDER BY DESC" instead? I'd be fine with that (though wouldn't "ORDER DESC" sound less weird?). bq. BTW, why test this with dtest instead of just single node mode? No reason outside of it being simpler for me (the tests only use a single node) and my ignorance of an "official" CQL test suite (but I kind of think the dtest framework would be a good official test framework for anything not a unit test). > CQL support for compound columns and wide rows > ---------------------------------------------- > > Key: CASSANDRA-2474 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2474 > Project: Cassandra > Issue Type: New Feature > Components: API, Core > Reporter: Eric Evans > Assignee: Sylvain Lebresne > Priority: Critical > Labels: cql > Fix For: 1.1 > > Attachments: 2474-transposed-1.PNG, 2474-transposed-raw.PNG, > 2474-transposed-select-no-sparse.PNG, 2474-transposed-select.PNG, > cql_tests.py, raw_composite.txt, screenshot-1.jpg, screenshot-2.jpg > > > For the most part, this boils down to supporting the specification of > compound column names (the CQL syntax is colon-delimted terms), and then > teaching the decoders (drivers) to create structures from the results. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira