[ https://issues.apache.org/jira/browse/CASSANDRA-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sylvain Lebresne updated CASSANDRA-5226: ---------------------------------------- Attachment: 0003-Add-bytes-conversion-functions.txt 0002-Allow-functions-in-selection.txt 0001-Refactor-to-support-CQL3-functions.txt > CQL3 refactor to allow conversion function > ------------------------------------------ > > Key: CASSANDRA-5226 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5226 > Project: Cassandra > Issue Type: Improvement > Reporter: Sylvain Lebresne > Assignee: Sylvain Lebresne > Fix For: 1.2.2 > > Attachments: 0001-Refactor-to-support-CQL3-functions.txt, > 0002-Allow-functions-in-selection.txt, 0003-Add-bytes-conversion-functions.txt > > > In CASSANDRA-5198, we've fixed CQL3 type validation and talked about adding > conversion functions to ease working with the different data types. However, > the current CQL3 code makes it fairly hard to add such functions in a > non-hacky way. In fact, we already support a few conversion functions (token, > minTimeuuid, maxTimeuuid, now) but the way we support them is extremely ugly > (the token function is competely special cased and min/maxTimeuuid are ugly > hacks in TimeUUIDType that I'm really not proud of). > So I'm attaching a refactor that cleans that up, making it easy to add new > conversion functions. Now, said refactor is a big one. While the goal is to > make it easy to add functions, I think it also improve the code in the > following ways: > * It much more clearly separate the phase of validating the query from > executing it. And in particular, it moves more work in the preparing phase. > Typically, the parsing of constants is now done in the preparing phase, not > the execution one. It also groups validation code much more cleanly imo. > * It simplify UpdateStatement. The Operation business was not very clean and > in particular the same operations where not handled by the same code > depending on whether they were prepared or not, which was error prone. This > is no longer the case. > * It somewhat simplify the parser. A few parsing rules were a bit too > convoluted, trying to enforce invariants that are much more easily checked > post parsing (and doing it post parsing often allow better error messages, > the parser tends to send cryptic errors). > The first attached part is the initial refactor. It also adds some relatively > generic code for adding conversion functions (it would typically not be very > hard to allow user defined functions, though that's not part of the patch at > all) and uses that to handle the existing token, minTimeuuid and maxTimeuuid > functions. > It's also worth mentioning that this first patch introduces type casts. The > main reason is that it allows multiple overloads of the same function. > Typically, the minTimeuuid allows both strings arguments (for dates) or > integer ones (for timestamp), but so when you have: > {noformat} > SELECT * FROM foo WHERE t > minTimeuuid(?); > {noformat} > then the code doesn't know which function to use. So it complains. But you > can remove the ambiguity with > {noformat} > SELECT * FROM foo WHERE t > minTimeuuid((bigint)?); > {noformat} > The 2nd patch finishes what the first one started by extending this > conversion functions support to select clauses. So after this 2nd patch you > can do stuff like: > {noformat} > SELECT token(k), k FROM foo; > {noformat} > for instance. > The 3rd patch builds on that to actually add new conversion functions. > Namely, for every existing CQL3 <type> it adds a blobTo<type> and a > <type>ToBlob function that convert from and to blobs. And so you can do (not > that this example is particularly smart): > {noformat} > SELECT varintToBlob(v) FROM foo WHERE v > blobToVarint(bigintToBlob(3)); > {noformat} > Honestly this last patch is more for demonstration purpose and we can discuss > this separately. In particular, we may want better different names for those > functions. But at least it should highlight that adding new function is easy > (this could be used to add methods to work with dates for instance). > Now, at least considering the 2 first patches, this is not a small amount of > code but I would still suggest pushing this in 1.2 (the patches are against > 1.2) for the following reasons: > * It fixes a few existing bugs (CASSANDRA-5198 has broke prepared statement > for instance, which this patch fixes) and add missing validation in a few > places (we are allowing sets literals like \{ 1, 1 \} for instance, which is > kind of wrong as it suggests we support multisets). We could fix those > separatly but honestly I'm not we won't miss some. > * We do have a fair amount of CQL dtests and i've check all pass. The > refactor also cleans up some part of the code quite a bit imo. So overall I > think I'm almost more confident in the code post-refactor than the current > one. > * We're early in 1.2 and it's an improvement after all. It would be a bit > sad to have to wait for 2.0 to get this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira