[ 
https://issues.apache.org/jira/browse/CASSANDRA-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-5226:
----------------------------------------

    Attachment: 0003-Add-bytes-conversion-functions.txt
                0002-Allow-functions-in-selection.txt
                0001-Refactor-to-support-CQL3-functions.txt
    
> CQL3 refactor to allow conversion function
> ------------------------------------------
>
>                 Key: CASSANDRA-5226
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5226
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 1.2.2
>
>         Attachments: 0001-Refactor-to-support-CQL3-functions.txt, 
> 0002-Allow-functions-in-selection.txt, 0003-Add-bytes-conversion-functions.txt
>
>
> In CASSANDRA-5198, we've fixed CQL3 type validation and talked about adding 
> conversion functions to ease working with the different data types. However, 
> the current CQL3 code makes it fairly hard to add such functions in a 
> non-hacky way. In fact, we already support a few conversion functions (token, 
> minTimeuuid, maxTimeuuid, now) but the way we support them is extremely ugly 
> (the token function is competely special cased and min/maxTimeuuid are ugly 
> hacks in TimeUUIDType that I'm really not proud of).
> So I'm attaching a refactor that cleans that up, making it easy to add new 
> conversion functions. Now, said refactor is a big one. While the goal is to 
> make it easy to add functions, I think it also improve the code in the 
> following ways:
> * It much more clearly separate the phase of validating the query from 
> executing it. And in particular, it moves more work in the preparing phase.  
> Typically, the parsing of constants is now done in the preparing phase, not 
> the execution one. It also groups validation code much more cleanly imo.
> * It simplify UpdateStatement. The Operation business was not very clean and 
> in particular the same operations where not handled by the same code 
> depending on whether they were prepared or not, which was error prone. This 
> is no longer the case.
> * It somewhat simplify the parser. A few parsing rules were a bit too 
> convoluted, trying to enforce invariants that are much more easily checked 
> post parsing (and doing it post parsing often allow better error messages, 
> the parser tends to send cryptic errors).
> The first attached part is the initial refactor. It also adds some relatively 
> generic code for adding conversion functions (it would typically not be very 
> hard to allow user defined functions, though that's not part of the patch at 
> all) and uses that to handle the existing token, minTimeuuid and maxTimeuuid 
> functions.
> It's also worth mentioning that this first patch introduces type casts. The 
> main reason is that it allows multiple overloads of the same function. 
> Typically, the minTimeuuid allows both strings arguments (for dates) or 
> integer ones (for timestamp), but so when you have:
> {noformat}
> SELECT * FROM foo WHERE t > minTimeuuid(?);
> {noformat}
> then the code doesn't know which function to use. So it complains. But you 
> can remove the ambiguity with
> {noformat}
> SELECT * FROM foo WHERE t > minTimeuuid((bigint)?);
> {noformat}
> The 2nd patch finishes what the first one started by extending this 
> conversion functions support to select clauses. So after this 2nd patch you 
> can do stuff like:
> {noformat}
> SELECT token(k), k FROM foo;
> {noformat}
> for instance.
> The 3rd patch builds on that to actually add new conversion functions. 
> Namely, for every existing CQL3 <type> it adds a blobTo<type> and a 
> <type>ToBlob function that convert from and to blobs. And so you can do (not 
> that this example is particularly smart):
> {noformat}
> SELECT varintToBlob(v) FROM foo WHERE v > blobToVarint(bigintToBlob(3));
> {noformat}
> Honestly this last patch is more for demonstration purpose and we can discuss 
> this separately. In particular, we may want better different names for those 
> functions. But at least it should highlight that adding new function is easy 
> (this could be used to add methods to work with dates for instance).
> Now, at least considering the 2 first patches, this is not a small amount of 
> code but I would still suggest pushing this in 1.2 (the patches are against 
> 1.2) for the following reasons:
>  * It fixes a few existing bugs (CASSANDRA-5198 has broke prepared statement 
> for instance, which this patch fixes) and add missing validation in a few 
> places (we are allowing sets literals like \{ 1, 1 \} for instance, which is 
> kind of wrong as it suggests we support multisets). We could fix those 
> separatly but honestly I'm not we won't miss some.
>  * We do have a fair amount of CQL dtests and i've check all pass. The 
> refactor also cleans up some part of the code quite a bit imo. So overall I 
> think I'm almost more confident in the code post-refactor than the current 
> one.
>  * We're early in 1.2 and it's an improvement after all. It would be a bit 
> sad to have to wait for 2.0 to get this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to