[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

Sylvain Lebresne (JIRA) Mon, 16 Feb 2015 01:10:28 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322549#comment-14322549
 ]


Sylvain Lebresne commented on CASSANDRA-7970:
---------------------------------------------

Sorry for the lack of communication. I'll have a good look at this but that 
might only be next week if that's ok (got a long flight next Sunday so I'll 
review it then). That said a couple of early remarks (that might be somewhat 
off since I haven't checked the patch) based on the comments on this ticket so 
far.

bq. I've made the column name declaration optional with doing INSERT JSON

I'd actually have a preference for not allowing the column name declaration at 
all as it doesn't buy us anything and imo having 2 forms is more confusing than 
anything. Even if we later want to allow both {{VALUES}} and {{JSON}} (which 
I'm actually kind of against but we can argue later since we've at least agreed 
on postoning that option), we can introduce back the names declaration later.

bq. toJson() can only be used in the selection clause of a SELECT statement, 
because it can accept any type and the exact argument type must be known.

Not 100% sure I see where the problem is on this one, at least in theory. Even 
if some of our literals can be of multiple types (typically numeric literals), 
they will always translate to the same thing in JSON anyway so that shouldn't 
be a problem. As for bind markers, we can do what we do for other functions 
when their is an ambiguity and require the user to provide a type-cast. Is it 
just that it's not convenient to do with the current code, or is there 
something more fundamental I'm missing?

bq. fromJson() can only be used in INSERT/UPDATE/DELETE statements because the 
receiving type must be known in order to parse the JSON correctly.

That one I understand, but I'm not sure a per-statement restriction is 
necessary the most appropriate because I suppose there is a problem with 
functions too since we allow overloading (namely, we can have 2 {{foo}} method, 
one taking a {{list<text>}} as argument, and the other taking a {{int}}, so 
{{foo(fromJson(z))}} would be problematic). So the most logical way to handle 
this for me would be to generalize slightly the notion of "some type" that we 
already have due to bind marker.  Typically, both a bind marker type and 
{{fromJson}} return type would be "some type", and when the type checker 
encounter one and can't resolve it to a single type, it would reject it asking 
the user to type-cast explicitely. Similarly, {{toJon()}} argument could be 
"some type". Again, we already do this for bind markers, it's a just a bit 
adhoc so it would just be a matter of generalizing it a bit. 

> JSON support for CQL
> --------------------
>
>                 Key: CASSANDRA-7970
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API
>            Reporter: Jonathan Ellis
>            Assignee: Tyler Hobbs
>              Labels: client-impacting, cql3.3, docs-impacting
>             Fix For: 3.0
>
>         Attachments: 7970-trunk-v1.txt
>
>
> JSON is popular enough that not supporting it is becoming a competitive 
> weakness.  We can add JSON support in a way that is compatible with our 
> performance goals by *mapping* JSON to an existing schema: one JSON documents 
> maps to one CQL row.
> Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
> [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
> document from a service or a user into a CQL row, with all the validation 
> that entails.
> Since we are not looking to support schemaless documents, we will not be 
> adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
> map the JSON to UDT, collections, and primitive CQL types.
> Here's how this might look:
> {code}
> CREATE TYPE address (
>   street text,
>   city text,
>   zip_code int,
>   phones set<text>
> );
> CREATE TABLE users (
>   id uuid PRIMARY KEY,
>   name text,
>   addresses map<text, address>
> );
> INSERT INTO users JSON
> {‘id’: 4b856557-7153,
>    ‘name’: ‘jbellis’,
>    ‘address’: {“home”: {“street”: “123 Cassandra Dr”,
>                         “city”: “Austin”,
>                         “zip_code”: 78747,
>                         “phones”: [2101234567]}}};
> SELECT JSON id, address FROM users;
> {code}
> (We would also want to_json and from_json functions to allow mapping a single 
> column's worth of data.  These would not require extra syntax.)
> [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
> [2] https://blog.compose.io/schema-less-is-usually-a-lie/
> [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

Reply via email to