[ https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322549#comment-14322549 ]
Sylvain Lebresne commented on CASSANDRA-7970: --------------------------------------------- Sorry for the lack of communication. I'll have a good look at this but that might only be next week if that's ok (got a long flight next Sunday so I'll review it then). That said a couple of early remarks (that might be somewhat off since I haven't checked the patch) based on the comments on this ticket so far. bq. I've made the column name declaration optional with doing INSERT JSON I'd actually have a preference for not allowing the column name declaration at all as it doesn't buy us anything and imo having 2 forms is more confusing than anything. Even if we later want to allow both {{VALUES}} and {{JSON}} (which I'm actually kind of against but we can argue later since we've at least agreed on postoning that option), we can introduce back the names declaration later. bq. toJson() can only be used in the selection clause of a SELECT statement, because it can accept any type and the exact argument type must be known. Not 100% sure I see where the problem is on this one, at least in theory. Even if some of our literals can be of multiple types (typically numeric literals), they will always translate to the same thing in JSON anyway so that shouldn't be a problem. As for bind markers, we can do what we do for other functions when their is an ambiguity and require the user to provide a type-cast. Is it just that it's not convenient to do with the current code, or is there something more fundamental I'm missing? bq. fromJson() can only be used in INSERT/UPDATE/DELETE statements because the receiving type must be known in order to parse the JSON correctly. That one I understand, but I'm not sure a per-statement restriction is necessary the most appropriate because I suppose there is a problem with functions too since we allow overloading (namely, we can have 2 {{foo}} method, one taking a {{list<text>}} as argument, and the other taking a {{int}}, so {{foo(fromJson(z))}} would be problematic). So the most logical way to handle this for me would be to generalize slightly the notion of "some type" that we already have due to bind marker. Typically, both a bind marker type and {{fromJson}} return type would be "some type", and when the type checker encounter one and can't resolve it to a single type, it would reject it asking the user to type-cast explicitely. Similarly, {{toJon()}} argument could be "some type". Again, we already do this for bind markers, it's a just a bit adhoc so it would just be a matter of generalizing it a bit. > JSON support for CQL > -------------------- > > Key: CASSANDRA-7970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7970 > Project: Cassandra > Issue Type: New Feature > Components: API > Reporter: Jonathan Ellis > Assignee: Tyler Hobbs > Labels: client-impacting, cql3.3, docs-impacting > Fix For: 3.0 > > Attachments: 7970-trunk-v1.txt > > > JSON is popular enough that not supporting it is becoming a competitive > weakness. We can add JSON support in a way that is compatible with our > performance goals by *mapping* JSON to an existing schema: one JSON documents > maps to one CQL row. > Thus, it is NOT a goal to support schemaless documents, which is a misfeature > [1] [2] [3]. Rather, it is to allow a convenient way to easily turn a JSON > document from a service or a user into a CQL row, with all the validation > that entails. > Since we are not looking to support schemaless documents, we will not be > adding a JSON data type (CASSANDRA-6833) a la postgresql. Rather, we will > map the JSON to UDT, collections, and primitive CQL types. > Here's how this might look: > {code} > CREATE TYPE address ( > street text, > city text, > zip_code int, > phones set<text> > ); > CREATE TABLE users ( > id uuid PRIMARY KEY, > name text, > addresses map<text, address> > ); > INSERT INTO users JSON > {‘id’: 4b856557-7153, > ‘name’: ‘jbellis’, > ‘address’: {“home”: {“street”: “123 Cassandra Dr”, > “city”: “Austin”, > “zip_code”: 78747, > “phones”: [2101234567]}}}; > SELECT JSON id, address FROM users; > {code} > (We would also want to_json and from_json functions to allow mapping a single > column's worth of data. These would not require extra syntax.) > [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/ > [2] https://blog.compose.io/schema-less-is-usually-a-lie/ > [3] http://dl.acm.org/citation.cfm?id=2481247 -- This message was sent by Atlassian JIRA (v6.3.4#6332)