[ https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14300120#comment-14300120 ]
Robert Stupp commented on CASSANDRA-7970: ----------------------------------------- I took a short look at the patch and played around a little bit with this. Some comments: * A test with null values in {{testFromJsonFct()}} and {{testToJsonFct()}} would be nice (toJson handles null correctly, fromJson throws an NPE for {{INSERT INTO tojson (k, asciival ) VALUES ( 0, fromJson(null) );}}) * Can you add a {{INSERT JSON}} variant that tackles tuples and types? * I tried {{insert into some_table (k, asciival ) JSON \{"k": 0, "asciival": "foobar"\};}} in cqlsh - it complains with a syntax error (obviously my bad). Was wondering if we really need to mention the column names since they are contained in the JSON. Couldn't a {{insert into some_table JSON ?;}} with json {{"\{"k": 0, "asciival": "foobar"\}"}} also work? * The Java Driver tries to contact a coordinator that owns the target partition. This gets complicated with {{INSERT JSON}} since the driver would have to parse the JSON before it actually knows the "correct" node. Maybe we can discuss a follow-up that both allows {{VALUES}} and {{JSON}} - e.g. {{INSERT INTO some_table (partitionKey, clusteringKey) VALUES (1, 2) JSON ?}}, where the {{JSON ?}} part contains more columns. * missing license header in {{Json.java}} * In {{FunctionCall}} you exchanged „fun“ with „fun.name()“ eliminating the function signature in the exception message * {{Selection.rowToJson}} : use Collections.singleton instead of {{Arrays.asList}} * {{Selection.rowToJson}} : would be nicer (and eliminate one String instance) to replace sb.append(JSONValue.escape(columnName)) with something like JSONValue.escape(columnName, sb); (make escape write to sb) * Similar for {{AbstractType.toJSONString}} - let the implementation write to the string builder * {{ReversedType}} : not sure whether the impl is correct - doesn’t it need to reverse the binary representation? * {{FromJsonFct}} + {{ToJsonFct}} : would be nicer to have a CHM instead of synchronized HashMap. Also these maps in these classes have {{AbstractType}} as the key and are never evicted - means: a changed user type would remain forever in these maps and if a UDT. This might get irrelevant when {{AbstractType}} is removed - so not sure, whether we should fix this now - maybe open a follow-up-ticket for this? A simple fix would be to 'statically cache' function instances for primitive types but always create a new instance of these functions for tuples + UDTs ; then you could pre-create the function instances completely eliminating the need for CHM or synchronized. Altogether: really nice work :) > JSON support for CQL > -------------------- > > Key: CASSANDRA-7970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7970 > Project: Cassandra > Issue Type: New Feature > Components: API > Reporter: Jonathan Ellis > Assignee: Tyler Hobbs > Fix For: 3.0 > > Attachments: 7970-trunk-v1.txt > > > JSON is popular enough that not supporting it is becoming a competitive > weakness. We can add JSON support in a way that is compatible with our > performance goals by *mapping* JSON to an existing schema: one JSON documents > maps to one CQL row. > Thus, it is NOT a goal to support schemaless documents, which is a misfeature > [1] [2] [3]. Rather, it is to allow a convenient way to easily turn a JSON > document from a service or a user into a CQL row, with all the validation > that entails. > Since we are not looking to support schemaless documents, we will not be > adding a JSON data type (CASSANDRA-6833) a la postgresql. Rather, we will > map the JSON to UDT, collections, and primitive CQL types. > Here's how this might look: > {code} > CREATE TYPE address ( > street text, > city text, > zip_code int, > phones set<text> > ); > CREATE TABLE users ( > id uuid PRIMARY KEY, > name text, > addresses map<text, address> > ); > INSERT INTO users JSON > {‘id’: 4b856557-7153, > ‘name’: ‘jbellis’, > ‘address’: {“home”: {“street”: “123 Cassandra Dr”, > “city”: “Austin”, > “zip_code”: 78747, > “phones”: [2101234567]}}}; > SELECT JSON id, address FROM users; > {code} > (We would also want to_json and from_json functions to allow mapping a single > column's worth of data. These would not require extra syntax.) > [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/ > [2] https://blog.compose.io/schema-less-is-usually-a-lie/ > [3] http://dl.acm.org/citation.cfm?id=2481247 -- This message was sent by Atlassian JIRA (v6.3.4#6332)