On 03/13/2013 12:40 PM, David E. Wheeler wrote:
On Mar 13, 2013, at 5:17 AM, Robert Haas <robertmh...@gmail.com> wrote:

What I think is tricky here is that there's more than one way to
conceptualize what the JSON data type really is.  Is it a key-value
store of sorts, or just a way to store text values that meet certain
minimalist syntactic criteria?  I had imagined it as the latter, in
which case normalization isn't sensible.  But if you think of it the
first way, then normalization is not only sensible, but almost
obligatory.
That makes a lot of sense. Given the restrictions I tend to prefer in my 
database data types, I had imagined it as the former. And since I'm using it 
now to store key/value pairs (killing off some awful EAV implementations in the 
process, BTW), I certainly think of it more formally as an object.


But I can live with the other interpretation, as long as the differences are 
clearly understood and documented. Perhaps a note could be added to the docs 
explaining this difference, and what one can do to adapt for it. A normalizing 
function would certainly help.
I guess the easiest and most generic way to normalize is to actually convert to some internal representation and back.

in pl/python this would look like this:

hannu=# create function normalize(IN ij json, OUT oj json) language plpythonu as $$
import json
return json.dumps(json.loads(ij))
$$;
CREATE FUNCTION
hannu=# select normalize('{"a":1, "a":"b", "a":true}');
  normalize
-------------
 {"a": true}
(1 row)

If we could want to be really fancy we could start storing our json in some format which is faster to parse, like tnetstrings, but probably it is too late in release cycle to change this now.

Hannu

Best,

David




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to