On Tue, Apr 8, 2014 at 11:37 PM, Heikki Linnakangas <hlinnakan...@vmware.com> wrote: > As the code stands, you don't have a choice on any of those things. The > decisions have been made by us, PostgreSQL developers. The only choice you > have is between jsonb_ops and jsonb_hash_ops, with a strange combination of > tradeoffs in both. Sure, they're still useful, if not optimal, for a > wide-range of applications. For more complicated cases, you will have to > resort to expression indexes. It bugs me greatly that the underlying indexam > could do all those things, we're just not exposing the capability.
Why would you ever not have to use expression indexes? Idiomatic usage of jsonb involves expression indexes because it's desirable to index only a expression. People will want to do things like only index the nested "tags" array far more frequently then they'll only want to index keys (that is, Object pair keys) in the entire document. I don't get why you'd say that they'd "resort" to expression indexes, like they're a kludge. Have you ever tried out one of the new document databases? I suggest you do. Expression indexes on jsonb map pretty closely onto how you're frequently expected to index data in those systems. That's something that they make heavy use of. Why would you ever not really have to consider ahead of time what is important enough to be indexed, and what is not? > ISTM we need a way to parameterize opclasses, so that when you create the > index, you specify the above things. That would be nice. > In the absence of parameterizable opclasses, it would be much more flexible > to have opclasses that index, keys, values, key-value pairs and paths > separately, instead of the current json_ops and json_hash_ops opclasses > which index all of those in the same index. That way, if you only e.g. ever > query on the existence of a key, you'd only need to index the keys. I think only ever needing to index the keys is not a common use-case. It pretty much makes exactly as much sense to do so as it would with hstore, and yet hstore doesn't support that after all these years. > I don't understand how we ended up with the current dichotomy of json_ops > and json_hash_ops... It makes sense if you consider jsonb_ops best suited to simpler hstore-style indexing, while jsonb_hash_ops is best suited to testing containment of JSON documents, potentially with lots of nesting. These documents are typically homogeneous in structure. Idiomatic usage of systems like MongoDB involves "collections" of fairly homogeneous documents. If there is a lot of variability in their structure within a collection, the collection more or less becomes impossible to usefully query. They aim to be flexible, but still implicitly require you to insert data with a half-way sensible/consistent structure. This makes separately indexing the keys less than compelling as a default, because there is so much duplication of keys in practice. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers