On 09/05/14 15:34, Bruce Momjian wrote:
On Thu, May  8, 2014 at 06:39:11PM -0400, Tom Lane wrote:
I wrote:
I think the idea of hashing only keys/values that are "too long" is a
reasonable compromise.  I've not finished coding it (because I keep
getting distracted by other problems in the code :-() but it does not
look to be very difficult.  I'm envisioning the cutoff as being something
like 128 bytes; in practice that would mean that few if any keys get
hashed, I think.
Attached is a draft patch for this.  In addition to the hash logic per se,
I made these changes:

* Replaced the K/V prefix bytes with a code that distinguishes the types
of JSON values.  While this is not of any huge significance for the
current index search operators, it's basically free to store the info,
so I think we should do it for possible future use.

* Fixed the problem with "exists" returning rows it shouldn't.  I
concluded that the best fix is just to force recheck for exists, which
allows considerable simplification in the consistent functions.

* Tried to improve the comments in jsonb_gin.c.

Barring objections I'll commit this tomorrow, and also try to improve the
user-facing documentation about the jsonb opclasses.
Looks good.  I was thinking the jsonb_ops name could remain unchanged
and the jsonb_hash_ops could be called jsonb_combo_ops as it combines
the key and value into a single index entry.

If you have 'jsonb_combo_ops' - then surely 'jsonb_op' should be called 'jsonb_xxx_ops', where the 'xxx' distinguishes that from 'jsonb_combo_ops'? I guess, if any appropriate wording of 'xxx' was too cumbersome, then it would be worse.


Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to