On Nov 12, 2013, at 10:35 AM, Teodor Sigaev <teo...@sigaev.ru> wrote:

> Hi!
> 
> Attatched patch adds nesting feature, types (string, boll and numeric 
> values), arrays and scalar to hstore type.

My apologies for not getting to this sooner, work has been a bit nutty. The 
truth is that I reviewed this patch quite a bit a month back, mostly so I could 
write documentation, the results of which are included in this patch. And I'm 
super excited for what's to come in the next iteration, as I hear that Teodor 
and Andrew are hard at work adding jsonb as a binary-compatible JSON data type.

Meanwhile, for this version, a quick overview of what has changed since 9.2.

Contents & Purpose
==================

Improved Data Type Support
--------------------------

* Added data type support for values. Previously they could only be strings or 
NULL, but with this patch they can also be numbers or booleans.

* Added array support. Values can be arrays of other values. The format for 
arrays is a bracketed, comma-delimited list.

* Added nesting support. hstore values can themselves be hstores. Nested 
hstores are wrapped in braces, but the root-level hstore is not (for 
compatibility with the format of previous versions of hstore).

* An hstore value is no longer required to be an hstore object. It can now be 
any scalar value.

These three items make the basic format feature-complete with JSON. Here's an 
example where the values are scalars:

    =% SELECT 'foo'::hstore, '"hi \"bob\""'::hstore, '1.0'::hstore, 
'true'::hstore, NULL::hstore;
     hstore |    hstore    | hstore | hstore | hstore 
    --------+--------------+--------+--------+--------
     "foo"  | "hi \"bob\"" | 1.0    | t      | 

And here are a couple of arrays with strings, numbers, booleans, and NULLs:

    SELECT '[k,v]'::hstore, '[1.0, "hi there", false, null]'::hstore;
       hstore   |           hstore           
    ------------+----------------------------
     ["k", "v"] | [1.0, "hi there", f, NULL]

Here's a complicated example formatted with `hstore.pretty_print` enabled.

        =% SET hstore.pretty_print=true;
        =% SELECT '{
          "type" => "Feature",
          "bbox" => [-180.0, -90.0, 180.0, 90.0],
          "geometry" => {
            "type" => "Polygon",
            "coordinates" => [[
              [-180.0, 10.0], [20.0, 90.0], [180.0, -5.0], [-30.0, -90.0]
              ]]
            }
        }'::hstore;
                  hstore          
        --------------------------
         "bbox"=>                +
         [                       +
             -180.0,             +
             -90.0,              +
             180.0,              +
             90.0                +
         ],                      +
         "type"=>"Feature",      +
         "geometry"=>            +
         {                       +
             "type"=>"Polygon",  +
             "coordinates"=>     +
             [                   +
                 [               +
                     [           +
                         -180.0, +
                         10.0    +
                     ],          +
                     [           +
                         20.0,   +
                         90.0    +
                     ],          +
                     [           +
                         180.0,  +
                         -5.0    +
                     ],          +
                     [           +
                         -30.0,  +
                         -90.0   +
                     ]           +
                 ]               +
             ]                   +
         }

So, exact feature parity with the JSON data type.

* hstore.pretty_print is a new GUC, specifically to allow an HSTORE value to be 
pretty-printed. There is also a function to pretty-print, so we might be able 
to just do away with the GUC.

Interface
---------

* New operators:
  + `hstore -> int`:     Get string value at array index (starting at 0)
  + `hstore ^> text`:    Get numeric value for key
  + `hstore ^> int`:     Get numeric value at array index
  + `hstore ?> text`:    Get boolean value for key
  + `hstore ?> int`:     Get boolean value at array index
  + `hstore #> text[]`:  Get string value for key path
  + `hstore #^> text[]`: Get numeric value for key path
  + `hstore #?> text[]`: Get boolean value for key path
  + `hstore %> text`:    Get hstore value for key
  + `hstore %> int`:     Get hstore value at array index
  + `hstore #%> text[]`: Get hstore value for key path
  + `hstore ? int`:      Does hstore contain array index
  + `hstore #? text[]`:  Does hstore contain key path
  + `hstore - int`:      Delete index from left operand
  + `hstore #- text[]`:  Delete key path from left operand

* New functions:
  + `hstore(text)`:             Make a text scalar hstore
  + `hstore(numeric)`:          Make a numeric scalar hstore
  + `hstore(boolean)`:          Make a boolean scalar hstore
  + `hstore(text, hstore)`:     Make a nested hstore
  + `hstore(text, numeric)`:    Make an hstore with a key and numeric value
  + `hstore(text, boolean)`:    Make an hstore with a key and boolean value
  + `array_to_hstore(anyarray): Make an array hstore from an SQL array
  + `hvals(hstore)`             Get values as a set of hstore values
  + `json_to_hstore(json)`      Convert JSON to hstore
  + `each_hstore(hstore)`       Get set of hstore key/value pairs
  + `hstore_typeof(hstore)`     Return text name for the hstore type (hash, 
array, text, numeric, etc.)
  + `replace(hstore,text[],hstore)`:     Replace value at specified path
  + `concat_path(hstore,text[],hstore)`: Concatenate hstore value at specified 
path
  + `hstore_print(hstore, params)`:      Format hstore as text

  The hstore_print() function has a number of optional boolean parameters to 
affect how the resulting text is formatted. They all default to false:

    - pretty_print
    - array_curly_braces: use {} instead of [] for arrays
    - root_hash_decorated: Use {} for the root hash
    - json: Format as JSON
    - loose: Try to parse numbers and booleans from text values

Other Changes
-------------

* New casts: JSON and HSTORE can be cast to each other. I don't think they're 
implicit, though the forthcoming jsonb data type might support explicit casting 
to and from hstore, since internally they will be identical.

* The internal representation has been changed, but should be backward (and 
pg_upgrade) compatible, just as Andrew Gierth's change from 8.4 to 9.0 was. One 
can do an in-place update to rewrite all records at once. Of course, nested 
and/or non-hash hstore values dumped from 9.4 will not be able to be loaded 
into 9.3.

* GIN indexing is now supported. This is actually pretty amazing. For an hstore 
value, even hash keys are considered values, as far as the index is concerned. 
This makes it efficient to find hstore values that contain a key. I wrote an 
example in this blog post:

  http://theory.so/pg/2013/10/25/indexing-nested-hstore/

Submission review
=================
* Is the patch in a patch format which has context? Yes.
* Does it apply cleanly to the current git master? It did for me, though I 
think Peter has found an issue or two since.
* Does it include reasonable tests, necessary doc patches, etc? Yes.

Usability review
================

* Does the patch actually implement what it says it does? Yes.
* Do we want that? OH yes.
* Do we already have it? No.
* Does it follow SQL spec, or the community-agreed behavior? Yes, though want 
jsonb, too.
* Does it include pg_dump support? Yes
* Are there dangers?  Could break backward compatibility, though I don't think 
it does.
* Have all the bases been covered? I think so

Feature test
============

* Does the feature work as advertised? Yes.
* Are there corner cases the author has failed to consider? All I noticed were 
promptly fixed.
* Are there any assertion failures or crashes? No

Performance review
==================

* Does the patch slow down simple tests?  No
* If it claims to improve performance, does it? Yes, with GIN index support. 
Loading hstore values is slower than loading JSON, but everything else is 
faster than JSON.
* Does it slow down other things? No.

Coding review
=============

* Does it follow the project guidelines? Yes.
* Are there portability issues?  Unknown
* Will it work on Windows/BSD etc? Tested on OS X only.
* Are the comments sufficient and accurate? Yes.
* Does it do what it says, correctly? As best I can tell, yes.
* Does it produce compiler warnings? No.
* Can you make it crash? No, but I did find a bug or two that was promptly 
fixed.

Architecture review
===================

* Is everything done in a way that fits together coherently with other 
features/modules? yes.
* Are there interdependencies that can cause problems? No

Conclusion
==========

I love where nested hstore is going, especially since it will be used for 
jsonb, too. The nesting, data type, and GIN index support is really great, and 
the new constructors provide a nice SQL API that make it easy to use. I think 
that the next version of this patch will be full of win for the project.

This was considered a WIP patch, since the jsonb support is still forthcoming, 
so it's appropriate to leave it marked “Returned with feedback”. As Andrew is 
doing much of that work, the code itself will get a much closer examination 
from him. But for the hstore feature itself, I think the current interface and 
features are ready to go.

Best,

David












-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to