Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON

Joey Adams Sun, 24 Jul 2011 15:49:21 -0700

On Sat, Jul 23, 2011 at 11:14 PM, Robert Haas <[email protected]> wrote:
> I doubt you're going to want to reinvent TOAST, ...

I was thinking about making it efficient to access or update
foo.a.b.c.d[1000] in a huge JSON tree.  Simply TOASTing the varlena
text means we have to unpack the entire datum to access and update
individual members.  An alternative would be to split the JSON into
chunks (possibly by using the pg_toast_<id> table) and have some sort
of index that can be used to efficiently look up values by path.

This would not be trivial, and I don't plan to implement it any time soon.

>
On Sun, Jul 24, 2011 at 2:19 PM, Florian Pflug <[email protected]> wrote:
> On Jul24, 2011, at 05:14 , Robert Haas wrote:
>> On Fri, Jul 22, 2011 at 10:36 PM, Joey Adams <[email protected]> 
>> wrote:
>>> ... Fortunately, JSON's definition of a
>>> "number" is its decimal syntax, so the algorithm is child's play:
>>>
>>>  * Figure out the digits and exponent.
>>>  * If the exponent is greater than 20 or less than 6 (arbitrary), use
>>> exponential notation.
>>>
>>
>
> I agree. As for your proposed algorithm, I suggest to instead use
> exponential notation if it produces a shorter textual representation.
> In other words, for values between -1 and 1, we'd switch to exponential
> notation if there's more than 1 leading zero (to the right of the decimal
> point, of course), and for values outside that range if there're more than
> 2 trailing zeros and no decimal point. All after redundant zeros and
> decimal points are removed. So we'd store
>
> 0 as 0
> 1 as 1
> 0.1 as 0.1
> 0.01 as 0.01
> 0.001 as 1e-3
> 10 as 10
> 100 as 100
> 1000 as 1e3
> 1000.1 as 1000.1
> 1001 as 1001
>

Interesting idea.  The reason I suggested using exponential notation
only for extreme exponents (less than -6 or greater than +20) is
partly for presentation value.  Users might be annoyed to see 1000000
turned into 1e6.  Moreover, applications working solely with integers
that don't expect the floating point syntax may choke on the converted
numbers.  32-bit integers can be losslessly encoded as IEEE
double-precision floats (JavaScript's internal representation), and
JavaScript's algorithm for converting a number to a string ([1],
section 9.8.1) happens to preserve the integer syntax (I think).

Should we follow the JavaScript standard for rendering numbers (which
my suggestion approximates)?  Or should we use the shortest encoding
as Florian suggests?

- Joey

 [1]: 
http://www.ecma-international.org/publications/files/ECMA-ST-ARCH/ECMA-262%205th%20edition%20December%202009.pdf

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON

Reply via email to