On 2014-02-09 22:23, John Bollinger wrote:


On Monday, September 1, 2014 3:55:03 AM UTC-5, henrik lindberg wrote:

    Hi,
    Recently I have been looking into serialization of various kinds, and
    the issue of how we represent and serialize/deserialize numbers have
    come up.


[...]


    Proposal
    ========
    I would like to cap a Puppet Integer to be a 64 signed value when used
    as a resource attribute, or anywhere in external formats. This means a
    value range of -2^63 to 2^63-1 which is in Exabyte range (1 exabyte
    = 2^60).

    I would like to cap a Puppet Float to be a 64 bit (IEEE 754 binary64)
    when used as a resource attribute or anywhere in external formats.

    With respect to intermediate results, I propose that we specify that
    values are of arbitrary size and that it is an error to store a value



What, specifically, does it mean to "store a value"?  Does that mean to
assign it to a resource attribute?

It was vague on purpose since I cannot currently enumerate the places where this should take place, but I was thinking resource attributes at least.


    that is to big for the typed representation Integer (64 bit signed).
    For
    Float (64 bit) representation there is no error, but it looses
    precision.



What about numbers that overflow or underflow a 64-bit Float?


That would also be an error (when it cannot lose more precision).

    When specifying an attribute to have Number type, automatic
    conversion to Float (with loss of precision) takes place if an internal
    integer number is to big for the Integer representation.

    (Note, by default, attributes are typed as Any, which means that
    they by
    default would store a Float if the integer value representation
    overflows).



And if BigDecimal (and maybe BigInteger) were added to the type system,
then I presume the expectation would be that over/underflowing Floats
would go there?  And maybe that overflowing integers would go there if
necessary to avoid loss of precision?


If we add them, then the runtime should be specified to gracefully choose the required size while calculating and that the types Any and Number means that they are accepted, but that Integer and Float does not accept them (when they have values that are outside the valid range). (I have not thought this through completely at this point I must say).


    Questions
    =========
    * Is it important that Javascript can be used to (accurately) read JSON
    generated by Puppet? (If so, the limit needs to be 2^53 or values lose
    precision).



I think that question is moot.  No matter what, Javascript is limited in
that it cannot with full fidelity consume or produce Puppet data having
more than 53 bits of numeric precision.  I don't think it helps anyone
to project that limitation into Puppet.

    * Is it important in Puppet Manifests to handle values larger than
    2^63-1 (smaller than -2^63), and if not so, why isn't it sufficient to
    use a floating point value (with reduced precision).



I am not prepared to offer examples of why Puppet manifests would need
to handle more than 63 bits of fixed-point precision, nor even more than
53 bits of floating-point precision.  I am uneasy about pulling back
from Puppet's documented greater current capabilities, however.

    * If you think Puppet needs to handle very large values (yottabyte
    sized
    disks?), should the language have convenient ways of expressing
    such values e.g. 42yb ?



I would prefer to avoid adding such expressions, especially if there
will not be similar ones all the way down the size scale.  I would not
be enthusiastic even with a full range of  such expressions.


    * Is it ok to automatically do transformation to floating point if
    values overflow, and the type of an attribute is Number? (as discussed
    above). I can imagine this making it difficult to efficiently represent
    an attribute in a database and support may vary between different
    database engines.



It is not ok to silently lose precision.  It might be ok to lose
precision if doing so is accompanied by a warning.

I'd anyway be inclined to say that the problem here is not so much
possible loss of precision as it is specifying the type of the attribute
as Number instead of something more specific.  OF COURSE that presents
issues for recording the value in a database.

    * Do you think it is worth the trouble to add the types BigInteger and
    BigDecimal to the type system to allow the representation to be more
    precise? (Note that this makes it difficult to use standard number
    representation in serialization formats). This means that Number is not
    allowed as an attribute/storage type (user must choose Integer, Float,
    or one of the Big... types).



1) If you have BigDecimal then you don't need BigInteger.

True, but BigInteger specifies that a fraction is not allowed.

2) Why would allowing one or both of the Bigs prevent Number from being
allowed as a serializable type?

Not sure I said that. The problem is that if something is potentially Big... then a database must be prepared to deal with it and it has a high cost. Specifying that Number means Integer, Float, or a Big type is perfectly fine.

The way I see it, if you allow Bigs then Numbers must always be
(de)serialized as BigDecimal.  Where you want attributes or other values
to be efficiently serializable / indexable / etc. you assign them a
narrower type appropriate for that purpose.  If this is too big a
challenge for users accustomed to not specifying types, then perhaps the
whole type system thing -- cool as it is -- is just not a good fit for
Puppet.

Yes, that is how I though this could work. However, since everything is basically untyped now (which we translate to the type Any), this means that PuppetDB must be changed to use BigDecimal instead of integer 64 and float. That is a loose^3; it is lots of work to implement, bad performance, and everyone needs to type everything.

3) Do you actually need one or both Bigs as named types in order to
allow Big values?  Could it not be that Big values are representable via
the Number type, but there is no (other) named numeric type that
specifically allows such values?  Since you seem to prefer that users to
not work with such values, would that not influence them in that direction?

Possibly. Having Number be concrete and represented as BigDecimal is ok, it can hold any value described by subclasses.


    * Do you think it should work as in Ruby? If so, are you ok with
    serialization that is non standard?



I think disallowing Bigs in the serialization formats will present its
own problems, only some of which you have touched on so far.  I think
the type system should offer /opportunities/ for greater efficiency in
numeric handling, rather than serving as an excuse to limit numeric
representations.


I don't quite get the point here - the proposed cap is not something that the type system needs. As an example MsgPack does not have standard Big types, thus a serialization will need to be special and it is not possible to just use something like "readInt" to get what you know should be an integer value. The other example is PuppetDB, where a decision has to be made how to store integers; the slower Big types, or a more efficient 64 bit value? This is not just about storage, also about indexing speed and query/comparisson - and if thinking that some values are stored as 64 bits and other as big type for the same entity that would be even slower to query for.

So - idea, make it safe and efficient for the normal cases. Only when there is a special case (if indeed we do need the big types) then take the less efficient route.


- henrik

--

Visit my Blog "Puppet on the Edge"
http://puppet-on-the-edge.blogspot.se/

--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/lu7jq8%24ag8%241%40ger.gmane.org.
For more options, visit https://groups.google.com/d/optout.

Reply via email to