On 2014-01-09 19:15, Trevor Vaughan wrote:
TL;DR; BigInteger/BigDecimal is the "right" thing to do, otherwise cap
at the client/server floor.

I have a few thoughts here:

1) I don't like losing precision in any case so a cap makes sense (maybe)

2) If you do cap, would you not want to cap to the lowest of the client
or server? I.e. if the client is a 32 bit system and the server is a 64
bit system, you'd cap at 32 bits.


There is no need to do that - today's systems handle both 32 and 64 bit values just fine - its the max unsigned 64 bit int, and values above that, and those that are smaller than -2^63 that causes problems. If you had such values today, they would not roundtrip through the system.

It does not matter all that much if a 32 bit system has a bit more work to do when adding 64 bit numbers - the main problems are serialization, and storage formats for efficient processing at larger scale (where 64 bit systems are indeed used).

3) There may be cases where someone needs higher precision numbers. I
can't think of them off hand, but I can guarantee that they'll happen so
adding BigInteger and BigDecimal are probably a good idea.


I also imagine them being needed - they are needed in various applications - just wonder what the need may be in puppet's domain.
(Total sum of diskspace in a report?)

Adding them as explicit types will work fine when we do need them BTW, but it requires a fair amount of work as there are many touchpoints in the system that has to deal with them.

4) For any fact that is retrieved that has multiple formats, I would
like to see a standard set of a hash for each size so that it is easier
to work with. Sure, right now, I can do variable mangling or post
retrieval math, but it's so very untidy.

disk_size => {
   '/dev/sda' => {
      'B' => 10737418240,
      'kB' => 10485760,
      'MB' => 10240,
      'GB' => 10,
   }
}

But then, how far do you take this? TB, PB? EB........?

We probably have to stop at Geopbyte since we are back at 'G' :-)

- henrik

On Mon, Sep 1, 2014 at 4:54 AM, Henrik Lindberg
<henrik.lindb...@cloudsmith.com <mailto:henrik.lindb...@cloudsmith.com>>
wrote:

    Hi,
    Recently I have been looking into serialization of various kinds,
    and the issue of how we represent and serialize/deserialize numbers
    have come up.

    TL;DR - I want to specify the max values of integers and floats in
    the puppet language for a number of reasons. Skip the background part
    to get to "Questions and Proposal" if you are already familiar with
    serialization formats, and issues regarding numeric representation.

    Background
    ---
    As you may know, Ruby has fluent handling of numbers - if a number
    would overflow its current byte-size a larger representation will be
    used - i.e. from 32 to 64 to (ruby) BigInteger (unlimited). Floating
    point numbers undergo the same transition from 32 to 64 to
    BigDecimal (unlimited).

    This is very flexible and helpful most of the time, but it creates
    problem when serializing / deserializing. Most serialization formats
    can simply not deal with > 64 bit values as regular numbers. They may do
    horrible things like truncation, or use the max/min value if a value
    is too big, or for floating point drastically lose precision.

    YAML
    - specifies integers to have arbitrary size, but recommends that an
    implementation uses its native integer size. The specification says:
    "In some languages (such as C), an integer may overflow the native
    type's storage capability. A YAML processor may reject such a value
    as an error, truncate it with a warning, or find some other manner
    to round-trip it. In general, integers representable using 32 binary
    digits should safely round-trip through most systems.".
    http://www.yaml.org/spec/1.2/__spec..html
    <http://www.yaml.org/spec/1.2/spec.html>

    For floating point values, only IEEE 32 bit are safe.

    In other words; it is unspecified... and means a YAML implementation
    may silently truncate numbers to 32 bit values to 32 bit max int
    (2,147,483,647) when running on a 32 bit machine (some
    implementations as noted as "gotchas" in blog posts (google for it)).

    JSON
    - is similar to YAML in that it specifies a number to be an
    arbitrary number of digits and it is thus up to an implementation to
    bind this to a representation. It has the same problems as YAML.
    Notably, if used with JavaScript which only has Number for both
    Integer and Real, the largest integer number is 2^53 (after which it
    starts to lose precision).

    MsgPack
    - handles 8-16-32-64 bit integers (signed and unsigned) as well as
    32 and 64 bit floating point. Does not have built in BigInteger,
    BigDecimal types.

    The Puppet Language Specification
    ---
    In the Puppet Language Specification the size and precision of
    numbers is currently specified as Ruby numbers (simply because this
    was easiest). This is sloppy and leaves edge cases for serialization
    and storage of data.

    Proposal
    ========
    I would like to cap a Puppet Integer to be a 64 signed value when
    used as a resource attribute, or anywhere in external formats. This
    means a value range of -2^63 to 2^63-1 which is in Exabyte range (1
    exabyte = 2^60).

    I would like to cap a Puppet Float to be a 64 bit (IEEE 754
    binary64) when used as a resource attribute or anywhere in external
    formats.

    With respect to intermediate results, I propose that we specify that
    values are of arbitrary size and that it is an error to store a
    value that is to big for the typed representation Integer (64 bit
    signed). For Float (64 bit) representation there is no error, but it
    looses precision. When specifying an attribute to have Number type,
    automatic conversion to Float (with loss of precision) takes place
    if an internal integer number is to big for the Integer representation.

    (Note, by default, attributes are typed as Any, which means that
    they by default would store a Float if the integer value
    representation overflows).

    Questions
    =========
    * Is it important that Javascript can be used to (accurately) read
    JSON generated by Puppet? (If so, the limit needs to be 2^53 or
    values lose precision).

    * Is it important in Puppet Manifests to handle values larger than
    2^63-1 (smaller than -2^63), and if not so, why isn't it sufficient
    to use a floating point value (with reduced precision).

    * If you think Puppet needs to handle very large values (yottabyte
    sized disks?), should the language have convenient ways of expressing
    such values e.g. 42yb ?

    * Is it ok to automatically do transformation to floating point if
    values overflow, and the type of an attribute is Number? (as
    discussed above). I can imagine this making it difficult to
    efficiently represent an attribute in a database and support may
    vary between different database engines.

    * Do you think it is worth the trouble to add the types BigInteger
    and BigDecimal to the type system to allow the representation to be
    more precise? (Note that this makes it difficult to use standard
    number representation in serialization formats). This means that
    Number is not allowed as an attribute/storage type (user must choose
    Integer, Float, or one of the Big... types).

    * Do you think it should work as in Ruby? If so, are you ok with
    serialization that is non standard?

    - henrik
    --

    Visit my Blog "Puppet on the Edge"
    http://puppet-on-the-edge.__blogspot.se/
    <http://puppet-on-the-edge.blogspot.se/>

    --
    You received this message because you are subscribed to the Google
    Groups "Puppet Developers" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to puppet-dev+unsubscribe@__googlegroups.com
    <mailto:puppet-dev%2bunsubscr...@googlegroups.com>.
    To view this discussion on the web visit
    
https://groups.google.com/d/__msgid/puppet-dev/lu1c8m%24a2n%__241%40ger.gmane.org
    
<https://groups.google.com/d/msgid/puppet-dev/lu1c8m%24a2n%241%40ger.gmane.org>.
    For more options, visit https://groups.google.com/d/__optout
    <https://groups.google.com/d/optout>.




--
Trevor Vaughan
Vice President, Onyx Point, Inc
(410) 541-6699
tvaug...@onyxpoint.com <mailto:tvaug...@onyxpoint.com>

-- This account not approved for unencrypted proprietary information --

--
You received this message because you are subscribed to the Google
Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to puppet-dev+unsubscr...@googlegroups.com
<mailto:puppet-dev+unsubscr...@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoUWgzwdhhEFtS6STj_POU80dtPpVvrN_dx1Ta13QCjJkQ%40mail.gmail.com
<https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoUWgzwdhhEFtS6STj_POU80dtPpVvrN_dx1Ta13QCjJkQ%40mail.gmail.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.


--

Visit my Blog "Puppet on the Edge"
http://puppet-on-the-edge.blogspot.se/

--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/lu2qi1%24e88%241%40ger.gmane.org.
For more options, visit https://groups.google.com/d/optout.

Reply via email to