Re: [Puppet-dev] A question about numbers and representation

Trevor Vaughan Mon, 01 Sep 2014 10:17:00 -0700

TL;DR; BigInteger/BigDecimal is the "right" thing to do, otherwise cap at
the client/server floor.


I have a few thoughts here:

1) I don't like losing precision in any case so a cap makes sense (maybe)

2) If you do cap, would you not want to cap to the lowest of the client or
server? I.e. if the client is a 32 bit system and the server is a 64 bit
system, you'd cap at 32 bits.

3) There may be cases where someone needs higher precision numbers. I can't
think of them off hand, but I can guarantee that they'll happen so adding
BigInteger and BigDecimal are probably a good idea.

4) For any fact that is retrieved that has multiple formats, I would like
to see a standard set of a hash for each size so that it is easier to work
with. Sure, right now, I can do variable mangling or post retrieval math,
but it's so very untidy.

disk_size => {
  '/dev/sda' => {
     'B' => 10737418240,
     'kB' => 10485760,
     'MB' => 10240,
     'GB' => 10,
  }
}

But then, how far do you take this? TB, PB? EB.......?

Thanks,

Trevor


On Mon, Sep 1, 2014 at 4:54 AM, Henrik Lindberg <
henrik.lindb...@cloudsmith.com> wrote:

> Hi,
> Recently I have been looking into serialization of various kinds, and the
> issue of how we represent and serialize/deserialize numbers have come up.
>
> TL;DR - I want to specify the max values of integers and floats in the
> puppet language for a number of reasons. Skip the background part
> to get to "Questions and Proposal" if you are already familiar with
> serialization formats, and issues regarding numeric representation.
>
> Background
> ---
> As you may know, Ruby has fluent handling of numbers - if a number would
> overflow its current byte-size a larger representation will be used - i.e.
> from 32 to 64 to (ruby) BigInteger (unlimited). Floating point numbers
> undergo the same transition from 32 to 64 to BigDecimal (unlimited).
>
> This is very flexible and helpful most of the time, but it creates problem
> when serializing / deserializing. Most serialization formats
> can simply not deal with > 64 bit values as regular numbers. They may do
> horrible things like truncation, or use the max/min value if a value is
> too big, or for floating point drastically lose precision.
>
> YAML
> - specifies integers to have arbitrary size, but recommends that an
> implementation uses its native integer size. The specification says:
> "In some languages (such as C), an integer may overflow the native type's
> storage capability. A YAML processor may reject such a value as an error,
> truncate it with a warning, or find some other manner to round-trip it. In
> general, integers representable using 32 binary digits should safely
> round-trip through most systems.". http://www.yaml.org/spec/1.2/spec.html
>
> For floating point values, only IEEE 32 bit are safe.
>
> In other words; it is unspecified... and means a YAML implementation may
> silently truncate numbers to 32 bit values to 32 bit max int
> (2,147,483,647) when running on a 32 bit machine (some implementations as
> noted as "gotchas" in blog posts (google for it)).
>
> JSON
> - is similar to YAML in that it specifies a number to be an arbitrary
> number of digits and it is thus up to an implementation to bind this to a
> representation. It has the same problems as YAML. Notably, if used with
> JavaScript which only has Number for both Integer and Real, the largest
> integer number is 2^53 (after which it starts to lose precision).
>
> MsgPack
> - handles 8-16-32-64 bit integers (signed and unsigned) as well as 32 and
> 64 bit floating point. Does not have built in BigInteger, BigDecimal types.
>
> The Puppet Language Specification
> ---
> In the Puppet Language Specification the size and precision of numbers is
> currently specified as Ruby numbers (simply because this was easiest). This
> is sloppy and leaves edge cases for serialization and storage of data.
>
> Proposal
> ========
> I would like to cap a Puppet Integer to be a 64 signed value when used as
> a resource attribute, or anywhere in external formats. This means a value
> range of -2^63 to 2^63-1 which is in Exabyte range (1 exabyte = 2^60).
>
> I would like to cap a Puppet Float to be a 64 bit (IEEE 754 binary64) when
> used as a resource attribute or anywhere in external formats.
>
> With respect to intermediate results, I propose that we specify that
> values are of arbitrary size and that it is an error to store a value that
> is to big for the typed representation Integer (64 bit signed). For Float
> (64 bit) representation there is no error, but it looses precision. When
> specifying an attribute to have Number type, automatic conversion to Float
> (with loss of precision) takes place if an internal integer number is to
> big for the Integer representation.
>
> (Note, by default, attributes are typed as Any, which means that they by
> default would store a Float if the integer value representation overflows).
>
> Questions
> =========
> * Is it important that Javascript can be used to (accurately) read JSON
> generated by Puppet? (If so, the limit needs to be 2^53 or values lose
> precision).
>
> * Is it important in Puppet Manifests to handle values larger than 2^63-1
> (smaller than -2^63), and if not so, why isn't it sufficient to use a
> floating point value (with reduced precision).
>
> * If you think Puppet needs to handle very large values (yottabyte sized
> disks?), should the language have convenient ways of expressing
> such values e.g. 42yb ?
>
> * Is it ok to automatically do transformation to floating point if values
> overflow, and the type of an attribute is Number? (as discussed above). I
> can imagine this making it difficult to efficiently represent an attribute
> in a database and support may vary between different database engines.
>
> * Do you think it is worth the trouble to add the types BigInteger and
> BigDecimal to the type system to allow the representation to be more
> precise? (Note that this makes it difficult to use standard number
> representation in serialization formats). This means that Number is not
> allowed as an attribute/storage type (user must choose Integer, Float, or
> one of the Big... types).
>
> * Do you think it should work as in Ruby? If so, are you ok with
> serialization that is non standard?
>
> - henrik
> --
>
> Visit my Blog "Puppet on the Edge"
> http://puppet-on-the-edge.blogspot.se/
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to puppet-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/puppet-dev/lu1c8m%24a2n%241%40ger.gmane.org.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Trevor Vaughan
Vice President, Onyx Point, Inc
(410) 541-6699
tvaug...@onyxpoint.com

-- This account not approved for unencrypted proprietary information --

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoUWgzwdhhEFtS6STj_POU80dtPpVvrN_dx1Ta13QCjJkQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Puppet-dev] A question about numbers and representation

Reply via email to