[Puppet-dev] Re: The string to number torture never stops

Henrik Lindberg Tue, 04 Nov 2014 08:47:30 -0800

On 2014-03-11 18:19, John Bollinger wrote:



On Friday, October 31, 2014 8:22:36 PM UTC-5, henrik lindberg wrote:

[...]

    Yet again someone was bit by the automatic String to Numeric conversion
    that is in Puppet (and also in --parser future).



I must confess to a certain dark amusement at Puppet struggling with its
weak-typing legacy and moving more and more in the direction of strong
typing.  Not that I am in any way happy about these difficulties, but
I'm an old-school dinosaur, and weak typing has never seemed like such a
great idea to me.

It is never a great idea to have weak / no typing. It is an even worseidea to have all types represented as strings...

[...]

    We fixed PUP-3602 by not converting strings that are floating point 0
    with exponential part and we also do not convert values that are
    floating point infinite in Ruby (e.g. 4e999 and such). This is a crutch
    though, and it is only a matter of time until someone stumbles over the
    next SURPRISE !



Indeed so.  If you rely on heuristics to choose behavior then you have
to accept that sometimes the wrong behavior will be chosen.  In this
case, it is purely a guess that a string starting with "0e[digit]" is
not meant to be interpreted as a number.


Yes, this was a bit of a panic fix. It really sucks.

    The best cure is naturally to never do String to numeric conversion.



What about numeric to string conversion?  I guess Puppet doesn't do that
automatically now, so maybe this isn't the time to start, but that /is/
an alternative approach to mixed string/number comparison.

True, 4.0.0 will not do numeric to string automatically (because of

radix, precision and floating point formatting). I do not see thatcoming back.

    And
    we wonder what people feel about that. Should we go for this in Puppet
    4.0.0 (and have a 3.7.4 release as the last of the 3x series where this
    behavior is implemented when using --parser future).



I think avoiding automatic string to numeric conversion is consistent
with forbidding bare strings starting with a digit..


Good point.

It would be a lot
better, though, if in the context of the manifest it were clearer which
expressions are strings and which are numeric.  That's no problem for
literals, of course, but none of this is interesting in the case where
all the values involved are literals.  I (think I) understand that with
the P4 parser and evaluator it will be possible to declare types
specifically enough to address that issue, but it is also my
understanding that expressions won't /necessarily/ have formal types
specific enough for that.

It is a bit difficult since operators are overloaded on type. The goodpart is that if we stop transforming strings to numbers there will beerrors for arithmetic expressions.

The bad part is that ==, != cannot raise errors (since a string issimply not equal to a number). Currently comparisons order all numbersto be smaller than all strings. We could change those to instead errorif the types are not comparable to each other.

It is highly desirable to give manifest authors sufficient control over
conversions to avoid unwanted ones, but it is not altogether clear to me
whether the best approach is to nix all automatic string-to-number
conversions.  A lot of existing manifests rely on such conversions,
since they used to be the only alternative.  P4 is at liberty to break
backward compatibility, but maybe a little less breakage would be wiser?

While I am not so worried about the logic in the manifests themselves.There has not been that many problems reported with respect to theconversion in the other direction. For resource attributes however thesituation is worse since there are many types out there where it isunknown how they deal with data types, what sort of munging / processingthey do of strings/numbers etc.

This would be the primary reason (IMO) to not do this until resourcetypes can type their parameters. (Since typed parameters directserialization, and there is no longer a question if a serialized "42"should be a number or a string).

    * Add === operator to compare both type and value. This is a slippery
    slope since we probably want Integer and Float to compare equal - say
    0.0 and 0. It adds yet another operator, and we have to decide what
    case, selector and in should use since there is no way to specify if
    one
    or the other should be used.



I agree that as proposed, the '===' operator would be troublesome.
There is always the alternative, though, of keeping '==' as it is, and
making '===' simply perform a comparison without string/number
conversion.  I think 'case', selector, and 'in' behavior are
collectively a red herring, though: if '===' were adopted in any form
then 'case', selector, and 'in' behavior would still be whatever is
specified for them, whether that's their current behavior or a variant
one based on '==='.  There is no requirement that that behavior be
selectable between different senses of equality.

I'm not necessarily advocating that solution, but I think it's
appropriate to take a careful, unbiased look at all the alternatives.
I'm not certain they're all on the table, yet.  For example, how about this:

* The value of an expression may be converted to a different type only
to the extent that the target type is consistent with the expression's
/formal/ type.

For example, if a class parameter is declared to be type String then
it's value cannot be automatically converted to Numeric or any of its
subtypes, but if it is type Scalar or Object and happens to /contain/ a
string, then the string value /can/ be automatically converted to a
number (a Float, for instance).  That could yield backward compatibility
for existing manifests that do not declare types, while still allowing
authors to control the allowed conversions.

Interesting idea, but problematic to implement and having goodperformance. Now the type of an expression is encoded in the resultinginstance / value. Adding the more advanced "declared type narrowsconversion", I think it is required to also keep track of the declaredtype of every (intermediate) value. All functions must declare theirreturn type etc. Since no functions do that now, we would basicallyalways operate on the Any type, and all conversions would be allowed.

    Instead, since we already have sprintf for value to string conversion,
    and there is a scanf in Ruby which does the conversion the other
    way, we
    can simply add that function to core puppet for value conversion.

    We could also add to_number(s), or to_number(s, format_string).
    When doing that though, we risk ending up with a plethora of to_xxx
    functions, and we could instead offer one more universal
    convert_to(Type, value, options) function e.g.

    # best effort, or fail
    convert_to(Number, value)

    # only if it is an integer, or fail
    convert_to(Integer, value, {base => 10})

    # convert integer to hex string with some extra text (the sprintf way)
    convert_to(String, value, {format => "Hex %x"})

    # convert to array of string (give full control over all
    # nested conversions.
    #
    convert_to(Array[String], value, {
        Integer => { format => "0x%x" },
        Float   => { format => "%#.4G" }})

    # etc.



I do think there need to be type conversion facilities in some form.
I'm inclined to like the idea of a generic facility based on types, as
opposed to a host of specific conversion functions.

    So - "Boldly break all the (s)t(r)hings"?



I don't know how costly it would be to implement, but I rather like the
combination of control, flexibility, and backward compatibility that
could be afforded by using formal types to limit the scope of allowed
conversions, instead of altogether forbidding (some) automatic conversions.

In general I think narrowing the conversions to declared type would bedifficult. It could possibly be done when passing arguments tofunctions, defined types, or parameterized classes. Currently it onlychecks for type compliance, and I don't have any immediate ideas for howto specify type conversion except something like specifying a lambda perparameter to do type conversion, or special type conversion functionsthat apply in different scopes. I think that adds complexity that ismore difficult to deal with than explicit conversion (i.e. accept abroader type, then convert/assert inside the body of the construct).


- henrik

--

Visit my Blog "Puppet on the Edge"
http://puppet-on-the-edge.blogspot.se/

--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/m3avtq%24l44%241%40ger.gmane.org.
For more options, visit https://groups.google.com/d/optout.

[Puppet-dev] Re: The string to number torture never stops

Reply via email to