Type Conversion Matrix, Pragmas (TAKE 4)

Michael Lazzaro Tue, 10 Jun 2003 12:04:54 -0700

Seeing as how lots of folks are on the road, and you can hear the on-list crickets chirping, I'm not sure if anything can be accomplished, but I'll repost this as one of those perennial things-which-really-need-to-be-decided-because-lots-of-stuff-is- dependant-on-it-in-terms-of-basic-A2-examples-and-best-practices.

Namely, how the most basic Perl6 types interact with each other, and what types can and can't be converted automatically, and the whole philosophy behind this: is it supposed to be silent, and DWIMmy? Is it supposed to be pedantic enough to prohibit "lossy" conversions? Is there more than one level of strictness? Who controls it -- the caller, or the callee?

(i) ---- Type Matrix ----

The following matrix depicts the basic P6 scalar types, and the "kinds" of conversions that may/must take place between them. Key is as follows:

+: Automatic conversion, conversion is not lossy
*: undefness and properties will be lost
N: numeric range or precision may be lost (esp. bigints, bignums)
F: numeric (float) conversion -- conversion to int is lossy
S: string conversion -- if string is not *entirely* numeric, is lossy
B: boolean conversion -- loses all but true/false
J: junction type; coercing to non-junction type may be lossy


FROM ->  str  Str  int  Int  num  Num  bit  Bit  bool  Bool Scalar
TO: str   -    *    +    *    +    *    +    *    +     *    *J
    Str   +    -    +    +    +    +    +    +    +     +     J
    int   S   *S    -   *N    F  *NF    +    *    +     *    *J
    Int   S    S    +    -    F    F    +    +    +     +     J
    num   S   *S    +    *    -   *N    +    *    +     *    *J
    Num   S    S    +    +    +    -    +    +    +     +     J
    bit   B   *B    B   *B    B   *B    -    *    +     *    *J
    Bit   B    B    B    B    B    B    +    -    +     +     J
    bool  B   *B    B   *B    B   *B    +    *    -     *    *J
    Bool  B    B    B    B    B    B    +    +    +     -     J
  Scalar  +    +    +    +    +    +    +    +    +     +     -

(ii) ---- Initial Assumptions ----

I previously proposed simplifying the matrix using the following Big Assumptions. This was not universally agreed upon, however: it may be that these assumptions are controlled by pragma (see next section). I include them here separately for reference.

*: (undefness and properties lost)

Using/converting an uppercase type as/to a lowercase (primitive) type is silently allowed. If you're sending an Int to something that requires an C<int>, you know that the 'something' can't deal with the undef case anyway -- it doesn't differentiate between undef and zero. Thus, you meant to do that: it's an "intentionally destructive" narrowing, and the C<undef> becomes a C<0>.

  my Int $a = undef;
  my int $b = $a;     # $b is now C<0>, NOT C<undef>

B: (conversion to boolean)

Converting to/from a bit or bool value is silently allowed. The Perl5 rules for "truth" are preserved, such that:

  my bool $b = undef;  # $b is C<0>
  my bool $b = 0;      # $b is C<0>
  my bool $b = 1;      # $b is C<1>
  my bool $b = -5;     # $b is C<1>
  my bool $b = 'foo';  # $b is C<1>

Converting a C<bit> or C<bool> to any other type always results in C<0> or C<1> for numeric conversions, or C<'0'> or C<'1'> for string conversions.

J: (scalar junctive to typed scalar)

A scalar junctive, e.g. an "untyped" scalar, can always be silently used as and/or converted to a more specific primitive type. This will quite frequently result in the loss of information; for example, saying:

   my     $a = 'foo';
   my int $b = $a;    # $b is now C<0>

works, but silently sets $b to 0, because the numeric value of C<'foo'> is C<0 but true>.

This means that using untyped scalars gets you back to Perl5 behavior of 'silently' accepting pretty much any conversion you can think of.

   my str $a = 'foo';
   my     $b = $a;
   my int $c = $a;    # COMPILE TIME ERROR
   my int $c = $b;    # OK

If you are using typed variables to enforce strict conversions, you probably want to be warned if you are using any untyped variables, anywhere. Something like:

use strict types;

(iii) ---- Simplified Matrix ----

The above assumptions result in a simplified conversion matrix, as follows:

FROM ->  str  Str  int  Int  num  Num  bit  Bit  bool  Bool Scalar
TO: str   -    +    +    +    +    +    +    +    +     +     +
    Str   +    -    +    +    +    +    +    +    +     +     +
    int   S    S    -    N    F   NF    +    +    +     +     +
    Int   S    S    +    -    F    F    +    +    +     +     +
    num   S    S    +    +    -    N    +    +    +     +     +
    Num   S    S    +    +    +    -    +    +    +     +     +
    bit   +    +    +    +    +    +    -    +    +     +     +
    Bit   +    +    +    +    +    +    +    -    +     +     +
    bool  +    +    +    +    +    +    +    +    -     +     +
    Bool  +    +    +    +    +    +    +    +    +     -     +
  Scalar  +    +    +    +    +    +    +    +    +     +     -

Leaving three "tricky" types of conversions to be dealt with.

(iv) ---- Pragma-Controlled Conversions ----

Given the above, and given the fact that our general answer to everything is to Use A Pragma ;-), we have the following tentative list of needed pragmas. For each possibility, we need to be able to declare that a given implicit type conversion will be silently allowed, will result in a warning, or will result in an exception.

A (very) rough proposed pragma form, for the sake of argument, is:

   use strict conversions;     # all on  (exceptions)
   no  strict conversions;     # all off

use strict conversions allow => << cv1 cv2 ... >>; # selected conversions are allowed use strict conversions warn => << cv1 cv2 ... >>; # selected conversions give warnings use strict conversions fail => << cv1 cv2 ... >>; # selected conversions give exceptions

S: (string to numeric)

Any given string may not be entirely numeric. In P5, a string like '1234foo' numifies to 1234, but you don't always want that ... sometimes, you want to throw an error if the string isn't 'cleanly' a number. Proposed pragma variants (exact P6 pragma syntax unknown):

use strict conversions allow => << str_to_num >>; # which is the default? use strict conversions warn => << str_to_num >>; use strict conversions fail => << str_to_num >>;

F: (float to int)

Historically, accidentally using a float as an int can be a significant source of errors. Proposed pragma variants:

use strict conversions allow => << num_to_int >>; # which is the default? use strict conversions warn => << num_to_int >>; use strict conversions fail => << num_to_int >>;

N: (numeric range)

This one is a giant pain. Converting, say, an Int to an int will, in fact, fail to do the right thing if you're in BigInt territory, such that the number would have to be truncated to fit in a standard <int>. But 99% of the time, you won't be working with numbers like that, so it would seem a horrible thing to disallow Int --> int and Num --> num conversions under the remote chance you *might* be hitting the range boundary. Then again, it would seem a horrible thing to hit the range boundary and not be informed of that fact. Thus, deciding the default state here will be a challenge:

use strict conversions allow => << Int_to_int >>; # which is the default? use strict conversions warn => << Int_to_int >>; use strict conversions fail => << Int_to_int >>;

use strict conversions allow => << Num_to_num >>; # which is the default? use strict conversions warn => << Num_to_num >>; use strict conversions fail => << Num_to_num >>;

(v) ---- Alternative Pragma Form ----

An alternative pragma form could possibly allow finer control over every individual possible conversion. The disadvantage of this form is that it would be very difficult to "correctly" set each of the >100 cells of the matrix, or even the 14 "critical" cells that most often change:

   use strict conversions allow {
       str => int,
       str => Int,
       str => num,
       str => Num,
       Str => int,
       Str => Int,
       Str => num,
       Str => Num,
       Int => int,
       num => int,
       num => Int,
       Num => int,
       Num => Int,
       Num => num,
   };

(vi) ---- Conversions of User Defined Types/Classes ----

It may be useful to allow the same level of pragma-based control for user-defined types and classes. For example, a given class Foo may wish to be "silently" convertable to an C<int>. One proposed syntax to declare the method of coercion/conversion might be:

    class Foo {
        ...

        to int {...}      # or C<as int {...}>?
    }

However, users of such a class could adjust the warning level of the given conversion using the alternate syntax given above (v):

use strict conversions warn { Foo => int };

----

Comments?

MikeL

Type Conversion Matrix, Pragmas (TAKE 4)

Reply via email to