Here's me thinking out loud.  I'm thinking about how to avoid alot of
explicit type casting without entering a maze of twisty typecasing
rules, all different.


Imagine we have a typing system where types are allowed to
automatically cast AS LONG AS NO INFORMATION IS LOST.

So let's start with something simple.  Int, Num [1], String.

All Ints can auto-cast to Numbers, no information lost.

All Ints and Numbers can auto-cast to Strings, again nothing is lost.
This is similar to how Perl works right now.

Strings can only auto-cast to Integers if the string looks like an integer.
Ditto String->Float.  Again, similar to how Perl works right now.
Obviously this will be a run-time type check.

So:

        my Int    $int = 4;     # Int -> Int assignment, ok
        my String $foo = $int;  # implicit Int -> String cast, ok

        print $int;             # implicit Int -> String cast, ok
        print $int + $foo;      # $foo is implicitly cast from String -> Int
                                #     since no information is lost.
                                # $int + $foo is integer math.
                                # The result is cast to a String.
        my Num    $dec = 4.2;   # Num -> Num assignment, ok
        $int = $dec;            # *error* 4.2 would have to be truncated
                                #     to fit into $int.
        $int = int($dec);       # This is ok.  int takes a Num and returns
                                #     an Int.
        $dec = $int;            # implicit Int -> Num cast, ok.

About the only thing remotely radical there is that Num -> Int won't
work.  The key is, never lose information.

I'm pondering this being okay:

        my Num    $dec = 4.0;
        my Int    $int = $dec;  # Num -> Int okay since 4.0 truncates to 4
                                #     with no(?) information lost

but I think it will complicate my Grand Plan which I'm not ready to
reveal.


Now, obviously there's no need to declare hashes, arrays and scalars
as being such, we have the sigils.  But references we do.

        my HASH  $ref = \%hash; # ok
        my ARRAY $ref = \%hash; # *error*

HASH, ARRAY, SCALAR, GLOB, etc... references do not cast.

        my SCALAR $ref = \$foo;
        print $ref;             # *error* implicit SCALAR -> String cast
                                #    illegal.

You'd have to do an explicit typecast (syntax left as an exercise).
Given that most times when you try to use a reference as a string
you're making a mistake, this shouldn't be a big deal.

For arrays, its probably enough that the whole array be of a single type:

        my Int @array = (1..10);        # array of integers
        $array[20] = 4.2;               # *error*, Nums not allowed.

provides some nice optimizations.  You can do the same for hashes:

        my Int %hash = (foo => 1, bar => 2);    # hash of integers
        $hash{baz} = 'yarrow';                  # *error* Strings not allowed

but we like to mix types up in hashes.  One way we can allow this is
to have a generic type, Value (or perhaps Scalar) that every type will
cast into.  Hash keys would be implicitly of type Value.

        my Value %hash = (foo => 1, bar => 2, baz => 'yarrow');

but that's pretty much the same as switching off strong typing.  So
obviously hashes have to have per-key types.  I will leave the syntax
of that as an exercise for the reader.  It would have to make it easy
to declare lists of keys as being a single type.  If we come up with
something nice here, it can probably be backported onto arrays.


Regexes pose an interesting problem.  m// and s/// would take a
String.  Num and Int can all autocast to a string fine, but
consider...

        my Int $foo = 42;
        $foo =~ s/\d+/Basset Hounds/;   # *run-time error*

Sure, $foo will cast to a String and be operated on, but then the
regex changes it to 'Basset Hounds' and tries to put that back into
$foo.  *BANG* Run time type error.  I think that's the best we can do
with regexes.  Perhaps there could be an option for the typing system
which says "s/// only takes Strings and does not cast" turning the
above into a compile-time error.

At least it does prevent some silly errors:

        my HASH $ref = \%hash;
        $ref =~ m/.../;                 # *error*


Now, here's an example of something that might be really annoying to
get right.  Let's say localtime() returns a hash in Perl 6 (sensible).
Let's also say that not only does it return the year, mday, hour, min,
sec, etc... as integers, it also returns things like the name of the
month, name of the day, etc...

        my %time = localtime;
        # "Today is Monday, July 9, 2001"
        printf "Today is %s, %s %d, %d", $time{qw(dow month mday year)};

What should the return signature of localtime look like?  Obviously it
should return a hash, so that much checks out.  But each of the keys
should be typed as well, String or Num.  It would be really, really
annoying to have to set up the hash and all its keys just right so it
can accept the return value from localtime.  Instead, perhaps Perl can
simply imply its type from the declaration:

        my %time = localtime;   # %time's type declaration is implied by
                                #    localtime's return signature AT
                                #    COMPILE TIME!
        %time = gmtime;         # ok, localtime and gmtime have the same
                                #    signature.
        $time{mday}++;          # ok, $time{mday} is an Int
        $time{mday} .= 'foofer' # *run-time error*  Implicit $time{mday} 
                                #    Int -> String cast ok for the string 
                                #    append, but trying to convert the 
                                #    String "10foofer" back to an Int fails.

So a newly declared variable can imply its type based on its initial
function assignment.  You might even take this further to work with
other variables.

        my String $moo = 'foo';
        my $bar = $moo;         # $bar implied to be String

However, if there is no type and nothing to imply it from, that's an
error.

        my $foo;                # *error* forgot to declare a type.

We could have Perl go through heroics to try and find $foo's first
assignment and imply a type from that, but I think that will rapidly
get Messy and Surprising.

I think this implied type from declared assignment is Super Cool.  It
means you only have to declare a few initial variables and the rest
can largely have their types implied.  More DWIM, more concise, less
annoying type gyrations, more chance hard-core Perl programmers will
accept strong types.


I'm going to stop here and digest.  I'm sure you can see where this is
all headed, I want strong typing but I want it as implicit and
unambiguous as possible (goals seemingly at odds).  I think if we bang
our heads into it enough, nice tricks like the implied type from
declaration will fall out.


[1] I'm calling floating point numbers just 'Numbers' or 'Num' for
short in my little imaginary typing system.  What we call a float is
what everyone else thinks of as a normal number (you may find this
shocking, but most people don't understand what a floating point
number is).  Internally Numbers would naturally upgrade themselves
from floats to doubles to BigFloats without the programmer having to
know.  Perhaps there would be a Float alias.

[2] *error* is compile-time error.

[3] This all, of course, is optional.

-- 
Michael G Schwern   <[EMAIL PROTECTED]>   http://www.pobox.com/~schwern/
Perl6 Quality Assurance     <[EMAIL PROTECTED]>       Kwalitee Is Job One

Reply via email to