Re: "Implied types, first try." Or "Its amazing what you can do with potatoes"

Steve Fink Mon, 09 Jul 2001 20:30:31 -0700
[EMAIL PROTECTED] wrote:

> Here's me thinking out loud.  I'm thinking about how to avoid alot of
> explicit type casting without entering a maze of twisty typecasing
> rules, all different.
> 
> 
> Imagine we have a typing system where types are allowed to
> automatically cast AS LONG AS NO INFORMATION IS LOST.


User-defined types? Do you tell them what they can accept without loss, 
or just have them in the big dusty Other box?

This really just forms a type hierarchy, where child-of requires 
can-be-losslessly-converted. May as well just create an explicit 
hierarchy, and maybe make it extensible with a strong suggestion that 
any additions follow certain rules.

>       my Int    $int = 4;     # Int -> Int assignment, ok
>       my String $foo = $int;  # implicit Int -> String cast, ok
> 
>       print $int;             # implicit Int -> String cast, ok
>       print $int + $foo;      # $foo is implicitly cast from String -> Int
>                               #     since no information is lost.
>                               # $int + $foo is integer math.
>                               # The result is cast to a String.
>       my Num    $dec = 4.2;   # Num -> Num assignment, ok
>       $int = $dec;            # *error* 4.2 would have to be truncated
>                               #     to fit into $int.
>       $int = int($dec);       # This is ok.  int takes a Num and returns
>                               #     an Int.
>       $dec = $int;            # implicit Int -> Num cast, ok.


Ok, fine.

> I'm pondering this being okay:
> 
>       my Num    $dec = 4.0;
>       my Int    $int = $dec;  # Num -> Int okay since 4.0 truncates to 4
>                               #     with no(?) information lost
> 
> but I think it will complicate my Grand Plan which I'm not ready to
> reveal.


Compile-time or run-time?

my Num $x = 3.0;
$x++;
my Int $y = $x;

Could be compile-time, if you do constant folding first.

Allow me to ramble:

You can think of having a universe of (an infinite number of) 'values'. 
A type is a subset of those values. Things like Int, Num, and String are 
just labels of subsets. So rather than thinking of a subtype (or a 
safely castable type) of another type, think of a subset. I'm not 
originating this, btw. So Int is a subset of Num, and '4' and '4.0' are 
considered to be exactly equivalent.

> Now, obviously there's no need to declare hashes, arrays and scalars
> as being such, we have the sigils.  But references we do.
> 
>       my HASH  $ref = \%hash; # ok
>       my ARRAY $ref = \%hash; # *error*
> 
> HASH, ARRAY, SCALAR, GLOB, etc... references do not cast.


...although in perl6 that's my HASH $ref = %hash, but sure.

>       my SCALAR $ref = \$foo;
>       print $ref;             # *error* implicit SCALAR -> String cast
>                               #    illegal.
> 
> You'd have to do an explicit typecast (syntax left as an exercise).
> Given that most times when you try to use a reference as a string
> you're making a mistake, this shouldn't be a big deal.


my SCALAR $ref is stringify_sub(&f) = \$foo

currently known as

use overload '"' => \$f

> 
> For arrays, its probably enough that the whole array be of a single type:
> 
>       my Int @array = (1..10);        # array of integers
>       $array[20] = 4.2;               # *error*, Nums not allowed.
> 
> provides some nice optimizations.  You can do the same for hashes:
> 
>       my Int %hash = (foo => 1, bar => 2);    # hash of integers
>       $hash{baz} = 'yarrow';                  # *error* Strings not allowed
> 
> but we like to mix types up in hashes.  One way we can allow this is
> to have a generic type, Value (or perhaps Scalar) that every type will
> cast into.  Hash keys would be implicitly of type Value.
> 
>       my Value %hash = (foo => 1, bar => 2, baz => 'yarrow');
> 
> but that's pretty much the same as switching off strong typing.  So
> obviously hashes have to have per-key types.  I will leave the syntax
> of that as an exercise for the reader.  It would have to make it easy
> to declare lists of keys as being a single type.  If we come up with
> something nice here, it can probably be backported onto arrays.


It's only switching off strong types for that variable, so it's a 
valuable thing to do.

my qt(String => DontEvenThinkAboutIt) %tree = (id => 1);
$tree{parent} = \%tree;

(or if you think you can tackle that)

$tree{parent_with_distance} = [ 0, \%tree ];

a map from strings to numbers and lists of two elements, where the first 
element is a number and the second is a reference to a map from strings 
to numbers and lists of two elements...

> 
> 
> Regexes pose an interesting problem.  m// and s/// would take a
> String.  Num and Int can all autocast to a string fine, but
> consider...
> 
>       my Int $foo = 42;
>       $foo =~ s/\d+/Basset Hounds/;   # *run-time error*
> 
> Sure, $foo will cast to a String and be operated on, but then the
> regex changes it to 'Basset Hounds' and tries to put that back into
> $foo.  *BANG* Run time type error.  I think that's the best we can do
> with regexes.  Perhaps there could be an option for the typing system
> which says "s/// only takes Strings and does not cast" turning the
> above into a compile-time error.


How is a regex different from any other function that makes sense with 
multiple types?

my Int $foo = 42;
my Num $bar = 4.2;
my Int $foo2 = increment($foo);
my Num $bar2 = increment($bar);
my Sub $sub2 = increment(sub { 42 });

sub increment {
        my $x = shift;
        if (ref $x eq 'CODE') {
                return sub { 1 + $x->(); }
        } else...
}

Hm. Too abstract. How about

sub escape_cgi_param { return quotemeta(shift()) }

my Int $id = escape_cgi_param(...);
my String $name = escape_cgi_param(...);

> 
> At least it does prevent some silly errors:
> 
>       my HASH $ref = \%hash;
>       $ref =~ m/.../;                 # *error*
> 
> 
> Now, here's an example of something that might be really annoying to
> get right.  Let's say localtime() returns a hash in Perl 6 (sensible).
> Let's also say that not only does it return the year, mday, hour, min,
> sec, etc... as integers, it also returns things like the name of the
> month, name of the day, etc...
> 
>       my %time = localtime;
>       # "Today is Monday, July 9, 2001"
>       printf "Today is %s, %s %d, %d", $time{qw(dow month mday year)};
> 
> What should the return signature of localtime look like?  Obviously it
> should return a hash, so that much checks out.  But each of the keys
> should be typed as well, String or Num.  It would be really, really
> annoying to have to set up the hash and all its keys just right so it
> can accept the return value from localtime.  Instead, perhaps Perl can
> simply imply its type from the declaration:
> 
>       my %time = localtime;   # %time's type declaration is implied by
>                               #    localtime's return signature AT
>                               #    COMPILE TIME!
>       %time = gmtime;         # ok, localtime and gmtime have the same
>                               #    signature.
>       $time{mday}++;          # ok, $time{mday} is an Int
>       $time{mday} .= 'foofer' # *run-time error*  Implicit $time{mday} 
>                               #    Int -> String cast ok for the string 
>                               #    append, but trying to convert the 
>                               #    String "10foofer" back to an Int fails.
> 
> So a newly declared variable can imply its type based on its initial
> function assignment.  You might even take this further to work with
> other variables.
> 
>       my String $moo = 'foo';
>       my $bar = $moo;         # $bar implied to be String
> 
> However, if there is no type and nothing to imply it from, that's an
> error.
> 
>       my $foo;                # *error* forgot to declare a type.


But it needs to resolve the type of an arbitrarily nasty expression 
anyway, doesn't it?

my $foo = (time % 2) ? "one" : 3; # $foo is a String

Otherwise, it seems like you only get implied types for constants and 
direct function calls of functions whose declarations are visible. I 
guess you demand a declaration if it gets nasty?

How about sub foo { return [ { }, 3 ]; }? What should its return type 
be? A ref to an array of two elements, with the first element a hash 
mapping nothing at all, or anything to anything? How about

my $x = foo();
my $y = $x->[0];
$y->{3} = 4;
$y->{bob} = "mary";

How about

my $x = $object->spruk();

when ref $object eq 'Gloof', and there is no &Gloof::spruk, and 
@Gloof::ISA is defined at runtime?

Or consider

package Flug;
sub somenum { return scalar(@ARGV); }
sub pi { return 3.1; }
sub AUTOLOAD { my $tmp = [ { }, 3, { a => [ 3 ] } ]; push @$tmp, \$tmp; 
return $tmp; }
my Flug $albert = new Flug;
my $method = (time % 2) ? "somenum" : "pi";
my $x = $albert->$method();

Sometimes, this is tractable. Recall the subset definition of types. 
$method's type is the subset containing the strings "somenum" and "pi". 
So $x's type is the union of the possible return values of Flug::somenum 
and Flug::pi. The possible return values of Flug::someone are 0..the max 
number of arguments, which can be simplified to 0..MAXINT, also known as 
the value subset labelled Int. Flug::pi always returns the value 3.1; no 
need to name its singleton subset. Their union is the set containing all 
integers along with 3.1. Probably some simplification rule will kick in 
here to reduce that to the numerical subset of values, but who knows. 
(eg it would be useful to have both "Int" and "Int or undef")

> 
> We could have Perl go through heroics to try and find $foo's first
> assignment and imply a type from that, but I think that will rapidly
> get Messy and Surprising.


my $foo = $object->$method();
my $bar = My::AutoloadCrazy->gloof();

are Messy too.

> I think this implied type from declared assignment is Super Cool.  It
> means you only have to declare a few initial variables and the rest
> can largely have their types implied.  More DWIM, more concise, less
> annoying type gyrations, more chance hard-core Perl programmers will
> accept strong types.


Implied typing appears to be type inference with extremely weak 
inference rules. I'm not saying that's a bad thing. But I'd probably 
prefer if

my $deleted_mask = 0x8000;
my $id = cgi_param("id");
$id->{foo}++;
die "go away" if $deleted_mask & $id;
print "hello admin" if $id < 10;

were converted internally to something more like

my Int $deleted_mask = 0x8000;
my Nonref $id = cgi_param("id");
$id->{foo}++; # compile-time error
{
        my Int $id = Int($id); # Runtime error if Int() fails
        die "go away" if $deleted_mask & $id;
        print "hello admin" if $id < 10;
}

assuming cgi_param() returns a Nonref. The compiler sees that $id is 
used as an integer within its scope, so it checks just before the first 
use and afterwards doesn't bother to check. cgi_param() cannot return an 
Int because it's often used for other things like user names and credit 
card numbers (heh...). Though perhaps it *is* safe to assume that it 
won't return a reference, and thus get some compile-time checking.

Any way you do it, it's tough to ensure that the type signature of a 
function is available at compile-time. We need some sort of typesafe 
import().

(you wrote your ramblings, I wrote mine. I spent much time thinking in 
circles about this stuff during the RFC phase and didn't really come up 
with much. So now I'm seeing what I come up with without thinking. :-) )
Re: "Implied types, first try." Or "Its amazing what you can do with potatoes"

Reply via email to