[EMAIL PROTECTED] wrote:
> Here's me thinking out loud. I'm thinking about how to avoid alot of
> explicit type casting without entering a maze of twisty typecasing
> rules, all different.
>
>
> Imagine we have a typing system where types are allowed to
> automatically cast AS LONG AS NO INFORMATION IS LOST.
User-defined types? Do you tell them what they can accept without loss,
or just have them in the big dusty Other box?
This really just forms a type hierarchy, where child-of requires
can-be-losslessly-converted. May as well just create an explicit
hierarchy, and maybe make it extensible with a strong suggestion that
any additions follow certain rules.
> my Int $int = 4; # Int -> Int assignment, ok
> my String $foo = $int; # implicit Int -> String cast, ok
>
> print $int; # implicit Int -> String cast, ok
> print $int + $foo; # $foo is implicitly cast from String -> Int
> # since no information is lost.
> # $int + $foo is integer math.
> # The result is cast to a String.
> my Num $dec = 4.2; # Num -> Num assignment, ok
> $int = $dec; # *error* 4.2 would have to be truncated
> # to fit into $int.
> $int = int($dec); # This is ok. int takes a Num and returns
> # an Int.
> $dec = $int; # implicit Int -> Num cast, ok.
Ok, fine.
> I'm pondering this being okay:
>
> my Num $dec = 4.0;
> my Int $int = $dec; # Num -> Int okay since 4.0 truncates to 4
> # with no(?) information lost
>
> but I think it will complicate my Grand Plan which I'm not ready to
> reveal.
Compile-time or run-time?
my Num $x = 3.0;
$x++;
my Int $y = $x;
Could be compile-time, if you do constant folding first.
Allow me to ramble:
You can think of having a universe of (an infinite number of) 'values'.
A type is a subset of those values. Things like Int, Num, and String are
just labels of subsets. So rather than thinking of a subtype (or a
safely castable type) of another type, think of a subset. I'm not
originating this, btw. So Int is a subset of Num, and '4' and '4.0' are
considered to be exactly equivalent.
> Now, obviously there's no need to declare hashes, arrays and scalars
> as being such, we have the sigils. But references we do.
>
> my HASH $ref = \%hash; # ok
> my ARRAY $ref = \%hash; # *error*
>
> HASH, ARRAY, SCALAR, GLOB, etc... references do not cast.
...although in perl6 that's my HASH $ref = %hash, but sure.
> my SCALAR $ref = \$foo;
> print $ref; # *error* implicit SCALAR -> String cast
> # illegal.
>
> You'd have to do an explicit typecast (syntax left as an exercise).
> Given that most times when you try to use a reference as a string
> you're making a mistake, this shouldn't be a big deal.
my SCALAR $ref is stringify_sub(&f) = \$foo
currently known as
use overload '"' => \$f
>
> For arrays, its probably enough that the whole array be of a single type:
>
> my Int @array = (1..10); # array of integers
> $array[20] = 4.2; # *error*, Nums not allowed.
>
> provides some nice optimizations. You can do the same for hashes:
>
> my Int %hash = (foo => 1, bar => 2); # hash of integers
> $hash{baz} = 'yarrow'; # *error* Strings not allowed
>
> but we like to mix types up in hashes. One way we can allow this is
> to have a generic type, Value (or perhaps Scalar) that every type will
> cast into. Hash keys would be implicitly of type Value.
>
> my Value %hash = (foo => 1, bar => 2, baz => 'yarrow');
>
> but that's pretty much the same as switching off strong typing. So
> obviously hashes have to have per-key types. I will leave the syntax
> of that as an exercise for the reader. It would have to make it easy
> to declare lists of keys as being a single type. If we come up with
> something nice here, it can probably be backported onto arrays.
It's only switching off strong types for that variable, so it's a
valuable thing to do.
my qt(String => DontEvenThinkAboutIt) %tree = (id => 1);
$tree{parent} = \%tree;
(or if you think you can tackle that)
$tree{parent_with_distance} = [ 0, \%tree ];
a map from strings to numbers and lists of two elements, where the first
element is a number and the second is a reference to a map from strings
to numbers and lists of two elements...
>
>
> Regexes pose an interesting problem. m// and s/// would take a
> String. Num and Int can all autocast to a string fine, but
> consider...
>
> my Int $foo = 42;
> $foo =~ s/\d+/Basset Hounds/; # *run-time error*
>
> Sure, $foo will cast to a String and be operated on, but then the
> regex changes it to 'Basset Hounds' and tries to put that back into
> $foo. *BANG* Run time type error. I think that's the best we can do
> with regexes. Perhaps there could be an option for the typing system
> which says "s/// only takes Strings and does not cast" turning the
> above into a compile-time error.
How is a regex different from any other function that makes sense with
multiple types?
my Int $foo = 42;
my Num $bar = 4.2;
my Int $foo2 = increment($foo);
my Num $bar2 = increment($bar);
my Sub $sub2 = increment(sub { 42 });
sub increment {
my $x = shift;
if (ref $x eq 'CODE') {
return sub { 1 + $x->(); }
} else...
}
Hm. Too abstract. How about
sub escape_cgi_param { return quotemeta(shift()) }
my Int $id = escape_cgi_param(...);
my String $name = escape_cgi_param(...);
>
> At least it does prevent some silly errors:
>
> my HASH $ref = \%hash;
> $ref =~ m/.../; # *error*
>
>
> Now, here's an example of something that might be really annoying to
> get right. Let's say localtime() returns a hash in Perl 6 (sensible).
> Let's also say that not only does it return the year, mday, hour, min,
> sec, etc... as integers, it also returns things like the name of the
> month, name of the day, etc...
>
> my %time = localtime;
> # "Today is Monday, July 9, 2001"
> printf "Today is %s, %s %d, %d", $time{qw(dow month mday year)};
>
> What should the return signature of localtime look like? Obviously it
> should return a hash, so that much checks out. But each of the keys
> should be typed as well, String or Num. It would be really, really
> annoying to have to set up the hash and all its keys just right so it
> can accept the return value from localtime. Instead, perhaps Perl can
> simply imply its type from the declaration:
>
> my %time = localtime; # %time's type declaration is implied by
> # localtime's return signature AT
> # COMPILE TIME!
> %time = gmtime; # ok, localtime and gmtime have the same
> # signature.
> $time{mday}++; # ok, $time{mday} is an Int
> $time{mday} .= 'foofer' # *run-time error* Implicit $time{mday}
> # Int -> String cast ok for the string
> # append, but trying to convert the
> # String "10foofer" back to an Int fails.
>
> So a newly declared variable can imply its type based on its initial
> function assignment. You might even take this further to work with
> other variables.
>
> my String $moo = 'foo';
> my $bar = $moo; # $bar implied to be String
>
> However, if there is no type and nothing to imply it from, that's an
> error.
>
> my $foo; # *error* forgot to declare a type.
But it needs to resolve the type of an arbitrarily nasty expression
anyway, doesn't it?
my $foo = (time % 2) ? "one" : 3; # $foo is a String
Otherwise, it seems like you only get implied types for constants and
direct function calls of functions whose declarations are visible. I
guess you demand a declaration if it gets nasty?
How about sub foo { return [ { }, 3 ]; }? What should its return type
be? A ref to an array of two elements, with the first element a hash
mapping nothing at all, or anything to anything? How about
my $x = foo();
my $y = $x->[0];
$y->{3} = 4;
$y->{bob} = "mary";
How about
my $x = $object->spruk();
when ref $object eq 'Gloof', and there is no &Gloof::spruk, and
@Gloof::ISA is defined at runtime?
Or consider
package Flug;
sub somenum { return scalar(@ARGV); }
sub pi { return 3.1; }
sub AUTOLOAD { my $tmp = [ { }, 3, { a => [ 3 ] } ]; push @$tmp, \$tmp;
return $tmp; }
my Flug $albert = new Flug;
my $method = (time % 2) ? "somenum" : "pi";
my $x = $albert->$method();
Sometimes, this is tractable. Recall the subset definition of types.
$method's type is the subset containing the strings "somenum" and "pi".
So $x's type is the union of the possible return values of Flug::somenum
and Flug::pi. The possible return values of Flug::someone are 0..the max
number of arguments, which can be simplified to 0..MAXINT, also known as
the value subset labelled Int. Flug::pi always returns the value 3.1; no
need to name its singleton subset. Their union is the set containing all
integers along with 3.1. Probably some simplification rule will kick in
here to reduce that to the numerical subset of values, but who knows.
(eg it would be useful to have both "Int" and "Int or undef")
>
> We could have Perl go through heroics to try and find $foo's first
> assignment and imply a type from that, but I think that will rapidly
> get Messy and Surprising.
my $foo = $object->$method();
my $bar = My::AutoloadCrazy->gloof();
are Messy too.
> I think this implied type from declared assignment is Super Cool. It
> means you only have to declare a few initial variables and the rest
> can largely have their types implied. More DWIM, more concise, less
> annoying type gyrations, more chance hard-core Perl programmers will
> accept strong types.
Implied typing appears to be type inference with extremely weak
inference rules. I'm not saying that's a bad thing. But I'd probably
prefer if
my $deleted_mask = 0x8000;
my $id = cgi_param("id");
$id->{foo}++;
die "go away" if $deleted_mask & $id;
print "hello admin" if $id < 10;
were converted internally to something more like
my Int $deleted_mask = 0x8000;
my Nonref $id = cgi_param("id");
$id->{foo}++; # compile-time error
{
my Int $id = Int($id); # Runtime error if Int() fails
die "go away" if $deleted_mask & $id;
print "hello admin" if $id < 10;
}
assuming cgi_param() returns a Nonref. The compiler sees that $id is
used as an integer within its scope, so it checks just before the first
use and afterwards doesn't bother to check. cgi_param() cannot return an
Int because it's often used for other things like user names and credit
card numbers (heh...). Though perhaps it *is* safe to assume that it
won't return a reference, and thus get some compile-time checking.
Any way you do it, it's tough to ensure that the type signature of a
function is available at compile-time. We need some sort of typesafe
import().
(you wrote your ramblings, I wrote mine. I spent much time thinking in
circles about this stuff during the RFC phase and didn't really come up
with much. So now I'm seeing what I come up with without thinking. :-) )