Re: Sane (less insane) pair semantics

TSa Tue, 11 Oct 2005 09:50:20 -0700

HaloO,

Larry Wall wrote:

It still has to figure out how to reconcile the named arguments
with the positional parameters, of course, unless someone has
made sufficient representation to the compiler that all calls to
a particular short name have particular named parameters that are
guaranteed to be in the same position everywhere, in which case the
compiler is allowed to simply translate them to positionals on the
call end.


Does that mean that a named binding excludes a positional one and
vice versa? I would consider that too much of a burden that tries
to solve the unsolvable problem of meeting unstated assumptions ;)


The thing I try to get at is the unification of classes and their
instances on the one hand and subs and their invocations on the
other.

Let's take the four sigiled vars $_, @_, %_ and &_ and assume that
they always contain the current state of affairs. Let's say they are
the "registers" of the virtual machine. Everything else is the memory
that refers to itself in a what not messy way. Memory locations not
rooted in the four registers are garbage collected eventually.

Making a goto kind of call just means putting a new address into &_.
At the statement level there's of course an automatic goto to the
next instruction.

A sub kind of call with parameters means messing with $_, @_ and %_
prior to making the jump. The return values are passed out by messing
with $_, @_ and %_ in the sub. With CPS there actually is no returning.
It's just the next jump. Well, the handling of the four magicals is
more complicated than that. There'll be lexical binding, temping etc.
And how about rules and grammars? Is there a $/, @/, %/ and &/ quadriga?

Method dispatch means basing the selection of the address
to be put into &_ on the content of $_, @_ and %_.


Not considering dispatch and the topic for now I come back to
the problem of this thread: the constructor call for an invocation
of a sub stored in &_. The task of the compiler is to output code
that doesn't violate structural type constraints extracted from
the source while preserving unspecificity as much as possible.

The first set of constraints comes from the signature of &_ and the
second from the *syntactical* structure of the arguments. The left
to right order gives the positional structure while the adverbial
pairs and the autoquoting version of fat arrow give the named or keyed
structure. Splatted terms are actually lazily evaluated code fragments
that insert information into the magicals prior to this matching.
Double splat does so eagerly. The prefix fundamental type enforcers
?, + and ~ also produce structural items as a side effect. So do most
circumfix operators like [], '' or "". The parens are for grouping
only, they are transparent to structural typing! The loose precedence
separators , and ; never take keyed args. I think it makes sense to
exclude them from overloading. They might be a context sensitive parser
concept, actually.

Well, and at some point prior to bindind the args to the params
the trait blocks like ENTER and PRE are called. And I wonder how
labels inside the sub's block can be used to enter it through
different paths:

  sub through
  {
     first:  say "first";  return;
     second: say "second"; return;
     third:  say "third";  return;
  }

  through :third; # prints "third"?
  through;        # prints "first"?


Obviously neither the compiler nor the virtual machine will have
difficulties with the above procedure. Also the programmers can
shy away from beeing specific on the caller side or the callee
side by splatting. Note that what I call structural type match is
only an arity and key match. Type constraints are checked only if
the structure fits!

From this thread I gather that the case of a pair hidden in an array
is the most debated one:

  my @array = ( 1, two => 2, 3 );

I hope, we all agree that [EMAIL PROTECTED] == 3. But is @array<two> == 2?
If yes, what shall happen with

  foo( [EMAIL PROTECTED] );

when foo requires exactly three params?

  sub foo ($x, $y, $z) {...}

And how shall the arity 2..3 be handled?

  sub foo ($x, $y, ?$z) # structure is foo:($,$,?$) without param names
  {
     $x eqv 1; # yes
     $y eqv (two => 2) && $z eqv 3;     # this?
     $y eqv 3          && $z eqv undef; # or this?
  }

How about a named optional with my preferred keyed item twigil :$
where actually the key and variable name might differ as in :two$z?
Perhaps that might be spelled :two($z) and allow the fat arrow
syntax two => $z as well.

  sub foo ($x, $y, :$two) # structure is foo:($,$,:two) without param names
  # S06:  ($x, $y, +$two)
  #       ($x, $y, two => $z)
  {
     $x eqv 1; # yes
     $y eqv (two => 2) && $two eqv 2; # this?
     $y eqv 3          && $two eqv 2; # or this?
  }

My 'Seven Sigils of Perl6' idea also nicely delivers an invocant markup
in methods:

  method foo (.$self, $x, $y, two => $z) {...}

Without the multi prefix

  method blahh ( .$first, .$second ) {...} # perhaps without comma?
  # S12:       (  $first:  $second:)

gives the left biased dispatch, while

  multi method blahh ( .$first, .$second ) {...}

aims at MMD. A method form without a dotted item assumes .$_ perhaps?

A strict parse time separation into zones for invocants, required and
optional positionals, then keyed and finally the slurpy part in definitional
forms makes sense, I think. But we could allow overlap between optional and
keyed:

  sub foo ($x, $y, ?:$opt, :$key) {...}

Which means a positional arity 2..3 and a keyed arity of 0..4 if theidentifiers $x and $y are known outside of the definition. With a stub


  sub foo ($, $, ?:opt, :key) {...}

the keyed arity is 0..2 or perhaps 2..4 if the keyed arity includes
the positional. Can we define a way to retrieve this information
from foo nicely?

Infinite arity is of course indicated with *. The sigils determine only
how the inside wants to access the variadic part which is only disjointly
split into positional and keyed if both types of sigils are present. The
positional ones are *$, *& and *@ and keyed params are indicated by *:$,
*:& and *%. Should a slurpy array do double duty if no slurpy hash is there?
Or should such an overlap be requested explicitly with *:@array in the sig?

In the end a signature has 1 + 2 + 2 parts or zones from left to right:

  1) finite invocants
  2) finite positionals
  3) finite keyed
  4) infinite positionals
  5) infinite keyed

A nice syntax for this structural type could be
&foo:( .i..I --> p..P :k..K --> r..R ) where
 i and I are the min and max number of invocants,
 p and P are the min and max positional arity,
 k and K are the min and max keyed      arity,
 r and R are the min and max return     arity.

..P, ..K and ..R can be Inf, ..I is finite and they are optional if equal
to their lowercase partner. For a sub i == 0 and the left --> is optional.

For a non-multi method i == 1 and I == 1 but the dot is mandatory when usedon a &var invocation term. In a method definition this part is derived from

the lexical scope when omitted.

The right --> is optional and defaults to --> 0..Inf then. The keyed part
including the colon is optional and defaults to 0..0 for subs and 0..Inf for
methods, I guess? The reverse applies to the positionals where subs default
to 0..Inf and methods to 0..0. The central part is subtyped contravariantly,
the outer parts covariantly after structural fitting.

The --> can be repeated to the left and right. But note that
extending into the dispatch part at least needs a dot:

  &foo:( . --> . --> 3 --> 4 ) $x, $y, $z,1,2, *<a b c d x y z>;

This means that the coderef &foo is dispatched on $x and has to yield a
method that is then dispatched on $y which has to yield a sub that takes
three required params. These receive $z, 1 and 2 respectively and the sub
has to return a coderef that requires 4 params. These would receive
'a' .. 'd' respectively. But the superflous 'x' .. 'z' cause an
arity error. The latter could be prevented with writing ( --> 4..Inf)
there. But note that there are quite many checks deferred until runtime
nonetheless.

So much, so good.



There's another funny thing that I haven't thought through, yet. If &_ is
the post-dispatch, post-binding coderef to the innermost call instance
then first of all a plain _ is just a call through that ref maintaining
the environment or so.

More interestingly the invocant part is pre-bound, and that might
lead to the natural re-call of current method syntax

  _();

or if you push the parens one char rightwards

  _.();

After the dot appeared further right shiftings reveal nothing new,
of course:

  _  .();

Well, and the non-invocant params can be bound to new arguments

  _ .(1,2);

in method syntax or with listop syntax

  _ 1,2;

Named args also work as usual:

  _ 1, :blahh(13), 23, xx => 2';

Ohh, and in a binop instance

  $x _ $y;

just means to call the current operator with two new args. A bit further
down this road we observe that the method sigil requests a re-dispatch

  $x._; # definitely weird looking

  ._; # same on $_ or $?SELF

Somewhat off-topic, sorry.
--
$TSa.greeting := "HaloO"; # mind the echo!

Re: Sane (less insane) pair semantics

Reply via email to