Re: Autothreading generalization

2005-02-01 Thread Markus Laire
Luke Palmer writes:
Craig DeForest writes:
Yeah, the sigils do get in the way for small placeholder variables like
these.
 @C[ $i; $j; $k; $l ] = @A[ $i; $j ] * @B[ $k; $l ] 
Losing the carets doesn't do much for us (and would force us to use the
explicit syntax, whatever that might be).  Hmm, on the other hand, ^
doesn't mean anything in term context yet.  I feel uncomfortable about
allowing ^ as a shorthand for $^, since every other variable in the
whole damn language has one of the four standard sigils.
What about adding fifth sigil into the language? To be used only with 
placeholder-variables?

That way ^ (or whatever char we choose) wouldn't be shorthand for $^, 
but part of the actual variable name, as are other sigils. That char 
should be selected based on it's readibility in expressions like these.

Would placeholder variables be used often enough to varrant their own sigil?
Luke
--
Markus Laire
Jam. 1:5-6


Re: Autothreading generalization

2005-02-01 Thread Craig DeForest
On Tuesday 01 February 2005 01:18 am, Markus Laire wrote:
 Luke Palmer writes:
  Yeah, the sigils do get in the way for small placeholder variables like
  these:
   @C[ $i; $j; $k; $l ] = @A[ $i; $j ] * @B[ $k; $l ] 
...
 Would placeholder variables be used often enough to varrant their own
 sigil?

Can't say for sure.  This style of explicit threading was tried in PDL but 
turned out to not be as important as robust implicit threading, and the 
explicit threading (handled by block-scoped thread variables that you would 
declare before using them) never got robust enough to be considered a 
production feature.

The current threading engine in PDL uses positional dimension slots to work 
out what is what:  active dimensions have to be at the start of the dimension 
list, and thread dimensions have to be at the end.   In principle, this is a 
Bad Thing since you sometimes have to transpose arrays just to get the dims 
in the right slots; but it turns out to be a Good Thing: it is  the right 
optimization for nearly all of the expressions that ever get written.

For example, with PDL-style threading, Luke's expression (written by someone 
paranoid) would be:
$c = $a(:,:,*1,*1) * $b(*1,*1,:,:);  # perl5/PDL 4-D outer product
where the names have been replaced by dimensional positions.
That still turns out to be more concise than the named version.

Most of the threading ops I do in PDL turn out to have zero active dimensions 
(scalar op, being looped over).  A few of them are vector ops (e.g. lookup 
into a table), and a lot are just matrix multiplication.  I have only had a 
few opportunities to use 5-D constructs (e.g. a collection of i x j image 
tiles that are themselves arrayed in an n x m matrix, with 3 colors).  I 
suspect that these sorts of higher dimensional operations will turn out to be 
ignored by virtually everyone -- but that they will turn out to be devilishly 
useful for a handful of brilliant people.





Autothreading generalization

2005-01-31 Thread Luke Palmer
S09 states (in Parallelized parameters and autothreading) that if you
use a parameter as an array subscript in a closure, the closure is a
candidate for autothreading.

- $x, $y { @foo[$x;$y] }

Means:

- ?$x = @foo.shape[0].range,
   ?$y = @foo.shape[1].range { @foo[$x;$y] }

And each range is automatically iterated.

This is considered by some to be far too subtle.  A simple error in
number of parameters passed could result in very strange semantics;
also, declaring some dimensions for efficiency can wildly change some
semantics. The situation is made somewhat better by the fact that this
only happens on arrays that have predeclared dimensions (though I'd
argue that that's even more subtle).

I have a different idea.

Let's put the current meaning of the qw   aside for the moment.  We'll
now use them as threading brackets.

The bracketing construct  @foo[$^i]  makes a junction-like object
threaded over all reasonable values of $^i.  Similarly, 
 @foo[$^i] * @bar[$^j]  creates a two-dimensional object which is the
outer product of @foo and @bar under multiplication (just iterating over
all values of $^i and $^j).

In the case that the values of $^i and $^j cannot be determined from the
way they are used, some extra syntax will be necessary.  I'm not sure
what that is (suggestions welcome).

In list context, the objects expand out into appropriately-dimensioned
lists.  In scalar context they behave much like junctions, threading all
operations, but they perform inner products as many times as necessary.

So:

my @result =  @foo[$^i] + @bar[$^i] 

Is the same as:

my @result =  @foo[$^i]  +  @foo[$^i] 

If you give it a statement without placeholders:

If it's a plain array, it creates an appropriately dimensioned
object.

If it's a scalar and an iterator, then it iterates it.  If it's any
other scalar there is an error.

These are lexical distinctions (except for checking whether something is
an iterator).

Here comes the fun part.

The typical hyper operation now looks like:

my @result =  @foo  +  @bar ;

And we can drop the outer brackets, saying that they're implied in this
simple case.

my @result = @foo + @bar;

And we also have a list iterator notation:

for $fh {
say .uc;
}

Unfortunately, the scalar iterator notation would have to be different.
But perhaps it should be.

Here are some examples derived from S09:

To write a typical tensor product:

C_{ijkl} = A_{ij} * B_{kl}

You write either of:

 @C[$^i; $^j; $^k; $^l] = @A[$^i; $^j] * @B[$^k; $^l] 
@C =  @A[$^i; $^j] * @B[$^k; $^l] 

Or to write another typical tensor product:

a^j = L_i^j b^i

You write either of:

 @a[$^j] = @L[$^i; $^j] * @b[$^i] 
@a = @L * @b;

(The last one works because the first index of @L is the one we want to
iterate over---like PDL threading)

As for stealing the french brackets, I think that it's a justified
cause.  They're already used for hyper operations, and this is just
generalizing that.  I argue that the interpolating qw meaning will be
the most neglected quote around.  For it to be useful, you have to be
slicing on variables and constants at the same time, which is quite
uncommon.

Luke


Re: Autothreading generalization

2005-01-31 Thread Craig DeForest
Quoth Luke Palmer on Monday 31 January 2005 03:46 pm,
 C_{ijkl} = A_{ij} * B_{kl}

 You write either of:

  @C[$^i; $^j; $^k; $^l] = @A[$^i; $^j] * @B[$^k; $^l] 
 @C =  @A[$^i; $^j] * @B[$^k; $^l] 

Hmm... This is both insanely great and also greatly insane.  

The issue is that, although the tensor notation is powerful, the readability 
is becoming lost in all the sigils/funny_characters on the thread variables.

Most of the non-perl-geek scientific-computing people I know already balk at 
the '$' and '@' characters because they increase the amount of black noise in 
scientific code too much; constructions like that just might send them all 
screaming back to FORTRAN.  Is there a way to generalize that reduces the 
amount of black noise so that the expression shines through?

@C[ ^i; ^j; ^k; ^l ] = @A[ ^i; ^j ] * @B[ ^k; ^l ] 

is much better from a readability standpoint since the j's and k's are 
actually visible, but may be horrific from a parsing perspective.



Re: Autothreading generalization

2005-01-31 Thread Luke Palmer
Luke Palmer writes:
 Or to write another typical tensor product:
 
 a^j = L_i^j b^i
 
 You write either of:
 
  @a[$^j] = @L[$^i; $^j] * @b[$^i] 
 @a = @L * @b;

Or not.  There's that implicit Einstein summation involved, and a
general purpose programming language isn't about to give dibbs to
summation.

I think it would have to be:

@a = reduce { @^a + @^b } @L * @b;

And that ain't so bad (much more explicit, certainly).

Luke


Re: Autothreading generalization

2005-01-31 Thread Luke Palmer
Craig DeForest writes:
 Quoth Luke Palmer on Monday 31 January 2005 03:46 pm,
  C_{ijkl} = A_{ij} * B_{kl}
 
  You write either of:
 
   @C[$^i; $^j; $^k; $^l] = @A[$^i; $^j] * @B[$^k; $^l] 
  @C =  @A[$^i; $^j] * @B[$^k; $^l] 
 
 Hmm... This is both insanely great and also greatly insane.  
 
 The issue is that, although the tensor notation is powerful, the readability 
 is becoming lost in all the sigils/funny_characters on the thread variables.
 
 Most of the non-perl-geek scientific-computing people I know already balk at 
 the '$' and '@' characters because they increase the amount of black noise in 
 scientific code too much; constructions like that just might send them all 
 screaming back to FORTRAN.  Is there a way to generalize that reduces the 
 amount of black noise so that the expression shines through?
 
 @C[ ^i; ^j; ^k; ^l ] = @A[ ^i; ^j ] * @B[ ^k; ^l ] 
 
 is much better from a readability standpoint since the j's and k's are 
 actually visible, but may be horrific from a parsing perspective.

Yeah, the sigils do get in the way for small placeholder variables like
these.

 @C[ $i; $j; $k; $l ] = @A[ $i; $j ] * @B[ $k; $l ] 

Losing the carets doesn't do much for us (and would force us to use the
explicit syntax, whatever that might be).  Hmm, on the other hand, ^
doesn't mean anything in term context yet.  I feel uncomfortable about
allowing ^ as a shorthand for $^, since every other variable in the
whole damn language has one of the four standard sigils.

Luke