Re: Autothreading generalization
Luke Palmer writes: Craig DeForest writes: Yeah, the sigils do get in the way for small placeholder variables like these. @C[ $i; $j; $k; $l ] = @A[ $i; $j ] * @B[ $k; $l ] Losing the carets doesn't do much for us (and would force us to use the explicit syntax, whatever that might be). Hmm, on the other hand, ^ doesn't mean anything in term context yet. I feel uncomfortable about allowing ^ as a shorthand for $^, since every other variable in the whole damn language has one of the four standard sigils. What about adding fifth sigil into the language? To be used only with placeholder-variables? That way ^ (or whatever char we choose) wouldn't be shorthand for $^, but part of the actual variable name, as are other sigils. That char should be selected based on it's readibility in expressions like these. Would placeholder variables be used often enough to varrant their own sigil? Luke -- Markus Laire Jam. 1:5-6
Re: Autothreading generalization
On Tuesday 01 February 2005 01:18 am, Markus Laire wrote: Luke Palmer writes: Yeah, the sigils do get in the way for small placeholder variables like these: @C[ $i; $j; $k; $l ] = @A[ $i; $j ] * @B[ $k; $l ] ... Would placeholder variables be used often enough to varrant their own sigil? Can't say for sure. This style of explicit threading was tried in PDL but turned out to not be as important as robust implicit threading, and the explicit threading (handled by block-scoped thread variables that you would declare before using them) never got robust enough to be considered a production feature. The current threading engine in PDL uses positional dimension slots to work out what is what: active dimensions have to be at the start of the dimension list, and thread dimensions have to be at the end. In principle, this is a Bad Thing since you sometimes have to transpose arrays just to get the dims in the right slots; but it turns out to be a Good Thing: it is the right optimization for nearly all of the expressions that ever get written. For example, with PDL-style threading, Luke's expression (written by someone paranoid) would be: $c = $a(:,:,*1,*1) * $b(*1,*1,:,:); # perl5/PDL 4-D outer product where the names have been replaced by dimensional positions. That still turns out to be more concise than the named version. Most of the threading ops I do in PDL turn out to have zero active dimensions (scalar op, being looped over). A few of them are vector ops (e.g. lookup into a table), and a lot are just matrix multiplication. I have only had a few opportunities to use 5-D constructs (e.g. a collection of i x j image tiles that are themselves arrayed in an n x m matrix, with 3 colors). I suspect that these sorts of higher dimensional operations will turn out to be ignored by virtually everyone -- but that they will turn out to be devilishly useful for a handful of brilliant people.
Autothreading generalization
S09 states (in Parallelized parameters and autothreading) that if you use a parameter as an array subscript in a closure, the closure is a candidate for autothreading. - $x, $y { @foo[$x;$y] } Means: - ?$x = @foo.shape[0].range, ?$y = @foo.shape[1].range { @foo[$x;$y] } And each range is automatically iterated. This is considered by some to be far too subtle. A simple error in number of parameters passed could result in very strange semantics; also, declaring some dimensions for efficiency can wildly change some semantics. The situation is made somewhat better by the fact that this only happens on arrays that have predeclared dimensions (though I'd argue that that's even more subtle). I have a different idea. Let's put the current meaning of the qw aside for the moment. We'll now use them as threading brackets. The bracketing construct @foo[$^i] makes a junction-like object threaded over all reasonable values of $^i. Similarly, @foo[$^i] * @bar[$^j] creates a two-dimensional object which is the outer product of @foo and @bar under multiplication (just iterating over all values of $^i and $^j). In the case that the values of $^i and $^j cannot be determined from the way they are used, some extra syntax will be necessary. I'm not sure what that is (suggestions welcome). In list context, the objects expand out into appropriately-dimensioned lists. In scalar context they behave much like junctions, threading all operations, but they perform inner products as many times as necessary. So: my @result = @foo[$^i] + @bar[$^i] Is the same as: my @result = @foo[$^i] + @foo[$^i] If you give it a statement without placeholders: If it's a plain array, it creates an appropriately dimensioned object. If it's a scalar and an iterator, then it iterates it. If it's any other scalar there is an error. These are lexical distinctions (except for checking whether something is an iterator). Here comes the fun part. The typical hyper operation now looks like: my @result = @foo + @bar ; And we can drop the outer brackets, saying that they're implied in this simple case. my @result = @foo + @bar; And we also have a list iterator notation: for $fh { say .uc; } Unfortunately, the scalar iterator notation would have to be different. But perhaps it should be. Here are some examples derived from S09: To write a typical tensor product: C_{ijkl} = A_{ij} * B_{kl} You write either of: @C[$^i; $^j; $^k; $^l] = @A[$^i; $^j] * @B[$^k; $^l] @C = @A[$^i; $^j] * @B[$^k; $^l] Or to write another typical tensor product: a^j = L_i^j b^i You write either of: @a[$^j] = @L[$^i; $^j] * @b[$^i] @a = @L * @b; (The last one works because the first index of @L is the one we want to iterate over---like PDL threading) As for stealing the french brackets, I think that it's a justified cause. They're already used for hyper operations, and this is just generalizing that. I argue that the interpolating qw meaning will be the most neglected quote around. For it to be useful, you have to be slicing on variables and constants at the same time, which is quite uncommon. Luke
Re: Autothreading generalization
Quoth Luke Palmer on Monday 31 January 2005 03:46 pm, C_{ijkl} = A_{ij} * B_{kl} You write either of: @C[$^i; $^j; $^k; $^l] = @A[$^i; $^j] * @B[$^k; $^l] @C = @A[$^i; $^j] * @B[$^k; $^l] Hmm... This is both insanely great and also greatly insane. The issue is that, although the tensor notation is powerful, the readability is becoming lost in all the sigils/funny_characters on the thread variables. Most of the non-perl-geek scientific-computing people I know already balk at the '$' and '@' characters because they increase the amount of black noise in scientific code too much; constructions like that just might send them all screaming back to FORTRAN. Is there a way to generalize that reduces the amount of black noise so that the expression shines through? @C[ ^i; ^j; ^k; ^l ] = @A[ ^i; ^j ] * @B[ ^k; ^l ] is much better from a readability standpoint since the j's and k's are actually visible, but may be horrific from a parsing perspective.
Re: Autothreading generalization
Luke Palmer writes: Or to write another typical tensor product: a^j = L_i^j b^i You write either of: @a[$^j] = @L[$^i; $^j] * @b[$^i] @a = @L * @b; Or not. There's that implicit Einstein summation involved, and a general purpose programming language isn't about to give dibbs to summation. I think it would have to be: @a = reduce { @^a + @^b } @L * @b; And that ain't so bad (much more explicit, certainly). Luke
Re: Autothreading generalization
Craig DeForest writes: Quoth Luke Palmer on Monday 31 January 2005 03:46 pm, C_{ijkl} = A_{ij} * B_{kl} You write either of: @C[$^i; $^j; $^k; $^l] = @A[$^i; $^j] * @B[$^k; $^l] @C = @A[$^i; $^j] * @B[$^k; $^l] Hmm... This is both insanely great and also greatly insane. The issue is that, although the tensor notation is powerful, the readability is becoming lost in all the sigils/funny_characters on the thread variables. Most of the non-perl-geek scientific-computing people I know already balk at the '$' and '@' characters because they increase the amount of black noise in scientific code too much; constructions like that just might send them all screaming back to FORTRAN. Is there a way to generalize that reduces the amount of black noise so that the expression shines through? @C[ ^i; ^j; ^k; ^l ] = @A[ ^i; ^j ] * @B[ ^k; ^l ] is much better from a readability standpoint since the j's and k's are actually visible, but may be horrific from a parsing perspective. Yeah, the sigils do get in the way for small placeholder variables like these. @C[ $i; $j; $k; $l ] = @A[ $i; $j ] * @B[ $k; $l ] Losing the carets doesn't do much for us (and would force us to use the explicit syntax, whatever that might be). Hmm, on the other hand, ^ doesn't mean anything in term context yet. I feel uncomfortable about allowing ^ as a shorthand for $^, since every other variable in the whole damn language has one of the four standard sigils. Luke