Re: Status of Stream Fusion?

J. Reinders Mon, 14 Nov 2022 03:32:58 -0800

Thank you both for the quick responses.

> Can you say precisely what you mean by "using stream fusion instead of 
> foldr/build fusion in base"?   For example, do you have a prototype library 
> that demonstrates what you intend, all except concatMap?



I believe the stream-fusion library [1] and in particular the Data.List.Stream 
module [2] implements a version of Data.List from base with stream fusion 
instead of foldr/build fusion. It is pretty old at this point, so it may not 
completely match up with the current Data.List module any more and it doesn’t 
use skip-less stream fusion yet.

I now also see that #915 [3] tracks the exact issue of replacing foldr/build 
fusion with stream fusion in base, but is closed because it required more 
research at the time. And in that thread, Sebastian already asked the same 
question as me 5 years ago [4]:

> At least this could get rid of the concatMap roadblock, are there any others 
> I'm not aware of?

[1] https://hackage.haskell.org/package/stream-fusion
[2] 
https://hackage.haskell.org/package/stream-fusion-0.1.2.5/docs/Data-List-Stream.html
[3] https://gitlab.haskell.org/ghc/ghc/-/issues/915
[4] https://gitlab.haskell.org/ghc/ghc/-/issues/915#note_141373

> But what about
> 
> concatMap (\x. Stream next (x*2 +x))
> 
> Then you want matching to succeed, with the substitution
> f :->  (\p. p*2 +p)
> 
> This is called "higher order matching" and is pretty tricky.


First of all, I’d like to clarify I’ll write ‘unification variable’, but that 
might be the wrong term. What I mean is a variable that is bound by a ‘forall’ 
in a rewrite rule. So in the example program, I’ll call ‘f’ and ’next’ 
unification variables and ‘x’ a local variable.

Higher order matching sounds like it makes the problem too general (although 
I’ll admit I haven’t looked into it fully). For this matching I feel like the 
core is to change the free variable ‘x’ in the expression ‘(x*2 +x)’ into a 
bound variable to get ‘\p -> p*2 +p’. That part sounds very easy. The only 
problem that remains is when to actually perform this change. I would suggest 
that this should happen when a unification variable ‘f’ is applied to a locally 
bound variable ‘x’. The local variable ‘x’ should only be allowed to occur in 
unification variables that are applied to it.

And indeed this seems to be what Sebastian suggests:

> perhaps you could prevent that by saying that `x` may not occur freely in 
> `next` or `f`, but the paper explicitly *wants* `x` to occur in `next`


The paper explicitly says that ‘x’ should *not* occur in ’next’ and ‘f’ (except 
indirectly because ‘f’ is applied to ‘x’), so that doesn’t seem like a problem 
to me.

In a way I’m suggesting that function application could be used in rewrite 
rules to indicate which local variables may scope over which unification 
variables. If a unification variable is applied to a local variable then the 
local variable may occur in that unification variable. It should not only match 
on a literal application.

I believe the reason that this is easier than higher order matching in general 
because it is restricted to applications of unification variables to locally 
bound variables. No other forms are required for the concatMap rule.

There can be some discussion about whether this syntax is really desirable 
because it changes the way locally bound variables are treated in rewrite 
rules, but I personally think the current behavior is not very reliable anyway. 
And I don’t think many people are using rewrite rules that contain local 
variables. But maybe it would be better to introduce special syntax for 
indicating scoping in rewrite rules. Duncan Coutts also uses this alternative 
notation:

    λx → Stream (⟨next⟩[]) (⟨e⟩[x])

But I’d guess it would be difficult to extend GHC’s parser to support that 
syntax. Re-using function application seems preferable from that perspective. 
Another alternative that I considered was this syntax:

    concatMap (\x -> Stream next e) = concatMap' next (\x -> e)

Now the matching local variable name on the left and the right could be used to 
infer that they are really the same and that ‘x’ should only be allowed to 
scope over ‘e’. GHC could just take the intersection of the scopes on the left 
and the right to deduce where the local variable ‘x’ should be allowed to occur 
(in this case that intersection is just ‘e’). But this syntax seems a bit more 
magical to me, so it isn’t my first choice. This also leads to ambiguities if 
two locally bound variables on the same side share the same name, but that 
could simply be rejected.

Cheers,

Jaro

_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Re: Status of Stream Fusion?

Reply via email to