Re: Status of Stream Fusion?

2022-11-14 Thread Sebastian Graf
Yes, if you confine yourself to Miller's "pattern fragment" (Meta variables
may appear in the head of an application, but then the arguments may only
be *distinct* bound variables) then you might be fine.

Thanks for giving it a try!


Am Mo., 14. Nov. 2022 um 14:27 Uhr schrieb J. Reinders <
jaro.reind...@gmail.com>:

> I think higher order pattern unification is different from higher order
> matching because the unification means both sides are allowed to be
> patterns that may contain meta variables, while matching works on only one
> pattern and one concrete term.
>
> The example pattern ‘forall a. someFunction (\x -> a x x)’ would indeed be
> a potential problem, but indeed I think disallowing that is fine because it
> is never necessary to write such patterns in practice.
>
> The paper linked by Simon contains another example: ‘forall f x. f x’
> matched to ‘0’, which is also a problem because it has infinitely many
> possible solutions, but I would consider this out of scope because it
> involves an application of two unification variables. For the practical
> examples I’ve seen up to now, we only need to handle the application of one
> unification variable with one locally bound variable.
>
> I’m pretty optimistic now that it is possible to implement an algorithm
> that covers the practical use cases (also the Simon’s ‘mapFB' example falls
> into this category) while being much simpler than existing algorithms for
> both higher order unification and higher order matching.
>
> I’ll do some experiments with the rule matcher.
>
> Cheers,
> Jaro
>
> > On 14 Nov 2022, at 14:03, Sebastian Graf  wrote:
> >
> > > I believe the reason that this is easier than higher order matching in
> general because it is restricted to applications of unification variables
> to locally bound variables.
> >
> > Indeed, it is easier to confine oneself to the pattern fragment. I think
> that's entirely the point of pattern unification: Meta variables may appear
> in the head of an application, but then the arguments may only be distinct
> bound variables. E.g., `α x x` would be disallowed.
> >
> > Perhaps it is possible to tweak RULEs just enough to match
> >
> >   forall f next. concatMap (λx → Stream next (f x)) = concatMap' next f
> >
> > (Now that I re-read your RULE, the explicit app `f x` makes sense
> compared to the absence of an app in `next` to indicate it must not mention
> `x`. I completely overread it before.)
> >
> > If `f x` should match `x*2+x` with `f := \x -> 2*x+x` then that means
> RULE matching modulo beta reductions, for a start.
> > Do we also want eta? I think we do, because we are already matching
> modulo eta since https://gitlab.haskell.org/ghc/ghc/-/merge_requests/6222
> (and are very careful not to increase arity of target term in doing so).
> >
> > There are many other troubling issues with pattern unification stemming
> from meta variables on both sides of the equation, but perhaps they don't
> come up because the target term does not contain meta variables. Solving a
> meta variable simply does `eqExpr` with solutions from other use sites, as
> today. It doesn't matter much that we are `eq`ing lambdas, because that's
> pretty simple to do.
> > I can't think of an obvious example where higher-order pattern
> *matching* (urgh, what an unfortunate intersection of meaning between
> "higher-order pattern unification" and "pattern matching") would be much
> more complicated, but I'm no expert in the field.
> >
> > I didn't know the paper Simon cited, but note that Miller's higher-order
> pattern unification (which has become bog standard in dependently-typed
> langs) identifies a useful subset that is also efficient to check. It's
> surprising that the authors do not compare the pattern fragment to the
> identified fragment of their implementation (although they cite it in one
> place as [16]).
> >
> > Anyway, unless someone comes up with a tricky counter-example for a RULE
> that would be complicated to check, you should probably just give it a try.
> >
> > Cheers,
> > Sebastian
> >
> >
> > Am Mo., 14. Nov. 2022 um 12:32 Uhr schrieb J. Reinders <
> jaro.reind...@gmail.com>:
> > Thank you both for the quick responses.
> >
> > > Can you say precisely what you mean by "using stream fusion instead of
> foldr/build fusion in base"?   For example, do you have a prototype library
> that demonstrates what you intend, all except concatMap?
> >
> >
> > I believe the stream-fusion library [1] and in particular the
> Data.List.Stream module [2] implements a version of Data.List from base
> with stream fusion instead of foldr/build fusion. It is pretty old at this
> point, so it may not completely match up with the current Data.List module
> any more and it doesn’t use skip-less stream fusion yet.
> >
> > I now also see that #915 [3] tracks the exact issue of replacing
> foldr/build fusion with stream fusion in base, but is closed because it
> required more research at the time. And in that thread, Sebastian 

Re: Status of Stream Fusion?

2022-11-14 Thread J. Reinders
I think higher order pattern unification is different from higher order 
matching because the unification means both sides are allowed to be patterns 
that may contain meta variables, while matching works on only one pattern and 
one concrete term.

The example pattern ‘forall a. someFunction (\x -> a x x)’ would indeed be a 
potential problem, but indeed I think disallowing that is fine because it is 
never necessary to write such patterns in practice.

The paper linked by Simon contains another example: ‘forall f x. f x’ matched 
to ‘0’, which is also a problem because it has infinitely many possible 
solutions, but I would consider this out of scope because it involves an 
application of two unification variables. For the practical examples I’ve seen 
up to now, we only need to handle the application of one unification variable 
with one locally bound variable.

I’m pretty optimistic now that it is possible to implement an algorithm that 
covers the practical use cases (also the Simon’s ‘mapFB' example falls into 
this category) while being much simpler than existing algorithms for both 
higher order unification and higher order matching.

I’ll do some experiments with the rule matcher.

Cheers,
Jaro

> On 14 Nov 2022, at 14:03, Sebastian Graf  wrote:
> 
> > I believe the reason that this is easier than higher order matching in 
> > general because it is restricted to applications of unification variables 
> > to locally bound variables. 
> 
> Indeed, it is easier to confine oneself to the pattern fragment. I think 
> that's entirely the point of pattern unification: Meta variables may appear 
> in the head of an application, but then the arguments may only be distinct 
> bound variables. E.g., `α x x` would be disallowed.
> 
> Perhaps it is possible to tweak RULEs just enough to match
> 
>   forall f next. concatMap (λx → Stream next (f x)) = concatMap' next f
> 
> (Now that I re-read your RULE, the explicit app `f x` makes sense compared to 
> the absence of an app in `next` to indicate it must not mention `x`. I 
> completely overread it before.)
> 
> If `f x` should match `x*2+x` with `f := \x -> 2*x+x` then that means RULE 
> matching modulo beta reductions, for a start. 
> Do we also want eta? I think we do, because we are already matching modulo 
> eta since https://gitlab.haskell.org/ghc/ghc/-/merge_requests/6222 (and are 
> very careful not to increase arity of target term in doing so).
> 
> There are many other troubling issues with pattern unification stemming from 
> meta variables on both sides of the equation, but perhaps they don't come up 
> because the target term does not contain meta variables. Solving a meta 
> variable simply does `eqExpr` with solutions from other use sites, as today. 
> It doesn't matter much that we are `eq`ing lambdas, because that's pretty 
> simple to do.
> I can't think of an obvious example where higher-order pattern *matching* 
> (urgh, what an unfortunate intersection of meaning between "higher-order 
> pattern unification" and "pattern matching") would be much more complicated, 
> but I'm no expert in the field.
> 
> I didn't know the paper Simon cited, but note that Miller's higher-order 
> pattern unification (which has become bog standard in dependently-typed 
> langs) identifies a useful subset that is also efficient to check. It's 
> surprising that the authors do not compare the pattern fragment to the 
> identified fragment of their implementation (although they cite it in one 
> place as [16]).
> 
> Anyway, unless someone comes up with a tricky counter-example for a RULE that 
> would be complicated to check, you should probably just give it a try.
> 
> Cheers,
> Sebastian
> 
> 
> Am Mo., 14. Nov. 2022 um 12:32 Uhr schrieb J. Reinders 
> :
> Thank you both for the quick responses.
> 
> > Can you say precisely what you mean by "using stream fusion instead of 
> > foldr/build fusion in base"?   For example, do you have a prototype library 
> > that demonstrates what you intend, all except concatMap?
> 
> 
> I believe the stream-fusion library [1] and in particular the 
> Data.List.Stream module [2] implements a version of Data.List from base with 
> stream fusion instead of foldr/build fusion. It is pretty old at this point, 
> so it may not completely match up with the current Data.List module any more 
> and it doesn’t use skip-less stream fusion yet.
> 
> I now also see that #915 [3] tracks the exact issue of replacing foldr/build 
> fusion with stream fusion in base, but is closed because it required more 
> research at the time. And in that thread, Sebastian already asked the same 
> question as me 5 years ago [4]:
> 
> > At least this could get rid of the concatMap roadblock, are there any 
> > others I'm not aware of?
> 
> [1] https://hackage.haskell.org/package/stream-fusion
> [2] 
> https://hackage.haskell.org/package/stream-fusion-0.1.2.5/docs/Data-List-Stream.html
> [3] https://gitlab.haskell.org/ghc/ghc/-/issues/915
> [4] 

Re: Status of Stream Fusion?

2022-11-14 Thread Sebastian Graf
> I believe the reason that this is easier than higher order matching in
general because it is restricted to applications of unification variables
to locally bound variables.

Indeed, it is easier to confine oneself to the pattern fragment. I think
that's entirely the point of pattern unification: Meta variables may appear
in the head of an application, but then the arguments may only be distinct
bound variables. E.g., `α x x` would be disallowed.

Perhaps it is possible to tweak RULEs just enough to match

  forall f next. concatMap (λx → Stream next (f x)) = concatMap' next f

(Now that I re-read your RULE, the explicit app `f x` makes sense compared
to the absence of an app in `next` to indicate it must not mention `x`. I
completely overread it before.)

If `f x` should match `x*2+x` with `f := \x -> 2*x+x` then that means RULE
matching modulo beta reductions, for a start.
Do we also want eta? I think we do, because we are already matching modulo
eta since https://gitlab.haskell.org/ghc/ghc/-/merge_requests/6222 (and are
very careful not to increase arity of target term in doing so).

There are many other troubling issues with pattern unification stemming
from meta variables on both sides of the equation, but perhaps they don't
come up because the target term does not contain meta variables. Solving a
meta variable simply does `eqExpr` with solutions from other use sites, as
today. It doesn't matter much that we are `eq`ing lambdas, because that's
pretty simple to do.
I can't think of an obvious example where higher-order pattern *matching*
(urgh, what an unfortunate intersection of meaning between "higher-order
pattern unification" and "pattern matching") would be much more
complicated, but I'm no expert in the field.

I didn't know the paper Simon cited, but note that Miller's higher-order
pattern unification (which has become bog standard in dependently-typed
langs) identifies a useful subset that is also efficient to check. It's
surprising that the authors do not compare the pattern fragment to the
identified fragment of their implementation (although they cite it in one
place as [16]).

Anyway, unless someone comes up with a tricky counter-example for a RULE
that would be complicated to check, you should probably just give it a try.

Cheers,
Sebastian


Am Mo., 14. Nov. 2022 um 12:32 Uhr schrieb J. Reinders <
jaro.reind...@gmail.com>:

> Thank you both for the quick responses.
>
> > Can you say precisely what you mean by "using stream fusion instead of
> foldr/build fusion in base"?   For example, do you have a prototype library
> that demonstrates what you intend, all except concatMap?
>
>
> I believe the stream-fusion library [1] and in particular the
> Data.List.Stream module [2] implements a version of Data.List from base
> with stream fusion instead of foldr/build fusion. It is pretty old at this
> point, so it may not completely match up with the current Data.List module
> any more and it doesn’t use skip-less stream fusion yet.
>
> I now also see that #915 [3] tracks the exact issue of replacing
> foldr/build fusion with stream fusion in base, but is closed because it
> required more research at the time. And in that thread, Sebastian already
> asked the same question as me 5 years ago [4]:
>
> > At least this could get rid of the concatMap roadblock, are there any
> others I'm not aware of?
>
> [1] https://hackage.haskell.org/package/stream-fusion
> [2]
> https://hackage.haskell.org/package/stream-fusion-0.1.2.5/docs/Data-List-Stream.html
> [3] https://gitlab.haskell.org/ghc/ghc/-/issues/915
> [4] https://gitlab.haskell.org/ghc/ghc/-/issues/915#note_141373
>
> > But what about
> >
> > concatMap (\x. Stream next (x*2 +x))
> >
> > Then you want matching to succeed, with the substitution
> > f :->  (\p. p*2 +p)
> >
> > This is called "higher order matching" and is pretty tricky.
>
>
> First of all, I’d like to clarify I’ll write ‘unification variable’, but
> that might be the wrong term. What I mean is a variable that is bound by a
> ‘forall’ in a rewrite rule. So in the example program, I’ll call ‘f’ and
> ’next’ unification variables and ‘x’ a local variable.
>
> Higher order matching sounds like it makes the problem too general
> (although I’ll admit I haven’t looked into it fully). For this matching I
> feel like the core is to change the free variable ‘x’ in the expression
> ‘(x*2 +x)’ into a bound variable to get ‘\p -> p*2 +p’. That part sounds
> very easy. The only problem that remains is when to actually perform this
> change. I would suggest that this should happen when a unification variable
> ‘f’ is applied to a locally bound variable ‘x’. The local variable ‘x’
> should only be allowed to occur in unification variables that are applied
> to it.
>
> And indeed this seems to be what Sebastian suggests:
>
> > perhaps you could prevent that by saying that `x` may not occur freely
> in `next` or `f`, but the paper explicitly *wants* `x` to occur in `next`
>
>
> The paper explicitly 

Re: Status of Stream Fusion?

2022-11-14 Thread J. Reinders
Thank you both for the quick responses.

> Can you say precisely what you mean by "using stream fusion instead of 
> foldr/build fusion in base"?   For example, do you have a prototype library 
> that demonstrates what you intend, all except concatMap?


I believe the stream-fusion library [1] and in particular the Data.List.Stream 
module [2] implements a version of Data.List from base with stream fusion 
instead of foldr/build fusion. It is pretty old at this point, so it may not 
completely match up with the current Data.List module any more and it doesn’t 
use skip-less stream fusion yet.

I now also see that #915 [3] tracks the exact issue of replacing foldr/build 
fusion with stream fusion in base, but is closed because it required more 
research at the time. And in that thread, Sebastian already asked the same 
question as me 5 years ago [4]:

> At least this could get rid of the concatMap roadblock, are there any others 
> I'm not aware of?

[1] https://hackage.haskell.org/package/stream-fusion
[2] 
https://hackage.haskell.org/package/stream-fusion-0.1.2.5/docs/Data-List-Stream.html
[3] https://gitlab.haskell.org/ghc/ghc/-/issues/915
[4] https://gitlab.haskell.org/ghc/ghc/-/issues/915#note_141373

> But what about
> 
> concatMap (\x. Stream next (x*2 +x))
> 
> Then you want matching to succeed, with the substitution
> f :->  (\p. p*2 +p)
> 
> This is called "higher order matching" and is pretty tricky.


First of all, I’d like to clarify I’ll write ‘unification variable’, but that 
might be the wrong term. What I mean is a variable that is bound by a ‘forall’ 
in a rewrite rule. So in the example program, I’ll call ‘f’ and ’next’ 
unification variables and ‘x’ a local variable.

Higher order matching sounds like it makes the problem too general (although 
I’ll admit I haven’t looked into it fully). For this matching I feel like the 
core is to change the free variable ‘x’ in the expression ‘(x*2 +x)’ into a 
bound variable to get ‘\p -> p*2 +p’. That part sounds very easy. The only 
problem that remains is when to actually perform this change. I would suggest 
that this should happen when a unification variable ‘f’ is applied to a locally 
bound variable ‘x’. The local variable ‘x’ should only be allowed to occur in 
unification variables that are applied to it.

And indeed this seems to be what Sebastian suggests:

> perhaps you could prevent that by saying that `x` may not occur freely in 
> `next` or `f`, but the paper explicitly *wants* `x` to occur in `next`


The paper explicitly says that ‘x’ should *not* occur in ’next’ and ‘f’ (except 
indirectly because ‘f’ is applied to ‘x’), so that doesn’t seem like a problem 
to me.

In a way I’m suggesting that function application could be used in rewrite 
rules to indicate which local variables may scope over which unification 
variables. If a unification variable is applied to a local variable then the 
local variable may occur in that unification variable. It should not only match 
on a literal application.

I believe the reason that this is easier than higher order matching in general 
because it is restricted to applications of unification variables to locally 
bound variables. No other forms are required for the concatMap rule.

There can be some discussion about whether this syntax is really desirable 
because it changes the way locally bound variables are treated in rewrite 
rules, but I personally think the current behavior is not very reliable anyway. 
And I don’t think many people are using rewrite rules that contain local 
variables. But maybe it would be better to introduce special syntax for 
indicating scoping in rewrite rules. Duncan Coutts also uses this alternative 
notation:

λx → Stream (⟨next⟩[]) (⟨e⟩[x])

But I’d guess it would be difficult to extend GHC’s parser to support that 
syntax. Re-using function application seems preferable from that perspective. 
Another alternative that I considered was this syntax:

concatMap (\x -> Stream next e) = concatMap' next (\x -> e)

Now the matching local variable name on the left and the right could be used to 
infer that they are really the same and that ‘x’ should only be allowed to 
scope over ‘e’. GHC could just take the intersection of the scopes on the left 
and the right to deduce where the local variable ‘x’ should be allowed to occur 
(in this case that intersection is just ‘e’). But this syntax seems a bit more 
magical to me, so it isn’t my first choice. This also leads to ambiguities if 
two locally bound variables on the same side share the same name, but that 
could simply be rejected.

Cheers,

Jaro

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Status of Stream Fusion?

2022-11-14 Thread Simon Peyton Jones
Jaro

1. Is the ‘concatMap’ problem really the only problem left on the way to
> using stream fusion instead of foldr/build fusion in base?
>

Can you say precisely what you mean by "using stream fusion instead of
foldr/build fusion in base"?   For example, do you have a prototype library
that demonstrates what you intend, all except concatMap?

  concatMap (λx → Stream next (f x)) = concatMap' next f
>

If it was lined up nicely like that it'd be easy. But what about

concatMap (\x. Stream next (x*2 +x))

Then you want matching to succeed, with the substitution
f :->  (\p. p*2 +p)

This is called "higher order matching" and is pretty tricky.  You could
start with this paper

.

My comment on
#22361 also refers to the potential usefulness of higher order matching.

Simon

On Mon, 14 Nov 2022 at 09:47, J. Reinders  wrote:

> Dear GHC devs,
>
> I’m interested in stream fusion and would like to see what it takes to fix
> the remaining issues, so that it can replace foldr/build fusion in base.
>
> First of all I would like to know what exactly the challenges are that are
> left. I believe one of the main remaining problems is the fusion of
> ‘concatMap’. Is that really the only thing?
>
> Secondly, I would like to know what has already been tried. I know
> Sebastian Graf has spent a lot of effort trying to get SpecConstr to work
> on lambda arguments without success. I’ve read that Sebastian now considers
> the static argument transformation more promising.
>
> However, Duncan Coutts proposed in his thesis to make rewrite rules
> slightly more powerful and use the rewrite rule:
>
> concatMap (λx → Stream next (f x)) = concatMap' next f
>
> Has that ever been tried? If so, what is the problem with this rewrite
> rule approach? I can understand that the `f x` function application is
> usually in a more reduced form, but it seems relatively easy to make the
> rewrite rule matcher smart enough to see through beta-reductions like that.
>
> So my main questions are:
>
> 1. Is the ‘concatMap’ problem really the only problem left on the way to
> using stream fusion instead of foldr/build fusion in base?
>
> 2. Has the rewrite rule approach to solving the ‘concatMap’ problem ever
> been tried?
>
> Any other information about the current status of stream fusion is also
> much appreciated.
>
> Cheers,
>
> Jaro Reinders
>
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Status of Stream Fusion?

2022-11-14 Thread Sebastian Graf
Hi Jaro,

I'm very glad that you are interested in picking up the pieces I left
behind!

Re: SpecConstr: Yes, that pass is already moderately complicated and gets
much more complicated when you start specialisation for non-bound lambdas,
because that would need higher-order pattern unification in RULEs to be
useful (as well as SpecConstr applying those RULEs when specialising). One
smart suggestion by Simon to prevent that was to only specialise on bound
lambdas only and do a pass before that assigns a name to every lambda. I'm
not sure if that is enough for the recursive specialisation problem arising
in stream fusion, though. I simply haven't played it through so far.

Re: Static argument transformation: I find that much more promising indeed.
Not the pass that transforms the RHS of a binding, but a my new idea that
evaluates in the Simplifier for a recursive function whether it makes sense
to inline its on-the-fly SAT'd form. See
https://gitlab.haskell.org/ghc/ghc/-/issues/18962 and the prototype
https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4553 for details. It
does just fine on concatMap stream fusion pipelines (at least when the
stepper function is small enough to inline), although I remember there are
a few annoying issues regarding SAT'ing stable unfoldings (think INLINE
recursive functions). Just thinking about it makes me excited again, but
I've to finish other stuff in my PhD first.

Re: Beefing up rewrite rules: I *think* the RULE you suggest amounts to
implementing pattern unification in RULEs (perhaps you could prevent that
by saying that `x` may not occur freely in `next` or `f`, but the paper
explicitly *wants* `x` to occur in `next`). I'd find that cool, but I'm a
bit wary that the RULE matcher (which I'm not very familiar with) might
behave subtly different in certain key scenarios than vanilla pattern
unification and we might get breaking changes as a result.
At the moment, RULEs matching only ever matches a term against a pattern,
where the former has no "unification variables", so it might be simpler
than full-blown pattern unification.

So in short, the problem was never that we couldn't write down the RULE,
but that it's hard to implement in GHC.

I can't really answer (1) or (2), but perhaps my summary above is useful to
you.

Sebastian

Am Mo., 14. Nov. 2022 um 10:47 Uhr schrieb J. Reinders <
jaro.reind...@gmail.com>:

> Dear GHC devs,
>
> I’m interested in stream fusion and would like to see what it takes to fix
> the remaining issues, so that it can replace foldr/build fusion in base.
>
> First of all I would like to know what exactly the challenges are that are
> left. I believe one of the main remaining problems is the fusion of
> ‘concatMap’. Is that really the only thing?
>
> Secondly, I would like to know what has already been tried. I know
> Sebastian Graf has spent a lot of effort trying to get SpecConstr to work
> on lambda arguments without success. I’ve read that Sebastian now considers
> the static argument transformation more promising.
>
> However, Duncan Coutts proposed in his thesis to make rewrite rules
> slightly more powerful and use the rewrite rule:
>
> concatMap (λx → Stream next (f x)) = concatMap' next f
>
> Has that ever been tried? If so, what is the problem with this rewrite
> rule approach? I can understand that the `f x` function application is
> usually in a more reduced form, but it seems relatively easy to make the
> rewrite rule matcher smart enough to see through beta-reductions like that.
>
> So my main questions are:
>
> 1. Is the ‘concatMap’ problem really the only problem left on the way to
> using stream fusion instead of foldr/build fusion in base?
>
> 2. Has the rewrite rule approach to solving the ‘concatMap’ problem ever
> been tried?
>
> Any other information about the current status of stream fusion is also
> much appreciated.
>
> Cheers,
>
> Jaro Reinders
>
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs