> > This is a neat trick, though I’ve had trouble getting it to work reliably > in my experiments (even though I was using GHC.Types.SPEC). That said, I > also feel like I don’t understand the subtleties of SpecConstr very well, > so it could have been my fault. >
Yeah, SPEC is quite unreliable, because IIRC at some point it's either consumed or irrelevant. But none of the combinators you mentioned should rely on SpecConstr! They are all non-recursive, so the Simplifier will take care of "specialisation". And it works just fine, I just tried it: https://gist.github.com/sgraf812/d15cd3ee9cc9bd2e72704f90567ef35b `test` there is optimised reasonably well. The problem is that we don't have the concrete a..f so we can't cancel away all allocations. If you give me a closed program where we fail to optimise away every bit of allocation (and it isn't due to size concerns), then I would be surprised. Although there might be a bug in how I encoded the streams, maybe we can be a bit stricter here or there if need be. `test2 = (double &&& inc) >>> arr (uncurry (+)) :: SF Int Int` is such a function that we optimise down to (the equivalent of) `arr (\n -> 3*n+1)`. Maybe you can give a medium-sized program where you think GHC does a poor job at optimising? Am Di., 31. März 2020 um 23:18 Uhr schrieb Alexis King < lexi.lam...@gmail.com>: > Sebastian and Simon, > > Thank you both for your responses—they are all quite helpful! I agree with > both of you that figuring out how to do this kind of specialization without > any guidance from the programmer seems rather intractable. It’s too hard to > divine where it would actually be beneficial, and even if you could, it > seems likely that other optimizations would get in the way of it actually > working out. > > I’ve been trying to figure out if it would be possible to help the > optimizer out by annotating the program with special combinators like the > existing ones provided by GHC.Magic. However, I haven’t been able to come > up with anything yet that seems like it would actually work. > > On Mar 31, 2020, at 06:12, Simon Peyton Jones <simo...@microsoft.com> > wrote: > > Wow – tricky stuff! I would never have thought of trying to optimise > that program, but it’s fascinating that you get lots and lots of them from > FRP. > > > For context, the reason you get all these tiny loops is that arrowized FRP > uses the Arrow and ArrowChoice interfaces to build its programs, and those > interfaces use tiny combinator functions like these: > > first :: Arrow a => a b c -> a (b, d) (c, d) > (***) :: Arrow a => a b d -> a c e -> a (b, c) (d, e) > (|||) :: ArrowChoice a => a b d -> a c d -> a (Either b c) d > > This means you end up with programs built out of dozens or hundreds of > uses of these tiny combinators. You get code that looks like > > first (left (arr f >>> g ||| right h) *** second i) > > and this is a textbook situation where you want to specialize and inline > all the combinators! For arrows without this tricky recursion, doing that > works as intended, and GHC’s simplifier will do what it’s supposed to, and > you get fast code. > > But with FRP, each of these combinators is recursive. This means you often > get really awful code that looks like this: > > arr (\case { Nothing -> Left (); Just x -> Right x }) >>> (f ||| g) > > This converts a Maybe to an Either, then branches on it. It’s analogous to > writing something like this in direct-style code: > > let y = case x of { Nothing -> Left (); Just x -> Right x } > in case y of { Left () -> f; Right x -> g x } > > We really want the optimizer to eliminate the intermediate Either and just > branch on it directly, and if GHC could fuse these tiny recursive loops, it > could! But without that, all this pointless shuffling of values around > remains in the optimized program. > > > - I wonder whether it’d be possible to adjust the FRP library to > generate easier-to-optimise code. Probably not, but worth asking. > > > I think it’s entirely possible to somehow annotate these combinators to > communicate this information to the optimizer, but I don’t know what the > annotations ought to look like. (That’s the research part!) > > But I’m not very optimistic about getting the library to generate > easier-to-optimize code with the tools available today. Sebastian’s example > of SF2 and stream fusion sort of works, but in my experience, something > like that doesn’t handle enough cases well enough to work on real arrow > programs. > > > - Unrolling one layer of a recursive function. That seems harder: how > we know to **stop** unrolling as we successively simplify? One > idea: do one layer of unrolling by hand, perhaps even in FRP source > code: > > add1rec = SF (\a -> let !b = a+1 in (b,add1rec)) > add1 = SF (\a -> let !b = a+1 in (b,add1rec)) > > > Yes, I was playing with the idea at one point of some kind of RULE that > inserts GHC.Magic.inline on the specialized RHS. That way the programmer > could ask for the unrolling explicitly, as otherwise it seems unreasonable > to ask the compiler to figure it out. > > On Mar 31, 2020, at 08:08, Sebastian Graf <sgraf1...@gmail.com> wrote: > > We can formulate SF as a classic Stream that needs an `a` to produce its > next element of type `b` like this (SF2 below) > > > This is a neat trick, though I’ve had trouble getting it to work reliably > in my experiments (even though I was using GHC.Types.SPEC). That said, I > also feel like I don’t understand the subtleties of SpecConstr very well, > so it could have been my fault. > > The more fundamental problem I’ve found with that approach is that it > doesn’t do very well for arrow combinators like (***) and (|||), which come > up very often in arrow programs but rarely in streaming. Fusing long chains > of first/second/left/right is actually pretty easy with ordinary RULEs, but > (***) and (|||) are much harder, since they have multiple continuations. > > It seems at first appealing to rewrite `f *** g` into `first f >>> second > g`, which solves the immediate problem, but this is actually a lot less > efficient after repeated rewritings. You end up rewriting `(f ||| g) *** h` > into `first (left f) >>> first (right g) >>> second h`, turning two > distinct branches into four, and larger programs have much worse > exponential blowups. > > So that’s where I’ve gotten stuck! I’ve been toying with the idea of > thinking about expression “shells”, so if you have something like > > first (a ||| b) >>> c *** second (d ||| e) >>> f > > then you have a “shell” of the shape > > first (● ||| ●) >>> ● *** second (● ||| ●) >>> ● > > which theoretically serves as a key for the specialization. You can then > generate a specialization and a rule: > > $s a b c d e f = ... > {-# RULE forall a b c d e f. > first (a ||| b) >>> c *** second (d ||| e) >>> f = $s a b c d > e f #-} > > The question then becomes: how do you specify what these shells are, and > how do you specify how to transform the shell into a specialized function? > I don’t know, but it’s something a Core plugin could theoretically do. > Maybe it makes sense for this domain-specific optimization to be a Core > pass that runs before the simplifier, like the typeclass specializer > currently is, but I haven’t explored that yet. > > Alexis >
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs