David

Yes, quite right.  Do make a ticket.  But it’s not easy to see a truly robust 
way to fix this.

Selector thunks look like
              a = case x of (p,q) -> p

You example has thunks of the form
              a = case x of (_,b)  -> case b of (c,_) -> c

We want this thunk to vanish entirely if x is bound to a pair whose second 
component is a pair.  But, if x is bound to a pair, whose second component is 
not yet evaluated, we want the thunk to partially vanish, becoming
              a = case b of (c,_) -> c
where b is the second component of x.

This seems hard in general.

You could imagine translating the example into
              b = case x of (_,b) -> b
              a = case b of (c,_) -> c

Here I have built two thunks rather than one, but each is (independently) a 
selector thunk.  We get just the right thing happening if x is evaluated but 
its second component is note. Hooray.  What’s not nice is that execution is 
slower in the case where the selector-thunk mechanism doesn’t fire.

Another possibility.  Make selector thunks carry a kind of “path” indicating 
which fields to pick out as they select a nested component of the data 
structure.  A kind of selector-thunk chain.

Another possibility: for suitable selector-like thunks, compile special code 
that is run by the garbage collector.

This deserves a ticket and a wiki page. Feel free to plunder the above.

Simon

From: ghc-devs <ghc-devs-boun...@haskell.org> On Behalf Of David Feuer
Sent: 09 March 2020 17:28
To: ghc-devs <ghc-devs@haskell.org>
Subject: Selector thunks again

The fragility of this feature remains frustrating. A few days ago, I wrote this 
code for building a complete binary tree from its breadth-first traversal. 
(This is an improvement of a version by Will Ness.)

data Tree a
  = Empty
  | Node a (Tree a) (Tree a)
  deriving Show

-- An infinite list
data IL a = a :< IL a
infixr 5 :<

bft :: [a] -> Tree a
bft xs = tree
  where
    tree :< subtrees = go xs subtrees

    go :: [a] -> IL (Tree a) -> IL (Tree a)
    go (a : as) ~(b1 :< ~(b2 :< bs)) = Node a b1 b2 :< go as bs
    go [] _ = fix (Empty :<)

When GHC compiles the lazy patterns, we get something essentially like this:

    go (a : as) ys =
      Node a
        (case ys of b1 :< _ -> b1)
        (case ys of _ :< b2 :< _ -> b2)
      :<
      go as (case ys of _ :< _ :< bs -> bs)

Now `case ys of b1 :< _ -> b1` is a selector thunk, which is cool. The GC can 
reduce it as soon as either of the other thunks is forced. But neither of the 
other two case expressions is a selector thunk, so neither will ever be reduced 
by the GC. If I consume the result tree using an inorder traversal, for 
example, then all the elements in the left subtree of the root will remain live 
until I start to consume the right subtree of the root.

I can instead write this:

   go (a : as) ys = Node a b1 b2 :< go as bs
      where
         {-# NOINLINE b2bs #-}
         b1 :< b2bs = ys
         b2 :< bs = b2bs

Now all the suspended selections are selector thunks, so things should clean up 
nicely. There are still three problems, though. The first is that this is 
harder to read. The second is that now we have four suspended selections 
instead of three. Finally, if b1 is not the first one forced, we'll need to 
force two thunks instead of one.

Can't we do any better?
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply via email to