Re: Distinct closure types vs. known infotables for stack frames

2023-06-27 Thread Alexis King
On Tue, Jun 27, 2023 at 4:13 AM Simon Peyton Jones <
simon.peytonjo...@gmail.com> wrote:

> In short, why are the design considerations for stack frames different to
> heap objects?  I think of a stack frame simply as a heap object that
> happens to be allocated on the stack
>

I agree with this perspective—I think it is generally an accurate one.
Indeed, I think it may very well be true that what I’ve described largely
applies to heap objects as well as stack frames, and working on
continuations just means I’ve much more time thinking about stacks. Perhaps
if I were working on the garbage collector I would be asking the same
question about heap objects.

For example, we have MVAR_CLEAN and MVAR_DIRTY, but each of those types is
only used by one statically-allocated infotable, as far as I can tell. In
some parts of the code, we check that the closure type is MVAR_CLEAN or
MVAR_DIRTY, but in other places, we check whether the infotable is
stg_MVAR_CLEAN_info or stg_MVAR_DIRTY_info. Meanwhile, we have both
stg_TVAR_CLEAN_info and stg_TVAR_DIRTY_info, but they share the same TVAR
closure type!

The decisions here seem fairly arbitrary. But perhaps there is some method
to the madness, or perhaps someone prefers one approach over the others, in
which case I would like to hear it! And if not, well, at least I’ll know. :)
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Distinct closure types vs. known infotables for stack frames

2023-06-26 Thread Alexis King
Hello all,

I am tinkering with the RTS again while trying to fix #23513
, and every time I touch
the exceptions/continuations code, I find myself waffling about whether to
introduce more closure types. I’d like to get a second opinion so I can
stop changing my mind!

Currently, we have distinct closure types for certain special stack frames,
like CATCH_FRAME, ATOMICALLY_FRAME, and UNDERFLOW_FRAME. However, there are
other special stack frames that *don’t* get their own closure types: there
is no MASK_ASYNC_EXCEPTIONS_FRAME or PROMPT_FRAME. Instead, when code needs
to recognize these frames on the stack, it just looks for a known infotable
pointer. That is, instead of writing

if (frame->header.info->i.type == PROMPT_FRAME) { ... }

we write

if (*frame == _prompt_frame_info) { ... }

which works out because there’s only one info table that’s used for all
prompt frames.

There are a handful of stack frames that are recognized in this way by some
part of the RTS, but the criteria used to determine which frames get their
own types and which don’t is not particularly clear. For some frames, like
UPDATE_FRAME, the closure type is necessary because it is shared between
several infotables. But other types, like CATCH_FRAME and UNDERFLOW_FRAME,
are only ever used by precisely one infotable.

I think one can make the following arguments for/against using separate
closure types for these stack frames:

   -

   Pro: It’s helpful to have separate types for particularly special frames
   like UNDERFLOW_FRAME because it makes it easier to remember which
   special cases to handle when walking the stack.
   -

   Pro: Branching on stack frame closure types using switch is easier to
   read than comparing infotable pointers.
   -

   Con: Adding more closure types unnecessarily pollutes code that branches
   on closure types, like the garbage collector.
   -

   Con: Using special closure types for these frames might make it seem
   like they have some special layout when in fact they are just ordinary
   stack frames.

Does anyone have any opinions about this? I’m personally okay with the
status quo, but the inconsistency makes me constantly second-guess whether
someone else might feel strongly that I ought to be doing things the other
way!

Thanks,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


The GHC(i)/RTS linker and Template Haskell

2022-05-31 Thread Alexis King
Hi all,

I’ve recently been trying to better understand how and where time is spent
at compile-time when running Template Haskell splices, and one of the areas
I’ve been struggling to figure out is the operation of the linker. From
reading the source code, here’s a summary of what I think I’ve figured out
so far:

   - TH splices are executed using the GHCi interpreter, though it may be
   internal or external (if -fexternal-interpreter is used).

   - Regardless of which mode is used, TH splices need their dependencies
   loaded into the interpreter context before they can be run. This is handled
   by the call to loadDecls in hscCompileCoreExpr', which in turn calls
   loadDependencies in GHC.Linker.Loader.

   - loadDependencies loads packages and modules in different ways. Package
   dependencies are just loaded via the appropriate built shared libraries,
   but modules from the current package have to be loaded a different way, via
   loadObjects (also in GHC.Linker.Loader).

Here, however, is where I get a bit lost. GHC has two strategies for
loading individual objects, which it chooses between depending on whether
the current value of interpreterDynamic is True. But I don’t actually
understand what interpreterDynamic means! The Haddock comment just says
that it determines whether or not the “interpreter uses the Dynamic way”,
but I don’t see why that matters. My understanding was that GHCi *always*
requires dynamic linking, since it is, after all, loading code dynamically.
Under what circumstances would interpreterDynamic ever be False?

Furthermore, I don’t actually understand precisely how and why this
influences the choice of loading strategy. In the case that
interpreterDynamic is True, GHC appears to convert the desired dyn_o object
into a shared library by calling the system linker, then loads that, which
can be very slow but otherwise works. However, when interpreterDynamic is
False, it loads the object directly. Both paths eventually call into “the
RTS linker”, implemented in rts/Linker.c, to actually load the resulting
object.

I have found precious little information on what the RTS linker does, in
which contexts it’s used, or how precisely it works. Note
[runtime-linker-phases] at the top of Linker.c has some information, but
it’s mostly a high-level description of what the code actually does rather
than an explanation of its role in the bigger picture. Does anyone know of
any resources they could possibly point me to that help to explain how all
the pieces fit together here? I’ve spent quite a bit of time reading the
code, but I’m afraid I still haven’t managed to see the forest for the
trees.

Thanks,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Seeking RTS experts to review delimited continuations MR

2022-04-14 Thread Alexis King
Hi all,

I have recently opened a draft MR with my initial implementation of
first-class delimited continuations in the RTS, available here:
https://gitlab.haskell.org/ghc/ghc/-/merge_requests/7942

The MR is not entirely finished—it still requires docs and tests, which I
am gradually working on. However, barring any bugs I discover after writing
those tests, I believe the implementation itself is feature-complete with
respect to the proposal. Given that it is a nontrivial patch to a somewhat
unloved portion of GHC, and given that there are a couple questions I have
about how best to go about testing certain interactions in the first place,
I figured I would reach out and see if anyone particularly familiar with
the guts of the RTS would be willing to volunteer some time to give it a
careful look.

The good news is that the patch is not actually very large, and it changes
very little existing code: the diffstats currently sit at +1,078 -48. For
those interested in taking a look at the patch, I recommend starting with
the Notes mentioned in the MR description. I suspect they may be a bit
sparse at the moment, so please do not hesitate to ask questions; I will do
my best to respond promptly.

Many thanks,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Concrete#, SpecialPred, and CtEvidence invariants

2021-12-09 Thread Alexis King
On Wed, Dec 8, 2021 at 2:20 PM Richard Eisenberg  wrote:

> Does this help?
>

Yes, it does, and what you said makes sense.

Really, the intent of the invariant isn’t “zonkedness” *per se*, but merely
that the type of the evidence should be “at least as zonked” as ctev_pred
is. But that is a little vague and handwavy, so I’ll go the route you
suggest and just reword things a little to explain.

Thanks!
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Concrete#, SpecialPred, and CtEvidence invariants

2021-12-08 Thread Alexis King
Hi all,

After a recent off-list conversation, I’ve been exploring using the new
SpecialPred mechanism to implement a custom constraint, but I’ve run into
some questions regarding evidence for CSpecialCan constraints.
Specifically, I’m uncertain about how to use them while satisfying Note
[CtEvidence invariants] in GHC.Tc.Types.Constraint.

Specifically, the Note in question states that ctev_pred must always be
kept in sync with the type of the evidence. For example, if ctev_dest for a
wanted constraint is a coercion hole, ctev_pred must be `varType
(coHoleCoVar hole)`. However, this seems rather odd in the case of
Concrete# constraints, since the evidence for a `Concrete# (ty :: ki)`
constraint is a coercion of type `ty ~# alpha`.

In canNonDecomposableConcretePrim in GHC.Tc.Solver.Canonical,
setCtEvPredType is used to update the type of a CSpecialCan constraint, and
setCtEvPredType automatically maintains the CtEvidence invariant mentioned
above. But this results in setCoHoleType being used to modify the type of
the CoVar from `ty ~# alpha` to `Concrete# (ty :: ki)`, which does not make
much sense, since that is definitely *not* the type of the CoVar. As far as
I can tell, the only reason that doesn’t cause Core Lint errors is that the
modified CoVar is never actually used, only the original one is.

This suggests to me that the invariant is not actually the right one. The
Note suggests that the reason the invariant exists is that the type must be
kept fully-zonked… which in turn suggests that perhaps fully-zonkedness is
the invariant that’s actually desired. In that case, ctev_pred would *not*
necessarily always be a cache for the type of the evidence, since in the
case of Concrete#, the evidence has a different type (though for other
types of constraints the old invariant would still coincidentally hold).
Rather, it would be the responsibility of canNonDecomposableConcretePrim to
merely ensure the evidence’s type is appropriately zonked.

Does this all sound right to people? If so, I will update the Note with the
modified invariant and update the code to match.

Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: GHC indecisive whether matching on GADT constructors in arrow notation is allowed

2021-10-05 Thread Alexis King
->
>  case ds_d2kW of { ($dShow_a2jl, x_a2hL) ->
>  break<0>() Main.Showable @ x_a2jb $dShow_a2jl x_a2hL
>  }
>  }))
>(returnA @ a_a2ja @ (Showable x_a2jb) $dArrow_a2jd))
>
> - Oleg
>
> On 5.10.2021 19.12, Richard Eisenberg wrote:
>
> I think the real difference is whether new type variables are brought into
> scope. It seems that GHC can't deal with a proc-pattern-match that
> introduces type variables, but it *can* deal with introduced constraints. I
> have no idea *why* this is the case, but it seems a plausible (if
> accidental) resting place for the implementation.
>
> Richard
>
> On Oct 3, 2021, at 12:19 PM, Alexis King  wrote:
>
> Hi,
>
> I’ve been working on bringing my reimplementation of arrow notation back
> up to date, and I’ve run into some confusion about the extent to which
> arrow notation is “supposed” to support matching on GADT constructors. Note
> [Arrows and patterns] in GHC.Tc.Gen.Pat suggests they aren’t supposed to
> be supported at all, which is what I would essentially expect. But issues
> #17423 <https://gitlab.haskell.org/ghc/ghc/-/issues/17423> and #18950
> <https://gitlab.haskell.org/ghc/ghc/-/issues/18950> provide examples of
> using GADT constructors in arrow notation, and there seems to be some
> expectation that in fact they *ought* to be supported, and some
> recently-added test cases verify that’s the case.
>
> But this is quite odd, because it means the arrows test suite now includes
> test cases that verify both that this is supported *and* that it isn’t…
> and all of them pass! Here’s my understanding of the status quo:
>
>-
>
>Matching on constructors that bind bona fide existential variables is
>not allowed, and this is verified by the arrowfail004 test case, which
>involves the following program:
>
>data T = forall a. T a
>
>panic :: (Arrow arrow) => arrow T T
>panic = proc (T x) -> do returnA -< T x
>
>This program is rejected with the following error message:
>
>arrowfail004.hs:12:15:
>Proc patterns cannot use existential or GADT data constructors
>In the pattern: T x
>
>-
>
>Despite the previous point, matching on constructors that bind
>evidence is allowed. This is enshrined in test cases T15175, T17423,
>and T18950, which match on constructors like these:
>
>data DecoType a where
>  DecoBool :: Maybe (String, String) -> Maybe (Int, Int) -> DecoType Bool
>data Point a where
>  Point :: RealFloat a => a -> Point a
>
>
> This seems rather contradictory to me. I don’t think there’s much of a
> meaningful distinction between these types of matches, as they create
> precisely the same set of challenges from the Core point of view… right?
> And even if I’m wrong, the error message in arrowfail004 seems rather
> misleading, since I would definitely call DecoBool and Point above “GADT
> data constructors”.
>
> So what’s the intended story here? Is matching on GADT constructors in
> arrow notation supposed to be allowed or not? (I suspect this is really
> just yet another case of “nobody really knows what’s ‘supposed’ to happen
> with arrow notation,” but I figured I might as well ask out of hopefulness
> that someone has some idea.)
>
> Thanks,
> Alexis
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
>
>
> ___
> ghc-devs mailing 
> listghc-devs@haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Arrow notation, existentials, and TcType demotion

2021-10-04 Thread Alexis King
Hello,

As I mentioned in a previous email
, GHC
seems currently somewhat uncertain about whether or not matching on GADTs
is permitted in arrow notation. Sam Derbyshire suggested on Twitter
 that we
probably do, ideally speaking, want to support them. Unfortunately, I am
confident that the existing implementation is *not* up to this task, as I
have now demonstrated in issues #20469
 and #20470
.

The latter of those two issues is particularly challenging to solve, as it
highlights the sorts of tricky interactions that can arise when arrow
notation is mixed with existential quantification. To give an example,
suppose we have the following datatypes:

data A where
  A :: a -> B a -> A
data B a where
  B :: B ()

And suppose we have the following proc expression:

proc a -> case a of
  A x B -> id -< x

The match on the A constructor introduces a locally-scoped skolem, and even
though the use of id appears on the RHS, it is *not* actually within the
match’s scope—rather, its scope is that of the overall proc expression.

This makes it tricky to maintain typechecker invariants, as introduced
metavariables must not accidentally leak into the outer scope via these
strangely-scoped expressions. For example, when typechecking the above
expression, we’ll increment tcl_tclevel before typechecking the id -< x
command, and we may introduce fresh metavariables while doing so. This
means we may end up with an expected type for id that looks like this:

a1[tau:1] -> a2[tau:1]

However, when we actually check id against that type, we must do so in a
context where tcl_tclevel is 0, not 1. This violates WantedInv from Note
[TcLevel invariants] in GHC.Tc.Utils.TcType. This means we need to do a
three-step process to properly check id’s type:

   1.

   Synthesize a new metavariable a3[tau:0].
   2.

   Emit [W] a3[tau:0] ~ (a1[tau:1] -> a2[tau:1]) within the arrow scope,
   i.e. where tcl_tclevel = 1.
   3.

   Check id against a3[tau:0].

Discerning readers may note that this looks *awfully* similar to the
process by which we promote a type via promoteTcType, as described in Note
[Promoting a type] in GHC.Tc.Utils.TcMType. However, in this case, we
aren’t really promoting a type, but *demoting* it—we ultimately want to
decrease its level, not increase it. However, it seems to me that the
process of doing this demotion is in fact handled perfectly fine by
promoteTcType. So my question is twofold:

   1.

   Is my reasoning about what to do here correct?
   2.

   Is there any harm in using promoteTcType to do this demotion?

Thanks,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


GHC indecisive whether matching on GADT constructors in arrow notation is allowed

2021-10-03 Thread Alexis King
Hi,

I’ve been working on bringing my reimplementation of arrow notation back up
to date, and I’ve run into some confusion about the extent to which arrow
notation is “supposed” to support matching on GADT constructors. Note
[Arrows and patterns] in GHC.Tc.Gen.Pat suggests they aren’t supposed to be
supported at all, which is what I would essentially expect. But issues
#17423  and #18950
 provide examples of
using GADT constructors in arrow notation, and there seems to be some
expectation that in fact they *ought* to be supported, and some
recently-added test cases verify that’s the case.

But this is quite odd, because it means the arrows test suite now includes
test cases that verify both that this is supported *and* that it isn’t… and
all of them pass! Here’s my understanding of the status quo:

   -

   Matching on constructors that bind bona fide existential variables is
   not allowed, and this is verified by the arrowfail004 test case, which
   involves the following program:

   data T = forall a. T a

   panic :: (Arrow arrow) => arrow T T
   panic = proc (T x) -> do returnA -< T x

   This program is rejected with the following error message:

   arrowfail004.hs:12:15:
   Proc patterns cannot use existential or GADT data constructors
   In the pattern: T x

   -

   Despite the previous point, matching on constructors that bind evidence
   is allowed. This is enshrined in test cases T15175, T17423, and T18950,
   which match on constructors like these:

   data DecoType a where
 DecoBool :: Maybe (String, String) -> Maybe (Int, Int) -> DecoType Bool
   data Point a where
 Point :: RealFloat a => a -> Point a


This seems rather contradictory to me. I don’t think there’s much of a
meaningful distinction between these types of matches, as they create
precisely the same set of challenges from the Core point of view… right?
And even if I’m wrong, the error message in arrowfail004 seems rather
misleading, since I would definitely call DecoBool and Point above “GADT
data constructors”.

So what’s the intended story here? Is matching on GADT constructors in
arrow notation supposed to be allowed or not? (I suspect this is really
just yet another case of “nobody really knows what’s ‘supposed’ to happen
with arrow notation,” but I figured I might as well ask out of hopefulness
that someone has some idea.)

Thanks,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: How does GHC implement layout?

2021-04-05 Thread Alexis King

On 4/5/21 2:36 PM, Ian Lynagh wrote:
It was originally designed by John Meacham: 
https://gitlab.haskell.org/haskell/prime/-/wikis/alternative-layout-rule 
https://www.mail-archive.com/haskell-prime@haskell.org/msg01938.html


Thanks, Ian—I had stumbled across a link to the old Haskell Prime trac 
wiki while I was searching for information, but I didn’t realize where 
it had been migrated to.


It isn't exactly equivalent to the Haskell layout rule, but it's 
fairly close and much simpler (due to not having the "on a parse 
error" case).


Yes, I gathered as much from the implementation. The idea makes sense, 
but of course it doesn’t provide much benefit to have a simpler 
implementation unless it actually /replaces/ the “on parse error” approach.


Given this appears to be a long-defunct proposal, a natural followup 
question is to ask whether there’s any reason this code is still in GHC. 
Is it used for any purpose, or could it be removed?


Alexis

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: How does GHC implement layout?

2021-04-04 Thread Alexis King

On 4/4/21 1:52 PM, Iavor Diatchki wrote:

Hi Alexis,

I wasn't sure what the "alternative layout" is either and did some 
googling, and it appears that it is something that was never really 
documented properly.   The following link contains pointers to the 
commit that introduced it (in 2009!) (not the main ticket but some of 
the comments)


Thanks, that’s a helpful pointer, though of course it still doesn’t 
explain very much. I’m still interested in understanding what the 
purpose of “alternative layout” is and how it operates, if anyone else 
has any idea.


Overall, I do think that Haskell's layout rule is more complicated 
than it needs to be, and this is mostly because of the rule that 
requires the insertion of a "virtual close curly" on a parse error.


Yes, this does seem to be by far the trickiest bit. But I’d be sad not 
to have it, as without it, even simple things like


   let x = 3 in e

would not be grammatically valid.

My feeling is that it'd be pretty tricky to do layout in the parser 
with grammar rules, but you may be able to do something with the 
parser state.


Yes, I have some vague ideas, but none of them are particularly fleshed 
out. It’s entirely possible that I just don’t understand the 
relationship between the lexer and the parser (which seems somewhat 
obscured by the “alternative layout” stuff), and the ideas I have are 
what’s already implemented today. I’ll have to study the implementation 
more closely.


In any case, thank you for your response! The ALR-related pointer 
certainly clarifies at least a little.


Alexis

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


How does GHC implement layout?

2021-04-03 Thread Alexis King

Hi all,

I’m wondering if there are any resources that discuss the design of 
GHC’s implementation of layout. (I haven’t been able to find any.) From 
looking at the code, here’s what I’ve gathered so far:


 * Layout is implemented in the lexer (compiler/GHC/Parser/Lexer.x).

 * The implementation is similar in some respects to the approach
   described in the Haskell Report, but still fairly different. Virtual
   braces and semicolons are inserted during the lexing process itself
   with the assistance of Alex lexer states (aka “start codes”).

 * In order to handle particularly tricky cases like

if e then do x; y else z


   where the virtual close brace must be inserted in the middle of a
   line, tokens such as in and else are given special context-sensitive
   treatment. This appears to be quite subtle.

Overall, I can mostly follow the code, but I still have a few unanswered 
questions:


 * The layout-related code consistently uses the phrase “alternative
   layout rule”—what does “alternative” mean here? Does it refer to
   GHC’s implementation of layout? Or maybe it refers to
   NondecreasingIndentation? It isn’t clear.

 * The implementation of layout seems quite complex, in large part
   because it has to worry about parsing concerns in the lexer in order
   to handle tricky cases like the one I provided above. Is there are
   reason all this is done in the lexer, rather than deferring some
   more of the work to the parser?

I’ve found remarkably little information about implementing layout in 
general, so perhaps I’m missing some resources or keywords to search 
for, but any information or perspectives would be appreciated!


Thanks,
Alexis

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Type inference of singular matches on GADTs

2021-03-30 Thread Alexis King

On 3/28/21 9:17 PM, Richard Eisenberg wrote:



I think this is the key part of Alexis's plea: that the type checker 
take into account exhaustivity in choosing how to proceed.


[…]

Does this match your understanding?


Yes, precisely. :) Without GADTs, exhaustivity doesn’t yield any useful 
information to the typechecker, but with them, it can.


I agree with Simon that it seems tricky—his examples are good ones—and I 
agree with Richard that I don’t know if this is actually a good or 
fruitful idea. I’m certainly not demanding anyone else produce a 
solution! But I was curious if anyone had explored this before, and it 
sounds like perhaps the answer is “no.” Fair enough! I still appreciate 
the discussion.


Alexis

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Type inference of singular matches on GADTs

2021-03-20 Thread Alexis King

Hi all,

Today I was writing some code that uses a GADT to represent 
heterogeneous lists:


   data HList as where
  HNil  :: HList '[]
  HCons :: a -> HList as -> HList (a ': as)

This type is used to provide a generic way to manipulate n-ary 
functions. Naturally, I have some functions that accept these n-ary 
functions as arguments, which have types like this:


   foo :: Blah as => (HList as -> Widget) -> Whatsit

The idea is that Blah does some type-level induction on as and supplies 
the function with some appropriate values. Correspondingly, my use sites 
look something like this:


   bar = foo (\HNil -> ...)

Much to my dismay, I quickly discovered that GHC finds these expressions 
quite unfashionable, and it invariably insults them:


   • Ambiguous type variable ‘as0’ arising from a use of ‘foo’
  prevents the constraint ‘(Blah as0)’ from being solved.

The miscommunication is simple enough. I expected that when given an 
expression like


   \HNil -> ...

GHC would see a single pattern of type HList '[] and consequently infer 
a type like


   HList '[] -> ...

Alas, it was not to be. It seems GHC is reluctant to commit to the 
choice of '[] for as, lest perhaps I add another case to my function in 
the future. Indeed, if I were to do that, the choice of '[] would be 
premature, as as ~ '[] would only be available within one branch. 
However, I do not in fact have any such intention, which makes me 
quietly wish GHC would get over its anxiety and learn to be a bit more 
of a risk-taker.


I ended up taking a look at the OutsideIn(X) paper, hoping to find some 
commentary on this situation, but in spite of the nice examples toward 
the start about the trickiness of GADTs, I found no discussion of this 
specific scenario: a function with exactly one branch and an utterly 
unambiguous pattern. Most examples come at the problem from precisely 
the opposite direction, trying to tease out a principle type from a 
collection of branches. The case of a function (or perhaps more 
accurately, a case expression) with only a single branch does not seem 
to be given any special attention.


Of course, fewer special cases is always nice. I have great sympathy for 
generality. Still, I can’t help but feel a little unsatisfied here. 
Theoretically, there is no reason GHC cannot treat


   \(a `HCons` b `HCons` c `HCons` HNil) -> ...

and

   \a b c -> ...

almost identically, with a well-defined principle type and pleasant type 
inference properties, but there is no way for me to communicate this to 
the typechecker! So, my questions:


1. Have people considered this problem before? Is it discussed anywhere
   already?
2. Is my desire here reasonable, or is there some deep philosophical
   argument for why my program should be rejected?
3. If it /is/ reasonable, are there any obvious situations where a
   change targeted at what I’m describing (vague as that is) would
   affect programs negatively, not positively?

I realize this gets rather at the heart of the typechecker, so I don’t 
intend to imply a change of this sort should be made frivolously. 
Indeed, I’m not even particularly attached to the idea that a change 
must be made! But I do want to understand the tradeoffs better, so any 
insight would be much appreciated.


Thanks,
Alexis

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Nested constructed product results?

2020-12-14 Thread Alexis King
Hi all,

I spent some time today looking into the performance of a program
involving a parser type that looks something like this:

data AnnotatedParser a = AnnotatedParser
  { annotation :: Annotation
  , parser :: String -> (Maybe a, String)
  }

The `Annotation` records metadata about the structure of an
`AnnotatedParser` that can be accessed statically (that is, without
having to run the parser on some input). `AnnotatedParser`s are built
from various primitive constructors and composed using various
combinators. These combinators end up looking something like this:

(<+>) :: AnnotatedParser a -> AnnotatedParser b -> AnnotatedParser (a, b)
AnnotatedParser ann1 f <+> AnnotatedParser ann2 g = AnnotatedParser
  { annotation = Seq ann1 ann2
  , parser = \s1 ->
  let !(a, s2) = f s1
  !(b, s3) = g s2
  in ((,) <$> a <*> b, s3)
  }

Use of these combinators leads to the construction and subsequent case
analysis of numerous `AnnotatedParser` closures. Happily, constructed
product result[1] analysis kicks in and rewrites such combinators to cut
down on the needless boxing, leading to worker/wrapper splits like this:

$w<+> :: Annotation
  -> (String -> (Maybe a, String))
  -> Annotation
  -> (String -> (Maybe b, String))
  -> (# Annotation, String -> (Maybe (a, b), String) #)
$w<+> ann1 f ann2 g =
  (# Seq ann1 ann2
   , \s1 -> let !(a, s2) = f s1
!(b, s3) = g s2
in ((,) <$> a <*> b, s3) #)

<+> :: AnnotatedParser a -> AnnotatedParser b -> AnnotatedParser (a, b)
<+> (AnnotatedParser ann1 f) (AnnotatedParser ann2 g) =
  case $w<+> ann1 f ann2 g of
(# a, b #) -> AnnotatedParser a b
{-# INLINE <+> #-}

This is great, and it cuts down on allocation significantly, but there
is still something unsatisfying about it: the `parser` function inside
the record is not affected by CPR! This is a shame, because
essentially all use sites immediately deconstruct the pair, making it
a prime candidate for unboxing. Ideally, we’d like to get this, instead:

$w<+> :: Annotation
  -> (String -> (Maybe a, String))
  -> Annotation
  -> (String -> (Maybe b, String))
  -> (# Annotation, String -> (# Maybe (a, b), String #) #)
$w<+> ann1 f ann2 g =
  (# Seq ann1 ann2
   , \s1 -> let !(a, s2) = f s1
!(b, s3) = g s2
in (# (,) <$> a <*> b, s3 #) #)

In practice, little combinators like `$w<+>` are marked INLINE, so `f`
and `g` are usually known rather than unknown calls. This nested CPR
transformation would allow the tuple construction to fuse with the
tuple deconstruction, eliminating quite a lot of unnecessary
boxing/unboxing.

Unfortunately, it seems as though GHC’s implementation of CPR is
entirely first-order: although function arguments are given rich
demand signatures, results are only described one level deep. But as
the above example hopefully illustrates, that’s leaving significant
optimization opportunities on the table! Hence, my questions:

1. Has this notion of “nested CPR” been explored at all before?
2. Does such an extension to CPR sound worth its weight?

I peeked a little at GHC.Types.Cpr and GHC.Core.Opt.CprAnal, and it
seems quite manageable to me, but I haven’t actually looked into an
implementation attempt just yet. I’m mostly interested in whether
others have thought about something like this and/or run into similar
issues in the past, or if this is really an unusual construction.

Thanks,
Alexis

[1]: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/cpr.pdf

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Async exceptions and delimited continuations

2020-07-01 Thread Alexis King
Hi all,

As some of you are likely aware, I have an open GHC proposal[1] to add
native support for delimited continuations to the RTS. I also have an
incomplete implementation,[2] and the only major remaining obstacle
concerns async exceptions. The issue is subtle, so I’ve tried to
provide all the necessary context in this email. If you’re already
familiar with the ideas at play, you can skip the context about how
delimited continuations work.

For those unfamiliar, delimited continuations allow capturing slices
of the call stack and restoring them later. For example, the program

do y <- prompt $ do x <- control0 $ \k -> k (pure 10)
pure (x + 5)
   print y

will print 15. To understand what’s happening operationally, we can
imagine an abstract call stack made up of continuation frames:

┌──┐
│  ● + 5   │redex: control0 $ \k -> k (pure 10)
├──┤
│ prompt ● │
├──┤
│ print ●  │
├──┤
│   ...│
├──┤

Here, each ● represents the “hole” where the evaluated result of the
redex will be returned. `control0` moves all the frames between the
top of the stack and the first `prompt` into the heap and returns a
reference to them, so after a single reduction step, we have

┌──┐
│ print ●  │redex: k1 (pure 10)
├──┤heap:  k1 = ┌──┐
│   ...││  ● + 5   │
├──┤└──┘

When a continuation is applied, its stored stack frames are copied
back onto the top of the current stack, and the argument becomes the
new redex:

┌──┐
│  ● + 5   │redex: pure 10
├──┤
│ print ●  │
├──┤
│   ...│
├──┤

Now it should hopefully be clear how we end up printing 15.

With that context, consider the following expression:

prompt $ mask_ $ do x <- control0 $ \k -> k (pure 10)
f x

The call stack at the point of control0 looks very similar in this
program, but now we have a use of `mask_` in there as well:

┌──┐
│   f ●│redex: control0 $ \k -> k (pure 10)
├──┤exns:  masked
│ mask_ ●  │
├──┤
│ prompt ● │
├──┤
│   ...│
├──┤

When capturing the continuation, we’ll unwind the stack the same way
we did before. Because we’re popping mask_ off the stack, we’ll unmask
async exceptions:

┌──┐redex: k1 (pure 10)
│   ...│exns:  not masked
├──┤heap:  k1 = ┌──┐
│   f ●│
├──┤
│ mask_ ●  │
└──┘

Now when we apply `k1`, we’ll copy the `mask_` frame back onto the
stack, and we must re-mask async exceptions. Otherwise, exceptions
will not be masked during the call to `f`, which would be wrong.

Why is this a problem? The RTS applies an optimization: if you call
mask_ (actually maskAsyncExceptions#) while exceptions are already
masked, it doesn’t push a new stack frame at all. So, for example, if
you write

mask_ $ mask_ $ foo bar

you’ll end up with only one mask_ frame on the call stack, not two.
This tiny optimization actually allows not one but two savings:

1. The immediate benefit is that we save a stack frame.

2. The hidden benefit is that we never need to push the old
   exception masking state onto the stack.

   If we had multiple mask_ frames on the stack simultaneously, we
   wouldn’t know what to do when returning: should we unmask them,
   or should they stay masked? We’d need to push that information
   onto the stack in the same way we must push a return address
   when calling a function.

   By skipping these redundant stack frames, we can always be
   certain the right thing to do on return is to unmask async
   exceptions. No need to store anything else.

(This explanation is a slight simplification, especially once
maskUninterruptible comes into play, but it’s close enough.)

Now you may see the looming problem: this strategy completely breaks
down in the presence of delimited continuations. With delimited
continuations, we might have a program like this:

mask_ $ prompt $ mask_ $ ...

If we capture a continuation up to this prompt, we expect the inner
mask_ frame to be captured along with it. But that won’t happen if we
never pushed a mask_ frame at all due to the aforementioned
optimization.

So now I can finally state my question: what is the right solution for
this? I see three obvious possible ways forward:

1. Keep track of whether or not we’re inside a prompt and skip the
   optimization in that case.

2. Keep some bookkeeping that tracks the modifications to the
   async exception masking state since the most recently pushed
   prompt.

3. Just don’t bother with the 

Re: Introducing GHC whole program compiler (GHC-WPC)

2020-06-13 Thread Alexis King
Hi Csaba,

I originally posted this comment on /r/haskell 

 before I saw you also sent this to ghc-devs. I’ve decided to reproduce my 
comment here as well, since this list probably has a more relevant audience:

> I want to start by saying that I think this sounds totally awesome, and I 
> think it’s a fantastic idea. I’m really interested in seeing how this 
> progresses!
> 
> I do wonder if people might find the name a little misleading. “Whole program 
> compilation” usually implies “whole program optimization,” but most of GHC’s 
> key optimizations happen at the Core level, before STG is even generated. (Of 
> course, I’m sure you’re well aware of that, I’m just stating it for the sake 
> of others who might be reading who aren’t aware.)
> 
> This seems much closer in spirit to “link-time optimization” (LTO) as 
> performed by Clang and GCC than whole program compilation. For example, 
> Clang’s LTO works by “linking” LLVM bitcode files instead of fully-compiled 
> native objects. STG is not quite analogous to LLVM IR—GHC’s analog would be 
> Cmm, not STG—but I think that difference is not that significant here: the 
> STG-to-Cmm pass is quite mechanical, and STG is mostly just easier to 
> manipulate.
> 
> tl;dr: Have you considered naming this project GHC-LTO instead of GHC-WPC?

Alexis

> On Jun 12, 2020, at 16:16, Csaba Hruska  wrote:
> 
> Hello,
> 
> I've created a whole program compilation pipeline for GHC via STG.
> Please read my blog post for the details:
> Introducing GHC whole program compiler (GHC-WPC) 
> 
> 
> Regards,
> Csaba Hruska
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Expanding a particular type family in type errors

2020-05-05 Thread Alexis King
> On May 5, 2020, at 04:00, Richard Eisenberg  wrote:
> 
> reportWanteds reports errors, yes, but it also fills in deferred type errors. 
> [snip] It's a bit awkward that a function called reportWanteds does this, but 
> it actually makes good sense, because reportWanteds knows the text that 
> should be in the runtime error call.

Yes, that does make sense; thank you for the explanation.

> You say that they're not, in general, fully known. But are they often known 
> in practice? My suggestion here isn't to remove the type families entirely, 
> but maybe to short-circuit around them. If that's often possible, it takes 
> some pressure off fixing the pretty-printing.

You’re right that this might be feasible, but I think it would be awkward, and 
it probably would break down just often enough to be an unsatisfying solution.

> I wonder if we should have a preprocessTypeForPrinting :: DynFlags -> Type -> 
> Type function. This could be run *before* the conversion to IfaceType. It 
> could handle both your new idea and the existing idea for 
> -f(no-)print-explicit-runtime-reps. The supplied DynFlags could be gotten 
> from sdocWithDynFlags, I think. I bet that once we have this facility 
> working, we'll find more uses for it.

If you think this is a reasonable idea, I’m happy to give it a shot—it 
definitely seems like the nicest way to do it to me. I’ll try it out and report 
back if I run into any trouble. Thanks again!

Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Expanding a particular type family in type errors

2020-05-04 Thread Alexis King
> On May 4, 2020, at 04:25, Richard Eisenberg  wrote:
> 
> 1. pprType seems a very challenging place to do this, given that pprType 
> works on IfaceTypes, which are not amenable to much of any analysis. Maybe 
> you could do the switcho-changeo before converting to IfaceType. Are your 
> type families closed? Then it's just barely conceivable. If they're open, you 
> won't have the instances available in pprType. Still, this seems like far too 
> much processing to be done in pretty-printing.

You are right—I discovered this after my previous email. But they are more than 
closed: they are BuiltInSynFams, so they can be expanded easily by 
isBuiltInSynFamTyCon_maybe + sfMatchFam.

> 2. tidyType really cannot change one type to another unequal one. Chaos will 
> ensue.

Agreed.

> 3. reportWanteds will have a similar problem: if we have -fdefer-type-errors, 
> then reportWanteds may actually happen before desugaring and execution, and 
> we can't have a type magically changed in the process. Doing so will lead to 
> a core-lint error.

I don’t totally understand this point—I thought reportWanteds just printed 
things! It sounds like you’re saying it does something more, but I don’t 
immediately see what.

> 4-5. These seem the post plausible. Yes, it's true that calls to 
> ArrowStackTup might leak through, but doing so wouldn't be strictly *wrong*, 
> just suboptimal. So it might be that a best-effort here is good enough.

I think 4 makes sense for situations where the type family is stuck, since in 
that case we can’t just expand it away, and there is a more useful error 
message we can report. Maybe 5 makes more sense for the other cases, but I 
started down that path and it’s quite awkward.

> Another possibility: when these ArrowStackTup calls are born, do you already 
> sometimes know the arguments? That is, there must be some code that produces 
> the ArrowStackTup '[a] call -- could it just special-case an '[a] argument 
> and produce `a` instead? Then, the ArrowStackTup family is used only when you 
> have an abstract argument, which might be considerably rarer.

I’m afraid not. The entire reason I have the type families in the first place 
is that the arguments are not, in general, fully known when the equality 
constraint is emitted. If that weren’t true, I could drop them completely. (The 
proposal discusses that in a little more detail, as do some Notes in my WIP MR.)

It’s all a bit frustrating, because it’s really not like printing things this 
way is “cheating” or somehow morally wrong—these types are nominally equal, so 
from the user’s point of view they are completely interchangeable. It only 
becomes sketchy from the Core point of view, and I’m totally fine to not do any 
of this magic there!

Really, I think this morally belongs in the printer. This is just a display 
thing, nothing more. I want to be able to say “these type families are not 
something the user is likely to understand, so when printing error messages, 
expand them if at all possible.” But it’s very difficult to do that currently 
given the way pprType goes through IfaceType, since you’ve lost the necessary 
information by that point.

Alexis___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Expanding a particular type family in type errors

2020-05-01 Thread Alexis King
Answering my own question: it looks like cleaning this up in GHC.Tc.Errors is 
rather impractical—there are just too many types buried inside other datatypes 
that would have to be updated—so it has to be wired into pprType. Fortunately, 
I can branch on userStyle to decide whether or not to do the expansion. This 
still seems like a bit of a hack, but I can’t think of a better way. Do tell me 
if there’s a better approach!

> On May 1, 2020, at 16:36, Alexis King  wrote:
> 
> Hi all,
> 
> As part of my ongoing rework of arrow notation 
> <https://gitlab.haskell.org/ghc/ghc/-/merge_requests/3191>, I have introduced 
> two wired-in type families, which assist in the typechecking process. The 
> details are not terribly important (you can read the proposal if you’d like 
> for those), but the relevant detail is that equalities of the shape
> 
> ArrowStackTup a ~ b
> 
> are produced by the typechecker in key places. ArrowStackTup is one of my 
> wired-in type families, and it’s really an implementation detail. 
> Unfortunately, I often end up with type errors that leak this detail, with 
> expected/actual types like these ones:
> 
> Expected type: ArrowStackTup '[Int] -> Bool
>   Actual type: Int -> String
> 
> This is quite annoying, as it’s exceedingly rare that these type families 
> actually get stuck, so they can almost always be expanded in type errors. As 
> it happens, `ArrowStackTup '[a]` expands to simply `a`, so the type error I 
> would like to report is this one:
> 
> Expected type: Int -> Bool
>   Actual type: Int -> String
> 
> Not technically challenging, but I find myself faced with the question of 
> where this special expansion logic ought to go. It seems like it could go in 
> any of several places:
> 
> It could be hard-wired into pprType, and perhaps selectively disabled with an 
> -fprint-* flag. This is nice in that it’s universal, but it’s almost 
> certainly a step too far: the casts for ArrowStackTup still end up in Core, 
> and expanding the type synonyms there would be quite confusing.
> 
> The expansion could happen in tidyType, since it’s called before reporting an 
> error. But this seems probably even worse than putting it in pprType, since 
> it’s still used in too many places, and it isn’t supposed to actually change 
> the type.
> 
> It could be handled as an extra, ad-hoc preprocessing step in reportWanteds. 
> This is much closer to reasonable, though it feels like quite a hack.
> 
> A separate error Reporter could catch these errors before the other reporters 
> do and perform the expansion there. But I don’t think this actually makes 
> sense, as the above example demonstrates that ArrowStackTup might be buried 
> inside another type and in fact might not actually be the source of the type 
> error at all!
> 
> It could be done last-minute in mkEqErr. But I don’t know if this is too 
> late, and ArrowStackTup could leak into an error through some other code path.
> 
> Of those options, the best one I’ve come up with seems to be option 3, an 
> ad-hoc preprocessing step in reportWanteds. Does that seem reasonable? Or is 
> it too much of a kludge?
> 
> Thanks,
> Alexis

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Expanding a particular type family in type errors

2020-05-01 Thread Alexis King
Hi all,

As part of my ongoing rework of arrow notation 
, I have introduced 
two wired-in type families, which assist in the typechecking process. The 
details are not terribly important (you can read the proposal if you’d like for 
those), but the relevant detail is that equalities of the shape

ArrowStackTup a ~ b

are produced by the typechecker in key places. ArrowStackTup is one of my 
wired-in type families, and it’s really an implementation detail. 
Unfortunately, I often end up with type errors that leak this detail, with 
expected/actual types like these ones:

Expected type: ArrowStackTup '[Int] -> Bool
  Actual type: Int -> String

This is quite annoying, as it’s exceedingly rare that these type families 
actually get stuck, so they can almost always be expanded in type errors. As it 
happens, `ArrowStackTup '[a]` expands to simply `a`, so the type error I would 
like to report is this one:

Expected type: Int -> Bool
  Actual type: Int -> String

Not technically challenging, but I find myself faced with the question of where 
this special expansion logic ought to go. It seems like it could go in any of 
several places:

It could be hard-wired into pprType, and perhaps selectively disabled with an 
-fprint-* flag. This is nice in that it’s universal, but it’s almost certainly 
a step too far: the casts for ArrowStackTup still end up in Core, and expanding 
the type synonyms there would be quite confusing.

The expansion could happen in tidyType, since it’s called before reporting an 
error. But this seems probably even worse than putting it in pprType, since 
it’s still used in too many places, and it isn’t supposed to actually change 
the type.

It could be handled as an extra, ad-hoc preprocessing step in reportWanteds. 
This is much closer to reasonable, though it feels like quite a hack.

A separate error Reporter could catch these errors before the other reporters 
do and perform the expansion there. But I don’t think this actually makes 
sense, as the above example demonstrates that ArrowStackTup might be buried 
inside another type and in fact might not actually be the source of the type 
error at all!

It could be done last-minute in mkEqErr. But I don’t know if this is too late, 
and ArrowStackTup could leak into an error through some other code path.

Of those options, the best one I’ve come up with seems to be option 3, an 
ad-hoc preprocessing step in reportWanteds. Does that seem reasonable? Or is it 
too much of a kludge?

Thanks,
Alexis___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Why doesn’t the simple optimizer fuse casts?

2020-04-27 Thread Alexis King
Hi Richard,

Nested casts can absolutely happen after the simple optimizer runs.
Suppose we produce a desugared expression like:

let { y = x `cast` co1 } in y `cast` co2

The simple optimizer will inline y, since it only appears once, so now
we have a nested cast. Perhaps you mean you expect the casts to be
merged when the simple optimizer gets to the Cast expression itself,
but they aren’t, as it doesn’t use mkCast to reconstruct the result:

-- from GHC.Core.SimpleOpt.simple_opt_expr
go (Cast e co)  | isReflCo co' = go e
| otherwise= Cast (go e) co'

I suppose a really simple fix would be to just use mkCast here instead
of Cast, but that wouldn’t be completely satisfying, since the merged
coercion wouldn’t be optimized. So you’d have to do something slightly
more complicated to detect if the result of `go e` was a cast
expression and combine the coercions before calling optCoercion.

Whether or not doing that would be a good idea is precisely what I’m
asking about. :)

Alexis

> On Apr 27, 2020, at 16:22, Richard Eisenberg  wrote:
> 
> Hi Alexis,
> 
> Nested casts shouldn't happen. The mkCast function gets rid of them.
> Someone somewhere is forgetting to call it. If you have a concrete
> program that leads to nested casts, post a bug report. :)
> 
> Thanks!
> Richard
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Why doesn’t the simple optimizer fuse casts?

2020-04-27 Thread Alexis King
This question is spurred by curiosity more than anything else, but
I’ve noticed that the simple optimizer doesn’t fuse nested casts, and
I’m wondering if there’s any reason it couldn’t. To make what I’m
talking about more concrete, suppose we have an expression like this:

(x |> co) |> sym co

It seems like it would be trivial for simpleOptExpr to fuse the nested
casts to get

x |> co; sym co

and then the coercion optimizer could get rid of it entirely.
Moreover, this seems within the spirit of the simple optimizer, since
it’s really just “cleaning up” an expression. Is there any reason the
simple optimizer doesn’t do this, or is it just something nobody
implemented?

(For context, I’ve recently been staring at a lot of -ddump-ds output,
and there happen to be a bunch of nested casts in the result that are
really just noise. It would be nice if the simple optimizer got rid of
them for me.)

Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Desugaring matches with already-desugared RHSs?

2020-04-25 Thread Alexis King
Hi all,

I’m currently in the process of rewriting most of the arrow desugaring
code. One of the most awkward parts of the current implementation is
the way case commands are desugared. Given a case command like

case e1 of
  A a b   -> cmd1
  B c -> cmd2
  C d e f -> cmd3

the desugarer actually replaces each command on the RHS with an
Either-wrapped tuple to get something like this:

arr (\env -> case e1 of
  A a b   -> Left (Left (a, b))
  B c -> Left (Right c)
  C d e f -> Right (d, e, f))
>>> ((cmd1 ||| cmd2) ||| cmd3)

This means the RHSs of the case expression are really already
desugared, and ideally they would be CoreExprs, but matchWrapper
expects the RHSs to be HsExprs. The current implementation
accommodates this restriction by building fake HsExprs with no
location information, but this means the logic for building the tuples
in the RHSs has to be duplicated (since other places do want
CoreExprs).

I was thinking it would be nice to avoid this hack, but I’m not sure
what the best way to do it is. One way would be to create a variant of
matchWrapper with a type like

matchWrapper'
  :: HsMatchContext GhcRn
  -> Maybe (LHsExpr GhcTc)
  -> MatchGroup GhcTc rhs
  -> (rhs -> DsM CoreExpr)  -- how to desugar the RHSs
  -> DsM ([Id], CoreExpr)

and update dsGRHSs to accept an extra argument as well. Then the arrow
desugaring code could just pass `return` to matchWrapper' so it
wouldn’t touch its RHSs. But I’m not sure if this approach makes sense
— nothing else in the desugarer seems to work this way. Is there a
better approach I’m not seeing?

Thanks,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Fusing loops by specializing on functions with SpecConstr?

2020-04-03 Thread Alexis King

I fiddled with alternative representations for a while and didn’t make
any progress—it was too easy to end up with code explosion in the
presence of any unknown calls—but I seem to have found a RULES-based
approach that works very well on the examples I’ve tried. It’s quite
simple, which makes it especially appealing!

I started by defining a wrapper around the `SF` constructor to attach
rules to:

mkSF :: (a -> s -> Step s b) -> s -> SF a b
mkSF = SF
{-# INLINE CONLIKE [1] mkSF #-}

I  then changed the definitions of (.), (***), (&&&), (+++), and (&&&)
to use `mkSF` instead of `SF`, but I left the other methods alone, so
they just use `SF` directly. Then I defined two rewrite rules:

{-# RULES
"mkSF @((), _)" forall f s. mkSF f ((), s) =
  SF (\a s1 -> case f a ((), s1) of Step ((), s2) b -> Step s2 b) s
"mkSF @(_, ())" forall f s. mkSF f (s, ()) =
  SF (\a s1 -> case f a (s1, ()) of Step (s2, ()) b -> Step s2 b) s
#-}

That’s it. These two rules alone are enough to eliminate the redundant
tupling. Now the optimized version of `mapMaybeSF` is beautiful!

mapMaybeSF = \ @ a @ b f -> case f of { SF @ s f2 s2 ->
  SF (\ a1 s1 -> case a1 of {
   Nothing -> case s1 of dt { __DEFAULT -> Step dt Nothing }
   Just x -> case f2 x s1 of {
 Step s2' c1 -> Step s2' (Just c1) }})
 s2 }

So unless this breaks down in some larger situation I’m not aware of, I
think this solves my problem without the need for any fancy SpecConstr
shenanigans. Many thanks to you, Sebastian, for pointing me in the right
direction!

Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Fusing loops by specializing on functions with SpecConstr?

2020-04-01 Thread Alexis King
> On Apr 1, 2020, at 03:21, Sebastian Graf  wrote:
> 
> That is indeed true. But note that as long as you manage to inline 
> `mapMaybeSF`, the final `runSF` will only allocate once on the "edge" of each 
> iteration, all intermediate allocations will have been fused away. But the 
> allocation of these non-sense records seems unfortunate.

Yes, that is technically true, but note that even if we inline mapMaybeSF, 
those nonsense records don’t go away, they just bubble up to the “fringe” of 
the enclosing computation. And consider how tiny mapMaybeSF is: I shudder to 
think how enormous that “fringe” would be for a large program written in SF!

(And of course, nothing prevents the runSF itself from appearing in a 
loop—quite probable, in fact, given its use in the hypothetical `lazy` 
combinator.)

> So this already seems quite brittle. Maybe a very targeted optimisation that 
> gets rid of the boring ((), _) wrappers could be worthwhile, given that a 
> potential caller is never able to construct such a thing themselves. But that 
> very much hinges on being able to prove that in fact every such ((), _) 
> constructed in the function itself terminates.

Yes, that is a good point. I concede that seems much less tractable than I had 
initially hoped.

Still, as you suggest, it does seem plausible that a different encoding could 
avoid this problem. I will experiment with a few different things and get back 
to you if I find anything interesting (assuming you don’t beat me to it first!).
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Fusing loops by specializing on functions with SpecConstr?

2020-03-31 Thread Alexis King
> On Mar 31, 2020, at 17:05, Sebastian Graf  wrote:
> 
> Yeah, SPEC is quite unreliable, because IIRC at some point it's either 
> consumed or irrelevant. But none of the combinators you mentioned should rely 
> on SpecConstr! They are all non-recursive, so the Simplifier will take care 
> of "specialisation". And it works just fine, I just tried it

Ah! You are right, I did not read carefully enough and misinterpreted. That 
approach is clever, indeed. I had tried something similar with a CPS encoding, 
but the piece I was missing was using the existential to tie the final knot.

I have tried it out on some of my experiments. It’s definitely a significant 
improvement, but it isn’t perfect. Here’s a small example:

mapMaybeSF :: SF a b -> SF (Maybe a) (Maybe b)
mapMaybeSF f = proc v -> case v of
  Just x -> do
y <- f -< x
returnA -< Just y
  Nothing -> returnA -< Nothing

Looking at the optimized core, it’s true that the conversion of Maybe to Either 
and back again gets eliminated, which is wonderful! But what’s less wonderful 
is the value passed around through `s`:

mapMaybeSF
  = \ (@ a) (@ b) (f :: SF a b) ->
  case f of { SF @ s f2 s2 ->
  SF
(\ (a1 :: Maybe a) (ds2 :: ((), ((), (((), (((), (((), s), ())), 
((), ((), (), ((), ()) ->

Yikes! GHC has no obvious way to clean this type up, so it will just grow 
indefinitely, and we end up doing a dozen pattern-matches in the body followed 
by another dozen allocations, just wrapping and unwrapping tuples.

Getting rid of that seems probably a lot more tractable than fusing the 
recursive loops, but I’m still not immediately certain how to do it. GHC would 
have to somehow deduce that `s` is existentially-bound, so it can rewrite 
something like

SF (\a ((), x) -> ... Yield ((), y) b ...) ((), s)

to

SF (\a x -> ... Yield y b) s

by parametricity. Is that an unreasonable ask? I don’t know!

Another subtlety I considered involves recursive arrows, where I currently 
depend on laziness in (|||). Here’s one example:

mapSF :: SF a b -> SF [a] [b]
mapSF f = proc xs -> case xs of
  x:xs -> do
y <- f -< x
ys <- mapSF f -< xs
returnA -< (y:ys)
  [] -> returnA -< []

Currently, GHC will just compile this to `mapSF f = mapSF f` under your 
implementation, since (|||) and (>>>) are both strict. However, I think this is 
not totally intractable—we can easily introduce an explicit `lazy` combinator 
to rein in strictness:

lazy :: SF a b -> SF a b
lazy sf0 = SF g (Unit sf0) where
  g a (Unit sf1) = case runSF sf1 a of
(b, sf2) -> Yield (Unit sf2) b

And now we can write `lazy (mapSF f)` at the point of the recursive call to 
avoid the infinite loop. This defeats some optimizations, of course, but 
`mapSF` is fundamentally recursive, so there’s only so much we can really 
expect.

So perhaps my needs here are less ambitious, after all! Getting rid of all 
those redundant tuples is my next question, but that’s rather unrelated from 
what we’ve been talking about so far.

Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Fusing loops by specializing on functions with SpecConstr?

2020-03-31 Thread Alexis King
Sebastian and Simon,

Thank you both for your responses—they are all quite helpful! I agree with both 
of you that figuring out how to do this kind of specialization without any 
guidance from the programmer seems rather intractable. It’s too hard to divine 
where it would actually be beneficial, and even if you could, it seems likely 
that other optimizations would get in the way of it actually working out.

I’ve been trying to figure out if it would be possible to help the optimizer 
out by annotating the program with special combinators like the existing ones 
provided by GHC.Magic. However, I haven’t been able to come up with anything 
yet that seems like it would actually work.

> On Mar 31, 2020, at 06:12, Simon Peyton Jones  wrote:
> 
> Wow – tricky stuff!   I would never have thought of trying to optimise that 
> program, but it’s fascinating that you get lots and lots of them from FRP.

For context, the reason you get all these tiny loops is that arrowized FRP uses 
the Arrow and ArrowChoice interfaces to build its programs, and those 
interfaces use tiny combinator functions like these:

first :: Arrow a => a b c -> a (b, d) (c, d)
(***) :: Arrow a => a b d -> a c e -> a (b, c) (d, e)
(|||) :: ArrowChoice a => a b d -> a c d -> a (Either b c) d

This means you end up with programs built out of dozens or hundreds of uses of 
these tiny combinators. You get code that looks like

first (left (arr f >>> g ||| right h) *** second i)

and this is a textbook situation where you want to specialize and inline all 
the combinators! For arrows without this tricky recursion, doing that works as 
intended, and GHC’s simplifier will do what it’s supposed to, and you get fast 
code.

But with FRP, each of these combinators is recursive. This means you often get 
really awful code that looks like this:

arr (\case { Nothing -> Left (); Just x -> Right x }) >>> (f ||| g)

This converts a Maybe to an Either, then branches on it. It’s analogous to 
writing something like this in direct-style code:

let y = case x of { Nothing -> Left (); Just x -> Right x }
in case y of { Left () -> f; Right x -> g x }

We really want the optimizer to eliminate the intermediate Either and just 
branch on it directly, and if GHC could fuse these tiny recursive loops, it 
could! But without that, all this pointless shuffling of values around remains 
in the optimized program.

> I wonder whether it’d be possible to adjust the FRP library to generate 
> easier-to-optimise code. Probably not, but worth asking.

I think it’s entirely possible to somehow annotate these combinators to 
communicate this information to the optimizer, but I don’t know what the 
annotations ought to look like. (That’s the research part!)

But I’m not very optimistic about getting the library to generate 
easier-to-optimize code with the tools available today. Sebastian’s example of 
SF2 and stream fusion sort of works, but in my experience, something like that 
doesn’t handle enough cases well enough to work on real arrow programs.

> Unrolling one layer of a recursive function.  That seems harder: how we know 
> to *stop* unrolling as we successively simplify?  One idea: do one layer of 
> unrolling by hand, perhaps even in FRP source code:
> add1rec = SF (\a -> let !b = a+1 in (b,add1rec))
> add1 = SF (\a -> let !b = a+1 in (b,add1rec))

Yes, I was playing with the idea at one point of some kind of RULE that inserts 
GHC.Magic.inline on the specialized RHS. That way the programmer could ask for 
the unrolling explicitly, as otherwise it seems unreasonable to ask the 
compiler to figure it out.

> On Mar 31, 2020, at 08:08, Sebastian Graf  wrote:
> 
> We can formulate SF as a classic Stream that needs an `a` to produce its next 
> element of type `b` like this (SF2 below)

This is a neat trick, though I’ve had trouble getting it to work reliably in my 
experiments (even though I was using GHC.Types.SPEC). That said, I also feel 
like I don’t understand the subtleties of SpecConstr very well, so it could 
have been my fault.

The more fundamental problem I’ve found with that approach is that it doesn’t 
do very well for arrow combinators like (***) and (|||), which come up very 
often in arrow programs but rarely in streaming. Fusing long chains of 
first/second/left/right is actually pretty easy with ordinary RULEs, but (***) 
and (|||) are much harder, since they have multiple continuations.

It seems at first appealing to rewrite `f *** g` into `first f >>> second g`, 
which solves the immediate problem, but this is actually a lot less efficient 
after repeated rewritings. You end up rewriting `(f ||| g) *** h` into `first 
(left f) >>> first (right g) >>> second h`, turning two distinct branches into 
four, and larger programs have much worse exponential blowups.

So that’s where I’ve gotten stuck! I’ve been toying with the idea of thinking 
about expression “shells”, so if you have something like

first (a ||| b) >>> c 

Fusing loops by specializing on functions with SpecConstr?

2020-03-29 Thread Alexis King
Hi all,

I have recently been toying with FRP, and I’ve noticed that
traditional formulations generate a lot of tiny loops that GHC does
a very poor job optimizing. Here’s a simplified example:

newtype SF a b = SF { runSF :: a -> (b, SF a b) }

add1_snd :: SF (String, Int) (String, Int)
add1_snd = second add1 where
  add1 = SF $ \a -> let !b = a + 1 in (b, add1)
  second f = SF $ \(a, b) ->
let !(c, f') = runSF f b
in ((a, c), second f')

Here, `add1_snd` is defined in terms of two recursive bindings,
`add1` and `second`. Because they’re both recursive, GHC doesn’t
know what to do with them, and the optimized program still has two
separate recursive knots. But this is a missed optimization, as
`add1_snd` is equivalent to the following definition, which fuses
the two loops together and consequently has just one recursive knot:

add1_snd_fused :: SF (String, Int) (String, Int)
add1_snd_fused = SF $ \(a, b) ->
  let !c = b + 1
  in ((a, c), add1_snd_fused)

How could GHC get from `add1_snd` to `add1_snd_fused`? In theory,
SpecConstr could do it! Suppose we specialize `second` at the call
pattern `second add1`:

{-# RULE "second/add1" second add1 = second_add1 #-}

second_add1 = SF $ \(a, b) ->
  let !(c, f') = runSF add1 b
  in ((a, c), second f')

This doesn’t immediately look like an improvement, but we’re
actually almost there. If we unroll `add1` once on the RHS of
`second_add1`, the simplifier will get us the rest of the way. We’ll
end up with

let !b1 = b + 1
!(c, f') = (b1, add1)
in ((a, c), second f')

and after substituting f' to get `second add1`, the RULE will tie
the knot for us.

This may look like small potatoes in isolation, but real programs
can generate hundreds of these tiny, tiny loops, and fusing them
together would be a big win. The only problem is SpecConstr doesn’t
currently specialize on functions! The original paper, “Call-pattern
Specialisation for Haskell Programs,” mentions this as a possibility
in Section 6.2, but it points out that actually doing this in
practice would be pretty tricky:

> Specialising for function arguments is more slippery than for
> constructor arguments. In the example above the argument was a
> simple variable, but what if it was instead a lambda term? [...]
>
> The trouble is that lambda abstractions are much more fragile than
> constructor applications, in the sense that simple transformations
> may make two abstractions look different although they have the
> same value.

Still, the difference this could make in a program of mine is so
large that I am interested in exploring it anyway. I am wondering if
anyone has investigated this possibility any further since the paper
was published, or if anyone knows of other use cases that would
benefit from this capability.

Thanks,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Specializing functions with implicit parameters

2020-03-14 Thread Alexis King
> On Mar 14, 2020, at 20:03, Sandy Maguire  wrote:
> 
> What GHC are you testing against? I suspect 
> https://gitlab.haskell.org/ghc/ghc/merge_requests/668 
>  will fix this.

I’ve tested against HEAD. I think the change you link is helpful, but it 
doesn’t quite get there: the usage gets dumped before specHeader even gets a 
chance to look at the call. The relevant bit of code is here:

https://gitlab.haskell.org/ghc/ghc/blob/1de3ab4a147eeb0b34b24a3c0e91f174e6e5cb79/compiler/specialise/Specialise.hs#L2274-2302
 


Specifically, this line seals the deal:

ClassPred cls _ -> not (isIPClass cls)  -- Superclasses can't be IPs

So maybe the right fix is just to change the role of type_determines_value so 
that it turns SpecDicts into UnspecArgs, and then with your change everything 
would just happily work out.___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Specializing functions with implicit parameters

2020-03-14 Thread Alexis King
Hi all,

I discovered today that GHC never specializes functions with implicit 
parameters. This is not that surprising—I wouldn’t expect GHC to specialize the 
implicit parameters themselves—but it’s unfortunate because it means a single 
implicit parameter somewhere can transitively destroy specialization that would 
otherwise be very helpful.

Is there any obstacle to specializing these functions’ other dictionaries and 
leaving the implicit parameters alone? That is, if I have a function

foo :: (?foo :: Bool, Show a) => a -> String

could GHC specialize `foo @Int` to

foo :: (?foo :: Bool) => Int -> String

treating the implicit parameter little differently from an ordinary function 
argument?

As far as I can tell, there isn’t any real obstacle to doing this, so unless 
I’m missing something, I might give it a try myself. I just wanted to make sure 
I wasn’t missing anything before diving in.

Thanks,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Feasibility of native RTS support for continuations?

2020-03-03 Thread Alexis King
As a small update on this for anyone following along, I submitted a GHC 
proposal about a week ago to add the discussed primops (albeit with some 
tweaked names). For those who haven’t seen it already, the pull request is here:

https://github.com/ghc-proposals/ghc-proposals/pull/313

So far, the reception has been quite positive, so I’m optimistic about getting 
these added. Of course, if anyone has any concerns, please voice them in the PR 
thread!

Thanks,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Calling an unknown function from low-level Cmm

2020-02-24 Thread Alexis King
> On Feb 24, 2020, at 20:03, Simon Peyton Jones  wrote:
> 
> I don’t know the answer to this, but Alexis when you find out can I ask (I 
> know I'm a broken record on this) that you write a Note to explain, with 
> pointers from the various places you looked when you were trying to find out 
> the answer?

If I do find out, I will! However, I’ve interpreted the lack of response to 
mean there isn’t one: the only easy way to do this is to pass the arguments on 
the stack using one of the stg_ap closures. I’m not sure if that non-answer is 
worth writing someplace, and if it is, I’m not quite sure where it ought to go. 
(Perhaps in GHC/Cmm/Parser.y?)
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Calling an unknown function from low-level Cmm

2020-02-14 Thread Alexis King
Hi all,

I’m trying to understand how to properly call an unknown function from 
low-level Cmm code. If I’m just applying a function to a state token, it’s 
easy; I can just do

R1 = io;
jump stg_ap_v_fast [R1];

since the calling convention is consistent in that case. But what if my 
function takes actual arguments? I can’t do

R1 = fun;
R2 = arg;
jump stg_ap_p_fast [R1, R2];

because if the calling convention doesn’t pass any arguments in registers, that 
would be wrong. I could check if NO_ARG_REGS is defined and generate different 
code in that situation, but that seems extreme. One option I think would work 
would be to do

R1 = fun;
Sp_adj(-2);
Sp(1) = arg;
jump RET_LBL(stg_ap_p) [R1];

but that seems wasteful if I have the argument in a register already anyway. Am 
I missing something?

Thanks,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Confused about PAP object layout

2020-02-14 Thread Alexis King
Disclaimer: I am not an expert. But I happened to have been looking at this 
code just yesterday, so I’ll try to answer to check my understanding. :)

Fundamentally, a PAP is not fully-saturated, so the number of arguments in its 
payload may be smaller than the information contained in the function’s bitmap. 
scavenge_large_bitmap calls walk_large_bitmap, which uses the bitmap as a 
“ruler” to guide the traversal, lining up each element in the payload to 
information in the bitmap. But the traversal only actually walks a payload of 
the specified size, so if there’s less information in the payload than there is 
information in the bitmap, the traversal will just terminate early.

> On Feb 14, 2020, at 09:30, Ömer Sinan Ağacan  wrote:
> 
> Right, I think that's the problem. We then pass the same "size" to
> scavenge_large_bitmap as the size of the bitmap. So we assume size of the 
> bitmap
> is pap->n_args.
> 
> So the call stack is
> 
> - scavenge_PAP, calls scavenge_PAP_payload with pap->n_args as "size"
> - scavenge_PAP_payload, calls scavenge_large_bitmap with "size" (== 
> pap->n_args)
>  as the bitmap's size
> 
> Is this expected?
> 
> Ömer
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Feasibility of native RTS support for continuations?

2020-02-12 Thread Alexis King
> On Feb 10, 2020, at 02:18, Simon Marlow  wrote:
> 
>> On Mon, 10 Feb 2020 at 08:17, Simon Marlow  wrote:
>> 
>> Let me just say "unsafePerformIO" :)  You probably want to at least ensure 
>> that things don't crash in that case, even if you can't give a sensible 
>> semantics to what actually happens. We have a similar situation with 
>> unsafeIOToST - we can't tell you exactly what it does in general, except 
>> that it doesn't crash (I hope!).
> 
> Typo - I meant unsafeIOToSTM here.

I’ve been thinking about this. At first I figured you were probably right, and 
I decided I’d switch to a more raiseAsync-like approach. But once I started 
trying to implement it, I became less convinced it makes sense.

As its name implies, an AP_STACK is sort of like a saturated application, where 
the stack itself is the “function.” Extending the analogy, a continuation would 
be a PAP_STACK, since the stack is awaiting a value. This difference is 
significant. Suppose you write:

let x = 1 + unsafePerformIO (shift f >>= g) in ...

If you force x, the stack will unwind to the nearest reset. When you unwind 
past x’s UPDATE_FRAME, you can’t replace the blackhole with an AP_STACK, since 
the captured slice of the stack represents the expression

\m -> 1 + unsafePerformIO (m >>= g)

which is a function, not a thunk. The only logical interpretation of this 
situation is that x is a thunk quite like

let y = 1 + error "bang"

except that it doesn’t just abort to the enclosing reset frame when forced, it 
actually composes the captured continuation with the upper frames of the 
current stack and passes the composed continuation to f to resume computation. 
That’s an awful lot of trouble given the resulting semantics is going to be 
unpredictable, anyway!

I should be clear that I do not intend shift/reset to have any safe interface 
in IO directly. Even if you could make it type safe, it would break all kinds 
of code that manages resources using bracket. Rather, I have built a library 
that defines a totally separate Eff monad, and that monad uses the primops 
directly, wrapped in a safe interface. It isn’t possible for a user to screw 
things up using unsafePerformIO because there is no way to call shift from IO 
(unless you wrap shift# in IO yourself, but then I think you can be expected to 
know what you’re getting yourself into). So without mucking about with 
GHC.Exts, you still can’t get segfaults.

Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Feasibility of native RTS support for continuations?

2020-02-10 Thread Alexis King
> On Feb 6, 2020, at 02:28, Simon Marlow  wrote:
> 
> The issue here is that raiseAsync is destructive - it *moves* the stack to 
> the heap, rather than copying it. So if you want to continue execution where 
> you left off, for shift#, you would have to copy it back onto the stack 
> again. That's the point I was trying to highlight here.

Ah, yes, I see what you mean! It happens that for my use case I actually do 
want to unwind the stack when I capture a continuation, so that isn’t a problem 
for me.

> Yes, these are all the things that make raiseAsync tricky! You can either 
> copy what raiseAsync does (but be warned, it has taken a lot of iteration to 
> get right) or try to use raiseAsync and/or modify it to do what you want.

My point was more that I’m unsure that shift# should handle most of those 
cases. For raiseAsync, it makes sense, since asynchronous interrupts can, by 
their nature, occur at any time, even during pure code. But my shift# operation 
lives in IO, and the intent is to only capture up to a reset# in the same state 
thread.

My justification for this is that if you could use shift# in pure code, it 
would be ill-defined what you’d even be capturing. Suppose you return a thunk 
containing a call to shift#. When the thunk is evaluated, you capture up to the 
nearest reset#… but who knows what that is now? This opens you up to all sorts 
of general badness.

Therefore, I don’t think there should ever be an UPDATE_FRAME in the captured 
continuation—if there is, it’s probably a bug. So unless someone can think of 
any valid use cases, I’ll make that more explicit by modifying the 
continuation-capturing code to add some assertions that those frames never 
appear in the captured stack.___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Feasibility of native RTS support for continuations?

2020-02-01 Thread Alexis King
I took a stab at implementing this today, using the “continuation is a
stack” implementation strategy I described in my previous email. I
haven’t tried very hard to break it yet, but this tiny test program
works:

{-# LANGUAGE BangPatterns, BlockArguments, MagicHash,
 ScopedTypeVariables, UnboxedTuples #-}

import GHC.Prim
import GHC.Types

data Continuation a b = Continuation# (Continuation# RealWorld a b)

reset :: IO a -> IO a
reset (IO m) = IO (reset# m)

shift :: (Continuation a b -> IO b) -> IO a
shift f = IO (shift# \k -> let !(IO m) = f (Continuation# k) in m)

applyContinuation :: Continuation a b -> a -> IO b
applyContinuation (Continuation# k) a = IO (applyContinuation# k a)

main :: IO ()
main = do
  ns <- reset do
n <- shift \(k :: Continuation Integer [Integer]) -> do
  a <- applyContinuation k 2
  b <- applyContinuation k 3
  pure (a ++ b)
pure [n]
  print ns

The output of this program is [2, 3], as expected.

My implementation doesn’t share any code with raiseAsync. Currently, it
isn’t very clever:

* reset# pushes a RET_SMALL frame with a well-known info pointer,
  _reset_frame_info.

* shift# walks the stack and copies it up to the nearest reset
  frame. If the stack consists of several chunks, it only copies the
  chunk that contains the reset frame, and it just repurposes the
  other chunks as the continuation (since the stack is unwinding
  anyway).

* applyContinuation# copies the captured stack and updates the
  UNDERFLOW frames as needed to point to the current stack.

* I haven’t implemented it yet, but it would be straightforward to
  implement an applyContinuationOneShot# operation that works like
  applyContinuation#, but doesn’t actually copy anything and just
  updates the UNDERFLOW frames in the captured stack itself.

This seems to work in my very simple examples, but there are also things
I know it doesn’t handle properly:

* It doesn’t make any attempt to handle modifications to the
  interrupt masking state properly. The right thing to do here is
  probably to look for mask/unmask frames on the stack during
  unwinding and to stash that information somewhere.

* It doesn’t do anything special for UPDATE_FRAMEs, so if there’s an
  UPDATE_FRAME that owns a blackhole on the stack, things will go
  quite wrong.

  I haven’t been worrying about this because I don’t think there
  should ever be any update frames between shift# and reset#. In the
  case of raiseAsync, the location of the “prompt” is well-defined:
  it’s the update frame. But shift# captures up to an explicit
  prompt, so using shift# when there’s an update frame on the stack
  can surely only lead to nonsense... right?

* It doesn’t do anything special for STM frames, so trying to
  capture a continuation through those will be similarly broken.

There are also probably bugs I don’t know about — I haven’t exercised
the implementation very hard yet — but I’ll keep playing with it. If
anyone is at all interested, I’ve pushed the code to a branch here:


https://gitlab.haskell.org/lexi.lambda/ghc/compare/master...first-class-continuations

My thanks again to everyone’s help!

Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Feasibility of native RTS support for continuations?

2020-01-31 Thread Alexis King
> On Jan 30, 2020, at 02:35, Simon Marlow  wrote:
> 
> My guess is you can almost do what you want with asynchronous exceptions but 
> some changes to the RTS would be needed.
> 
> There's a bit of code in the IO library that literally looks like this 
> (https://gitlab.haskell.org/ghc/ghc/blob/master/libraries%2Fbase%2FGHC%2FIO%2FHandle%2FInternals.hs#L175)

Thanks for the pointer, I definitely had not discovered that! That is an 
interesting trick. I think your explanation paired with the Note is enough for 
it to make sense to me, though I don’t yet understand all the implementation 
subtleties of raiseAsync itself.

> Also you might want to optimise the implementation so that it doesn't 
> actually tear down the stack as it copies it into the heap, so that you could 
> avoid the need to copy it back from the heap again in shift#.

I don’t fully understand this point — do you mean “avoid the need to copy it 
back on continuation application”? shift# just copies the stack slice to the 
heap, so it doesn’t need to copy it back.

If I was right, and you were referring to continuation application rather than 
shift#, I agree with you there. It seems as though it ought to be possible to 
represent the stack slice as a stack itself, so if the stack looks like

┌───┐
│ RET_SMALL │
├───┤
│ CATCH │
├───┤
│ RESET │
├───┤

then the captured continuation could itself essentially be a stack like

┌───┐
│ RET_SMALL │
├───┤
│ CATCH │
├───┤
│ UNDERFLOW │
└───┘

where restoring a continuation just copies the captured stack and updates its 
underflow frame to point at the top of the current stack. If the caller 
promises not to use the continuation again after applying it, the copying could 
be skipped entirely, and the captured stack could just become the new stack.

However, I don’t understand enough about the way the RTS currently works to 
know if this is viable. For example, what if I write this:

reset (mask_ (shift f))

Now presumably I want to ensure the masking state is restored when I invoke the 
continuation. But it can’t just be unconditionally restored, since if I write

mask_ (reset (shift f >>= g))

then the mask frame isn’t included on the stack, so applying the continuation 
shouldn’t affect the masking state. Presumably this means a continuation 
restore can’t be as simple as copying the captured stack frames onto the 
current stack, since restoring certain frames affects other parts of the RTS 
state.

(Tangentially, it seems like this particular example is not handled properly in 
the existing capture/restore code, as this comment in Exception.cmm notes:

 NB. there's a bug in here.  If a thread is inside an
 unsafePerformIO, and inside maskAsyncExceptions# (there is an
 unmaskAsyncExceptions_ret on the stack), and it is blocked in an
 interruptible operation, and it receives an exception, then the
 unsafePerformIO thunk will be updated with a stack object
 containing the unmaskAsyncExceptions_ret frame.  Later, when
 someone else evaluates this thunk, the original masking state is
 not restored.

I think getting this right probably matters more if continuations are added, so 
that’s something else to worry about.)

> So that's shift#. What about reset#? I expect it's something like 
> `unsafeInterleaveIO`, that is it creates a thunk to name the continuation. 
> You probably also want a `catch` in there, so that we don't tear down more of 
> the stack than we need to.

It would be nice to be able to capture slices of the stack that include catch 
frames, though theoretically it isn’t necessary — it would be possible to 
arrange for a continuation that wants to capture through a catch to do 
something like

reset (... (catch (reset ...) ...))

instead, then call shift twice and compose the two continuations by hand 
(inserting another call to catch in between). I don’t know enough yet to 
understand all the tradeoffs involved, but I’ll see if it’s possible to get 
away with the userland emulation, since I figure the less code in the RTS the 
better!

> Hope this is helpful.

Very much so, thank you!

> On Jan 30, 2020, at 10:31, Ben Gamari  wrote:
> 
> For the record, runtime system captures the stack state in an AP_STACK
> closure. This is done in rts/RaiseAsync.c:raiseAsync and some of this is
> described in the comment attached to that function.
> 
> As Simon PJ points out, this is all very tricky stuff, especially in a
> concurrent context. If you make any changes in this area do be sure to
> keep in mind the considerations described in Note [AP_STACKs must be
> eagerly blackholed], which arose out of the very-nast #13615.

Thank you for the pointers — I did manage to find raiseAsync, but I hadn’t seen 
that Note. I will try my best to be suitably wary, though I imagine I’m in far 
enough over my head that I don’t yet know 

Re: Feasibility of native RTS support for continuations?

2020-01-29 Thread Alexis King
> On Jan 29, 2020, at 03:32, Simon Peyton Jones  wrote:
> 
> Suppose a thread happens to be evaluating a pure thunk for (factorial 200). 
> […] This stack-freezing stuff is definitely implemented.

That’s fascinating! I had no idea, but your explanation makes sense (as do the 
papers you linked). That is definitely promising, as it seems like many of the 
tricky cases may already be accounted for? I’ll see if I can follow the Cmm 
code well enough to hunt down how it’s implemented.

One other thing I have been thinking about: this is completely incompatible 
with the state hack, isn’t it? That is not a showstopper, of course—I do not 
intend to suggest that continuations be capturable in ordinary IO—but it does 
mean I probably want a way to selectively opt out. (But I’ll worry about that 
if I ever get that far.)

Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Feasibility of native RTS support for continuations?

2020-01-28 Thread Alexis King
> On Jan 28, 2020, at 04:09, Simon Peyton Jones  wrote:
> 
> I've thought about this quite a bit in the past, but got stalled for lack of 
> cycles to think about it more.  But there's a paper or two

Many thanks! I’d stumbled upon the 2007 paper, but I hadn’t seen the 2016 one. 
In the case of the former, I had thought it probably wasn’t enormously 
relevant, since the “continuations” appear to be fundamentally one-shot. At 
first glance, that doesn’t seem to have changed in the JFP article, but I 
haven’t really read it yet, so maybe I’m mistaken. I’ll take a closer look.

> On the effects front I think Daan Leijen is doing interesting stuff, although 
> I'm not very up to date:
> https://www.microsoft.com/en-us/research/people/daan/publications/

Indeed, I’ve read a handful of his papers while working on this! I didn’t 
mention it in the original email, but I’ve also talked a little with Matthew 
Flatt about efficient implementation of delimited control, and he pointed me to 
a few papers a couple of months ago. One of those was “Final Shift for call/cc: 
a Direct Implementation of Shift and Reset” by Gasbichler and Sperber, which 
describes an approach closest to what I currently have in mind to try to 
implement in the RTS.

> One interesting dimension is whether or not the continuations you capture are 
> one-shot.  If so, particularly efficient implementations are possible.

Quite so. One thing I’ve considered is that it’s possible to obtain much of 
that efficiency even without requiring strict one-shot continuations if you 
have a separate operation for restoring a continuation that guarantees you 
won’t ever restore it again, sort of like the existing unsafeThaw/unsafeFreeze 
operations. That is, you can essentially convert a multi-shot continuation into 
a one-shot continuation and reap performance benefits, even if you’ve already 
applied the continuation.

This is a micro-optimization, though, so I’m not worrying too much about it 
right now.

> Also: much of the "capture stack chunk" stuff is *already* implemented, 
> because it is (I think) what happens when a thread receives an asynchronous 
> exception, and just abandon its evaluation of thunks that it has started work 
> on.

Now that is very interesting, and certainly not something I would have 
expected! Why would asynchronous exceptions need to capture any portion of the 
stack? Exceptions obviously trigger stack unwinding, so I assumed the “abort to 
the current prompt” part of my implementation would already exist, but not the 
“capture a slice of the stack” part. Could you say a little more about this, or 
point me to some relevant code?

Thanks again,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Feasibility of native RTS support for continuations?

2020-01-28 Thread Alexis King
Hi all,

tl;dr: I want to try to implement native support for capturing slices of
the RTS stack as a personal experiment; please tell me the obstacles I
am likely to run into. Much more context follows.

---

I have been working on an implementation of an algebraic effect system
that uses unsafe primops to be as performant as possible. However, the
unavoidable need to CPS the entire program balloons heap allocation.
Consider the following definition:

f a b = g a >>= \c -> h (b + c)

Assume `g` and `h` are not inlined. If the monad used is IO, this will
be compiled efficiently: the result of `g` is returned on the stack, and
no closure needs to be allocated for the lambda. However, if the monad
supports capturing the continuation, the above definition must be CPS’d.
After inlining, we end up with

f a b = \k -> let lvl = \c -> h (b + c) k in g a lvl

which must allocate a closure on the heap. This is frustrating, as it
must happen for every call to a non-inlined monadic operation, even if
that operation never captures the continuation. In an algebraic effect
system, there are many shortcuts that avoid the need to capture the
continuation, and under my implementation, large swaths of code never do
so. I’ve managed to exploit that to get some savings, but I can’t escape
the need to allocate all these closures.

This motivates my question: how difficult would it be to allow capturing
a portion of the RTS call stack directly? My requirements are fairly
minimal, as continuations go:

  1. Capturing a continuation is only legal from within a strict state
 thread (i.e. IO or strict ST).

  2. The continuation is captured up to a prompt, which would be a new
 kind of RTS stack frame. Prompts are not tagged, so there is only
 ever exactly one prompt active at any time (which may be the root
 prompt).

  3. Capturing a continuation is unsafe. The behavior of capturing a
 continuation is undefined if the current prompt was not created by
 the current state thread (and it is never legal to capture up to
 the root prompt).

  4. Applying a continuation is unsafe. Captured continuations return
 `Any`, and type safety is the caller’s obligation.

  5. Continuations are “functional,” which is to say applying them does
 not trigger any additional stack unwinding.

This minimal support for primitive continuation capturing would be
enough to support my efficient, safe delimited control implementation.
In my ignorant mind, implementing this ought to be as simple as defining
two new primops,

reset# :: (State# s -> (# State# s, a #))
   -> State# s -> (# State# s, a #)

shift# :: ((a -> State# s -> (# State# s, Any #))
   -> State# s -> (# State# s, Any #))
   -> State# s -> (# State# s, a #)

where reset# pushes a new prompt frame and shift# captures a slice of
the RTS stack up to that frame and copies it into the heap. Restoring a
continuation would copy all the captured frames onto the end of the
current stack. Sounds simple enough!

I would like to experiment with implementing something like this myself,
just to see if it would really work, but somehow I doubt it is actually
as simple as it sounds. Minor complications are fine, but what worries
me are major obstacles I haven’t found in my limited attempts to learn
about the RTS.

So far, I’ve read the old “The New GHC/Hugs Runtime System” paper, which
still seems mostly accurate from a high level, though I imagine many
details have changed since then. I’ve also read the
Commentary/RTS/Storage/Stack wiki page, and I’ve peeked at some of the
RTS source code. I’ve also skimmed a handful of other papers and wiki
pages, but it’s hard to know what I’m looking for. Can anyone point me
to better resources or give me some insight into what will be hard?

Thanks in advance,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Superclasses of type families returning constraints?

2020-01-11 Thread Alexis King
> On Jan 6, 2020, at 15:41, Simon Peyton Jones  wrote:
> 
> Ah, I see a bit better now.  So you want a way to get from evidence that
>   co1 :: F a b ~# ()
> to evidence that
>   co2 :: a ~# b
> 
> So you'd need some coercion form like
>   co2 = runBackwards co1
> 
> or something, where runBackwards is some kind of coercion form, like sym or 
> trans, etc.

Precisely.

I’ve begun to feel there’s something markedly GADT-like going on here. Consider 
this similar but slightly different example:

type family F1 a where
  F1 () = Bool

Given this definition, this function is theoretically well-typed:

f1 :: F1 a -> a
f1 !_ = ()

Since we have access to closed-world reasoning, we know that matching on a term 
of type `F1 a` implies `a ~ ()`. But it gets more interesting with more 
complicated examples:

type family F2 a b where
  F2 ()() = Bool
  F2 (Maybe a) a  = Int

These equations theoretically permit the following definitions:

f2a :: F2 () b -> b
f2a !_ = ()

f2b :: F2 (Maybe ()) b -> b
f2b !_ = ()

That is, matching on a term of type `F2 a b` gives rise to a set of 
*implications.* This is sort of interesting, since we can’t currently write 
implications in source Haskell. Normally we avoid this problem by using GADTs, 
so F2 would instead be written like this:

data T2 a b where
  T2A :: Bool -> T2 () ()
  T2B :: Int -> T2 (Maybe a) a

But these have different representations: `T2` is tagged, so if we had a value 
of type `T2 a ()`, we could branch on it to find out if `a` is `()` or `Maybe 
()` (and if we had a `Bool` or an `Int`, for that matter). `F2`, on the other 
hand, is just a synonym, so we cannot do that.

In this case, arguably, the right answer is “just use a GADT.” In my case, 
however, I cannot, because I actually want to write something like

type family F2' a b where
  F2' ()() = Void#
  F2' (Maybe a) a  = Void#

so that the term has no runtime representation at all. Even if we had 
`UnliftedData` (and we don’t), and even if it supported GADTs (seems unlikely), 
this still couldn’t be encoded using it because the term would still have to be 
represented by an `Int#` for the same reason `(# Void# | Void# #)` is. On the 
other hand, if this worked as above, `F2'` would really just be a funny-looking 
way of encoding a proof of a set of implications in source Haskell.

> I don't know what a design for this would look like.   And even if we had it, 
> would it pay its way, by adding substantial and useful new programming 
> expressiveness?

For the latter question, I definitely have no idea! In my time writing Haskell, 
I have never personally found myself wanting this until now, so it may be of 
limited practical use. I have no idea how to express my `F2'` term in Haskell 
today, and I’d very much like to be able to, but of course this is not the only 
mechanism by which it could be expressed.

Still, I find the relationship interesting, and I wonder if this particular 
connection between type families and GADTs is well-known. If it isn’t, I wonder 
whether or not it’s useful more generally. Of course, it might not be... but 
then maybe someone else besides me will also find it interesting, at least. :)

Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Superclasses of type families returning constraints?

2020-01-06 Thread Alexis King
> On Jan 6, 2020, at 05:29, Richard Eisenberg  wrote:
> 
> You're absolutely right that improvement doesn't solve your problem. But 
> what I didn't say is that there is no real reason (I think) that we can't 
> improve improvement to produce givens. This would likely require a change to 
> the coercion language in Core (and thus not to be taken lightly), but if we 
> can identify a class of programs that clearly benefits from that work, it is 
> more likely to happen. The thoughts about improvement were just a very basic 
> proof-of-concept. Sadly, the proof-of-concept failed, so I think a first step 
> in this direction would be to somehow encode these partial improvements, and 
> then work on changing Core. That road seems too long, though, so in the end, 
> I think constrained type families (with superclass constraints) might be a 
> more fruitful direction.


Thanks, this all makes sense to me. I actually went back and re-read the 
injective type families paper after responding to your previous email, and I 
discovered it actually alludes to the issue we’re discussing! At the end of the 
paper, in section 7.3, it provides the following example:

> Could closed type families move beyond injectivity and functional 
> dependencies by applying closed-world reasoning that derives solutions of 
> arbitrary equalities, provided a unique solution exists? Consider this 
> example:
> 
> type family J a where
>   J Int = Char
>   J Bool = Char
>   J Double = Float
> 
> One might reasonably expect that if we wish to prove (J a ∼ Float), we will 
> simplify to (a ∼ Double). Yet GHC does not do this as neither injectivity nor 
> functional dependencies can discover this solution.

This is not quite the same as what we’re talking about here, but it’s clearly 
in the same ballpark.

I think what you’re describing makes a lot of sense, and it would be 
interesting to explore what it would take to encode into core. After thinking a 
little more on the topic, I think that probably makes by far the most sense 
from the core side of things. But I agree it seems like a significant change, 
and I don’t know enough about the formalism to know how difficult it would be 
to prove soundness. (I haven’t done much formal proving of anything!)

Alexis___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Superclasses of type families returning constraints?

2020-01-06 Thread Alexis King
> On Jan 6, 2020, at 12:05, Alexis King  wrote:
> 
>type family F a b where
>  F a b = () -- regular (), not (() :: Constraint)

(Err, sorry, this should be `F a a = ()`. But I think you can understand what 
I’m getting at.)
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Superclasses of type families returning constraints?

2020-01-06 Thread Alexis King
> On Jan 6, 2020, at 08:52, Simon Peyton Jones  wrote:
> 
> This is rejected, and it's not in the least bit puzzling!  You have evidence 
> for (F a b) and you need to prove (a~b) -- for any a and b.  Obviously you 
> can't.  And in Core you could write (eq @Int @Bool (error "urk")) and you 
> jolly well don’t want it to return Refl.

I’m not sure I totally understand your complaint. If I were to write

eq :: a ~ b => a :~: b
eq = Refl

then in core I could still write `eq @Int @Bool (error "urk")`, and that would 
be entirely well-typed. It would not really return Refl at runtime, of course, 
but neither would the example I gave in my original email.

Perhaps a better way to phrase my original example is to write it this way:

type family F a b where
  F a b = () -- regular (), not (() :: Constraint)

eq :: F a b ~ () => a :~: b
eq = Refl

In this case, in core, we receive an argument of type `F a b ~ ()`, and we can 
force that to obtain a coercion of type `F a b ~# ()`. For reasons similar to 
the mechanisms behind injective type families, that equality really does imply 
`a ~# b`. The fact that the type family returns an empty constraint tuple is 
really incidental, and perhaps unnecessarily misleading.

Does that clear up your confusion? Or have I misunderstood your concern?

Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Superclasses of type families returning constraints?

2020-01-05 Thread Alexis King
> On Jan 4, 2020, at 20:38, Richard Eisenberg  wrote:
> 
> I thought that, maybe, we could use partial improvement to give you what you 
> want

I think that improvement alone cannot possibly be enough here, as improvement 
by its nature does not provide evidence. Improvement allows us to take a set of 
constraints like

[G] FD2 a Bool
[W] FD2 a b

and derive [WD] b ~ Bool, but importantly this does not produce a new given! 
This only works if b is a metavariable, since we can solve the new wanted by 
simply taking b := Bool, but if b is rigid, we are just as stuck as before. In 
other words, improvement only helps resolve ambiguities, not derive any new 
information.

That’s why I think the “superclass” characterization is more useful. If instead 
we express your FD2 class as

class b ~ B a => FD2 a b where
  type B a

then if we have [G] FD2 a Bool, we can actually derive [G] B a ~ Bool, which is 
much stronger than what we were able to derive using improvement.

I imagine you are aware of all of the above already, but it’s not immediately 
clear to me from your description why you need functional dependencies (and 
therefore improvement) rather than this kind of approximation using 
superclasses and type families. Would modeling things with that approximation 
help at all? If not, why not? I think that would help me understand what you’re 
saying a little better.

Thanks,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Superclasses of type families returning constraints?

2019-12-30 Thread Alexis King
Hi Richard,

Thanks for the pointer to constrained type families—that’s helpful to clarify 
some of my thoughts about this, and it is indeed relevant! One example I had in 
my real code seems difficult to express even with constrained type families, 
however; here is a stripped-down version of it:

type family F a b :: Constraint where
  F '[](A _)   = ()
  F (B a ': b) (C c d) = (a ~ c, F b d)

With the above type family, if we have the given `F (B a ': b) c`, we’d really 
like GHC to be able to derive `c ~ C t1 t2` so that the type family can 
continue to reduce, yielding the givens `a ~ t1` and `F b t2`. But we can’t use 
your trick of moving the equality into the RHS because we need to procure two 
fresh unification variables, something we are unable to do. The family is 
simply stuck, and there isn’t much we can do to cajole GHC into believing 
otherwise.

In the above example, what the “superclasses” of F are is much less clear. We 
end up with a few different implications:

F '[] a :-  a ~ A t1
F a (A b)   :-  a ~ '[]
F (B a ': b) c  :-  c ~ C t1 t2
F a (C b c) :-  a ~ (B t1 ': t2)

I don’t know if there is a way to encode those using the constraint language 
available in source Haskell today. If I understand correctly, I don’t think 
they can be expressed using quantified constraints. Maybe it’s more trouble 
than it’s worth, but I find it a little interesting to think about.

Anyway, thank you for the response; I think my original question has been 
answered. :)

Alexis

> On Dec 29, 2019, at 21:02, Richard Eisenberg  wrote:
> 
> Hi Alexis,
> 
> I'm not aware of any work in this direction. It's an interesting idea to 
> think about. A few such disconnected thoughts:
> 
> - You could refactor the type family equation to be `F a b = a ~ b`. Then 
> your program would be accepted. And -- assuming the more sophisticated 
> example is also of kind constraint -- any non-linear pattern could be 
> refactored similarly, so perhaps this observation would carry over.
> 
> - If we had constrained type families (paper: 
> https://repository.brynmawr.edu/cgi/viewcontent.cgi?article=1075=compsci_pubs,
>  proposal:https://github.com/ghc-proposals/ghc-proposals/pull/177), you could 
> express a superclass constraint on the enclosing type family, which would 
> likely work very much in the way you would want.

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Superclasses of type families returning constraints?

2019-12-27 Thread Alexis King
Hello all,

I recently noticed that GHC rejects the following program:

type family F a b :: Constraint where
  F a a = ()

eq :: F a b => a :~: b
eq = Refl

This is certainly not shocking, but it is a little unsatisfying: as far as I 
can tell, accepting this program would be entirely sound. That is, `a ~ b` is 
morally a “superclass” of `F a b`. In this example the type family is 
admittedly rather pointless, as `a ~ b` could be used instead, but it is 
possible to construct more sophisticated examples that cannot be so 
straightforwardly expressed in other ways.

I am therefore curious: has this kind of scenario ever been discussed before? 
If yes, is there a paper/GitLab issue/email thread somewhere that discusses it? 
And if no, is there any fundamental reason that GHC does not propagate such 
information (i.e. it’s incompatible with some aspect of the type system or 
constraint solver), or is it simply something that has not been explored? 
(Maybe you think the above program is horrible and *shouldn’t* be accepted even 
if it were possible, but that is a different question entirely. :))

Thanks,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Simplifier bug fixed in GHC 8.8.1?

2019-10-28 Thread Alexis King
Hi all,

I have an odd question: I’ve bumped into a clear simplifier bug, and although 
it only happens on GHC 8.6.5, not 8.8.1, I’d like to locate the change that 
fixed it. My library’s test suite currently fails on GHC 8.6.5 due to the bug, 
and I’d rather not force all my users to upgrade to 8.8 if I can help it, so 
I’m hoping to find a workaround.

The minimal test case I’ve found for the bug is this program:

{-# LANGUAGE GeneralizedNewtypeDeriving, StandaloneDeriving, TypeFamilies 
#-}

import Control.Exception
import Control.Monad.IO.Class
import Control.Monad.Trans.Identity
import Control.Monad.Trans.Reader

class Monad m => MonadFoo m where
  foo :: m a -> m a
instance MonadFoo IO where
  foo m = onException m (pure ())
instance MonadFoo m => MonadFoo (ReaderT r m) where
  foo m = ReaderT $ \r -> foo (runReaderT m r)
deriving instance MonadFoo m => MonadFoo (IdentityT m)

type family F m where
  F m = IdentityT m

newtype FT m a = FT { runFT :: F m a }
  deriving (Functor, Applicative, Monad, MonadIO, MonadFoo)

main :: IO ()
main = run (foo (liftIO (throwIO (IndexOutOfBounds "bang"
  where
run :: ReaderT () (FT (ReaderT () IO)) a -> IO a
run = flip runReaderT () . runIdentityT . runFT . flip runReaderT ()

Using GHC 8.6.5 on macOS 10.14.5, compiling this program with optimizations 
reliably triggers the -fcatch-bottoms sanitization:

$ ghc -O -fcatch-bottoms weird.hs && ./weird
[1 of 1] Compiling Main ( weird.hs, weird.o )
Linking weird ...
weird: Bottoming expression returned

What goes wrong? Somehow the generated core for this program includes the 
following:

lvl_s47B :: SomeException
lvl_s47B = $fExceptionArrayException_$ctoException lvl_s483

m_s47r :: () -> State# RealWorld -> (# State# RealWorld, () #)
m_s47r
  = \ _ (eta_B1 :: State# RealWorld) -> raiseIO# lvl_s47B eta_B1

main_s2Ww :: State# RealWorld -> (# State# RealWorld, () #)
main_s2Ww
  = \ (eta_a2wK :: State# RealWorld) ->
  catch# (case m_s47r `cast`  of { }) raiseIO# eta_a2wK

This core is completely bogus: it assumes that m_s47r is bottom, but m_s47r is 
a top-level function! The program still passes -dcore-lint, unfortunately, as 
it is still well-typed. (Also, in case it helps: -ddump-simplifier-iterations 
shows that the buggy transformation occurs in the first iteration of the very 
first simplifier pass.)

I’ve been trying to figure out what change might have fixed this so that I can 
assess if it’s possible to work around, but I haven’t found anything obvious. 
I’ve been slowly `git bisect`ing to look for the commit that introduced the 
fix, but many of the commits I’ve tested cause unrelated panics on my machine, 
which has been exacerbating the problem of the slow recompilation times. I’m a 
little at wits’ end, but opening a bug report hasn’t felt right, since the bug 
does appear to already be fixed.

Does this issue ring any bells to anyone on this list? Is there a particular 
patch that landed between GHC 8.6.5 and GHC 8.8.1 that might have fixed this 
problem? If not, I’ll keep trying with `git bisect`, but I’d appreciate any 
pointers.

Thanks,
Alexis

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Debugging specialization

2019-10-24 Thread Alexis King
Hi all,

I am trying to understand why GHC is not specializing an imported INLINABLE 
instance method, and the information provided by -ddump-spec 
-Wall-missed-specialisations has not been enough to help me figure it out. Is 
there some easy/well-trodden way that I could build GHC with some additional 
instrumentation in place to help me better understand the decisions being made 
by the specializer, or is -ddump-spec the most granularity available?

I’ve noticed that Specialise.hs has handful of pprTrace calls sprinkled about, 
but they are all commented out. Is the recommended method to just uncomment 
these calls and rebuild GHC?

Thanks,
Alexis

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Compiling projects with dependencies using GHC HEAD

2019-10-04 Thread Alexis King
> On Oct 1, 2019, at 18:56, Ben Gamari  wrote:
> 
> Indeed it's just a function of this being a bit work in progress; we
> only started allowing multiple patch versions relatively recently and it
> looks like the constraint file generation hasn't been updated to account
> for this yet. I'll try to fix that tonight.

Thanks for the clarification. I just wanted to follow up and say that this was 
indeed enough to get me off the ground, and I was eventually able to 
successfully compile my project with GHC HEAD after updating a few other 
dependencies for MonadFail compatibility. Thanks again for your efforts!

Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Compiling projects with dependencies using GHC HEAD

2019-10-01 Thread Alexis King
> On Oct 1, 2019, at 18:19, Ben Gamari  wrote:
> 
> Your question is a very good one which, until quite recently, there was
> not a particularly good answer to. However, in the past months we have
> been working on infrastructure (in the form of head.hackage [0]) to make
> GHC more testable in pre-release form.
> 
> I have a draft blog post explaining the currently state-of-play
> here [1,2]. Unfortunately between ICFP, release things, vacation, and bug
> fixing I've lacked the time to finish these off (the tutorial in particular).

Hi Ben,

Many thanks for your response—this infrastructure looks incredibly helpful! 
I’ll give it a try. One question I do have just from giving it a look involves 
the cabal.constraints file available here:

https://ghc.gitlab.haskell.org/head.hackage/cabal.constraints 


A number of packages appear to be listed twice, and some of them (but not all 
of them) are pinned to different versions. Is that intentional? Or is it just a 
part of still being a bit of a work in progress?

In any case, this gets me quite a bit further, and though I’m not sure yet if 
my project will actually build, I can now at least construct a valid build 
plan. Thanks again!

Alexis___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Compiling projects with dependencies using GHC HEAD

2019-10-01 Thread Alexis King
Hi all,

I have a project that I’d like to try to build with GHC HEAD, but it’s a
large project with a lot of dependencies, and while there’s a lot of
information on building GHC itself, I haven’t been able to find very many
resources on what workflow people use to actually build projects in the
wild using it. I found this wiki page, “Installing packages in your test
compiler,” which has a little guidance, but the details are sparse:

https://gitlab.haskell.org/ghc/ghc/wikis/debugging/installing-packages-inplace


However, I’ve been struggling to actually get my project to build using
cabal-install paired with a stage 2 compiler. I realize some of this is
just fundamental—various libraries need to be updated for new versions of
GHC—but I’m willing to go to the effort to manually resolve those conflicts
so long as I know the right way to go about doing it.

One simple thing I did try (and failed at) is building an updated version
of cabal-install itself using my updated GHC. From inside my checkout of
the GHC repo, I tried the following:

$ cd libraries/Cabal/cabal-install
$ cabal --version
cabal --version
cabal-install version 2.4.1.0
compiled using version 2.4.1.0 of the Cabal library
$ cabal --with-compiler= --package-db= \
new-build cabal-install


However, this failed to compile due to out-of-bounds versions of certain
packages, so I tried again with --allow-newer:

$ cabal --with-compiler= --package-db= \
new-build cabal-install --allow-newer=base,ghc-prim,template-haskell


This time, it managed to produce a build plan, but for reasons I don’t
understand, both Cabal-3.1.0.0 and Cabal-2.4.1.0 are in the build plan, and
the latter fails to compile against newer versions of base (due to the
MonadFail changes). I tried to figure out what was causing Cabal-2.4.1.0 to
end up in the build plan, but running new-build with the -v3 option didn’t
seem to give me any more explanation.

Apologies if this is really a request for support with cabal-install, and
it doesn’t belong on this list, but I figured I’d start here, since I
imagine the people on this list are best-equipped to answer questions
involving using GHC HEAD. I’d appreciate any pointers or resources anyone
has!

Thanks,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Typechecker plugins and BuiltInSynFamily

2019-08-20 Thread Alexis King
Hello all,

As I’ve been dabbling with typechecker plugins, I’ve found myself primarily 
using them to define new “magic” type families, and I don’t think I’m 
alone—Sandy Maguire recently released the magic-tyfams package for precisely 
that purpose. However, I can’t help but notice that GHC already has good 
internal support for such type families via BuiltInSynFamily and CoAxiomRule, 
which are mostly used to implement operations on Nats. As a plugin author, I 
would love to be able to use that functionality directly instead of being 
forced to reimplement it myself, for two big reasons:

AxiomRuleCo provides significantly more safety from -dcore-lint than UnivCo, 
but UnivCo is currently the only way to provide evidence for plugin-solved 
families.

The sfInteractTop and sfInteractInert fields of BuiltInSynFamily make it easy 
to support improvement for custom type families, which I believe would take a 
non-trivial amount of tricky code to get right using the current typechecker 
plugin API.

Given the above, I started wondering if it is possible to define a 
BuiltInSynFamily from inside a plugin or, failing that, to modify GHC to expose 
that functionality to typechecker plugin authors. I am not familiar with GHC’s 
internals, but in my brief reading of the source code, the following two things 
seem like the trickiest obstacles:

BuiltInSynFamily TyCons need to be injected into the initial name cache, since 
otherwise those names will get resolved to their ordinary, non-built-in 
counterparts (e.g. the ordinary open type families defined in GHC.TypeLits).

Since CoAxiomRule values actually have functions inside them, they can’t be 
serialized into interface files. Therefore, it looks like GHC currently 
maintains a hardcoded list of all the known CoAxiomRules, and 
tcIfaceCoAxiomRule just searches for a value in that list using a well-known 
(i.e. not in any way namespaced!) string.

I am not knowledgable enough about GHC to say how hard overcoming either of 
those issues would be. Point 1 seems possible to achieve by arranging for 
plugins to export the built-in names they want to define and propagating those 
to the initial name cache, but I don’t know enough about how plugins are loaded 
to know if that would create any possible circular dependencies (i.e. does the 
name cache need to already exist in order to load a plugin in the first place?).

Point 2 seems harder. My gut instinct is that it could probably be overcome by 
somehow storing a reference to the plugin that defined the CoAxiomRule in the 
interface file (by, for example, storing its package-qualified module name), 
but I’m not immediately certain when that reference should be resolved to the 
actual CoAxiomRule value. It also presumably needs to complain if the necessary 
plugin is not actually loaded when the CoAxiomRule needs to be resolved!

I’m willing to try my hand at experimenting with an implementation of this if 
someone can give me a couple pointers on where to start on the above two issues 
(assuming people don’t think it’s a bad idea to do it at all). Any advice would 
be appreciated!

Thanks,
Alexis

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Properly writing typechecker plugins

2019-08-16 Thread Alexis King
Apologies for the long delay before replying to this, I ended up becoming very 
busy for a couple weeks.

> On Aug 2, 2019, at 02:57, Simon Peyton Jones  wrote:
>  
> We should not need to delete solved _givens_ from the inert set.  We can 
> augment givens with extra facts, but deleting them seems wrong.

I agree with this, so no complaints from me about that.

> There should be no Derived constraints in the inert set anyway.  They should 
> all be in the WantedConstraints passed to runTcPluginsWanted.They were 
> extracted from the inert set, along with the Deriveds, by getUnsolvedInerts 
> in solve_simple_wanteds

Upon looking at the code more carefully, you’re quite right—I was dead wrong, 
and typechecker plugins can solve derived constraints just fine. I spent 
several hours debugging the gory details this evening, and after finding the 
problem, I realized the bug had already been reported as issue #16735 
. A little anticlimactic, but 
yet another reason to figure out how to build GHC HEAD on my machine (and maybe 
then I can think about submitting some of the documentation changes, too).

Thanks again,
Alexis

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Properly writing typechecker plugins

2019-08-01 Thread Alexis King
My thanks to you, Simon, for your prompt response. After reading your message, 
I realized I perhaps read too far into this sentence on the wiki page:

> Defining type families in plugins is more work than it needs to be, because 
> the current interface forces the plugin to search the unsolved constraints 
> for the type family in question (which might be anywhere within the types), 
> then emit a new given constraint to reduce the type family.

Emphasis mine. Although that sentence seems to imply that adding given 
constraints from a typechecker plugin is both possible and sanctioned, your 
message, my experimentation, and the GHC source code all seem to agree that is 
not true, after all. After realizing that was a dead end, I puzzled for some 
time on how one is intended to solve nested type families after all, only to 
realize that a proper solution is simpler than I had realized.

To make things more concrete, I was confused about how my plugin might solve a 
constraint like

[W] {co_0} :: F (ToLower "Foo") ~# Length "bar"

if it knows about ToLower and Length but nothing about F. The solution is 
actually extremely straightforward to implement, but the idea was not at all 
obvious to me at first. Namely, it’s possible to just recursively walk the 
whole type and solve any known families bottom-up, producing a new constraint:

[W] {co_1} :: F "foo" ~# 3

At the same time, to construct evidence for the first constraint, the recursive 
function can also build a coercion as it goes, producing

co_2 = (F (Univ :: "foo" ~ ToLower "Foo")) ~# (Univ :: 3 ~ Length "bar")

which can be used to cast the evidence for the new constraint to evidence for 
the old one:

co_0 := co_2 ; {co_1}

That all seems to be working well, and it’s much nicer than whatever it was I 
was trying before. There’s one small wrinkle I’ve bumped into, however, which 
was preventing the solver from terminating. Specifically, typechecker plugins 
do not seem to be able to solve derived constraints.

To elaborate, my plugin was getting called with the constraint [D] (Length "f" 
~ 1), which it was happily turning into [D] (1 ~ 1), but the derived constraint 
was never removed from the inert set, and it would produce new [D] (1 ~ 1) 
constraints forever. I went looking in the GHC source, only to discover that 
you, Simon, are apparently deeply suspicious 
<https://gitlab.haskell.org/ghc/ghc/blob/986643cb3506f2eedce96bf2d2c03873f105fad5/compiler/typecheck/TcInteract.hs#L276-277>
 of typechecker plugins that solve derived constraints! I don’t know if there’s 
a reason behind that—maybe I’m going about things the wrong way, and I 
shouldn’t need to solve those derived constraints—but in the meantime, I added 
some kludgey mutable state to keep track of the derived constraints my plugin 
has already attempted to solve so it won’t try to solve them again.

Anyway, that aside, if anyone finds what I’ve written in this email helpful, I 
can certainly flesh it out a bit more and stick it in a wiki page somewhere. 
Though, for what it’s worth, I think it would be even more helpful if some of 
the relevant information made its way into the GHC user’s guide, since I found 
that more discoverable than the wiki page (which took a comparatively large 
amount of digging to locate).

Thanks again,
Alexis

> On Aug 1, 2019, at 03:01, Simon Peyton Jones  wrote:
> 
> Alexis
>  
> Thanks for writing this up so carefully.  I hope that others will join in.   
> And please then put the distilled thought onto the wiki page(s) so they are 
> not lost.
>  
> Some quick thoughts from me:
>  
> Flattening.  I’m pretty sure we pass constraints unflattened because that’s 
> what someone wanted at  the time.  It could easily be changed, but it might 
> complicate the API.  E.g. you might reasonably want to know the mapping from 
> type variable to function application. There is no fundameental obstacle.
>  
> Letting the plugin add given constraints.  This looks a bit like: let the 
> plugin prove lemmas and hand them back to GHC (along with their proof) to 
> exploit. Yes, that seems reasonable too.  Again, something new in the API.
>  
> I don’t understand enough of your type-class instance question to comment 
> meaningfully, but perhaps others will.
>  
> Nothing about the plugin interface is cast in stone.  There are quite a few 
> “customers” but few enough that they’ll probably be happy to adapt to 
> changes.   Go for it, in consultation with them!
>  
> Simon
>  
>  
>  
> From: ghc-devs  On Behalf Of Alexis King
> Sent: 01 August 2019 07:55
> To: ghc-devs@haskell.org
> Subject: Properly writing typechecker plugins
>  
> Hi all,
>  
> I have recently decided to try writing a GHC typechecker plugin so I can get 
> my hands on some extra operations on typ

Properly writing typechecker plugins

2019-08-01 Thread Alexis King
Hi all,

I have recently decided to try writing a GHC typechecker plugin so I can get my 
hands on some extra operations on type-level strings. My plugin works, but only 
sort of—I know some things about it are plain wrong, and I have a sneaking 
suspicion that plenty of other things are not handled properly.

First, some context about what I do and don’t already know: I have a high-level 
understanding of the basic concepts behind GHC’s solver. I understand what 
evidence is and what purpose it serves, I mostly understand the different 
flavors of constraints, and I think I have a decent grasp on some of the 
operational details of how individual passes work. I’ve spent a little time 
reading through comments in the GHC source code, along with pieces of the 
source code itself, but I’m sure my understanding is pretty patchy.

With that out of the way, here are my questions:

First, I’m trying to understand: why are wanted constraints passed to 
typechecker plugins unflattened? This is my single biggest point of confusion. 
It certainly seems like the opposite of what I want. Consider that I have a 
type family

type family ToUpper (s :: Symbol) :: Symbol where {}

that I wish to solve in my plugin. At first, I just naïvely looked through the 
bag of wanted constraints and looked for constraints of the shape

t ~ ToUpper s

but this isn’t good enough, since my plugin regularly receives constraints that 
look more like

t ~ SomeOtherTypeFamily (ToUpper s)

so I have to recursively grovel through every type equality constraint looking 
for an application of a family I care about. Furthermore, once I’ve found one, 
I’m not sure how to actually let GHC know that I’ve solved it—do I really have 
to just generate a new given constraint and let GHC’s solver connect the dots?

I have seen the note on the typechecker plugins wiki page 

 about possibly baking support for type families into the plugin interface, 
which would indeed be nicer than the status quo, but it seems odd to me that 
they aren’t just passed to plugins flattened, which seems like it would spare a 
lot of effort. Isn’t the flattened representation really what typechecker 
plugins would like to see, anyway?

But let’s put families aside for a moment. I’m not just solving type families 
in my plugin, I’m also solving classes. These classes have no methods, but they 
do have functional dependencies. For example, I have a class

class Append (a :: Symbol) (b :: Symbol) (c :: Symbol) | a b -> c, a c -> 
b, b c -> a

which is like GHC.TypeLits.AppendSymbol, but the fundeps make GHC a bit happier 
when running it “backwards” (since GHC doesn’t really know about AppendSymbol’s 
magical injectivity, so it sometimes complains).

In any case, I was hoping that GHC’s solver would handle the improvement 
afforded by the fundeps for me once I provided evidence for Append constraints, 
but that doesn’t seem to be the case. Currently, I am therefore manually 
generating derived constraints based on the functional dependency information, 
plumbing FunDepOrigin2 through and all. Is there some way to cooperate better 
with GHC’s solver so I don’t have to duplicate all that logic in my plugin?

I guess one thing I didn’t try is returning given constraints from my solver 
instead of just solving them and providing evidence. That is, if my plugin 
received a

[W] d1 :: Append "foo" "bar" c

constraint, then instead of solving the constraint directly, I could leave it 
alone and instead return a new constraint

[G] d2 :: Append "foo" "bar" "baz"

and presumably GHC’s solver would use that constraint to improve and solve d1. 
But similar to my confusion about type families above, I’m uncertain if that’s 
the intended method or not, since it seems like it’s sort of circumventing the 
plugin API.

Finally, on a related note, building evidence for these solver-generated 
typeclass instances is a bit of a pain. They have no methods, but they do 
sometimes have superclasses. Currently, I’ve been generating CoreExprs as 
carefully as I’m able to after reading through the desugaring code: I call 
dataConWrapId on the result of classDataCon, then use mkTyConApp and mkCoreApps 
on that directly. It seems to work okay now, but it didn’t always: -dcore-lint 
thankfully caught my mistakes, but I’ve been wondering if there’s a safer way 
to build the dictionary that I’ve been missing.

That’s it for now—I’ve just been muddling through until things work. Once I get 
something that feels closer to right, maybe I’ll put the code somewhere and ask 
for more low-level comments if anyone would like to take the time to offer 
them, but for now, I’m still working on the high-level ideas. The wiki pages 
I’ve found have been very helpful; my appreciation to all who have contributed 
to them!

Many thanks,
Alexis

___