Re: TTG: Handling Source Locations

2019-02-12 Thread Adam Gundry
While we're on the topic, is there any plan to get rid of all those
panics? AFAICS they are entirely unnecessary: we should just use an
empty datatype for unused constructor extension points, then we can
eliminate it to get whatever we like. See #15247.

Adam


On 12/02/2019 15:40, Shayan Najd wrote:
>> Someone could easily call rnPatAndThen when they should call rnLPatAndThen. 
>> This would cause a panic.
> 
> With Solution A, there shouldn't be two functions `rnLPatAndThen` and
> `rnPatAndThen` anyways. There should be only `rnPatAndThen` with an
> extra case for the wrapper node.
> 
>> There's also the problem that the pattern-match checker can't usefully look 
>> through view patterns.
> 
> Yes, I have reported it while back. I don't know of the progress in fixing 
> this.
> 
> On Tue, 12 Feb 2019 at 16:24, Richard Eisenberg  wrote:
>>
>> That's true, but how would it play out in practice? For example, take a look 
>> at RnPat. There is a rnLPatAndThen which uses wrapSrcSpanCps to extract the 
>> location and then call rnPatAndThen. rnPatAndThen, in turn, just panics if 
>> it sees the extension point, because that's an unexpected constructor. 
>> Someone could easily call rnPatAndThen when they should call rnLPatAndThen. 
>> This would cause a panic.
>>
>> There's also the problem that the pattern-match checker can't usefully look 
>> through view patterns. If there is a nested pattern-match (that is, we see 
>> dL->L _ (SomeOtherConstructor), then there is no way to guarantee a complete 
>> pattern-match short of a catch-all. So it doesn't seem to me that the 
>> pattern-match checker is really helping us achieve what we want here.
>>
>> Richard
>>
>>> On Feb 12, 2019, at 9:30 AM, Shayan Najd  wrote:
>>>
 My problem, though, is that this is just a convention -- no one checks it. 
 It would be easy to forget.
>>>
>>> I am not sure if I understand: shouldn't the totality checker warn if
>>> there is no pattern for the wrapper constructor (hence enforce the
>>> convention)?
>>>
>>>
>>> On Tue, 12 Feb 2019 at 15:19, Richard Eisenberg  
>>> wrote:



> On Feb 12, 2019, at 5:19 AM, Shayan Najd  wrote:
>
> About the new code, the convention is straightforward: anytime you
> destruct an AST node, assume a wrapper node inside (add an extra
> case), or use the smart constructors/pattern synonyms.

 Aha! This, I did not know. So, you're saying that all the consumers of the 
 GHC AST need to remember to use dL every time they pattern-match. With the 
 new design, using dL when it's unnecessary doesn't hurt, but forgetting it 
 is problematic. So: just use it every time. My problem, though, is that 
 this is just a convention -- no one checks it. It would be easy to forget.

> On Feb 12, 2019, at 6:00 AM, Simon Peyton Jones via ghc-devs 
>  wrote:
>
> One way to think of it is this: we can now put SrcSpans where they make 
> sense, rather than everywhere.

 This has some logic to it, but I'm not quite sold. Another way of saying 
 this is that the new design favors flexibility for the producer, at the 
 cost of requiring consumers to be aware of and consistently apply the 
 convention Shayan describes above. The problem is, though, that if the 
 producer is stingy in adding source locations, the consumer won't know 
 which locations are expected to be informative. Is the consumer expected 
 to collect locations from a variety of places and try to combine them 
 somehow? I doubt it. So this means that the flexibility for the producer 
 isn't really there -- the type system will accept arbitrary choices of 
 where to put locations, but consumers won't get the locations where they 
 expect them.

>  We can still say (Located t) in places where we want to guarantee a 
> SrcSpan.

 This seems to go against the TTG philosophy. We can do this in, say, the 
 return type of a function, but we can't in the AST proper, because that's 
 shared among a number of clients, some of whom don't want the source 
 locations.

>
> Yes, this lets us add more than one; that's redundant but not harmful.

 I disagree here. If we add locations to a node twice, then we'll have to 
 use dL twice to find the underlying constructor. This is another case 
 there the type system offers the producer flexibility but hamstrings the 
 consumer.


> On Feb 12, 2019, at 7:32 AM, Vladislav Zavialov  
> wrote:
>
> I claim an SrcSpan makes sense everywhere, so this is not a useful
> distinction. Think about it as code provenance, an AST node always
> comes from somewhere

 I agree with this observation. Perhaps SrcSpan is a bad name, and 
 SrcProvenance is better. We could even imagine using the new HasCallStack 
 feature to track where generated code comes from (perhaps only in DEBUG 
 compilers). Do we 

Re: Marge has been stabilised

2019-02-12 Thread Matthew Pickering
Yes, on a quiet day she will idly wait for the opportunity to merge
two MRs rather than try and merge just one.

I don't think this is a bad default as her CI cycle has proved to be
quite long so only 2-3 batches happen per day at most.

Matt

On Tue, Feb 12, 2019 at 6:36 PM Alan & Kim Zimmerman
 wrote:
>
> > Every 30 minutes whilst idle she will try to find new MRs to batch
> >  together. If she finds at least two MRs to batch together then she
> > creates a batch as a new merge request. A batch is the series of MRs
> > rebased on top of each other from oldest to newest.
>
> Does this mean that on a quiet day (if there are ever any for GHC dev), a 
> lone merge
> will languish until someone else puts forward a valid merge too?
>
> Or will marge just merge a non-batched lone merge request?
>
> Alan
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Marge has been stabilised

2019-02-12 Thread Alan & Kim Zimmerman
> Every 30 minutes whilst idle she will try to find new MRs to batch
>  together. If she finds at least two MRs to batch together then she
> creates a batch as a new merge request. A batch is the series of MRs
> rebased on top of each other from oldest to newest.

Does this mean that on a quiet day (if there are ever any for GHC dev), a
lone merge
will languish until someone else puts forward a valid merge too?

Or will marge just merge a non-batched lone merge request?

Alan
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Marge has been stabilised

2019-02-12 Thread Matthew Pickering
Yes there are two others I use frequently.

/approve - Approves a Merge request
/assign - Assign a user

and I imagine I will use /label and /relabel frequently when we have labels.

It's also useful to know that the ! autocomplete for merge requests
can be filtered by name of MR. The user autocomplete @ can also be
filtered by a user's real name.

There is a full list here:
https://docs.gitlab.com/ee/user/project/quick_actions.html

Cheers,

Matt

On Tue, Feb 12, 2019 at 4:23 PM Richard Eisenberg  wrote:
>
>
>
> On Feb 12, 2019, at 10:51 AM, Matthew Pickering  
> wrote:
>
> This can be quickly toggled on/off by typing the /wip quick command in
> a comment.
>
>
> This is an interesting aside. I understand this to mean: If I make a comment 
> (the same place that I would write a comment for humans) that consists solely 
> of "/wip", then instead of posting anything to humans, the title of my MR 
> changes, either adding "WIP: " or deleting that from the beginning.
>
> Are there other such pieces of magic? Is there a place they are listed?
>
> Thanks!
> Richard
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Marge has been stabilised

2019-02-12 Thread Richard Eisenberg


> On Feb 12, 2019, at 10:51 AM, Matthew Pickering  
> wrote:
> 
> This can be quickly toggled on/off by typing the /wip quick command in
> a comment.

This is an interesting aside. I understand this to mean: If I make a comment 
(the same place that I would write a comment for humans) that consists solely 
of "/wip", then instead of posting anything to humans, the title of my MR 
changes, either adding "WIP: " or deleting that from the beginning.

Are there other such pieces of magic? Is there a place they are listed?

Thanks!
Richard___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Marge has been stabilised

2019-02-12 Thread Matthew Pickering
WIP merge requests have "WIP:" at the front of the title.

I have been marking MRs as WIP is they are not ready to merge to try
to keep track of things that need to be added to the merge queue or
not.

This can be quickly toggled on/off by typing the /wip quick command in
a comment.

Cheers,

Matt

On Tue, Feb 12, 2019 at 3:44 PM Richard Eisenberg  wrote:
>
> Thanks for these instructions!
>
> On Feb 12, 2019, at 3:37 AM, Matthew Pickering  
> wrote:
>
> 3. Make sure it is not marked as WIP
>
>
> What does this mean, precisely?
>
> Thanks,
> Richard
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: TTG: Handling Source Locations

2019-02-12 Thread Shayan Najd
> Someone could easily call rnPatAndThen when they should call rnLPatAndThen. 
> This would cause a panic.

With Solution A, there shouldn't be two functions `rnLPatAndThen` and
`rnPatAndThen` anyways. There should be only `rnPatAndThen` with an
extra case for the wrapper node.

> There's also the problem that the pattern-match checker can't usefully look 
> through view patterns.

Yes, I have reported it while back. I don't know of the progress in fixing this.

On Tue, 12 Feb 2019 at 16:24, Richard Eisenberg  wrote:
>
> That's true, but how would it play out in practice? For example, take a look 
> at RnPat. There is a rnLPatAndThen which uses wrapSrcSpanCps to extract the 
> location and then call rnPatAndThen. rnPatAndThen, in turn, just panics if it 
> sees the extension point, because that's an unexpected constructor. Someone 
> could easily call rnPatAndThen when they should call rnLPatAndThen. This 
> would cause a panic.
>
> There's also the problem that the pattern-match checker can't usefully look 
> through view patterns. If there is a nested pattern-match (that is, we see 
> dL->L _ (SomeOtherConstructor), then there is no way to guarantee a complete 
> pattern-match short of a catch-all. So it doesn't seem to me that the 
> pattern-match checker is really helping us achieve what we want here.
>
> Richard
>
> > On Feb 12, 2019, at 9:30 AM, Shayan Najd  wrote:
> >
> >> My problem, though, is that this is just a convention -- no one checks it. 
> >> It would be easy to forget.
> >
> > I am not sure if I understand: shouldn't the totality checker warn if
> > there is no pattern for the wrapper constructor (hence enforce the
> > convention)?
> >
> >
> > On Tue, 12 Feb 2019 at 15:19, Richard Eisenberg  
> > wrote:
> >>
> >>
> >>
> >>> On Feb 12, 2019, at 5:19 AM, Shayan Najd  wrote:
> >>>
> >>> About the new code, the convention is straightforward: anytime you
> >>> destruct an AST node, assume a wrapper node inside (add an extra
> >>> case), or use the smart constructors/pattern synonyms.
> >>
> >> Aha! This, I did not know. So, you're saying that all the consumers of the 
> >> GHC AST need to remember to use dL every time they pattern-match. With the 
> >> new design, using dL when it's unnecessary doesn't hurt, but forgetting it 
> >> is problematic. So: just use it every time. My problem, though, is that 
> >> this is just a convention -- no one checks it. It would be easy to forget.
> >>
> >>> On Feb 12, 2019, at 6:00 AM, Simon Peyton Jones via ghc-devs 
> >>>  wrote:
> >>>
> >>> One way to think of it is this: we can now put SrcSpans where they make 
> >>> sense, rather than everywhere.
> >>
> >> This has some logic to it, but I'm not quite sold. Another way of saying 
> >> this is that the new design favors flexibility for the producer, at the 
> >> cost of requiring consumers to be aware of and consistently apply the 
> >> convention Shayan describes above. The problem is, though, that if the 
> >> producer is stingy in adding source locations, the consumer won't know 
> >> which locations are expected to be informative. Is the consumer expected 
> >> to collect locations from a variety of places and try to combine them 
> >> somehow? I doubt it. So this means that the flexibility for the producer 
> >> isn't really there -- the type system will accept arbitrary choices of 
> >> where to put locations, but consumers won't get the locations where they 
> >> expect them.
> >>
> >>>  We can still say (Located t) in places where we want to guarantee a 
> >>> SrcSpan.
> >>
> >> This seems to go against the TTG philosophy. We can do this in, say, the 
> >> return type of a function, but we can't in the AST proper, because that's 
> >> shared among a number of clients, some of whom don't want the source 
> >> locations.
> >>
> >>>
> >>> Yes, this lets us add more than one; that's redundant but not harmful.
> >>
> >> I disagree here. If we add locations to a node twice, then we'll have to 
> >> use dL twice to find the underlying constructor. This is another case 
> >> there the type system offers the producer flexibility but hamstrings the 
> >> consumer.
> >>
> >>
> >>> On Feb 12, 2019, at 7:32 AM, Vladislav Zavialov  
> >>> wrote:
> >>>
> >>> I claim an SrcSpan makes sense everywhere, so this is not a useful
> >>> distinction. Think about it as code provenance, an AST node always
> >>> comes from somewhere
> >>
> >> I agree with this observation. Perhaps SrcSpan is a bad name, and 
> >> SrcProvenance is better. We could even imagine using the new HasCallStack 
> >> feature to track where generated code comes from (perhaps only in DEBUG 
> >> compilers). Do we need to do this today? I'm not sure there's a crying 
> >> need. But philosophically, we are able to attach a provenance to every 
> >> slice of AST, so there's really no reason for uninformative locations.
> >>
> >>> My concrete proposal: let's just put SrcSpan in the extension fields
> >>> of each node
> >>
> >> I support this 

Re: TTG: Handling Source Locations

2019-02-12 Thread Richard Eisenberg
That's true, but how would it play out in practice? For example, take a look at 
RnPat. There is a rnLPatAndThen which uses wrapSrcSpanCps to extract the 
location and then call rnPatAndThen. rnPatAndThen, in turn, just panics if it 
sees the extension point, because that's an unexpected constructor. Someone 
could easily call rnPatAndThen when they should call rnLPatAndThen. This would 
cause a panic.

There's also the problem that the pattern-match checker can't usefully look 
through view patterns. If there is a nested pattern-match (that is, we see 
dL->L _ (SomeOtherConstructor), then there is no way to guarantee a complete 
pattern-match short of a catch-all. So it doesn't seem to me that the 
pattern-match checker is really helping us achieve what we want here.

Richard

> On Feb 12, 2019, at 9:30 AM, Shayan Najd  wrote:
> 
>> My problem, though, is that this is just a convention -- no one checks it. 
>> It would be easy to forget.
> 
> I am not sure if I understand: shouldn't the totality checker warn if
> there is no pattern for the wrapper constructor (hence enforce the
> convention)?
> 
> 
> On Tue, 12 Feb 2019 at 15:19, Richard Eisenberg  wrote:
>> 
>> 
>> 
>>> On Feb 12, 2019, at 5:19 AM, Shayan Najd  wrote:
>>> 
>>> About the new code, the convention is straightforward: anytime you
>>> destruct an AST node, assume a wrapper node inside (add an extra
>>> case), or use the smart constructors/pattern synonyms.
>> 
>> Aha! This, I did not know. So, you're saying that all the consumers of the 
>> GHC AST need to remember to use dL every time they pattern-match. With the 
>> new design, using dL when it's unnecessary doesn't hurt, but forgetting it 
>> is problematic. So: just use it every time. My problem, though, is that this 
>> is just a convention -- no one checks it. It would be easy to forget.
>> 
>>> On Feb 12, 2019, at 6:00 AM, Simon Peyton Jones via ghc-devs 
>>>  wrote:
>>> 
>>> One way to think of it is this: we can now put SrcSpans where they make 
>>> sense, rather than everywhere.
>> 
>> This has some logic to it, but I'm not quite sold. Another way of saying 
>> this is that the new design favors flexibility for the producer, at the cost 
>> of requiring consumers to be aware of and consistently apply the convention 
>> Shayan describes above. The problem is, though, that if the producer is 
>> stingy in adding source locations, the consumer won't know which locations 
>> are expected to be informative. Is the consumer expected to collect 
>> locations from a variety of places and try to combine them somehow? I doubt 
>> it. So this means that the flexibility for the producer isn't really there 
>> -- the type system will accept arbitrary choices of where to put locations, 
>> but consumers won't get the locations where they expect them.
>> 
>>>  We can still say (Located t) in places where we want to guarantee a 
>>> SrcSpan.
>> 
>> This seems to go against the TTG philosophy. We can do this in, say, the 
>> return type of a function, but we can't in the AST proper, because that's 
>> shared among a number of clients, some of whom don't want the source 
>> locations.
>> 
>>> 
>>> Yes, this lets us add more than one; that's redundant but not harmful.
>> 
>> I disagree here. If we add locations to a node twice, then we'll have to use 
>> dL twice to find the underlying constructor. This is another case there the 
>> type system offers the producer flexibility but hamstrings the consumer.
>> 
>> 
>>> On Feb 12, 2019, at 7:32 AM, Vladislav Zavialov  
>>> wrote:
>>> 
>>> I claim an SrcSpan makes sense everywhere, so this is not a useful
>>> distinction. Think about it as code provenance, an AST node always
>>> comes from somewhere
>> 
>> I agree with this observation. Perhaps SrcSpan is a bad name, and 
>> SrcProvenance is better. We could even imagine using the new HasCallStack 
>> feature to track where generated code comes from (perhaps only in DEBUG 
>> compilers). Do we need to do this today? I'm not sure there's a crying need. 
>> But philosophically, we are able to attach a provenance to every slice of 
>> AST, so there's really no reason for uninformative locations.
>> 
>>> My concrete proposal: let's just put SrcSpan in the extension fields
>>> of each node
>> 
>> I support this counter-proposal. Perhaps if it required writing loads of 
>> extra type instances, I wouldn't be as much in favor. But we already have to 
>> write those instances -- they just change from NoExt to SrcSpan. This seems 
>> to solve all the problems nicely, at relatively low cost. And, I'm sure it's 
>> more efficient at runtime than either the previous ping-pong style or the 
>> current scenario, as we can pattern-match on constructors directly, 
>> requiring one less pointer-chase or function call.
>> 
>> One downside of this proposal is that it means that more care will have to 
>> be taken when setting the extension field of AST nodes after a pass, making 
>> sure to preserve the location. (This 

Re: TTG: Handling Source Locations

2019-02-12 Thread Shayan Najd
> [Richard:] I disagree here. If we add locations to a node twice, then we'll 
> have to use dL twice to find the underlying constructor. This is another case 
> there the type system offers the producer flexibility but hamstrings the 
> consumer.

Depends on the semantics of `dL`: currently  (for `Pat`) it returns
the top-level `SrcSpan` and then the underlying node with all the
inner wrappers stripped away. So one use of `dL` is enough in this
semantic.
(see https://github.com/ghc/ghc/blob/master/compiler/hsSyn/HsPat.hs#L341)

> [Vlad:] As to the better solution, I think we should just go with Solution B 
> from the Wiki page.
> [Richard:] I support this counter-proposal. Perhaps if it required writing 
> loads of extra type instances, I wouldn't be as much in favour.

It may help to identify at least three sorts of functions commonly
used *currently* in GHC when interacting with AST nodes (please add,
if I am missing some):
(a) those that ignore source locations;
(b) those that generically handle source locations regardless of the
constructor of the underlying node; and
(c) those that handle source locations case-by-case (often by nested
pattern matching).

The key issue with Solution B, as listed in the wiki, is that it ruins
the separation of two concerns in functions working on AST nodes:
handling source locations, and the actual logic of the function.
With the ping-pong style, handling of source locations is sometimes
refactored in a separate function, and with Solution A refactored in a
separate case/function clause.
With Solution B, however, every time we construct a node we should
have a source location ready to put into it.
That is, with Solution B, (a) and (b) are not cleanly implemented.

(I can explain more if not clear.)

/Shayan

On Tue, 12 Feb 2019 at 15:30, Shayan Najd  wrote:
>
> > My problem, though, is that this is just a convention -- no one checks it. 
> > It would be easy to forget.
>
> I am not sure if I understand: shouldn't the totality checker warn if
> there is no pattern for the wrapper constructor (hence enforce the
> convention)?
>
>
> On Tue, 12 Feb 2019 at 15:19, Richard Eisenberg  wrote:
> >
> >
> >
> > > On Feb 12, 2019, at 5:19 AM, Shayan Najd  wrote:
> > >
> > > About the new code, the convention is straightforward: anytime you
> > > destruct an AST node, assume a wrapper node inside (add an extra
> > > case), or use the smart constructors/pattern synonyms.
> >
> > Aha! This, I did not know. So, you're saying that all the consumers of the 
> > GHC AST need to remember to use dL every time they pattern-match. With the 
> > new design, using dL when it's unnecessary doesn't hurt, but forgetting it 
> > is problematic. So: just use it every time. My problem, though, is that 
> > this is just a convention -- no one checks it. It would be easy to forget.
> >
> > > On Feb 12, 2019, at 6:00 AM, Simon Peyton Jones via ghc-devs 
> > >  wrote:
> > >
> > > One way to think of it is this: we can now put SrcSpans where they make 
> > > sense, rather than everywhere.
> >
> > This has some logic to it, but I'm not quite sold. Another way of saying 
> > this is that the new design favors flexibility for the producer, at the 
> > cost of requiring consumers to be aware of and consistently apply the 
> > convention Shayan describes above. The problem is, though, that if the 
> > producer is stingy in adding source locations, the consumer won't know 
> > which locations are expected to be informative. Is the consumer expected to 
> > collect locations from a variety of places and try to combine them somehow? 
> > I doubt it. So this means that the flexibility for the producer isn't 
> > really there -- the type system will accept arbitrary choices of where to 
> > put locations, but consumers won't get the locations where they expect them.
> >
> > >   We can still say (Located t) in places where we want to guarantee a 
> > > SrcSpan.
> >
> > This seems to go against the TTG philosophy. We can do this in, say, the 
> > return type of a function, but we can't in the AST proper, because that's 
> > shared among a number of clients, some of whom don't want the source 
> > locations.
> >
> > >
> > > Yes, this lets us add more than one; that's redundant but not harmful.
> >
> > I disagree here. If we add locations to a node twice, then we'll have to 
> > use dL twice to find the underlying constructor. This is another case there 
> > the type system offers the producer flexibility but hamstrings the consumer.
> >
> >
> > > On Feb 12, 2019, at 7:32 AM, Vladislav Zavialov  
> > > wrote:
> > >
> > > I claim an SrcSpan makes sense everywhere, so this is not a useful
> > > distinction. Think about it as code provenance, an AST node always
> > > comes from somewhere
> >
> > I agree with this observation. Perhaps SrcSpan is a bad name, and 
> > SrcProvenance is better. We could even imagine using the new HasCallStack 
> > feature to track where generated code comes from (perhaps only in DEBUG 
> > 

Re: TTG: Handling Source Locations

2019-02-12 Thread Shayan Najd
> My problem, though, is that this is just a convention -- no one checks it. It 
> would be easy to forget.

I am not sure if I understand: shouldn't the totality checker warn if
there is no pattern for the wrapper constructor (hence enforce the
convention)?


On Tue, 12 Feb 2019 at 15:19, Richard Eisenberg  wrote:
>
>
>
> > On Feb 12, 2019, at 5:19 AM, Shayan Najd  wrote:
> >
> > About the new code, the convention is straightforward: anytime you
> > destruct an AST node, assume a wrapper node inside (add an extra
> > case), or use the smart constructors/pattern synonyms.
>
> Aha! This, I did not know. So, you're saying that all the consumers of the 
> GHC AST need to remember to use dL every time they pattern-match. With the 
> new design, using dL when it's unnecessary doesn't hurt, but forgetting it is 
> problematic. So: just use it every time. My problem, though, is that this is 
> just a convention -- no one checks it. It would be easy to forget.
>
> > On Feb 12, 2019, at 6:00 AM, Simon Peyton Jones via ghc-devs 
> >  wrote:
> >
> > One way to think of it is this: we can now put SrcSpans where they make 
> > sense, rather than everywhere.
>
> This has some logic to it, but I'm not quite sold. Another way of saying this 
> is that the new design favors flexibility for the producer, at the cost of 
> requiring consumers to be aware of and consistently apply the convention 
> Shayan describes above. The problem is, though, that if the producer is 
> stingy in adding source locations, the consumer won't know which locations 
> are expected to be informative. Is the consumer expected to collect locations 
> from a variety of places and try to combine them somehow? I doubt it. So this 
> means that the flexibility for the producer isn't really there -- the type 
> system will accept arbitrary choices of where to put locations, but consumers 
> won't get the locations where they expect them.
>
> >   We can still say (Located t) in places where we want to guarantee a 
> > SrcSpan.
>
> This seems to go against the TTG philosophy. We can do this in, say, the 
> return type of a function, but we can't in the AST proper, because that's 
> shared among a number of clients, some of whom don't want the source 
> locations.
>
> >
> > Yes, this lets us add more than one; that's redundant but not harmful.
>
> I disagree here. If we add locations to a node twice, then we'll have to use 
> dL twice to find the underlying constructor. This is another case there the 
> type system offers the producer flexibility but hamstrings the consumer.
>
>
> > On Feb 12, 2019, at 7:32 AM, Vladislav Zavialov  
> > wrote:
> >
> > I claim an SrcSpan makes sense everywhere, so this is not a useful
> > distinction. Think about it as code provenance, an AST node always
> > comes from somewhere
>
> I agree with this observation. Perhaps SrcSpan is a bad name, and 
> SrcProvenance is better. We could even imagine using the new HasCallStack 
> feature to track where generated code comes from (perhaps only in DEBUG 
> compilers). Do we need to do this today? I'm not sure there's a crying need. 
> But philosophically, we are able to attach a provenance to every slice of 
> AST, so there's really no reason for uninformative locations.
>
> > My concrete proposal: let's just put SrcSpan in the extension fields
> > of each node
>
> I support this counter-proposal. Perhaps if it required writing loads of 
> extra type instances, I wouldn't be as much in favor. But we already have to 
> write those instances -- they just change from NoExt to SrcSpan. This seems 
> to solve all the problems nicely, at relatively low cost. And, I'm sure it's 
> more efficient at runtime than either the previous ping-pong style or the 
> current scenario, as we can pattern-match on constructors directly, requiring 
> one less pointer-chase or function call.
>
> One downside of this proposal is that it means that more care will have to be 
> taken when setting the extension field of AST nodes after a pass, making sure 
> to preserve the location. (This isn't really all that different from 
> location-shuffling today.) A quick mock-up shows that record-updates can make 
> this easier:
>
> > data Phase = Parsed | Renamed
> >
> > data Exp p = Node (XNode p) Int
> >
> > type family XNode (p :: Phase)
> > type instance XNode p = NodeExt p
> >
> > data NodeExt p where
> >   NodeExt :: { flag :: Bool, fvs :: RenamedOnly p String } -> NodeExt p
> >
> > type family RenamedOnly p t where
> >   RenamedOnly Parsed _ = ()
> >   RenamedOnly Renamed t = t
> >
> > example :: Exp Parsed
> > example = Node (NodeExt { flag = True, fvs = () }) 5
> >
> > rename :: Exp Parsed -> Exp Renamed
> > rename (Node ext n) = Node (ext { fvs = "xyz" }) n
>
> Note that the extension point is a record type that has a field available 
> only after renaming. We can then do a type-changing record update when 
> producing the renamed node, preserving the flag in the code above. What's sad 
> is that, if 

Re: TTG: Handling Source Locations

2019-02-12 Thread Richard Eisenberg


> On Feb 12, 2019, at 5:19 AM, Shayan Najd  wrote:
> 
> About the new code, the convention is straightforward: anytime you
> destruct an AST node, assume a wrapper node inside (add an extra
> case), or use the smart constructors/pattern synonyms.

Aha! This, I did not know. So, you're saying that all the consumers of the GHC 
AST need to remember to use dL every time they pattern-match. With the new 
design, using dL when it's unnecessary doesn't hurt, but forgetting it is 
problematic. So: just use it every time. My problem, though, is that this is 
just a convention -- no one checks it. It would be easy to forget.

> On Feb 12, 2019, at 6:00 AM, Simon Peyton Jones via ghc-devs 
>  wrote:
> 
> One way to think of it is this: we can now put SrcSpans where they make 
> sense, rather than everywhere.

This has some logic to it, but I'm not quite sold. Another way of saying this 
is that the new design favors flexibility for the producer, at the cost of 
requiring consumers to be aware of and consistently apply the convention Shayan 
describes above. The problem is, though, that if the producer is stingy in 
adding source locations, the consumer won't know which locations are expected 
to be informative. Is the consumer expected to collect locations from a variety 
of places and try to combine them somehow? I doubt it. So this means that the 
flexibility for the producer isn't really there -- the type system will accept 
arbitrary choices of where to put locations, but consumers won't get the 
locations where they expect them.

>   We can still say (Located t) in places where we want to guarantee a SrcSpan.

This seems to go against the TTG philosophy. We can do this in, say, the return 
type of a function, but we can't in the AST proper, because that's shared among 
a number of clients, some of whom don't want the source locations.

> 
> Yes, this lets us add more than one; that's redundant but not harmful.

I disagree here. If we add locations to a node twice, then we'll have to use dL 
twice to find the underlying constructor. This is another case there the type 
system offers the producer flexibility but hamstrings the consumer.


> On Feb 12, 2019, at 7:32 AM, Vladislav Zavialov  wrote:
> 
> I claim an SrcSpan makes sense everywhere, so this is not a useful
> distinction. Think about it as code provenance, an AST node always
> comes from somewhere

I agree with this observation. Perhaps SrcSpan is a bad name, and SrcProvenance 
is better. We could even imagine using the new HasCallStack feature to track 
where generated code comes from (perhaps only in DEBUG compilers). Do we need 
to do this today? I'm not sure there's a crying need. But philosophically, we 
are able to attach a provenance to every slice of AST, so there's really no 
reason for uninformative locations.

> My concrete proposal: let's just put SrcSpan in the extension fields
> of each node

I support this counter-proposal. Perhaps if it required writing loads of extra 
type instances, I wouldn't be as much in favor. But we already have to write 
those instances -- they just change from NoExt to SrcSpan. This seems to solve 
all the problems nicely, at relatively low cost. And, I'm sure it's more 
efficient at runtime than either the previous ping-pong style or the current 
scenario, as we can pattern-match on constructors directly, requiring one less 
pointer-chase or function call.

One downside of this proposal is that it means that more care will have to be 
taken when setting the extension field of AST nodes after a pass, making sure 
to preserve the location. (This isn't really all that different from 
location-shuffling today.) A quick mock-up shows that record-updates can make 
this easier:

> data Phase = Parsed | Renamed
> 
> data Exp p = Node (XNode p) Int
> 
> type family XNode (p :: Phase)
> type instance XNode p = NodeExt p
> 
> data NodeExt p where
>   NodeExt :: { flag :: Bool, fvs :: RenamedOnly p String } -> NodeExt p
> 
> type family RenamedOnly p t where
>   RenamedOnly Parsed _ = ()
>   RenamedOnly Renamed t = t
> 
> example :: Exp Parsed
> example = Node (NodeExt { flag = True, fvs = () }) 5
> 
> rename :: Exp Parsed -> Exp Renamed
> rename (Node ext n) = Node (ext { fvs = "xyz" }) n

Note that the extension point is a record type that has a field available only 
after renaming. We can then do a type-changing record update when producing the 
renamed node, preserving the flag in the code above. What's sad is that, if 
there were no renamer-only field, we couldn't do a type-changing empty record 
update as the default case. (Haskell doesn't have empty record updates. Perhaps 
it should. They would be useful in doing a type-change on a datatype with a 
phantom index. A clever compiler could even somehow ensure that such a record 
update is completely compiled away.) In any case, this little example is 
essentially orthogonal to my points above, and the choice of whether to use 
records or other structures are 

Re: TTG: Handling Source Locations

2019-02-12 Thread Vladislav Zavialov
> One way to think of it is this: we can now put SrcSpans where they make 
> sense, rather than everywhere.

I claim an SrcSpan makes sense everywhere, so this is not a useful
distinction. Think about it as code provenance, an AST node always
comes from somewhere: a user-written .hs file, a GHCi command, or
compiler-generated code (via TH or deriving). We should never omit
this information from a node.

And when we are writing code that consumes an AST, it always makes
sense to ask what the provenance of a node is, for example to use it
in an error message.

> this lets us add more than one; that's redundant but not harmful

It goes against the philosophy of making illegal states
irrepresentable. Now all code must be careful not to end up in an
illegal state of nested SrcSpan, without any help from the
typechecker.

The code that pattern matches on an AST, at the same time, must be
prepared to handle this case anyway (or else we risk to crash), which
it currently does with stripSrcSpanPat in the implementation of dL.

And having to remember to apply dL when matching on the AST is more
trivia to learn and remember. Not even a warning if one forgets to do
that, no appropriate place to explain this to new contributors
(reading another Note just to start doing anything at all with the
AST? unnecessary friction), and only a test failure at best in case of
a mistake.

My concrete proposal: let's just put SrcSpan in the extension fields
of each node. In other words, take these lines

type instance XVarPat  (GhcPass _) = NoExt
type instance XLazyPat (GhcPass _) = NoExt
type instance XAsPat   (GhcPass _) = NoExt
type instance XParPat  (GhcPass _) = NoExt
type instance XBangPat (GhcPass _) = NoExt
...

and replace them with

type instance XVarPat  (GhcPass _) = SrcSpan
type instance XLazyPat (GhcPass _) = SrcSpan
type instance XAsPat   (GhcPass _) = SrcSpan
type instance XParPat  (GhcPass _) = SrcSpan
type instance XBangPat (GhcPass _) = SrcSpan
...

And don't bother with the HasSrcSpan class, don't define
composeSrcSpan and decomposeSrcSpan. Very straightforward and
beneficial for both producers and consumers of an AST.

All the best,
Vladislav
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


RE: TTG: Handling Source Locations

2019-02-12 Thread Simon Peyton Jones via ghc-devs
One way to think of it is this: we can now put SrcSpans where they make sense, 
rather than everywhere.   We can still say (Located t) in places where we want 
to guarantee a SrcSpan.

Yes, this lets us add more than one; that's redundant but not harmful.

Simon

|  -Original Message-
|  From: ghc-devs  On Behalf Of Matthew
|  Pickering
|  Sent: 12 February 2019 09:08
|  To: Vladislav Zavialov 
|  Cc: GHC 
|  Subject: Re: TTG: Handling Source Locations
|  
|  I just did this now, it was quite disconcerting that my code continued to
|  compile after applying `cL loc` to the return value of one of my
|  functions.
|  
|  On Sat, Feb 9, 2019 at 5:40 PM Vladislav Zavialov 
|  wrote:
|  >
|  > I wholly share this concern, which is why I commented on the Phab diff:
|  >
|  > > Does this rely on the caller to call dL on the pattern? Very fragile,
|  let's not do that.
|  >
|  > In addition, I'm worried about illegal states where we end up with
|  > multiple nested levels of `NewPat`, and calling `dL` once is not
|  > sufficient.
|  >
|  > As to the better solution, I think we should just go with Solution B
|  > from the Wiki page. Yes, it's somewhat more boilerplate, but it
|  > guarantees to have locations in the right places for all nodes. The
|  > main argument against it was that we'd have to define `type instance
|  > XThing (GhcPass p) = SrcSpan` for many a `Thing`, but I don't see it
|  > as a downside at all. We should do so anyway, to get rid of parsing
|  > API annotations and put them in the AST proper.
|  >
|  > All the best,
|  > Vladislav
|  >
|  > On Sat, Feb 9, 2019 at 7:19 PM Richard Eisenberg 
|  wrote:
|  > >
|  > > Hi devs,
|  > >
|  > > I just came across [TTG: Handling Source Locations], as I was poking
|  around in RdrHsSyn and found wondrous things like (dL->L wiz waz) all
|  over the place.
|  > >
|  > > General outline:
|  > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgh
|  > > c.haskell.org%2Ftrac%2Fghc%2Fwiki%2FImplementingTreesThatGrow%2FHand
|  > > lingSourceLocationsdata=02%7C01%7Csimonpj%40microsoft.com%7C915
|  > > 2cd5c5b624a9fac5c08d690c9a908%7C72f988bf86f141af91ab2d7cd011db47%7C1
|  > > %7C0%7C636855593134767677sdata=I6kltUVNtcMItCao1dPvnM86%2FlE8ky
|  > > CwshV81dD6mbY%3Dreserved=0 Phab diff:
|  > > https://phabricator.haskell.org/D5036
|  > > Trac ticket:
|  > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgh
|  > > c.haskell.org%2Ftrac%2Fghc%2Fticket%2F15495data=02%7C01%7Csimon
|  > > pj%40microsoft.com%7C9152cd5c5b624a9fac5c08d690c9a908%7C72f988bf86f1
|  > > 41af91ab2d7cd011db47%7C1%7C0%7C636855593134767677sdata=VeRbLhJD
|  > > ZQv%2FCZ39lMpwo2SRhmcyIsHRgwXNYDN28cA%3Dreserved=0
|  > > Commit:
|  > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi
|  > > tlab.haskell.org%2Fghc%2Fghc%2Fcommit%2F509d5be69c7507ba5d0a5f39ffd1
|  > > 613a59e73eeadata=02%7C01%7Csimonpj%40microsoft.com%7C9152cd5c5b
|  > > 624a9fac5c08d690c9a908%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C
|  > > 636855593134767677sdata=nv9GjvSvGweBPmsHEVD1jBB7yz0Br0hDHtZ5Exv
|  > > uDqU%3Dreserved=0
|  > >
|  > > I see why this change is wanted and how the new version works.
|  > >
|  > > It seems to me, though, that this move makes us *less typed*. That
|  is, it would be very easy (and disastrous) to forget to match on a
|  location node. For example, I can now do this:
|  > >
|  > > > foo :: LPat p -> ...
|  > > > foo (VarPat ...) = ...
|  > >
|  > > Note that I have declared that foo takes a located pat, but then I
|  forgot to extract the location with dL. This would type-check, but it
|  would fail. Previously, the type checker would ensure that I didn't
|  forget to match on the L constructor. This error would get caught after
|  some poking about, because foo just wouldn't work.
|  > >
|  > > However, worse, we might forget to *add* a location when downstream
|  functions expect one. This would be harder to detect, for two reasons:
|  > > 1. The problem is caught at deconstruction, and figuring out where an
|  object was constructed can be quite hard.
|  > > 2. The problem might silently cause trouble, because dL won't
|  actually fail on a node missing a location -- it just gives noSrcSpan. So
|  the problem would manifest as a subtle degradation in the quality of an
|  error message, perhaps not caught until several patches (or years!)
|  later.
|  > >
|  > > So I'm uncomfortable with this direction of travel.
|  > >
|  > > Has this aspect of this design been brought up before? I have to say
|  I don't have a great solution to suggest. Perhaps the best I can think of
|  is to make Located a type family. It would branch on the type index to
|  HsSyn types, introducing a Located node for GhcPass but not for other
|  types. This Isn't really all that extensible (I think) and it gives
|  special status to GHC's usage of the AST. But it seems to solve the
|  immediate problems without the downside above.
|  > >
|  > > Sorry for 

Re: TTG: Handling Source Locations

2019-02-12 Thread Shayan Najd
Hi Richard,

> [Richard:]
> It seems to me, though, that this move makes us *less typed*.
> [and]
> However, worse, we might forget to *add* a location when downstream functions 
> expect one.

We had a more sophisticated version of TTG that could support the
ping-pong style (and hence typed tagging of locations), but it came at
the price of more complicated encoding [0].
We have decided to abandon the more typed variant since tracking
whether a node is located or not is inherently a dynamic/run-time
process, not a static/compile-time process:
there are some nodes that are generated in the process by the compiler
by an arbitrary logic (hard to encode by types), hence have no
location (in the source code).

The types `LHsExpr`, `LPat`, and the like will be deleted! It will be
all `HsExpr`, `Pat` and the like.
Baking-in, e.g. `LHsExpr` into `HsExpr`, was a mistake in the first
place: we were cheating using `Maybe` type anyway, when for example an
`LHsExpr` was forcibly required but we had only `HsExpr` and used
`noLoc`.

> Sorry for reopening something that has already been debated, but (unless I'm 
> missing something) the current state of affairs seems like a potential 
> wellspring of subtle bugs.

We were really careful about the refactoring. The new code aside, I
don't see how we can introduce any bugs by the refactoring of the old
code explained in the wiki.
About the new code, the convention is straightforward: anytime you
destruct an AST node, assume a wrapper node inside (add an extra
case), or use the smart constructors/pattern synonyms.

I'd be happy to rediscuss the design space here. It would be great to
have everyone fully on board as it is not a trivial change.

/Shayan

[0] https://github.com/shayan-najd/HsAST/blob/master/Paper.pdf

On Sat, 9 Feb 2019 at 17:19, Richard Eisenberg  wrote:
>
> Hi devs,
>
> I just came across [TTG: Handling Source Locations], as I was poking around 
> in RdrHsSyn and found wondrous things like (dL->L wiz waz) all over the place.
>
> General outline: 
> https://ghc.haskell.org/trac/ghc/wiki/ImplementingTreesThatGrow/HandlingSourceLocations
> Phab diff: https://phabricator.haskell.org/D5036
> Trac ticket: https://ghc.haskell.org/trac/ghc/ticket/15495
> Commit: 
> https://gitlab.haskell.org/ghc/ghc/commit/509d5be69c7507ba5d0a5f39ffd1613a59e73eea
>
> I see why this change is wanted and how the new version works.
>
> It seems to me, though, that this move makes us *less typed*. That is, it 
> would be very easy (and disastrous) to forget to match on a location node. 
> For example, I can now do this:
>
> > foo :: LPat p -> ...
> > foo (VarPat ...) = ...
>
> Note that I have declared that foo takes a located pat, but then I forgot to 
> extract the location with dL. This would type-check, but it would fail. 
> Previously, the type checker would ensure that I didn't forget to match on 
> the L constructor. This error would get caught after some poking about, 
> because foo just wouldn't work.
>
> However, worse, we might forget to *add* a location when downstream functions 
> expect one. This would be harder to detect, for two reasons:
> 1. The problem is caught at deconstruction, and figuring out where an object 
> was constructed can be quite hard.
> 2. The problem might silently cause trouble, because dL won't actually fail 
> on a node missing a location -- it just gives noSrcSpan. So the problem would 
> manifest as a subtle degradation in the quality of an error message, perhaps 
> not caught until several patches (or years!) later.
>
> So I'm uncomfortable with this direction of travel.
>
> Has this aspect of this design been brought up before? I have to say I don't 
> have a great solution to suggest. Perhaps the best I can think of is to make 
> Located a type family. It would branch on the type index to HsSyn types, 
> introducing a Located node for GhcPass but not for other types. This Isn't 
> really all that extensible (I think) and it gives special status to GHC's 
> usage of the AST. But it seems to solve the immediate problems without the 
> downside above.
>
> Sorry for reopening something that has already been debated, but (unless I'm 
> missing something) the current state of affairs seems like a potential 
> wellspring of subtle bugs.
>
> Thanks,
> Richard
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: TTG: Handling Source Locations

2019-02-12 Thread Matthew Pickering
I just did this now, it was quite disconcerting that my code continued
to compile after applying `cL loc` to the return value of one of my
functions.

On Sat, Feb 9, 2019 at 5:40 PM Vladislav Zavialov  wrote:
>
> I wholly share this concern, which is why I commented on the Phab diff:
>
> > Does this rely on the caller to call dL on the pattern? Very fragile, let's 
> > not do that.
>
> In addition, I'm worried about illegal states where we end up with
> multiple nested levels of `NewPat`, and calling `dL` once is not
> sufficient.
>
> As to the better solution, I think we should just go with Solution B
> from the Wiki page. Yes, it's somewhat more boilerplate, but it
> guarantees to have locations in the right places for all nodes. The
> main argument against it was that we'd have to define `type instance
> XThing (GhcPass p) = SrcSpan` for many a `Thing`, but I don't see it
> as a downside at all. We should do so anyway, to get rid of parsing
> API annotations and put them in the AST proper.
>
> All the best,
> Vladislav
>
> On Sat, Feb 9, 2019 at 7:19 PM Richard Eisenberg  wrote:
> >
> > Hi devs,
> >
> > I just came across [TTG: Handling Source Locations], as I was poking around 
> > in RdrHsSyn and found wondrous things like (dL->L wiz waz) all over the 
> > place.
> >
> > General outline: 
> > https://ghc.haskell.org/trac/ghc/wiki/ImplementingTreesThatGrow/HandlingSourceLocations
> > Phab diff: https://phabricator.haskell.org/D5036
> > Trac ticket: https://ghc.haskell.org/trac/ghc/ticket/15495
> > Commit: 
> > https://gitlab.haskell.org/ghc/ghc/commit/509d5be69c7507ba5d0a5f39ffd1613a59e73eea
> >
> > I see why this change is wanted and how the new version works.
> >
> > It seems to me, though, that this move makes us *less typed*. That is, it 
> > would be very easy (and disastrous) to forget to match on a location node. 
> > For example, I can now do this:
> >
> > > foo :: LPat p -> ...
> > > foo (VarPat ...) = ...
> >
> > Note that I have declared that foo takes a located pat, but then I forgot 
> > to extract the location with dL. This would type-check, but it would fail. 
> > Previously, the type checker would ensure that I didn't forget to match on 
> > the L constructor. This error would get caught after some poking about, 
> > because foo just wouldn't work.
> >
> > However, worse, we might forget to *add* a location when downstream 
> > functions expect one. This would be harder to detect, for two reasons:
> > 1. The problem is caught at deconstruction, and figuring out where an 
> > object was constructed can be quite hard.
> > 2. The problem might silently cause trouble, because dL won't actually fail 
> > on a node missing a location -- it just gives noSrcSpan. So the problem 
> > would manifest as a subtle degradation in the quality of an error message, 
> > perhaps not caught until several patches (or years!) later.
> >
> > So I'm uncomfortable with this direction of travel.
> >
> > Has this aspect of this design been brought up before? I have to say I 
> > don't have a great solution to suggest. Perhaps the best I can think of is 
> > to make Located a type family. It would branch on the type index to HsSyn 
> > types, introducing a Located node for GhcPass but not for other types. This 
> > Isn't really all that extensible (I think) and it gives special status to 
> > GHC's usage of the AST. But it seems to solve the immediate problems 
> > without the downside above.
> >
> > Sorry for reopening something that has already been debated, but (unless 
> > I'm missing something) the current state of affairs seems like a potential 
> > wellspring of subtle bugs.
> >
> > Thanks,
> > Richard
> > ___
> > ghc-devs mailing list
> > ghc-devs@haskell.org
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Marge has been stabilised

2019-02-12 Thread Matthew Pickering
Hi all,

I think I have finally managed to stablise the merge bot (Marge).

If you have a patch ready to merge then

1. Make sure that CI shows as passing
2. Make sure it has been approved by at least one person
3. Make sure it is not marked as WIP

Once these three conditions are met, assign the PR to "Marge Bot" and
then she will pick it up and merge it, hopefully within 24 hours
depending on her current state.

Any more questions, feel free to ask. I included some more specific
detail below.

Cheers,

Matt

--

Here is her current mode of operation.

Every 30 minutes whilst idle she will try to find new MRs to batch
together. If she finds at least two MRs to batch together then she
creates a batch as a new merge request. A batch is the series of MRs
rebased on top of each other from oldest to newest.

She then waits for this merge request to pass CI, checking every 10
minutes to see if it has done so.

Once the MR has passed CI, she rebases and merges each MR in the batch
one by one to the target branch. In between each MR she merges she
waits 5 minutes.

Due to the fact each MR is merged one by one:

1. The original MR will be closed automatically.
2. A CI job will trigger for each MR as it lands on master.
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs