Re: How does GHC implement layout?

2021-04-06 Thread Simon Marlow
The history here: several people independently noticed that it might be
better to implement the "parse error" part of layout by doing something
simpler than full parsing. For example, if we just count brackets in the
lexer, then we can handle things like

   f (case x of y -> z)

and if we treat let/in as a pair of brackets too, then the common case of

  let x = y in z

also works. The AlternativeLayoutRule was Ian's implementation of this
idea, and (if I remember correctly) contains a number of fixes that arose
from discovering interesting cases of code in the wild that weren't handled
by the obvious bracket-matching techniques.

My conclusion from this experiment - which is a bit subjective and might
differ from others - is that
* you need a lot of special cases to handle code that Haskell users expect
to parse
* and then the specification becomes pretty complex, and hard to explain to
someone

So ultimately it didn't solve the problem in a nice way, unfortunately. The
"parse error" rule is hard for implementers, but it's not actually that
hard for users.

If other people agree with this conclusion, we should kill off the ALR code
now.

Someone is going to ask me for examples of those "special cases", I'm
afraid I don't remember - I only cached the answer to the question, not the
working :) You'd probably have to go digging through the ALR code.

Cheers
Simon


On Tue, 6 Apr 2021 at 01:36, Alexis King  wrote:

> On 4/5/21 2:36 PM, Ian Lynagh wrote:
>
> It was originally designed by John Meacham:
> https://gitlab.haskell.org/haskell/prime/-/wikis/alternative-layout-rule
> https://www.mail-archive.com/haskell-prime@haskell.org/msg01938.html
>
> Thanks, Ian—I had stumbled across a link to the old Haskell Prime trac
> wiki while I was searching for information, but I didn’t realize where it
> had been migrated to.
>
> It isn't exactly equivalent to the Haskell layout rule, but it's fairly
> close and much simpler (due to not having the "on a parse error" case).
>
> Yes, I gathered as much from the implementation. The idea makes sense, but
> of course it doesn’t provide much benefit to have a simpler implementation
> unless it actually *replaces* the “on parse error” approach.
>
> Given this appears to be a long-defunct proposal, a natural followup
> question is to ask whether there’s any reason this code is still in GHC. Is
> it used for any purpose, or could it be removed?
>
> Alexis
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: How does GHC implement layout?

2021-04-05 Thread Alexis King

On 4/5/21 2:36 PM, Ian Lynagh wrote:
It was originally designed by John Meacham: 
https://gitlab.haskell.org/haskell/prime/-/wikis/alternative-layout-rule 
https://www.mail-archive.com/haskell-prime@haskell.org/msg01938.html


Thanks, Ian—I had stumbled across a link to the old Haskell Prime trac 
wiki while I was searching for information, but I didn’t realize where 
it had been migrated to.


It isn't exactly equivalent to the Haskell layout rule, but it's 
fairly close and much simpler (due to not having the "on a parse 
error" case).


Yes, I gathered as much from the implementation. The idea makes sense, 
but of course it doesn’t provide much benefit to have a simpler 
implementation unless it actually /replaces/ the “on parse error” approach.


Given this appears to be a long-defunct proposal, a natural followup 
question is to ask whether there’s any reason this code is still in GHC. 
Is it used for any purpose, or could it be removed?


Alexis

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: How does GHC implement layout?

2021-04-05 Thread Ian Lynagh
On Sun, Apr 04, 2021 at 05:18:52PM -0500, Alexis King wrote:
> On 4/4/21 1:52 PM, Iavor Diatchki wrote:
> > 
> > Overall, I do think that Haskell's layout rule is more complicated than
> > it needs to be, and this is mostly because of the rule that requires the
> > insertion of a "virtual close curly" on a parse error.
> 
> Yes, this does seem to be by far the trickiest bit. But I’d be sad not to
> have it, as without it, even simple things like
> 
>let x = 3 in e
> 
> would not be grammatically valid.

That is accepted by the AlternativeLayoutRule.


Thanks
Ian

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: How does GHC implement layout?

2021-04-05 Thread Ian Lynagh
On Mon, Apr 05, 2021 at 05:09:21PM +, Simon Peyton-Jones wrote:
> Two people who may know more about the alternative layout rule are Simon 
> Marlow and Ian Lynagh (cc'd).

It was originally designed by John Meacham:
https://gitlab.haskell.org/haskell/prime/-/wikis/alternative-layout-rule
https://www.mail-archive.com/haskell-prime@haskell.org/msg01938.html

It isn't exactly equivalent to the Haskell layout rule, but it's fairly
close and much simpler (due to not having the "on a parse error" case).


Thanks
IAn

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: How does GHC implement layout?

2021-04-05 Thread Carter Schonwald
while not totally perfectly related, i have wondered if something like the
the recent pika parsing paper https://arxiv.org/abs/2005.06444 could be
used to provide better error recovery  from layout related errors in
haskell and similar languages.

On Sun, Apr 4, 2021 at 6:19 PM Alexis King  wrote:

> On 4/4/21 1:52 PM, Iavor Diatchki wrote:
>
> Hi Alexis,
>
> I wasn't sure what the "alternative layout" is either and did some
> googling, and it appears that it is something that was never really
> documented properly.   The following link contains pointers to the commit
> that introduced it (in 2009!)  (not the main ticket but some of the
> comments)
>
> Thanks, that’s a helpful pointer, though of course it still doesn’t
> explain very much. I’m still interested in understanding what the purpose
> of “alternative layout” is and how it operates, if anyone else has any idea.
>
> Overall, I do think that Haskell's layout rule is more complicated than it
> needs to be, and this is mostly because of the rule that requires the
> insertion of a "virtual close curly" on a parse error.
>
> Yes, this does seem to be by far the trickiest bit. But I’d be sad not to
> have it, as without it, even simple things like
>
> let x = 3 in e
>
> would not be grammatically valid.
>
> My feeling is that it'd be pretty tricky to do layout in the parser with
> grammar rules, but you may be able to do something with the parser state.
>
> Yes, I have some vague ideas, but none of them are particularly fleshed
> out. It’s entirely possible that I just don’t understand the relationship
> between the lexer and the parser (which seems somewhat obscured by the
> “alternative layout” stuff), and the ideas I have are what’s already
> implemented today. I’ll have to study the implementation more closely.
>
> In any case, thank you for your response! The ALR-related pointer
> certainly clarifies at least a little.
>
> Alexis
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: How does GHC implement layout?

2021-04-05 Thread Sebastian Graf
Hi Alexis, Hi Iavor,

I'm afraid I'm not particularly acquainted with how GHC implements
indentation-sensitive parsing, but I really like the way in which this book
 frames the problem. If you
look at the preview for the first chapter, you'll notice that (they call
the lexer scanner, and) they introduce an additional pass between lexer and
parser that handles the context-sensitive bits about indentation-sensitive
parsing, which doesn't fit well with the parser (which assumes a
context-free grammar to stay sane) or with the lexer (which should better
be just a simple DFA).

In particular, the short section 1.3 about the screener explicitly mentions
Haskell as use case:

[image: 2021-04-05-140820_543x163_scrot.png]
Although that doesn't really explain the how. Maybe section 2.5 (where impl
of screeners is covered) provides more insight into that.
Screeners are a bit like semantic analysis, but on the token stream instead
of the parse tree.

Reach out in private if you want more excerpts.

Cheers,
Sebastian

Am Mo., 5. Apr. 2021 um 00:19 Uhr schrieb Alexis King :

> On 4/4/21 1:52 PM, Iavor Diatchki wrote:
>
> Hi Alexis,
>
> I wasn't sure what the "alternative layout" is either and did some
> googling, and it appears that it is something that was never really
> documented properly.   The following link contains pointers to the commit
> that introduced it (in 2009!)  (not the main ticket but some of the
> comments)
>
> Thanks, that’s a helpful pointer, though of course it still doesn’t
> explain very much. I’m still interested in understanding what the purpose
> of “alternative layout” is and how it operates, if anyone else has any idea.
>
> Overall, I do think that Haskell's layout rule is more complicated than it
> needs to be, and this is mostly because of the rule that requires the
> insertion of a "virtual close curly" on a parse error.
>
> Yes, this does seem to be by far the trickiest bit. But I’d be sad not to
> have it, as without it, even simple things like
>
> let x = 3 in e
>
> would not be grammatically valid.
>
> My feeling is that it'd be pretty tricky to do layout in the parser with
> grammar rules, but you may be able to do something with the parser state.
>
> Yes, I have some vague ideas, but none of them are particularly fleshed
> out. It’s entirely possible that I just don’t understand the relationship
> between the lexer and the parser (which seems somewhat obscured by the
> “alternative layout” stuff), and the ideas I have are what’s already
> implemented today. I’ll have to study the implementation more closely.
>
> In any case, thank you for your response! The ALR-related pointer
> certainly clarifies at least a little.
>
> Alexis
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: How does GHC implement layout?

2021-04-04 Thread Alexis King

On 4/4/21 1:52 PM, Iavor Diatchki wrote:

Hi Alexis,

I wasn't sure what the "alternative layout" is either and did some 
googling, and it appears that it is something that was never really 
documented properly.   The following link contains pointers to the 
commit that introduced it (in 2009!) (not the main ticket but some of 
the comments)


Thanks, that’s a helpful pointer, though of course it still doesn’t 
explain very much. I’m still interested in understanding what the 
purpose of “alternative layout” is and how it operates, if anyone else 
has any idea.


Overall, I do think that Haskell's layout rule is more complicated 
than it needs to be, and this is mostly because of the rule that 
requires the insertion of a "virtual close curly" on a parse error.


Yes, this does seem to be by far the trickiest bit. But I’d be sad not 
to have it, as without it, even simple things like


   let x = 3 in e

would not be grammatically valid.

My feeling is that it'd be pretty tricky to do layout in the parser 
with grammar rules, but you may be able to do something with the 
parser state.


Yes, I have some vague ideas, but none of them are particularly fleshed 
out. It’s entirely possible that I just don’t understand the 
relationship between the lexer and the parser (which seems somewhat 
obscured by the “alternative layout” stuff), and the ideas I have are 
what’s already implemented today. I’ll have to study the implementation 
more closely.


In any case, thank you for your response! The ALR-related pointer 
certainly clarifies at least a little.


Alexis

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: How does GHC implement layout?

2021-04-04 Thread Iavor Diatchki
Hi Alexis,

I wasn't sure what the "alternative layout" is either and did some
googling, and it appears that it is something that was never really
documented properly.   The following link contains pointers to the commit
that introduced it (in 2009!)  (not the main ticket but some of the
comments)

https://ghc-tickets.haskell.narkive.com/htgwkF80/13087-alternativelayoutrule-breaks-lambdacase

Overall, I do think that Haskell's layout rule is more complicated than it
needs to be, and this is mostly because of the rule that requires the
insertion of a "virtual close curly" on a parse error.  This means that the
parser and lexer have to communicate.  I've implemented a few
languages with layout, and usually use a simpler version of layout that
just omits that case.  The benefit is that layout can be implemented as a
simple pre-processor pass on the stream of tokens,  which is much simpler
to specify and implement.   The drawback is that sometimes you have to
write programs in a slightly different way, but nothing that can't be
easily worked around.

My feeling is that it'd be pretty tricky to do layout in the parser with
grammar rules, but you may be able to do something with the parser state.
 I wonder how different it would end up looking though, as in a way that's
exactly what we are doing now, it is just that some of the state is the
lexer.

-Iavor


On Sat, Apr 3, 2021 at 5:05 PM Alexis King  wrote:

> Hi all,
>
> I’m wondering if there are any resources that discuss the design of GHC’s
> implementation of layout. (I haven’t been able to find any.) From looking
> at the code, here’s what I’ve gathered so far:
>
>- Layout is implemented in the lexer (compiler/GHC/Parser/Lexer.x).
>
>- The implementation is similar in some respects to the approach
>described in the Haskell Report, but still fairly different. Virtual braces
>and semicolons are inserted during the lexing process itself with the
>assistance of Alex lexer states (aka “start codes”).
>
>- In order to handle particularly tricky cases like
>
>if e then do x; y else z
>
>
>where the virtual close brace must be inserted in the middle of a
>line, tokens such as in and else are given special context-sensitive
>treatment. This appears to be quite subtle.
>
> Overall, I can mostly follow the code, but I still have a few unanswered
> questions:
>
>- The layout-related code consistently uses the phrase “alternative
>layout rule”—what does “alternative” mean here? Does it refer to GHC’s
>implementation of layout? Or maybe it refers to
>NondecreasingIndentation? It isn’t clear.
>
>- The implementation of layout seems quite complex, in large part
>because it has to worry about parsing concerns in the lexer in order to
>handle tricky cases like the one I provided above. Is there are reason all
>this is done in the lexer, rather than deferring some more of the work to
>the parser?
>
> I’ve found remarkably little information about implementing layout in
> general, so perhaps I’m missing some resources or keywords to search for,
> but any information or perspectives would be appreciated!
>
> Thanks,
> Alexis
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


How does GHC implement layout?

2021-04-03 Thread Alexis King

Hi all,

I’m wondering if there are any resources that discuss the design of 
GHC’s implementation of layout. (I haven’t been able to find any.) From 
looking at the code, here’s what I’ve gathered so far:


 * Layout is implemented in the lexer (compiler/GHC/Parser/Lexer.x).

 * The implementation is similar in some respects to the approach
   described in the Haskell Report, but still fairly different. Virtual
   braces and semicolons are inserted during the lexing process itself
   with the assistance of Alex lexer states (aka “start codes”).

 * In order to handle particularly tricky cases like

if e then do x; y else z


   where the virtual close brace must be inserted in the middle of a
   line, tokens such as in and else are given special context-sensitive
   treatment. This appears to be quite subtle.

Overall, I can mostly follow the code, but I still have a few unanswered 
questions:


 * The layout-related code consistently uses the phrase “alternative
   layout rule”—what does “alternative” mean here? Does it refer to
   GHC’s implementation of layout? Or maybe it refers to
   NondecreasingIndentation? It isn’t clear.

 * The implementation of layout seems quite complex, in large part
   because it has to worry about parsing concerns in the lexer in order
   to handle tricky cases like the one I provided above. Is there are
   reason all this is done in the lexer, rather than deferring some
   more of the work to the parser?

I’ve found remarkably little information about implementing layout in 
general, so perhaps I’m missing some resources or keywords to search 
for, but any information or perspectives would be appreciated!


Thanks,
Alexis

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs