Re: Lexing / Parsing and final token

2021-01-19 Thread Alan & Kim Zimmerman
FYI I did the horrible thing for now, optimisations welcome.

The change is at [1]

Alan

[1]
https://gitlab.haskell.org/ghc/ghc/-/commit/742273a94c187f51e3b143f9c206c42024486ecf?merge_request_iid=2418

On Tue, 19 Jan 2021 at 22:04, Alan & Kim Zimmerman 
wrote:

> And if there is a comment after the '}' and then more blank lines, the
> last token is a comment.
>
> If no curlies, it is a ITsemi for the last location, after the comment.
>
> So my hacky scheme of using ITsemi as the means to track the last gap is
> not viable.
>
> And I don't want to put extra housekeeping on every token to track two
> tokens back, not just one. Back to the drawing board.
>
> Thanks
>   Alan
>
>
> On Tue, 19 Jan 2021 at 21:59, Richard Eisenberg  wrote:
>
>> So, I think there's your answer: the last token might be ITccurly, not
>> ITsemi. It seems that the "insert invisible curlies and semis" is taken
>> more literally for semis than for curlies.
>>
>> Richard
>>
>> On Jan 19, 2021, at 4:58 PM, Alan & Kim Zimmerman 
>> wrote:
>>
>> Changing it to remove the final ';' gives a last token of ITccurly.
>>
>> Changing it to
>>
>> module Bug where
>> x = 5
>> y = 6
>>
>> Gives a last token of ITsemi.
>>
>> Alan
>>
>> On Tue, 19 Jan 2021 at 21:50, Richard Eisenberg  wrote:
>>
>>> That's bizarre. Does it still happen with explicit braces?
>>>
>>> Just to test, I tried
>>>
>>> module Bug where {
>>> x = 5;
>>> y = 6;
>>> };
>>>
>>> and GHC rejected because of the trailing ;.
>>>
>>> Richard
>>>
>>> > On Jan 19, 2021, at 4:35 PM, Alan & Kim Zimmerman 
>>> wrote:
>>> >
>>> > I am (still) working on !2418 to bring the API Annotations into the
>>> GHC ParsedSource, and making good progress.
>>> >
>>> > I am currently making a rough port of ghc-exactprint, to ensure I can
>>> get all the tests around modifying the AST to work.
>>> >
>>> > One of the last pieces is being able to capture the spacing from the
>>> last token in the file to the EOF.  I guess technically it is the second
>>> last token.
>>> >
>>> > Empirically (calling getTokenStream), it seems this is always ITsemi.
>>> I am not sure how this comes about, as the `module` parsing rule in
>>> Parser.y ends with body or body2, and those both finish with an actual or
>>> virtual '}'.
>>> >
>>> > Can I rely on the token before ITEof always being ITsemi?
>>> >
>>> > Alan
>>> > ___
>>> > ghc-devs mailing list
>>> > ghc-devs@haskell.org
>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>>
>>>
>>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Lexing / Parsing and final token

2021-01-19 Thread Alan & Kim Zimmerman
And if there is a comment after the '}' and then more blank lines, the last
token is a comment.

If no curlies, it is a ITsemi for the last location, after the comment.

So my hacky scheme of using ITsemi as the means to track the last gap is
not viable.

And I don't want to put extra housekeeping on every token to track two
tokens back, not just one. Back to the drawing board.

Thanks
  Alan


On Tue, 19 Jan 2021 at 21:59, Richard Eisenberg  wrote:

> So, I think there's your answer: the last token might be ITccurly, not
> ITsemi. It seems that the "insert invisible curlies and semis" is taken
> more literally for semis than for curlies.
>
> Richard
>
> On Jan 19, 2021, at 4:58 PM, Alan & Kim Zimmerman 
> wrote:
>
> Changing it to remove the final ';' gives a last token of ITccurly.
>
> Changing it to
>
> module Bug where
> x = 5
> y = 6
>
> Gives a last token of ITsemi.
>
> Alan
>
> On Tue, 19 Jan 2021 at 21:50, Richard Eisenberg  wrote:
>
>> That's bizarre. Does it still happen with explicit braces?
>>
>> Just to test, I tried
>>
>> module Bug where {
>> x = 5;
>> y = 6;
>> };
>>
>> and GHC rejected because of the trailing ;.
>>
>> Richard
>>
>> > On Jan 19, 2021, at 4:35 PM, Alan & Kim Zimmerman 
>> wrote:
>> >
>> > I am (still) working on !2418 to bring the API Annotations into the GHC
>> ParsedSource, and making good progress.
>> >
>> > I am currently making a rough port of ghc-exactprint, to ensure I can
>> get all the tests around modifying the AST to work.
>> >
>> > One of the last pieces is being able to capture the spacing from the
>> last token in the file to the EOF.  I guess technically it is the second
>> last token.
>> >
>> > Empirically (calling getTokenStream), it seems this is always ITsemi.
>> I am not sure how this comes about, as the `module` parsing rule in
>> Parser.y ends with body or body2, and those both finish with an actual or
>> virtual '}'.
>> >
>> > Can I rely on the token before ITEof always being ITsemi?
>> >
>> > Alan
>> > ___
>> > ghc-devs mailing list
>> > ghc-devs@haskell.org
>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>
>>
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Lexing / Parsing and final token

2021-01-19 Thread Richard Eisenberg
So, I think there's your answer: the last token might be ITccurly, not ITsemi. 
It seems that the "insert invisible curlies and semis" is taken more literally 
for semis than for curlies.

Richard

> On Jan 19, 2021, at 4:58 PM, Alan & Kim Zimmerman  wrote:
> 
> Changing it to remove the final ';' gives a last token of ITccurly.
> 
> Changing it to
> 
> module Bug where
> x = 5
> y = 6
> 
> Gives a last token of ITsemi.
> 
> Alan
> 
> On Tue, 19 Jan 2021 at 21:50, Richard Eisenberg  > wrote:
> That's bizarre. Does it still happen with explicit braces?
> 
> Just to test, I tried
> 
> module Bug where {
> x = 5;
> y = 6;
> };
> 
> and GHC rejected because of the trailing ;.
> 
> Richard
> 
> > On Jan 19, 2021, at 4:35 PM, Alan & Kim Zimmerman  > > wrote:
> > 
> > I am (still) working on !2418 to bring the API Annotations into the GHC 
> > ParsedSource, and making good progress.
> > 
> > I am currently making a rough port of ghc-exactprint, to ensure I can get 
> > all the tests around modifying the AST to work.
> > 
> > One of the last pieces is being able to capture the spacing from the last 
> > token in the file to the EOF.  I guess technically it is the second last 
> > token.
> > 
> > Empirically (calling getTokenStream), it seems this is always ITsemi.  I am 
> > not sure how this comes about, as the `module` parsing rule in Parser.y 
> > ends with body or body2, and those both finish with an actual or virtual 
> > '}'.
> > 
> > Can I rely on the token before ITEof always being ITsemi?
> > 
> > Alan
> > ___
> > ghc-devs mailing list
> > ghc-devs@haskell.org 
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs 
> > 
> 

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Lexing / Parsing and final token

2021-01-19 Thread Alan & Kim Zimmerman
Changing it to remove the final ';' gives a last token of ITccurly.

Changing it to

module Bug where
x = 5
y = 6

Gives a last token of ITsemi.

Alan

On Tue, 19 Jan 2021 at 21:50, Richard Eisenberg  wrote:

> That's bizarre. Does it still happen with explicit braces?
>
> Just to test, I tried
>
> module Bug where {
> x = 5;
> y = 6;
> };
>
> and GHC rejected because of the trailing ;.
>
> Richard
>
> > On Jan 19, 2021, at 4:35 PM, Alan & Kim Zimmerman 
> wrote:
> >
> > I am (still) working on !2418 to bring the API Annotations into the GHC
> ParsedSource, and making good progress.
> >
> > I am currently making a rough port of ghc-exactprint, to ensure I can
> get all the tests around modifying the AST to work.
> >
> > One of the last pieces is being able to capture the spacing from the
> last token in the file to the EOF.  I guess technically it is the second
> last token.
> >
> > Empirically (calling getTokenStream), it seems this is always ITsemi.  I
> am not sure how this comes about, as the `module` parsing rule in Parser.y
> ends with body or body2, and those both finish with an actual or virtual
> '}'.
> >
> > Can I rely on the token before ITEof always being ITsemi?
> >
> > Alan
> > ___
> > ghc-devs mailing list
> > ghc-devs@haskell.org
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Lexing / Parsing and final token

2021-01-19 Thread Richard Eisenberg
That's bizarre. Does it still happen with explicit braces?

Just to test, I tried

module Bug where {
x = 5;
y = 6;
};

and GHC rejected because of the trailing ;.

Richard

> On Jan 19, 2021, at 4:35 PM, Alan & Kim Zimmerman  wrote:
> 
> I am (still) working on !2418 to bring the API Annotations into the GHC 
> ParsedSource, and making good progress.
> 
> I am currently making a rough port of ghc-exactprint, to ensure I can get all 
> the tests around modifying the AST to work.
> 
> One of the last pieces is being able to capture the spacing from the last 
> token in the file to the EOF.  I guess technically it is the second last 
> token.
> 
> Empirically (calling getTokenStream), it seems this is always ITsemi.  I am 
> not sure how this comes about, as the `module` parsing rule in Parser.y ends 
> with body or body2, and those both finish with an actual or virtual '}'.
> 
> Can I rely on the token before ITEof always being ITsemi?
> 
> Alan
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Lexing / Parsing and final token

2021-01-19 Thread Alan & Kim Zimmerman
I am (still) working on !2418 to bring the API Annotations into the GHC
ParsedSource, and making good progress.

I am currently making a rough port of ghc-exactprint, to ensure I can get
all the tests around modifying the AST to work.

One of the last pieces is being able to capture the spacing from the last
token in the file to the EOF.  I guess technically it is the second last
token.

Empirically (calling getTokenStream), it seems this is always ITsemi.  I am
not sure how this comes about, as the `module` parsing rule in Parser.y
ends with body or body2, and those both finish with an actual or virtual
'}'.

Can I rely on the token before ITEof always being ITsemi?

Alan
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Storage layout of integral types

2021-01-19 Thread Stefan Schulze Frielinghaus
Hi all,

I'm wondering what the supposed storage layout of integral types is.  In
particular for integral types with size less than the size of a word.  For
example, on a 64bit machine is a 32bit integer supposed to be written as a
whole word and therefore as 64 bits or just as 32bits in the payload of a
closure?

I'm asking because since commit be5d74ca I see differently aligned integers in
the payload of a closure on a 64bit big-endian machine.  For example, in the
following code an Int32 object is created which contains the actual integer in
the high part of the payload (the snippet comes from the add operator
GHC.Int.$fNumInt32_$c+_entry):

Hp = Hp + 16;
...
I64[Hp - 8] = GHC.Int.I32#_con_info;
I32[Hp] = _scz7::I32;

whereas e.g. in function rts_getInt32 the opposite is assumed and the actual
integer is expected in the low part of the payload:

HsInt32
rts_getInt32 (HaskellObj p)
{
// See comment above:
// ASSERT(p->header.info == I32zh_con_info ||
//p->header.info == I32zh_static_info);
return (HsInt32)(HsInt)(UNTAG_CLOSURE(p)->payload[0]);
}

The same seems to be the case for the interpreter and foreign calls (case
bci_CCALL) where integral arguments are passed in the low part of a whole word.

Currently, my intuition is that the payload of a closure for an integral type
with size smaller than WordSize is written as a whole word where the subword is
aligned according to the machines endianness.  Can someone confirm this?  If
that is indeed true, then rts_getInt32 seems to be correct but not the former.
Otherwise the converse seems to be the case.

Cheers,
Stefan
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Large Haddock submodule merge landing soon

2021-01-19 Thread Ben Gamari

Hi all,

I have a very large Haddock change (merging Haddock's development
branch, `ghc-8.10`, into the `ghc-head` branch) pending in !4819. I
will try merge it as soon as Marge's next batch finishes.

If you have a Haddock change outstanding you will need to perform a
rather significant rebase after this change goes in. However, I am
available to help with rebase if needed; just send me the MR number of
your merge request and I can take of the rest.

Cheers,

- Ben


signature.asc
Description: PGP signature
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs