Re: Potential improvement in compacting GC

2021-07-13 Thread Ömer Sinan Ağacan
Two other ideas that should improve GHC's compacting GC much more
significantly. I've implemented both of these in another project and the
results were great. In a GC benchmark (mutator does almost no work other than
allocating data using a bump allocator), first one reduced Wasm instructions
executed by 14%, second one 19.8%.

Both of these ideas require pushing object headers to the mark stack with the
objects, which means larger mark stacks. This is the only downside of these
algorithms.

- Instead of marking and then threading in the next pass, mark phase threads
  all fields when pushing the fields to the mark stack. We still need two other
  passes: one to unthread headers, another to move the objects. (we can't do
  both in one pass, let me know if you're curious and I can elaborate)

  This has the same number of passes as the current implementation, but it only
  visits object fields once. Currently, we visit fields once when marking, to
  mark fields, then again in `update_fwd`. This implementation does both in one
  pass over the fields. `update_fwd` does not visit fields.

  This reduced Wasm instructions executed by 14% in my benchmark.

- Marking phase threads backwards pointers (ignores forwards pointers). Then we
  do one pass instead of two: for a marked object, unthread it (update
  forwards pointers to the object's new location), move it to its new location,
  then thread its forwards pointers. This completely eliminates one of the 3
  passes, but fields need to be visited twice as before (unlike the algorithm
  above).

  I think this one is originally described in "An Efficient Garbage Compaction
  Algorithm", but I found that paper to be difficult to follow. A short
  description of the same algorithm exists in "High-Performance Garbage
  Collection for Memory-Constrained Environments" in section 5.1.2.

  This reduced Wasm instructions executed by 19.8% in my benchmark.

  In this algorithm, fields that won't be moved can be threaded any time before
  the second pass (pointed objects need to be marked and pushed to the mark
  stack with headers before threading a field). For example, in GHC, mut list
  entries can be threaded before or after marking (but before the second pass)
  as IIRC mut lists are not moved. Same for fields of large objects.

As far as I can see, mark-compact GC is still the default when max heap size is
specified and the oldest generation size is (by default) more than 30% of the
max heap size. I'm not sure if max heap size is specified often (it's off by
default), so not sure what would be the impact of these improvements be, but if
anyone would be interested in funding me to implement these ideas (second
algorithm above, and the bitmap iteration in the previous email) I could try to
allocate one or two days a week to finish in a few months.

Normally these are simple changes, but it's difficult to test and debug GHC's
RTS as we don't have a test suite readily available and the code is not easily
testable. In my previous implementations of these algorithms I had unit tests
for the GC where I could easily generate arbitrary graphs (with cycles,
backwards ptrs, forwards ptrs, ptrs from/to roots etc.) and test GC in
isolation. Implementation of (2) took less than a day, and I didn't have to
debug it more once the tests passed. It's really unfortunate that GHC's RTS
makes this kind of thing difficult..

Ömer

Ömer Sinan Ağacan , 7 Oca 2021 Per, 20:42
tarihinde şunu yazdı:
>
> Hello,
>
> I recently implemented the algorithm used by GHC's compacting GC in another
> project. The algorithm (after marking) makes two passes over the heap
> /generation. In GHC, these passes are implemented in [1] and in the next
> function.
>
> In my implementation I tried 3 ways of implementing these passes, one of which
> is the same as GHC's code, and benchmarked each version. I found that the
> fastest implementation is not what's used in GHC, but it could easily be used.
>
> I should mention that my code targets Wasm, and I benchmarked Wasm 
> instructions
> executed. Not CPU cycles, CPU instructions, or anything like that. It's very
> likely that the results will be different when benchmarking code running on
> actual hardware.
>
> Secondly, in my case the heap is mostly dead (residency is low). In GHC,
> compaction for the oldest generation is enabled when residency passes a
> threshold, so the assumption is the heap is mostly live. I'm guessing this
> should also make some difference.
>
> Anyway, the first implementation I tried was similar to GHC's scan, but I did
>
> scan += object_size(scan);
>
> instead of bumping scan by one, as GHC does in [2]. This was the slowest
> version.
>
> Second implementation did the same as GHC (bumped scan by one). This was
> faster, but still slower than the next version.
>
> What I found to be the best is scanning the bitmap, not the heap. The bitmap
> iterator reads one word at a time. In each iteration it checks if the bitmap
> word is 0. 

mkIfaceTc panics with "lookupVers1"

2021-07-13 Thread Erdi, Gergo via ghc-devs
PUBLIC

Hi,

I'm trying to use `mkIfaceTc` to make a ModIface from the results of 
typechecking. Everything goes well until it gets to `makeFullIface`, where it 
fails to find some imported fingerprints:

hello: hello: panic! (the 'impossible' happened)
  (GHC version 9.0.1:
lookupVers1

MyPrim Foo
Call stack:
CallStack (from HasCallStack):
  callStackDoc, called at compiler/GHC/Utils/Outputable.hs:1230:37 in 
ghc-lib-9.0.1.20210623-3xx7a2u7IkN9vKAnkscROb:GHC.Utils.Outputable
  pprPanic, called at compiler/GHC/Iface/Recomp.hs:1455:19 in 
ghc-lib-9.0.1.20210623-3xx7a2u7IkN9vKAnkscROb:GHC.Iface.Recomp

I am using the GHC API to load my modules differently than what GHC itself 
does; I am attaching the full code, but perhaps to note is that I am creating 
ModSummarys one by one and then typechecking and adding them to the 
moduleNameProviderMap in dependency order. Also, I am using `downsweep` 
directly instead of `depanal`, because the latter's `flushFinderCaches` was 
breaking my inter-unit imports.

Because the attached code is a minimized version of something larger, some 
parts of it may seem to be doing things in a roundabout way but that's because 
in the real program all those degrees of freedom are used. In case that 
matters, the only packages in my package DB are  rts, ghc-prim-0.7.0 and 
ghc-bignum-1.0.

So my question is basically, what am I doing wrong? Why is fingerprinting 
failing on `Top.hs`'s import of `MyPrim.hs`, and how do I fix that, while still 
keeping this total control over when and how modules are typechecked and loaded?

Thanks,
Gergo


This email and any attachments are confidential and may also be privileged. If 
you are not the intended recipient, please delete all copies and notify the 
sender immediately. You may wish to refer to the incorporation details of 
Standard Chartered PLC, Standard Chartered Bank and their subsidiaries at 
https: //www.sc.com/en/our-locations

Where you have a Financial Markets relationship with Standard Chartered PLC, 
Standard Chartered Bank and their subsidiaries (the "Group"), information on 
the regulatory standards we adhere to and how it may affect you can be found in 
our Regulatory Compliance Statement at https: //www.sc.com/rcs/ and Regulatory 
Compliance Disclosures at http: //www.sc.com/rcs/fm

Insofar as this communication is not sent by the Global Research team and 
contains any market commentary, the market commentary has been prepared by the 
sales and/or trading desk of Standard Chartered Bank or its affiliate. It is 
not and does not constitute research material, independent research, 
recommendation or financial advice. Any market commentary is for information 
purpose only and shall not be relied on for any other purpose and is subject to 
the relevant disclaimers available at https: 
//www.sc.com/en/regulatory-disclosures/#market-disclaimer.

Insofar as this communication is sent by the Global Research team and contains 
any research materials prepared by members of the team, the research material 
is for information purpose only and shall not be relied on for any other 
purpose, and is subject to the relevant disclaimers available at https: 
//research.sc.com/research/api/application/static/terms-and-conditions. 

Insofar as this e-mail contains the term sheet for a proposed transaction, by 
responding affirmatively to this e-mail, you agree that you have understood the 
terms and conditions in the attached term sheet and evaluated the merits and 
risks of the transaction. We may at times also request you to sign the term 
sheet to acknowledge the same.

Please visit https: //www.sc.com/en/regulatory-disclosures/dodd-frank/ for 
important information with respect to derivative products.

Top.hs
Description: Top.hs


Main.hs
Description: Main.hs


MyPrim.hs
Description: MyPrim.hs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Can NamedFieldPuns be added to `GHC.LanguageExtensions.Types.Extension`?

2021-07-13 Thread Alfredo Di Napoli
Hello all,

I am happy to see engagement on this issue. I didn't read Ed's and
Richard's replies until now, but I have indeed explored the pattern synonym
solution, which I have materialised in a MR here:

https://gitlab.haskell.org/ghc/ghc/-/merge_requests/6156

As I say in the MR description, the only small downside is the beefy
`COMPLETE` pragma, but we gain the possibility of deprecating the extension
a bit more explicitly, which is nice. Feedback welcome! :)

Thanks,

Alfredo



On Mon, 12 Jul 2021 at 17:08, Edward Kmett  wrote:

> There's always pattern synonyms as an option for cases like this, free of
> backwards compat issues.
>
> -Edward
>
> On Tue, Jul 6, 2021 at 3:00 AM Alfredo Di Napoli <
> alfredo.dinap...@gmail.com> wrote:
>
>>
>> Hello Simon,
>>
>> Yes, renaming and perhaps keeping `RecordPuns` as a pattern synonym to
>> not break backward-compat, if that's feasible to define as we are in
>> `ghc-boot-th` here. Not sure if `PatternSynonyms` and `COMPLETE` would be
>> available there.
>>
>> I am not sure how many libs that depend on the ghc API would break (I
>> haven't grepped on Hackage yet), but that might tip the benefits/troubles
>> ratio towards keeping the status quo.
>>
>> This is not a "problem" I have to solve today, and it might not be
>> considered a problem by others (just an inconsistency I guess): as a
>> colleague of mine pointed out, GHC is not necessarily "lying" here. It's
>> still the same underlying extension, it just happens that there are two
>> names that refer to it.
>>
>> Perhaps I could think about adding to `GhcHint` some kind of mapping
>> which would give to IDEs or third-party libs the correct extension name
>> given an input `LangExt.Extension`, the problem then becomes making sure
>> that we keep this mapping in sync with the information contained in
>> `GHC.Driver.Session`.
>>
>> I will let it simmer.
>>
>> Thanks!
>>
>> A.
>>
>> On Tue, 6 Jul 2021 at 11:19, Simon Peyton Jones 
>> wrote:
>>
>>> 1. What prevents us from adding `NamedFieldPuns` as a proper constructor
>>> for the `Extension` type and in principle remove `RecordPuns`? Backward
>>> compatibility I assume?
>>>
>>> You mean, essentially, rename `LangExt.RecordPuns` to `NamedFieldPuns`.
>>>
>>>
>>>
>>> I’d be fine with that.  There might be back-compat issues, but only with
>>> other plugins, and probably with vanishingly few of them.  Grep in Hackage!
>>>
>>>
>>>
>>> Simon
>>>
>>>
>>>
>>> *From:* ghc-devs  *On Behalf Of *Alfredo
>>> Di Napoli
>>> *Sent:* 06 July 2021 10:14
>>> *To:* Simon Peyton Jones via ghc-devs 
>>> *Subject:* Can NamedFieldPuns be added to
>>> `GHC.LanguageExtensions.Types.Extension`?
>>>
>>>
>>>
>>> Dear all,
>>>
>>>
>>>
>>> As some of you might know, for the past few months I have been working
>>> on changing GHC's diagnostic messages from plain SDocs to richer Haskell
>>> types.
>>>
>>>
>>>
>>> As part of this work, I have added a mechanism to embed hints into
>>> diagnostics, defined in `GHC.Types.Hint` in `HEAD`. One of the main
>>> workhorse of this `GhcHint` type is the `SuggestExtension
>>> LangExt.Extension` constructor, which embeds the extension to enable to use
>>> a particular feature. The `LangExt.Extension` type comes from
>>> `GHC.LanguageExtensions.Types`, and up until now there has always been a
>>> 1:1 mapping between the language pragma for the extension and the type
>>> itself.
>>>
>>>
>>>
>>> Today I was working on turning this error into a proper Haskell type:
>>>
>>>
>>>
>>> badPun :: Located RdrName -> TcRnMessage
>>>
>>> badPun fld = TcRnUnknownMessage $ mkPlainError noHints $
>>>
>>>   vcat [text "Illegal use of punning for field" <+> quotes (ppr fld),
>>>
>>> text "Use NamedFieldPuns to permit this"]
>>>
>>>
>>>
>>> I was ready to yield a `SuggestExtension LangExt.NamedFieldPuns` when I
>>> discovered that there is no `NamedFieldPuns` constructor. Rather, there is
>>> a `RecordPuns` , which refer to a deprecated flag, and we simply map
>>> `NamedFieldPuns` back to it in `GHC.Driver.Session`:
>>>
>>>
>>>
>>> ...
>>>
>>>   depFlagSpec' "RecordPuns"   LangExt.RecordPuns
>>>
>>> (deprecatedForExtension "NamedFieldPuns"),
>>>
>>> ...
>>>
>>>   flagSpec "NamedFieldPuns"   LangExt.RecordPuns,
>>>
>>> ...
>>>
>>>
>>>
>>> This is problematic for the `GhcHint` type, because now if I was to
>>> yield `SuggestExtension LangExt.RecordPuns` to the user, I could still
>>> pretty-print the suggestion to turn `RecordPuns` into `NamedFieldPuns`, but
>>> this means that IDEs or third-party library would have access to the
>>>
>>> "raw" Haskell datatype, and at that point they will be stuck with a
>>> suggestion to enable a deprecated extension! (or best case scenario they
>>> will have to transform the suggestion into something more sensible, which
>>> partially defeats the point of this refactoring work I have been doing).
>>>
>>>
>>>
>>> I am not sure this behaviour is unique for just `NamedFieldPuns`, but my
>>> question