Re: Potential improvement in compacting GC
Two other ideas that should improve GHC's compacting GC much more significantly. I've implemented both of these in another project and the results were great. In a GC benchmark (mutator does almost no work other than allocating data using a bump allocator), first one reduced Wasm instructions executed by 14%, second one 19.8%. Both of these ideas require pushing object headers to the mark stack with the objects, which means larger mark stacks. This is the only downside of these algorithms. - Instead of marking and then threading in the next pass, mark phase threads all fields when pushing the fields to the mark stack. We still need two other passes: one to unthread headers, another to move the objects. (we can't do both in one pass, let me know if you're curious and I can elaborate) This has the same number of passes as the current implementation, but it only visits object fields once. Currently, we visit fields once when marking, to mark fields, then again in `update_fwd`. This implementation does both in one pass over the fields. `update_fwd` does not visit fields. This reduced Wasm instructions executed by 14% in my benchmark. - Marking phase threads backwards pointers (ignores forwards pointers). Then we do one pass instead of two: for a marked object, unthread it (update forwards pointers to the object's new location), move it to its new location, then thread its forwards pointers. This completely eliminates one of the 3 passes, but fields need to be visited twice as before (unlike the algorithm above). I think this one is originally described in "An Efficient Garbage Compaction Algorithm", but I found that paper to be difficult to follow. A short description of the same algorithm exists in "High-Performance Garbage Collection for Memory-Constrained Environments" in section 5.1.2. This reduced Wasm instructions executed by 19.8% in my benchmark. In this algorithm, fields that won't be moved can be threaded any time before the second pass (pointed objects need to be marked and pushed to the mark stack with headers before threading a field). For example, in GHC, mut list entries can be threaded before or after marking (but before the second pass) as IIRC mut lists are not moved. Same for fields of large objects. As far as I can see, mark-compact GC is still the default when max heap size is specified and the oldest generation size is (by default) more than 30% of the max heap size. I'm not sure if max heap size is specified often (it's off by default), so not sure what would be the impact of these improvements be, but if anyone would be interested in funding me to implement these ideas (second algorithm above, and the bitmap iteration in the previous email) I could try to allocate one or two days a week to finish in a few months. Normally these are simple changes, but it's difficult to test and debug GHC's RTS as we don't have a test suite readily available and the code is not easily testable. In my previous implementations of these algorithms I had unit tests for the GC where I could easily generate arbitrary graphs (with cycles, backwards ptrs, forwards ptrs, ptrs from/to roots etc.) and test GC in isolation. Implementation of (2) took less than a day, and I didn't have to debug it more once the tests passed. It's really unfortunate that GHC's RTS makes this kind of thing difficult.. Ömer Ömer Sinan Ağacan , 7 Oca 2021 Per, 20:42 tarihinde şunu yazdı: > > Hello, > > I recently implemented the algorithm used by GHC's compacting GC in another > project. The algorithm (after marking) makes two passes over the heap > /generation. In GHC, these passes are implemented in [1] and in the next > function. > > In my implementation I tried 3 ways of implementing these passes, one of which > is the same as GHC's code, and benchmarked each version. I found that the > fastest implementation is not what's used in GHC, but it could easily be used. > > I should mention that my code targets Wasm, and I benchmarked Wasm > instructions > executed. Not CPU cycles, CPU instructions, or anything like that. It's very > likely that the results will be different when benchmarking code running on > actual hardware. > > Secondly, in my case the heap is mostly dead (residency is low). In GHC, > compaction for the oldest generation is enabled when residency passes a > threshold, so the assumption is the heap is mostly live. I'm guessing this > should also make some difference. > > Anyway, the first implementation I tried was similar to GHC's scan, but I did > > scan += object_size(scan); > > instead of bumping scan by one, as GHC does in [2]. This was the slowest > version. > > Second implementation did the same as GHC (bumped scan by one). This was > faster, but still slower than the next version. > > What I found to be the best is scanning the bitmap, not the heap. The bitmap > iterator reads one word at a time. In each iteration it checks if the bitmap > word is 0.
mkIfaceTc panics with "lookupVers1"
PUBLIC Hi, I'm trying to use `mkIfaceTc` to make a ModIface from the results of typechecking. Everything goes well until it gets to `makeFullIface`, where it fails to find some imported fingerprints: hello: hello: panic! (the 'impossible' happened) (GHC version 9.0.1: lookupVers1 MyPrim Foo Call stack: CallStack (from HasCallStack): callStackDoc, called at compiler/GHC/Utils/Outputable.hs:1230:37 in ghc-lib-9.0.1.20210623-3xx7a2u7IkN9vKAnkscROb:GHC.Utils.Outputable pprPanic, called at compiler/GHC/Iface/Recomp.hs:1455:19 in ghc-lib-9.0.1.20210623-3xx7a2u7IkN9vKAnkscROb:GHC.Iface.Recomp I am using the GHC API to load my modules differently than what GHC itself does; I am attaching the full code, but perhaps to note is that I am creating ModSummarys one by one and then typechecking and adding them to the moduleNameProviderMap in dependency order. Also, I am using `downsweep` directly instead of `depanal`, because the latter's `flushFinderCaches` was breaking my inter-unit imports. Because the attached code is a minimized version of something larger, some parts of it may seem to be doing things in a roundabout way but that's because in the real program all those degrees of freedom are used. In case that matters, the only packages in my package DB are rts, ghc-prim-0.7.0 and ghc-bignum-1.0. So my question is basically, what am I doing wrong? Why is fingerprinting failing on `Top.hs`'s import of `MyPrim.hs`, and how do I fix that, while still keeping this total control over when and how modules are typechecked and loaded? Thanks, Gergo This email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please delete all copies and notify the sender immediately. You may wish to refer to the incorporation details of Standard Chartered PLC, Standard Chartered Bank and their subsidiaries at https: //www.sc.com/en/our-locations Where you have a Financial Markets relationship with Standard Chartered PLC, Standard Chartered Bank and their subsidiaries (the "Group"), information on the regulatory standards we adhere to and how it may affect you can be found in our Regulatory Compliance Statement at https: //www.sc.com/rcs/ and Regulatory Compliance Disclosures at http: //www.sc.com/rcs/fm Insofar as this communication is not sent by the Global Research team and contains any market commentary, the market commentary has been prepared by the sales and/or trading desk of Standard Chartered Bank or its affiliate. It is not and does not constitute research material, independent research, recommendation or financial advice. Any market commentary is for information purpose only and shall not be relied on for any other purpose and is subject to the relevant disclaimers available at https: //www.sc.com/en/regulatory-disclosures/#market-disclaimer. Insofar as this communication is sent by the Global Research team and contains any research materials prepared by members of the team, the research material is for information purpose only and shall not be relied on for any other purpose, and is subject to the relevant disclaimers available at https: //research.sc.com/research/api/application/static/terms-and-conditions. Insofar as this e-mail contains the term sheet for a proposed transaction, by responding affirmatively to this e-mail, you agree that you have understood the terms and conditions in the attached term sheet and evaluated the merits and risks of the transaction. We may at times also request you to sign the term sheet to acknowledge the same. Please visit https: //www.sc.com/en/regulatory-disclosures/dodd-frank/ for important information with respect to derivative products. Top.hs Description: Top.hs Main.hs Description: Main.hs MyPrim.hs Description: MyPrim.hs ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Can NamedFieldPuns be added to `GHC.LanguageExtensions.Types.Extension`?
Hello all, I am happy to see engagement on this issue. I didn't read Ed's and Richard's replies until now, but I have indeed explored the pattern synonym solution, which I have materialised in a MR here: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/6156 As I say in the MR description, the only small downside is the beefy `COMPLETE` pragma, but we gain the possibility of deprecating the extension a bit more explicitly, which is nice. Feedback welcome! :) Thanks, Alfredo On Mon, 12 Jul 2021 at 17:08, Edward Kmett wrote: > There's always pattern synonyms as an option for cases like this, free of > backwards compat issues. > > -Edward > > On Tue, Jul 6, 2021 at 3:00 AM Alfredo Di Napoli < > alfredo.dinap...@gmail.com> wrote: > >> >> Hello Simon, >> >> Yes, renaming and perhaps keeping `RecordPuns` as a pattern synonym to >> not break backward-compat, if that's feasible to define as we are in >> `ghc-boot-th` here. Not sure if `PatternSynonyms` and `COMPLETE` would be >> available there. >> >> I am not sure how many libs that depend on the ghc API would break (I >> haven't grepped on Hackage yet), but that might tip the benefits/troubles >> ratio towards keeping the status quo. >> >> This is not a "problem" I have to solve today, and it might not be >> considered a problem by others (just an inconsistency I guess): as a >> colleague of mine pointed out, GHC is not necessarily "lying" here. It's >> still the same underlying extension, it just happens that there are two >> names that refer to it. >> >> Perhaps I could think about adding to `GhcHint` some kind of mapping >> which would give to IDEs or third-party libs the correct extension name >> given an input `LangExt.Extension`, the problem then becomes making sure >> that we keep this mapping in sync with the information contained in >> `GHC.Driver.Session`. >> >> I will let it simmer. >> >> Thanks! >> >> A. >> >> On Tue, 6 Jul 2021 at 11:19, Simon Peyton Jones >> wrote: >> >>> 1. What prevents us from adding `NamedFieldPuns` as a proper constructor >>> for the `Extension` type and in principle remove `RecordPuns`? Backward >>> compatibility I assume? >>> >>> You mean, essentially, rename `LangExt.RecordPuns` to `NamedFieldPuns`. >>> >>> >>> >>> I’d be fine with that. There might be back-compat issues, but only with >>> other plugins, and probably with vanishingly few of them. Grep in Hackage! >>> >>> >>> >>> Simon >>> >>> >>> >>> *From:* ghc-devs *On Behalf Of *Alfredo >>> Di Napoli >>> *Sent:* 06 July 2021 10:14 >>> *To:* Simon Peyton Jones via ghc-devs >>> *Subject:* Can NamedFieldPuns be added to >>> `GHC.LanguageExtensions.Types.Extension`? >>> >>> >>> >>> Dear all, >>> >>> >>> >>> As some of you might know, for the past few months I have been working >>> on changing GHC's diagnostic messages from plain SDocs to richer Haskell >>> types. >>> >>> >>> >>> As part of this work, I have added a mechanism to embed hints into >>> diagnostics, defined in `GHC.Types.Hint` in `HEAD`. One of the main >>> workhorse of this `GhcHint` type is the `SuggestExtension >>> LangExt.Extension` constructor, which embeds the extension to enable to use >>> a particular feature. The `LangExt.Extension` type comes from >>> `GHC.LanguageExtensions.Types`, and up until now there has always been a >>> 1:1 mapping between the language pragma for the extension and the type >>> itself. >>> >>> >>> >>> Today I was working on turning this error into a proper Haskell type: >>> >>> >>> >>> badPun :: Located RdrName -> TcRnMessage >>> >>> badPun fld = TcRnUnknownMessage $ mkPlainError noHints $ >>> >>> vcat [text "Illegal use of punning for field" <+> quotes (ppr fld), >>> >>> text "Use NamedFieldPuns to permit this"] >>> >>> >>> >>> I was ready to yield a `SuggestExtension LangExt.NamedFieldPuns` when I >>> discovered that there is no `NamedFieldPuns` constructor. Rather, there is >>> a `RecordPuns` , which refer to a deprecated flag, and we simply map >>> `NamedFieldPuns` back to it in `GHC.Driver.Session`: >>> >>> >>> >>> ... >>> >>> depFlagSpec' "RecordPuns" LangExt.RecordPuns >>> >>> (deprecatedForExtension "NamedFieldPuns"), >>> >>> ... >>> >>> flagSpec "NamedFieldPuns" LangExt.RecordPuns, >>> >>> ... >>> >>> >>> >>> This is problematic for the `GhcHint` type, because now if I was to >>> yield `SuggestExtension LangExt.RecordPuns` to the user, I could still >>> pretty-print the suggestion to turn `RecordPuns` into `NamedFieldPuns`, but >>> this means that IDEs or third-party library would have access to the >>> >>> "raw" Haskell datatype, and at that point they will be stuck with a >>> suggestion to enable a deprecated extension! (or best case scenario they >>> will have to transform the suggestion into something more sensible, which >>> partially defeats the point of this refactoring work I have been doing). >>> >>> >>> >>> I am not sure this behaviour is unique for just `NamedFieldPuns`, but my >>> question