RE: Again: Uniques in GHC
Some of those clarifying points helped a *great* deal. Thanks. I've addressed comments / questions and linked from KeyTypes. Ph. From: Simon Peyton Jones simo...@microsoft.com Sent: 09 October 2014 22:36 To: Holzenspies, P.K.F. (EWI); carter.schonw...@gmail.com Cc: ghc-devs@haskell.org Subject: RE: Again: Uniques in GHC Thank you. A most helpful beginning. I have added some comments and queries, as well as clarifying some points. Simon From: p.k.f.holzensp...@utwente.nl [p.k.f.holzensp...@utwente.nl] Sent: 09 October 2014 12:39 To: Simon Peyton Jones; carter.schonw...@gmail.com Cc: ghc-devs@haskell.org Subject: RE: Again: Uniques in GHC Dear Simon, et al, I've created the wiki-page about the Unique-patch [1]. Should it be linked to from the KeyDataTypes [2]? Regards, Philip [1] https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/Unique [2] https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/KeyDataTypes From: Simon Peyton Jones simo...@microsoft.com Sent: 07 October 2014 23:23 To: Holzenspies, P.K.F. (EWI); carter.schonw...@gmail.com Cc: ghc-devs@haskell.org Subject: RE: Again: Uniques in GHC One of the things I'm finding difficult about this Phab stuff is that I get presented with lots of code without enough supporting text saying * What problem is this patch trying to solve? * What is the user-visible design (for language features)? * What are the main ideas in the implementation? The place we usually put such design documents is on the GHC Trac Wiki. Email is ok for discussion, but the wiki is FAR better for stating clearly the current state of play. Philip, might you make such a page for this unique stuff? To answer some of you specific questions (please include the answers in the wiki page in some form): * Uniques are never put in .hi files (as far as I know). They do not survive a single invocation of GHC. * However with ghc --make, or ghci, uniques do survive for the entire invocation of GHC. For example in ghc --make, uniques assigned when compiling module A should not clash with those for module B * Yes, TyCons and DataCons must have separate uniques. We often form sets of Names, which contain both TyCons and DataCons. Let's not mess with this. * Having unique-supply-splitting as a pure function is so deeply embedded in GHC that I could not hazard a guess as to how difficult it would be to IO-ify it. Moreover, I would regret doing so because it would force sequentiality where none is needed. * Template Haskell is a completely independent Haskell library. It does not import GHC. If uniques were in their own package, then TH and GHC could share them. Ditto Hoopl. * You say that Uniques are serialised as Word32. I'm not sure why they are serialised at all! * Enforcing determinacy everywhere is a heavy burden. Instead I suppose that you could run a pass at the end to give everything a more determinate name TidyPgm does this for the name strings, so it would probably be easy to do so for the uniques too. Simon From: ghc-devs [ghc-devs-boun...@haskell.org] on behalf of p.k.f.holzensp...@utwente.nl [p.k.f.holzensp...@utwente.nl] Sent: 07 October 2014 22:03 To: carter.schonw...@gmail.com Cc: ghc-devs@haskell.org Subject: RE: Again: Uniques in GHC Dear Carter, Simon, et al, (CC'd SPJ on this explicitly, because I *think* he'll be most knowledgeable on some of the constraints that need to be guaranteed for Uniques) I agree, but to that end, a few parameters need to become clear. To this end, I've created a Phabricator-thing that we can discuss things off of: https://phabricator.haskell.org/D323 Here are my open issues: - There were ad hoc domains of Uniques being created everywhere in the compiler (i.e. characters chosen to classify the generated Uniques). I have gathered them all up and given them names as constructors in Unique.UniqueDomain. Some of these names are arbitrary, because I don't know what they're for precisely. I generally went for the module name as a starting point. I did, however, make a point of having different invocations of mkSplitUniqSupply et al all have different constructors (e.g. HscMainA through HscMainC). This is to prevent the high potential for conflicts (see comments in uniqueDomainChar). If there are people that are more knowledgeable about the use of Uniques in these modules (e.g. HscMain, ByteCodeGen, etc.) can say that the uniques coming from these different invocations can never cause conflict, they maybe can reduce the number of UniqueDomains. - Some UniqueDomains only have a handful of instances and seem a bit wasteful. - Uniques were represented by a custom-boxed Int#, but serialised as Word32. Most modern machines see Int# as a 64-bit thing. Aren't we worried about the potential
RE: Again: Uniques in GHC
Dear Simon, et al, I've created the wiki-page about the Unique-patch [1]. Should it be linked to from the KeyDataTypes [2]? Regards, Philip [1] https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/Unique [2] https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/KeyDataTypes From: Simon Peyton Jones simo...@microsoft.com Sent: 07 October 2014 23:23 To: Holzenspies, P.K.F. (EWI); carter.schonw...@gmail.com Cc: ghc-devs@haskell.org Subject: RE: Again: Uniques in GHC One of the things I'm finding difficult about this Phab stuff is that I get presented with lots of code without enough supporting text saying * What problem is this patch trying to solve? * What is the user-visible design (for language features)? * What are the main ideas in the implementation? The place we usually put such design documents is on the GHC Trac Wiki. Email is ok for discussion, but the wiki is FAR better for stating clearly the current state of play. Philip, might you make such a page for this unique stuff? To answer some of you specific questions (please include the answers in the wiki page in some form): * Uniques are never put in .hi files (as far as I know). They do not survive a single invocation of GHC. * However with ghc --make, or ghci, uniques do survive for the entire invocation of GHC. For example in ghc --make, uniques assigned when compiling module A should not clash with those for module B * Yes, TyCons and DataCons must have separate uniques. We often form sets of Names, which contain both TyCons and DataCons. Let's not mess with this. * Having unique-supply-splitting as a pure function is so deeply embedded in GHC that I could not hazard a guess as to how difficult it would be to IO-ify it. Moreover, I would regret doing so because it would force sequentiality where none is needed. * Template Haskell is a completely independent Haskell library. It does not import GHC. If uniques were in their own package, then TH and GHC could share them. Ditto Hoopl. * You say that Uniques are serialised as Word32. I'm not sure why they are serialised at all! * Enforcing determinacy everywhere is a heavy burden. Instead I suppose that you could run a pass at the end to give everything a more determinate name TidyPgm does this for the name strings, so it would probably be easy to do so for the uniques too. Simon From: ghc-devs [ghc-devs-boun...@haskell.org] on behalf of p.k.f.holzensp...@utwente.nl [p.k.f.holzensp...@utwente.nl] Sent: 07 October 2014 22:03 To: carter.schonw...@gmail.com Cc: ghc-devs@haskell.org Subject: RE: Again: Uniques in GHC Dear Carter, Simon, et al, (CC'd SPJ on this explicitly, because I *think* he'll be most knowledgeable on some of the constraints that need to be guaranteed for Uniques) I agree, but to that end, a few parameters need to become clear. To this end, I've created a Phabricator-thing that we can discuss things off of: https://phabricator.haskell.org/D323 Here are my open issues: - There were ad hoc domains of Uniques being created everywhere in the compiler (i.e. characters chosen to classify the generated Uniques). I have gathered them all up and given them names as constructors in Unique.UniqueDomain. Some of these names are arbitrary, because I don't know what they're for precisely. I generally went for the module name as a starting point. I did, however, make a point of having different invocations of mkSplitUniqSupply et al all have different constructors (e.g. HscMainA through HscMainC). This is to prevent the high potential for conflicts (see comments in uniqueDomainChar). If there are people that are more knowledgeable about the use of Uniques in these modules (e.g. HscMain, ByteCodeGen, etc.) can say that the uniques coming from these different invocations can never cause conflict, they maybe can reduce the number of UniqueDomains. ? - Some UniqueDomains only have a handful of instances and seem a bit wasteful. - Uniques were represented by a custom-boxed Int#, but serialised as Word32. Most modern machines see Int# as a 64-bit thing. Aren't we worried about the potential for undetected overlap/conflict there? - What is the scope in which a Unique must be Unique? I.e. what if independently compiled modules have overlapping Uniques (for different Ids) in their hi-files? Also, do TyCons and DataCons really need to have guaranteed different Uniques? Shouldn't the parser/renamer figure out what goes where and raise errors on domain violations? - There seem to be related-but-different Unique implementations in Template Haskell and Hoopl. Why is this? - How critical is it to let mkUnique (and mkSplitUniqSupply) be pure functions? If they can be IO, we could greatly simplify the management of (un)generated Uniques in each UniqueDomain and quite possibly make the move to a threaded
RE: Tentative high-level plans for 7.10.1
I’m with John wrt. the discussions on LTS and the 7.8.4 release being orthogonal. Especially if 7.8 does not have submodules and if this is a pain, there’s also no reason to backport our approach to LTS into 7.8. In other words, 7.10 could also be the first LTS version. Ph. From: John Lato [mailto:jwl...@gmail.com] Sent: woensdag 8 oktober 2014 18:22 To: Edward Z. Yang Cc: ghc-devs@haskell.org; Simon Marlow Subject: Re: Tentative high-level plans for 7.10.1 Speaking for myself, I don't think the question of doing a 7.8.4 release at all needs to be entangled with the LTS issue. On Wed, Oct 8, 2014 at 8:23 AM, Edward Z. Yang ezy...@mit.edumailto:ezy...@mit.edu wrote: Excerpts from Herbert Valerio Riedel's message of 2014-10-08 00:59:40 -0600: However, should GHC 7.8.x turn out to become a LTS-ishly maintained branch, we may want to consider converting it to a similiar Git structure as GHC HEAD currently is, to avoid having to keep two different sets of instructions on the GHC Wiki for how to work on GHC 7.8 vs working on GHC HEAD/7.10 and later. Emphatically yes. Lack of submodules on the 7.8 branch makes working with it /very/ unpleasant. Edward ___ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: Again: Uniques in GHC
Wait, wait, wait! I wasn't talking about a parallel *runtime*. Nothing changes there. All I'm talking about is something that is a very old issue that never got added / solved / resolved. Somewhere on the commentary, or the mailing list, I seem to recall that the generation of Uniques was the bottleneck for the parallelisation of GHC *Itself*. It's about having a compiler using multiple threads and says nothing about programs coming out of it. I'm all with you on embedded processors and that kind of stuff, but I don't see a pressing need to compile *on* them. Isn't all ARM-stuff assuming cross-compilation? Ph. From: mad@gmail.com mad@gmail.com on behalf of Austin Seipp aus...@well-typed.com Sent: 07 October 2014 17:46 To: Holzenspies, P.K.F. (EWI) Cc: ghc-devs@haskell.org Subject: Re: Again: Uniques in GHC On Tue, Oct 7, 2014 at 1:32 AM, p.k.f.holzensp...@utwente.nl wrote: Yes, this approach to a parallel GHC would only work on 64-bit machines. The idea is, I guess, that we're not going to see a massive demand for parallel GHC running on multi-core 32-bit systems. In other words; 32-bit systems wouldn't get a parallel GHC. Let me make sure I'm understanding this correctly: in this particular proposed solution, the side effect would be that we no longer have a capable 32bit runtime which supports multicore parallelism? Sorry, but I'm afraid this approach is pretty much unacceptable IMO, for precisely the reason outlined in your last sentence. 32bit systems are surprisingly commen. I have several multicore 32bit ARMv7 machines on my desk right now, for example. And there are a lot more of those floating around than you might think. If that's the 'cure', I think I (and other users) would consider it far worse than the disease. Regards, Philip ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs -- Regards, Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: Again: Uniques in GHC
From: mad@gmail.com mad@gmail.com on behalf of Austin Seipp aus...@well-typed.com So I assume your change would mean 'ghc -j' would not work for 32bit. I still consider this a big limitation, one which is only due to an implementation detail. But we need to confirm this will actually fix any bottlenecks first though before getting to that point. Yes, that's what I'm saying. Let me just add that what I'm proposing by no means prohibits or hinders making 32-bit GHC-versions be parallel later on, it just doesn't solve the problem. It depends to what extent the fully deterministic behaviour bug is considered a priority (there was something about parts of the hi-files being non-deterministic across different executions of GHC; don't recall the details). Anyhow, the work I'm doing now exposes a few things about Uniques that confuse me a little and that could have been bugs (that maybe never acted up). Extended e-mail to follow later on. Ph. ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: Again: Uniques in GHC
Dear Carter, Simon, et al, (CC'd SPJ on this explicitly, because I *think* he'll be most knowledgeable on some of the constraints that need to be guaranteed for Uniques) I agree, but to that end, a few parameters need to become clear. To this end, I've created a Phabricator-thing that we can discuss things off of: https://phabricator.haskell.org/D323 Here are my open issues: - There were ad hoc domains of Uniques being created everywhere in the compiler (i.e. characters chosen to classify the generated Uniques). I have gathered them all up and given them names as constructors in Unique.UniqueDomain. Some of these names are arbitrary, because I don't know what they're for precisely. I generally went for the module name as a starting point. I did, however, make a point of having different invocations of mkSplitUniqSupply et al all have different constructors (e.g. HscMainA through HscMainC). This is to prevent the high potential for conflicts (see comments in uniqueDomainChar). If there are people that are more knowledgeable about the use of Uniques in these modules (e.g. HscMain, ByteCodeGen, etc.) can say that the uniques coming from these different invocations can never cause conflict, they maybe can reduce the number of UniqueDomains. - Some UniqueDomains only have a handful of instances and seem a bit wasteful. - Uniques were represented by a custom-boxed Int#, but serialised as Word32. Most modern machines see Int# as a 64-bit thing. Aren't we worried about the potential for undetected overlap/conflict there? - What is the scope in which a Unique must be Unique? I.e. what if independently compiled modules have overlapping Uniques (for different Ids) in their hi-files? Also, do TyCons and DataCons really need to have guaranteed different Uniques? Shouldn't the parser/renamer figure out what goes where and raise errors on domain violations? - There seem to be related-but-different Unique implementations in Template Haskell and Hoopl. Why is this? - How critical is it to let mkUnique (and mkSplitUniqSupply) be pure functions? If they can be IO, we could greatly simplify the management of (un)generated Uniques in each UniqueDomain and quite possibly make the move to a threaded GHC easier (for what that's worth). Also, this may help solve the non-determinism issues. - Missing haddocks, failing lints (lines too long) and a lot of cosmetics will be met when the above points have become a tad more clear. I'm more than happy to document a lot of the answers to the above stuff in Unique and/or commentary. Regards, Philip From: Carter Schonwald carter.schonw...@gmail.com Sent: 07 October 2014 21:30 To: Holzenspies, P.K.F. (EWI) Cc: Austin Seipp; ghc-devs@haskell.org Subject: Re: Again: Uniques in GHC in some respects, having fully deterministic builds is a very important goal: a lot of tooling for eg, caching builds of libraries works much much better if you have that property :) On Tue, Oct 7, 2014 at 12:45 PM, p.k.f.holzensp...@utwente.nlmailto:p.k.f.holzensp...@utwente.nl wrote: From: mad@gmail.commailto:mad@gmail.com mad@gmail.commailto:mad@gmail.com on behalf of Austin Seipp aus...@well-typed.commailto:aus...@well-typed.com So I assume your change would mean 'ghc -j' would not work for 32bit. I still consider this a big limitation, one which is only due to an implementation detail. But we need to confirm this will actually fix any bottlenecks first though before getting to that point. Yes, that's what I'm saying. Let me just add that what I'm proposing by no means prohibits or hinders making 32-bit GHC-versions be parallel later on, it just doesn't solve the problem. It depends to what extent the fully deterministic behaviour bug is considered a priority (there was something about parts of the hi-files being non-deterministic across different executions of GHC; don't recall the details). Anyhow, the work I'm doing now exposes a few things about Uniques that confuse me a little and that could have been bugs (that maybe never acted up). Extended e-mail to follow later on. Ph. ___ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: Tentative high-level plans for 7.10.1
I don't know whether this has ever been considered as an idea, but what about having a notion of Long Term Support version (similar to how a lot of processor and operating systems vendors go about this). The idea behind an LTS-GHC would be to continue bug-fixing on the LTS-version, even if newer major versions no longer get bug-fixing support. To some extent, there will be redundancies (bugs that have disappeared in newer versions because newer code does the same and more, still needing to be fixed on the LTS code base), but the upside would be a clear prioritisation between stability (LTS) and innovation (latest major release). The current policy for feature *use* in the GHC code-base is that they're supported in (at least) three earlier major release versions. Should we go the LTS-route, the logical choice would be to demand the latest LTS-version. The danger, of course, is that people aren't very enthusiastic about bug-fixing older versions of a compiler, but for language/compiler-uptake, this might actually be a Better Way. Thoughts? Ph. From: John Lato jwl...@gmail.com Sent: 06 October 2014 01:10 To: Johan Tibell Cc: Simon Marlow; ghc-devs@haskell.org Subject: Re: Tentative high-level plans for 7.10.1 Speaking as a user, I think Johan's concern is well-founded. For us, ghc-7.8.3 was the first of the 7.8 line that was really usable in production, due to #8960 and other bugs. Sure, that can be worked around in user code, but it takes some time for developers to locate the issues, track down the bug, and implement the workaround. And even 7.8.3 has some bugs that cause minor annoyances (either ugly workarounds or intermittent build failures that I haven't had the time to debug); it's definitely not solid. Similarly, 7.6.3 was the first 7.6 release that we were able to use in production. I'm particularly concerned about ghc-7.10 as the AMP means there will be significant lag in identifying new bugs (since it'll take time to update codebases for that major change). For the curious, within the past few days we've seen all the following, some multiple times, all so far intermittent: ghc: panic! (the 'impossible' happened) (GHC version 7.8.3.0 for x86_64-unknown-linux): kindFunResult ghc-prim:GHC.Prim.*{(w) tc 34d} ByteCodeLink.lookupCE During interactive linking, GHCi couldn't find the following symbol: some_mangled_name_closure ghc: mmap 0 bytes at (nil): Invalid Argument internal error: scavenge_one: strange object 2022017865 Some of these I've mapped to likely ghc issues, and some are fixed in HEAD, but so far I haven't had an opportunity to put together reproducible test cases. And that's just bugs that we haven't triaged yet, there are several more for which workarounds are in place. John L. On Sat, Oct 4, 2014 at 2:54 PM, Johan Tibell johan.tib...@gmail.commailto:johan.tib...@gmail.com wrote: On Fri, Oct 3, 2014 at 11:35 PM, Austin Seipp aus...@well-typed.commailto:aus...@well-typed.com wrote: - Cull and probably remove the 7.8.4 milestone. - Simply not enough time to address almost any of the tickets in any reasonable timeframe before 7.10.1, while also shipping them. - Only one, probably workarouadble, not game-changing bug (#9303) marked for 7.8.4. - No particular pressure on any outstanding bugs to release immediately. - ANY release would be extremely unlikely, but if so, only backed by the most critical of bugs. - We will move everything in 7.8.4 milestone to 7.10.1 milestone. - To accurately catalogue what was fixed. - To eliminate confusion. #8960 looks rather serious and potentially makes all of 7.8 a no-go for some users. I'm worried that we're (in general) pushing too many bug fixes towards future major versions. Since major versions tend to add new bugs, we risk getting into a situation where no major release is really solid. ___ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: Again: Uniques in GHC
Very much part of my plan, Johan! I was a fervent +1 on that recommendation. Ph. ? From: Johan Tibell johan.tib...@gmail.com Sent: 06 October 2014 12:06 To: Holzenspies, P.K.F. (EWI) Cc: ghc-devs@haskell.org Subject: Re: Again: Uniques in GHC On Mon, Oct 6, 2014 at 11:58 AM, p.k.f.holzensp...@utwente.nlmailto:p.k.f.holzensp...@utwente.nl wrote: - The export-list of Unique has some comments stating that function X is only exported for module Y, yet is used elsewhere. This may be because these comments do not show up in haddock etc. leading some people to think they're up for general use. In my refactoring, I'm sticking the restriction in the function name, so it's no longer mkUniqueGrimily, but rather mkUniqueOnlyForUniqSupply (making the name even longer should discourage their use more). If at all possible, these should be removed altogether asap. Since you're touching this code base it would be a terrific time to add some Haddocks! (We recently decided, on the ghc-devs@ list, that all new top-level entities, i.e. functions, data types, and classes, should have Haddocks.) ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: Show instance for SrcSpan
The way I read Alan's earlier mail is precisely that; auto-generated Show does what he wants (show the entire AST), whereas Outputable hides too much information. I very much understand his frustration with having to manually figure out what constructors and datatypes go where in a compiled program. Alan's point was the *absence* of auto derived Show instances and, in the case of SrcSpan, too much verbosity (rather than wanting stuff to be incomplete). Allowing some bespoke stuff to reduce the noise of something like record field names for SrcSpan makes even more sense in this context. Similarly, this is why Alan I want everything to have Data instances, so you can (amongst many other nice things) selectively print parts of the AST. Ph. From: Alan Kim Zimmerman alan.z...@gmail.com Sent: 06 October 2014 15:15 To: Mateusz Kowalczyk Cc: ghc-devs@haskell.org Subject: Re: Show instance for SrcSpan True, but if you are using GHC generated stuff via the GHC API you sometimes do not want to have to implement Outputable for all your app types, when you can auto derive Show which mostly does what you need. On Mon, Oct 6, 2014 at 3:11 PM, Mateusz Kowalczyk fuuze...@fuuzetsu.co.ukmailto:fuuze...@fuuzetsu.co.uk wrote: On 10/06/2014 01:59 PM, Alan Kim Zimmerman wrote: Is there any reason I can't put in a diff request to replace the derived Show instance for SrcSpan with a handcrafted one that does not exhausively list the constructors, making it more readable? Alan Why? If you're looking for pretty output then you should be changing Outputable. -- Mateusz K. ___ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: Again: Uniques in GHC
Dear Joachim, Although I can't quite get what you're saying from the posts on that link, I'm not immediately sure what you're saying should extend to hi-files. These files are very much specific to the compiler version you're using, as in, new GHCs add stuff to them all the time and their binary format does not (seem to) provision for being able to skip unknown things (i.e. it doesn't say how large the next semantic block is in the hi-file). If we're going to keep the formats the same for any architecture, we're going to have to limit 64-bit machines to 32-bit (actually 30-bits, another thing I don't quite understand in BinIface) Uniques. There seem to be possibilities to alleviate the issues with parallel generation of fresh Uniques in a parallel version of GHC. The idea is that, since 64-bits is more than we'll ever assign anyway, to use a few for thread-ids, so we would guarantee non-conflicting Uniques generated by different threads. Anyway, maybe someone a tad more knowledgeable about Uniques could maybe tell me on what scale Uniques in the hi-files should be unique? Must they only be non-conflicting in a Module? In a Package? If I first compile a file with GHC and then, in a separate invocation of GHC, compile another, surely their hi-files will have some of the same Uniques for their own, different things? Where are these conflicts resolved when linking multiple independently compiled files? Are they ever? Regards, Philip ? From: Joachim Breitner m...@joachim-breitner.de Sent: 06 October 2014 12:36 Subject: Re: Again: Uniques in GHC snip A while ago we had problems with haddock in Debian when the serialization became bit-dependent.^1 I suggest to keep the specification of any on-disk format independent of architecture specifics. Greetings, Joachim ^1 http://bugs.debian.org/586723#15 ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: GHC AST Annotations
Dear Alan, Nice going and thanks for undertaking yet another useful AST transformation! A few thoughts (do with them as you see fit): - Always called ann; doesn't this require OverloadedRecordFields? You're in danger of delaying your modification (scheduled to land in 7.10). Other than that, as before, from a design perspective: yes please. - In terms of presentation/comments; when I first started looking at (i.e. traversing, selectively printing etc.) the AST, I was always really annoyed that every child in the tree has one extra step of indirection, due to the location annotations being L loc thing, as opposed to a loc-field as part of the thing. I would simply call it annotation (no talk of external tool writers). In time, I hope GHC-annotations also move to that field. Regards, Philip From: Alan Kim Zimmerman alan.z...@gmail.com Sent: 23 September 2014 20:57 To: Richard Eisenberg Cc: ghc-devs@haskell.org Subject: Re: GHC AST Annotations I have created https://ghc.haskell.org/trac/ghc/ticket/9628 for this, and have decided to first tackle adding a type parameter to the entire AST, so that tool writers can add custom information as required. My first stab at this is to do is as follows ``` data HsModule r name = HsModule { ann :: r, -- ^ Annotation for external tool writers hsmodName :: Maybe (Located ModuleName), -- ^ @Nothing@: \module X where\ is omitted (in which case the next -- field is Nothing too) hsmodExports :: Maybe [LIE name], ``` Salient points 1. It comes as the first type parameter, and is called r 2. It gets added as the first field of the syntax element 3. It is always called ann Before undertaking this particular change, I would appreciate some feedback. Regards Alan On Thu, Aug 28, 2014 at 8:34 PM, Alan Kim Zimmerman alan.z...@gmail.commailto:alan.z...@gmail.com wrote: This does have the advantage of being explicit. I modelled the initial proposal on HSE as a proven solution, and I think that they were trying to keep it non-invasive, to allow both an annotated and non-annoted AST. I thiink the key question is whether it is acceptable to sprinkle this kind of information throughout the AST. For someone interested in source-to-source conversions (like me) this is great, others may find it intrusive. The other question, which is probably orthogonal to this, is whether we want the annotation to be a parameter to the AST, which allows it to be overridden by various tools for various purposes, or fixed as in Richard's suggestion. A parameterised annotation allows the annotations to be manipulated via something like for HSE: -- |AST nodes are annotated, and this class allows manipulation of the annotations. class Functor ast = Annotated ast where -- |Retrieve the annotation of an AST node. ann :: ast l - l -- |Change the annotation of an AST node. Note that only the annotation of the node itself is affected, and not -- the annotations of any child nodes. if all nodes in the AST tree are to be affected, use fmap. amap :: (l - l) - ast l - ast l Alan On Thu, Aug 28, 2014 at 7:11 PM, Richard Eisenberg e...@cis.upenn.edumailto:e...@cis.upenn.edu wrote: For what it's worth, my thought is not to use SrcSpanInfo (which, to me, is the wrong way to slice the abstraction) but instead to add SrcSpan fields to the relevant nodes. For example: | HsDoSrcSpan -- of the word do BlockSrcSpans (HsStmtContext Name) -- The parameterisation is unimportant -- because in this context we never use -- the PatGuard or ParStmt variant [ExprLStmt id] -- do:one or more stmts PostTcType -- Type of the whole expression ... data BlockSrcSpans = LayoutBlock Int -- the parameter is the indentation level ... -- stuff to track the appearance of any semicolons | BracesBlock ... -- stuff to track the braces and semicolons The way I understand it, the SrcSpanInfo proposal means that we would have lots of empty SrcSpanInfos, no? Most interior nodes don't need one, I think. Popping up a level, I do support the idea of including this info in the AST. Richard On Aug 28, 2014, at 11:54 AM, Simon Peyton Jones simo...@microsoft.commailto:simo...@microsoft.com wrote: In general I’m fine with this direction of travel. Some specifics: ·You’d have to be careful to document, for every data constructor in HsSyn, what the association between the [SrcSpan] in the SrcSpanInfo and the “sub-entities” ·Many of the sub-entities will have their own SrcSpanInfo wrapped around them, so there’s some unhelpful duplication. Maybe you only want the SrcSpanInfo to list the [SrcSpan]s for the sub-entities (like the syntactic keywords)
RE: Unique as special boxing type hidden constructors
Dear Simon, The point is to have newtype Unique = Unique Int where we use the boxing of Int, instead of creating our own boxing. Actually, it seems useful to move to newtype Unique = Unique Word (see other discussions about unnecessary signedness). I've been working on this (although only as a side-project, so progress is very slow) and I've discovered a lot of API-out-of-sync-ness; there are comments stating we don't export mkUnique, so that we can keep track of all the Chars we use. Unfortunately, we *do* export mkUnique from Unique and we do *not* have consistent use of Chars everywhere. I'm working to replace the Char-mechanism with a (rather straightforward) sum-type UniqueDomain. This should also help get a more consistent treatment of serialisation (the one in the module Unique is *slightly* different from the one in BinIface). I'm still not quite sure how to do the performance tests on the actual compilation (i.e. runtime of GHC itself). If anything, moving Uniques to a higher abstraction (coerced boxed values, instead of manually boxed stuff) is actually a good litmus test of how far GHC's optimisations have come since '96 ;) If you have any more input, especially on performance stuff (what would be the worst acceptable performance hit and measured on what, for example), it would be *very* welcome! Regards, Philip From: Simon Marlow marlo...@gmail.com Sent: 04 September 2014 11:49 To: Edward Z. Yang; Holzenspies, P.K.F. (EWI) Cc: ghc-devs Subject: Re: Unique as special boxing type hidden constructors FastInt = Int#, so newtype doesn't work here. Cheers, Simon On 15/08/2014 14:01, Edward Z. Yang wrote: The definition dates back to 1996, so it seems plausible that newtype is the way to go now. Edward Excerpts from p.k.f.holzenspies's message of 2014-08-15 11:52:47 +0100: Dear all, I'm working with Alan to instantiate everything for Data.Data, so that we can do better SYB-traversals (which should also help newcomers significantly to get into the GHC code base). Alan's looking at the AST types, I'm looking at the basic types in the compiler. Right now, I'm looking at Unique and two questions come up: data Unique = MkUnique FastInt 1) As someone already commented: Is there a specific reason (other than history) that this isn't simply a newtype around an Int? If we're boxing anyway, we may as well use the default Int boxing and newtype-coerce to the specific purpose of Unique, no? 2) As a general question for GHC hacking style; what is the reason for hiding the constructors in the first place? I understand about abstraction and there are reasons for hiding, but there's a public GHC API and then there are all these modules that people can import at their own peril. Nothing is guaranteed about their consistency from version to version of GHC. I don't really see the point about hiding constructors (getting in the way of automatically deriving things) and then giving extra functions like (in the case of Unique): getKeyFastInt (MkUnique x) = x mkUniqueGrimily x = MkUnique (iUnbox x) I would propose to just make Unique a newtype for an Int and making the constructor visible. Regards, Philip ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: Suggestion for GHC System User's Guide documentation change
Dear Howard, Yes, emphatically so! Any examples should be copy-paste-runnable if reasonably possible without any further switches, so that means the pragmas *should* be included! Regards, Philip From: Howard B. Golden howard_b_gol...@yahoo.com Sent: 22 August 2014 18:47 To: Holzenspies, P.K.F. (EWI); simo...@microsoft.com; ghc-devs@haskell.org Subject: Re: Suggestion for GHC System User's Guide documentation change p.k.f., I like your less verbose suggestion better than my original. I don't understand your comment about code examples: Are you supporting or opposing the inclusion of the LANGUAGE pragmas in the examples? Howard From: p.k.f.holzensp...@utwente.nl p.k.f.holzensp...@utwente.nl To: simo...@microsoft.com; howard_b_gol...@yahoo.com; ghc-devs@haskell.org Sent: Friday, August 22, 2014 5:38 AM Subject: RE: Suggestion for GHC System User's Guide documentation change Marginally less verbose; why not use the language extension *only* in running text? Preferably with a link to the documentation of that language extension. In your example: | The language extension refUnicodeSyntax/ref enables Unicode characters to be | used to stand for certain ASCII character sequences. With regards to code examples: Ideally any explicit code example could just be copy-pasted into a .hs-file and loaded into ghci / compiled with ghc without special switches. Just my two cents ;) Ph. ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: Suggestion for GHC System User's Guide documentation change
Marginally less verbose; why not use the language extension *only* in running text? Preferably with a link to the documentation of that language extension. In your example: | The language extension refUnicodeSyntax/ref enables Unicode characters to be | used to stand for certain ASCII character sequences.? With regards to code examples: Ideally any explicit code example could just be copy-pasted into a .hs-file and loaded into ghci / compiled with ghc without special switches. Just my two cents ;) Ph. From: Simon Peyton Jones simo...@microsoft.com Sent: 22 August 2014 09:37 To: Howard B. Golden; ghc-devs@haskell.org Subject: RE: Suggestion for GHC System User's Guide documentation change I'd be ok with this. It's a bit more verbose, but if it's less confusing for our users, then go for it. Thanks for offering to make a patch! SImon | -Original Message- | From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of | Howard B. Golden | Sent: 21 August 2014 22:30 | To: ghc-devs@haskell.org | Subject: Suggestion for GHC System User's Guide documentation change | | I suggest changing the User's Guide extensions documentation to | consistently use the LANGUAGE pragma form to specify extensions and | code examples, rather than a combination of LANGUAGE pragmas and - | XExtension flags. I find the combination of the two confusing. Also, | the reader copying code examples which require a specific LANGUAGE to | compile will be assisted by including the LANGUAGE pragma in the code | examples. | | | For example, in section 7.3, I would change: | | | 7.3. Syntactic extensions | 7.3.1. Unicode syntax | | The language extension -XUnicodeSyntax enables Unicode characters to be | used to stand for certain ASCII character sequences. | | | | To: | | 7.3. Syntactic extensions | 7.3.1. Unicode syntax | | The language extension {-# LANGUAGE UnicodeSyntax #-} enables Unicode | characters to be used to stand for certain ASCII character sequences. | | | | | Similarly, I would include the required LANGUAGE pragma(s) in _all_ | code examples. For example, in section 7.3.7, I would change: | | | type Typ | | data TypView = Unit | | Arrow Typ Typ | | view :: Typ - TypView | | -- additional operations for constructing Typ's ... | | | | To: | | | | {-# LANGUAGE ViewPatterns #-} | type Typ | | data TypView = Unit | | Arrow Typ Typ | | view :: Typ - TypView | | -- additional operations for constructing Typ's ... | | | I realize that LANGUAGE pragmas must be in file headers. While it is | possible that users may be confused if they try to put pragmas in the | body of a source file, I believe this will be outweighed by the benefit | of making the examples clearer about the extensions necessary to use | them. | | If this change is accepted, I volunteer to make the necessary | documentation patches to implement it. | | | Howard B. Golden | Northridge, CA USA | ___ | ghc-devs mailing list | ghc-devs@haskell.org | http://www.haskell.org/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: Unique as special boxing type hidden constructors
Dear Simon, et al, I seem to recall that the Unique(Supply) was an issue in parallelising GHC itself. There's a comment in the code (signed JSM) that there aren't any 64-bit bugs, if we have at least 32-bits for Ints and Chars fit in 8 characters. Then, there's bitmasks like 0x00FF to separate the Int-part from the Char-part. I was wondering; if we move Uniques to 64 bits, but use the top 16 (instead of the current 8) for *both* the tag (currently a Char, soon an sum-type) and the threadId of the supplying thread of a Unique, would that help? Regards, Philip From: Simon Peyton Jones simo...@microsoft.com Sent: 18 August 2014 23:29 To: Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org Subject: RE: Unique as special boxing type hidden constructors 1) There is a #ifdef define(__GLASGOW_HASKELL__), which confused me somewhat. Similar things occur elsewhere in the code. Isn't the assumption that GHC is being used? Is this old portability stuff that may be removed? I think so, unless others yell to the contrary. 2) Uniques are produced from a Char and an Int. The function to build Uniques (mkUnique) is not exported, according to the comments, so as to see all characters used. Access to these different classes of Uniques is given through specialised mkXXXUnique functions. Does anyone have a problem with something like: data UniqueClass = UniqDesugarer | UniqAbsCFlattener | UniqSimplStg | UniqNativeCodeGen ... OK by me 3) Is there a reason for having functions implementing class-methods to be exported? In the case of Unique, there is pprUnique and: instance Outputable Unique where ppr = pprUnique Please don’t change this. If you want to change how pretty-printing of uniques works, and want to find all the call sites of pprUnique, it’s FAR easier to grep for pprUnique than to search for all calls of ppr, and work out which are at type Unique! (In my view) it’s usually much better not to use type classes unless you actually need overloading. Simon From: p.k.f.holzensp...@utwente.nl [mailto:p.k.f.holzensp...@utwente.nl] Sent: 18 August 2014 14:50 To: Simon Peyton Jones; ghc-devs@haskell.org Subject: RE: Unique as special boxing type hidden constructors Dear Simon, et al, Looking at Unique, there are a few more design choices that may be outdated, and since I'm polishing things now, anyway, I figured I could update it on more fronts. 1) There is a #ifdef define(__GLASGOW_HASKELL__), which confused me somewhat. Similar things occur elsewhere in the code. Isn't the assumption that GHC is being used? Is this old portability stuff that may be removed? 2) Uniques are produced from a Char and an Int. The function to build Uniques (mkUnique) is not exported, according to the comments, so as to see all characters used. Access to these different classes of Uniques is given through specialised mkXXXUnique functions. Does anyone have a problem with something like: data UniqueClass = UniqDesugarer | UniqAbsCFlattener | UniqSimplStg | UniqNativeCodeGen ... and a public (i.e. exported) function: mkUnique :: UniqueClass - Int - Unique ? The benefit of this would be to have more (to my taste) self-documenting code and a greater chance that documentation is updated (the list of unique supply characters in the comments is currently outdated). 3) Is there a reason for having functions implementing class-methods to be exported? In the case of Unique, there is pprUnique and: instance Outputable Unique where ppr = pprUnique Here pprUnique is exported and it is used in quite a few places where it's argument is unambiguously a Unique (so it's not to force the type) *and* ppr is used for all kinds of other types. I'm assuming this is an old choice making things marginally faster, but I would say cleaning up the API / namespace would now outweigh this margin. ? I will also be adding Haddock-comments, so when this is done, a review would be most welcome (I'll also be doing some similar transformations to other long-since-untouched-code). Regards, Philip Van: Simon Peyton Jones simo...@microsoft.commailto:simo...@microsoft.com Verzonden: maandag 18 augustus 2014 00:11 Aan: Holzenspies, P.K.F. (EWI); ghc-devs@haskell.orgmailto:ghc-devs@haskell.org Onderwerp: RE: Unique as special boxing type hidden constructors Re (1) I think this is a historical. A newtype wrapping an Int should be fine. I’d be ok with that change. Re (2), I think your question is: why does module Unique export the data type Unique abstractly, rather than exporting both the data type and its constructor. No deep reason here, but it guarantees that you can only *make* a unique from an Int by calling ‘mkUniqueGrimily’, which signals clearly that something fishy is going on. And rightly so! Simon From: ghc-devs
RE: Unique as special boxing type hidden constructors
Methinks a lot of the former performance considerations in Unique are out-dated (as per earlier discussion; direct use of unboxed ints etc.). An upside of using an ADT for the types of uniques is that we don't actually need to reserve 8 bits for a Char (which is committing to neither the actual number of classes, nor the nature of real Chars in Haskell). Instead, we can make a bitmask dependent on the number of classes that we actually use and stick the tag on the least-significant side of the Unique, as opposed to the most-significant (as we do now). We want to keep things working on 32-bits, but maybe a future of parallel builds is only for 64-bits. In this case, I would suggest that the 64-bit-case looks like this: thread_id_bits:8 unique_id_bits:56-X tag_bits:X whereas the 32-bit case simply has unique_id_bits:32-X tag_bits:X Where X is dependent on the size of the UniqueClass-sum-type (to be introduced). This would be CPP-magic'd using WORD_SIZE_IN_BITS. Ph. From: Simon Peyton Jones simo...@microsoft.com Sent: 20 August 2014 13:01 To: Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org Subject: RE: Unique as special boxing type hidden constructors Sounds like a good idea to me. Would need to think about making sure that it all still worked, somehow, on 32 bit. S From: p.k.f.holzensp...@utwente.nl [mailto:p.k.f.holzensp...@utwente.nl] Sent: 20 August 2014 11:31 To: Simon Peyton Jones; ghc-devs@haskell.org Subject: RE: Unique as special boxing type hidden constructors Dear Simon, et al, I seem to recall that the Unique(Supply) was an issue in parallelising GHC itself. There's a comment in the code (signed JSM) that there aren't any 64-bit bugs, if we have at least 32-bits for Ints and Chars fit in 8 characters. Then, there's bitmasks like 0x00FF to separate the Int-part from the Char-part. I was wondering; if we move Uniques to 64 bits, but use the top 16 (instead of the current 8) for *both* the tag (currently a Char, soon an sum-type) and the threadId of the supplying thread of a Unique, would that help? Regards, Philip From: Simon Peyton Jones simo...@microsoft.commailto:simo...@microsoft.com Sent: 18 August 2014 23:29 To: Holzenspies, P.K.F. (EWI); ghc-devs@haskell.orgmailto:ghc-devs@haskell.org Subject: RE: Unique as special boxing type hidden constructors 1) There is a #ifdef define(__GLASGOW_HASKELL__), which confused me somewhat. Similar things occur elsewhere in the code. Isn't the assumption that GHC is being used? Is this old portability stuff that may be removed? I think so, unless others yell to the contrary. 2) Uniques are produced from a Char and an Int. The function to build Uniques (mkUnique) is not exported, according to the comments, so as to see all characters used. Access to these different classes of Uniques is given through specialised mkXXXUnique functions. Does anyone have a problem with something like: data UniqueClass = UniqDesugarer | UniqAbsCFlattener | UniqSimplStg | UniqNativeCodeGen ... OK by me 3) Is there a reason for having functions implementing class-methods to be exported? In the case of Unique, there is pprUnique and: instance Outputable Unique where ppr = pprUnique Please don’t change this. If you want to change how pretty-printing of uniques works, and want to find all the call sites of pprUnique, it’s FAR easier to grep for pprUnique than to search for all calls of ppr, and work out which are at type Unique! (In my view) it’s usually much better not to use type classes unless you actually need overloading. Simon From: p.k.f.holzensp...@utwente.nlmailto:p.k.f.holzensp...@utwente.nl [mailto:p.k.f.holzensp...@utwente.nl] Sent: 18 August 2014 14:50 To: Simon Peyton Jones; ghc-devs@haskell.orgmailto:ghc-devs@haskell.org Subject: RE: Unique as special boxing type hidden constructors Dear Simon, et al, Looking at Unique, there are a few more design choices that may be outdated, and since I'm polishing things now, anyway, I figured I could update it on more fronts. 1) There is a #ifdef define(__GLASGOW_HASKELL__), which confused me somewhat. Similar things occur elsewhere in the code. Isn't the assumption that GHC is being used? Is this old portability stuff that may be removed? 2) Uniques are produced from a Char and an Int. The function to build Uniques (mkUnique) is not exported, according to the comments, so as to see all characters used. Access to these different classes of Uniques is given through specialised mkXXXUnique functions. Does anyone have a problem with something like: data UniqueClass = UniqDesugarer | UniqAbsCFlattener | UniqSimplStg | UniqNativeCodeGen ... and a public (i.e. exported) function: mkUnique :: UniqueClass - Int - Unique ? The benefit of this would be to have more (to my taste) self-documenting code
RE: Unique as special boxing type hidden constructors
On Wed, Aug 20, 2014 at 1:47 PM, p.k.f.holzensp...@utwente.nlmailto:p.k.f.holzensp...@utwente.nl wrote: thread_id_bits:8 unique_id_bits:56-X tag_bits:X Is the thread id deterministic between runs? If not, please do not use this layout. I remember vaguely Unique being relevant to ghc not having deterministic builds, my most wanted ghc feature: https://ghc.haskell.org/trac/ghc/ticket/4012 I think this depends on the policy GHC *will* have (there is not parallel build atm) wrt. the forking of threads. An actual Control.Concurrent.ThreadId might be as large as 64 bits, so, of course, we won't be using that, but rather the sequence number in which the UniqueSupply was split off for a new thread. In other words, if the decision to fork threads is deterministic, so are the Uniques with this layout. Mind you, I imagine a parallel GHC would still have at most one thread working on a single module. I don't know too much about what makes it into the interface file of a module (I can't imagine the exact Uniques end up there, because they would overlap with other modules - with per-module compilation - and conflict that way). Can you comment on how (the layout of) Uniques relate to #4012 in a little more detail? It seems that if the Uniques that somehow end up in the interface files could simply be stripped of the thread id, in which case, the problem reduces to the current one. Ph. ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: Unique as special boxing type hidden constructors
Dear Max, et al, Here's hoping either you are still on the mailing list, or the address I found on your website (which says you're a Ph.D. student, so it's starting to smell) is still operational. I'm working on redoing some Unique-stuff in GHC. Mostly, the code uses Unique's API in a well-behaved fashion. The only awkward bit I found is in BinIface.getSymtabName, which git blames you for ;) I just wanted to ask: Why does this functions do all the bit-masking and shifting stuff directly and with different masks than anything in Unique? Is there a reason why this doesn't use unpkUnique? The comments in Unique state that mkUnique is NOT EXPORTED (the caps are in the comments, I'm not shouting), but they are, it seems, specifically for BinIface. I would like to get rid of this, but dare not hack away in the dark. Regards, Philip From: Alexander Kjeldaas alexander.kjeld...@gmail.com Sent: 20 August 2014 15:48 To: Holzenspies, P.K.F. (EWI) Cc: Simon Peyton Jones; ghc-devs Subject: Re: Unique as special boxing type hidden constructors On Wed, Aug 20, 2014 at 3:07 PM, p.k.f.holzensp...@utwente.nlmailto:p.k.f.holzensp...@utwente.nl wrote: On Wed, Aug 20, 2014 at 1:47 PM, p.k.f.holzensp...@utwente.nlmailto:p.k.f.holzensp...@utwente.nl wrote: thread_id_bits:8 unique_id_bits:56-X tag_bits:X Is the thread id deterministic between runs? If not, please do not use this layout. I remember vaguely Unique being relevant to ghc not having deterministic builds, my most wanted ghc feature: https://ghc.haskell.org/trac/ghc/ticket/4012 I think this depends on the policy GHC *will* have (there is not parallel build atm) wrt. the forking of threads. An actual Control.Concurrent.ThreadId might be as large as 64 bits, so, of course, we won't be using that, but rather the sequence number in which the UniqueSupply was split off for a new thread. In other words, if the decision to fork threads is deterministic, so are the Uniques with this layout. Mind you, I imagine a parallel GHC would still have at most one thread working on a single module. I don't know too much about what makes it into the interface file of a module (I can't imagine the exact Uniques end up there, because they would overlap with other modules - with per-module compilation - and conflict that way). Can you comment on how (the layout of) Uniques relate to #4012 in a little more detail? It seems that if the Uniques that somehow end up in the interface files could simply be stripped of the thread id, in which case, the problem reduces to the current one. I frankly don't know. I just think it's better to keep ThreadId out of data that can bleed into symbols and what not. As you can see, the thread id is just a counter, and as forkIO in a threaded runtime will be racy between threads, they aren't deterministic. http://stackoverflow.com/questions/24995262/how-can-i-build-a-threadid-given-that-i-know-the-actual-number Alexander ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: Unique as special boxing type hidden constructors
PS. Unique also looks like a case where Ints are used and (= 0) is asserted. Can these cases be converted to Word as per earlier discussions? Van: p.k.f.holzensp...@utwente.nl p.k.f.holzensp...@utwente.nl Verzonden: maandag 18 augustus 2014 15:49 Aan: simo...@microsoft.com; ghc-devs@haskell.org Onderwerp: RE: Unique as special boxing type hidden constructors Dear Simon, et al, Looking at Unique, there are a few more design choices that may be outdated, and since I'm polishing things now, anyway, I figured I could update it on more fronts. 1) There is a #ifdef define(__GLASGOW_HASKELL__), which confused me somewhat. Similar things occur elsewhere in the code. Isn't the assumption that GHC is being used? Is this old portability stuff that may be removed? 2) Uniques are produced from a Char and an Int. The function to build Uniques (mkUnique) is not exported, according to the comments, so as to see all characters used. Access to these different classes of Uniques is given through specialised mkXXXUnique functions. Does anyone have a problem with something like: data UniqueClass = UniqDesugarer | UniqAbsCFlattener | UniqSimplStg | UniqNativeCodeGen ... and a public (i.e. exported) function: mkUnique :: UniqueClass - Int - Unique ? The benefit of this would be to have more (to my taste) self-documenting code and a greater chance that documentation is updated (the list of unique supply characters in the comments is currently outdated). 3) Is there a reason for having functions implementing class-methods to be exported? In the case of Unique, there is pprUnique and: instance Outputable Unique where ppr = pprUnique Here pprUnique is exported and it is used in quite a few places where it's argument is unambiguously a Unique (so it's not to force the type) *and* ppr is used for all kinds of other types. I'm assuming this is an old choice making things marginally faster, but I would say cleaning up the API / namespace would now outweigh this margin. I will also be adding Haddock-comments, so when this is done, a review would be most welcome (I'll also be doing some similar transformations to other long-since-untouched-code). Regards, Philip Van: Simon Peyton Jones simo...@microsoft.com Verzonden: maandag 18 augustus 2014 00:11 Aan: Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org Onderwerp: RE: Unique as special boxing type hidden constructors Re (1) I think this is a historical. A newtype wrapping an Int should be fine. I’d be ok with that change. Re (2), I think your question is: why does module Unique export the data type Unique abstractly, rather than exporting both the data type and its constructor. No deep reason here, but it guarantees that you can only *make* a unique from an Int by calling ‘mkUniqueGrimily’, which signals clearly that something fishy is going on. And rightly so! Simon From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of p.k.f.holzensp...@utwente.nl Sent: 15 August 2014 11:53 To: ghc-devs@haskell.org Subject: Unique as special boxing type hidden constructors Dear all, I'm working with Alan to instantiate everything for Data.Data, so that we can do better SYB-traversals (which should also help newcomers significantly to get into the GHC code base). Alan's looking at the AST types, I'm looking at the basic types in the compiler. Right now, I'm looking at Unique and two questions come up: data Unique = MkUnique FastInt 1) As someone already commented: Is there a specific reason (other than history) that this isn't simply a newtype around an Int? If we're boxing anyway, we may as well use the default Int boxing and newtype-coerce to the specific purpose of Unique, no? 2) As a general question for GHC hacking style; what is the reason for hiding the constructors in the first place? I understand about abstraction and there are reasons for hiding, but there's a public GHC API and then there are all these modules that people can import at their own peril. Nothing is guaranteed about their consistency from version to version of GHC. I don't really see the point about hiding constructors (getting in the way of automatically deriving things) and then giving extra functions like (in the case of Unique): getKeyFastInt (MkUnique x) = x mkUniqueGrimily x = MkUnique (iUnbox x) I would propose to just make Unique a newtype for an Int and making the constructor visible. Regards, Philip ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Unique as special boxing type hidden constructors
Dear all, I'm working with Alan to instantiate everything for Data.Data, so that we can do better SYB-traversals (which should also help newcomers significantly to get into the GHC code base). Alan's looking at the AST types, I'm looking at the basic types in the compiler. Right now, I'm looking at Unique and two questions come up: data Unique = MkUnique FastInt 1) As someone already commented: Is there a specific reason (other than history) that this isn't simply a newtype around an Int? If we're boxing anyway, we may as well use the default Int boxing and newtype-coerce to the specific purpose of Unique, no? 2) As a general question for GHC hacking style; what is the reason for hiding the constructors in the first place? I understand about abstraction and there are reasons for hiding, but there's a public GHC API and then there are all these modules that people can import at their own peril. Nothing is guaranteed about their consistency from version to version of GHC. I don't really see the point about hiding constructors (getting in the way of automatically deriving things) and then giving extra functions like (in the case of Unique): getKeyFastInt (MkUnique x) = x mkUniqueGrimily x = MkUnique (iUnbox x) I would propose to just make Unique a newtype for an Int and making the constructor visible. Regards, Philip ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: Broken Data.Data instances
Dear Alan, I’ve had a look at the diffs on Phabricator. They’re looking good. I have a few comments / questions: 1) As you said, the renamer and typechecker are heavily interwoven, but when you *know* that you’re between renamer and typechecker (i.e. when things have ‘Name’s, but not ‘Id’s), isn’t it better to choose the PreTcType as argument? (Basically, look for any occurrence of “Name PostTcType” and replace with Pre.) 2) I saw your point about being able to distinguish PreTcType from () in SYB-traversals, but you have now defined PreTcType as a synonym for (). With an eye on the maximum line-width of 80 characters and these things being explicit everywhere as a type parameter (as opposed to a type family over the exposed id-parameter), how much added value is there still in having the names PreTcType and PostTcType? Would “()” and “Type” not be as clear? I ask, because when I started looking at GHC, I was overwhelmed with all the names for things in there, most of which then turn out to be different names for the same thing. The main reason to call the thing PostTcType in the first place was to give some kind of warning that there would be nothing there before TC. 3) The variable name “ptt” is a bit misleading to me. I would use “ty”. 4) In the cases of the types that have recently been parameterized in what they contain, is there a reason to have the ty-argument *after* the content-argument? E.g. why is it “LGRHS RdrName (LHsExpr RdrName PreTcType) PreTcType” instead of “LGRHS RdrName PreTcType (LHsExpr RdrName PreTcType)”? This may very well be a tiny stylistic thing, but it’s worth thinking about. 5) I much prefer deleting code over commenting it out. I understand the urge, but if you don’t remove these lines before your final commit, they will become noise in the long term. Versioning systems preserve the code for you. (Example: Convert.void) Regards, Philip From: Alan Kim Zimmerman [mailto:alan.z...@gmail.com] Sent: woensdag 13 augustus 2014 8:50 To: Holzenspies, P.K.F. (EWI) Cc: Simon Peyton Jones; Edward Kmett; ghc-devs@haskell.org Subject: Re: Broken Data.Data instances And I dipped my toes into the phabricator water, and uploaded a diff to https://phabricator.haskell.org/D153 I left the lines long for now, so that it is clear that I simply added parameters to existing type signatures. On Tue, Aug 12, 2014 at 10:51 PM, Alan Kim Zimmerman alan.z...@gmail.commailto:alan.z...@gmail.com wrote: Status update I have worked through a proof of concept update to the GHC AST whereby the type is provided as a parameter to each data type. This was basically a mechanical process of changing type signatures, and required very little actual code changes, being only to initialise the placeholder types. The enabling types are type PostTcType = Type-- Used for slots in the abstract syntax -- where we want to keep slot for a type -- to be added by the type checker...but -- [before typechecking it's just bogus] type PreTcType = () -- used before typechecking class PlaceHolderType a where placeHolderType :: a instance PlaceHolderType PostTcType where placeHolderType = panic Evaluated the place holder for a PostTcType instance PlaceHolderType PreTcType where placeHolderType = () These are used to replace all instances of PostTcType in the hsSyn types. The change was applied against HEAD as of last friday, and can be found here https://github.com/alanz/ghc/tree/wip/landmine-param https://github.com/alanz/haddock/tree/wip/landmine-param They pass 'sh validate' with GHC 7.6.3, and compile against GHC 7.8.3. I have not tried to validate that yet, have no reason to expect failure. Can I please get some feedback as to whether this is a worthwhile change? It is the first step to getting a generic traversal safe AST Regards Alan On Mon, Jul 28, 2014 at 5:45 PM, Alan Kim Zimmerman alan.z...@gmail.commailto:alan.z...@gmail.com wrote: FYI I edited the paste at http://lpaste.net/108262 to show the problem On Mon, Jul 28, 2014 at 5:41 PM, Alan Kim Zimmerman alan.z...@gmail.commailto:alan.z...@gmail.com wrote: I already tried that, the syntax does not seem to allow it. I suspect some higher form of sorcery will be required, as alluded to here http://stackoverflow.com/questions/14133121/can-i-constrain-a-type-family Alan On Mon, Jul 28, 2014 at 4:55 PM, p.k.f.holzensp...@utwente.nlmailto:p.k.f.holzensp...@utwente.nl wrote: Dear Alan, I would think you would want to constrain the result, i.e. type family (Data (PostTcType a)) = PostTcType a where … The Data-instance of ‘a’ doesn’t give you much if you have a ‘PostTcType a’. Your point about SYB-recognition of WrongPhase is, of course, a good one ;) Regards, Philip From: Alan Kim Zimmerman [mailto:alan.z...@gmail.commailto:alan.z...@gmail.com] Sent: maandag 28 juli 2014 14:10 To:
RE: Broken Data.Data instances
I always read the () as “there’s nothing meaningful to stick in here, but I have to stick in something” so I don’t necessarily want the WrongPhase-thing. There is very old commentary stating it would be lovely if someone could expose the PostTcType as a parameter of the AST-types, but that there are so many types and constructors, that it’s a boring chore to do. Actually, I was hoping haRe would come up to speed to be able to do this. That being said, I think Simon’s idea to turn PostTcType into a type-family is a better way altogether; it also documents intent, i.e. () may not say so much, but PostTcType RdrName says quite a lot. Simon commented that a lot of the internal structures aren’t trees, but cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just and Nothing, which again refer to the TyCon for Maybe. From: Alan Kim Zimmerman [mailto:alan.z...@gmail.com] Sent: maandag 28 juli 2014 11:14 To: Simon Peyton Jones Cc: Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs Subject: Re: Broken Data.Data instances I have made a conceptual example of this here http://lpaste.net/108262 Alan On Mon, Jul 28, 2014 at 9:50 AM, Alan Kim Zimmerman alan.z...@gmail.commailto:alan.z...@gmail.com wrote: What about creating a specific type with a single constructor for the not relevant to this phase type to be used instead of () above? That would also clearly document what was going on. Alan On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones simo...@microsoft.commailto:simo...@microsoft.com wrote: I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc. If the “hand grenades” are the PostTcTypes, etc, then I can explain why they are there. There simply is no sensible type you can put before the type checker runs. For example one of the constructors in HsExpr is | HsMultiIf PostTcType [LGRHS id (LHsExpr id)] After type checking we know what type the thing has, but before we have no clue. We could get around this by saying type PostTcType = Maybe TcType but that would mean that every post-typechecking consumer would need a redundant pattern-match on a Just that would always succeed. It’s nothing deeper than that. Adding Maybes everywhere would be possible, just clunky. However we now have type functions, and HsExpr is parameterised by an ‘id’ parameter, which changes from RdrName (after parsing) to Name (after renaming) to Id (after typechecking). So we could do this: | HsMultiIf (PostTcType id) [LGRHS id (LHsExpr id)] and define PostTcType as a closed type family thus type family PostTcType a where PostTcType Id = TcType PostTcType other = () That would be better than filling it with bottoms. But it might not help with generic programming, because there’d be a component whose type wasn’t fixed. I have no idea how generics and type functions interact. Simon From: Edward Kmett [mailto:ekm...@gmail.commailto:ekm...@gmail.com] Sent: 27 July 2014 18:27 To: p.k.f.holzensp...@utwente.nlmailto:p.k.f.holzensp...@utwente.nl Cc: alan.z...@gmail.commailto:alan.z...@gmail.com; Simon Peyton Jones; ghc-devs Subject: Re: Broken Data.Data instances Philip, Alan, If you need a hand, I'm happy to pitch in guidance. I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc. This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock. Simon, It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try. Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways. -Edward On Sun, Jul 27, 2014 at 10:17 AM, p.k.f.holzensp...@utwente.nlmailto:p.k.f.holzensp...@utwente.nl wrote: Alan, In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the
RE: Broken Data.Data instances
Sorry about that… I’m having it out with my terminal server and the server seems to be winning. Here’s another go: I always read the () as “there’s nothing meaningful to stick in here, but I have to stick in something” so I don’t necessarily want the WrongPhase-thing. There is very old commentary stating it would be lovely if someone could expose the PostTcType as a parameter of the AST-types, but that there are so many types and constructors, that it’s a boring chore to do. Actually, I was hoping haRe would come up to speed to be able to do this. That being said, I think Simon’s idea to turn PostTcType into a type-family is a better way altogether; it also documents intent, i.e. () may not say so much, but PostTcType RdrName says quite a lot. Simon commented that a lot of the internal structures aren’t trees, but cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just and Nothing, which again refer to the TyCon for Maybe. I was wondering whether it would be possible to make stateful lenses for this. Of course, for specific cases, we could do this, but I wonder if it is also possible to have lenses remember the things they visited and not visit them twice. Any ideas on this, Edward? Regards, Philip From: Alan Kim Zimmerman [mailto:alan.z...@gmail.com] Sent: maandag 28 juli 2014 11:14 To: Simon Peyton Jones Cc: Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs Subject: Re: Broken Data.Data instances I have made a conceptual example of this here http://lpaste.net/108262 Alan On Mon, Jul 28, 2014 at 9:50 AM, Alan Kim Zimmerman alan.z...@gmail.commailto:alan.z...@gmail.com wrote: What about creating a specific type with a single constructor for the not relevant to this phase type to be used instead of () above? That would also clearly document what was going on. Alan On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones simo...@microsoft.commailto:simo...@microsoft.com wrote: I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc. If the “hand grenades” are the PostTcTypes, etc, then I can explain why they are there. There simply is no sensible type you can put before the type checker runs. For example one of the constructors in HsExpr is | HsMultiIf PostTcType [LGRHS id (LHsExpr id)] After type checking we know what type the thing has, but before we have no clue. We could get around this by saying type PostTcType = Maybe TcType but that would mean that every post-typechecking consumer would need a redundant pattern-match on a Just that would always succeed. It’s nothing deeper than that. Adding Maybes everywhere would be possible, just clunky. However we now have type functions, and HsExpr is parameterised by an ‘id’ parameter, which changes from RdrName (after parsing) to Name (after renaming) to Id (after typechecking). So we could do this: | HsMultiIf (PostTcType id) [LGRHS id (LHsExpr id)] and define PostTcType as a closed type family thus type family PostTcType a where PostTcType Id = TcType PostTcType other = () That would be better than filling it with bottoms. But it might not help with generic programming, because there’d be a component whose type wasn’t fixed. I have no idea how generics and type functions interact. Simon From: Edward Kmett [mailto:ekm...@gmail.commailto:ekm...@gmail.com] Sent: 27 July 2014 18:27 To: p.k.f.holzensp...@utwente.nlmailto:p.k.f.holzensp...@utwente.nl Cc: alan.z...@gmail.commailto:alan.z...@gmail.com; Simon Peyton Jones; ghc-devs Subject: Re: Broken Data.Data instances Philip, Alan, If you need a hand, I'm happy to pitch in guidance. I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc. This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock. Simon, It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try. Alternately, we could use the 'virtual constructor' trick there
RE: Broken Data.Data instances
Dear Alan, I would think you would want to constrain the result, i.e. type family (Data (PostTcType a)) = PostTcType a where … The Data-instance of ‘a’ doesn’t give you much if you have a ‘PostTcType a’. Your point about SYB-recognition of WrongPhase is, of course, a good one ;) Regards, Philip From: Alan Kim Zimmerman [mailto:alan.z...@gmail.com] Sent: maandag 28 juli 2014 14:10 To: Holzenspies, P.K.F. (EWI) Cc: Simon Peyton Jones; Edward Kmett; ghc-devs@haskell.org Subject: Re: Broken Data.Data instances Philip I think the main reason for the WrongPhase thing is to have something that explicitly has a Data and Typeable instance, to allow generic (SYB) traversal. If we can get by without this so much the better. On a related note, is there any way to constrain the 'a' in type family PostTcType a where PostTcType Id= TcType PostTcType other = WrongPhaseTyp to have an instance of Data? I am experimenting with traversals over my earlier paste, and got stuck here (which is the reason the Show instances were commentet out in the original). Alan On Mon, Jul 28, 2014 at 12:30 PM, p.k.f.holzensp...@utwente.nlmailto:p.k.f.holzensp...@utwente.nl wrote: Sorry about that… I’m having it out with my terminal server and the server seems to be winning. Here’s another go: I always read the () as “there’s nothing meaningful to stick in here, but I have to stick in something” so I don’t necessarily want the WrongPhase-thing. There is very old commentary stating it would be lovely if someone could expose the PostTcType as a parameter of the AST-types, but that there are so many types and constructors, that it’s a boring chore to do. Actually, I was hoping haRe would come up to speed to be able to do this. That being said, I think Simon’s idea to turn PostTcType into a type-family is a better way altogether; it also documents intent, i.e. () may not say so much, but PostTcType RdrName says quite a lot. Simon commented that a lot of the internal structures aren’t trees, but cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just and Nothing, which again refer to the TyCon for Maybe. I was wondering whether it would be possible to make stateful lenses for this. Of course, for specific cases, we could do this, but I wonder if it is also possible to have lenses remember the things they visited and not visit them twice. Any ideas on this, Edward? Regards, Philip From: Alan Kim Zimmerman [mailto:alan.z...@gmail.commailto:alan.z...@gmail.com] Sent: maandag 28 juli 2014 11:14 To: Simon Peyton Jones Cc: Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs Subject: Re: Broken Data.Data instances I have made a conceptual example of this here http://lpaste.net/108262 Alan On Mon, Jul 28, 2014 at 9:50 AM, Alan Kim Zimmerman alan.z...@gmail.commailto:alan.z...@gmail.com wrote: What about creating a specific type with a single constructor for the not relevant to this phase type to be used instead of () above? That would also clearly document what was going on. Alan On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones simo...@microsoft.commailto:simo...@microsoft.com wrote: I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc. If the “hand grenades” are the PostTcTypes, etc, then I can explain why they are there. There simply is no sensible type you can put before the type checker runs. For example one of the constructors in HsExpr is | HsMultiIf PostTcType [LGRHS id (LHsExpr id)] After type checking we know what type the thing has, but before we have no clue. We could get around this by saying type PostTcType = Maybe TcType but that would mean that every post-typechecking consumer would need a redundant pattern-match on a Just that would always succeed. It’s nothing deeper than that. Adding Maybes everywhere would be possible, just clunky. However we now have type functions, and HsExpr is parameterised by an ‘id’ parameter, which changes from RdrName (after parsing) to Name (after renaming) to Id (after typechecking). So we could do this: | HsMultiIf (PostTcType id) [LGRHS id (LHsExpr id)] and define PostTcType as a closed type family thus type family PostTcType a where PostTcType Id = TcType PostTcType other = () That would be better than filling it with bottoms. But it might not help with generic programming, because there’d be a component whose type wasn’t fixed. I have no idea how generics and type functions interact. Simon From: Edward Kmett [mailto:ekm...@gmail.commailto:ekm...@gmail.com] Sent: 27 July 2014 18:27 To: p.k.f.holzensp...@utwente.nlmailto:p.k.f.holzensp...@utwente.nl Cc:
RE: Broken Data.Data instances
Alan, In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable). Regards, Philip Van: Alan Kim Zimmerman [alan.z...@gmail.com] Verzonden: vrijdag 25 juli 2014 13:44 Aan: Simon Peyton Jones CC: Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org Onderwerp: Re: Broken Data.Data instances By the way, I would be happy to attempt this task, if the concept is viable. On Thu, Jul 24, 2014 at 11:23 PM, Alan Kim Zimmerman alan.z...@gmail.commailto:alan.z...@gmail.com wrote: While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends? In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/GhcUtils.hs) -- Check the Typeable items checkItemStage1 :: (Typeable a) = SYB.Stage - a - Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet - Bool postTcType = const (stage SYB.TypeChecker ) :: GHC.PostTcType- Bool fixity = const (stage SYB.Renamer ) :: GHC.Fixity- Bool And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values. Perhaps use an initialiser that can have its panic turned off when called via the GHC API? Regards Alan On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones simo...@microsoft.commailto:simo...@microsoft.com wrote: So... does anyone object to me changing these broken instances with the ones given by DeriveDataTypeable? That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented. Simon From: Philip K.F. Hölzenspies [mailto:p.k.f.holzensp...@utwente.nlmailto:p.k.f.holzensp...@utwente.nl] Sent: 24 July 2014 18:42 To: Simon Peyton Jones Cc: ghc-devs@haskell.orgmailto:ghc-devs@haskell.org Subject: Re: Broken Data.Data instances Dear Simon, et al, These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful). So... does anyone object to me changing these broken instances with the ones given by DeriveDataTypeable? Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested? Regards, Philip [cid:image001.jpg@01CFA78B.7D356DE0] Simon Peyton Jonesmailto:simo...@microsoft.com 24 Jul 2014 18:22 GHC’s data structures are often mutually recursive. e.g. •The TyCon for Maybe contains the DataCon for Just •The DataCon For just contains Just’s type •Just’s type contains the TyCon for Maybe So any attempt to recursively walk over all these structures, as you would a tree, will fail. Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map. Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types Simon From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of p.k.f.holzensp...@utwente.nlmailto:p.k.f.holzensp...@utwente.nl Sent: 24 July 2014 16:42 To: ghc-devs@haskell.orgmailto:ghc-devs@haskell.org Subject: Broken Data.Data instances Dear GHC-ers, Is there a reason for explicitly broken Data.Data instances? Case in point: instance Data Var where -- don't traverse? toConstr _ = abstractConstr Var gunfold _ _ = error gunfold dataTypeOf _ = mkNoRepType Var I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this: collect :: (Typeable b, Data a, MonadPlus m) = a - m b collect = everything mplus $ mkQ mzero return allTypes :: CoreExpr - [Type] allTypes = collect Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very
RE: Unexpected failure to inline, even with pragma
Dear Joachim, et al., Yes, you were right, this does fix it. This confuses me even more as to why it *did* inline Foo.Bar.foo in Foo.Bar.bar without -O, though. Is -O required for optimization across module bounds? Also, since I really want a certain level of inlining for a plugin I'm working on; is there a way to force (from the plugin, i.e. using the API) to force inlining of a term at its call-site? Alternatively (weaker), can I force - from a compiler plugin - the inliner to behave as if -O was set even if it wasn't? I may be doing things in a roundabout way, but here's my thinking: I want to write a combinatory-library with a plugin that optimizes uses of this library using non-trivial domain knowledge. However, the primitive combinators for which my plugin is defined are hard to use. I want to offer users higher-level combinators that are expressed in terms of those primitive ones, but I don't want to complicate my plugin too much. If, somehow, I could make absolutely sure all the non-primitive combinators were inlined (again, this may be done using API-calls rather than pragma's), I could define my plugin-logic only in terms of the primitive ones... this would be bliss :D Regards, Philip -Original Message- From: Joachim Breitner [mailto:m...@joachim-breitner.de] Sent: donderdag 1 mei 2014 16:14 To: ghc-devs@haskell.org Subject: Re: Unexpected failure to inline, even with pragma Hi, Am Donnerstag, den 01.05.2014, 12:59 + schrieb p.k.f.holzensp...@utwente.nl: $ ghc -ddump-inlinings -fforce-recomp Main.hs You need to pass -O to ghc. (I didn’t check if that fixes it, but without it the optimizer does very very little.) Greetings, Joachim -- Joachim “nomeata” Breitner m...@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nome...@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nome...@debian.org ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Request: export runTcInteractive from TcRnDriver
Dear GHC-devs, Is there a reason why, in HEAD, TcRnDriver does *not* export runTcInteractive? If not, can it please be added? (I considered sending a patch with this email, but it's so trivial a change that the check of the patch is more work than manually adding runTcInteractive to the export list.) I'm developing against the GHC API of 7.6.3 and it would have saved me hours of work to have precisely that function. Seeing it's in HEAD, but not being exported seems a shame ;) Regards, Philip ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
FW: Detabbing patch and GHC developers Wiki
PS. Ignore my comments about git send-email. Ubuntu enjoys splitting everything up in minute sub-packages. I should have installed git-email. From: Holzenspies, P.K.F. (EWI) Sent: donderdag 3 oktober 2013 15:18 To: 'ghc-devs@haskell.org' Subject: Detabbing patch and GHC developers Wiki Dear GHCers, Attached is a patch to detab the current HEAD. I've followed the GHC developers Wiki instructions on validation and patch generation. However, the Wiki seems to contain an error (or, at least, does not agree with my version of git). At [1], the wiki says I can submit patches using git send-email. My git complains that it does not know this command. If this is indeed wrong, could someone in the know update the wiki? Can I consider my patch submitted for consideration or should I send it somewhere else? Regards, Philip ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs