Re: Request for feedback: deriving strategies syntax
On 2016-09-28 04:06, Richard Eisenberg wrote: +1 on `stock` from me. Though I was all excited to get my class next semester jazzed for PL work by explaining that I had slipped a new keyword `bespoke` into a language. :) Maybe there's still a spot you can slip it in, e.g. bespoke error messages. ;) I agree that "stock" is an acceptable alternative. MarLinn ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Making (useful subsets of) bytecode portable between targets
On 2016-11-25 12:11, Simon Marlow wrote: We basically have two worlds: first, the compile-time world. In this world, we need all the packages and modules of the current package built for the host platform. Secondly, we need the runtime world, with all the packages and modules of the current package cross-compiled for the target platform. Maybe this separation and the preceding discussion of the two possible solutions suggests a usable approach to the architecture of this future system? First, let me reframe the "runner" idea. In a real-world environment, this seems like a viable solution either with two separate machines or with a VM nested in the main build machine. In both cases, we would need two parts of the compiler, communicating over customized channels. The cross-compiler approach is more or less just a variation on this with far less overhead. So why not build an architecture that supports both solutions? I practice, this would mean we need a tightly defined, but flexible API between at least two "architecture plugins" and one controller that could run on either side. To me, this sounds more like a build system than a mere compiler. And I'm okay with that, but I don't think GHC+Cabal alone can and should shoulder the complexity. There are nice, working build-systems out there that could take over the role of the controller, so all GHC and Cabal would have to offer are parsing, modularized steps, and nice hooks. In other words, /a //kind of meta-language to describe compiler deployments/ – and Haskell is great for describing languages. Here's yet another idea I'd like to add, although it is rather silly. The idea of a meta-language that describes a conversion structure seems very close to what Pandoc is doing for documents. And while Pandoc's architecture and history make it a bit static, GHC can still learn from it. Maybe, someday, there could even be a bigger, even more over-arching build language that describes the program, the documentation, and the deployment processes of the whole system? Cheers, MarLinn ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Separating typechecking and type error reporting in two passes?
But you are right that when the programmer sits there and waits for a result, that’s when snappyness is important. I had a random idea based on this observation: (With a certain flag set) the compiler could follow the existing strategy until it has hit the first n errors, possibly with n=1. Then it could switch off the context overhead and all subsequent errors could be deferred or not fleshed out. Or, alternatively, the proposed new strategy is used, but the second pass only looks at the first n errors. Benefit: Correct code is on the fast path, but error reporting doesn't add too much of an overhead. My experience when using the compiler to have a conversation about errors was that I was correcting one or two errors at a time, then re-compiling. I discarded all the extra information about the other errors anyway, at least most of the time. I don't know if that is a usual pattern, but if it is we might as well exploit it. This idea could already benefit from a separation, but we can go further. What if, in interactive sessions, you would only get the result of the first pass at first. No details, but only a list of error positions. In some cases, that is all you need to find a dumb typo. It also doesn't clutter the screen with loads of fluff while still giving you a basic idea of how much is wrong. Now what if you could then instruct the system to do the second pass at places you choose, interactively? In other words the conversation would be even more conversational. Of course the benefits are debatable and this is not something that's going to be happening soon anyway. But for me the idea alone is an argument for the proposed new separation, because it would give us the flexibility to think of features like this. Cheers, MarLinn ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Telemetry (WAS: Attempt at a real world benchmark)
It could tell us which language features are most used. Language features are hard if they are not available in separate libs. If in libs, then IIRC debian is packaging those in separate packages, again you can use their package contest. What in particular makes them hard? Sorry if this seems like a stupid question to you, I'm just not that knowledgeable yet. One reason I can think of would be that we would want attribution, i.e. did the developer turn on the extension himself, or is it just used in a lib or template – but that should be easy to solve with a source hash, right? That source hash itself might need a bit of thought though. Maybe it should not be a hash of a source file, but of the parse tree. The big issue is (a) design and implementation effort, and (b) dealing with the privacy issues. I think (b) used to be a big deal, but nowadays people mostly assume that their software is doing telemetry, so it feels more plausible. But someone would need to work out whether it had to be opt-in or opt-out, and how to actually make it work in practice. Privacy here is complete can of worms (keep in mind you are dealing with a lot of different law systems), I strongly suggest not to even think about it for a second. Your note "but nowadays people mostly assume that their software is doing telemetry" may perhaps be true in sick mobile apps world, but I guess is not true in the world of developing secure and security related applications for either server usage or embedded. My first reaction to "nowadays people mostly assume that their software is doing telemetry" was to amend it with "* in the USA" in my mind. But yes, mobile is another place. Nowadays I do assume most software uses some sort of phone-home feature, but that's because it's on my To Do list of things to search for on first configuration. Note that I am using "phone home" instead of "telemetry" because some companies hide it in "check for updates" or mix it with some useless "account" stuff. Finding out where it's hidden and how much information they give about the details tells a lot about the developers, as does opt-in vs opt-out. Therefore it can be a reason to not choose a piece of software or even an ecosystem after a first try. (Let's say an operating system almost forces me to create an online account on installation. That not only tells me I might not want to use that operating system, it also sends a marketing message that the whole ecosystem is potentially toxic to my privacy because they live in a bubble where that appears to be acceptable.) So I do have that aversion even in non-security-related contexts. I would say people are aware that telemetry exists, and developers in particular. I would also say developers are aware of the potential benefits, so they might be open to it. But what they care and worry about is /what/ is reported and how they can /control/ it. Software being Open Source is a huge factor in that, because they know that, at least in theory, they could vet the source. But the reaction might still be very mixed – see Mozilla Firefox. My suggestion would be a solution that gives the developer the feeling of making the choices, and puts them in control. It should also be compatible with configuration management so that it can be integrated into company policies as easily as possible. Therefore my suggestions would be * Opt-In. Nothing takes away the feeling of being in control more than perceived "hijacking" of a device with "spy ware". This also helps circumvent legal problems because the users or their employers now have the responsibility. * The switches to turn it on or off should be in a configuration file. There should be several staged configuration files, one for a project, one for a user, one system-wide. This is for compatibility with configuration management. Configuration higher up the hierarchy override ones lower in the hierarchy, but they can't force telemetry to be on – at least not the sensitive kind. * There should be several levels or a set of options that can be switched on or off individually, for fine-grained control. All should be very well documented. Once integrated and documented, they can never change without also changing the configuration flag that switches them on. There still might be some backlash, but a careful approach like this could soothe the minds. If you are worried that we might get too little data this way, here's another thought, leading back to performance data: The most benefit in that regard would come from projects that are built regularly, on different architectures, with sources that can be inspected and with an easy way to get diffs. In other words, projects that live on github and travis anyway. Their maintainers should be easy to convince to set that little switch to "on". Regards, MarLinn ___
Re: Telemetry
Pretty random idea: What if ghc exposed measurement points for performance and telemetry, but a separate tool would handle the read-out, configuration, upload etc. That would keep the telemetry from being built-in, while still being a way to get *some* information. Such a support tool might be interesting for other projects, too, or even for slightly different use cases like monitoring servers. The question is if such a tool would bring enough benefit to enough projects for buy-in and to attract contributors. And just separating it doesn't solve the underlying issues of course, so attracting contributors and buy-in might be even harder than it already is for "normal" projects. Close ties to ghc might improve that, but I doubt how big such an effect would be. Additionally, this approach would just shift many of the questions over to Haskell-platform and/or Stack instead of addressing them – or even further, on that volatile front-line space where inner-community conflict roared recently. It wouldn't be the worst place to address them, but I would hesitate to throw yet another potential point of contention onto that burned field. Basically: I like that idea, but I might just have proven it fruitless anyway. Cheers, MarLinn ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Help needed: Restrictions of proc-notation with RebindableSyntax
Sorry to barge into the discussion with neither much knowledge of the theory nor the implementation. I tried to look at both, but my understanding is severely lacking. However I do feel a tiny bit emboldened because my own findings turned out to at least have the same shadow as the contents of this more thorough overview. The one part of the existing story I personally found the most promising was to explore the category hierarchy around Arrows, in other words the Gibbard/Trinkle perspective. Therefore I want to elaborate my own naive findings a tiny bit. Bear in mind that much of this is gleaned from experimental implementations or interpreted, but I do not have proofs, or even theory. Almost all parts necessary for an Arrow seem to already be contained in a symmetrical braided category. Fascinatingly, even the braiding might be superfluous in some cases, leaving only the need for a monoidal category. But to get from a braided category to a full Arrow, there seems to be a need for "constructors" like (arr $ \x -> (x,x)) and "destructors" like (arr fst). There seem to be several options for those, and a choice would have to be made. Notably: is introduction done by duplicating existing values, or by introducing new "unit" values (for a suitable definition of "unit")? That choice doesn't seem impactful, but my gut feeling is that that's just because I cannot see the potential points of impact. What makes this story worse is that the currently known hierarchies around ArrowChoice and ArrowLoop seem to be coarser still – although the work around profunctors might help. That said, my understanding is so bad that I can not even see any benefits or drawbacks of the structure of ArrowLoop's "loop" versus a more "standard" fix-point structure. I do, however, think there is something to be gained. The good old Rosetta Stone paper still makes me think that what is now Arrow notation might be turned into a much more potent tool – exactly because we might be able to lift those restrictions. One particular idea I have in mind: If the notation can support purely braided categories, it might be used to describe reversible computation, which in turn is used in describing quantum computation. The frustrating part for me is that I would like to contribute to this effort. But again, my understanding of each and every component is fleeting at best. MarLinn On 2016-12-21 06:15, Edward Kmett wrote: Arrows haven't seen much love for a while. In part this is because many of the original applications for arrows have been shown to be perfectly suited to being handled by Applicatives. e.g. the Swiestra/Duponcheel parser that sort of kickstarted everything. There are several options for improved arrow desugaring. Megacz's work on GArrows at first feels like it should be applicable here, as it lets you change out the choice of pseudo-product while preserving the general arrow feel. Unfortunately, the GArrow class isn't sufficient for most arrow desguaring, due to the fact that the arrow desugaring inherently involves breaking apart patterns for almost any non-trivial use and nothing really requires the GArrow 'product' to actually even be product like. Cale Gibbard and Ryan Trinkle on the other hand like to use a more CCC-like basis for arrows. This stays in the spirit to the GArrow class, but you still have the problems around pattern matching. I don't think they actually wrote anything to deal with the actual arrow notation and just programmed in the alternate style to get better introspection on the operations involved. I think the key insight there is that much of the notation can be made to work with weaker categorical structures than full arrows, but the existing class hierarchy around arrows is very coarse. As a minor data point both of these sorts of encodings of arrow problems start to drag in language extensions that make the notation harder to standardize. Currently they work with bog standard Haskell 98/2010. If you're looking for an interesting theoretical direction to extend Arrow notation: An arrow is a strong monad in the category of profunctors [1]. Using the profunctors library [2] (Strong p, Category p) is equivalent in power to Arrow p. Exploiting that, a profunctor-based desugaring could get away with much weaker constraints than Arrow depending on how much of proc notation you use. Alternately a separate class hierarchy that only required covariance in the second argument is an option, but my vague recollection from the last time that I looked into this is that while such a desguaring only uses covariance in the second argument of the profunctor, you can prove that contravariance in the first argument follows from the pile of laws. This subject came up the last time someone thought to extend the Arrow desguaring. You can probably find a thread on the mailing list from Ross Paterson a few years ago. This version has the ben