Re: Request for feedback: deriving strategies syntax

2016-09-27 Thread MarLinn via ghc-devs

On 2016-09-28 04:06, Richard Eisenberg wrote:
+1 on `stock` from me. Though I was all excited to get my class next 
semester jazzed for PL work by explaining that I had slipped a new 
keyword `bespoke` into a language. :)


Maybe there's still a spot you can slip it in, e.g. bespoke error 
messages. ;)



I agree that "stock" is an acceptable alternative.

MarLinn
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Making (useful subsets of) bytecode portable between targets

2016-11-25 Thread MarLinn via ghc-devs

On 2016-11-25 12:11, Simon Marlow wrote:
We basically have two worlds: first, the compile-time world. In this 
world, we need all the packages and modules of the current package 
built for the host platform. Secondly, we need the runtime world, with 
all the packages and modules of the current package cross-compiled for 
the target platform.


Maybe this separation and the preceding discussion of the two possible 
solutions suggests a usable approach to the architecture of this future 
system?


First, let me reframe the "runner" idea. In a real-world environment, 
this seems like a viable solution either with two separate machines or 
with a VM nested in the main build machine. In both cases, we would need 
two parts of the compiler, communicating over customized channels.
The cross-compiler approach is more or less just a variation on this 
with far less overhead.

So why not build an architecture that supports both solutions?

I practice, this would mean we need a tightly defined, but flexible API 
between at least two "architecture plugins" and one controller that 
could run on either side. To me, this sounds more like a build system 
than a mere compiler. And I'm okay with that, but I don't think 
GHC+Cabal alone can and should shoulder the complexity. There are nice, 
working build-systems out there that could take over the role of the 
controller, so all GHC and Cabal would have to offer are parsing, 
modularized steps, and nice hooks. In other words, /a //kind of 
meta-language to describe compiler deployments/ – and Haskell is great 
for describing languages.


Here's yet another idea I'd like to add, although it is rather silly. 
The idea of a meta-language that describes a conversion structure seems 
very close to what Pandoc is doing for documents. And while Pandoc's 
architecture and history make it a bit static, GHC can still learn from 
it. Maybe, someday, there could even be a bigger, even more over-arching 
build language that describes the program, the documentation, and the 
deployment processes of the whole system?


Cheers,
MarLinn
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Separating typechecking and type error reporting in two passes?

2016-11-30 Thread MarLinn via ghc-devs



But you are right that when the programmer sits there and waits for a
result, that’s when snappyness is important.


I had a random idea based on this observation:
(With a certain flag set) the compiler could follow the existing 
strategy until it has hit the first n errors, possibly with n=1. Then it 
could switch off the context overhead and all subsequent errors could be 
deferred or not fleshed out. Or, alternatively, the proposed new 
strategy is used, but the second pass only looks at the first n errors.
Benefit: Correct code is on the fast path, but error reporting doesn't 
add too much of an overhead. My experience when using the compiler to 
have a conversation about errors was that I was correcting one or two 
errors at a time, then re-compiling. I discarded all the extra 
information about the other errors anyway, at least most of the time. I 
don't know if that is a usual pattern, but if it is we might as well 
exploit it.


This idea could already benefit from a separation, but we can go further.
What if, in interactive sessions, you would only get the result of the 
first pass at first. No details, but only a list of error positions. In 
some cases, that is all you need to find a dumb typo. It also doesn't 
clutter the screen with loads of fluff while still giving you a basic 
idea of how much is wrong. Now what if you could then instruct the 
system to do the second pass at places you choose, interactively? In 
other words the conversation would be even more conversational.
Of course the benefits are debatable and this is not something that's 
going to be happening soon anyway. But for me the idea alone is an 
argument for the proposed new separation, because it would give us the 
flexibility to think of features like this.


Cheers,
MarLinn
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Telemetry (WAS: Attempt at a real world benchmark)

2016-12-09 Thread MarLinn via ghc-devs



It could tell us which language features are most used.


Language features are hard if they are not available in separate libs. 
If in libs, then IIRC debian is packaging those in separate packages, 
again you can use their package contest. 


What in particular makes them hard? Sorry if this seems like a stupid 
question to you, I'm just not that knowledgeable yet. One reason I can 
think of would be that we would want attribution, i.e. did the developer 
turn on the extension himself, or is it just used in a lib or template – 
but that should be easy to solve with a source hash, right? That source 
hash itself might need a bit of thought though. Maybe it should not be a 
hash of a source file, but of the parse tree.


The big issue is (a) design and implementation effort, and (b) 
dealing with the privacy issues. I think (b) used to be a big deal, 
but nowadays people mostly assume that their software is doing 
telemetry, so it feels more plausible.  But someone would need to 
work out whether it had to be opt-in or opt-out, and how to actually 
make it work in practice.


Privacy here is complete can of worms (keep in mind you are dealing 
with a lot of different law systems), I strongly suggest not to even 
think about it for a second. Your note "but nowadays people mostly 
assume that their software is doing telemetry" may perhaps be true in 
sick mobile apps world, but I guess is not true in the world of 
developing secure and security related applications for either server 
usage or embedded.


My first reaction to "nowadays people mostly assume that their software 
is doing telemetry" was to amend it with "* in the USA" in my mind. But 
yes, mobile is another place. Nowadays I do assume most software uses 
some sort of phone-home feature, but that's because it's on my To Do 
list of things to search for on first configuration. Note that I am 
using "phone home" instead of "telemetry" because some companies hide it 
in "check for updates" or mix it with some useless "account" stuff. 
Finding out where it's hidden and how much information they give about 
the details tells a lot about the developers, as does opt-in vs opt-out. 
Therefore it can be a reason to not choose a piece of software or even 
an ecosystem after a first try. (Let's say an operating system almost 
forces me to create an online account on installation. That not only 
tells me I might not want to use that operating system, it also sends a 
marketing message that the whole ecosystem is potentially toxic to my 
privacy because they live in a bubble where that appears to be 
acceptable.) So I do have that aversion even in non-security-related 
contexts.


I would say people are aware that telemetry exists, and developers in 
particular. I would also say developers are aware of the potential 
benefits, so they might be open to it. But what they care and worry 
about is /what/ is reported and how they can /control/ it. Software 
being Open Source is a huge factor in that, because they know that, at 
least in theory, they could vet the source. But the reaction might still 
be very mixed – see Mozilla Firefox.


My suggestion would be a solution that gives the developer the feeling 
of making the choices, and puts them in control. It should also be 
compatible with configuration management so that it can be integrated 
into company policies as easily as possible. Therefore my suggestions 
would be


 *

   Opt-In. Nothing takes away the feeling of being in control more than
   perceived "hijacking" of a device with "spy ware". This also helps
   circumvent legal problems because the users or their employers now
   have the responsibility.

 *

   The switches to turn it on or off should be in a configuration file.
   There should be several staged configuration files, one for a
   project, one for a user, one system-wide. This is for compatibility
   with configuration management. Configuration higher up the hierarchy
   override ones lower in the hierarchy, but they can't force telemetry
   to be on – at least not the sensitive kind.

 *

   There should be several levels or a set of options that can be
   switched on or off individually, for fine-grained control. All
   should be very well documented. Once integrated and documented, they
   can never change without also changing the configuration flag that
   switches them on.

There still might be some backlash, but a careful approach like this 
could soothe the minds.


If you are worried that we might get too little data this way, here's 
another thought, leading back to performance data: The most benefit in 
that regard would come from projects that are built regularly, on 
different architectures, with sources that can be inspected and with an 
easy way to get diffs. In other words, projects that live on github and 
travis anyway. Their maintainers should be easy to convince to set that 
little switch to "on".



Regards,
MarLinn

___

Re: Telemetry

2016-12-09 Thread MarLinn via ghc-devs
Pretty random idea: What if ghc exposed measurement points for 
performance and telemetry, but a separate tool would handle the 
read-out, configuration, upload etc. That would keep the telemetry from 
being built-in, while still being a way to get *some* information.


Such a support tool might be interesting for other projects, too, or 
even for slightly different use cases like monitoring servers. The 
question is if such a tool would bring enough benefit to enough projects 
for buy-in and to attract contributors. And just separating it doesn't 
solve the underlying issues of course, so attracting contributors and 
buy-in might be even harder than it already is for "normal" projects. 
Close ties to ghc might improve that, but I doubt how big such an effect 
would be.


Additionally, this approach would just shift many of the questions over 
to Haskell-platform and/or Stack instead of addressing them – or even 
further, on that volatile front-line space where inner-community 
conflict roared recently. It wouldn't be the worst place to address 
them, but I would hesitate to throw yet another potential point of 
contention onto that burned field.


Basically: I like that idea, but I might just have proven it fruitless 
anyway.



Cheers,
MarLinn
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Help needed: Restrictions of proc-notation with RebindableSyntax

2016-12-20 Thread MarLinn via ghc-devs
Sorry to barge into the discussion with neither much knowledge of the 
theory nor the implementation. I tried to look at both, but my 
understanding is severely lacking. However I do feel a tiny bit 
emboldened because my own findings turned out to at least have the same 
shadow as the contents of this more thorough overview.


The one part of the existing story I personally found the most promising 
was to explore the category hierarchy around Arrows, in other words the 
Gibbard/Trinkle perspective. Therefore I want to elaborate my own naive 
findings a tiny bit. Bear in mind that much of this is gleaned from 
experimental implementations or interpreted, but I do not have proofs, 
or even theory.
Almost all parts necessary for an Arrow seem to already be contained in 
a symmetrical braided category. Fascinatingly, even the braiding might 
be superfluous in some cases, leaving only the need for a monoidal 
category. But to get from a braided category to a full Arrow, there 
seems to be a need for "constructors" like (arr $ \x -> (x,x)) and 
"destructors" like (arr fst). There seem to be several options for 
those, and a choice would have to be made. Notably: is introduction done 
by duplicating existing values, or by introducing new "unit" values (for 
a suitable definition of "unit")? That choice doesn't seem impactful, 
but my gut feeling is that that's just because I cannot see the 
potential points of impact.


What makes this story worse is that the currently known hierarchies 
around ArrowChoice and ArrowLoop seem to be coarser still – although the 
work around profunctors might help. That said, my understanding is so 
bad that I can not even see any benefits or drawbacks of the structure 
of ArrowLoop's "loop" versus a more "standard" fix-point structure.


I do, however, think there is something to be gained. The good old 
Rosetta Stone paper still makes me think that what is now Arrow notation 
might be turned into a much more potent tool – exactly because we might 
be able to lift those restrictions. One particular idea I have in mind: 
If the notation can support purely braided categories, it might be used 
to describe reversible computation, which in turn is used in describing 
quantum computation.


The frustrating part for me is that I would like to contribute to this 
effort. But again, my understanding of each and every component is 
fleeting at best.


MarLinn


On 2016-12-21 06:15, Edward Kmett wrote:
Arrows haven't seen much love for a while. In part this is because 
many of the original applications for arrows have been shown to be 
perfectly suited to being handled by Applicatives. e.g. the 
Swiestra/Duponcheel parser that sort of kickstarted everything.


There are several options for improved arrow desugaring.

Megacz's work on GArrows at first feels like it should be applicable 
here, as it lets you change out the choice of pseudo-product while 
preserving the general arrow feel. Unfortunately, the GArrow class 
isn't sufficient for most arrow desguaring, due to the fact that the 
arrow desugaring inherently involves breaking apart patterns for 
almost any non-trivial use and nothing really requires the GArrow 
'product' to actually even be product like.


Cale Gibbard and Ryan Trinkle on the other hand like to use a more 
CCC-like basis for arrows. This stays in the spirit to the GArrow 
class, but you still have the problems around pattern matching. I 
don't think they actually wrote anything to deal with the actual arrow 
notation and just programmed in the alternate style to get better 
introspection on the operations involved. I think the key insight 
there is that much of the notation can be made to work with weaker 
categorical structures than full arrows, but the existing class 
hierarchy around arrows is very coarse.


As a minor data point both of these sorts of encodings of arrow 
problems start to drag in language extensions that make the notation 
harder to standardize. Currently they work with bog standard Haskell 
98/2010.


If you're looking for an interesting theoretical direction to extend 
Arrow notation:


An arrow is a strong monad in the category of profunctors [1].

Using the profunctors library [2] (Strong p, Category p) is equivalent 
in power to Arrow p.


Exploiting that, a profunctor-based desugaring could get away with 
much weaker constraints than Arrow depending on how much of proc 
notation you use.


Alternately a separate class hierarchy that only required covariance 
in the second argument is an option, but my vague recollection from 
the last time that I looked into this is that while such a desguaring 
only uses covariance in the second argument of the profunctor, you can 
prove that contravariance in the first argument follows from the pile 
of laws. This subject came up the last time someone thought to extend 
the Arrow desguaring. You can probably find a thread on the mailing 
list from Ross Paterson a few years ago.


This version has the ben