RE: Telemetry (WAS: Attempt at a real world benchmark)

Simon Peyton Jones via ghc-devs Fri, 09 Dec 2016 07:16:15 -0800

Just to say:


·         Telemetry is a good topic

·         It is clearly a delicate one as we’ve already seen from two widely 
differing reactions.  That’s why I have never seriously contemplated doing 
anything about it.

·         I’m love a consensus to emerge on this, but I don’t have the 
bandwidth to drive it.

Incidentally, when I said “telemetry is common” I meant that almost every piece 
of software I run on my PC these days automatically checks for updates.  It no 
longer even asks me if I want to do that.. it just does it.  That’s telemetry 
right there: the supplier knows how many people are running each version of 
their software.

Simon

From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of MarLinn via 
ghc-devs
Sent: 09 December 2016 14:52
To: ghc-devs@haskell.org
Subject: Re: Telemetry (WAS: Attempt at a real world benchmark)



It could tell us which language features are most used.

Language features are hard if they are not available in separate libs. If in 
libs, then IIRC debian is packaging those in separate packages, again you can 
use their package contest.

What in particular makes them hard? Sorry if this seems like a stupid question 
to you, I'm just not that knowledgeable yet. One reason I can think of would be 
that we would want attribution, i.e. did the developer turn on the extension 
himself, or is it just used in a lib or template – but that should be easy to 
solve with a source hash, right? That source hash itself might need a bit of 
thought though. Maybe it should not be a hash of a source file, but of the 
parse tree.


The big issue is (a) design and implementation effort, and (b) dealing with the 
privacy issues.  I think (b) used to be a big deal, but nowadays people mostly 
assume that their software is doing telemetry, so it feels more plausible.  But 
someone would need to work out whether it had to be opt-in or opt-out, and how 
to actually make it work in practice.

Privacy here is complete can of worms (keep in mind you are dealing with a lot 
of different law systems), I strongly suggest not to even think about it for a 
second. Your note "but nowadays people mostly assume that their software is 
doing telemetry" may perhaps be true in sick mobile apps world, but I guess is 
not true in the world of developing secure and security related applications 
for either server usage or embedded.


My first reaction to "nowadays people mostly assume that their software is 
doing telemetry" was to amend it with "* in the USA" in my mind. But yes, 
mobile is another place. Nowadays I do assume most software uses some sort of 
phone-home feature, but that's because it's on my To Do list of things to 
search for on first configuration. Note that I am using "phone home" instead of 
"telemetry" because some companies hide it in "check for updates" or mix it 
with some useless "account" stuff. Finding out where it's hidden and how much 
information they give about the details tells a lot about the developers, as 
does opt-in vs opt-out. Therefore it can be a reason to not choose a piece of 
software or even an ecosystem after a first try. (Let's say an operating system 
almost forces me to create an online account on installation. That not only 
tells me I might not want to use that operating system, it also sends a 
marketing message that the whole ecosystem is potentially toxic to my privacy 
because they live in a bubble where that appears to be acceptable.) So I do 
have that aversion even in non-security-related contexts.

I would say people are aware that telemetry exists, and developers in 
particular. I would also say developers are aware of the potential benefits, so 
they might be open to it. But what they care and worry about is what is 
reported and how they can control it. Software being Open Source is a huge 
factor in that, because they know that, at least in theory, they could vet the 
source. But the reaction might still be very mixed – see Mozilla Firefox.

My suggestion would be a solution that gives the developer the feeling of 
making the choices, and puts them in control. It should also be compatible with 
configuration management so that it can be integrated into company policies as 
easily as possible. Therefore my suggestions would be

·      Opt-In. Nothing takes away the feeling of being in control more than 
perceived "hijacking" of a device with "spy ware". This also helps circumvent 
legal problems because the users or their employers now have the responsibility.

·      The switches to turn it on or off should be in a configuration file. 
There should be several staged configuration files, one for a project, one for 
a user, one system-wide. This is for compatibility with configuration 
management. Configuration higher up the hierarchy override ones lower in the 
hierarchy, but they can't force telemetry to be on – at least not the sensitive 
kind.

·      There should be several levels or a set of options that can be switched 
on or off individually, for fine-grained control. All should be very well 
documented. Once integrated and documented, they can never change without also 
changing the configuration flag that switches them on.

There still might be some backlash, but a careful approach like this could 
soothe the minds.

If you are worried that we might get too little data this way, here's another 
thought, leading back to performance data: The most benefit in that regard 
would come from projects that are built regularly, on different architectures, 
with sources that can be inspected and with an easy way to get diffs. In other 
words, projects that live on github and travis anyway. Their maintainers should 
be easy to convince to set that little switch to "on".



Regards,
MarLinn

_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

RE: Telemetry (WAS: Attempt at a real world benchmark)

Reply via email to