Re: Phoning home

2008-02-26 Thread Steve Langasek
On Wed, Feb 27, 2008 at 07:26:37AM +1000, Anthony Towns wrote:
> On Mon, Feb 25, 2008 at 04:25:28PM -0800, Steve Langasek wrote:
> > On Mon, Feb 25, 2008 at 10:16:29AM +0100, Giacomo A. Catenazzi wrote:
> > >> Speaking as a human being, I would suggest that Debian policy should be
> > >> that all "phoning home" MUST be enabled explicitly, and MUST be turned
> > >> off by default.
> > "should" here would only mean that we've failed to correctly define "phoning
> > home".

> So that'd be something like "Packages should not communicate on the
> network except as specifically necessary for their functionality. In
> particular they must not `phone home' by contacting a central service to
> report user statistics back to the author, unless the user specifically
> enables that option." ?

s/should not/must not/, then yes. :)

-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
Ubuntu Developerhttp://www.debian.org/
[EMAIL PROTECTED] [EMAIL PROTECTED]


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Phoning home

2008-02-26 Thread Anthony Towns
On Mon, Feb 25, 2008 at 04:25:28PM -0800, Steve Langasek wrote:
> On Mon, Feb 25, 2008 at 10:16:29AM +0100, Giacomo A. Catenazzi wrote:
> >> Speaking as a human being, I would suggest that Debian policy should be
> >> that all "phoning home" MUST be enabled explicitly, and MUST be turned
> >> off by default.
> "should" here would only mean that we've failed to correctly define "phoning
> home".

So that'd be something like "Packages should not communicate on the
network except as specifically necessary for their functionality. In
particular they must not `phone home' by contacting a central service to
report user statistics back to the author, unless the user specifically
enables that option." ?

Cheers,
aj



signature.asc
Description: Digital signature


Re: Phoning home

2008-02-26 Thread Ian Jackson
Steve Langasek writes ("Re: Phoning home"):
> On Mon, Feb 25, 2008 at 10:16:29AM +0100, Giacomo A. Catenazzi wrote:
> > No, I prefer the SHOULD form, because it permit the
> > right thing to be done, giving the debian developer
> > the freedom (and burden) to check what it is bad, and
> > what it is acceptable.
> 
> "should" here would only mean that we've failed to correctly define "phoning
> home".  There's no legitimate reason for Debian packages to phone home, and
> it's always a bug if they do; if this is to be referenced in policy at all,
> this should be made plain.

I think you're twisting the definition here.  `Phoning home' means
connecting to some central server defined by the software developers.
It's value neutral.

If you use `phoning home' to mean only bad things, then we need a new
value-neutral phrase.


But taking your usage on board for the sake of the rest of your
message:

> > Think about:
> > apt
> 
> Not "phoning home":
> 
> - the requests don't contain identifying information about the client, with
>   the exception of the source IP address.

That is often enough to identify an individual user.

> - with the exception of security.d.o, there's no calling back to a central
>   server.

The mirrors are central servers.  I don't think it makes that much
difference whether it's one or more.

> - the requests are central to the functionality of the package, not
>   gratuitous calls for purposes of statistics-gathering.

That's the critical justification.

> - the requests must be initiated by the user.

Ubuntu prompts desktop users with admin ability when updates are
available.  I think this is a very good thing and we should do it too.

> > ntpdate
> 
> Not phoning home, for the same reasons as above (minus the last point).

Of course we don't run these NTP servers.  But we trust pool.ntp.org's
DNS servers (which can discover that our users are Debian users) and
the packages to the actual NTP servers are difficult to identify and
track.

> > clamav-freshclam
> 
> - central to the functionality of the package; if you don't want to be
>   trackable you don't install the package.
> - statistics gathering is a side-effect of the main purpose of the package,
>   and there's no way around this short of anonymizing your client access
>   through tor or similar.

Isn't this just an update downloader ?  What statistics are
collected ?  Do we direct our users to our servers, or to ones run by
upstream ?  If the latter, what privacy assurances do we have and why
do we believe them ?

> > Some of such project collects statistics.
> 
> The issue is not whether packages communicate with projects that collect
> statistics; the issue is whether the packages do so for the *purpose* of
> allowing statistics-gathering.

One of the key principles of data protection is that information
gathered for one purpose should not be used for another.

So information necessarily exposed to make the program work should not
be collected and statistically analysed by our servers even if that is
technically possible to do.

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Phoning home

2008-02-26 Thread Ian Jackson
Thomas Bushnell BSG writes ("Re: Phoning home"):
> These are two separate concerns.
> 
> Concern One: What a server does with information as a result of its
> operations;
> 
> Concern Two: What network traffic a program makes in its operation.

I think it is a mistake to separate these things in this way.

In the context of a particular program, it makes sense to consider
them both at once.  What network traffic a program ought to make
depends crucially on the servers it might be talking to; likewise,
what a server ought to do depends on the circumstances in which it
might be contacted.

> We cannot fix Concern One directly for other people's servers, and so we
> must not get sidetracked into thinking we should.

I disagree.  We should consider whether we can take measures so that
users' data is exposed only to trustworthy servers.  That might mean
choosing different servers, running our own, or disabling relevant
features.

I'm not saying we should do anything impractical, risky or stupid,
like running our own sanitising forwarding DNS proxy, or something.

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Phoning home

2008-02-26 Thread Ian Jackson
Russ Allbery writes ("Re: Phoning home"):
> I suppose that apt never updates itself unless you have something
> configured to do so (although does synaptic default to running aptitude
> update periodically?).

We can serve our users better by having our apt phone home to ask if
there are updates, because that way the user is more likely to be able
to conveniently install security fixes which are can be very
important.

>  But at least in theory Debian could track all
> sorts of interesting information about users based on what packages they
> download and when.  We *don't*, of course, but companies who software does
> similar things do so.

Indeed.

So we should
  * Phone home only when we need to
  * Send as little information as necessary when we do so
  * Connect to servers which are as trustworthy as possible
  * Trustworthy servers, amongst their other good attributes,
store as little information as possible

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Phoning home

2008-02-26 Thread Ian Jackson
Thomas Bushnell BSG writes ("Re: Phoning home"):
> On Sun, 2008-02-24 at 13:54 +, Ian Jackson wrote:
> > But I was rather surprised to find this situation.  It looks like the
> > prospective maintainer was aware of the phoning home but didn't
> > consider it a release-critical bug; they are also reluctant to
> > override upstream's wishes without some clear Debian policy statement
> > to the effect that this is not permissible.
> 
> I'm unclear about this "override upstream's wishes" part.  I have heard
> this kind of thing a number of times, and I strongly disagree with it.

That was my wording, but the prospective maintainer's sentiment.
I wholeheartedly agree with you.

> It sounds as if the maintainer is saying that upstream gets some kind of
> veto, which can only be overridden if there is a "clear Debian policy
> statement" on the point, and that is a mistaken and buggy approach.
> Upstream doesn't get a veto.

Yes.  I have explained this :-).

> There are good social and technical reasons not to deviate from upstream
> without good reasons, but this is a good reason, whether there is a
> "clear policy" or not.

I think what's not clear to everybody is that this is a good reason.

Conventional privacy mores in much of the world at large have greatly
changed.  I deplore these changes, and I'm glad to see that Debian
appears to be willing to hold a stronger line.

But to provide clarity, I think it would be a good idea to write
something down in policy - just as in other areas of potential
controversy, we have some explicit statements of our collective view.

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Phoning home

2008-02-26 Thread Ian Jackson
Julian Gilbey writes ("Re: Phoning home"):
> On Sun, Feb 24, 2008 at 01:54:11PM +, Ian Jackson wrote:
> > I think therefore that we should add some statement to policy about
> > phoning home.
> 
> Agreed.
> 
> > As a starting point:
> > 
> >  * Software in Debian should not communicate over the network except
> >- in order to, and as necessary to, perform their function
> >  (which includes the established Debian software update
> >   distribution infrastructure); or
> 
> I'm not sure what the phrase in parentheses means.

That is, apt is allowed to phone home to check for and download the
updates.

> >- for other purposes with explicit permission from the user
> 
> So what about visiting a website with a browser which then opens a
> popup?  Not sure how best to word this, but I fundamentally agree with
> the sentiment.

I agree that unwanted popups are abusive and our browser should do its
best to stop them.  Sadly this is a difficult and complex problem with
political, economic and technological aspects.  I didn't intend to
stick my oar into the war between websites and browsers.

> >- Usually, our software should communicate only to servers we
> >  control or which we have substantial reason to trust.
> 
> "By default", our software should ...
> The user might be given an option to change this (see below).

Yes, but also `usually' because there might be reasons for it to do
something else.

For example, if you ask to visit http://www.lycqmitb.com (perhaps
because lycqmitb is a website password and you fumbled the
cut-and-paste), the system will necessarily send your DNS query to
your ISP and ultimately to the root nameservers and to the nameservers
for .com.

We already know that the nameservers for .com are not trustworthy;
they have in the past betrayed users trust quite egregiously.  But we
don't have any practical choice to avoid this for our users.

So I wanted, unfortunately, to leave open the possibility that we
might send our users; data to untrusted servers if we don't have any
other sane options.

> We could have one question which asks "Some software authors like
> collecting anonymised data about the usage of their software in order
> to better optimise it.  Would you be willing to participate in this?",
> and then the possibility of opting in/out of individual packages.
> Also, any package which does something essentially different could
> have its own question.

That was the kind of thing I was thinking about.  We should aggregate
as well as anonymise, and possibly add some randomness to the figures
or blank out figures with very small sample sizes compared to their
magnitudes, to avoid recovery through sophisticated analysis.

So this would have to be configured and negotiated on a per-package
basis.  I would be happy to write and operate the central laundering
service.

> This could be an option given to the user, I guess.  I like the
> possibility of anonymising responses, as long as it does not
> negatively affect the benefits the phoning home provides.  (For
> example, it could be that upstream wants to know about the habits of
> individual users and their patterns over time rather than just the sum
> total of this information.  In such a case, Debian would have to track
> the individual users, then modify the info before sending it
> upstream.)

The per-user information could be tracked on the end user's system,
and only the summary transmitted to Debian.

It is always better not to collect the information than to collect it
and then throw it away.

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Phoning home

2008-02-26 Thread Bas Wijnen
On Sun, Feb 24, 2008 at 08:27:59PM -0600, Raphael Geissert wrote:
> > On Sun, Feb 24, 2008 at 07:44:53PM -0600, Raphael Geissert wrote:
> >> The problem I see here is that admin != user in all the situations.
> >> IMO it should ask, or at least warn, the user and not the admin.
> >> Because in the end is the user's privacy the one affected, not the
> >> administrator's.
>
> All it has to do is check if the user has already been warned and if
> not do it, of course only when the program is run.

You make it sound as if that's simple (it is) and good (it's not, IMO),
but I think it very much resembles having to click through a license for
every package you install.  One of the nice things about Debian is that
the user doesn't need to worry about such things: Debian makes sure
things are fine.

IMO a dialog asking me if I want to send information to upstream is
annoying.  Getting one for every program for every user makes Debian
significantly worse for our users.  Let's not go that way, please.

> If there's no easy way to do it then just for the sake of simplicity a
> patch rewriting the 'phoning home' function should be written.

In all cases, a patch disabling the "feature" would be acceptable.  If
it makes upstream really happy, I can live with an option to enable the
functionality.  But it must be disabled by default, and the user must
not be asked anything (unless perhaps they have a "low" debconf
treshold).  See also my comments to Thomas' e-mail below.

> IMHO that sounds more reasonable than letting the admin decide about the
> users privacy.

As a user, if you don't trust the admin, you shouldn't use the machine.
More specifically, you shouldn't give any data to a computer that you
don't trust the administrator with.  If the administrator turns such a
feature on, then that's the person who passes your information to
upstream.  They can do this anyway.  Annoying the user just confuses the
issue.  If the admin really wants to send out this information, and he's
evil, he can ask the question and ignore the answer.

In other words, asking the user doesn't add any security, but it does
add annoyance.

The solution (to the problem that the user doesn't know that the admin
violates his privacy) is to educate users that anything they do on a
machine can be seen and modified by the administrator.  Asking such
questions to users suggests otherwise, which is a bad idea in itself
IMO.

The admin has full control over the machine, including all user data in
it.  Let's not pretend otherwise.

On Sun, Feb 24, 2008 at 05:40:42PM -0500, Thomas Bushnell BSG wrote:
> > they are also reluctant to override upstream's wishes without some
> > clear Debian policy statement to the effect that this is not
> > permissible.
> 
> I'm unclear about this "override upstream's wishes" part.  I have heard
> this kind of thing a number of times, and I strongly disagree with it.
> 
> Debian is not a conduit for upstream packages to get conveniently
> compiled for Debian, is it?  It's a coherent system.  Debian maintainers
> have the job of making their packages DTRT, whether upstream does that
> or not, whether upstream agrees or not.

I fully agree.

> It sounds as if the maintainer is saying that upstream gets some kind of
> veto, which can only be overridden if there is a "clear Debian policy
> statement" on the point, and that is a mistaken and buggy approach.
> Upstream doesn't get a veto.

I don't think this was meant.  However:

> There are good social and technical reasons not to deviate from upstream
> without good reasons, but this is a good reason, whether there is a
> "clear policy" or not.

Upstream appearantly isn't so impressed by this reason.  For the
maintainer, it is socially a good thing to have some formal document to
point at; "this is how we do things in Debian" as opposed to "that's how
I personally prefer things to be done".

I share your feeling that some maintainers seem to not want to modify
upstream's work except to fix "real" bugs that upstream will want to fix
later.  I think that we should make clear that this is not the Right
Thing to do.  Debian is about making the best possible OS.  That
includes consistency.  If upstream's work is not consistent with the
rest, we modify it, whether upstream likes it or not.  The whole point
of free software is that we can do that.

However, good relations with upstream are valuable, and for that reason
it is good to formally write down some things, like "our software
doesn't by default connect to anything which isn't needed for it to
function, and doesn't by default send more than needed to any server".

Thanks,
Bas

-- 
I encourage people to send encrypted e-mail (see http://www.gnupg.org).
If you have problems reading my e-mail, use a better reader.
Please send the central message of e-mails as plain text
   in the message body, not as HTML and definitely not as MS Word.
Please do not use the MS Word format for attachments either.
For more information, see http://p