Re: Possibly moving Debian services to a CDN

2013-10-14 Thread Brian Gupta
On Mon, Oct 14, 2013 at 1:36 AM, Tollef Fog Heen  wrote:
> ]] Paul Wise
>
>> About the archive mirrors, some reworded thoughts from the DPL IRC
>> channel when this came up a few days ago:
>>
>>  [...] I think the current state of affairs is fine;
>
> I don't believe you're one of the person who is doing the legwork in
> maintaining any of the CDNs we're currently running, so how would you
> know that?  Because it's not visibly broken (most of the time)?  I see
> it breaking on a weekly basis, if not more often.
>
>> [...] Removing the mirror network won't be possible anyway, people are
>> still going to create mirrors, especially ISPs will for their
>> customers; due to quotas and distant mirrors being much slower.
>
> Nobody has suggested removing the mirror network.  What's being
> discussed is using a CDN for some .d.o services.  If somebody wants to
> continue running their mirror they will of course be free to do so.
>
>> Not all CDNs support IPv6.
>
> We will want to use CDNs that do support IPv6.  It's one of the
> technical bits that need to fall into place before we will want to
> switch.
>
>> I would rather expand the mirror network.
>
> Does that mean you're volunteering for the task of doing this and
> maintaining the various existing CDNs?

Yes, the hardest part of this is Debian itself would need to basically
build its own distribution network. Whether this is something I would
personal volunteer to help with, if we decide not to go the route of
using CDNs, the answer is yes. However, I don't think in this case,
whether or not I would volunteer for this should be a factor in the
decision, and will share my more fleshed out point of view than the
couple of lines I shared on IRC.

In believe the considerations here fall into a number of areas:

1) Technical. e.g. - CNAMEs, IPv6 support, security concerns related
to caching, and support for protocols other than HTTP under the same
DNS name as the HTTP services (rsync, SMTP, etc.). I believe these are
the biggest challenges that Debian would find moving to using CDNs,
but trust that if the DSA is seriously considering this, they know
what these issues are and would address them before making any
changes.

2) DFSG concerns - These blackbox services may or may not be built
using Free Software. We really have no way of knowing for sure, since
we would be abdicating the actual responsibility of running software.
That said, if our end users and Debian itself are not required to use
proprietary protocols or tools, I think this issue isn't as major of a
concern that it might seem, and believe that a CDN would be classified
as a "network service", as defined by the FSF. RMS has an interesting
blog post on this topic [1], that I largely agree with. Although there
are other issues raised, I don't believe they would impact our
decision whether or not to use a CDN, but I encourage everyone to read
this before making any decisions. My take on this, is that using a CDN
does not violate the DFSG, but defer to more experienced hands on this
particular issue.

3) Privacy - There have been issues raised about logging by these
third party CDNs. My sense is that if the CDNs do not replace our
mirror network, and people are free to continue to use existing
mirrors, my take is this change may not introduce new privacy
concerns.

4) Commercial nature of CDN services - As Tollef correctly points out,
Debian does rely on monetary and in-kind donations from a number of
for-profit enterprises. A CDN does not change this, so I think this
particular point should not be a major factor in our decision.

5) Relationships - We do have relationships with the members of our
mirror network. An open question remains, "Would migrating to CDN
services damage these relationships?" I suspect that if we allow the
mirror network to remain in place, and we communicate a solid case for
this change to our mirror network partners, prior to making it, the
majority would likely support the change.

6) Single point of failure - While technically any properly designed
CDN should be a distributed system resilient to single failures, the
fact remains, that a CDN as a whole, if it were to be compromised or
have a systematic failure (technical or other), could be catastrophic
for Debian, and our users. That said, I believe Toleff's proposal to
work with multiple CDNs, is likely a good way to address this concern,
which could further be mitigated by keeping our mirror network in
place. One thing to bear in mind, is that if any single CDN makes
accommodations/changes for our unique technical needs, this means that
it is less likely that these CDNs would be easily interchangeable. One
way to work around this, is to make sure from the start we are working
with at least two networks.

7) User experience - Does the use of CDNs improve the end user
experience? From my understanding and experience working with CDNs,
the answer is almost certainly yes, but defer to the DSA to confirm
this is in fact the case

Re: Possibly moving Debian services to a CDN

2013-10-14 Thread Nikolaus Rath
Tollef Fog Heen  writes:
>> 1) Privacy concerns: Debian would deliver much more data to business
>> companies than necessary. Keep in mind that personalized data is one
>> of the most valuable things to data miners. Currently I choose one
>> mirror site to pull my packages from. I can freely choose that mirror
>> on basis of location, bandwidth, personal likes or, let's say, privacy
>> reasons because I know that this specific mirror doesn't log my IPs.
>> When using a CDN, at least in that way I understood your proposal, I'm
>> not free to choose anymore. The company running that CDN will obtain
>> all of data like how many machines are behind a subnet or IP, what
>> kind of machines (intel, sparc, powerpc, m68k, ...) and might know if
>> I forget to update a machine (security).
>
> This is absolutely a valid concern.  I have a few mitigation strategies
> and one observation:
>
> - You can still run your own mirror.  We need that ourselves and like I
> wrote in the initial email, we need to find a way that keeps rsync
> working.
>
> - You can use an IP anonymizing service such as Tor.
 
Are you suggesting to download debian packages over tor? Last time I
used it, I got about 25 kB/s of bandwidth. But even if that has changed,
I'm pretty sure the tor network isn't intended for bulk transfer of the
debian archive...


Best,
Nikolaus

-- 
Encrypted emails preferred.
PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

 »Time flies like an arrow, fruit flies like a Banana.«


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87mwmbiqa4@rath.org



Re: Possibly moving Debian services to a CDN

2013-10-14 Thread Henrique de Moraes Holschuh
On Mon, 14 Oct 2013, Paul Wise wrote:
> On Mon, Oct 14, 2013 at 2:16 AM, Joey Hess wrote:
> > But apparently not one solved by free software included in Debian.
> > Perhaps it's worth avoiding using it if that will help encourage the
> > development of libre alternatives.
> 
> I guess the hardest part of the problem is logistics to get machines
> and disks into the network of every ISP in the world and put those on
> an anycast IP address.

While building the anycast infrasctructure is quite easy as long as you have
the ASN and IP blocks, getting it deployed is anything but trivial... and
the problems are _not_ technical.  It is not just logistics: getting the
anycast blocks routed and announced properly (remember, most of those will
be global nodes, not reduced-visibility nodes!) is a very large annoyance.

Let's stick to something DNSSEC-based, please.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20131014155546.gb29...@khazad-dum.debian.net



DSA Team Meeting minutes, 2013-10-11

2013-10-14 Thread Faidon Liambotis

Present:
luca (Luca Filipozzi)
paravoid (Faidon Liambotis)
weasel (Peter Palfrader)
zumbi (Héctor Orón Martínez)
Mithrandir (Tollef Fog Heen)
zobel (Martin Zobel-Helas)
sgran (Stephen Gran)

Ongoing project update
  o debian.org mail move status (tfheen, sgran)
- big move is done, cleanup work is needed, esp. to reduce the number of
  mail-receiving hosts
- unsure at this point if big services like ftp-master & draghi will
  stop accepting mail directly; starting with easier ones such as
  gobby.d.o.  Processing mail locally but not being open to the
  internet is the agreed middle ground.
  o openstack (sgran)
- Package version evaluation complete.  Havana trial at work underway.  d.o
  deployment work plans started
  o disks for beethoven (weasel, zobel)
- TK says they can't find any fault with the system; blame it to our
  SATA disks not being enterprise-grade.
- Peter installed the box and it crashed again, with the controller
  going AWOL
- This has consumed too much time; agreement to replace beethoven
  early and reuse the disks.
- ACTION: zobel to look for beethoven replacement (rt #4724)
  o disks for bytemark (tfheen)
- stalled, on the backburner due to other projects
- backup disks are getting full; needs reprioritisation
- ACTION: tfheen to do disks for bytemark
  o ns4 to move away from orff (weasel)
- no progress
  o CDN plan (tfheen)
- DPL wants a larger discussion
- ACTION: tfheen to send an email to debian-project; Lucas to help
  o ARM OOB status (zumbi)
- we now have serial console (via console servers) to the arm
  machines hosted at ynic and arm.
- we don't have remote power; this is coming eventually.
  o  ARM Calxeda nodes plan/roadmap (zumbi)
- nothing new, waiting for the ARM sprint next month (14-17 Nov)
  o Shipping of cyclades console servers
- we now have a console server in Vienna for the mipsels and we have
  remote power for them.
- one of the console servers is in Darmstradt but not in man-da yet.
  ETA: one month at most.
  o debdelta, codesearch (tfheen)
- no progress
  o franck and carepacks (luca, zobel)
- softchoice unresponsive
- ACTION: decide what we want, tell SPI treasurer to buy it from CDW
  o SSO status (zobel)
- no progress. Sprint planned for January.
  o New UD status (luca)
- plan to roll out in December
- ACTION: luca to reply to Olivier Berger
  o SSL certificates
- no reply from Gandi regarding billing challenge (root cause: they
  only handle CC well)
- ACTION: luca to ping Gandi regarding billing mechanism or 'ssl
  certs in lieu'
  o GRnet hardware purchases
- GRnet has recommitted to the rack space they offer to Debian but
  want to move Debian equipment to new DC
- ACTION: paravoid to discuss timing / planning for the move; maybe
  we buy new equipment for the new DC
  o HP AllianceOne for Greece
- our contact in HP@EU has changed position; need to find new
  contact
- ACTION: paravoid will contact tbm
- ACTION: paravoid will contact HP reseller that GRnet uses
  o MAN-DA hardware purchases
- ACTION: zobel will coordinate purchases with those needed for DG-i
  (to replace beethoven)
  o ries disk shipping (luca)
- in progress; no response from ECE yet
- ACTION: luca to get this sorted today
  o small items budget
- discussed on devel, went to d-d-a, done (thanks lucas)
  o alioth hw
- bytemark blade to be used. no other blockers
  o service guidelines (zobel)
- no progress
  o archive.org
- ACTION: luca to ping them
  o stabile
- ACTION: luca to source new controller cards
  o single source of truth
- ACTION: luca to look at after ud roll-out
  (http://ralph.allegrogroup.com/)

Open items
  o broken HW that needs somebody to deal with it (there should be tickets):
- saens disk
  o ravel move

Next Meeting: 2014-11-08 1430Z


--
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20131014143426.gb26...@dewey.void.home



Re: Possibly moving Debian services to a CDN

2013-10-14 Thread Tollef Fog Heen
]] Philip Hands

> Tollef Fog Heen  writes:
> 
> ...
> > Nobody has suggested removing the mirror network.  What's being
> > discussed is using a CDN for some .d.o services.
> 
> That was certainly not clear from your original post.
> 
> I certainly read you as suggesting that some services could be moved to
> third-party CDN(s), with an eye to moving ftp.debian.org there to, with
> the implication that the mirror network would then become mostly
> redundant.

«Become redundant» is not the same as being removed, though.  It would
initially be something we ran alongside the regular mirror network
(anything else would be crazy for what I think are obvious reasons).

If our experiences are then positive, we might want to stop relying on
the mirror network in say, d-i, but there's not central planning
committee shutting down any mirrors.

Local mirrors choose whether they want to carry Debian or not, and I
suspect many of them will want to use the resources for other things if
the usage falls below a threshold. Whether that actually happens or not
amounts to predicting the future, something I'm not going to try to do.

Does that make it clearer, or is it still confusing?

-- 
Tollef Fog Heen
UNIX is user friendly, it's just picky about who its friends are


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87txgk5839@qurzaw.varnish-software.com



Re: Possibly moving Debian services to a CDN

2013-10-14 Thread Philip Hands
Tollef Fog Heen  writes:

...
> Nobody has suggested removing the mirror network.  What's being
> discussed is using a CDN for some .d.o services.

That was certainly not clear from your original post.

I certainly read you as suggesting that some services could be moved to
third-party CDN(s), with an eye to moving ftp.debian.org there to, with
the implication that the mirror network would then become mostly
redundant.

I would suggest that that's the scenario that is causing people to argue
against you, so if that's not what you were suggesting, perhaps you
should try to express your plans again to get the discussion closer to
what you think you were suggesting.

Cheers, Phil.
-- 
|)|  Philip Hands [+44 (0)20 8530 9560]http://www.hands.com/
|-|  HANDS.COM Ltd.http://ftp.uk.debian.org/
|(|  10 Onslow Gardens, South Woodford, London  E18 1NE  ENGLAND


pgpNV9vbU3taC.pgp
Description: PGP signature


Re: Possibly moving Debian services to a CDN

2013-10-14 Thread Ingo Jürgensmann
Am 14.10.2013 um 07:29 schrieb Tollef Fog Heen :

>> 1) Privacy concerns: Debian would deliver much more data to business
>> companies than necessary. Keep in mind that personalized data is one
>> of the most valuable things to data miners. Currently I choose one
>> mirror site to pull my packages from. I can freely choose that mirror
>> on basis of location, bandwidth, personal likes or, let's say, privacy
>> reasons because I know that this specific mirror doesn't log my IPs.
>> When using a CDN, at least in that way I understood your proposal, I'm
>> not free to choose anymore. The company running that CDN will obtain
>> all of data like how many machines are behind a subnet or IP, what
>> kind of machines (intel, sparc, powerpc, m68k, ...) and might know if
>> I forget to update a machine (security).
> This is absolutely a valid concern.  I have a few mitigation strategies
> and one observation:
> - You can still run your own mirror.  We need that ourselves and like I
> wrote in the initial email, we need to find a way that keeps rsync
> working.

Yeah, running my own mirror is an option for me. I did run a backports.org 
mirror in the past and was thinking of expanding it to a full-blown mirror. 
But that, of course, is not an option for Joe Average User. 

> - You can use an IP anonymizing service such as Tor.

We know that NSA and GCHQ are running Tor exit nodes. And yet they don't have 
the capacity to track all TOR traffic globally, but only, with great 
cost/effort, single users can be tracked. Apparently this might just be a 
matter of time, competing with counter measures from the TOR project. 

> - You can use a local proxy that hides the details of how many nodes,
> etc. you have.

There are ways to distinguish nodes/users behind a proxy by using 
fingerprinting, latency checks and other stuff. 

Yes, I know it will be rather unlikely that someone will do that for Debian 
updates, but until some weeks ago I couldn't think of secret services that will 
do a Full Take of intercontinental sea cables like the GCHQ is doing. The 
lesson from that is: if a secret service like NSA or GCHQ want to know 
something, no effort is too big. 
All the Debian project can do, is to drive the costs high for such kind of 
surveillance. Or to put it other way around: Debian should avoid it to make it 
more easy for them. 

> - I would like us to have agreements with any donors that they're not
> allowed to use the information for anything but operational issues.  We
> can't tell them not to log (because that's really hard on a technical
> level), but we can restrict what they can do with the logs.

True. You can request agreements, but as the whole NSA affair is showing: it 
doesn't matter when it comes down to NSA & Co. There are secret courts with 
secret decisions and National Security Letters for silencing the providers, 
although internal agreements like Safe Harbor do exist. 
So, whereas agreements can be made, there will be no way for Debian to control 
whether they are being held or not.  

> The observation is that we currently don't have any such control over
> mirror operators.  They are, AFAIK, free to use whatever information
> they collect for whatever purpose they would like.

Granted. That's maybe something Debian can address as well in the future. 

But having many mirror operators result in: 
- higher "costs" for controlling them
- each mirror operator only sees its own traffic
- each mirror site will be subject to the specific law in that country (higher 
data protection level in Germany for example)

Well, I think you got the point already... ;-) 

>> 2) Integrity concerns: although Debian uses signed package lists and
>> hashed packages, using a CDN would raise the chances that there might
>> be attack vectors by manipulating the traffic. Maybe not be the will
>> of the running company, but there are other groups that might have
>> interest and the power to intercept traffic and manipulating it. This
>> is, of course, also true to current mirror sites, but a centralized
>> CDN will be more convenient to such kind of attackers.
> Given we don't use HTTPs and such today, you don't know if the traffic
> is actually going to the mirror you think it's going to, so this isn't
> really different from today.  With a CDN we could actually push more of
> the traffic to HTTPS if we wanted.  This isn't feasible with today's
> mirror network.

That's a valid point of you, thanks! The use of HTTPS should be encouraged, of 
course. How would HTTPS with a CDN work? I would believe that the CDN provider 
will use some kind of SSL proxy or SSL interception techniques. Otherwise you 
would have the same problems with managing HTTPS with the current mirror 
network. 
There are probably these possible ways: 
a) CDN provides an HTTPS entry point, but connects to the underlying mirror by 
plain HTTP. 
b) CDN uses DPI and SSL interception to break end-to-end encryption

For example using Cisco WAAS is a nice and de