Report from the Ruby Sprint, January 2014
This is the report of the Ruby team sprint that was held at the IRILL offices in Paris, France betwen January 15th and 17th, 2014. We thank IRILL for hosting us, and the Debian sponsors for funding the expenses required for the sprint. # Attendees * Antonio Terceiro * Cédric Boutillier * Christian Hofstaedtler * Jonas Genannt * Lucas Nussbaum (Jan 17th) # Initial plans The following topics were worked on: - Work towards removal of Ruby 1.8 - Decide on supported interpreter versions for Jessie - Get the Rails 3 package(s) in shape - Get the Rails 4 packages(3) in shape - Discussion of an initial list of key packages for the team - Drop the standalone `rubygems` package - Work on removal of old transitional packages remaining from the wheezy changes (~100) - Support for autopkgtest in gem2deb - Ruby policy and the ruby-policy package We were able to work on all but the last two items. The results are reported below. Raw notes (unreviewed) that were taken during the sprint can be found at http://www.okfnpad.org/p/DebianRubySprintParis2014 In general, the sprint was very productive and fun, and we encourage other teams that did not have one yet (as was the case of the Ruby team) to try it out! Some things really are best discussed in a room with a whiteboard. In our case, working out the plan forward with regard to obsolete and new Ruby interpreters would probably have taken us weeks over email, but we were able to settle on a good enough solution in one afternoon. # Work towards removal of Ruby 1.8 Work in this area mainly involves either fixing packages to work with a newer Ruby version, or giving up because the package is broken beyond repair, in which case we file removal bugs for them. Pretty much all of the packages in the second case have long-dead upstream and nobody caring about them out there. In this front, we did: - ~24 uploads fixing packages - ~9 patches sent to the BTS - ~29 bugs filed and/or updated # Decision on interpreter versions Taking the upstream support periods into account: This discussion started by us collecting the expected EOL dates for the currently Ruby series supported by upstream: Version Release Date Upstream EOL === = 1.8 irrelevantPast EOL 1.9.3irrelevantFeb 2015¹ 2.0.0Feb 2013 Feb 2016 (best case)² 2.1 Dec 2013 Dec 2016 (best case)² 2.2 Dec 2015 (?) Dec 2018 (best case)² ¹ http://www.ruby-lang.org/en/news/2014/01/10/ruby-1-9-3-will-end-on-2015/ ² https://bugs.ruby-lang.org/issues/9215#change-43556 Assuming that Jessie will freeze early November of 2014, (optimistically one could say) estimating its release for June 2015, and assuming 3 years of support, the EOL for Jessie would be June 2018. With these dates in mind: - It's impractical to keep Ruby 1.9 for Jessie - Ruby 2.2 will be released too late to be considered for Jessie It's clear to us that the default should be 2.1. The remaining question would be whether it's worthwhile to also keep 2.0, and we thought it's not. The more recent Ruby versions have not been introducing as much backwards incompability as it was the case for Ruby 1.9 wrt to Ruby 1.8. So if any Ruby project works today with Ruby 1.9 (the oldest of the upstream-supported series), it will most likely also work with either Ruby 2.0 and Ruby 2.1. This led us to the realization that because of this, it does not make much sense for us to keep supporting multiple Ruby versions simultaneously. Having multiple interpreters is a burden: - on the interpreter maintainer(s), who have to maintain multiple versions for multiple debian versions. A single security issue might need 6+ separate uploads (2+ ruby versions x 3 suites - unstable, stable, oldstable) that have to be built and tested. - on the Ruby team, since whenever we want o drop a old and deprecated interpreter we have to hunt down every maintainer to have their packages updated so we can remove the old interpreter. - because of the alternatives choice, upgrades might be a pain if users have explicitly selected an old interpreter as /usr/bin/ruby. Moving forward, our plan is to have a single Ruby interpreter in the archive. We will keep the infrastructure for multiple versions around so that we can have multiple versions simultaneously on a temporary basis when we want to introduce a new version with a smoother transition. So the next months you can expect that: - ruby1.8 will be removed - ruby1.9.1 will be removed - ruby2.0 will be made the default (pretty soon) - ruby2.1 will be made the default (later) - ruby2.0 will be removed - switching /usr/bin/ruby with update-alternatives will be no longer supported # Get the Rails 3 package(s) in shape - finished and uploaded unified rails 3 source package (rails-3.2), to replace ~8 separate source packages. The new package includes an autopkgtest test suite that will alert us if any dependency breaks Rails. Currently waiting in NEW. -
Re: Possibly moving Debian services to a CDN
On 30/01/14 at 13:53 +0100, Tollef Fog Heen wrote: ]] Tollef Fog Heen Hi all, - the various bits and bobs that are currently hosted on static.debian.org I thought it's time for a small update about this. As of about an hour ago, planet and metadata.ftp-master are now served from the Fastly CDN, and it all seems to be working quite smoothly. We've uncovered some bits we want to make work better, such as adding and removing backend servers automatically when they become unavailable or are added to the static DNS RR, purging content from the caches when it's updated and possibly some other minor bits. This does sadly mean we don't currently have IPv6 for those two services, something that's being worked on by Fastly. As for the privacy concerns raised in the thread, I've had quite a lot of discussions with Fastly about how they operate wrt privacy. They don't store request-related logs (only billing information), so there are no URLs, cookie, client IPs or similar being stored. Varnish has an ephemeral log which they go through a couple of times a minute where some of that information is present, but it never leaves the host (unless we enable logging to an endpoint we control). I'm quite content with how they're handling the privacy concerns. In the interest of full disclosure I should also mention that I'm starting to work for Fastly in a few days time. I don't believe that has influenced my views or judgements here. Hi Tollef, Thanks a lot for this status update. I'm very much in favor of exploring ways to make the Debian infrastructure easier to manage, and using a CDN sounds like a great way to do so. It's great that things worked out with Fastly (any plans for a more public announcement?). However, in [1], I raised one main non-technical concern that is not mentioned in your mail: I fear that, by moving to CDNs without ensuring that there are a sufficient number of CDN providers willing and able to support Debian, we could end up in a lock-in situation with a specific CDN provider (after all, there are not so many of them, and even a smaller number could be able to deal with our technical requirements). [1] https://lists.debian.org/debian-project/2013/10/msg00074.html Of course, as long as we have the infrastructure to go back to the old way of doing things, it is not a big problem. So I'm not worried at the moment. But one of the end goals of using CDN is to reduce the number of Debian PoP (have Debian machines in a fewer number of datacenters, to make them easier to manage). Once we do that, it will be very hard to go back. Have you been trying to reach out to other CDN providers about supporting Debian? I know of discussions with Amazon CloudFront, but I remember some technical blockers? Could the DPL be of some help to you in that process? Cheers. Lucas signature.asc Description: Digital signature
Re: Debian services and Debian infrastructure
On 01/04/2014 11:56 PM, Lucas Nussbaum wrote: 3. to provide a place to experiment with new services + create a Debian cloud with virtual machines to develop new services (maybe providing manually-created VMs would be enough -- I'm not sure we need a complex infra such as OpenStack). My first remark about this would be: do we have any other cloud software that can extensively use something like Ceph for distributed storage? Because for me, distributed storage is a mandatory piece which we have to implement when thinking about cloud computing. Otherwise, we have no serious redundancy for storage, and then we have funny issues like what happened with the HDD of Alioth. If the DSA wont go with OpenStack, please name another (comparable) solution that would fit. Also, I hereby propose my help if the DSA need to setup OpenStack. I probably can also involve people from eNovance. They have the expertise to deploy OpenStack using full HA (which I don't know yet). By the way, we need galera in Debian! :) On 01/07/2014 04:47 PM, Wouter Verhelst wrote: Since DSA is using puppet extensively ATM, wouldn't it be better to have a documented procedure on how to set up a VM or chroot or similar environment that uses DSA's puppet recipes to set up a development instance? That way, people can make changes where necessary (while obviously understanding these changes may or may not be acceptable), don't have to worry about making a mistake and killing someone else's machine (after all, it's their own machine), etc. The puppet receipt for setting-up OpenStack are open source, and eNovance actively works on that right now (to have a private cloud product based on just that...). I don't think it's fully production ready right away now, but every day, it's taking a better shape, and I think it's nearly up to speed and could be completely useable in a few weeks. Another thing: I've been providing a bit more than a dozen VMs based on GPLHost Xen solution (I am the main owner of GPLHost, and it's been now more than 10 years...) for various people within the Debian project. The only requirement was that a DD would ask and write his PGP key ID in the registration form, plus the hosted project had to be related to free software in a way or another. Up to now, there's still some space available. I'm not sure if this means that there's not a lot of need for it, or if this is because this is not well known enough. Anyway, I'm very happy I could help, but I'm still a bit frustrated that this isn't really part of the Debian official infrastructure. Cheers, Thomas Goirand (zigo) P.S: Note that I'm not subscribe to -project, which is why I missed this thread, however I'd be very interested in helping. -- To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/52f50b2c.2080...@debian.org
Re: Debian services and Debian infrastructure
On 01/23/2014 06:07 AM, Tollef Fog Heen wrote: Who's then to ensure that machine is secured and kept up to date with security patches and doesn't become a source of spam? Just like for every hosting solution: you deal with it when it happens. With my 10 years experience of running a hosting business, I don't think this is a huge problem. On 01/23/2014 06:07 AM, Tollef Fog Heen wrote: Sorry to say, but most developers are not good sysadmins I'm sure they are better than let's say 90% of my customers. On 01/23/2014 06:07 AM, Tollef Fog Heen wrote: It's sgran who's been thinking about how to do this, but afaik he's seen close to zero interest from developers for it, so it's not happened yet. I don't think we need anything from the DPL as such, but if people are actually interested in something like this happening, saying so would be a good start. I'm convinced as well that there's no demand for it. However, if we start doing some CI things (like piuparts, adequat, rebuild rebuild twice and so on) on each upload, then it makes sense to have a cloud with throwable VMs. And there's all sorts of compute loads that we could make a very good use of. It'd be super nice to have the archive rebuild jobs running on the Debian infrastructure rather than on AWS for example. Cheers, Thomas -- To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/52f50d72.8090...@debian.org
Re: Debian services and Debian infrastructure
On 01/22/2014 12:30 AM, Stephen Gran wrote: I am nervous that your plans to start handing out VMs with no DSA involvement means that in a few years, we'll have a few dozen owned VMs that no one ever bothered to clean up costing us good will with these hosters. What I've done on the VMs which I provided for free to any DD that asked for it, was to have them expire every year. The DDs need to open a support ticket to ask for another year. It worked out well so far. Thomas -- To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/52f50f9c.7040...@debian.org
Re: Debian services and Debian infrastructure
On Sat, 08 Feb 2014, Thomas Goirand wrote: It'd be super nice to have the archive rebuild jobs running on the Debian infrastructure rather than on AWS for example. I agree, and it has been proposed several times over the last few years. To say there was no interest whatsoever would overstate the amount of excitement those suggestions have received. -- | .''`. ** Debian ** Peter Palfrader | : :' : The universal http://www.palfrader.org/ | `. `' Operating System | `-http://www.debian.org/ -- To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20140207171630.gb3...@anguilla.noreply.org
Re: Debian services and Debian infrastructure
- Thomas Goirand z...@debian.org wrote: My first remark about this would be: do we have any other cloud software that can extensively use something like Ceph for distributed storage? Because for me, distributed storage is a mandatory piece which we have to implement when thinking about cloud computing. We use a MooseFS cluster for Brainfood's hosting cluster and use files on the distributed filesystem for our KVM disk images. Moose does not have Ceph's RADOS block device level integration but it does have some very useful features that Ceph does not. For instance, it has the ability to do shallow clones of a file (which Ceph can also do) and does not require that all children be removed before the original parent (a disadvantage which Ceph suffers from). We have found the performance of Moose to be adequate for our needs using an Infiniband interconnect and the reliability has been excellent. If the DSA wont go with OpenStack, please name another (comparable) solution that would fit. We use OpenNebula because we started fairly early and at the time we first began the OpenNebula packaging on Debian was much more mature. That picture may have changed by now. We have considered looking more deeply into OpenStack but OpenNebula is a breeze to script and we do not want to make a decision that is hype based as some accuse the CERN switch to have been. We also have a smaller Eucalyptus installation that we are experimenting with for AWS based services such as Asgard. The puppet receipt for setting-up OpenStack are open source, and eNovance actively works on that right now (to have a private cloud product based on just that...). We have been using Ansible for our host configuration. We first learned of Ansible through our involvement with the Eucalyptus web UI development effort and some family ties between the Eucalyptus team and Ansible. We helped Ansible with some early web development efforts and, as a result, were able to get started with their product earlier in its life. I don't really have anything negative to say about Puppet or Chef but I would say that our use of Ansible has flourished a rate that never happened with either of those products. Another thing: I've been providing a bit more than a dozen VMs based on GPLHost Xen solution (I am the main owner of GPLHost, and it's been now more than 10 years...) for various people within the Debian project. The only requirement was that a DD would ask and write his PGP key ID ... or if this is because this is not well known enough. Anyway, I'm very happy I could help, but I'm still a bit frustrated that this isn't really part of the Debian official infrastructure. Similarly we have handed out some pro-bono virtual machines to various DDs over time just to get their feedback on how our environment worked from their perspective. I would agree that we have not had DDs beating down our door to request this service. If the availability of these kinds of machines was more clearly advertised to DDs that might be a different story. We have also seen some use of the machine for personal services. That is not a problem from our perspective but I can understand how there could be other viewpoints. We would also be happy to collaborate in an effort to organize this kind of service into a more clearly defined Debian cloud platform that provides some well understood set of services (authentication, VPNs, GIT repos, etc.) -- Debian, choice of a GNU generation. -- To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/8080424.13861391797081490.javamail.r...@newmail.brainfood.com
Re: Possibly moving Debian services to a CDN
On Fri, Feb 07, 2014 at 02:08:26PM +0100, Lucas Nussbaum wrote: On 30/01/14 at 13:53 +0100, Tollef Fog Heen wrote: ]] Tollef Fog Heen Hi all, - the various bits and bobs that are currently hosted on static.debian.org I thought it's time for a small update about this. As of about an hour ago, planet and metadata.ftp-master are now served from the Fastly CDN, and it all seems to be working quite smoothly. We've uncovered some bits we want to make work better, such as adding and removing backend servers automatically when they become unavailable or are added to the static DNS RR, purging content from the caches when it's updated and possibly some other minor bits. This does sadly mean we don't currently have IPv6 for those two services, something that's being worked on by Fastly. As for the privacy concerns raised in the thread, I've had quite a lot of discussions with Fastly about how they operate wrt privacy. They don't store request-related logs (only billing information), so there are no URLs, cookie, client IPs or similar being stored. Varnish has an ephemeral log which they go through a couple of times a minute where some of that information is present, but it never leaves the host (unless we enable logging to an endpoint we control). I'm quite content with how they're handling the privacy concerns. In the interest of full disclosure I should also mention that I'm starting to work for Fastly in a few days time. I don't believe that has influenced my views or judgements here. Hi Tollef, Thanks a lot for this status update. I'm very much in favor of exploring ways to make the Debian infrastructure easier to manage, and using a CDN sounds like a great way to do so. It's great that things worked out with Fastly (any plans for a more public announcement?). However, in [1], I raised one main non-technical concern that is not mentioned in your mail: I fear that, by moving to CDNs without ensuring that there are a sufficient number of CDN providers willing and able to support Debian, we could end up in a lock-in situation with a specific CDN provider (after all, there are not so many of them, and even a smaller number could be able to deal with our technical requirements). [1] https://lists.debian.org/debian-project/2013/10/msg00074.html Of course, as long as we have the infrastructure to go back to the old way of doing things, it is not a big problem. So I'm not worried at the moment. But one of the end goals of using CDN is to reduce the number of Debian PoP (have Debian machines in a fewer number of datacenters, to make them easier to manage). Once we do that, it will be very hard to go back. Have you been trying to reach out to other CDN providers about supporting Debian? I know of discussions with Amazon CloudFront, but I remember some technical blockers? Could the DPL be of some help to you in that process? I am in active discussion with another CDN provider and I should restart the CloudFront conversation. There are technical considerations with Fastly, also, that Tollef will work through. We've always been of the opinion that we need two CDN providers. We're just as concerned about vendor lock-in as anyone. Thank you for the offer of DPL help. I'll loop you in. Luca -- Luca Filipozzi http://www.crowdrise.com/SupportDebian signature.asc Description: Digital signature
Re: Possibly moving Debian services to a CDN
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 8/02/2014 11:46 AM, Luca Filipozzi wrote: Have you been trying to reach out to other CDN providers about supporting Debian? I know of discussions with Amazon CloudFront, but I remember some technical blockers? Could the DPL be of some help to you in that process? I am in active discussion with another CDN provider and I should restart the CloudFront conversation. There are technical considerations with Fastly, also, that Tollef will work through. We've always been of the opinion that we need two CDN providers. We're just as concerned about vendor lock-in as anyone. Hello Luca, all, http://cloudfront.debian.net/ is continuing to do some traffic. It is also offering CDN acceleration for debian-cd and cdimage. We set up a second CDN distribution, http://cloudfront-security.debian.net/; however at this point in time CloudFront does not support IPv6. CloudFront now has 51 edge locations worldwide (having added Rio de Janerio, Taipei, Manila, Marseille and Warsaw recently) and supports custom SSL certificates if (Debian) want to do this. Using the Debian http redirection service at http://http.debian.net/ is perhaps a preferred approach as it allows real time redirection to the desired CDN service. I'm happy to give any DD access to the AWS account that is running this; please mail me (gpg signed) off list. As to lock in - CloudFront is an http cache network - stop handing out the cloudfront.debian.net URL and traffic naturally drops off; nothing else do to. James - -- /Mobile:/ +61 422 166 708, /Email:/ james_AT_rcpt.to -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.17 (MingW32) iQIcBAEBAgAGBQJS9cVEAAoJEK7IKHSdhcU8G7IQAI32rECjmo8/lNeLP5Z4jn+T KqqmEqoyKAX1cDFP0PFHZZPocWeTZM9p5B5u+hnGcf4gj8umjmFM7XzP3EwlSYM1 QOiOlRfGuNsp871Rnhwdw3pslEiOX2vdg7V4Mjcbo+NX0MNY+Vu2XqDoie5SSgE0 eSR/0quV9o9a22A6CkO5lmmtgOnWSSYsFUFOutdPseQmTKYn5J9Qc4wLOIO4Z0CI X+ANOlK6RMeg4CwiBNezxa4xkTQcg6TvqOt9r0/rxWNCYsWPnCyFel4Iykctqt99 yjrkMga8dDWgKqegB7awINu701z9SVbVqpWkiM+E3+5ZpDDshbwf78uR6BiliXP6 ZY860SSXhX12EWrlEIcamZXZMbh0xZrWITuFksoXMnQZK45S6TU5JYPhVPnWoLou kM99eBcJhNyiqZawH1nTulsP4DM9vYRDt+kb9g3L8HSBhkYmCN0Gz5iPzEpXZ/Ef b4VhKOegPfiT1cjHLqltNxZvJ8gN1L4X4LJ2bVu56qDCgqj83r5vtRVfI5RI53TB U8itPGsZHgxXtOG8Hqyji8YjUT7i60ZAok9kMKgZ7lBdNt1Cv0SYN8yRl2pf1Ru5 pqyIjxUsBSNFFPkSAh01cXrHvhK+EonbxGFcrW2xSdXXLr5KgkM9kN/vjtzgLPfS t1zIIRKge1YN+5z4n35T =2A80 -END PGP SIGNATURE-
Re: Debian services and Debian infrastructure
On 07/02/14 at 18:16 +0100, Peter Palfrader wrote: On Sat, 08 Feb 2014, Thomas Goirand wrote: It'd be super nice to have the archive rebuild jobs running on the Debian infrastructure rather than on AWS for example. I agree, and it has been proposed several times over the last few years. To say there was no interest whatsoever would overstate the amount of excitement those suggestions have received. The archive rebuilds are currently running over a very short amount of time ( 8 hours). That's an important feature, because it allows one to rebuild a snapshot of the archive (or a quasi-snapshot, since that could span two dinstalls) and make sure to find all build failures in that snapshot. Unfortunately, to do that, one requires quite a lot of computing power. As a wild guess, I would say 20 to 30 recent servers (I've been using more than 100 EC2 VM for some rebuilds, especially when doing two rebuilds at the same time to compare the rebuilds, e.g. gcc 4.n vs gcc 4.(n+1)). We currently get this computing power for free from Amazon. We used to get it from my employer through a research project (Grid'5000), and we could return to that if needed: the move to Amazon was an attempt at switching from 'only Lucas can work on that' to 'other people can work on that', which was a success given at least David Suarez (general purpose rebuilds + bug filing) and Sylvestre Ledru (clang rebuilds) have been actively running rebuilds over the last year, and I haven't. Also, we made sure of not using anything Amazon-specific in the rebuild infrastructure. Doing those rebuilds on Debian infrastructure would require either: 1) investing in a set of powerful machines to do the rebuilds ($50k as a guess, $2k * 25 -- and that's probably the bare minimum we would need), finding hosting for them, managing them. We get all of that for free from Amazon, so I don't think that this would be good use of Debian's funds. 2) degrading the service: loosen the requirement to rebuild quasi-snapshots of the archive, or use snapshot.debian.org to build a possibly older snapshot of the archive (which would result in filing bugs that were already fixed). None of those options look particularly exciting. That's why I never explored further the idea of running those rebuilds on Debian infrastructure. But maybe you have a nice solution, that you haven't explained yet? One thing we could do, though, is move all the static part of the rebuild infrastructure to Debian infrastructure. That's the virtual machine used to schedule builds on all temporary nodes, and the virtual machine used to store logs. Those logs used to be stored in http://people.d.o/~lucas/, which did not work scale well to several people doing the rebuilds. I approached DSA asking if they would be willing to provide similar archival space in a place accessible by a team including non-DDs. (#debian-admin, 2013-06-03) All of the suggested solutions (mail all build logs as attachments to the bug reports; ask someone else or a cron job to rsync from external host to qa.d.o; push to buildd.d.o) had drawbacks or required additional work that I didn't have time for at the time, so we have just been using an EC2 VM (aws-logs.d.n) to store the logs since then. Probably that should be revisited, but I'm no longer the good point of contact for that (David Suarez is). Lucas -- To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20140208070510.ga27...@xanadu.blop.info