On Wednesday, January 16, 2013 at 10:05 AM, Stephan Deibel wrote:
M.-A. Lemburg wrote:
I've been able to recover the pages from archive.org (http://archive.org)
and have also
tried Google cache (which failed due to limits on the number of
allowed requests) and Yahoo/Bing cache. The latter worked, but
only returns a small fraction of the pages we have had in the wiki -
about 300+ pages. They are more recent than the archive.org
(http://archive.org) ones,
though, so I'm trying to merge the Yahoo archive ones back into the
archive.org (http://archive.org) recovery.
I recovered around 4500 pages from archive.org (http://archive.org)... in
HTML. Reimar
has a tool to convert them back into wiki markup, which we'll
try to use to prepare an import.
Meanwhile I'm also trying to see whether we can still extract some
data from the broken VM image. It does show traces of the wiki
file contents, so the data still exists on the image in some
form. Noah already tried extundelete with no success. I'm going
to give some of the other tools a try as well, e.g. ext4magic
or PhotoRec.
Phew, sounds like fun... thanks for everyone's work on this!
Can someone explain (to PSF members list) how it ended up that there
were no backups? I'm not trying to put anyone on the spot, just trying
to (a) understand how this happened, making it so hard to recover, and
(b) make sure that python.org (http://python.org) and other important
resources _are_ being
backed up in a way that prevents this kind of thing from taking down
services for a long time.
Thanks,
- Stephan
Noah can expand on this as Infrastructure lead, but the short version is this -
last year we got some beefy donations and hosting form OSU/OSL - this allows us
to run our own VM infrastructure and isolate/spin up new servers at will (which
is great). We've been slowly migrating the old services to the new systems.
Our backups are currently handled via donated services to Tummy.com - in the
transition, one of the things which had to be done was update those backups to
point to the new virtual machines. This happened for some of the more mission
critical virtual machines, but unfortunately one of the machines which fell
through the cracks was the wiki machine, which hosts not just one Moin instance
- but every single wiki the PSF hosts (including the members wiki, etc).
Due to this, when the server was compromised, and the data deleted sometime
around the 28th of december due to a 0 day exploit in Moin Moin, we lost all
data from the move to OSU.
We have coordinated with Noah, Sean at Tummy, etc to ensure all VMs hosted at
the new setup are on a vigorous backup regime (offsite via Tummy). In addition
to this, Noah is deploying an on site backup system / coordinating with OSU to
ensure we have secondary / on site backups of everything.
This ultimately comes down to a miscommunication/miss on our part, and we are
examining ways to backfill our volunteer team with paid services and leveraging
the services OSU offers to ensure we have good backups, support and other
things we may lack today.
Thanks go out to Noah for identifying and triaging the issue as best as
possible and for Marc-Andre and others for looking to recover what they can
from the compromised virtual machine and web archives.
All of our infrastructure is managed by Chef
(https://github.com/coderanger/psf-chef/tree/master/roles) and Ganeti at OSU.
Currently being backed up are:
virt-l4es2w.psf.osuosl.org
virt-gwhg4e.psf.osuosl.org
virt-wdiwcy.psf.osuosl.org
virt-sxw5uy.psf.osuosl.org
virt-oku3tm.psf.osuosl.org
virt-h669vt.psf.osuosl.org
virt-wzmlmm.psf.osuosl.org
virt-ys0nco.psf.osuosl.org
virt-7yvsjn.psf.osuosl.org
virt-k4b2sa.psf.osuosl.org
virt-ozvw2q.psf.osuosl.org
virt-8joqck.psf.osuosl.org
virt-et2yi0.psf.osuosl.org
This also includes non PSF assets such as PyPy assets we are now hosting for
free. As I said, this is both a combination of communication issues and
volunteer load. The board is examining paid backup/leads where needed and/or
leveraging OSU's services and administration.
Jesse Noller
Director, Python Software Foundation
Chair, PyCon 2013 - http://us.pycon.org
jnol...@gmail.com / jnol...@python.org
+1 617-877-9135
___
pydotorg-www mailing list
pydotorg-www@python.org
http://mail.python.org/mailman/listinfo/pydotorg-www