The file system (Ceph) is now stable and we've turned all the VMs back
on. And, the downtime paid off! We see no evidence of data loss or
corruption.
A few VMs have been a bit fussy about coming back up, so I encourage you
to 'Hard Reboot Instance' if you are seeing bad behavior. Toolforge
sh
We are having some very concerning instability with the cloud-vps file
system. Out of an abundance of caution I have shut off EVERYTHING in
cloud-vps to prevent rampant data corruption.
I don't expect this outage to last long but will notify when things
start up again. Very sorry for the downt
Thanks largely to dschwen's hard work, we are about to move the
long-neglected postgres osmdb to a volunteer-managed project. Most
workloads have already moved to the new service. As far as anyone can
tell there is only a single tool still hitting osmdb.eqiad.wmnet.
Later in the week, that too