[gentoo-dev] bugzilla unscheduled downtime
One of the DB cluster boxes seems to have spontaneously rebooted around 03:13:53 UTC. I'm working on tracing why now (and why Nagios didn't yell at us). Bugzie down until I've fixed it. -- Robin Hugh Johnson Gentoo Linux Developer Infra Guy E-Mail : [EMAIL PROTECTED] GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85
Re: [gentoo-dev] bugzilla unscheduled downtime
On Sun, Aug 24, 2008 at 12:09:41AM -0700, Robin H. Johnson wrote: One of the DB cluster boxes seems to have spontaneously rebooted around 03:13:53 UTC. Sorry, that's the time the kernel started up again. So it went down 2-3 minutes before that, right around the start of the daily database backup (03h10). I'm working on tracing why now (and why Nagios didn't yell at us). Bugzie down until I've fixed it. I fixed the symptoms on the box, but no luck on the cause yet. I'll fix the Nagios tommorow. If it breaks again tonight, find somebody in -infra to turn off the apache on the web node, or have them phone me. If you deleted any CC entries between 03h10 and 08h00 UTC, they might have come back, but everything else merged perfectly (file a bug if you spot any other corruption). -- Robin Hugh Johnson Gentoo Linux Developer Infra Guy E-Mail : [EMAIL PROTECTED] GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 pgpBOfjCHvO8r.pgp Description: PGP signature
Re: [gentoo-dev] bugzilla unscheduled downtime
On Sun, Aug 24, 2008 at 1:13 AM, Robin H. Johnson [EMAIL PROTECTED] wrote: On Sun, Aug 24, 2008 at 12:09:41AM -0700, Robin H. Johnson wrote: One of the DB cluster boxes seems to have spontaneously rebooted around 03:13:53 UTC. Sorry, that's the time the kernel started up again. So it went down 2-3 minutes before that, right around the start of the daily database backup (03h10). I'm working on tracing why now (and why Nagios didn't yell at us). Bugzie down until I've fixed it. I fixed the symptoms on the box, but no luck on the cause yet. I'll fix the Nagios tommorow. If it breaks again tonight, find somebody in -infra to turn off the apache on the web node, or have them phone me. If you deleted any CC entries between 03h10 and 08h00 UTC, they might have come back, but everything else merged perfectly (file a bug if you spot any other corruption). -- Robin Hugh Johnson Gentoo Linux Developer Infra Guy E-Mail : [EMAIL PROTECTED] GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 Is it me or is bugzilla still down? Or has it gone down again? I can't access it, nor can some of my friends.
Re: [gentoo-dev] bugzilla unscheduled downtime
2008/8/24 Andrey Falko [EMAIL PROTECTED]: On Sun, Aug 24, 2008 at 1:13 AM, Robin H. Johnson [EMAIL PROTECTED] wrote: On Sun, Aug 24, 2008 at 12:09:41AM -0700, Robin H. Johnson wrote: One of the DB cluster boxes seems to have spontaneously rebooted around 03:13:53 UTC. Sorry, that's the time the kernel started up again. So it went down 2-3 minutes before that, right around the start of the daily database backup (03h10). I'm working on tracing why now (and why Nagios didn't yell at us). Bugzie down until I've fixed it. I fixed the symptoms on the box, but no luck on the cause yet. I'll fix the Nagios tommorow. If it breaks again tonight, find somebody in -infra to turn off the apache on the web node, or have them phone me. If you deleted any CC entries between 03h10 and 08h00 UTC, they might have come back, but everything else merged perfectly (file a bug if you spot any other corruption). -- Robin Hugh Johnson Gentoo Linux Developer Infra Guy E-Mail : [EMAIL PROTECTED] GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 Is it me or is bugzilla still down? Or has it gone down again? I can't access it, nor can some of my friends. Please be patient. Robin is working on it at the moment. It's the same problem as yesterday. Kind regards, Lukasz Damentko
Re: [gentoo-dev] bugzilla unscheduled downtime
On Mon, Aug 25, 2008 at 12:38:13AM +0200, Lukasz Damentko wrote: Is it me or is bugzilla still down? Or has it gone down again? I can't access it, nor can some of my friends. Please be patient. Robin is working on it at the moment. It's the same problem as yesterday. No, it's not the same problem. A set of 3 IPs from .nl were attacking now, and managed to tie up the databases to their concurrent query limit, and cause both of the database nodes to lock up their mysqld. Looking at more of the traffic, I think that some new web worm is just breaking onto the scene, and it's trying to abuse the forms to POST spam and malware. -- Robin Hugh Johnson Gentoo Linux Developer Infra Guy E-Mail : [EMAIL PROTECTED] GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 pgpO4pFVC3IFv.pgp Description: PGP signature