Hi,
The maintenance was scheduled on Monday, for the day after that. We
had only a few hours to plan for it and communicate about it, and I
think we did a pretty good job given the time we had.
The maintenance banner was up for a few hours (not a day) prior to the
maintenance window to give
On Wed, May 25, 2011 at 1:00 PM, Thomas Morton morton.tho...@googlemail.com
wrote:
Let's just drop it :) I'm not sure where things went so south but I take
full responsibility. I've pinged Tim off-list about contributing my own
time
to work on the error page matter - which I think is only
On 05/25/2011 01:12 PM, Tim Starling wrote:
On 25/05/11 18:14, Thomas Morton wrote:
IRC was flooded with people who didn't understand what was going on. And
many didn't believe/understand that it was maintenance... so this is
definitely an area worth improving.
Maybe we can replace the IRC
Milos Rancic, 26/05/2011 09:57:
Site notice for a week before the maintenance would be useful, too. We
communicate with our users via web site, not via emails.
A week of pain to signal (and not avoid) an hour of pain? Doesn't look
like a gain.
Nemo
I'm pretty sure there was a site notice; I recall seeing one anyway :)
Tom
On 26 May 2011 09:09, Federico Leva (Nemo) nemow...@gmail.com wrote:
Milos Rancic, 26/05/2011 09:57:
Site notice for a week before the maintenance would be useful, too. We
communicate with our users via web site,
On 05/26/2011 10:09 AM, Federico Leva (Nemo) wrote:
Milos Rancic, 26/05/2011 09:57:
Site notice for a week before the maintenance would be useful, too. We
communicate with our users via web site, not via emails.
A week of pain to signal (and not avoid) an hour of pain? Doesn't look
like a
There was, it ran for a day. (
http://meta.wikimedia.org/wiki/Special:CentralNotice)- Generic maintenance
notice.
Theo
On Thu, May 26, 2011 at 1:41 PM, Thomas Morton morton.tho...@googlemail.com
wrote:
I'm pretty sure there was a site notice; I recall seeing one anyway :)
Tom
On 26 May
Thomas Morton, 26/05/2011 10:11:
I'm pretty sure there was a site notice; I recall seeing one anyway :)
For a day: http://meta.wikimedia.org/wiki/Special:CentralNotice
Nemo
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe:
On 05/26/2011 10:18 AM, Theo10011 wrote:
There was, it ran for a day. (
http://meta.wikimedia.org/wiki/Special:CentralNotice)- Generic maintenance
notice.
So, then it should just last a bit longer (maybe three days if not a
week?) and we would avoid the most of complains.
On 26/05/11 17:57, Milos Rancic wrote:
On 05/25/2011 01:12 PM, Tim Starling wrote:
On 25/05/11 18:14, Thomas Morton wrote:
IRC was flooded with people who didn't understand what was going on. And
many didn't believe/understand that it was maintenance... so this is
definitely an area worth
We already get spammed enough with notices, which is one of the
reasons many people hide them permanently via css so they never
intrude again, which would make them pointless for the more
established users, also overkill for what was meant to be (from my
understanding) only a few minutes of
On 24/05/11 23:32, Thomas Morton wrote:
So, just a quick thought for future reference - during maintenance is it
possible in future to update the error message to explain that maintenance
is ongoing?
Seeing as how widely WMF projects are used by a non-technical project the
current MySQL
I don't get this.
Would it be possible in future, if the sites are unresponsive, or will be
unresponsive due to planned maintenance, to establish a fallback that simply
displays an explanatory status message to the public?
FT2
On Wed, May 25, 2011 at 8:15 AM, Tim Starling
On 25/05/11 17:32, FT2 wrote:
I don't get this.
Would it be possible in future, if the sites are unresponsive, or will be
unresponsive due to planned maintenance, to establish a fallback that simply
displays an explanatory status message to the public?
You mean replace the entire site with
I think it's reasonable (and indeed standard) to deploy some sort of
downtime maintenance error message.
If that requires improving the error handling code to catch a wider variety
of errors and push people to the error message page then I understand the
time issues :).
If the short term
priority task being to get the site working again. Maybe at some time
in the future, we will have enough 24/7 sysadmin manpower that we can
respond to any unplanned downtime in the way you suggest. But we don't
have that capability just yet.
In future we will have five nines availability and
In future can I have vanilla and strawberry with that? :)
FT2
On Wed, May 25, 2011 at 9:16 AM, Domas Mituzas midom.li...@gmail.comwrote:
In future we will have five nines availability and no downtimes will
happen.
___
foundation-l mailing list
On Wed, May 25, 2011 at 9:32 AM, FT2 ft2.w...@gmail.com wrote:
I don't get this.
Would it be possible in future, if the sites are unresponsive, or will be
unresponsive due to planned maintenance, to establish a fallback that simply
displays an explanatory status message to the public?
Would
Austin,
That's interesting, what was the wording for the maintenance message? I only
ever saw the default our servers are experiencing a technical problem
error page.
Tom
On 25 May 2011 10:53, Austin Hair adh...@gmail.com wrote:
On Wed, May 25, 2011 at 9:32 AM, FT2 ft2.w...@gmail.com wrote:
On Wed, May 25, 2011 at 11:57 AM, Thomas Morton
morton.tho...@googlemail.com wrote:
That's interesting, what was the wording for the maintenance message? I only
ever saw the default our servers are experiencing a technical problem
error page.
I could be misremembering, because I honestly
unless, as Tim already addressed, you wanted a developer
assigned to updating the message in real time.
No, definitely not what was being suggested.
This is the error message that appeared for me (and apparently others):
http://nomulous.com/blog/wp-content/uploads/2009/09/wikipedia_error.png
As you can see it refers to some unknown error. In this case the
maintentance was known and* pre-planned* for several days.
technically this was unknown problem :)
A lot of people were confused by the outage and the error page was unhelpful
to them. This could have been mitigated simply
Huh? The downtime was expected during 13:00 and 14:00 UTC, or at least there
was an email warning of such things the day before... hardly unplanned or
unknown.
Tom
On 25 May 2011 11:12, Domas Mituzas midom.li...@gmail.com wrote:
As you can see it refers to some unknown error. In this case
On Wed, May 25, 2011 at 12:09 PM, Thomas Morton
morton.tho...@googlemail.com wrote:
This is the error message that appeared for me (and apparently others):
http://nomulous.com/blog/wp-content/uploads/2009/09/wikipedia_error.png
I won't continue arguing about whether or not it should say
It might be more worthwhile to put downtime status updates on
status.wikimedia.org as a logical page to display the status of the servers,
and link to it from the default error messages.
Given that status.wm.org is an external service, it would hopefully not be
affected by any outages and the
Hi!
Huh? The downtime was expected during 13:00 and 14:00 UTC, or at least there
was an email warning of such things the day before... hardly unplanned or
unknown.
there's a bit of a difference between maintenance window and expected downtime
during it.
Domas
The maintenance was planned, downtime was noted as possible. An error
message that reflects that seems, frankly, a good idea.
The response to what I thought to be a helpful suggestion in improving
communication with readership has been... incredibly disappointing. I wish I
hadn't bothered. :( I
Hi!
The maintenance was planned, downtime was noted as possible. An error
message that reflects that seems, frankly, a good idea.
There're lots of great ideas around the world, feeding the hungry and curing
the cancer among them.
The response to what I thought to be a helpful suggestion in
If we knew what would fail to put an appropriate error message there, we'd
probably fix the problem beforehand. :-)
That's... completely missing the point. Yes the specific errors faced were
unexpected or unforseen, BUT they were a* direct result* of the maintenance
between 13:00 and 14:00. I am
Tim,
When I originally wrote:
during maintenance is it possible in future to update the error message to
explain that maintenance is ongoing?
That was a bit of a silly moment from me :) I see how that implies
in-maintenance updates.
In fact my suggestion was to update the error message to
Domas, what are you trying to achieve with your comments on Tom's
suggestions? He just said that if we know that maintenance is done and
could cause outages we should put up an error message that informs the
reader about the maintenance work and tells him not to worry. That's
obviously a
Hi!
That's... completely missing the point. Yes the specific errors faced were
unexpected or unforseen, BUT they were a* direct result* of the maintenance
between 13:00 and 14:00. I am simply passing on the feeling of our
readership; which was that the situation was badly communicated to
On 25/05/11 18:14, Thomas Morton wrote:
IRC was flooded with people who didn't understand what was going on. And
many didn't believe/understand that it was maintenance... so this is
definitely an area worth improving.
Maybe we can replace the IRC link in the Squid error message with a
link to
Tim Starling wrote:
Maybe we can replace the IRC link in the Squid error message with a
link to the WatchMouse page (status.wikimedia.org). That would reduce
the IRC flood.
* https://bugzilla.wikimedia.org/show_bug.cgi?id=16043
* https://bugzilla.wikimedia.org/show_bug.cgi?id=20079
MZMcBride
m...@marcusbuck.org wrote:
The sensible reaction (from a person who is involved in the
maintenance) would be:
Oh, sorry, we were so much occupied with making the maintenance work
as smooth and uninterruptive as possible that we totally didn't think
about that. We will integrate it into our
Maybe we can replace the IRC link in the Squid error message with a
link to the WatchMouse page
@Tim; that seems a good idea.
@Domas, I'm afraid you don't seem to have understood the premise of my
suggestion.. which is fine. But one fallacy is worth responding to:
You have some annoying users,
On Wed, May 25, 2011 at 4:40 PM, Domas Mituzas midom.li...@gmail.comwrote:
Hi!
That's... completely missing the point. Yes the specific errors faced
were
unexpected or unforseen, BUT they were a* direct result* of the
maintenance
between 13:00 and 14:00. I am simply passing on the
Theo10011 wrote:
Instead of diverting users to IRC, how about an outage/error page with a
twitter/identi.ca feed with updates from the tech team, or at least a page
with customized message in case of previously planned outage. Most of the
tech staff already use Twitter/Identi.ca to update
On Wed, May 25, 2011 at 5:31 PM, MZMcBride z...@mzmcbride.com wrote:
Theo10011 wrote:
Instead of diverting users to IRC, how about an outage/error page with a
twitter/identi.ca feed with updates from the tech team, or at least a
page
with customized message in case of previously planned
On Wed, May 25, 2011 at 10:09 PM, Theo10011 de10...@gmail.com wrote:
On Wed, May 25, 2011 at 5:31 PM, MZMcBride z...@mzmcbride.com wrote:
Theo10011 wrote:
Instead of diverting users to IRC, how about an outage/error page with a
twitter/identi.ca feed with updates from the tech team, or at
What I understood from this thread is: if you have a planned
maintenance windows between 13 and 14 GMT, it would be appreciated if
you could:
- create a simple page that says: We are working on our servers
between 13 and 14 GMT and Wikipedia might be unavailable during that
time
- replace the
On 25/05/11 21:19, MZMcBride wrote:
Tim Starling wrote:
Maybe we can replace the IRC link in the Squid error message with a
link to the WatchMouse page (status.wikimedia.org). That would reduce
the IRC flood.
* https://bugzilla.wikimedia.org/show_bug.cgi?id=16043
*
Tim,
Great, thanks for that. Seeing as it was me that raise this ;) I guess it's
only right I take up the gauntlet, so will try and find time later to
propose something.
Tom
On 25 May 2011 13:48, Tim Starling tstarl...@wikimedia.org wrote:
On 25/05/11 21:19, MZMcBride wrote:
Tim Starling
On 25/05/11 22:27, Strainu wrote:
What I understood from this thread is: if you have a planned
maintenance windows between 13 and 14 GMT, it would be appreciated if
you could:
- create a simple page that says: We are working on our servers
between 13 and 14 GMT and Wikipedia might be
Me - no.
Readers who didn't know - yes.
Wikipedia going down without a temporary explanation page is roughly of the
same scale as apple.com going down with no explanation, google.com going
down with no explanation, microsoft.com going down with no explanation, and
so on.
Top 5 website means we
2011/5/25 Tim Starling tstarl...@wikimedia.org:
On 25/05/11 22:27, Strainu wrote:
What I understood from this thread is: if you have a planned
maintenance windows between 13 and 14 GMT, it would be appreciated if
you could:
- create a simple page that says: We are working on our servers
As a non-tech, don't all reads (at least) pass through the squids, so we can
identify and report in a nice way a lot of connection errors at that point?
/ignoreifnaive
FT2
On Wed, May 25, 2011 at 2:18 PM, Tim Starling tstarl...@wikimedia.orgwrote:
There are dozens of places where error
Just conceptualising...
I haven't played with Squid for a while (so am rusty) but the simplest
solution would probably be to catch all PHP errors somewhere in the
Mediawiki code and return a 500 status error code.
Then get Squid to map that to the static error page.
On the other hand throwing a
Wikipedia going down without a temporary explanation page is roughly of the
same scale as apple.com going down with no explanation, google.com going
down with no explanation, microsoft.com going down with no explanation, and
so on.
WHOAH THERE IS QUITE SOME SELF ENTITLEMENT THERE.
Microsoft
On 25/05/11 23:41, FT2 wrote:
As a non-tech, don't all reads (at least) pass through the squids, so we can
identify and report in a nice way a lot of connection errors at that point?
/ignoreifnaive
Maybe it would be possible to identify error messages by their HTTP
response code, and replace
Domas, why so defensive? No one accused you of anything or blamed you
for the downtime. The comments suggesting more finely-tuned error
messages weren't critical of you or Tim or the developers in general,
they were just (reasonable) suggestions. Maybe adjusting all the
various error messages in
Is the Squid configuration the foundation employs available publicly
somewhere (I'm scanning the SVN and not seeing it..)? Because I don't mind
having a look and filing a specific bugzilla correction with various bits of
code changes.
It's about time I refreshed my Squid knowledge :)
Tom
On 25
On 25 May 2011 09:50, Domas Mituzas midom.li...@gmail.com wrote:
Oh, by the way, I don't know where you look, but I somewhat missed
communication about maintenance events ongoing in Google or Microsoft or
Apple - you think they have none?
Did you get lots of clarification why your gmail was
Hi!
Domas, why so defensive?
I'm contrarian in this case :)
unfeasible because of the work involved, but you can probably say that
without all the combative snark.
Well, as with every downtime, there are way more issues* that end up uncovered
and have to be looked at, and yet largest
@Tim: Understood, I'll make sure I know this will work first so as not to
generate work for you. My initial idea might not be so workable given the
architecture used (and how Squid handles error codes). I'll roll up some
servers here at work and run some tests.
@Domos; echoing what Risker said...
Zitat von MZMcBride z...@mzmcbride.com:
m...@marcusbuck.org wrote:
The sensible reaction (from a person who is involved in the
maintenance) would be:
Oh, sorry, we were so much occupied with making the maintenance work
as smooth and uninterruptive as possible that we totally didn't think
In a message dated 5/25/2011 3:33:57 AM Pacific Daylight Time,
midom.li...@gmail.com writes:
There're lots of great ideas around the world, feeding the hungry and
curing the cancer among them.
Domas your responses are not helpful at all. You are simply stirring the
pot to no point.
Hi!
Domas your responses are not helpful at all. You are simply stirring the
pot to no point. Please stop.
You forgot to tell if all of my responses or just some, and if there's really
no point at all, or there might be some.
Anyway, thanks for this helpful contribution!
Domas
In a message dated 5/25/2011 11:01:24 AM Pacific Daylight Time,
midom.li...@gmail.com writes:
You forgot to tell if all of my responses or just some, and if there's
really no point at all, or there might be some.
Anyway, thanks for this helpful contribution!
Refactoring my comments :
On Tue, May 24, 2011 at 6:32 AM, Thomas Morton
morton.tho...@googlemail.com wrote:
So, just a quick thought for future reference - during maintenance is it
possible in future to update the error message to explain that maintenance
is ongoing?
I work with lots of (library) databases, and
Let's just drop it :) I'm not sure where things went so south but I take
full responsibility. I've pinged Tim off-list about contributing my own time
to work on the error page matter - which I think is only fair enough given
that I raised it. And sorry for any offence caused to the ops team by my
Domas Mituzas wrote:
FAIL WHALE!
W W W
WW W W
'. W
.--._ \ \.--|
/ -..__) .-'
| _ /
\'-.__, .__.,'
`''._\--'
V
http://en.wikipedia.org/wiki/User:MZMcBride/Blame_wheel 3
MZMcBride
So, just a quick thought for future reference - during maintenance is it
possible in future to update the error message to explain that maintenance
is ongoing?
Seeing as how widely WMF projects are used by a non-technical project the
current MySQL connection error I am seeing on Commons is just
I totally agree with Thomas.
On Tue, May 24, 2011 at 4:32 PM, Thomas Morton morton.tho...@googlemail.com
wrote:
So, just a quick thought for future reference - during maintenance is it
possible in future to update the error message to explain that maintenance
is ongoing?
Seeing as how
Speaking of WP downtime, you might be particularly interested in today's
XKCD:
http://xkcd.com/903/
wittylama.com/blog
Peace, love metadata
On 24 May 2011 21:35, Itzik Edri it...@infra.co.il wrote:
I totally agree with Thomas.
On Tue, May 24, 2011 at 4:32 PM, Thomas Morton
Dear all,
The Wikimedia Foundation will be performing network maintenance on
Tuesday, May 24 between 13:00 and 14:00 (UTC) (see other timezones on
timeanddate.com: http://ur1.ca/49cl2 ).
During the maintenance period, you may experience intermittent
connection issues to Wikimedia Foundation
66 matches
Mail list logo