Lucas_Werkmeister_WMDE added a comment.
Deployed, and so far the servers seem to be happy, but I’ll check again in a couple of hours before closing this.TASK DETAILhttps://phabricator.wikimedia.org/T184812EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To:
Stashbot added a comment.
Mentioned in SAL (#wikimedia-operations) [2018-03-12T13:59:57Z] Synchronized wmf-config/Wikibase-production.php: SWAT: [[gerrit:416748|Enable caching of constraint check results (T184812)]] (duration: 00m 57s)TASK
Stashbot added a comment.
Mentioned in SAL (#wikimedia-operations) [2018-03-12T13:31:34Z] Synchronized wmf-config/Wikibase-production.php: SWAT: [[gerrit:416748|Enable caching of constraint check results (T184812)]] (duration: 03m 08s)TASK
Stashbot added a comment.
Mentioned in SAL (#wikimedia-operations) [2018-03-12T13:17:22Z] Synchronized wmf-config/Wikibase-production.php: SWAT: [[gerrit:416748|Enable caching of constraint check results (T184812)]] (duration: 03m 09s)TASK
gerritbot added a comment.
Change 416748 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable caching of constraint check results
https://gerrit.wikimedia.org/r/416748TASK DETAILhttps://phabricator.wikimedia.org/T184812EMAIL
Lucas_Werkmeister_WMDE added a comment.
Alright, I’ve added it to next Monday’s EU SWAT.TASK DETAILhttps://phabricator.wikimedia.org/T184812EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Lucas_Werkmeister_WMDECc: Lydia_Pintscher, Bawolff, greg, Ladsgroup,
gerritbot added a comment.
Change 416748 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[operations/mediawiki-config@master] Enable caching of constraint check results
https://gerrit.wikimedia.org/r/416748TASK
jcrespo added a comment.
Yes, I said I wanted to block it on the ticket I mentioned before, which is now resolved.TASK DETAILhttps://phabricator.wikimedia.org/T184812EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Lucas_Werkmeister_WMDE, jcrespoCc:
Lucas_Werkmeister_WMDE added a comment.
Apologies for being pushy earlier – I definitely want the Wikidata servers to stay alive as well :)
The bug that caused the incident should be thoroughly fixed now, with several commits already in wmf.23 (either before the cut or backported) and some more
Lydia_Pintscher added a comment.
In T184812#4010399, @jcrespo wrote:
@Lydia_Pintscher I consider this is being pushed unnecessarily fast after an actual outage happend, without proper investigation of the causes leading to them-- that looks like a disregard for the site's reliability, and I do
Marostegui added a comment.
In T184812#4011213, @Ladsgroup wrote:
Hey,
We totally understand T188505 is a blocker for further work on caching-side. It was more of a question of how we should proceed with the deployment of something else that might get affected by caching not enabled. If we don't
Ladsgroup added a comment.
Hey,
We totally understand T188505 is a blocker for further work on caching-side. It was more of a question of how we should proceed with the deployment of something else that might get affected by caching not enabled. If we don't enable caching, other problems might
Marostegui added a comment.
In T184812#4010371, @Lucas_Werkmeister_WMDE wrote:
Okay… is there anything I can do to help move that task forward?
The reason I want to get caching into place soon is that we’ll roll out constraint checks to more users in the next weeks (starting on March 1st to all
Lucas_Werkmeister_WMDE added a comment.
Several of the actionables have been implemented; the main one was merged just before the wmf.23 branch cut, others I’ve now requested to be backported for the European Mid-Day SWAT today. If it’s okay, I’d like to try enabling caching again in today’s
Ladsgroup added a comment.
Sent to ops-lTASK DETAILhttps://phabricator.wikimedia.org/T184812EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Lucas_Werkmeister_WMDE, LadsgroupCc: Bawolff, greg, Ladsgroup, jcrespo, Marostegui, TerraCodes, Stashbot,
Ladsgroup added a comment.
Well, it seems wikitech is completely down now.TASK DETAILhttps://phabricator.wikimedia.org/T184812EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Lucas_Werkmeister_WMDE, LadsgroupCc: Bawolff, greg, Ladsgroup, jcrespo, Marostegui,
Lucas_Werkmeister_WMDE added a comment.
Wikitech is writable again, here’s the report: https://wikitech.wikimedia.org/wiki/Incident_documentation/20180226-WikibaseQualityConstraints
I’ll ask @Ladsgroup to send out the email to ops@ tomorrow (AFAIK I don’t have the required permissions for that,
Bawolff added a comment.
Php considering "0" to be false strikes again!TASK DETAILhttps://phabricator.wikimedia.org/T184812EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Lucas_Werkmeister_WMDE, BawolffCc: Bawolff, greg, Ladsgroup, jcrespo, Marostegui,
Lucas_Werkmeister_WMDE added a comment.
I’ve started writing up an incident report, but I can’t currently save it since wikitech is read-only due to a database migration. If it’s not back to read-write by the time I leave the office, I’ll dump it in an Etherpad or something.TASK
Ladsgroup added a comment.
In T184812#4002850, @jcrespo wrote:
SELECT $startOpts $vars $from $useIndex $ignoreIndex " .
"WHERE $conds $preLimitTail";
That looks like a recipe for an sql injection, how did this pass security review?
Why this is public?TASK
Lucas_Werkmeister_WMDE added a comment.
That looks like a recipe for an sql injection, how did this pass security review?
I should’ve been clearer – this is in Database::selectSQLText. That’s the one place where code like this would be expected, right? (Assuming that all the variables have been
Lucas_Werkmeister_WMDE added a comment.
Okay, so it’s actually the opposite. We’re somehow asking for the latest revision IDs of an empty array of entity IDs. And WikiPageEntityMetaDataLookup has a special safeguard for this:
if ( empty( $where ) ) {
// If we skipped all entity IDs, select
Lucas_Werkmeister_WMDE added a comment.
The WHERE would have something like one Q1234=page_title AND 0=page_namespace per entity ID.
An alternative way to confirm this would be to check the latestRevisionIds in a cached value… perhaps someone can look at the cached value for
Lucas_Werkmeister_WMDE added a comment.
Okay, then it’s almost certainly caused by this change, yes :( and I’m going to guess that the most likely reason for this query to become so slow would be that we’re asking for way too many revision IDs. Can you perhaps confirm this? The WHERE would have
Marostegui added a comment.
Something like:
SELECT /* Wikibase\Lib\Store\Sql\WikiPageEntityMetaDataLookup::selectRevisionInformationMultiple */ rev_id, rev_content_format, rev_timestamp, page_latest, page_is_redirect, old_id, old_text, old_flags, page_title FROM `page` INNER JOIN `revision` ON
Lucas_Werkmeister_WMDE added a comment.
Because I can’t think of anything SQL-related in that change… it shouldn’t result in any new SQL queries
Okay, that was wrong. There are new SQL queries: for each wbcheckconstraints request, one or two queries to get the latest revision ID for a set of
jcrespo added a comment.
F14034462: Screenshot_20180226_185900.pngTASK DETAILhttps://phabricator.wikimedia.org/T184812EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Lucas_Werkmeister_WMDE, jcrespoCc: greg, Ladsgroup, jcrespo, Marostegui, TerraCodes,
greg added a comment.
This deserves an incident report.TASK DETAILhttps://phabricator.wikimedia.org/T184812EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Lucas_Werkmeister_WMDE, gregCc: greg, Ladsgroup, jcrespo, Marostegui, TerraCodes, Stashbot,
Lucas_Werkmeister_WMDE added a comment.
To clarify – SQL queries? Because I can’t think of anything SQL-related in that change… it shouldn’t result in any new SQL queries, and should indirectly have reduced the number of SQL queries (because some Wikidata items would no longer need to be loaded).
Marostegui added a comment.
And looks like everything is back to normal: https://grafana.wikimedia.org/dashboard/file/server-board.json?panelId=10=1=db1109=eth0=now-1h=now=1m
We also killed queries (and they are not coming back after the revert)TASK
Marostegui added a comment.
We have reverted the changeTASK DETAILhttps://phabricator.wikimedia.org/T184812EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Lucas_Werkmeister_WMDE, MarosteguiCc: Ladsgroup, jcrespo, Marostegui, TerraCodes, Stashbot,
Lucas_Werkmeister_WMDE added a comment.
Ouch. Please feel free to revert the config change https://gerrit.wikimedia.org/r/413724 – even if it turns out to be unrelated, it won’t hurt our users too much.TASK DETAILhttps://phabricator.wikimedia.org/T184812EMAIL
jcrespo added a comment.
https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m=10=1=db1109=eth0=1519578882341=1519665282341TASK DETAILhttps://phabricator.wikimedia.org/T184812EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To:
Marostegui added a comment.
We are investigating if this has caused a massive spike on wikidata replicas: https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m=10=1=db1109=eth0=1519578882341=1519665282341
As per https://wikitech.wikimedia.org/wiki/Server_Admin_Log the time
gerritbot added a comment.
Change 413724 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable caching of constraint check results
https://gerrit.wikimedia.org/r/413724TASK DETAILhttps://phabricator.wikimedia.org/T184812EMAIL
gerritbot added a comment.
Change 413724 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[operations/mediawiki-config@master] Enable caching of constraint check results
https://gerrit.wikimedia.org/r/413724TASK
36 matches
Mail list logo