[Wikidata-bugs] [Maniphest] [Commented On] T171027: "2062 Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis

2017-10-07 Thread jcrespo
jcrespo added a comment. We can do it on commons and ruwiki, which are probably the most affected ones, plus some on s3.TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc: D3r1ck01, matej_suchanek

[Wikidata-bugs] [Maniphest] [Commented On] T177707: don't dispatch changes to all affected pages for highly used items

2017-10-09 Thread jcrespo
jcrespo added a comment. I think the materialization is a wrong approach, and we should try to materialize the changes only on query. E.g.: Watchlist x page x wikidata_usage JOINS being queried on real time. Of course, that makes the query an order of magnitude slower and more complex but- some

[Wikidata-bugs] [Maniphest] [Commented On] T177707: don't dispatch changes to all affected pages for highly used items

2017-10-09 Thread jcrespo
jcrespo added a comment. This is the rate on the last 7 days of Apiqueryrecentchanges: https://logstash.wikimedia.org/goto/7a5cebd94f0120ead4ca10a34f6b2b54 Coincidentally, it is much larger on ruwiki and commons, even if ruwiki was added as backend a borrowed large server.TASK DETAILhttps

[Wikidata-bugs] [Maniphest] [Commented On] T171027: "2062 Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis

2017-10-09 Thread jcrespo
jcrespo added a comment. BTW, the decision was already mentioned as the 4th option of @Catrope suggestions "Disable Wikidata RC on large wikis until we have a more scalable implementation of the feature". I think nobody predicted how bad things were at the time.TASK D

[Wikidata-bugs] [Maniphest] [Updated] T171027: "2062 Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis

2017-10-09 Thread jcrespo
jcrespo removed a project: User-notice. TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc: Finavon, WMDE-leszek, saper, Masti, Lydia_Pintscher, D3r1ck01, matej_suchanek, Ankry, Ladsgroup, Lsanabria

[Wikidata-bugs] [Maniphest] [Unblock] T90435: [Epic] Wikidata watchlist improvements (client)

2017-10-09 Thread jcrespo
jcrespo closed subtask T171027: "2062 Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis as "Resolved". TASK DETAILhttps://phabricator.wikimedia.org/T90435EMAIL PREFERENCEShttps://phabricator.wik

[Wikidata-bugs] [Maniphest] [Block] T90435: [Epic] Wikidata watchlist improvements (client)

2017-10-09 Thread jcrespo
jcrespo reopened subtask T171027: "2062 Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis as "Open". TASK DETAILhttps://phabricator.wikimedia.org/T90435EMAIL PREFERENCEShttps://phabricator.wik

[Wikidata-bugs] [Maniphest] [Reopened] T171027: "2062 Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis

2017-10-09 Thread jcrespo
jcrespo reopened this task as "Open". TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc: Finavon, WMDE-leszek, saper, Masti, Lydia_Pintscher, D3r1ck01, matej_suchanek, Ankry, Ladsgroup,

[Wikidata-bugs] [Maniphest] [Closed] T171027: "2062 Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis

2017-10-09 Thread jcrespo
jcrespo closed this task as "Resolved".jcrespo added a project: User-notice.jcrespo added a comment. Notification for users: We are going to disable wikidata recentchanges (meaning, changes on pages on other wikis coming from changes done on wikidata; the recentchanges at wikidata is not

[Wikidata-bugs] [Maniphest] [Created] T177772: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata)

2017-10-09 Thread jcrespo
jcrespo created this task.jcrespo added projects: MediaWiki-Watchlist, DBA, Wikidata.Herald added subscribers: Liuxinyu970226, Jay8g, TerraCodes, Aklapper. TASK DESCRIPTIONOnce rows stop from coming in (see parent task), this will likely either solve or mitigate performance issues. The rows

[Wikidata-bugs] [Maniphest] [Block] T171027: "2062 Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis

2017-10-09 Thread jcrespo
jcrespo created subtask T12: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata). TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To

[Wikidata-bugs] [Maniphest] [Commented On] T151717: Usage tracking: record which statement group is used

2017-10-09 Thread jcrespo
jcrespo added a comment. What you write is ok, but IF you want our opinion, can you translate that into increase of row storage and inserts/other write activity compared to the full table size/previous state?TASK DETAILhttps://phabricator.wikimedia.org/T151717EMAIL PREFERENCEShttps

[Wikidata-bugs] [Maniphest] [Commented On] T151717: Usage tracking: record which statement group is used

2017-10-09 Thread jcrespo
jcrespo added a comment. I have the feeling that these numbers could be meaningless on such small wikis, given the issues on recentchanges with only some large wikis such as commons and ruwiki. Could the same issue happen to those a x10 growth. I know the case is different, pages * edits, vs

[Wikidata-bugs] [Maniphest] [Commented On] T151717: Usage tracking: record which statement group is used

2017-10-09 Thread jcrespo
jcrespo added a comment. What I mean is that the number are ok to proceeed (not a big deal), but still worried for the large wikis. I know you do not have all the answers, I just was talking aloud.TASK DETAILhttps://phabricator.wikimedia.org/T151717EMAIL PREFERENCEShttps

[Wikidata-bugs] [Maniphest] [Commented On] T177772: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata)

2017-10-09 Thread jcrespo
jcrespo added a comment. I am leaving a screen open on dbstore1002 loading a copy of recentchanges from commonswiki. Because of the size, it will take some time to be copied. Tomorrow I will test purging the table with something such as: pt-archiver --source h=dbstore1002.eqiad.wmnet,D

[Wikidata-bugs] [Maniphest] [Commented On] T151717: Usage tracking: record which statement group is used

2017-10-10 Thread jcrespo
jcrespo added a comment. Sure, it is ok.TASK DETAILhttps://phabricator.wikimedia.org/T151717EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc: Bawolff, eranroz, Ottomata, PokestarFan, Ladsgroup, Stashbot, gerritbot, Halfak, jcrespo, TomT0m, Hall1467

[Wikidata-bugs] [Maniphest] [Commented On] T171027: "2062 Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis

2017-10-10 Thread jcrespo
jcrespo added a comment. @Catrope- discussion ongoing, feel free to weight in. As a DBA I required these 2 now because very large tables and high error rate.TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To

[Wikidata-bugs] [Maniphest] [Commented On] T171027: "2062 Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis

2017-10-10 Thread jcrespo
jcrespo added a comment. I forgot to say we suspect the same thing happens on other s3 hosts, but these 2 previous wikis create so many errors that it is difficult to say until we fix these 2.TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] [Updated] T177707: don't dispatch changes to all affected pages for highly used items

2017-10-10 Thread jcrespo
jcrespo added a subscriber: brion.jcrespo added a comment. you are essentially proposing to get rid of the recentchanges table altogether No, I was asking to keep recentchanges as it was before, with the previous load/meaning and implement "new features" on separate tables, or on separa

[Wikidata-bugs] [Maniphest] [Commented On] T177707: don't dispatch changes to all affected pages for highly used items

2017-10-10 Thread jcrespo
jcrespo added a comment. One last thought, it is early to say, but commonswiki seem to have stopped having insert spikes of 100x the normal rate: https://grafana.wikimedia.org/dashboard/db/mysql?panelId=2&fullscreen&orgId=1&var-dc=eqiad%20prometheus%2Fops&var-server=db1068&va

[Wikidata-bugs] [Maniphest] [Commented On] T177772: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata)

2017-10-10 Thread jcrespo
jcrespo added a comment. I have allowed for codfw to lag- so that we can go at around 500 deletes/s. That means the whole thing will take less than 3 hours. Shout if anyone see any strangeness on comonswiki (you shouldn't)- worse case scenario- kill the pt-archiver job on the screen sessi

[Wikidata-bugs] [Maniphest] [Commented On] T151717: Usage tracking: record which statement group is used

2017-10-10 Thread jcrespo
jcrespo added a comment. 51.5M you meant, maybe?TASK DETAILhttps://phabricator.wikimedia.org/T151717EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc: Bawolff, eranroz, Ottomata, PokestarFan, Ladsgroup, Stashbot, gerritbot, Halfak, jcrespo, TomT0m

[Wikidata-bugs] [Maniphest] [Updated] T177772: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata)

2017-10-10 Thread jcrespo
jcrespo added a comment. That means the whole thing will take less than 3 hours I had a mind slip... we have to delete 60M rows, not 6M, that means 30 hours, not 3. I ran this for 6 hours, 10M rows were deleted. We will continue after the s4 maintenance tomorrow: T168661TASK DETAILhttps

[Wikidata-bugs] [Maniphest] [Commented On] T177772: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata)

2017-10-11 Thread jcrespo
jcrespo added a comment. ruwiki results are more extreme: root@db2076[ruwiki]> SELECT count(*) FROM recentchanges; +--+ | count(*) | +--+ | 37427748 | +--+ 1 row in set (11.19 sec) root@db2076[ruwiki]> SELECT COUNT(*) FROM recentchanges WHERE rc_source

[Wikidata-bugs] [Maniphest] [Commented On] T174028: Finalize database schema for MCR content meta-data

2017-10-12 Thread jcrespo
jcrespo added a comment. One additional comment is that if you kind of agree, in some cases, to do the "new table and copy" pattern rather than an alter. It is much easier to add or transform columns later, so no need to optimize in advance. NULLs in compact or above rows formats take

[Wikidata-bugs] [Maniphest] [Commented On] T177772: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata)

2017-10-12 Thread jcrespo
jcrespo added a comment. root@db2076[ruwiki]> SELECT COUNT(*) FROM recentchanges WHERE rc_source = 'wb'; +--+ | COUNT(*) | +--+ |0 | +--+ 1 row in set (1.19 sec)TASK DETAILhttps://phabricator.wikimedia.org/T12EMAIL PREFERENCEShttps://phabricator.

[Wikidata-bugs] [Maniphest] [Commented On] T171027: "Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis

2017-10-12 Thread jcrespo
jcrespo added a comment. This is the result on query error rate after ruwiki has been purged: F10158519: Screenshot_20171012_130744.png Please note that these are preliminary results, and that tables have yet to be optimized/analyzed, but I think this confirms we are in the right track. Commons

[Wikidata-bugs] [Maniphest] [Commented On] T178063: Decide on scalable approach for watchlist integration of Wikidata

2017-10-12 Thread jcrespo
jcrespo added a comment. Small correction/clarification on "this was found to generate too much load" as an ops, I interpret load as throughput/backlog work. The insertion [load] itself was not the problem (the spikes on inserts were too large, but something that could be smoothed); t

[Wikidata-bugs] [Maniphest] [Commented On] T174028: Finalize database schema for MCR content meta-data

2017-10-12 Thread jcrespo
jcrespo added a comment. @Anomie I know it is on your mind, it was a wink that for any estimation we give you know on some schema changes, I firmly believe we can cut by 10 the time it takes to deploy them if that was already there. Also an explanation why they take some much time right now.TASK

[Wikidata-bugs] [Maniphest] [Commented On] T177772: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata)

2017-10-13 Thread jcrespo
jcrespo added a comment. 53156406 rows purged on commons so far of the initial 58M estimation (it will probably be less because regular rc purge by timestamp).TASK DETAILhttps://phabricator.wikimedia.org/T12EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To

[Wikidata-bugs] [Maniphest] [Closed] T177772: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata)

2017-10-13 Thread jcrespo
jcrespo closed this task as "Resolved".jcrespo added a comment. There are more things pending, like running optimize table on the non-rc replicas of eqiad or all of codfw, and checking other wikis different from ruwiki and commons, but the initial scope (emergency) has been fixed.TASK D

[Wikidata-bugs] [Maniphest] [Unblock] T171027: "Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis

2017-10-13 Thread jcrespo
jcrespo closed subtask T12: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata) as "Resolved". TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org/sett

[Wikidata-bugs] [Maniphest] [Commented On] T171027: "Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis

2017-10-13 Thread jcrespo
jcrespo added a comment. The initial scope- query issues on ruwiki and commonswiki, I think has been successfully patched. These are the statistics I have now: F10184141: Screenshot_20171013_165442.png I would ask users affected to confirm the issues are gone- probably not 100% of them will

[Wikidata-bugs] [Maniphest] [Unassigned] T171027: "Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis

2017-10-13 Thread jcrespo
jcrespo removed jcrespo as the assignee of this task. TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc: JEumerus, Alsee, awight, Matthewrbowker, Noella94, Nirmos, Stryn, Mike_Peel, Capankajsmilyo

[Wikidata-bugs] [Maniphest] [Commented On] T171027: "Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis

2017-10-13 Thread jcrespo
jcrespo added a comment. For a separate ticket, other potential projects to keep an eye on: ca.wikipedia.org fr.wikipedia.org ro.wikipedia.org TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc

[Wikidata-bugs] [Maniphest] [Lowered Priority] T171027: "Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis

2017-10-13 Thread jcrespo
jcrespo lowered the priority of this task from "Unbreak Now!" to "Normal".jcrespo added a comment. Lowering priority unless we get more reports from other ruwiki or commons users.TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedi

[Wikidata-bugs] [Maniphest] [Commented On] T177772: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata)

2017-10-13 Thread jcrespo
jcrespo added a comment. For the curious, 150GB of disk space (and memory) was freed with the commonswiki purge.TASK DETAILhttps://phabricator.wikimedia.org/T12EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc: gerritbot, Stashbot, Jdforrester-WMF

[Wikidata-bugs] [Maniphest] [Unblock] T90435: [Epic] Wikidata watchlist improvements (client)

2017-10-16 Thread jcrespo
jcrespo closed subtask T171027: "Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis as "Resolved". TASK DETAILhttps://phabricator.wikimedia.org/T90435EMAIL PREFERENCEShttps://phabricator.wik

[Wikidata-bugs] [Maniphest] [Closed] T171027: "Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis

2017-10-16 Thread jcrespo
jcrespo closed this task as "Resolved".jcrespo assigned this task to Bawolff.jcrespo added a comment. I am going to declare the initial issue as resolves, as the only feedback I got was positive. There is still a lot of followup: check the issue on other wikis and have a more long-ter

[Wikidata-bugs] [Maniphest] [Retitled] T178661: Drop wb_entity_per_page views in labs

2017-10-20 Thread jcrespo
jcrespo renamed this task from "Drop replication of wb_entity_per_page in labs" to "Drop wb_entity_per_page views in labs".jcrespo updated the task description. (Show Details) CHANGES TO TASK DESCRIPTIONWe've got https://gerrit.wikimedia.org/r/382694 merged, and DBAs will

[Wikidata-bugs] [Maniphest] [Updated] T178661: Drop wb_entity_per_page views in Wiki Replicas

2017-11-15 Thread jcrespo
jcrespo added a comment. This is the direct cause of T180564.TASK DETAILhttps://phabricator.wikimedia.org/T178661EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chasemp, jcrespoCc: bd808, chasemp, Magnus, hoo, daniel, jcrespo, Marostegui, Aklapper, Ladsgroup

[Wikidata-bugs] [Maniphest] [Created] T180694: Test testwikidatawiki on s8

2017-11-16 Thread jcrespo
jcrespo created this task.jcrespo added projects: Operations, Wikidata, MediaWiki-Configuration, DBA. TASK DESCRIPTIONIn order to test that moving wikidatawiki on production to s8 will not break things, we have thought of moving one non-user impacting wiki first to test everything is ok. Testwiki

[Wikidata-bugs] [Maniphest] [Retitled] T180694: Test moving testwikidatawiki database to s8 replica set on Wikimedia

2017-11-16 Thread jcrespo
jcrespo renamed this task from "Test testwikidatawiki on s8" to "Test moving testwikidatawiki database to s8 replica set on Wikimedia".jcrespo updated the task description. (Show Details) CHANGES TO TASK DESCRIPTIONIn order to test that moving wikidatawiki on production to s8 w

[Wikidata-bugs] [Maniphest] [Commented On] T180694: Test moving testwikidatawiki database to s8 replica set on Wikimedia

2017-11-16 Thread jcrespo
jcrespo added a comment. Thanks, Ladsgroup, for starters we were thinking of preparing the topology changes for s8 on codfw (which if it breaks it wouldn't be a huge deal because it is passive right now). Here is the change: https://gerrit.wikimedia.org/r/391835 Moving testwikidatawiki wou

[Wikidata-bugs] [Maniphest] [Unassigned] T176277: Provision a separate DB shard for wbc_entity_usage

2017-11-22 Thread jcrespo
jcrespo removed jcrespo as the assignee of this task.jcrespo added a comment. This need hardware provisioning, and that means budget, and that means a detailed plan with our overall ok.TASK DETAILhttps://phabricator.wikimedia.org/T176277EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings

[Wikidata-bugs] [Maniphest] [Commented On] T173710: Job queue is increasing non-stop

2017-11-24 Thread jcrespo
jcrespo added a comment. @Aklapper Probably, but I would close that one, as that should not be happening right now, unless you have reports saying it is again.TASK DETAILhttps://phabricator.wikimedia.org/T173710EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To

[Wikidata-bugs] [Maniphest] [Updated] T181486: Wikidata database corruption?

2017-11-28 Thread jcrespo
jcrespo edited projects, added Wikidata; removed Toolforge. TASK DETAILhttps://phabricator.wikimedia.org/T181486EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc: Ladsgroup, jcrespo, Aklapper, Magnus, Lahi, Gq86, GoranSMilovanovic, QZanden, Wikidata

[Wikidata-bugs] [Maniphest] [Commented On] T181486: Wikidata database corruption?

2017-11-28 Thread jcrespo
jcrespo added a comment. @Magnus- you probably didn't see Ladsgroup comment, apparently it requires editing, and not purging for the table to repopulate.TASK DETAILhttps://phabricator.wikimedia.org/T181486EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferenc

[Wikidata-bugs] [Maniphest] [Commented On] T178661: Drop wb_entity_per_page views in Wiki Replicas

2017-11-30 Thread jcrespo
jcrespo added a comment. In the longer term we should make some changes to maintain-views so that it can drop a single view without needing to rebuild all of the views for the database. Also, given that the "new" servers can be depooled almost at will, metadata locking can be avoided,

[Wikidata-bugs] [Maniphest] [Declined] T180694: Test moving testwikidatawiki database to s8 replica set on Wikimedia

2017-12-04 Thread jcrespo
jcrespo closed this task as "Declined".jcrespo added a comment. We are happy with the configuration on both eqiad and codfw, we do not need to test testwiki- we did it with wikidatawiki easily.TASK DETAILhttps://phabricator.wikimedia.org/T180694EMAIL PREFERENCEShttps://phabricator.wik

[Wikidata-bugs] [Maniphest] [Commented On] T183341: New item fails (Special and WEF tool)

2017-12-20 Thread jcrespo
jcrespo added a comment. The LIMIT 1 FOR UPDATE (plus what marostegui comments) indicates that is not a lag problem, but a contention problem (Error: 1205 Lock wait timeout exceeded)- many items wanting to lock the same rows at the same time. There is nothing for the DBAs to do here, code should

[Wikidata-bugs] [Maniphest] [Updated] T183341: New item fails (Special and WEF tool)

2017-12-20 Thread jcrespo
jcrespo edited projects, added MediaWiki-Database; removed DBA.jcrespo added a comment. @Billinghurst In non-technical terms, it seems to be an overload of some kind- something that normally doesn't happen, but that the code should avoid from happening in any case (e.g. it could be an unre

[Wikidata-bugs] [Maniphest] [Commented On] T183341: New item fails (Special and WEF tool)

2017-12-20 Thread jcrespo
jcrespo added a comment. @Ladsgroup After looking at database logs, I can confirm there is something really bad with the wikidata new item logic- each request has to lock the table, which esentially serialized creations, puts a limit on the number of items that can be created per second (in a bad

[Wikidata-bugs] [Maniphest] [Commented On] T183341: New item fails (Special and WEF tool)

2017-12-20 Thread jcrespo
jcrespo added a comment. @Billinghurst I was actually praising you for the report, for taking the time, and saying sorry it happened :-)TASK DETAILhttps://phabricator.wikimedia.org/T183341EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc: jcrespo

[Wikidata-bugs] [Maniphest] [Commented On] T183341: New item fails (Special and WEF tool)

2017-12-20 Thread jcrespo
jcrespo added a comment. @Ladsgroup Without looking, I imagine a bot overloaded the creation, something that should be easy to check on recentchanges. Of course, that doesn't take away from avoiding or reducing the lock as much as possible.TASK DETAILhttps://phabricator.wikimedia.org/T183341

[Wikidata-bugs] [Maniphest] [Commented On] T183341: New item fails (Special and WEF tool)

2017-12-20 Thread jcrespo
jcrespo added a comment. Yes, the slowdown should avoid issues until a fix is conceived.TASK DETAILhttps://phabricator.wikimedia.org/T183341EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc: Lydia_Pintscher, Daniel_Mietchen, mark, jcrespo, Marostegui

[Wikidata-bugs] [Maniphest] [Commented On] T181645: Help communicate read-only time for dewiki and wikidata for database split

2017-12-22 Thread jcrespo
jcrespo added a comment. As a note, on some parts of the world, in particular, SFO, it will be still the 8th.TASK DETAILhttps://phabricator.wikimedia.org/T181645EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Lea_Lacroix_WMDE, jcrespoCc: JStrodt_WMDE

[Wikidata-bugs] [Maniphest] [Updated] T144382: Spiky write pattern on core db masters

2016-08-31 Thread jcrespo
jcrespo added a project: Wikidata.jcrespo added a comment. An unreliable, heuristic approach seems to point to old knowns: LinksUpdate and, less probably, Wikibase\Client\Usage\Sql\EntityUsageTable. Maybe its job execution could be reviewed to do shorter; more frequent runs?TASK DETAILhttps

[Wikidata-bugs] [Maniphest] [Closed] T140968: Wikibase refreshLinks hook breaks implicit transactions

2016-08-31 Thread jcrespo
jcrespo closed this task as "Resolved".jcrespo claimed this task.jcrespo added a comment. I see this and T140955 resolved since Aug 4.TASK DETAILhttps://phabricator.wikimedia.org/T140968EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoC

[Wikidata-bugs] [Maniphest] [Closed] T140955: Wikibase\Repo\Store\WikiPageEntityStore::updateWatchlist: Automatic transaction with writes in progress (from DatabaseBase::query (LinkCache::addLinkObj))

2016-08-31 Thread jcrespo
jcrespo closed this task as "Resolved".jcrespo claimed this task.jcrespo added a comment. I see this and T140968 resolved since Aug 4.TASK DETAILhttps://phabricator.wikimedia.org/T140955EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc:

[Wikidata-bugs] [Maniphest] [Created] T144398: wikibase-addUsagesForPage generates transaction error logs "Implicit transaction already active" "Explicit commit of implicit transaction" "Implicit tran

2016-08-31 Thread jcrespo
jcrespo created this task.jcrespo added projects: Wikidata, Wikimedia-log-errors.Herald added a subscriber: Aklapper. TASK DESCRIPTIONThey are 90% of the db-related logs right now; I would bet they are not critical errors, but the main issue is that they make difficult to monitor actual errors

[Wikidata-bugs] [Maniphest] [Commented On] T143818: Wikibase\SqlIdGenerator::generateNewId: Explicit commit of implicit transaction.

2016-09-01 Thread jcrespo
jcrespo added a comment. Thank you for the quick fix. It is highly appreciated.TASK DETAILhttps://phabricator.wikimedia.org/T143818EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hoo, jcrespoCc: jcrespo, greg, gerritbot, hashar, Aklapper, D3r1ck01, Izno

[Wikidata-bugs] [Maniphest] [Commented On] T104762: Setup sparqly service at https://sparqly.wmflabs.org/ (like Quarry)

2016-09-14 Thread jcrespo
jcrespo added a comment. @Base, your questions are very interesting, and you seem to have really nice suggestions, but I would suggest a mailing list, wiki talk page (or if it was a bug/feature request, doing them on a separate ticket), as the preferred way to communicate. This ticket is probably

[Wikidata-bugs] [Maniphest] [Created] T146079: s5-master contention caused (?) by refreshlinksprioritized job running for all wikis

2016-09-19 Thread jcrespo
jcrespo created this task.jcrespo added projects: Wikimedia-log-errors, Wikidata, DBA, MediaWiki-JobRunner.Herald added a subscriber: Aklapper. TASK DESCRIPTIONThere is high number of connection errors to 10.64.16.144 (db1049, or s5-master) caused probably by high number of connections such as

[Wikidata-bugs] [Maniphest] [Retitled] T146079: s5-master contention caused (?) by refreshlinksprioritized/addUsagesForPage jobs running for all wikis

2016-09-19 Thread jcrespo
jcrespo changed the title from "s5-master contention caused (?) by refreshlinksprioritized job running for all wikis" to "s5-master contention caused (?) by refreshlinksprioritized/addUsagesForPage jobs running for all wikis". TASK DETAILhttps://phabricator.wikime

[Wikidata-bugs] [Maniphest] [Commented On] T146079: s5-master contention caused (?) by refreshlinksprioritized/addUsagesForPage jobs running for all wikis

2016-09-20 Thread jcrespo
jcrespo added a comment. The errors have slowed down to the point they disappeared. I will close this in my afternoon as invalid/a one time spike unless it reappears.TASK DETAILhttps://phabricator.wikimedia.org/T146079EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] [Closed] T146079: s5-master contention caused (?) by refreshlinksprioritized/addUsagesForPage jobs running for all wikis

2016-09-20 Thread jcrespo
jcrespo closed this task as "Resolved".jcrespo claimed this task.jcrespo added a comment. There is still some spikes here and there, but the level of errors is low and it is only 30% of the total, so I am resolving. Probably there was an ongoing import causing higher load that usual. T

[Wikidata-bugs] [Maniphest] [Created] T146529: 5-10 times more rows are currently read from the s5 master than from all other masters together

2016-09-24 Thread jcrespo
jcrespo created this task.jcrespo added projects: Performance, Wikidata, DBA.Herald added a subscriber: Aklapper. TASK DESCRIPTIONF4519503: Screenshot from 2016-09-24 11-12-03.png https://grafana-admin.wikimedia.org/dashboard/db/mysql-aggregated?from=1474704583828&to=1474708183828&var-

[Wikidata-bugs] [Maniphest] [Commented On] T146529: 5-10 times more rows are currently read from the s5 master than from all other masters together

2016-09-25 Thread jcrespo
jcrespo added a comment. It only stopped from 19:24:45 to 20:00:45 on the 24th. I see an unusual amount of GET_LOCK('LinksUpdate:job:pageid:28776419', 15), on s5-master mostly for linkupdates and category memberships. Which is strange because I would only expect that if there was l

[Wikidata-bugs] [Maniphest] [Commented On] T146576: 502 Bad Gateway errors while trying to run simple queries with the Wikidata Query Service

2016-09-25 Thread jcrespo
jcrespo added a comment. According to icinga/grafana, several servers have lag (including that one).TASK DETAILhttps://phabricator.wikimedia.org/T146576EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc: jcrespo, Stashbot, Multichill, Gehel, Smalyshev

[Wikidata-bugs] [Maniphest] [Commented On] T146529: 5-10 times more rows are currently read from the s5 master than from all other masters together

2016-09-25 Thread jcrespo
jcrespo added a comment. This stopped suddenly at 08:20:42 on the 25th.TASK DETAILhttps://phabricator.wikimedia.org/T146529EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc: aaron, Aklapper, jcrespo, Marostegui, Vali.matei, Minhnv-2809, D3r1ck01, Izno

[Wikidata-bugs] [Maniphest] [Closed] T146529: 5-10 times more rows are currently read from the s5 master than from all other masters together

2016-09-28 Thread jcrespo
jcrespo closed this task as "Resolved".jcrespo claimed this task.jcrespo added a comment. This stopped happening 2016-09-27 ~07:00.TASK DETAILhttps://phabricator.wikimedia.org/T146529EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc:

[Wikidata-bugs] [Maniphest] [Commented On] T146576: 502 Bad Gateway errors while trying to run simple queries with the Wikidata Query Service

2016-10-01 Thread jcrespo
jcrespo added a comment. By looking at https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?panelId=15&fullscreen and doing some testing, I think this particular issue is solved. I would suggest to close this as resolved and create new tickets for remaining, unrelated issues.

[Wikidata-bugs] [Maniphest] [Created] T147748: Large number of CategoryMembershipChangeJob::run updates are failing

2016-10-09 Thread jcrespo
jcrespo created this task.jcrespo added projects: Wikidata, MediaWiki-General-or-Unknown, MediaWiki-JobRunner, MediaWiki-JobQueue.Herald added a subscriber: Aklapper. TASK DESCRIPTIONAfter handling an incident: T147747 I saw a large amount of jobs complaining on the mediawiki logs

[Wikidata-bugs] [Maniphest] [Triaged] T147748: Large number of CategoryMembershipChangeJob::run updates are failing

2016-10-09 Thread jcrespo
jcrespo triaged this task as "Unbreak Now!" priority.jcrespo added a comment.Herald added subscribers: Jay8g, Luke081515, TerraCodes. Unbreak now because it makes logs impossible for me to read right now logs while unrelated ongoing issues are happening. Just stopping the logging

[Wikidata-bugs] [Maniphest] [Commented On] T147748: Large number of CategoryMembershipChangeJob::run updates are failing

2016-10-09 Thread jcrespo
jcrespo added a comment. Probably related, queries like this are causing thousands of errors a day: Hits Tmax Tavg Tsum Hosts Users Schemas 11910 33 4 52,440 db1087 wikiuser wikidatawiki SELECT /* Wikibase\Lib\Store\Sql\SqlEntityInfoBuilder::collectTermsForEntities */ term_entity_type

[Wikidata-bugs] [Maniphest] [Updated] T147748: Large number of CategoryMembershipChangeJob::run updates are failing

2016-10-10 Thread jcrespo
jcrespo added a project: Wikimedia-log-errors. TASK DETAILhttps://phabricator.wikimedia.org/T147748EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc: MZMcBride, TerraCodes, Luke081515, Jay8g, Aklapper, jcrespo, D3r1ck01, Izno, Wikidata-bugs, aude

[Wikidata-bugs] [Maniphest] [Commented On] T147748: Large number of CategoryMembershipChangeJob::run updates are failing

2016-10-10 Thread jcrespo
jcrespo added a comment. It is clear to me that the direct cause for connection issues is pilups coming from queries such as: Id: 6451784424 User: wikiuser Host: 10.64.32.35:38688 db: wikidatawiki Command: Query Time: 18 State: Sending data Info: SELECT /* Wikibase

[Wikidata-bugs] [Maniphest] [Commented On] T147748: Large number of CategoryMembershipChangeJob::run updates are failing

2016-10-10 Thread jcrespo
jcrespo added a comment. Thank you for the quick response, seeing the effect already, I am more than ok with putting this as high or normal (sorry, I didn't see it soon). Is the slow Wikibase\Lib\Store\Sql\SqlEntityInfoBuilder::collectTermsForEntities related to the job or is it a separate

[Wikidata-bugs] [Maniphest] [Commented On] T145412: Review & work on Cognate extension

2016-10-11 Thread jcrespo
jcrespo added a comment. Performance and scalability. We need a way to efficiently track and query page names across all Wiktionaries. Why not solve the problem forever by making title a first-class entity on core, solving title and *link, page_assessment,etc. space issues at the same time? I am

[Wikidata-bugs] [Maniphest] [Commented On] T147748: Large number of CategoryMembershipChangeJob::run updates are failing

2016-10-13 Thread jcrespo
jcrespo added a comment. I conclude that we should run one query per term type here and simply merge the result in the application code. Yes, this is a property of Btree indexes- 2 ranges cannot be handled efficiently- you can only handle one with an index at a time. You could even merge both

[Wikidata-bugs] [Maniphest] [Commented On] T145412: Review & work on Cognate extension

2016-10-13 Thread jcrespo
jcrespo added a comment. Title is already a first-class entity, that's what the page table is. No, right now, page is an entity that has a series of properties: at title, a text, etc. By setting title as a strong entity, it has meaning by its own: a page has 1:1 titles, a title has 1:0

[Wikidata-bugs] [Maniphest] [Unassigned] T144010: Drop eu_touched in production

2016-10-20 Thread jcrespo
jcrespo removed jcrespo as the assignee of this task.jcrespo added a comment. Please note that neither ticket T51188 nor #Schema-change tags are monitored by DBAs See: https://wikitech.wikimedia.org/wiki/Schema_changes and https://www.mediawiki.org/wiki/Development_policy

[Wikidata-bugs] [Maniphest] [Commented On] T148988: Initial Cognate DB review

2016-10-25 Thread jcrespo
jcrespo added a comment. @Addshore I have some questions, they are not long, but they depend on each other, so I would love to chat with you when you find the time, as that will simplify the interaction. I am at Europe Timezone, so if you can find some time to meet at IRC, it would be great. The

[Wikidata-bugs] [Maniphest] [Commented On] T148988: Initial Cognate DB review

2016-10-26 Thread jcrespo
jcrespo added a comment. So, with the feedback @Addshore gave me on IRC, I would suggest: Converting, in the first case: CREATE TABLE IF NOT EXISTS /*_*/cognate_titles ( cgti_site VARBINARY(32) NOT NULL, cgti_namespace INT NOT NULL, cgti_title VARBINARY(255), cgti_key VARBINARY(255) NOT

[Wikidata-bugs] [Maniphest] [Claimed] T148988: Initial Cognate DB review

2016-10-26 Thread jcrespo
jcrespo claimed this task.jcrespo moved this task from Triage to In progress on the DBA board.jcrespo triaged this task as "Normal" priority.jcrespo added a comment. I will put this in progress, but I will be waiting for further feedback.TASK DETAILhttps://phabricator.wik

[Wikidata-bugs] [Maniphest] [Commented On] T148988: Cognate DB review

2016-11-02 Thread jcrespo
jcrespo added a comment. DROP TABLE cognate_sites; DROP TABLE cognate_titles; DROP TABLE cognate_normalizations; /* *Create the tables */ CREATE TABLE IF NOT EXISTS cognate_sites ( This is strange syntax. Normally, we want to DROP TABLE IF EXISTS, and force its creation, but anyway. For

[Wikidata-bugs] [Maniphest] [Commented On] T148988: Cognate DB review

2016-11-02 Thread jcrespo
jcrespo added a comment. I think the key here that may have been missed in our previous chat is that the normalization step is the same for all wikis. Ok, assuming that, which can create issues in the future, but that is your decision. Why do you need the other tables? You are maintaining core

[Wikidata-bugs] [Maniphest] [Commented On] T148988: Cognate DB review

2016-11-02 Thread jcrespo
jcrespo added a comment. RIGHT JOIN cognate_titles ON cn_raw = ct_title LEFT JOIN cognate_sites ON ct_site = cs_dbname No thanks, please continue with your original proposal, which was way better than this.TASK DETAILhttps://phabricator.wikimedia.org/T148988EMAIL PREFERENCEShttps

[Wikidata-bugs] [Maniphest] [Commented On] T148988: Cognate DB review

2016-11-02 Thread jcrespo
jcrespo added a comment. Is there a reason why T148988#2744793, with some fixes based on your latest feedback could not work? I do not know what you are trying to do achieve here, honestly, only you know- and I do not have time to work on this unless you schedule it appropriately some months in

[Wikidata-bugs] [Maniphest] [Commented On] T148988: Cognate DB review

2016-11-02 Thread jcrespo
jcrespo added a comment. A strictly unique key (possible PK) in the table would be a combination of all 4 fields, Do you see this as being too much? Would the id vs string for site make a difference here? Another option could be a PK on the site, namespace and title_key (but this could run into

[Wikidata-bugs] [Maniphest] [Unassigned] T148988: Cognate DB review

2016-11-02 Thread jcrespo
jcrespo removed jcrespo as the assignee of this task.jcrespo moved this task from In progress to Blocked external/Not db team on the DBA board. TASK DETAILhttps://phabricator.wikimedia.org/T148988WORKBOARDhttps://phabricator.wikimedia.org/project/board/1060/EMAIL PREFERENCEShttps

[Wikidata-bugs] [Maniphest] [Commented On] T148988: Cognate DB review

2016-11-02 Thread jcrespo
jcrespo added a comment. the NULL values in the title field where the title is the same as the normalized title (to cut back on duplicated data) This is the kind of thing what makes SELECT not simple- you have to do an IF on the code based on if the value is NULL or not. That is not normalized

[Wikidata-bugs] [Maniphest] [Commented On] T148988: Cognate DB review

2016-11-09 Thread jcrespo
jcrespo added a comment. @jcrespo Tags? What tags? And why multiple tags for the same title? We are comparing pages titles between wikis, with some minimal string normalization applied. That's it. The above was an unrelated example, with columns that had nothing to do with the current desi

[Wikidata-bugs] [Maniphest] [Commented On] T151356: Wikibase\Repo\Store\Sql\SqlEntityIdPager::fetchIds query slow

2016-11-22 Thread jcrespo
jcrespo added a comment. This is the goal, but we will try to achieve this without the force index, depending on how much I can change the original query: MariaDB MARIADB db1082 wikidatawiki > EXPLAIN SELECT page_id, page_title FROM `page` FORCE INDEX(PRIMARY) WHERE (page_id > 312807

[Wikidata-bugs] [Maniphest] [Commented On] T151356: Wikibase\Repo\Store\Sql\SqlEntityIdPager::fetchIds query slow

2016-11-23 Thread jcrespo
jcrespo added a comment. The problem is not how to do the force, that can be added directly, the problem is that force index is a poor workaround, if the index changes in the future or the row distribution, we may be forcing a bad query plan.TASK DETAILhttps://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] [Commented On] T151356: Wikibase\Repo\Store\Sql\SqlEntityIdPager::fetchIds query slow

2016-11-23 Thread jcrespo
jcrespo added a comment. So we have the choice of forcing a bad query plan ourselves, or leaving it to Maria to pick a bad query plan... No, why is this a binary solution? we maybe can rewrite the query to do what we want and it changes if the environement changes. What we do not want to do at

[Wikidata-bugs] [Maniphest] [Commented On] T151356: Wikibase\Repo\Store\Sql\SqlEntityIdPager::fetchIds query slow

2016-11-23 Thread jcrespo
jcrespo added a comment. If we have to fail back to index hinting, this should be preferred -ignore rather than force: MariaDB [wikidatawiki]> FLUSH STATUS; pager cat > /dev/null; SELECT page_id,page_title FROM `page` IGNORE INDEX(page_redirect_namespace_len) WHERE (page_id > 312

[Wikidata-bugs] [Maniphest] [Commented On] T151356: Wikibase\Repo\Store\Sql\SqlEntityIdPager::fetchIds query slow

2016-11-23 Thread jcrespo
jcrespo added a comment. The histograms are still not enough to convince T151356#2817971 to not optimize the page_is_redirect condition. My advice is to rewrite the query into: MariaDB [wikidatawiki]> FLUSH STATUS; pager cat > /dev/null; SELECT page_id,page_title,page_is_redirect FROM

[Wikidata-bugs] [Maniphest] [Commented On] T151356: Wikibase\Repo\Store\Sql\SqlEntityIdPager::fetchIds query slow

2016-11-24 Thread jcrespo
jcrespo added a comment. I am ok with that, but on my version, it doesn't create a temporary table (maybe you were mixing 2 different executions?): MariaDB [wikidatawiki]> FLUSH STATUS; pager cat > /dev/null; SELECT page_id,page_title FROM `page` LEFT JOIN redirect ON rd_from = pa

<    1   2   3   4   5   6   >