[Wikidata-bugs] [Maniphest] T275133: Limit query parallelism from Flink based WDQS updater to Wikidata

2021-06-21 Thread Gehel
Gehel closed this task as "Resolved".

TASK DETAIL
  https://phabricator.wikimedia.org/T275133

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse, Gehel
Cc: dcausse, Gehel, Aklapper, Invadibot, MPhamWMF, maantietaja, wkandek, 
JMeybohm, CBogen, Akuckartz, Nandana, Namenlos314, jijiki, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Addshore, Mbch331, Dzahn
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T275133: Limit query parallelism from Flink based WDQS updater to Wikidata

2021-06-10 Thread dcausse
dcausse claimed this task.
dcausse moved this task from Ready for Development to Needs Reporting on the 
Discovery-Search (Current work) board.
dcausse added a comment.


  I ran a backfill (reaching directly to appservers) using a thread pool of 
size 6 over 12 workers (72) and the impact on the app servers was barely 
noticeable. We can somehow control the parallelism using the options of the 
flink pipeline itself. The pipeline has been running from yarn with these 
values for a couple months now so I'm tentatively calling this done.
  We can reconsider using other techniques like our poolcounter but I doubt 
this is worth the effort at this point.
  
  Related pipeline options:
  
parallelism: 12
wikibase_repo_thread_pool_size: 6

TASK DETAIL
  https://phabricator.wikimedia.org/T275133

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Gehel, Aklapper, Invadibot, MPhamWMF, maantietaja, wkandek, 
JMeybohm, CBogen, Akuckartz, Nandana, Namenlos314, jijiki, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Addshore, Mbch331, Dzahn
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T275133: Limit query parallelism from Flink based WDQS updater to Wikidata

2021-04-21 Thread jijiki
jijiki added a project: User-jijiki.

TASK DETAIL
  https://phabricator.wikimedia.org/T275133

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: jijiki
Cc: dcausse, Gehel, Aklapper, Invadibot, MPhamWMF, maantietaja, wkandek, 
JMeybohm, CBogen, Akuckartz, Nandana, Namenlos314, jijiki, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Addshore, Mbch331, Dzahn
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T275133: Limit query parallelism from Flink based WDQS updater to Wikidata

2021-04-15 Thread dcausse
dcausse added a comment.
Restricted Application added a project: wdwb-tech.


  Since we are going to use envoy to contact MW applications servers I wonder 
if this kind of limits could be enforced by it?
  
  Today I think that wdqs updaters are talking to the edge caches and some 
requests might not reach app servers but when using envoy we will always hit 
the app servers.
  
  I have no clue what would be a reasonable limit here. I collected some stats 
on backend timings for the first 7 day of April 2021 (time_firstbyte on cache 
misses for `/wiki/Special:EntityData/QXYZ.ttl?flavor=dump&revion=XYZ`):
  
  | day of april | count | p50   | p75   | p95   | p99   |
  | 1| 1241154 | 0.083 | 0.104 | 0.157 | 0.212 |
  | 2| 1570675 | 0.084 | 0.105 | 0.156 | 0.210 |
  | 3| 1315251 | 0.083 | 0.103 | 0.153 | 0.209 |
  | 4| 1064852 | 0.081 | 0.102 | 0.155 | 0.209 |
  | 5| 1232205 | 0.081 | 0.103 | 0.154 | 0.209 |
  | 6| 1242875 | 0.082 | 0.103 | 0.156 | 0.209 |
  | 7| 1257607 | 0.082 | 0.103 | 0.157 | 0.212 |

TASK DETAIL
  https://phabricator.wikimedia.org/T275133

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Gehel, Aklapper, Invadibot, MPhamWMF, maantietaja, wkandek, 
JMeybohm, CBogen, Akuckartz, Nandana, Namenlos314, jijiki, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Addshore, Mbch331, Dzahn
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T275133: Limit query parallelism from Flink based WDQS updater to Wikidata

2021-02-22 Thread MPhamWMF
MPhamWMF set the point value for this task to "2".

TASK DETAIL
  https://phabricator.wikimedia.org/T275133

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: MPhamWMF
Cc: Gehel, Aklapper, MPhamWMF, wkandek, JMeybohm, CBogen, Akuckartz, Nandana, 
Namenlos314, jijiki, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, 
Mbch331, Dzahn
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T275133: Limit query parallelism from Flink based WDQS updater to Wikidata

2021-02-22 Thread MPhamWMF
MPhamWMF moved this task from All WDQS-related tasks to Current work on the 
Wikidata-Query-Service board.
MPhamWMF added a project: Discovery-Search (Current work).

TASK DETAIL
  https://phabricator.wikimedia.org/T275133

WORKBOARD
  https://phabricator.wikimedia.org/project/board/891/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: MPhamWMF
Cc: Gehel, Aklapper, MPhamWMF, wkandek, JMeybohm, CBogen, Akuckartz, Nandana, 
Namenlos314, jijiki, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, 
Mbch331, Dzahn
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T275133: Limit query parallelism from Flink based WDQS updater to Wikidata

2021-02-22 Thread MPhamWMF
MPhamWMF updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T275133

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: MPhamWMF
Cc: Gehel, Aklapper, MPhamWMF, wkandek, JMeybohm, CBogen, Akuckartz, Nandana, 
Namenlos314, jijiki, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, 
Mbch331, Dzahn
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T275133: Limit query parallelism from Flink based WDQS updater to Wikidata

2021-02-18 Thread Gehel
Gehel added a parent task: T244590: [Epic] Rework the WDQS updater as an event 
driven application.

TASK DETAIL
  https://phabricator.wikimedia.org/T275133

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel
Cc: Gehel, Aklapper, MPhamWMF, wkandek, JMeybohm, CBogen, Akuckartz, Nandana, 
Namenlos314, jijiki, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, 
Mbch331, Dzahn
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T275133: Limit query parallelism from Flink based WDQS updater to Wikidata

2021-02-18 Thread Gehel
Gehel created this task.
Gehel added projects: serviceops, Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  As an operator of WDQS, I want to ensure that the Updater isn't overloading 
dependent services by limiting max concurrent requests.
  
  Given the recent incident where the WDQS Flink based streaming updater was 
blocked as it seemed to generate too much traffic, we want to enforce 
appropriate limits on the query parallelism it can generate. The implementation 
is already in place, this is just a matter of finding the appropriate 
configuration. This new updater is a lot more efficient than the current one 
(which is the whole point), so it can potentially generate a lot more load.
  
  As a distributed application, this updater has concurrency limits on each 
node (currently, 12 nodes x 6 thread = 72 concurrent requests max). Note that 
this max number is never reached, each nodes does other things than querying 
Wikidata.
  
  In normal operations, the number of requests is limited by the edit rate on 
Wikidata, and whatever limit we set must be sufficient to support that edit 
rate.
  
  During initial data load, the backlog of edits is consumed as fast as 
possible, limited by whatever concurrency we set. This is where having a 
reasonable limit is necessary.
  
  The current updater duplicates requests to wikidata for each node (18 nodes 
at the moment). The new updater centralises this, so we can expect a 18x 
reduction in queries during normal operation once the new updater is pushed to 
production.
  
  AC:
  
  - parallelism limits are agreed on with Service Ops and configured

TASK DETAIL
  https://phabricator.wikimedia.org/T275133

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel
Cc: Gehel, Aklapper, MPhamWMF, wkandek, JMeybohm, CBogen, Akuckartz, Nandana, 
Namenlos314, jijiki, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, 
Mbch331, Dzahn
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs