dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  When we exposed the 3 experimental endpoints to test the first version of the 
graph split we disabled query throttling to avoid impacting the various 
analysis we had to run to evaluate the impact of the split.
  We then realized while analyzing what happens when federated queries are 
running that this throttling mechanism might have a negative impact by having 
wdqs nodes throttling each others.
  
  This ticket is about finding a plan to ensure that query throttling does not 
interfere with federation.
  
  A simple approach would be that the wdqs machine receiving the traffic is 
going to be responsible for throttling the client, subsequent queries made 
internally as part of federation would be un-throttled. Nodes serving federated 
results to other nodes should still remain protected by the frontend node 
answering to the client.
  
  To achieve this we need to detect when a query is emitted from another query 
service and craft a header at the nginx level to inform the throttling servlet 
that it should not be activated.
  Such header exist but sadly the throttling filter does re-use the existing 
`X-BIGDATA-READ-ONLY` which is having another purpose so cannot be re-used in 
our context (it would be too dangerous).
  
  One approach could be to use a new header `X-Disable-Throttling` dedicated 
for this purpose the nginx settings would have to be adapted to set 
`X-Disable-Throttling` when the query is emitted from from another blazegraph 
node. Unfortunately this might start to throttle local requests made directly 
on the blazegraph port (updates) which would then be prone to throttling and 
would have to be adapted to set this header (streaming-updater-consumer, data 
import scripts).
  
  Another approach is to adapt the throttling servlet and change how it's 
configured adding a new config `disable-throttling-if-header` such that a 
request with:
  
  - `X-BIGDATA-READ-ONLY: 1` and `X-Disable-Throttling: true` would disable 
throttling
  - `X-BIGDATA-READ-ONLY: 1` only would enable throttling
  - a request without any these headers would not enable throttling
  
  AC:
  
  - decide on the approach
  - blazegraph does not throttle itself when running federated queries

TASK DETAIL
  https://phabricator.wikimedia.org/T361950

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, AWesterinen, Namenlos314, Gq86, Lucas_Werkmeister_WMDE, 
EBjune, KimKelting, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, 
aude, Tobias1984, Manybubbles
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to