Volans added subscribers: cmooney, ayounsi, Volans.
Volans reopened this task as "Open".
Volans added a comment.
Restricted Application added a project: wdwb-tech.


  I'm re-opening this as a follow up from a chat in this CR 
<https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/745629/3/cookbooks/sre/wdqs/data-reload.py#25>.
  I think that we should find a solution for this as I find not ideal that we 
have to rely on an external source for our own data and raises some concerns:
  
  - the total size of the 3 files to download is over 100GB
  - with the external URL I guess you have to use the HTTP proxies, adding 
unnecessary strain there
  - the integrity of those files should be verified against a checksum coming 
from our internal and authoritative dumps, and this doesn't seem the case AFAICT
  
  Some alternatives that are worth to investigate:
  
  - Fix the internal rate-limiting issue for internal clients only, the current 
dumps host has a 10G NIC so it shouldn't be a networking problem, not sure for 
the disk side of it.
  - evaluate rsync for the transfer. For example we could have a slow rsync 
that copies only the required files to another host periodically and then have 
the cookbook pick them from this other location (either rsync or curl at that 
point) quickly.
  - evaluate transfer.py for this use case ( see 
https://wikitech.wikimedia.org/wiki/Transfer.py )
  
  Adding @ayounsi and @cmooney if they have any comment on the network side of 
it.

TASK DETAIL
  https://phabricator.wikimedia.org/T222349

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel, Volans
Cc: Volans, ayounsi, cmooney, EBernhardson, Bstorm, ArielGlenn, Gehel, 
Aklapper, joanna_borun, Ramtin2021, Invadibot, MPhamWMF, dcaro, Devnull, 
Slst2020, GeminiAgaloos, maantietaja, nskaggs, lmata, Muchiri124, 
Raymond_Ndibe, CBogen, Nintendofan885, Akuckartz, Phamhi, RhinosF1, 
Legado_Shulgin, ReaperDawn, Nandana, Namenlos314, skpuneethumar, sietec, Zylc, 
Giuliamocci, Davinaclare77, 1978Gage2001, Techguru.pc, Lahi, Operator873, Gq86, 
Bsandipan, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Chicocvenancio, 
Allthingsgo, Hfbn0, QZanden, EBjune, Tbscho, merbst, LawExplorer, Zppix, 
JJMC89, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, Wong128hk, mys_721tx, 
jkroll, Wikidata-bugs, Jdouglas, Jitrixis, aude, Tobias1984, Manybubbles, 
Gryllida, faidon, scfc, Addshore, Mbch331, Jay8g, bd808, Krenair, fgiunchedi
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to