I would probably open a task to have wget available in the kubernetes cluster and another, low-priority one, for investigating why connection gets dropped between toolforge and dumps.w.o
On Sat, 13 Jan 2024 at 08:42, Wurgl <heisewu...@gmail.com> wrote: > Hello! > > wget was the tool I was using with jsub-Environment, but wget is not > available any more in kubernetes (with toolforge jobs start …) :-( > > $ webservice php7.4 shell > tools.persondata@shell-1705135256:~$ wget > bash: wget: command not found > > > Wolfgang > > > Am Sa., 13. Jan. 2024 um 02:20 Uhr schrieb Platonides < > platoni...@gmail.com>: > >> Gerhard said that for him the downloading job ran for about 12 hours. It >> seems the connection was closed. >> I wouldn't be surprised if this was facing a similar problem as >> https://phabricator.wikimedia.org/T351876 >> >> With such long download time, it isn't that strange that there could be >> connection errors (still something to look into, though, toolserver-to-Prod >> shouldn't be suffering that). >> >> wget (used by Gerhard) retries automatically, perhaps curl isn't and is >> thus more susceptible to these errors. >> >> Try changing your job to >> wget -O - >> https://dumps.wikimedia.org/wikidatawiki/latest/wikidatawiki-latest-pages-articles-multistream.xml.bz2 >> | bzip2 -d | tail >> >>
_______________________________________________ Xmldatadumps-l mailing list -- xmldatadumps-l@lists.wikimedia.org To unsubscribe send an email to xmldatadumps-l-le...@lists.wikimedia.org