Hey Elmar, did you try the following?
wget2 -p -r -l 1 -N -D static.esquire.de https://www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur
It downloads 94 files, 44 are .jpg files in static.esquire.de/.TBH, I am not 100% sure what you are trying to do, so excuse me if I am off the track. The -p option is for downloading the files you need for displaying a page (e.g. inlined images). If the images are just links, they are not downloaded by -p. In this case, -r -l 1 help. If images that are displayed in the browser are downloaded/displayed by javascript, wget/wget2 won't help you.
Regards, Tim On 9/13/23 00:14, Elmar Stellnberger wrote:
Hi to all! Today I wanted to download the following web page for means of archiving: https://www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur The following command line did not do what I want: wget -p -N -H -D esquire.de --tries=10 https://www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur The following seemed to do: wget -p -r -N -H -D esquire.de --exclude-domains www.esquire.de --tries=10 https://www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur : files downloaded: now/static.esquire.de/1200x630/smart/images/2023-08/gettyimages-1391653079.jpg now/www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur : dld.log: ... BEENDET --2023-09-12 23:18:01-- Verstrichene Zeit: 1,2s Geholt: 2 Dateien, 246K in 0,07s (3,62 MB/s) i.e. diz "two files fetched, no error" Without -r & --exclude-domains it did download 52 files (most of them .js), all from www.esquire.de and none from static.esquire.de. Finally I succeeded to download the images desired by me by á: (here starting from the second file as I did a manual download of the first) grep -o "https://static.esquire.de/[^ ]*\.jpg" schoenste-wasserfaelle-welt-natur.html | sed -n '2,500/./p' | while read line; do wget -p "$line"; done Might (theoretically) be a bug of wget 1.21.4 (1.mga9, i.e. Mageia 9 i686) that it did not download more than two files at the second attempt, though that may also be supposed to be a public-avail-silicon fallacy by whomever wants it to assume. BTW: 'wpdld' is my scriptlet to archive the web pages I read. Regarding the pages it works for (using wget) I prefer this over a Firefox save-page, as it keeps the web page more or less in pristine state to be mirrored like at the Wayback machine, if necessary. Not to save on disk what I read is something I have experienced that it can be nasty, caus´ not every article in news is kept online forever, or be it that it is just deleted from the indexes of search engines (and on-page searches). I would also have 'wpv' for viewing, but alas that isn´t multidomain or non-relative link ready - Hi, what about a make-relative feature of already downloaded web pages on disk for wget2? (would be my desire as I prefer to download non-relative and doing that on disk allows a 'dircmp' (another self-written program to compare (and sync) directories; using it more or less since 2008).) Regards, Elmar Stellnberger
OpenPGP_signature.asc
Description: OpenPGP digital signature
