On Montag, 12. Juni 2017 17:07:30 CEST Chris wrote: > Hi Tim, > > I just created a test page at - > https://www.anfractuosity.com/files/test2.html > were I still get the issue. > > The version is 'GNU Wget 1.19.1 built on linux-gnu.'
Thanks, Chris. The issue is reproducible with latest git, thanks to your test page. I'll create a test case tomorrow and then we'll fix it. It has something to do with If-Modified-Since. If you use --no-if-modified-since the links are converted correctly. The good news is: Wget2 (https://gitlab.com/gnuwget/wget2) does it correctly :-) With Best Regards, Tim > > cheers > Chris > > On 12 June 2017 at 15:35, Tim Rühsen <tim.rueh...@gmx.de> wrote: > > On 06/12/2017 10:27 AM, chris wrote: > > > Hi Tim, > > > > > > Thanks for your reply, I notice the following in the debug logs: > > > > > > """ > > > will convert url > > > http://www.anfractuosity.com/wp-content/uploads/2014/02/fsk.png to local > > > site_output/fsk.png > > > will convert url > > > https://www.anfractuosity.com/wp-content/uploads/2014/02/fsk.png to > > > > local > > > > > site_output/fsk.png.html > > > """ > > > > > > The difference between those URLs seems to be one is https and one > > > isn't. > > > When I wget those URLs though, both seem to return a .png, with 'Length: > > > 51068 (50K) [image/png]'. > > > > > > So I'm a bit confused why I get the fsk.png.html URL. > > > > What version of wget are you using ? (1.19.1 here) > > > > I tried some combinations of srcset (with https and http) and your > > original options. I thought of an issue with redirection (because that's > > an answer with text/html Content-Type). > > > > Could you create a small reproducer page ? e.g. like > > <html><body> > > <img > > srcset="https://www.anfractuosity.com/wp-content/uploads/2014/02/fsk.png > > 533w, > > http://www.anfractuosity.com/wp-content/uploads/2014/02/fsk-266x300.png > > 266w"><a> > > </body></html> > > > > With whatever paths you are using for the .png files. > > I don't want to download tons of files (limited bandwidth here). > > > > > cheers > > > Chris > > > > > > On Mon, Jun 12, 2017 at 9:08 AM, Tim Rühsen <tim.rueh...@gmx.de> wrote: > > >> Hi Chris, > > >> > > >> On 06/11/2017 05:24 PM, chris wrote: > > >>> Hi, > > >>> > > >>> I'm just wondering if I've possibly found a bug, unless I'm just doing > > >>> something incorrectly (which I assume is more likely). > > >>> > > >>> I grab my webpage using 'wget -T1 -t1 -E -k -H -nd -N -p -P > > >>> site_output > > >>> https://www.anfractuosity.com/projects/ultrasound-networking/ > note1 > > > > 2> > > > > >>> note2' > > >>> > > >>> But i notice the srcset tags in the resulting downloaded files produce > > >> > > >>> 'srcset="fsk.png.html 533w, fsk-266x300.png 266w" sizes="(max-width: > > >> 533px) > > >> > > >>> 100vw, 533px" /></a></p>' in the output index.html. > > >>> > > >>> On the actual webpage it looks like "srcset=" > > >>> https://www.anfractuosity.com/wp-content/uploads/2014/02/fft.png > > >> > > >> 762w,...." > > >> > > >>> no .html extension on the .png. > > >> > > >> You requested -E (--adjust-extension) and -k (--convert-links). > > >> That would change the file name when the server tags the file as > > >> content-type 'text/html'. You could see that in the debug output > > >> (options -d or --debug). > > >> > > >>> Cheers > > >>> Chris > > >> > > >> With Best Regards, Tim
signature.asc
Description: This is a digitally signed message part.