Re: Download all the necessary files and linked images

2006-05-17 Thread Jean-Marc Molina
Hrvoje Niksic wrote: > I think you have a point there -- -A shouldn't so blatantly invalidate > -p. That would be IMHO the best fix to the problem you're > encountering. Frank mentionned that limitation in its first reply.

--page-requisites & host spanning

2006-05-17 Thread Jean-Marc Molina
Hello, I would like to talk about the --page-requisites (-p) and -H (host spanning) options. From the manual we can read that -p « causes Wget to download all the files that are necessary to properly display a given html page. This includes such things as inlined images, sounds, and referenced sty

Re: Download all the necessary files and linked images

2006-03-11 Thread Jean-Marc MOLINA
Mauro Tortonesi wrote: > although i really dislike the name "--no-follow-excluded-html", i > certainly agree on the necessity to introduce such a feature into > wget. > > can we come up with a better name (and reach consensus on that) > before i include this feature in wget 1.11? I agree "no" shou

Re: Download all the necessary files and linked images

2006-03-11 Thread Jean-Marc MOLINA
Tobias Tiederle wrote: >> I just set up my compile environment for WGet again. >> When I did regex support, I had the same problem with exclusion, so I >> introduced a new parameter "--follow-excluded-html". >> (Which is of course the default) but you can turn it off with >> --no-follow-excluded-ht

Re: Download all the necessary files and linked images

2006-03-09 Thread Jean-Marc MOLINA
Frank McCown wrote: > I'm afraid wget won't do exactly what you want it to do. Future > versions of wget may enable you to specify a wildcard to select which > files you'd like to download, but I don't know when you can expect > that behavior. I have an other opinion about that limitation. Could

Re: Download all the necessary files and linked images

2006-03-09 Thread Jean-Marc MOLINA
Frank McCown wrote: > I'm afraid wget won't do exactly what you want it to do. Future > versions of wget may enable you to specify a wildcard to select which > files you'd like to download, but I don't know when you can expect > that behavior. The more I use wget, the more I like it, even if I us

Download all the necessary files and linked images

2006-03-09 Thread Jean-Marc MOLINA
Hello, I want to archive a HTML page and « all the files that are necessary to properly display » it (Wget manual), plus all the linked images (). I tried most options and features : recursive archiving, including and excluding directories and file types... But I can't make up the right options to

How come I get spammed by wget@sunsite.dk ?

2005-11-09 Thread Jean-Marc MOLINA
Hello, Since I began to post here I got some spam from [EMAIL PROTECTED] Mostly it sends replies to my posts and never subscribed to any "mailing list". Does unsuscribing from the list will stop it ? Thanks and sorry, I'm not accustomed to mailing list but don't understand how come I got subscrib

Re: bug retrieving embedded images with --page-requisites

2005-11-09 Thread Jean-Marc MOLINA
Tony Lewis wrote: > The --convert-links option changes the website path to a local file > system path. That is, it changes the directory, not the file name. Thanks I didn't understand it that way. > IMO, your suggestion has merit, but it would require wget to maintain > a list of MIME types and c

Re: bug retrieving embedded images with --page-requisites

2005-11-09 Thread Jean-Marc MOLINA
Hrvoje Niksic wrote: > More precisely, it doesn't use the file name advertised by the > Content-Disposition header. That is because Wget decides on the file > name it will use based on the URL used, *before* the headers are > downloaded. This unfortunate design decision is the cause of all > thes

Re: bug retrieving embedded images with --page-requisites

2005-11-09 Thread Jean-Marc MOLINA
Gavin Sherlock wrote: > i.e. the image is generated on the fly from a script, which then > essentially prints the image back to the browser with the correct > mime type. While this is a non-standard way to include an image on a > page, the --page-requisites are not fulfilled when retrieving this >

Re: Getting list of files

2005-11-08 Thread Jean-Marc MOLINA
Jonathan wrote: > I think you should be using a tool like linklint (www.linklint.org) > not wget. Thanks I didn't know that tool. However as I'm not really into Perl scripting I wonder if you know any PHP equivalent. And if I understand well how it works, it seems link checkers like Linklint are j

logfile and log messages parser

2005-11-08 Thread Jean-Marc MOLINA
Hello, I was looking for an alternative to HTTrack to archive single pages and found wget. It works just like a charm thanks to the "--page-requisites" options. However I would like to post-process the archived files. I thought of using the logs but it seems they are... just a bunch of messages. I

Re: Getting list of files

2005-11-02 Thread Jean-Marc MOLINA
Shahram Bakhtiari wrote: > I would like to share my experience of my failed attempt on using > wget to get the list of files. > I used the following command to get a list of all existing mp3 files, > without really downloading them: > > wget --http-user=peshvar2000 --http-passwd=peshvar2000 -r -np