Re: robots.txt
On Sat, Jun 08, 2002 at 10:30:02PM -0500, Amy Rupp wrote: On some sites I cannot download files. On one such site I found this file robots.txt. Is this file the cause for wget not downloading the files. Why not just put robots=off in your .wgetrc? Read the docs, people! That's why many hours were spent by persons preparing, correcting, and keeping them up to date as best they can. -- AlanE
Re: robots.txt
On Sat, Jun 08, 2002 at 10:58:16PM -0500, Amy Rupp wrote: Why not just put robots=off in your .wgetrc? With all due respect, I *DID* read the documentation, which I quoted, and I attempted to find the latest version available. I didn't post the original question, and *MY* question still stands: if you robots=off does just that; it does neither read nor honor robots.txt. As for User Agent, most sites like to see a string with WinXX or IE or Explorer in them. If you are using windows, go to a friend's web site and then ask them to mail you the user agent string your Explorer sent and use that. If you are using Unix and run apache, look through your own logs to see what strings people send you. -- AlanE
Re: Extend timeout to DNS lookups
On Mon, Apr 15, 2002 at 04:20:58AM +0200, Hrvoje Niksic wrote: As suggested by Alan E, this patch extends the meaning of timeout to include DNS lookups. After this patch, I can't think of any network operation still allowed to take more than the specified timeout period. Cool. Thanks. That's gonna save a lot of complaints and bug reports, and make for more satisfied customers. Is the automatic subscription payment code about ready? Hehehe. -- AlanE
How about leaving the headers from the spam messages?
It would be a lot easier to report this shit to SpamCop if the mailing list software didn't strip the incoming headers. -- AlanE
Re: Proposal for despamming the list
On Sunday 14 April 2002 01:23, you wrote: Alan E [EMAIL PROTECTED] writes: Does it allow custom rules, such as bonus points for mails that mention wget or debug log in the body? AFAIK, yes. you can put your rules in the local configuration file. -- AlanE
Re: Timeout Bug
On Thursday 21 February 2002 05:44, Ian Abbott wrote: On 21 Feb 2002 at 1:31, Alan Eldridge wrote: You can't get it to work for timing out a socket connection, because that is a bit of code that hasn't been implemented yet. If no one else wants to, I can work up a patch for this next week. It's pretty standard coding, right out of Stevens. ;) Before you do that, have look at this thread from December: Subject: Patch for wget hanging on connect() call http://www.mail-archive.com/wget@sunsite.dk/msg02342.html http://www.mail-archive.com/wget@sunsite.dk/msg02345.html http://www.mail-archive.com/wget@sunsite.dk/msg02353.html http://www.mail-archive.com/wget@sunsite.dk/msg02354.html http://www.mail-archive.com/wget@sunsite.dk/msg02355.html http://www.mail-archive.com/wget@sunsite.dk/msg02358.html http://www.mail-archive.com/wget@sunsite.dk/msg02356.html The main problem seems to be the (non-)portability of FIONBIO on older systems that Wget is supposed to work on, but the above messages go into it (and possible work-arounds) in more detail. It might be better to use alarm() and a signal handler as Daniel Stenberg suggested in one of the above messages. Thanks, Ian. Will do so. I won't get to this until next week, so if anybody else out there has intentions to do it sooner, please post so we don't duplicate effort. -- Alan Eldridge Dave's not here, man.