Re: robots.txt

2002-06-08 Thread Alan E

On Sat, Jun 08, 2002 at 10:30:02PM -0500, Amy Rupp wrote:
 On some sites I cannot download files.
 On one such site I found this file robots.txt.
 Is this file the cause for wget not downloading the files.

Why not just put robots=off in your .wgetrc?

Read the docs, people! That's why many hours were spent by persons
preparing, correcting, and keeping them up to date as best they can.

-- 
AlanE



Re: robots.txt

2002-06-08 Thread Alan E

On Sat, Jun 08, 2002 at 10:58:16PM -0500, Amy Rupp wrote:
 
 Why not just put robots=off in your .wgetrc?

With all due respect, I *DID* read the documentation, which I quoted,
and I attempted to find the latest version available.  I didn't post
the original question, and *MY* question still stands:  if you

robots=off does just that; it does neither read nor honor robots.txt.

As for User Agent, most sites like to see a string with WinXX or IE or
Explorer in them.

If you are using windows, go to a friend's web site and then ask them
to mail you the user agent string your Explorer sent and use that.

If you are using Unix and run apache, look through your own logs to see
what strings people send you. 

-- 
AlanE



Re: Extend timeout to DNS lookups

2002-04-14 Thread Alan E

On Mon, Apr 15, 2002 at 04:20:58AM +0200, Hrvoje Niksic wrote:
As suggested by Alan E, this patch extends the meaning of timeout to
include DNS lookups.  After this patch, I can't think of any network
operation still allowed to take more than the specified timeout
period.

Cool. Thanks. That's gonna save a lot of complaints and bug reports, and
make for more satisfied customers. Is the automatic subscription payment
code about ready? Hehehe.

-- 
AlanE



How about leaving the headers from the spam messages?

2002-04-13 Thread Alan E

It would be a lot easier to report this shit to SpamCop if the mailing
list software didn't strip the incoming headers. 
-- 
AlanE



Re: Proposal for despamming the list

2002-04-13 Thread Alan E

On Sunday 14 April 2002 01:23, you wrote:
 Alan E [EMAIL PROTECTED] writes:

 Does it allow custom rules, such as bonus points for mails that
 mention wget or debug log in the body?

AFAIK, yes. you can put your rules in the local configuration file.

-- 
AlanE




Re: Timeout Bug

2002-02-21 Thread Alan E

On Thursday 21 February 2002 05:44, Ian Abbott wrote:
 On 21 Feb 2002 at 1:31, Alan Eldridge wrote:
  You can't get it to work for timing out a socket connection, because
  that is a bit of code that hasn't been implemented yet.
 
  If no one else wants to, I can work up a patch for this next week.
  It's pretty standard coding, right out of Stevens. ;)

 Before you do that, have look at this thread from December:

 Subject: Patch for wget hanging on connect() call
 http://www.mail-archive.com/wget@sunsite.dk/msg02342.html
  http://www.mail-archive.com/wget@sunsite.dk/msg02345.html
   http://www.mail-archive.com/wget@sunsite.dk/msg02353.html
http://www.mail-archive.com/wget@sunsite.dk/msg02354.html
 http://www.mail-archive.com/wget@sunsite.dk/msg02355.html
  http://www.mail-archive.com/wget@sunsite.dk/msg02358.html
 http://www.mail-archive.com/wget@sunsite.dk/msg02356.html

 The main problem seems to be the (non-)portability of FIONBIO on
 older systems that Wget is supposed to work on, but the above
 messages go into it (and possible work-arounds) in more detail. It
 might be better to use alarm() and a signal handler as Daniel
 Stenberg suggested in one of the above messages.

Thanks, Ian. Will do so. I won't get to this until next week, so if anybody
else out there has intentions to do it sooner, please post so we don't 
duplicate effort.

-- 
Alan Eldridge
Dave's not here, man.