According to Joe R. Jah:
> Retriever.cc-h.diff is the culprit; I backed it off, and everything worked
> without a problem.
Aw, nuts! I wanted that patch. Maybe we should just try for a simpler
tactic. Rather than blocking all pushes to a down server, push all URLs
regardless, but when you pop them, check if the server is OK, and if not,
don't call RetrieveHTTP(). I'd just have to take the check for _bad_server
out of push(), and add a method to allow Retriever to query _bad_server.
Can anyone see a problem with this approach? It seems more efficient that
checking twice to see if the file is local, but it will chew up more queue
entries.
I also want to change RetrieveLocal() to allow local retrieval of .txt and
.pdf files, at the very least, and use it to check for robots.txt - that
would allow completely HTTP-free local indexing for most users.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930