I once wrote a spider program that ran into the same problem. The way I
fixed it there was to have an option of the maximum URL size. This should
prevent such a loop. The default could be infinite, or just a really huge
number.
Jon
On Fri, 28 Jul 2000, Geoff Hutchison wrote:
> At 9:45 PM -0400 7/27/00, [EMAIL PROTECTED] wrote:
> >I think I found a bug in which a file is linked to ./ htdig will
> >continue forever.....
> >I had told it to index a backup cd of my old linux filesystem. When I
> >checked in on it it was indexing something like:
>
> I don't know that I can really consider this a bug. It's very easy to
> make infinite loops with symlinks. You can trap all sorts of
> command-line scripts and so on in them.
>
> >Also I am still very confused on what causes htdig to revert back to the
> >web server from a local url.
> >I understand it should use it for files like .cgi and .shtml, but going
> >back for every file thats not .html doesnt make sense. Is there a way to
> >make it _only_ use the web server for .cgi and .shtml?
>
> I don't understand what you mean by "every file that's not .html."
> The 3.1.5 version indexes .txt and .pdf files. Beyond that, it would
> have a difficult time working out the MIME-type of the file without
> either a mime.types file or something like mod_mime_magic in the
> Apache server. If it doesn't know what type of file it is, it *has*
> to go to the server!
>
> -Geoff
>
>
> ------------------------------------
> To unsubscribe from the htdig3-dev mailing list, send a message to
> [EMAIL PROTECTED]
> You will receive a message to confirm this.
>
>
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.