On Wed, 16 Jun 1999, Neil Mansilla wrote:

> Can someone help me identify the file and subroutine that
> is the FIRST to see and strip out the HREFs?  I think that
> this is the best place to lowercase a URL, before it gets
> any further in the spidering process..

I would assume that Retriever::got_href is the first to see HREFs. Well,
not quite, clearly the HTML parser does. However, the Retriever class is
better suited for lowercasing URLs.

> At that location, we'll check the conf["case_sensitive"]
> (or pass that value to that subroutine) and IMMEDIATELY
> lowercase the URL(s).

Do you want to submit a patch?

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to