I've been battling with the case_sensitive issue for a while now. It seems that by declaring "case_sensitive: false" will automatically lowercase the URLs (performed in ../htlib/URL.cc). This seems like a great idea, however, I think a more logical procedure would be to not automatically lowercase the URL from the get-go and only lower case the URL temporarily when performing comparisons to previously crawled/queued URLs. Basically, what is happening is that the university's web server uses Apache's mod_mispel. Upon a URL case sensitivity mis-match (ex: http://www.foo.com/DOCUMENT is the request, but http://www.foo.com/document is the true document name), the module will send an automatic 301 Moved Permanently message -- a message that htdig does NOT follow, regardless of the case_sensitive argument. Long story short: where/how can the code be modified so that the actual URL is NOT lowercased automatically, but rather, is only lowercased temporarily when doing a comparison to other queued/crawled URLs (which will also be temporarily lowercased during the comparison process)? Thanks, Patrick ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this.
