Begin forwarded message:
*From: *"David Adams" <[EMAIL PROTECTED]> *Date: *24 March 2004 9:18:17 PM *To: *<[EMAIL PROTECTED]>, "Toby Thain" <[EMAIL PROTECTED]> *Subject: Re: [htdig] query parameters should be ignored by extension filter? * I am also using ht://Dig version 3.1.6 and for me it IS indexing URLs like
http://www.soton.ac.uk/~lopsoc/gallery.php?gallery=sorcerer1&photo=CNV00023.jpg
even though I have .jpg in my bad_extensions: list.
I suggest that you take a hard look at your configuration file and check that one of:
exclude_urls: limit_urls_to: bad_querystr: url_rewrite_rules:
isn't excluding them.
David,
Thanks for your suggestions.
I am not using any of those directives; the .conf is vanilla except for customising the search results wrapper.
I did need to add .swf and .ico to the bad extensions list. IMHO these should really be in there by default (may be fixed in later version?)
Adding "valid_extensions: .php3 .html" did not help either; the URLs are still not being indexed. Even adding a fake "&q" to the end of the URL doesn't stop htdig rejecting it - a sample rejection from rundig -vvv:
-----
href: http://stegbar.intranet/php/photo.php3?f=s_rc_pl_wd_t_aw_1.jpg&q (Thumbnail: windows_and_doors
Enlarge)
Rejected: Extension is not valid! -----
Toby
Personally, I don't need those ~lopsoc/...jpg files and will be adding them to exclude_urls: if they publish many more of them!
David Adams Corporate Information Services Information Systems Services University of Southampton
----- Original Message ----- From: "Toby Thain" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Wednesday, March 24, 2004 9:58 AM Subject: [htdig] query parameters should be ignored by extension filter?
List,
I noticed today that htdig is not indexing URLs like:
/foo/page.php3?f=bar.jpg
because it notices the URL ends with ".jpg". I am surprised that it's not smart enough to realise that the fetched object is actually a ".php3", and I definitely want that URL followed.
Is this fixed in a recent version (I am using ht://Dig 3.1.6)? Or is there a simple configuration fix?
Toby
------------------------------------------------------- This SF.Net email is sponsored by: IBM Linux Tutorials Free Linux tutorial presented by Daniel Robbins, President and CEO of GenToo technologies. Learn everything from fundamentals to system administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________ ht://Dig general mailing list: <[EMAIL PROTECTED]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general
------------------------------------------------------- This SF.Net email is sponsored by: IBM Linux Tutorials Free Linux tutorial presented by Daniel Robbins, President and CEO of GenToo technologies. Learn everything from fundamentals to system administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click _______________________________________________ ht://Dig general mailing list: <[EMAIL PROTECTED]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

