According to fx:
> it s very strange ...
> I show you my conf
> -------------------------------------------------------------
> database_dir: /home/web/inerd/htdig/db
> database_base: ${database_dir}/inerd
> #allow_virtual_hosts: true
> valid_extensions: .html .htm .shtml .php .php3 .asp .php
> start_url: http://192.168.0.2
> limit_urls_to: http://192.168.0.2
> exclude_urls: /cgi-bin/ .cgi
> bad_extensions: .wav .gz .z .sit .au .zip .tar .hqx .exe .com .gif\
> .jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov .avi
> maintainer: inerd
> max_head_length: 10000
> max_doc_size: 200000
> no_excerpt_show_top: false
> search_algorithm: exact:1 synonyms:0.5 endings:0.1
> search_results_wrapper: /home/web/inerd/www/htdig/wrapper_inerd.html
> nothing_found_file: /home/web/inerd/www/htdig/nomatch_inerd.html
> ----------------------------------------------------
> the result of the htdig -i -vvv
>
> ...
> pushing http://192.168.0.2/index.php3
> +A tag: pos = 2, position = =/news/index.php3?idnews=3 class=news>
> href: http://192.168.0.2/news/index.php3?idnews=3 (La troisi�me)
>
> Rejected: Extension is not valid!
This error, just as the one below, indicates the URL is rejected because
it doesn't fit any of the patterns in valid_extensions. Unfortunately,
the pattern matching doesn't take CGI parameters into account, so the
match fails. I think this is a bug, which the patch below should fix.
> ...
>
> ...
> *A tag: pos = 2, position = ="/services" class="navig1">
> href: http://192.168.0.2/services (services)
>
> Rejected: Extension is not valid!
In this case, the URL is rejected because of a bug in the new
valid_extensions attribute handling, as was pointed out by Warren
Jones about a month ago.
> ...
>
> do you have any suggestion ?
> (I ve really tried a lot of things ... a real mystery)
>
> thanx
>
> ps : I use 3.1.4
> and my directory index is good :
> DirectoryIndex index.html index.htm index.shtml index.cgi index.php3
Here is a patch which I hope will fix both problems. Please let me know
if it works.
--- htdig/Retriever.cc.valextbug Thu Dec 9 18:28:44 1999
+++ htdig/Retriever.cc Tue Feb 1 09:16:04 2000
@@ -702,9 +702,14 @@ Retriever::IsValidURL(char *u)
//
char *ext = strrchr(url, '.');
String lowerext;
+ if (ext && strchr(ext, '/')) // Ignore a dot if it's not in the
+ ext = NULL; // final component of the path.
if (ext)
{
lowerext = ext;
+ int parm = lowerext.indexOf('?'); // chop off URL parameter
+ if (parm >= 0)
+ lowerext.chop(lowerext.length() - parm);
lowerext.lowercase();
if (invalids->Exists(lowerext))
{
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.