Hi,

my website makes use of URL's like

     http://website/dir/index.jsp?content=foo/bar.html

The result of a request like the one above is  text/html containing
relative URL's like

     <a href="foo/foobar.html">foo</a>

When udmsearch tries to index foo/foobar.html it constructs the
absolute URL of the document badly resulting in a query to

     http://website/dir/index.jsp?content=foo/foo/foobar.html

This happens because a '/' is encountered in the query-part of
the original URL - which should *not* be considered the path of
the URL (cf src/parseurl.c). The appended patch gets rid of
that problem by temporarily removing the query-portion of
the file-part while parsing path/file-parts.

The patched version constructs the correct document-url

     http://website/dir/foo/foobar.html

Any comment is appreciated.

Jörg Zanger

------
SAP Basis Administration, Abt. IT
Schöck Bauteile GmbH, D-76534 Baden-Baden
Tel.: +49 7223 967-355, Fax: +49 7223 967-352

____________________________________________

--- parseurl.c Wed Aug 16 18:38:48 2000
+++ parseurl.c.orig Thu Jul  6 12:26:59 2000
@@ -9,7 +9,7 @@
 #include "udm_utils.h"

 int UdmParseURL(UDM_URL *url,char *s){
-char *schema,*anchor,*file,*query;
+char *schema,*anchor,*file;

     if(strlen(s)>=UDM_URLSIZE)
          return(UDM_PARSEURL_LONG);
@@ -130,16 +130,10 @@
               strcpy(url->filename,url->path);
          strcpy(url->path,"");
     }
-
-    /* temporarily truncate path to querystring */
-    if((query=strchr(url->path,'?')))
-         *(query) = 0;
     if((file=strrchr(url->path,'/'))&&(strcmp(file,"/"))){
-         if(query) *(query) = '?';
          strcpy(url->filename,file+1);
          *(file+1)=0;
     }
-    if(query) *(query) = '?';
     return(0);
 }


______________
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]

Reply via email to