According to Benjelloun Adnane:
> When I execute "rundig" it doesn't parse Hyperlinks correctly
> 
> This URL :
> index.epl?menu=menug&ances=root_1000&selec=1000&href=file1.html
> 
> Is parsed as :
> 
> index.epl?menu=menug=root_1000=1000=file1.htm
> 
> 
> Please msg. me at ([EMAIL PROTECTED]) if you have any help

Yes, this is a known problem with the 3.1.3 release.  It turns out that
the fix for handling &foo; SGML entities in HTML tag parameters had a
couple problems.  First of all, it didn't handle & if translate_amp
was false, even though this was one of the main motivations for the fix.
Secondly, it messed up bare &'s in tag parameters.

Torsten posted a patch for this a few days after 3.1.3 was released.
I'm now posting what I think is an improvement on this.  Torsten's
patch had a couple potential problems which I think this patch avoids.
(The most serious of these was that a parameter after a bare "&" could
still get stripped out, if translate_amp was true.)  Please give this
patch a try and let me know if there are any problems.

--- htdig/HTML.cc.orig  Wed Sep 22 11:18:40 1999
+++ htdig/HTML.cc       Thu Oct 14 15:08:24 1999
@@ -1114,7 +1114,15 @@ HTML::transSGML(char *str)
     while (*text)
     {
        if (*text == '&')
-           convert << SGMLEntities::translateAndUpdate(text);
+       {
+           if (strncmp((char *)text, "&amp;", 5) == 0) 
+           {
+               // We MUST convert these in URLs, regardless of translate_amp.
+               convert << '&';
+               text += 5;
+           } else
+               convert << SGMLEntities::translateAndUpdate(text);
+       }
        else
            convert << *text++;
     }
--- htdig/SGMLEntities.cc.orig  Wed Sep 22 11:18:41 1999
+++ htdig/SGMLEntities.cc       Thu Oct 14 15:08:31 1999
@@ -280,5 +280,11 @@ SGMLEntities::translateAndUpdate(unsigne
     
     if (*entityStart == ';')
        entityStart++;          // A final ';' is used up.
-    return translate(entity);
+    unsigned char e = translate(entity);
+    if (e == ' ' && strncmp((char *)orig, "&#32", 4) != 0)
+    {
+       entityStart = orig + 1; // Catch unrecognized entities...
+       return '&';
+    }
+    return e;
 }


-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word unsubscribe in
the SUBJECT of the message.

Reply via email to