I have just strted to use htdig.  I have htdig-3.0.8b2, and have had a
problem with URLs like: 
      http:/ITS
In directory htlib, lines 151 to 152 of URL.cc are:
      if (hasService && ((strncmp(ref, "http:/", 6) == 0) ||
           (strncmp(ref, "http:", 5) != 0)))
and this condition is satisfied for the above URL.  This leads to a call
of parse which then leads to it executing lines 302 to 306 of URL.cc:
      _host = 0;
      _port = 0;
      _url = 0;
      _path = p;
      _normal = 1;
      return;
which leads to nonsense.

If you change line 151 to:
      if (hasService && ((strncmp(ref, "http://", 7) == 0) ||
then a URL like:
      http:/ITS
leads to the condition on line 151 giving false.  It now goes on
eventually to execute lines 171 to 178 of URL.cc which are:
 if (*ref == '/')
 {
     //
     // The reference is on the same server as the parent, but
     // an absolute path was given...
     //
     _path = ref;
 }
which I believe is exactly what's required.

In the cases where *ref starts with "http" (which is most common), I think
it's possible to prove that (with the current code) the condition on line
171, i.e., *ref == '/', is never true and so the statement on line 177 is
unreachable.

I would appreciate someone confirming the above.  If it's true, I'm
puzzled as to why hasn't it been detected previously.  Is the "http:/"
code on line 151 new? 

--
Barry Cornelius                      Telephone: (0191 or +44 191) 374 4717
User Services, Information Technology Service,            Office: 374 2892   
Science Site, University of Durham, Durham, DH1 3LE, UK      Fax: 374 7759
http://www.durham.ac.uk/~dcl0bjc       mailto:[EMAIL PROTECTED]

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.

Reply via email to