According to Geoff Hutchison:
> At 1:18 PM +0300 11/22/99, Alexis Mikhailov wrote:
> >Can I propose a patch? It allows better handling of local files through
> >using of file://localhost urls. And there was a little bug in String.cc.
> >This patch was created to allow KDevelop to use Ht://Dig as it's
> >search engine, work sponsored through Cosource.
>
> Hi!
>
> I actually had something like this in mind for local_urls. I've taken
> a quick glance and it generally looks OK, though it would be nice to
> have a ChangeLog entry for all of the changes (since some don't look
> related).
I've looked through, and the patch contains a number of different (sometimes
loosely related) fixes and enhancements:
1) htcommon/DocumentDB.cc & htdig/Retriever.cc: allow file:... as well
as http:... URLs. (This doesn't change anything in htlib/URL.cc, so I'm
not sure about how well it'll handle hrefs in documents from the file:
service. I think fully qualified file:/path/doc URLs will work fine,
but it seems that relative ones which contain the "file:" service name
will not be treated as relative, while relative ones without the service
name will default to "http:".
2) htdig/HTML.cc: add support for an ignore_noindex attribute. This is
undocumented and no default is defined, but I think the behaviour is
pretty obvious from the code. I'd question the desirability/need for
this, but it seems harmless enough. The value should be set in a static
variable, though, to minimise performance impact. I'd call this a work
in progress.
3) htdig/Retriever.cc & htdig/Server.cc: modified to allow local file
access to persist even if the HTTP server is down. Looks good to me.
4) htdig/htdig.cc & htlib/String.cc: allow htdig to read URL list from
stdin if htdig given an argument of "-". The >> operator added to the
String class uses a lot of stuff I've never seen before, so I don't know
how to judge it for myself. It seems more complicated that it would
need to be for the simple task at hand - all you need to do is load all
of stdin into one string - but I guess the line by line approach would
allow bigger lists (less string overhead). Undocumented.
5) htlib/String.cc: fix a bug in write() method, which is currently unused.
6) htlib/cgi.cc & htsearch/htsearch.cc: add a -a option to htsearch, to
add name=value parameters to those in query string. This is undocumented
as well. I'm not sure how it relates to the other changes, but it seems
simple enough.
7) htsearch/htsearch.cc: set alarm() time earlier in execution than was
done before. Didn't know that the initial processing could lead to hangs.
8) htsearch/Display.cc: compare nPages to maximum_pages earlier than was
done before, presumably so PAGES variable won't be allowed to exceed it.
Looks good to me.
I'm all for these fixes, but some documentation explaining the less
clear aspects, or even a note explaining the rationale for these,
would be appreciated. I do have some concerns about the correctness of
(1) above, given the lack of changes to URL.cc. Maybe this would be a
better fit in 3.2. I'm also concerned about the portability of the >>
operator added to the String class in (4) above, given the way it pokes
into internals of library classes, and the unfamiliar cast construct.
Separate patches for these would be a big help, as fixes are generally
committed one at a time.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You'll receive a message confirming the unsubscription.