According to Vikram Lele:
> I am looking for a search engine for our intranet news group.
> We are using HyperNews as our intranet newsgroup.
> 
> Is htDig suitable for such applications ? As you know newsgroup
> contents change almost on hourly basis .. so the search engine must be
> capable of incremental indexing. 
> 
> From your site I couldn't figure if incremental indexing is supported.
> Please advice.

A number of ht://Dig users use it for indexing mailing lists, which
would be quite similar, so perhaps some of them would care to comment
on the issues involved.

The htdig program is capable of incremental indexing, but it tends to
recheck all indexed documents to see if they've changed, which in itself
can mean a lot of overhead when you have lots of static content with a
smaller proportion of new content.  There are a few ways to avoid this.
One is to make sure you index via the local filesystem rather than via
HTTP, so the modification time checks for unchanged files would be very
quick.  Another is to break up your index into multiple databases for
various "ages" of documents (e.g. one per month or one per year, depending
on how far back you go), so that only the most recent database needs to
be updated, and then you can merge them all to get a full search database.

Depending on the size of the data you're indexing, this may or may not be
an issue.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] 
You will receive a message to confirm this. 


Reply via email to