I have added a couple of features in ht://Dig to suite my requirements.
At present, I am in the process of testing them. If there is any
interest, I will be happy to post a patch. (Source baseline is
htdig-3.1.3)
Here is a brief description of the features:
1. Multiple databases (collections)
It appears that several people had raised this issue. The
requirement is to perform searches across multiple databases.
It was suggested that multiple databases can be merged into one
large database. Along with the "restrict_url" property, one can
simulate multiple collections. Another approach was to write a
wrapper to merge search results.
With the first approach, database sizes can become large, resulting
in slower searches. The second approach probably will have problems
with sorting and pagination.
With suitable modification to htsearch, it can now accept several
databases to search against. It performs searches on specified
databases independently and then handles the aggregate results.
In a way, it is similar to the second approach above, except that
the merging of results happens within htsearch itself. Since the
merged result list can be sorted and paginated, it avoids problems
with a wrapper script.
One potential downside is, there may be more (database) files
open during the search. The modified htsearch seems to be working
satisfactorily.
This change is restricted to htsearch only. htdig and htmerge are
unaffected.
2. Indexing newsgroup articles
Implemented support for NNTP protocol to index and search newsgroup
articles. With this feature, news:// urls are treated similar to
http:// urls. Newsgroups can be added to the start_url property.
News articles can be incrementally indexed.
This change is restricted to htdig and htmerge only. htsearch is not
affected.
Raj Inamdar
[EMAIL PROTECTED]
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.