We are running into some situations where the duration of htsearch processing
-- when a fairly-common word has been sought -- is long enough to cause
problems (timeouts in the invoking process).
Looking at documentation, it does not appear that there is any option in
either the conf file or the parameters passed to htsearch, to limit the
number of matches which are located and sorted. If "several thousand"
documents match the specified words, all of these have to participate in
sorting; there's no way to limit the number which participate.
Use of "bad_words" operates as documented, but this prevents any matches from
being processed.
Appears to me that I could inspect the .wordlist file produced by htdig,
locate the records which are resulting in unwanted matches, and remove these
prior to running htmerge.
I'm hoping that this will merely result in a smaller .words.db file, and that
the .words.db entries which DO get written will still be processed correctly.
(ie, that the inconsistency, due to the absence of the deleted entries, will
not result in any unforseen problems).
Steven P Haver/602-242-9708
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.