Chandramohan wrote:
perform such a cull again, you might make several
distinct indexes (one per
day, per week, per whatever) during that reindexing
so the next time will be
much easier.
How would you search and consolidate the results
across multiple indexes? Hits from each index will
have independent scoring.
Frankly, I ignore the scores in my application. The data itself isn't English
prose, so the TF/IDF calcuations are stretched at best, as a measure of
relevance. I presort the documents to be in "relevance" order (a popularity
metric), then specify index ordering for the results.
If that wouldn't work for your application, it seems to me that large-enough
sub-sections *would* produce equivalent scores. That is, if the sub-indexes
were big enough, one could directly compare scores, so a simple merge would
work. If the total document corpus is small, then the need for sub-indexes
isn't there anyhow.
--MDC
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]