I could have up to 2 million documents and growing. On Tue, Apr 22, 2008 at 7:29 PM, Karl Wettin <[EMAIL PROTECTED]> wrote:
> Jonathan Ariel skrev: > > Is there any way to execute a MoreLikeThis over a subset of documents? I > > need to retrieve a set of interesting keywords from a subset of > > documents > > and not the entire index (imagine that my index has documents > > categorized as > > A, B and C and I just want to work with those categorized as A). Right > > now > > it is using docFreq from the IndexReader. So I looked into the > > FilterIndexReader to see if I can override the docFreq behavior, but I'm > > not > > sure if it's possible. > > > > What do you think? > > > > It might be tricky. > > How many documents do you have in the subset? Could you measure the > distance between the term vectors and cluster them in real time? > > > karl > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >