Smart idea, but it won't help me. I have almost 50 categories and eventually I would like to "filter" not just on category but maybe also on language, etc. Karl: what do you mean by measure the distance between the term vectors and cluster them in real time?
On Tue, Apr 22, 2008 at 7:39 PM, Glen Newton <[EMAIL PROTECTED]> wrote: > Sorry, I misunderstood the problem. My mistake. > > While not optimal and rather expensive space-wise, you could have - in > addition to existing keyword field - a field for each category. If > the document being indexed is in category A, only add the text to the > catA field. Now do MoreLikeThis on catA. This assumes you know the > categories at index time, of course. > Redundant but workable. > > -Glen > > 2008/4/22 Jonathan Ariel <[EMAIL PROTECTED]>: > > Is there any way to execute a MoreLikeThis over a subset of documents? I > > need to retrieve a set of interesting keywords from a subset of > documents > > and not the entire index (imagine that my index has documents > categorized as > > A, B and C and I just want to work with those categorized as A). Right > now > > it is using docFreq from the IndexReader. So I looked into the > > FilterIndexReader to see if I can override the docFreq behavior, but > I'm not > > sure if it's possible. > > > > What do you think? > > > > Jonathan > > > > > > -- > > - > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >