Re: MoreLikeThis over a subset of documents

Jonathan Ariel Tue, 22 Apr 2008 15:52:57 -0700

Smart idea, but it won't help me. I have almost 50 categories and eventually
I would like to "filter" not just on category but maybe also on language,
etc.
Karl: what do you mean by measure the distance between the term vectors and
cluster them in real time?


On Tue, Apr 22, 2008 at 7:39 PM, Glen Newton <[EMAIL PROTECTED]> wrote:

> Sorry, I misunderstood the problem. My mistake.
>
> While not optimal and rather expensive space-wise, you could have - in
> addition to existing keyword field - a field for each category.  If
> the document being indexed is in category A, only add the text to the
> catA field. Now do MoreLikeThis on catA. This assumes you know the
> categories at index time, of course.
> Redundant but workable.
>
> -Glen
>
> 2008/4/22 Jonathan Ariel <[EMAIL PROTECTED]>:
> > Is there any way to execute a MoreLikeThis over a subset of documents? I
> >  need to retrieve a set of interesting keywords from a subset of
> documents
> >  and not the entire index (imagine that my index has documents
> categorized as
> >  A, B and C and I just want to work with those categorized as A). Right
> now
> >  it is using docFreq from the IndexReader. So I looked into the
> >  FilterIndexReader to see if I can override the docFreq behavior, but
> I'm not
> >  sure if it's possible.
> >
> >  What do you think?
> >
> >  Jonathan
> >
>
>
>
> --
>
> -
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Re: MoreLikeThis over a subset of documents

Reply via email to