"Mike Klaas" <[EMAIL PROTECTED]> wrote on 16/03/2007 14:26:46:

> On 3/15/07, karl wettin <[EMAIL PROTECTED]> wrote:
> > I propose a change of the current IndexReader.getTermFreqVector/s-
> > code so that it /always/ return the vector space model of a document,
> > even when set fields are set as Field.TermVector.NO.
> >
> > Is that crazy? Could be really slow, but except for that.. And if it
> > is cached then that information is known by inspecting the fields.
> > People don't go fetching term vectors without knowing what thay are
> > doing, are they?
>
> The highlighting contrib code does this: attempt to retrieve the
> termvector, catch InvalidArgumentException, fall back to re-analysis
> of the data.

This way makes more sense to me.  IndexReader.getTermFreqVector() means its
there, just bring it, while the fall-back is more a
computeTermFreqVector(), which takes much more time.  Users would likely
prefer getting an exception for the get() (oops, term vectors were not
saved..) rather then auto falling back to an expensive computation.

This functionality seems proper as a utility, so it can be reused, I think
perhaps in contrib?

>
> I'm not sure if that is crazy, but that is what is currently implemented.
>
> -Mike


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to