Re: Retrieving Tokens

Erick Erickson Thu, 20 Dec 2007 06:45:49 -0800

I think that what Yonik wants is a higher-level response.
*Why* do you want to process the tokens later? What is the
use case you're trying to satisfy?


Best
Erick

On Dec 20, 2007 1:37 AM, Rishabh Joshi <[EMAIL PROTECTED]> wrote:

> > What are you trying to do with the tokens?
>
> Yonik, we wanted a "tokenizer" that would tokenize the content of a
> document
> as per our requirements, and then store them in the index so that, we
> could
> retrieve those tokens at search time, for further processing in our
> application.
>
> Regards,
> Rishabh
>
> On Dec 19, 2007 10:02 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>
> > On Dec 19, 2007 10:59 AM, Rishabh Joshi <[EMAIL PROTECTED]> wrote:
> > > I have created my own Tokenizer and I am indexing the documents using
> > the
> > > same.
> > >
> > > I wanted to know if there is a way to retrieve the tokens (created by
> my
> > > custom tokenizer) from the index.
> >
> > If you want the tokens in the index, see the luke request handler.
> >
> > If you want the tokens for a specific document, it's more
> > complicated... Lucene maintains an *inverted* index... terms point to
> > documents, so by default there is no way to ask for all of the terms
> > in a certain document.  One could ask lucene to store the terms for
> > certain fields (called term vectors), but that requires extra space in
> > the index, and solr doesn't yet have a way to ask that they be
> > retrieved.
> >
> > What are you trying to do with the tokens?
> >
> > -Yonik
> >
>

Re: Retrieving Tokens

Reply via email to