I think that what Yonik wants is a higher-level response. *Why* do you want to process the tokens later? What is the use case you're trying to satisfy?
Best Erick On Dec 20, 2007 1:37 AM, Rishabh Joshi <[EMAIL PROTECTED]> wrote: > > What are you trying to do with the tokens? > > Yonik, we wanted a "tokenizer" that would tokenize the content of a > document > as per our requirements, and then store them in the index so that, we > could > retrieve those tokens at search time, for further processing in our > application. > > Regards, > Rishabh > > On Dec 19, 2007 10:02 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > > On Dec 19, 2007 10:59 AM, Rishabh Joshi <[EMAIL PROTECTED]> wrote: > > > I have created my own Tokenizer and I am indexing the documents using > > the > > > same. > > > > > > I wanted to know if there is a way to retrieve the tokens (created by > my > > > custom tokenizer) from the index. > > > > If you want the tokens in the index, see the luke request handler. > > > > If you want the tokens for a specific document, it's more > > complicated... Lucene maintains an *inverted* index... terms point to > > documents, so by default there is no way to ask for all of the terms > > in a certain document. One could ask lucene to store the terms for > > certain fields (called term vectors), but that requires extra space in > > the index, and solr doesn't yet have a way to ask that they be > > retrieved. > > > > What are you trying to do with the tokens? > > > > -Yonik > > >