subject:"multiple tokens at the same position"

Re: multiple tokens at the same position

2007-05-25 Thread Mark Miller

Another (obvious) option is to use two indexes and direct the query to the appropriate index depending on the search specification. Of course you double your space requirements, but your basically going to do that anyway if you use two fields. I chose this for the slight benefit of fewer fields on

Re: multiple tokens at the same position

2007-05-25 Thread Enis Soztutar

On 5/25/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : Yes, indeed we could but it brings other problems, for example increasing : the index size, and extending the query to search for multiple fields, etc. 1) if you index both teh raw and stemmed forms your index is going to grow to roughly

Re: multiple tokens at the same position

2007-05-25 Thread Chris Hostetter

: Yes, indeed we could but it brings other problems, for example increasing : the index size, and extending the query to search for multiple fields, etc. 1) if you index both teh raw and stemmed forms your index is going to grow to roughly the same size regardless of wether the stem and the arw a

Re: multiple tokens at the same position

2007-05-25 Thread Erick Erickson

I can only speak to the " avoid matching stemmed or canonical forms" part... Yes, but you've got to do some fancy dancing when you index, something like adding a special signifier to, say, the original token. I'll ignore the canonical part of your question for the sake of brevity. Consider inde

Re: multiple tokens at the same position

2007-05-25 Thread Enis Soztutar

Yes, indeed we could but it brings other problems, for example increasing the index size, and extending the query to search for multiple fields, etc. On 5/25/07, Steven Rowe <[EMAIL PROTECTED]> wrote: Hi Enis, Enis Soztutar wrote: > In nutch we have a use case in which we need to store tokens

Re: multiple tokens at the same position

2007-05-25 Thread Steven Rowe

Hi Enis, Enis Soztutar wrote: > In nutch we have a use case in which we need to store tokens with their > original text plus their stemmed form plus their canonical form(through > some asciifization). From my understanding of lucene, it makes sense to > write a tokenstream which generates several

multiple tokens at the same position

2007-05-25 Thread Enis Soztutar

Hi, In nutch we have a use case in which we need to store tokens with their original text plus their stemmed form plus their canonical form(through some asciifization). From my understanding of lucene, it makes sense to write a tokenstream which generates several tokens for each "word", but p

Re: multiple tokens at the same position

Re: multiple tokens at the same position

Re: multiple tokens at the same position

Re: multiple tokens at the same position

Re: multiple tokens at the same position

Re: multiple tokens at the same position

multiple tokens at the same position

7 matches

Site Navigation

Mail list logo

Footer information