Re: read more tokens during analysis

2010-02-12 Thread Ahmet Arslan
> i want to consider the current word > & the next as a single term. > > when analyzing "Arun Kumar" > > i want my analyzer to consider "Arun",  "Arun Kumar" > as synonyms. > > in the tokenstream method, how do we read the next token > "Kumar" > i am going through the setPositionIncrements meth

Can you use reduced sized test indexes to predict performance gains for a larger index?

2010-02-12 Thread Chris Harris
I'd like to try some experiments to see if I can improve search performance by changing analysis (e.g. adding/removing word bigrams or commongrams), or by changing how I map my source records into Lucene documents. The problem is that my index currently is about 1TB in size and takes about 2-3 week

Re: read more tokens during analysis

2010-02-12 Thread Rohit Banga
thanks will try the code and get back if i have any problems. Rohit Banga On Fri, Feb 12, 2010 at 10:38 PM, Ahmet Arslan wrote: > > > i want to consider the current word > > & the next as a single term. > > > > when analyzing "Arun Kumar" > > > > i want my analyzer to consider "Arun", "Arun