RE: n-gram indexing

2005-08-01 Thread Chris Hostetter
EMAIL PROTECTED]> : Reply-To: java-user@lucene.apache.org : To: java-user@lucene.apache.org : Subject: RE: n-gram indexing : : Hi Chris, : The method you suggested is definitely a good solution. However there is one more reason I would like to do n-gram generation at indexing time.

RE: n-gram indexing

2005-08-01 Thread Rajesh Munavalli
There might be other ways to do which I am not aware of. Let me know what your thoughts on this. I would really appreciate any suggestions you might have. thanks, Rajesh Munavalli -Original Message- From: [EMAIL PROTECTED] on behalf of Chris Hostetter Sent: Fri 7/2

RE: n-gram indexing

2005-07-29 Thread Chris Hostetter
stions like this... : -Original Message- : Sent: Monday, July 18, 2005 5:11 PM : To: java-user@lucene.apache.org : Subject: RE: n-gram indexing : : : Your intuition is right, but i can't think of any reason why you need to : add the n-grams at indexing time -- or why using phrase q

RE: n-gram indexing

2005-07-29 Thread Rajesh Munavalli
I was wondering how Lucene's phrase query would work in case of n-gram indexing. There are two scenarios for popsition increments while adding the index for n-grams. For example consider tri-grams of "united states of america". Scenario 1: Index position token 0

Re: n-gram indexing

2005-07-24 Thread Sebastian Marius Kirsch
Hi Rajeev, I wrote a filter for generating n-grams a while back; I intended to use it for statistics, but I guess you can also use it for search. I also thought of the "boosting effect" you describe when I implemented it, though I never actually tried whether it works that way. It's in the Lucene

RE: n-gram indexing

2005-07-24 Thread Dave Kor
Quoting Rajesh Munavalli <[EMAIL PROTECTED]>: > Let me explain a scenario where I would need to add the n-grams at > indexing time. I see your point and I do agree. As it stands, Lucene does not innately support n-gram indexing. However it is not impossible to adapt Lucene to serve

RE: n-gram indexing

2005-07-19 Thread Rajesh Munavalli
se : queries. I am not sure if there is a better way to achieve the same : effect. : : Thanks, : : Rajesh : : : -Original Message----- : From: Andy Roberts [mailto:[EMAIL PROTECTED] : Sent: Monday, July 18, 2005 5:56 PM : To: java-user@lucene.apache.org : Subject: Re: n-gram indexing : : On Mo

Re: n-gram indexing

2005-07-18 Thread Chris Lamprecht
d without using phrase > queries. I am not sure if there is a better way to achieve the same > effect. > > Thanks, > > Rajesh > > > -Original Message- > From: Andy Roberts [mailto:[EMAIL PROTECTED] > Sent: Monday, July 18, 2005 5:56 PM > To: java-user@lucene.apa

Re: n-gram indexing

2005-07-18 Thread Andy Roberts
On Monday 18 Jul 2005 22:06, Rajesh Munavalli wrote: > Intution behind adding n-grams is to boost naturally occurring larger > phrases versus using phrase queries. For example, if I am searching for > "united states of america", I want the search results to return the > documents ordered as follows

RE: n-gram indexing

2005-07-18 Thread Chris Hostetter
e same : effect. : : Thanks, : : Rajesh : : : -Original Message- : From: Andy Roberts [mailto:[EMAIL PROTECTED] : Sent: Monday, July 18, 2005 5:56 PM : To: java-user@lucene.apache.org : Subject: Re: n-gram indexing : : On Monday 18 Jul 2005 21:27, Rajesh Munavalli wrote: : > At what point

RE: n-gram indexing

2005-07-18 Thread Rajesh Munavalli
Message- From: Andy Roberts [mailto:[EMAIL PROTECTED] Sent: Monday, July 18, 2005 5:56 PM To: java-user@lucene.apache.org Subject: Re: n-gram indexing On Monday 18 Jul 2005 21:27, Rajesh Munavalli wrote: > At what point do I add n-grams? Does the order in which I add n-grams > affec

Re: n-gram indexing

2005-07-18 Thread Andy Roberts
On Monday 18 Jul 2005 21:27, Rajesh Munavalli wrote: > At what point do I add n-grams? Does the order in which I add n-grams > affect exact phrase queries later? My questions are > > (1) Should I add all the 1-grams followed by 2-grams followed by > 3-grams..etc sentence by sentence OR > > (2) Add

n-gram indexing

2005-07-18 Thread Rajesh Munavalli
At what point do I add n-grams? Does the order in which I add n-grams affect exact phrase queries later? My questions are (1) Should I add all the 1-grams followed by 2-grams followed by 3-grams..etc sentence by sentence OR (2) Add all the 1 grams of entire document first before starting 2-grams