Phrase indexing and searching

2013-12-18 Thread Manjula Wijewickrema
Dear list, My Lucene programme is able to index single words and search the most matching documents (based on term frequencies) documents from a corpus to the input document. Now I want to index two word phrases and search the matching corpus documents (based on phrase frequencies) to the input do

Phrase indexing and searching

2013-12-22 Thread Manjula Wijewickrema
Dear All, My Lucene programme is able to index single words and search the most matching documents (based on term frequencies) documents from a corpus to the input document. Now I want to index two word phrases and search the matching corpus documents (based on phrase frequencies) to the input doc

Re: Phrase indexing and searching

2013-12-23 Thread Steve Rowe
Hi Manjula, Sounds like ShingleFilter will do what you want: < http://lucene.apache.org/core/4_6_0/analyzers-common/org/apache/lucene/analysis/shingle/ShingleFilter.html > Steve www.lucidworks.com On Dec 22, 2013 11:25 PM, "Manjula Wijewickrema" wrote: > Dear All, > > My Lucene programme is abl

Re: Phrase indexing and searching

2013-12-23 Thread Manjula Wijewickrema
Hi Steve, Thanks for the reply. Could you please simply let me know how to embed SingleFilter in the code for both indexing and searching? Coz, different people suggest different snippets to the code and they did not do the job. Thanks, Manjula. On Mon, Dec 23, 2013 at 8:42 PM, Steve Rowe wro

Phrase indexing and searching with Lucene

2009-02-18 Thread Nada Mimouni
Hello everybody, In my research work, I use Lucene to index and search into text documents. At present, I just index and search for single words. I want to extend this to phrases (or nGrams). Could anyone please give me more details on how to do it and also point me to some useful references o

Phrase indexing and searching with Lucene

2009-02-18 Thread Nada Mimouni
Hello everybody, I use Lucene to index and search into text documents. At present, I just index and search for single words. I want to extend this to phrases (or nGrams). Could anyone please give me details on how to index phrases and then make a phrase search? Thank you very much in advanc

Re: Phrase indexing and searching with Lucene

2009-02-18 Thread Erick Erickson
Have you tried the built-in phrase processing with double quotes? e.g. "this is a phrase"? See the Term section at http://lucene.apache.org/java/2_4_0/queryparsersyntax.html Best Erick On Wed, Feb 18, 2009 at 5:57 AM, Nada Mimouni < mimo...@tk.informatik.tu-darmstadt.de> wrote: > > > Hello ever

RE: Phrase indexing and searching with Lucene

2009-02-18 Thread Nada Mimouni
.com] Sent: Wed 2/18/2009 2:10 PM To: java-user@lucene.apache.org Subject: Re: Phrase indexing and searching with Lucene Have you tried the built-in phrase processing with double quotes? e.g. "this is a phrase"? See the Term section at http://lucene.apache.org/java/2_4_0/queryparsers

Re: Phrase indexing and searching with Lucene

2009-02-18 Thread Erick Erickson
; What I need to do: > - Index phrases ("multi" words) in addition to terms (single words) > - Search for both : phrases and terms > > > Is there any idea on how to proceed? > > Regards > Nada > > > -Original Message- > From: Erick Erickson [mailto

RE: Phrase indexing and searching with Lucene

2009-02-19 Thread Nada Mimouni
kson [mailto:erickerick...@gmail.com] Sent: Wed 2/18/2009 3:24 PM To: java-user@lucene.apache.org Subject: Re: Phrase indexing and searching with Lucene I'm still not clear why the built-in phrase query syntax won't work. If I index the following terms (erick, erickson, thinks, small, th

Re: Phrase indexing and searching with Lucene

2009-02-19 Thread Erick Erickson
> 898 7567 1 (relevant) > > > In this example, Lucene matches document 7567 to be relevant to he query > (since it contains all query terms), however bright here is relative to > Sirius (what we need is to get "sun bright"). > > > > > Best &g

Re: Phrase indexing and searching with Lucene

2009-02-23 Thread Chris Hostetter
: Subject: Phrase indexing and searching with Lucene : References: <499be497.2060...@mapmyindia.com> : <18248.30864...@web26005.mail.ukl.yahoo.com> http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing lis