I'd assumed that FuzzyQuery wouldn't ignore case but I could be wrong. What would be the edit distance between "mer" and "merlot"? Would it be less that 1.5 which I reckon would be the value of length(term)*0.5 as detailed in the javadocs? Seems unlikely, but I don't really know anything about the Levenshtein (edit distance) algorithm as used by FuzzyQuery. Wouldn't a PrefixQuery be more appropriate here?
-- Ian. On Tue, May 3, 2011 at 10:10 AM, Clemens Wyss <clemens...@mysign.ch> wrote: > Unfortunately lowercasing doesn't help. > Also, doesn't the FuzzyQuery ignore casing? > >> -----Ursprüngliche Nachricht----- >> Von: Ian Lea [mailto:ian....@gmail.com] >> Gesendet: Dienstag, 3. Mai 2011 11:06 >> An: java-user@lucene.apache.org >> Betreff: Re: "fuzzy prefix" search >> >> Mer != mer. The latter will be what is indexed because StandardAnalyzer >> calls LowerCaseFilter. >> >> -- >> Ian. >> >> >> On Tue, May 3, 2011 at 9:56 AM, Clemens Wyss <clemens...@mysign.ch> >> wrote: >> > Sorry for coming back to my issue. Can anybody explain why my "simple" >> unit test below fails? Any hint/help appreciated. >> > >> > Directory directory = new RAMDirectory(); IndexWriter indexWriter = >> > new IndexWriter( directory, new StandardAnalyzer( Version.LUCENE_31 ), >> > IndexWriter.MaxFieldLength.UNLIMITED ); Document document = new >> > Document(); document.add( new Field( "test", "Merlot", >> > Field.Store.YES, Field.Index.ANALYZED ) ); indexWriter.addDocument( >> > document ); IndexReader indexReader = indexWriter.getReader(); >> > IndexSearcher searcher = new IndexSearcher( indexReader ); Query q = >> > new FuzzyQuery( new Term( "test", "Mer" ), 0.5f, 0, 10 ); // or Query >> > q = new FuzzyQuery( new Term( "test", "Mer" ), 0.5f); TopDocs result = >> > searcher.search( q, 10 ); Assert.assertEquals( 1, result.totalHits ); >> > >> > - Clemens >> > >> >> -----Ursprüngliche Nachricht----- >> >> Von: Clemens Wyss [mailto:clemens...@mysign.ch] >> >> Gesendet: Montag, 2. Mai 2011 23:01 >> >> An: java-user@lucene.apache.org >> >> Betreff: AW: "fuzzy prefix" search >> >> >> >> Is it the combination of FuzzyQuery and Term which makes the search >> >> to go for "word boundaries"? >> >> >> >> > -----Ursprüngliche Nachricht----- >> >> > Von: Clemens Wyss [mailto:clemens...@mysign.ch] >> >> > Gesendet: Montag, 2. Mai 2011 14:13 >> >> > An: java-user@lucene.apache.org >> >> > Betreff: AW: "fuzzy prefix" search >> >> > >> >> > I tried this too, but unfortunately I only get hits when the search >> >> > term is a least as long as the word to be looked up. >> >> > >> >> > E.g.: >> >> > ... >> >> > Directory directory = new RAMDirectory(); IndexWriter indexWriter = >> >> > new IndexWriter( directory, IndexManager.getIndexingAnalyzer( >> >> LOCALE_DE ), >> >> > IndexWriter.MaxFieldLength.UNLIMITED ); >> >> > >> >> > Document document = new Document(); document.add( new Field( >> >> > "test", "Merlot", >> >> > Field.Store.YES, Field.Index.ANALYZED ) ); >> >> indexWriter.addDocument( >> >> > document ); >> >> > >> >> > IndexReader indexReader = indexWriter.getReader(); IndexSearcher >> >> > searcher = new IndexSearcher( indexReader ); >> >> > >> >> > Query q = new FuzzyQuery( new Term( "test", "Mer" ), 0.6f, 1 ); >> >> > TopDocs result = searcher.search( q, 10 ); Assert.assertEquals( 1, >> >> result.totalHits ); ... >> >> > >> >> > > -----Ursprüngliche Nachricht----- >> >> > > Von: Uwe Schindler [mailto:u...@thetaphi.de] >> >> > > Gesendet: Montag, 2. Mai 2011 13:50 >> >> > > An: java-user@lucene.apache.org >> >> > > Betreff: RE: "fuzzy prefix" search >> >> > > >> >> > > Hi, >> >> > > >> >> > > You can pass an integer to FuzzyQuery which defines the number of >> >> > > characters that are seen as prefix. So all terms must match this >> >> > > prefix and the rest of each term is matched using fuzzy. >> >> > > >> >> > > Uwe >> >> > > >> >> > > ----- >> >> > > Uwe Schindler >> >> > > H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de >> >> > > eMail: u...@thetaphi.de >> >> > > >> >> > > > -----Original Message----- >> >> > > > From: Clemens Wyss [mailto:clemens...@mysign.ch] >> >> > > > Sent: Monday, May 02, 2011 1:47 PM >> >> > > > To: java-user@lucene.apache.org >> >> > > > Subject: "fuzzy prefix" search >> >> > > > >> >> > > > I'd like to search fuzzily but not on a full term. >> >> > > > E.g. >> >> > > > I have a text "Merlot del Ticino" >> >> > > > I'd like >> >> > > > "mer", "merr", "melo", ... to match. >> >> > > > >> >> > > > If I use FuzzyQuery only "merlot, "merlott" hit. What >> >> > > > Query-combination should I use? >> >> > > > >> >> > > > Thx >> >> > > > Clemens >> >> > > > >> >> > > > >> >> > > > --------------------------------------------------------------- >> >> > > > --- >> >> > > > -- >> >> > > > - To unsubscribe, e-mail: >> >> > > > java-user-unsubscr...@lucene.apache.org >> >> > > > For additional commands, e-mail: >> >> > > > java-user-h...@lucene.apache.org >> >> > > >> >> > > >> >> > > >> >> > > ----------------------------------------------------------------- >> >> > > --- >> >> > > - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >> > > For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > >> >> > >> >> > ------------------------------------------------------------------- >> >> > -- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >> > For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> > For additional commands, e-mail: java-user-h...@lucene.apache.org >> > >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org