>PrefixQuery I'd like the combination of prefix and fuzzy ;-) because people could also type "menlo" or "märl" and in any of these cases I'd like to get a hit on Merlot (for suggesting Merlot)
> -----Ursprüngliche Nachricht----- > Von: Ian Lea [mailto:[email protected]] > Gesendet: Dienstag, 3. Mai 2011 11:22 > An: [email protected] > Betreff: Re: "fuzzy prefix" search > > I'd assumed that FuzzyQuery wouldn't ignore case but I could be wrong. > What would be the edit distance between "mer" and "merlot"? Would it be > less that 1.5 which I reckon would be the value of length(term)*0.5 as > detailed in the javadocs? Seems unlikely, but I don't really know anything > about the Levenshtein (edit distance) algorithm as used by FuzzyQuery. > Wouldn't a PrefixQuery be more appropriate here? > > > -- > Ian. > > On Tue, May 3, 2011 at 10:10 AM, Clemens Wyss <[email protected]> > wrote: > > Unfortunately lowercasing doesn't help. > > Also, doesn't the FuzzyQuery ignore casing? > > > >> -----Ursprüngliche Nachricht----- > >> Von: Ian Lea [mailto:[email protected]] > >> Gesendet: Dienstag, 3. Mai 2011 11:06 > >> An: [email protected] > >> Betreff: Re: "fuzzy prefix" search > >> > >> Mer != mer. The latter will be what is indexed because > >> StandardAnalyzer calls LowerCaseFilter. > >> > >> -- > >> Ian. > >> > >> > >> On Tue, May 3, 2011 at 9:56 AM, Clemens Wyss > <[email protected]> > >> wrote: > >> > Sorry for coming back to my issue. Can anybody explain why my > "simple" > >> unit test below fails? Any hint/help appreciated. > >> > > >> > Directory directory = new RAMDirectory(); IndexWriter indexWriter = > >> > new IndexWriter( directory, new StandardAnalyzer( > Version.LUCENE_31 > >> > ), IndexWriter.MaxFieldLength.UNLIMITED ); Document document = > new > >> > Document(); document.add( new Field( "test", "Merlot", > >> > Field.Store.YES, Field.Index.ANALYZED ) ); indexWriter.addDocument( > >> > document ); IndexReader indexReader = indexWriter.getReader(); > >> > IndexSearcher searcher = new IndexSearcher( indexReader ); Query q > >> > = new FuzzyQuery( new Term( "test", "Mer" ), 0.5f, 0, 10 ); // or > >> > Query q = new FuzzyQuery( new Term( "test", "Mer" ), 0.5f); TopDocs > >> > result = searcher.search( q, 10 ); Assert.assertEquals( 1, > >> > result.totalHits ); > >> > > >> > - Clemens > >> > > >> >> -----Ursprüngliche Nachricht----- > >> >> Von: Clemens Wyss [mailto:[email protected]] > >> >> Gesendet: Montag, 2. Mai 2011 23:01 > >> >> An: [email protected] > >> >> Betreff: AW: "fuzzy prefix" search > >> >> > >> >> Is it the combination of FuzzyQuery and Term which makes the > >> >> search to go for "word boundaries"? > >> >> > >> >> > -----Ursprüngliche Nachricht----- > >> >> > Von: Clemens Wyss [mailto:[email protected]] > >> >> > Gesendet: Montag, 2. Mai 2011 14:13 > >> >> > An: [email protected] > >> >> > Betreff: AW: "fuzzy prefix" search > >> >> > > >> >> > I tried this too, but unfortunately I only get hits when the > >> >> > search term is a least as long as the word to be looked up. > >> >> > > >> >> > E.g.: > >> >> > ... > >> >> > Directory directory = new RAMDirectory(); IndexWriter > >> >> > indexWriter = new IndexWriter( directory, > >> >> > IndexManager.getIndexingAnalyzer( > >> >> LOCALE_DE ), > >> >> > IndexWriter.MaxFieldLength.UNLIMITED ); > >> >> > > >> >> > Document document = new Document(); document.add( new Field( > >> >> > "test", "Merlot", > >> >> > Field.Store.YES, Field.Index.ANALYZED ) ); > >> >> indexWriter.addDocument( > >> >> > document ); > >> >> > > >> >> > IndexReader indexReader = indexWriter.getReader(); IndexSearcher > >> >> > searcher = new IndexSearcher( indexReader ); > >> >> > > >> >> > Query q = new FuzzyQuery( new Term( "test", "Mer" ), 0.6f, 1 ); > >> >> > TopDocs result = searcher.search( q, 10 ); Assert.assertEquals( > >> >> > 1, > >> >> result.totalHits ); ... > >> >> > > >> >> > > -----Ursprüngliche Nachricht----- > >> >> > > Von: Uwe Schindler [mailto:[email protected]] > >> >> > > Gesendet: Montag, 2. Mai 2011 13:50 > >> >> > > An: [email protected] > >> >> > > Betreff: RE: "fuzzy prefix" search > >> >> > > > >> >> > > Hi, > >> >> > > > >> >> > > You can pass an integer to FuzzyQuery which defines the number > >> >> > > of characters that are seen as prefix. So all terms must match > >> >> > > this prefix and the rest of each term is matched using fuzzy. > >> >> > > > >> >> > > Uwe > >> >> > > > >> >> > > ----- > >> >> > > Uwe Schindler > >> >> > > H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de > >> >> > > eMail: [email protected] > >> >> > > > >> >> > > > -----Original Message----- > >> >> > > > From: Clemens Wyss [mailto:[email protected]] > >> >> > > > Sent: Monday, May 02, 2011 1:47 PM > >> >> > > > To: [email protected] > >> >> > > > Subject: "fuzzy prefix" search > >> >> > > > > >> >> > > > I'd like to search fuzzily but not on a full term. > >> >> > > > E.g. > >> >> > > > I have a text "Merlot del Ticino" > >> >> > > > I'd like > >> >> > > > "mer", "merr", "melo", ... to match. > >> >> > > > > >> >> > > > If I use FuzzyQuery only "merlot, "merlott" hit. What > >> >> > > > Query-combination should I use? > >> >> > > > > >> >> > > > Thx > >> >> > > > Clemens > >> >> > > > > >> >> > > > > >> >> > > > ------------------------------------------------------------ > >> >> > > > --- > >> >> > > > --- > >> >> > > > -- > >> >> > > > - To unsubscribe, e-mail: > >> >> > > > [email protected] > >> >> > > > For additional commands, e-mail: > >> >> > > > [email protected] > >> >> > > > >> >> > > > >> >> > > > >> >> > > -------------------------------------------------------------- > >> >> > > --- > >> >> > > --- > >> >> > > - To unsubscribe, e-mail: > >> >> > > [email protected] > >> >> > > For additional commands, e-mail: > >> >> > > [email protected] > >> >> > > >> >> > > >> >> > ---------------------------------------------------------------- > >> >> > --- > >> >> > -- To unsubscribe, e-mail: > >> >> > [email protected] > >> >> > For additional commands, e-mail: > >> >> > [email protected] > >> >> > >> >> > >> >> ------------------------------------------------------------------ > >> >> --- To unsubscribe, e-mail: > >> >> [email protected] > >> >> For additional commands, e-mail: [email protected] > >> > > >> > > >> > ------------------------------------------------------------------- > >> > -- To unsubscribe, e-mail: [email protected] > >> > For additional commands, e-mail: [email protected] > >> > > >> > > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [email protected] > >> For additional commands, e-mail: [email protected] > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
