Thanks for your reply, Mark.
This was my original code for constructing my query using FuzzyQuery: BooleanQuery query = new BooleanQuery(); if (artist.length() > 0) { FuzzyQuery artist_query = new FuzzyQuery(new Term("artist", artist)); query.add(artist_query, BooleanClause.Occur.MUST); } if (song.length() > 0) { FuzzyQuery song_query = new FuzzyQuery(new Term("song", song)); query.add(song_query, BooleanClause.Occur.MUST); } This is my first attempt to use FuzzyLikeThisQuery (with no success): FuzzyLikeThisQuery query = new FuzzyLikeThisQuery(2, new SimpleAnalyzer()); if (artist.length() > 0) { query.addTerms(artist, "artist", 0.5f, 0); } if (song.length() > 0) { query.addTerms(song, "song", 0.5f, 0); } This is my second attempt to use FuzzyLikeThisQuery (with no success): BooleanQuery query = new BooleanQuery(); if (artist.length() > 0) { FuzzyLikeThisQuery artist_query = new FuzzyLikeThisQuery(1, new SimpleAnalyzer()); artist_query.addTerms(artist, "artist", 0.5f, 0); query.add(artist_query, BooleanClause.Occur.MUST); } if (song.length() > 0) { FuzzyLikeThisQuery song_query = new FuzzyLikeThisQuery(1, new SimpleAnalyzer()); song_query.addTerms(song, "song", 0.5f, 0); query.add(song_query, BooleanClause.Occur.MUST); } I think it's my lack of undersanding of the usage of FuzzyLikeThisQuery that makes me getting irrelevant results. Could you tell me what's wrong here, please? Thank you. On Mon, 2008-06-23 at 11:28 +0000, mark harwood wrote: > >>I do have serious problems with the relevance of the results with fuzzy > >>queries. > > Please take the time to read my response here: > > http://www.gossamer-threads.com/lists/lucene/java-user/62050#62050 > > I had a work colleague come up with exactly the same problem this week and > the solution is the same. > > Just tested my index with a standard Lucene FuzzyQuery for "Paul~" - this > gives "Phul", "Saul", and "Paulo" before ANY "Paul" records due to IDF issues. > Using FuzzyLikeThisQuery puts all the "Paul" records ahead of the variants. > > > > ----- Original Message ---- > From: László Monda <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Cc: [EMAIL PROTECTED] > Sent: Monday, 23 June, 2008 12:10:05 PM > Subject: Re: Getting irrelevant results using fuzzy query > > On Wed, 2008-06-18 at 21:10 +0200, Daniel Naber wrote: > > On Mittwoch, 18. Juni 2008, László Monda wrote: > > > > > Additional info: Lucene seems to do the right thing when only few > > > documents are present, but goes crazy when there is about 1.5 million > > > documents in the index. > > > > Lucene works well with more documents (currently using it with 9 million). > > but the fuzzy query requires iteration over all terms which makes this > > query slow. This can be avoid by setting the prefixLength parameter of the > > FuzzyQuery constructor to 1 or 2. Or maybe you should use an n-gram index, > > see the spellchecker in the contrib area. > > Thanks for the suggestion, but I don't have any performance problems > yet, but I do have serious problems with the relevance of the results > with fuzzy queries. > -- Laci <http://monda.hu>
signature.asc
Description: This is a digitally signed message part