Using Lucene 3.5, I created a query parser based on the dismax parser but in order to get matches on misspellings ecetra I additionally do a fuzzy search and a wildcard search

http://svn.musicbrainz.org/search_server/trunk/servlet/src/main/java/org/musicbrainz/search/servlet/DismaxQueryParser.java

So a search for 'echo bunneymen' searches for over three fields (alias, sortname, artist) and becomes dijunction searches on these and phrase search

custom(+((
alias:echo~0.5^0.71999997 | alias:echo*^0.71999997 | alias:echo^0.9
| sortname:echo~0.5^0.88000005 | sortname:echo*^0.88000005 | sortname:echo^1.1
| artist:echo~0.5^1.04 | artist:echo*^1.04 | artist:echo^1.3)~0.1
 (
alias:bunneymen~0.5^0.71999997 | alias:bunneymen*^0.71999997 | alias:bunneymen^0.9 | sortname:bunneymen~0.5^0.88000005 | sortname:bunneymen*^0.88000005 | sortname:bunneymen^1.1 | artist:bunneymen~0.5^1.04 | artist:bunneymen*^1.04 | artist:bunneymen^1.3)~0.1) (alias:"echo bunneymen"^0.2 | sortname:"echo bunneymen"^0.2 | artist:"echo bunneymen"^0.2)~0.1)

and it gives me exactly the results and scoring that I want, trouble is that its TOO SLOW

I tried using a different write mechanism as recommended new MultiTermQuery.TopTermsBoostOnlyBooleanQueryRewrite(100) but then it doesn't consider the query idf which makes sense so that rare query terms aren't boosted, but neither does it consider the idf or field/norm of the matching document this seems wrong because this still seem relevent, and more problematically the fuzzy query scores are so much lower than normal and phrase matches, so it doesn't seem to work when using fuzzy queries mixed in with other queries, is there a better option or even some better documentation on the rewrite method so I can understand it better.

Alternatively, is there an analyzer I can use to analyse the fields using the fuzzy/levenstein logic so I can do this at index time instead then just use a normal term query with same analyzer instead of a fuzzy query

Paul



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to