Re: phrase, inidividual term, prefix, fuzzy and stemming search

Otis Gospodnetic Fri, 04 Feb 2011 03:51:15 -0800

Hi,

I'll admit I didn't read your email closely, but the first part makes me thing 
that ngrams, which I don't think you mentioned, might be handy for you here, 
allowing for misspellings without the implementation complexity.


Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: cyang2010 <ysxsu...@hotmail.com>
> To: solr-user@lucene.apache.org
> Sent: Mon, January 31, 2011 5:22:19 PM
> Subject: phrase, inidividual term, prefix, fuzzy and stemming search
> 
> 
> My current project has the requirement to support search when user inputs  any
> number of terms across a few index fields (movie title, actor,  director).
> 
> In order to maximize result, I plan to support all those  searches listed in
> the subject, phrase, individual term, prefix, fuzzy and  stemming.  Of
> course, score relevance in the right order is also  important.
> 
> I have considered using dismax query.  However, it does  not support prefix
> query.  I am not sure if it supports fuzzy query, my  guess is does not.
> 
> Therefore, i still need to use standard query.    For example, if someone
> searches "deim moer" (typo for demi moore), i compare  the phrase and terms
> with each searchable fields (title, actor,  director):
> 
> 
> title_display: "deim moer"~30 actors: "deim moer"~30  directors: "deim
> moer"~30    <--  OR
> 
> title_display:  deim    <-- OR
> actors: deim 
> directors: deim 
> 
> title_display: deim*   <-- OR
> actors: deim* 
> directors:  deim* 
> 
> title_display: deim~0.6   <-- OR
> actors: deim~0.6 
> directors: deim~0.6 
> 
> title_display: moer    <--  OR
> actors: moer 
> directors: moer 
> 
> title_display: moer*    <-- OR
> actors: moer* 
> directors: moer* 
> 
> title_display:  moer~0.6    <-- OR
> actors: moer~0.6 
> directors:  moer~0.6
> 
> The solr relevance score is sum for all those OR.  In that  way, i can make
> sure relevance score are in order.  For example, for the  exact match ("deim
> moer"), it will match phrase, term, prefix and fuzzy query  all at the same
> time.   Therefore, it will score higher than some input  text only matchs
> term, or prefix or fuzzy.     At the same time, i  can apply boost to a
> particular search field if requirement  needs.
> 
> 
> Does it sound right to you?  Is there better ways to  achieve the same thing? 
> My concern is my query is not going to perform,  since it tries to do too
> much.  But isn't that what people want to get  (maximize result) when they
> just type in a few search words?
> 
> Another  question is that:  Can i combine the result of two query together? 
> For  example, first i query phrase and term match, next I query for  prefix
> match.  Can I just append the result for prefix match to that  for
> phrase/term match?   I thought two queries have different  queryNorm,
> therefore, the score is not comparable to each other so as to  combine.  Is
> it correct?
> 
> 
> Thanks.  love to hear what your  thought is.
> 
> 
> -- 
> View this message in context: 
>http://lucene.472066.n3.nabble.com/phrase-inidividual-term-prefix-fuzzy-and-stemming-search-tp2391111p2391111.html
>
> Sent  from the Solr - User mailing list archive at Nabble.com.
>

Re: phrase, inidividual term, prefix, fuzzy and stemming search

Reply via email to