Hi Sravan,
Edismax has ’sow’ parameter that results in edismax to pass query to field 
analysis, but not sure how it will work with fuzzy search. What you might do is 
use _query synthax to separate shingle and non shingle queries, e.g.
q=_query({!edismax sow=false qf=title_bigrams}$v) OR _query({!edismax 
qf=title}$v)&$v=some movie title

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 7 Feb 2018, at 10:55, Sravan Kumar <sra...@caavo.com> wrote:
> 
> We have the following two fields for our movie title search
> - title without symbols
> a custom analyser with WordDelimiterFilterFactory, SynonymFilterFactory and
> other filters to retain only alpha numeric characters.
> - title with word bi grams
> a custom analyser with solr.ShingleFilterFactory to generate "bi gram" word
> tokens with '_' as separator.
> 
> A custom similarity class is used to make tf & idf values as 1.
> 
> Edismax query parser is used to perform all searches. Phrase boosting (pf)
> is also used.
> 
> There are couple of issues while searching:
> 1>  BiGram field doesn't generate bi grams if the white spaces in the query
> are not escaped.
> - For example, if the query is "pursuit of happyness", then bi grams are
> not generated.  This is due to the fact that the edismax query parser
> tokenizes based on whitespaces before passing the string to
> analyser(correct me if I am wrong).
> But in case of "pursuit\ of\ happyness", they are as the string which is
> passed to the analyser is with the whitespace.
> 
> 2>  Fuzzy search doesn't work in  whitespace escaped queries.
> Ex: "pursuit~2\ of\ happiness~1"
> 
> 3> Edismax's Phrase boosting doesn't work the way it should in
> non-whitespace escaped fuzzy queries.
> 
> If the query is "pursuit~2 of happiness~1" (without escaping whitespaces)
> 
> fuzzy queries are generated
> (title_name:pursuit~2), (title_name:happiness~1) in the parsed query.
> But,edismax pf (phrase boost) generates query like
> title_name:"pursuit (2 pursuit2) of happiness (1 happiness1)"
> This means the analyser got the original query consisting the fuzzy
> operator for phrase boosting.
> 
> 
> 1> How whitespaces should be handled in case of filters like
> solr.ShingleFilterFactory to generate bi grams?
> 2> If generating bi grams requires whitespaces escaped and fuzzy searches
> not, how do we accomodate both these in a single solr request and scored
> together.
> 
> 
> 
> -
> -- 
> Regards,
> Sravan

Reply via email to