Thank you for your answer.

I agree, I can manage predictable values through synonyms.

However most data in this index are company and product names, leading
sometimes to rather strange syntax (mix of upper/lower case, misplaced dash
or spaces). One purpose to using solr was to help in finding potential
duplicates before data insertion.

On another hand I could write a custom tokenizer/filter and a custom query
builder that would test many combinations. I have the feeling however it is
an inefficient approach.
That is...
Indexing : "chelsea soccer club" =>
"chelsea","soccer","club",chelseasoccer","soccerclub","chelseasoccerclub"
Searching : "chelsea soccerclub" => "chelsea" and "soccerclub" or
"chelseasoccerclub"
While search expressions are generally short, indexation will be a
nightmare...


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Find-results-with-or-without-whitespace-tp3117144p3117581.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to