Thanks for the education Chris,  
I pasted the chars into  Index and Query fields on analyzer panel.

Index/Query Analyzers almost the same.. 
On both, non-greeks drop out after worddelimiterfilter
Index analyzer has grey background of words that seem to make it thru all the 
filters.

WhitespaceTokenizerFactory <-  ∠ ψ Σ • ≤ ≠ • ≥ μ ω φ θ ¢ β √ Ω ° ± Δ #  
SynonymFilterFactory (query only) <- ditto
StopFilterFactory    <- ditto
WordDelimiterFilterFactory  <- ψ Σ μ ω φ θ β Ω Δ  now only greeks
LowerCaseFilterFactory      <- ψ σ μ ω φ θ β ω δ  lower case Greeks only
SnowballPorterFilterFactory <- ψ σ μ ω φ θ β ω δ

so I'm thinking I need to change the worddelimiterfilter properties  
{catenateWords=0, catenateNumbers=0, splitOnCaseChange=1, catenateAll=0, 
generateNumberParts=1, generateWordParts=1, splitOnNumerics=0}

or copy these strings into a different field name/type without word delimiter, 
that way I wouldn't affect any ways that existing text is being searched. 
Sound right?

Allan Tegelberg





-----Original Message-----
From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 
Sent: Thursday, January 24, 2013 3:46 PM
To: solr-user@lucene.apache.org
Subject: Re: solr parsed query dropping special chars

: When I search for these characters in the admin query, I can only find the 
Greeks.
: debug shows the parsed query only has greek chars like omega, delta, sigma
: but does not contain others like degree, angle, cent, bullet, less_equal…

this is most likeley because of the analyzer you are using for your text field, 
an assumption which can be verified using the Analysis tool in the admin UI to 
see how the various pieces of your query analzer deal with the input.

My guess is you are using a tokenizer which ignores punctuation.

Don't foget to check your index analyzer as well -- you may not even be 
indexing these punctuation symbols either...

: the response dumps the document and  shows me the chars exist in the 
document..
: <str>angle (∠)</str>

...that's the stored value, the *indexed* text may not contain those terms.


-Hoss

Reply via email to