Ivan
The hyphen character (-) is a Solr operator to exclude results matching the
word that follows the operator. You may strip off them while indexing and
searching. I think there are different ways to make it work if you need to
retain. I am using the following way
1. Excerpt from my schema.xml (you may not need all filters):
fieldtype name=text class=solr.TextField positionIncrementGap=100
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
!-- in this example, we will only use synonyms at query time
filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true expand=false/
--
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.LowerCaseTokenizerFactory/--
!-- tokenizer class=solr.WhitespaceTokenizerFactory/--
filter class=solr.SynonymFilterFactory
synonyms=synonyms.txt ignoreCase=true expand=true/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
/fieldtype
2. Query:
Iam removing hyphens before appending to q= query string which is working fine
for me
http://localhost:9080/solr/custdatacore/select/?q=PHONE_NUMBER:239083*
Note: Since my field is text and this is num, i am just appending * at the end
The actual data stored in index is 111-123-9083
and the spell check with below (without stripping off hyphens)
suggest/?spellcheck.q=111-123-9083 spellcheck=true
Thanks
Ram M Marpaka
From: Lance Norskog goks...@gmail.com
To: solr-user@lucene.apache.org
Sent: Monday, 23 July 2012 7:14 PM
Subject: Re: Search special chars
The Whitespace Tokenizer does this. It breaks everything
apart only
by space, tabs and newlines. You can use this whitespace tokenizer in
the query stack of your field type.
Another option is to create a regular expression CharFilter that turns
non-* into non*.
On Mon, Jul 23, 2012 at 7:10 PM, Li, Qiang qiang...@msci.com wrote:
Hi All,
I want to search some keywords like Non-taxable, which has a - in the
word. Can I make it working in Solr by some configuration? Or any other ways?
Thanks Regards,
Ivan
This email message and any attachments are for the sole use of the intended
recipients and may contain proprietary and/or confidential information which
may be privileged or otherwise protected from disclosure. Any unauthorized
review, use, disclosure or distribution is prohibited. If you are not an
intended recipient, please contact
the sender by reply email and destroy the original message and any copies of
the message as well as any attachments to the original message. Local
registered entity information:
http://www.msci.com/legal/local_registered_entities.html
--
Lance Norskog
goks...@gmail.com