searching for q terms that start with a dash/hyphen being interpreted as prohibited clauses

2013-01-17 Thread geeky2
hello

environment: solr 3.5

problem statement:

i have a requirement to search for part numbers that start with a dash /
hyphen.

example q= term: *-0004A-0436*

example query:

http://some_url:some_port/some_core/select?facet=falsesort=score+desc%2C+rankNo+asc%2C+partCnt+descstart=0q=*-0004A-0436*+itemType%3A1wt=xmlqt=itemModelNoProductTypeBrandSearchrows=4

what is happening: query is returning a huge results set.  in reality there
is one (1) and only one record in the database with this part number.

i believe this is happening because the dash is being interpreted by the
query parser as a prohibited clause and the effective result is, give me
everything that does NOT have this part number.

how is this handled so that the search is conducted for the actual part:
-0004A-0436

thx
mark

more information:

request handler in solrconfig.xml

  requestHandler name=itemModelNoProductTypeBrandSearch
class=solr.SearchHandler default=false
lst name=defaults
  str name=defTypeedismax/str
  str name=echoParamsall/str
  int name=rows10/int
  str name=qfitemModelNoExactMatchStr^30 itemModelNo^.9
divProductTypeDesc^.8 plsBrandDesc^.5/str
  str name=q.alt*:*/str
  str name=sortscore desc, rankNo desc, partCnt desc/str
  str name=facettrue/str
  str name=facet.fielditemModelDescFacet/str
  str name=facet.fieldplsBrandDescFacet/str
  str name=facet.fielddivProductTypeIdFacet/str
/lst
lst name=appends
/lst
lst name=invariants
/lst
  /requestHandler


field information from schema.xml (if helpful)

field name=itemModelNoExactMatchStr type=text_general_trim
indexed=true stored=true/
 
field name=itemModelNo type=text_en_splitting indexed=true
stored=true omitNorms=true/

field name=divProductTypeDesc type=text_general_edge_ngram
indexed=true stored=true multiValued=true/

field name=plsBrandDesc type=text_general_edge_ngram indexed=true
stored=true multiValued=true/


fieldType name=text_general_trim class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.TrimFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

fieldType name=text_en_splitting class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/


filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true/
filter class=solr.PatternReplaceFilterFactory pattern=\.
replacement= replace=all/
filter class=solr.EdgeNGramFilterFactory minGramSize=3
maxGramSize=15 side=front/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=1 splitOnCaseChange=1
preserveOriginal=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
  /analyzer

fieldType name=text_general_edge_ngram class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true/
filter class=solr.SynonymFilterFactory
synonyms=synonyms_SHC.txt ignoreCase=true expand=true/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EdgeNGramFilterFactory minGramSize=3
maxGramSize=15 side=front/
  /analyzer
  analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType






--
View this message in context: 
http://lucene.472066.n3.nabble.com/searching-for-q-terms-that-start-with-a-dash-hyphen-being-interpreted-as-prohibited-clauses-tp4034310.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: searching for q terms that start with a dash/hyphen being interpreted as prohibited clauses

2013-01-17 Thread Erick Erickson
I think all you need to do is escape the hyphen, or have you tried that already?

Best
Erick

On Thu, Jan 17, 2013 at 1:38 PM, geeky2 gee...@hotmail.com wrote:
 hello

 environment: solr 3.5

 problem statement:

 i have a requirement to search for part numbers that start with a dash /
 hyphen.

 example q= term: *-0004A-0436*

 example query:

 http://some_url:some_port/some_core/select?facet=falsesort=score+desc%2C+rankNo+asc%2C+partCnt+descstart=0q=*-0004A-0436*+itemType%3A1wt=xmlqt=itemModelNoProductTypeBrandSearchrows=4

 what is happening: query is returning a huge results set.  in reality there
 is one (1) and only one record in the database with this part number.

 i believe this is happening because the dash is being interpreted by the
 query parser as a prohibited clause and the effective result is, give me
 everything that does NOT have this part number.

 how is this handled so that the search is conducted for the actual part:
 -0004A-0436

 thx
 mark

 more information:

 request handler in solrconfig.xml

   requestHandler name=itemModelNoProductTypeBrandSearch
 class=solr.SearchHandler default=false
 lst name=defaults
   str name=defTypeedismax/str
   str name=echoParamsall/str
   int name=rows10/int
   str name=qfitemModelNoExactMatchStr^30 itemModelNo^.9
 divProductTypeDesc^.8 plsBrandDesc^.5/str
   str name=q.alt*:*/str
   str name=sortscore desc, rankNo desc, partCnt desc/str
   str name=facettrue/str
   str name=facet.fielditemModelDescFacet/str
   str name=facet.fieldplsBrandDescFacet/str
   str name=facet.fielddivProductTypeIdFacet/str
 /lst
 lst name=appends
 /lst
 lst name=invariants
 /lst
   /requestHandler


 field information from schema.xml (if helpful)

 field name=itemModelNoExactMatchStr type=text_general_trim
 indexed=true stored=true/

 field name=itemModelNo type=text_en_splitting indexed=true
 stored=true omitNorms=true/

 field name=divProductTypeDesc type=text_general_edge_ngram
 indexed=true stored=true multiValued=true/

 field name=plsBrandDesc type=text_general_edge_ngram indexed=true
 stored=true multiValued=true/


 fieldType name=text_general_trim class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.TrimFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType

 fieldType name=text_en_splitting class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/


 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true/
 filter class=solr.PatternReplaceFilterFactory pattern=\.
 replacement= replace=all/
 filter class=solr.EdgeNGramFilterFactory minGramSize=3
 maxGramSize=15 side=front/
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1
 preserveOriginal=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
   /analyzer

 fieldType name=text_general_edge_ngram class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true/
 filter class=solr.SynonymFilterFactory
 synonyms=synonyms_SHC.txt ignoreCase=true expand=true/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.EdgeNGramFilterFactory minGramSize=3
 maxGramSize=15 side=front/
   /analyzer
   analyzer type=query
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true/
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType






 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/searching-for-q-terms-that-start-with-a-dash-hyphen-being-interpreted-as-prohibited-clauses-tp4034310.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: searching for q terms that start with a dash/hyphen being interpreted as prohibited clauses

2013-01-17 Thread Jack Krupansky

Or put the term in quotes.

-- Jack Krupansky

-Original Message- 
From: Erick Erickson

Sent: Thursday, January 17, 2013 6:59 PM
To: solr-user@lucene.apache.org
Subject: Re: searching for q terms that start with a dash/hyphen being 
interpreted as prohibited clauses


I think all you need to do is escape the hyphen, or have you tried that 
already?


Best
Erick

On Thu, Jan 17, 2013 at 1:38 PM, geeky2 gee...@hotmail.com wrote:

hello

environment: solr 3.5

problem statement:

i have a requirement to search for part numbers that start with a dash /
hyphen.

example q= term: *-0004A-0436*

example query:

http://some_url:some_port/some_core/select?facet=falsesort=score+desc%2C+rankNo+asc%2C+partCnt+descstart=0q=*-0004A-0436*+itemType%3A1wt=xmlqt=itemModelNoProductTypeBrandSearchrows=4

what is happening: query is returning a huge results set.  in reality 
there

is one (1) and only one record in the database with this part number.

i believe this is happening because the dash is being interpreted by the
query parser as a prohibited clause and the effective result is, give me
everything that does NOT have this part number.

how is this handled so that the search is conducted for the actual part:
-0004A-0436

thx
mark

more information:

request handler in solrconfig.xml

  requestHandler name=itemModelNoProductTypeBrandSearch
class=solr.SearchHandler default=false
lst name=defaults
  str name=defTypeedismax/str
  str name=echoParamsall/str
  int name=rows10/int
  str name=qfitemModelNoExactMatchStr^30 itemModelNo^.9
divProductTypeDesc^.8 plsBrandDesc^.5/str
  str name=q.alt*:*/str
  str name=sortscore desc, rankNo desc, partCnt desc/str
  str name=facettrue/str
  str name=facet.fielditemModelDescFacet/str
  str name=facet.fieldplsBrandDescFacet/str
  str name=facet.fielddivProductTypeIdFacet/str
/lst
lst name=appends
/lst
lst name=invariants
/lst
  /requestHandler


field information from schema.xml (if helpful)

field name=itemModelNoExactMatchStr type=text_general_trim
indexed=true stored=true/

field name=itemModelNo type=text_en_splitting indexed=true
stored=true omitNorms=true/

field name=divProductTypeDesc type=text_general_edge_ngram
indexed=true stored=true multiValued=true/

field name=plsBrandDesc type=text_general_edge_ngram indexed=true
stored=true multiValued=true/


fieldType name=text_general_trim class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.TrimFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

fieldType name=text_en_splitting class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/


filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true/
filter class=solr.PatternReplaceFilterFactory pattern=\.
replacement= replace=all/
filter class=solr.EdgeNGramFilterFactory minGramSize=3
maxGramSize=15 side=front/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=1 splitOnCaseChange=1
preserveOriginal=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
  /analyzer

fieldType name=text_general_edge_ngram class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true/
filter class=solr.SynonymFilterFactory
synonyms=synonyms_SHC.txt ignoreCase=true expand=true/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EdgeNGramFilterFactory minGramSize=3
maxGramSize=15 side=front/
  /analyzer
  analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType






--
View this message in context: 
http://lucene.472066.n3.nabble.com/searching-for-q-terms-that-start-with-a-dash-hyphen-being-interpreted-as-prohibited-clauses-tp4034310.html
Sent from the Solr - User mailing list archive at Nabble.com.