Re: Multi-word exact keyword case-insensitive search suggestions

2011-01-17 Thread Chamnap Chhorn
No other way around to fit this requirement?

On Sat, Jan 15, 2011 at 10:01 AM, Chamnap Chhorn chamnapchh...@gmail.comwrote:

 Ahh, thanks guys for helping me!

 For Adam solution, it doesn't work for me. Here is my Field, FieldType, and
 solr query:

 fieldType name=text_keyword class=solr.TextField
 positionIncrementGap=100

analyzer
tokenizer class=solr.KeywordTokenizerFactory /
filter class=solr.ShingleFilterFactory
  maxShingleSize=4 outputUnigrams=true
 outputUnigramIfNoNgram=false /
  /analyzer
 /fieldType

 field name=keyphrase type=text_keyword indexed=true stored=false
 multiValued=true/


 http://localhost:8081/solr/select?q=printing%20houseqf=keyphrasedebugQuery=ondefType=dismax


 str name=parsedquery
 +((DisjunctionMaxQuery((keyphrase:smart))
 DisjunctionMaxQuery((keyphrase:mobile)))~2) ()
 /str
 str name=parsedquery_toString+(((keyphrase:smart)
 (keyphrase:mobile))~2) ()/str
  lst name=explain/

 The result is not found.

 For erick solution, it works for me. However, I can't put filter query,
 since it's part of full text search. If I put fq, it would just return
 documents that match exactly as the query. I want to show those that match
 exactly on the top and the rest for documents that match partially.

 The problem is that when the user search a word (eg. printing of the
 keyword printing house), that document also include in the search results.
 The other problem is that if the user search the reverse order(eg. house
 printing), it's also found.

 Cheers


 On Sat, Jan 15, 2011 at 3:31 AM, Erick Erickson 
 erickerick...@gmail.comwrote:

 This might work:

 Define your field to use WhitespaceTokenizer and LowerCaseFilterFactory

 Use a filter query referencing this field.

 If you wanted the words to appear in their exact order, you could just
 define
 the pf field in your dismax.

 Best
 Erick

 On Thu, Jan 13, 2011 at 8:01 PM, Estrada Groups 
 estrada.adam.gro...@gmail.com wrote:

  Ahhh...the fun of open source software ;-). Requires a ton of trial and
  error! I found what worked for me and figured it was worth passing it
 along.
  If you don't mind...when you sort everything out on your end, please
 post
  results for the rest of us to take a gander at.
 
  Cheers,
  Adam
 
  On Jan 13, 2011, at 9:08 PM, Chamnap Chhorn chamnapchh...@gmail.com
  wrote:
 
   Thanks for your reply. However, it doesn't work for my case at all. I
  think
   it's the problem with query parser or something else. It forces me to
 put
   double quote to the search query in order to get the results found.
  
   str name=rawquerystringsim 010/str
   str name=querystringsim 010/str
   str name=parsedquery+DisjunctionMaxQuery((keyphrase:sim 010))
  ()/str
   str name=parsedquery_toString+(keyphrase:sim 010) ()/str
  
   str name=rawquerystringsmart mobile/str
   str name=querystringsmart mobile/str
   str name=parsedquery
   +((DisjunctionMaxQuery((keyphrase:smart))
   DisjunctionMaxQuery((keyphrase:mobile)))~2) ()
   /str
   str name=parsedquery_toString+(((keyphrase:smart)
  (keyphrase:mobile))~2)
   ()/str
  
   The intent here is to do a full text search, part of that is to search
   keyword field, so I can't put quote to it.
  
   On Thu, Jan 13, 2011 at 10:30 PM, Adam Estrada 
   estrada.adam.gro...@gmail.com wrote:
  
   Hi,
  
   the following seems to work pretty well.
  
 fieldType name=text_ws class=solr.TextField
   positionIncrementGap=100
   analyzer
 tokenizer class=solr.KeywordTokenizerFactory /
 filter class=solr.ShingleFilterFactory
   maxShingleSize=4 outputUnigrams=true
   outputUnigramIfNoNgram=false /
   /analyzer
 /fieldType
  
 !-- A text field that uses WordDelimiterFilter to enable splitting
  and
   matching of
 words on case-change, alpha numeric boundaries, and
  non-alphanumeric
   chars,
 so that a query of wifi or wi fi could match a document
   containing Wi-Fi.
 Synonyms and stopwords are customized by external files, and
   stemming is enabled.
 The attribute autoGeneratePhraseQueries=true (the default)
  causes
   words that get split to
 form phrase queries. For example, WordDelimiterFilter splitting
   text:pdp-11 will cause the parser
 to generate text:pdp 11 rather than (text:PDP OR text:11).
 NOTE: autoGeneratePhraseQueries=true tends to not work well
 for
   non whitespace delimited languages.
 --
 fieldType name=text class=solr.TextField
  positionIncrementGap=100
   autoGeneratePhraseQueries=true
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 !-- in this example, we will only use synonyms at query time
 filter class=solr.SynonymFilterFactory
   synonyms=index_synonyms.txt ignoreCase=true expand=false/
 --
 !-- Case insensitive stop word removal.
   add enablePositionIncrements=true in both the index and query
   analyzers to leave a 

Re: Multi-word exact keyword case-insensitive search suggestions

2011-01-14 Thread Erick Erickson
This might work:

Define your field to use WhitespaceTokenizer and LowerCaseFilterFactory

Use a filter query referencing this field.

If you wanted the words to appear in their exact order, you could just
define
the pf field in your dismax.

Best
Erick

On Thu, Jan 13, 2011 at 8:01 PM, Estrada Groups 
estrada.adam.gro...@gmail.com wrote:

 Ahhh...the fun of open source software ;-). Requires a ton of trial and
 error! I found what worked for me and figured it was worth passing it along.
 If you don't mind...when you sort everything out on your end, please post
 results for the rest of us to take a gander at.

 Cheers,
 Adam

 On Jan 13, 2011, at 9:08 PM, Chamnap Chhorn chamnapchh...@gmail.com
 wrote:

  Thanks for your reply. However, it doesn't work for my case at all. I
 think
  it's the problem with query parser or something else. It forces me to put
  double quote to the search query in order to get the results found.
 
  str name=rawquerystringsim 010/str
  str name=querystringsim 010/str
  str name=parsedquery+DisjunctionMaxQuery((keyphrase:sim 010))
 ()/str
  str name=parsedquery_toString+(keyphrase:sim 010) ()/str
 
  str name=rawquerystringsmart mobile/str
  str name=querystringsmart mobile/str
  str name=parsedquery
  +((DisjunctionMaxQuery((keyphrase:smart))
  DisjunctionMaxQuery((keyphrase:mobile)))~2) ()
  /str
  str name=parsedquery_toString+(((keyphrase:smart)
 (keyphrase:mobile))~2)
  ()/str
 
  The intent here is to do a full text search, part of that is to search
  keyword field, so I can't put quote to it.
 
  On Thu, Jan 13, 2011 at 10:30 PM, Adam Estrada 
  estrada.adam.gro...@gmail.com wrote:
 
  Hi,
 
  the following seems to work pretty well.
 
fieldType name=text_ws class=solr.TextField
  positionIncrementGap=100
  analyzer
tokenizer class=solr.KeywordTokenizerFactory /
filter class=solr.ShingleFilterFactory
  maxShingleSize=4 outputUnigrams=true
  outputUnigramIfNoNgram=false /
  /analyzer
/fieldType
 
!-- A text field that uses WordDelimiterFilter to enable splitting
 and
  matching of
words on case-change, alpha numeric boundaries, and
 non-alphanumeric
  chars,
so that a query of wifi or wi fi could match a document
  containing Wi-Fi.
Synonyms and stopwords are customized by external files, and
  stemming is enabled.
The attribute autoGeneratePhraseQueries=true (the default)
 causes
  words that get split to
form phrase queries. For example, WordDelimiterFilter splitting
  text:pdp-11 will cause the parser
to generate text:pdp 11 rather than (text:PDP OR text:11).
NOTE: autoGeneratePhraseQueries=true tends to not work well for
  non whitespace delimited languages.
--
fieldType name=text class=solr.TextField
 positionIncrementGap=100
  autoGeneratePhraseQueries=true
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
!-- in this example, we will only use synonyms at query time
filter class=solr.SynonymFilterFactory
  synonyms=index_synonyms.txt ignoreCase=true expand=false/
--
!-- Case insensitive stop word removal.
  add enablePositionIncrements=true in both the index and query
  analyzers to leave a 'gap' for more accurate phrase queries.
--
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=1
  catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
  protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
  ignoreCase=true expand=true/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=0
  catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
  protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
  /analyzer
/fieldType
 
copyField source=cat dest=text/
copyField source=subject dest=text/
copyField source=summary dest=text/
copyField source=cause dest=text/
copyField source=status dest=text/
copyField source=urgency dest=text/
 
  I ingest the source fields as text_ws (I know I've changed it a bit) and
  then copy the field to text. This seems to do what you are asking for.
 
  

Re: Multi-word exact keyword case-insensitive search suggestions

2011-01-14 Thread Chamnap Chhorn
Ahh, thanks guys for helping me!

For Adam solution, it doesn't work for me. Here is my Field, FieldType, and
solr query:

fieldType name=text_keyword class=solr.TextField
positionIncrementGap=100
   analyzer
   tokenizer class=solr.KeywordTokenizerFactory /
   filter class=solr.ShingleFilterFactory
 maxShingleSize=4 outputUnigrams=true
outputUnigramIfNoNgram=false /
 /analyzer
/fieldType

field name=keyphrase type=text_keyword indexed=true stored=false
multiValued=true/

http://localhost:8081/solr/select?q=printing%20houseqf=keyphrasedebugQuery=ondefType=dismax

str name=parsedquery
+((DisjunctionMaxQuery((keyphrase:smart))
DisjunctionMaxQuery((keyphrase:mobile)))~2) ()
/str
str name=parsedquery_toString+(((keyphrase:smart) (keyphrase:mobile))~2)
()/str
lst name=explain/

The result is not found.

For erick solution, it works for me. However, I can't put filter query,
since it's part of full text search. If I put fq, it would just return
documents that match exactly as the query. I want to show those that match
exactly on the top and the rest for documents that match partially.

The problem is that when the user search a word (eg. printing of the
keyword printing house), that document also include in the search results.
The other problem is that if the user search the reverse order(eg. house
printing), it's also found.

Cheers

On Sat, Jan 15, 2011 at 3:31 AM, Erick Erickson erickerick...@gmail.comwrote:

 This might work:

 Define your field to use WhitespaceTokenizer and LowerCaseFilterFactory

 Use a filter query referencing this field.

 If you wanted the words to appear in their exact order, you could just
 define
 the pf field in your dismax.

 Best
 Erick

 On Thu, Jan 13, 2011 at 8:01 PM, Estrada Groups 
 estrada.adam.gro...@gmail.com wrote:

  Ahhh...the fun of open source software ;-). Requires a ton of trial and
  error! I found what worked for me and figured it was worth passing it
 along.
  If you don't mind...when you sort everything out on your end, please post
  results for the rest of us to take a gander at.
 
  Cheers,
  Adam
 
  On Jan 13, 2011, at 9:08 PM, Chamnap Chhorn chamnapchh...@gmail.com
  wrote:
 
   Thanks for your reply. However, it doesn't work for my case at all. I
  think
   it's the problem with query parser or something else. It forces me to
 put
   double quote to the search query in order to get the results found.
  
   str name=rawquerystringsim 010/str
   str name=querystringsim 010/str
   str name=parsedquery+DisjunctionMaxQuery((keyphrase:sim 010))
  ()/str
   str name=parsedquery_toString+(keyphrase:sim 010) ()/str
  
   str name=rawquerystringsmart mobile/str
   str name=querystringsmart mobile/str
   str name=parsedquery
   +((DisjunctionMaxQuery((keyphrase:smart))
   DisjunctionMaxQuery((keyphrase:mobile)))~2) ()
   /str
   str name=parsedquery_toString+(((keyphrase:smart)
  (keyphrase:mobile))~2)
   ()/str
  
   The intent here is to do a full text search, part of that is to search
   keyword field, so I can't put quote to it.
  
   On Thu, Jan 13, 2011 at 10:30 PM, Adam Estrada 
   estrada.adam.gro...@gmail.com wrote:
  
   Hi,
  
   the following seems to work pretty well.
  
 fieldType name=text_ws class=solr.TextField
   positionIncrementGap=100
   analyzer
 tokenizer class=solr.KeywordTokenizerFactory /
 filter class=solr.ShingleFilterFactory
   maxShingleSize=4 outputUnigrams=true
   outputUnigramIfNoNgram=false /
   /analyzer
 /fieldType
  
 !-- A text field that uses WordDelimiterFilter to enable splitting
  and
   matching of
 words on case-change, alpha numeric boundaries, and
  non-alphanumeric
   chars,
 so that a query of wifi or wi fi could match a document
   containing Wi-Fi.
 Synonyms and stopwords are customized by external files, and
   stemming is enabled.
 The attribute autoGeneratePhraseQueries=true (the default)
  causes
   words that get split to
 form phrase queries. For example, WordDelimiterFilter splitting
   text:pdp-11 will cause the parser
 to generate text:pdp 11 rather than (text:PDP OR text:11).
 NOTE: autoGeneratePhraseQueries=true tends to not work well
 for
   non whitespace delimited languages.
 --
 fieldType name=text class=solr.TextField
  positionIncrementGap=100
   autoGeneratePhraseQueries=true
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 !-- in this example, we will only use synonyms at query time
 filter class=solr.SynonymFilterFactory
   synonyms=index_synonyms.txt ignoreCase=true expand=false/
 --
 !-- Case insensitive stop word removal.
   add enablePositionIncrements=true in both the index and query
   analyzers to leave a 'gap' for more accurate phrase queries.
 --
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt

Re: Multi-word exact keyword case-insensitive search suggestions

2011-01-13 Thread Adam Estrada
Hi,

the following seems to work pretty well.

fieldType name=text_ws class=solr.TextField
positionIncrementGap=100
  analyzer
tokenizer class=solr.KeywordTokenizerFactory /
filter class=solr.ShingleFilterFactory
  maxShingleSize=4 outputUnigrams=true
outputUnigramIfNoNgram=false /
  /analyzer
/fieldType

!-- A text field that uses WordDelimiterFilter to enable splitting and
matching of
words on case-change, alpha numeric boundaries, and non-alphanumeric
chars,
so that a query of wifi or wi fi could match a document
containing Wi-Fi.
Synonyms and stopwords are customized by external files, and
stemming is enabled.
The attribute autoGeneratePhraseQueries=true (the default) causes
words that get split to
form phrase queries. For example, WordDelimiterFilter splitting
text:pdp-11 will cause the parser
to generate text:pdp 11 rather than (text:PDP OR text:11).
NOTE: autoGeneratePhraseQueries=true tends to not work well for
non whitespace delimited languages.
--
fieldType name=text class=solr.TextField positionIncrementGap=100
autoGeneratePhraseQueries=true
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
!-- in this example, we will only use synonyms at query time
filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true expand=false/
--
!-- Case insensitive stop word removal.
  add enablePositionIncrements=true in both the index and query
  analyzers to leave a 'gap' for more accurate phrase queries.
--
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
  /analyzer
/fieldType

copyField source=cat dest=text/
copyField source=subject dest=text/
copyField source=summary dest=text/
copyField source=cause dest=text/
copyField source=status dest=text/
copyField source=urgency dest=text/

I ingest the source fields as text_ws (I know I've changed it a bit) and
then copy the field to text. This seems to do what you are asking for.

Adam

On Thu, Jan 13, 2011 at 12:05 AM, Chamnap Chhorn chamnapchh...@gmail.comwrote:

 Hi all,

 I'm just stuck with exact keyword for several days. Hope you guys could
 help
 me. Here is the scenario:

   1. It need to be matched with multi-word keyword and case insensitive
   2. Partial word or single word matching with this field is not allowed

 I want to know the field type definition for this field and sample solr
 query. I need to combine this search with my full text search which uses
 dismax query.

 Thanks
 --
 Chhorn Chamnap
 http://chamnapchhorn.blogspot.com/



Re: Multi-word exact keyword case-insensitive search suggestions

2011-01-13 Thread Chamnap Chhorn
Thanks for your reply. However, it doesn't work for my case at all. I think
it's the problem with query parser or something else. It forces me to put
double quote to the search query in order to get the results found.

str name=rawquerystringsim 010/str
str name=querystringsim 010/str
str name=parsedquery+DisjunctionMaxQuery((keyphrase:sim 010)) ()/str
str name=parsedquery_toString+(keyphrase:sim 010) ()/str

str name=rawquerystringsmart mobile/str
str name=querystringsmart mobile/str
str name=parsedquery
+((DisjunctionMaxQuery((keyphrase:smart))
DisjunctionMaxQuery((keyphrase:mobile)))~2) ()
/str
str name=parsedquery_toString+(((keyphrase:smart) (keyphrase:mobile))~2)
()/str

The intent here is to do a full text search, part of that is to search
keyword field, so I can't put quote to it.

On Thu, Jan 13, 2011 at 10:30 PM, Adam Estrada 
estrada.adam.gro...@gmail.com wrote:

 Hi,

 the following seems to work pretty well.

fieldType name=text_ws class=solr.TextField
 positionIncrementGap=100
  analyzer
tokenizer class=solr.KeywordTokenizerFactory /
filter class=solr.ShingleFilterFactory
  maxShingleSize=4 outputUnigrams=true
 outputUnigramIfNoNgram=false /
  /analyzer
/fieldType

!-- A text field that uses WordDelimiterFilter to enable splitting and
 matching of
words on case-change, alpha numeric boundaries, and non-alphanumeric
 chars,
so that a query of wifi or wi fi could match a document
 containing Wi-Fi.
Synonyms and stopwords are customized by external files, and
 stemming is enabled.
The attribute autoGeneratePhraseQueries=true (the default) causes
 words that get split to
form phrase queries. For example, WordDelimiterFilter splitting
 text:pdp-11 will cause the parser
to generate text:pdp 11 rather than (text:PDP OR text:11).
NOTE: autoGeneratePhraseQueries=true tends to not work well for
 non whitespace delimited languages.
--
fieldType name=text class=solr.TextField positionIncrementGap=100
 autoGeneratePhraseQueries=true
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
!-- in this example, we will only use synonyms at query time
filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
--
!-- Case insensitive stop word removal.
  add enablePositionIncrements=true in both the index and query
  analyzers to leave a 'gap' for more accurate phrase queries.
--
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
  /analyzer
/fieldType

copyField source=cat dest=text/
copyField source=subject dest=text/
copyField source=summary dest=text/
copyField source=cause dest=text/
copyField source=status dest=text/
copyField source=urgency dest=text/

 I ingest the source fields as text_ws (I know I've changed it a bit) and
 then copy the field to text. This seems to do what you are asking for.

 Adam

 On Thu, Jan 13, 2011 at 12:05 AM, Chamnap Chhorn chamnapchh...@gmail.com
 wrote:

  Hi all,
 
  I'm just stuck with exact keyword for several days. Hope you guys could
  help
  me. Here is the scenario:
 
1. It need to be matched with multi-word keyword and case insensitive
2. Partial word or single word matching with this field is not allowed
 
  I want to know the field type definition for this field and sample solr
  query. I need to combine this search with my full text search which uses
  dismax query.
 
  Thanks
  --
  Chhorn Chamnap
  http://chamnapchhorn.blogspot.com/
 




-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: Multi-word exact keyword case-insensitive search suggestions

2011-01-13 Thread Estrada Groups
Ahhh...the fun of open source software ;-). Requires a ton of trial and error! 
I found what worked for me and figured it was worth passing it along. If you 
don't mind...when you sort everything out on your end, please post results for 
the rest of us to take a gander at. 

Cheers,
Adam

On Jan 13, 2011, at 9:08 PM, Chamnap Chhorn chamnapchh...@gmail.com wrote:

 Thanks for your reply. However, it doesn't work for my case at all. I think
 it's the problem with query parser or something else. It forces me to put
 double quote to the search query in order to get the results found.
 
 str name=rawquerystringsim 010/str
 str name=querystringsim 010/str
 str name=parsedquery+DisjunctionMaxQuery((keyphrase:sim 010)) ()/str
 str name=parsedquery_toString+(keyphrase:sim 010) ()/str
 
 str name=rawquerystringsmart mobile/str
 str name=querystringsmart mobile/str
 str name=parsedquery
 +((DisjunctionMaxQuery((keyphrase:smart))
 DisjunctionMaxQuery((keyphrase:mobile)))~2) ()
 /str
 str name=parsedquery_toString+(((keyphrase:smart) (keyphrase:mobile))~2)
 ()/str
 
 The intent here is to do a full text search, part of that is to search
 keyword field, so I can't put quote to it.
 
 On Thu, Jan 13, 2011 at 10:30 PM, Adam Estrada 
 estrada.adam.gro...@gmail.com wrote:
 
 Hi,
 
 the following seems to work pretty well.
 
   fieldType name=text_ws class=solr.TextField
 positionIncrementGap=100
 analyzer
   tokenizer class=solr.KeywordTokenizerFactory /
   filter class=solr.ShingleFilterFactory
 maxShingleSize=4 outputUnigrams=true
 outputUnigramIfNoNgram=false /
 /analyzer
   /fieldType
 
   !-- A text field that uses WordDelimiterFilter to enable splitting and
 matching of
   words on case-change, alpha numeric boundaries, and non-alphanumeric
 chars,
   so that a query of wifi or wi fi could match a document
 containing Wi-Fi.
   Synonyms and stopwords are customized by external files, and
 stemming is enabled.
   The attribute autoGeneratePhraseQueries=true (the default) causes
 words that get split to
   form phrase queries. For example, WordDelimiterFilter splitting
 text:pdp-11 will cause the parser
   to generate text:pdp 11 rather than (text:PDP OR text:11).
   NOTE: autoGeneratePhraseQueries=true tends to not work well for
 non whitespace delimited languages.
   --
   fieldType name=text class=solr.TextField positionIncrementGap=100
 autoGeneratePhraseQueries=true
 analyzer type=index
   tokenizer class=solr.WhitespaceTokenizerFactory/
   !-- in this example, we will only use synonyms at query time
   filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
   --
   !-- Case insensitive stop word removal.
 add enablePositionIncrements=true in both the index and query
 analyzers to leave a 'gap' for more accurate phrase queries.
   --
   filter class=solr.StopFilterFactory
   ignoreCase=true
   words=stopwords.txt
   enablePositionIncrements=true
   /
   filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
   filter class=solr.PorterStemFilterFactory/
 /analyzer
 analyzer type=query
   tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
   filter class=solr.StopFilterFactory
   ignoreCase=true
   words=stopwords.txt
   enablePositionIncrements=true
   /
   filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
   filter class=solr.PorterStemFilterFactory/
 /analyzer
   /fieldType
 
   copyField source=cat dest=text/
   copyField source=subject dest=text/
   copyField source=summary dest=text/
   copyField source=cause dest=text/
   copyField source=status dest=text/
   copyField source=urgency dest=text/
 
 I ingest the source fields as text_ws (I know I've changed it a bit) and
 then copy the field to text. This seems to do what you are asking for.
 
 Adam
 
 On Thu, Jan 13, 2011 at 12:05 AM, Chamnap Chhorn chamnapchh...@gmail.com
 wrote:
 
 Hi all,
 
 I'm just stuck with exact keyword for several days. Hope you guys could
 help
 me. Here is the scenario:
 
  1. It need to be matched with multi-word keyword and case insensitive
  2. Partial word or single word matching with this field is not allowed
 
 I want to know the field type definition for this field and sample solr
 query. I need to 

Multi-word exact keyword case-insensitive search suggestions

2011-01-12 Thread Chamnap Chhorn
Hi all,

I'm just stuck with exact keyword for several days. Hope you guys could help
me. Here is the scenario:

   1. It need to be matched with multi-word keyword and case insensitive
   2. Partial word or single word matching with this field is not allowed

I want to know the field type definition for this field and sample solr
query. I need to combine this search with my full text search which uses
dismax query.

Thanks
-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/