subject:"Text field case sensitivity problem"

Re: Text field case sensitivity problem

2011-06-30 Thread Jamie Johnson

I'm not familiar with the CharFilters, I'll look into those now.

Is the solr.LowerCaseFilterFactory not handling wildcards the expected
result or is this a bug?

On Wed, Jun 15, 2011 at 4:34 PM, Mike Sokolov soko...@ifactory.com wrote:
 I wonder whether CharFilters are applied to wildcard terms?  I suspect they
 might be.  If that's the case, you could use the MappingCharFilter to
 perform lowercasing (and strip diacritics too if you want that)

 -Mike

 On 06/15/2011 10:12 AM, Jamie Johnson wrote:

 So simply lower casing the works but can get complex.  The query that I'm
 executing may have things like ranges which require some words to be upper
 case (i.e. TO).  I think this would be much better solved on Solrs end, is
 there a JIRA about this?

 On Tue, Jun 14, 2011 at 5:33 PM, Mike Sokolov soko...@ifactory.com wrote:

 opps, please s/Highlight/Wildcard/

 On 06/14/2011 05:31 PM, Mike Sokolov wrote:

 Wildcard queries aren't analyzed, I think?  I'm not completely sure what
 the best workaround is here: perhaps simply lowercasing the query terms
 yourself in the application.  Also - I hope someone more knowledgeable will
 say that the new HighlightQuery in trunk doesn't have this restriction, but
 I'm not sure about that.

 -Mike

 On 06/14/2011 05:13 PM, Jamie Johnson wrote:

 Also of interest to me is this returns results
 http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kristine


 On Tue, Jun 14, 2011 at 5:08 PM, Jamie Johnsonjej2...@gmail.com
  wrote:

 I am using the following for my text field:

 fieldType name=text class=solr.TextField
 positionIncrementGap=100 autoGeneratePhraseQueries=true
 analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 !-- in this example, we will only use synonyms at query time
 filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
         --
 !-- Case insensitive stop word removal.
           add enablePositionIncrements=true in both the index and query
           analyzers to leave a 'gap' for more accurate phrase queries.
         --
 filter class=solr.StopFilterFactory
                 ignoreCase=true
                 words=stopwords.txt
                 enablePositionIncrements=true
                 /
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
 /analyzer
 analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory
                 ignoreCase=true
                 words=stopwords.txt
                 enablePositionIncrements=true
                 /
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
 /analyzer
 /fieldType

 I have a field defined as
 field name=Person_Name type=text stored=true indexed=true /

 when I execute a go to the following url I get results
 http://localhost:8983/solr/select?defType=luceneq=Person_Name:kris*
 but if I do
 http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kris*
 I get nothing.  I thought the LowerCaseFilterFactory would have handled
 lowercasing both the query and what is being indexed, am I missing
 something?

Re: Text field case sensitivity problem

2011-06-30 Thread Jamie Johnson

I think my answer is here...

On wildcard and fuzzy searches, no text analysis is performed on the
search word. 

taken from http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Analyzers


On Thu, Jun 30, 2011 at 10:23 AM, Jamie Johnson jej2...@gmail.com wrote:
 I'm not familiar with the CharFilters, I'll look into those now.

 Is the solr.LowerCaseFilterFactory not handling wildcards the expected
 result or is this a bug?

 On Wed, Jun 15, 2011 at 4:34 PM, Mike Sokolov soko...@ifactory.com wrote:
 I wonder whether CharFilters are applied to wildcard terms?  I suspect they
 might be.  If that's the case, you could use the MappingCharFilter to
 perform lowercasing (and strip diacritics too if you want that)

 -Mike

 On 06/15/2011 10:12 AM, Jamie Johnson wrote:

 So simply lower casing the works but can get complex.  The query that I'm
 executing may have things like ranges which require some words to be upper
 case (i.e. TO).  I think this would be much better solved on Solrs end, is
 there a JIRA about this?

 On Tue, Jun 14, 2011 at 5:33 PM, Mike Sokolov soko...@ifactory.com wrote:

 opps, please s/Highlight/Wildcard/

 On 06/14/2011 05:31 PM, Mike Sokolov wrote:

 Wildcard queries aren't analyzed, I think?  I'm not completely sure what
 the best workaround is here: perhaps simply lowercasing the query terms
 yourself in the application.  Also - I hope someone more knowledgeable will
 say that the new HighlightQuery in trunk doesn't have this restriction, but
 I'm not sure about that.

 -Mike

 On 06/14/2011 05:13 PM, Jamie Johnson wrote:

 Also of interest to me is this returns results
 http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kristine


 On Tue, Jun 14, 2011 at 5:08 PM, Jamie Johnsonjej2...@gmail.com
  wrote:

 I am using the following for my text field:

 fieldType name=text class=solr.TextField
 positionIncrementGap=100 autoGeneratePhraseQueries=true
 analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 !-- in this example, we will only use synonyms at query time
 filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
         --
 !-- Case insensitive stop word removal.
           add enablePositionIncrements=true in both the index and query
           analyzers to leave a 'gap' for more accurate phrase queries.
         --
 filter class=solr.StopFilterFactory
                 ignoreCase=true
                 words=stopwords.txt
                 enablePositionIncrements=true
                 /
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
 /analyzer
 analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory
                 ignoreCase=true
                 words=stopwords.txt
                 enablePositionIncrements=true
                 /
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
 /analyzer
 /fieldType

 I have a field defined as
 field name=Person_Name type=text stored=true indexed=true /

 when I execute a go to the following url I get results
 http://localhost:8983/solr/select?defType=luceneq=Person_Name:kris*
 but if I do
 http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kris*
 I get nothing.  I thought the LowerCaseFilterFactory would have handled
 lowercasing both the query and what is being indexed, am I missing
 something?

Re: Text field case sensitivity problem

2011-06-30 Thread Mike Sokolov

Yes, after posting that response, I read some more and came to the same 
conclusion... there seems to be some interest on the dev list in 
building a capability to specify an analysis chain for use with wildcard 
and related queries, but it doesn't exist now.


-Mike

On 06/30/2011 10:34 AM, Jamie Johnson wrote:

I think my answer is here...

On wildcard and fuzzy searches, no text analysis is performed on the
search word. 

taken from http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Analyzers


On Thu, Jun 30, 2011 at 10:23 AM, Jamie Johnsonjej2...@gmail.com  wrote:
   

I'm not familiar with the CharFilters, I'll look into those now.

Is the solr.LowerCaseFilterFactory not handling wildcards the expected
result or is this a bug?

On Wed, Jun 15, 2011 at 4:34 PM, Mike Sokolovsoko...@ifactory.com  wrote:
 

I wonder whether CharFilters are applied to wildcard terms?  I suspect they
might be.  If that's the case, you could use the MappingCharFilter to
perform lowercasing (and strip diacritics too if you want that)

-Mike

On 06/15/2011 10:12 AM, Jamie Johnson wrote:

So simply lower casing the works but can get complex.  The query that I'm
executing may have things like ranges which require some words to be upper
case (i.e. TO).  I think this would be much better solved on Solrs end, is
there a JIRA about this?

On Tue, Jun 14, 2011 at 5:33 PM, Mike Sokolovsoko...@ifactory.com  wrote:
   

opps, please s/Highlight/Wildcard/

On 06/14/2011 05:31 PM, Mike Sokolov wrote:
 

Wildcard queries aren't analyzed, I think?  I'm not completely sure what
the best workaround is here: perhaps simply lowercasing the query terms
yourself in the application.  Also - I hope someone more knowledgeable will
say that the new HighlightQuery in trunk doesn't have this restriction, but
I'm not sure about that.

-Mike

On 06/14/2011 05:13 PM, Jamie Johnson wrote:
   

Also of interest to me is this returns results
http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kristine


On Tue, Jun 14, 2011 at 5:08 PM, Jamie Johnsonjej2...@gmail.com
  wrote:

 

I am using the following for my text field:

fieldType name=text class=solr.TextField
positionIncrementGap=100 autoGeneratePhraseQueries=true
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
!-- in this example, we will only use synonyms at query time
filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true expand=false/
 --
!-- Case insensitive stop word removal.
   add enablePositionIncrements=true in both the index and query
   analyzers to leave a 'gap' for more accurate phrase queries.
 --
filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
/analyzer
/fieldType

I have a field defined as
field name=Person_Name type=text stored=true indexed=true /

when I execute a go to the following url I get results
http://localhost:8983/solr/select?defType=luceneq=Person_Name:kris*
but if I do
http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kris*
I get nothing.  I thought the LowerCaseFilterFactory would have handled
lowercasing both the query and what is being indexed, am I missing
something?

Re: Text field case sensitivity problem

2011-06-30 Thread Erik Hatcher

Jamie - there is a JIRA about this, at least one: 
https://issues.apache.org/jira/browse/SOLR-218

Erik
 
On Jun 15, 2011, at 10:12 , Jamie Johnson wrote:

 So simply lower casing the works but can get complex.  The query that I'm
 executing may have things like ranges which require some words to be upper
 case (i.e. TO).  I think this would be much better solved on Solrs end, is
 there a JIRA about this?
 
 On Tue, Jun 14, 2011 at 5:33 PM, Mike Sokolov soko...@ifactory.com wrote:
 
 opps, please s/Highlight/Wildcard/
 
 
 On 06/14/2011 05:31 PM, Mike Sokolov wrote:
 
 Wildcard queries aren't analyzed, I think?  I'm not completely sure what
 the best workaround is here: perhaps simply lowercasing the query terms
 yourself in the application.  Also - I hope someone more knowledgeable will
 say that the new HighlightQuery in trunk doesn't have this restriction, but
 I'm not sure about that.
 
 -Mike
 
 On 06/14/2011 05:13 PM, Jamie Johnson wrote:
 
 Also of interest to me is this returns results
 http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kristine
 
 
 On Tue, Jun 14, 2011 at 5:08 PM, Jamie Johnsonjej2...@gmail.com
 wrote:
 
 I am using the following for my text field:
 
 fieldType name=text class=solr.TextField
 positionIncrementGap=100 autoGeneratePhraseQueries=true
 analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 !-- in this example, we will only use synonyms at query time
 filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
--
 !-- Case insensitive stop word removal.
  add enablePositionIncrements=true in both the index and query
  analyzers to leave a 'gap' for more accurate phrase queries.
--
 filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
 /analyzer
 analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
 /analyzer
 /fieldType
 
 I have a field defined as
 field name=Person_Name type=text stored=true indexed=true /
 
 when I execute a go to the following url I get results
 http://localhost:8983/solr/select?defType=luceneq=Person_Name:kris*
 but if I do
 http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kris*
 I get nothing.  I thought the LowerCaseFilterFactory would have handled
 lowercasing both the query and what is being indexed, am I missing
 something?

Re: Text field case sensitivity problem

2011-06-30 Thread Mike Sokolov


Yes, and this too: https://issues.apache.org/jira/browse/SOLR-219

On 06/30/2011 12:46 PM, Erik Hatcher wrote:

Jamie - there is a JIRA about this, at least 
one:https://issues.apache.org/jira/browse/SOLR-218

Erik

On Jun 15, 2011, at 10:12 , Jamie Johnson wrote:

   

So simply lower casing the works but can get complex.  The query that I'm
executing may have things like ranges which require some words to be upper
case (i.e. TO).  I think this would be much better solved on Solrs end, is
there a JIRA about this?

On Tue, Jun 14, 2011 at 5:33 PM, Mike Sokolovsoko...@ifactory.com  wrote:

 

opps, please s/Highlight/Wildcard/


On 06/14/2011 05:31 PM, Mike Sokolov wrote:

   

Wildcard queries aren't analyzed, I think?  I'm not completely sure what
the best workaround is here: perhaps simply lowercasing the query terms
yourself in the application.  Also - I hope someone more knowledgeable will
say that the new HighlightQuery in trunk doesn't have this restriction, but
I'm not sure about that.

-Mike

On 06/14/2011 05:13 PM, Jamie Johnson wrote:

 

Also of interest to me is this returns results
http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kristine


On Tue, Jun 14, 2011 at 5:08 PM, Jamie Johnsonjej2...@gmail.com
wrote:

I am using the following for my text field:
   

fieldType name=text class=solr.TextField
positionIncrementGap=100 autoGeneratePhraseQueries=true
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
!-- in this example, we will only use synonyms at query time
filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true expand=false/
--
!-- Case insensitive stop word removal.
  add enablePositionIncrements=true in both the index and query
  analyzers to leave a 'gap' for more accurate phrase queries.
--
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
/analyzer
/fieldType

I have a field defined as
field name=Person_Name type=text stored=true indexed=true /

when I execute a go to the following url I get results
http://localhost:8983/solr/select?defType=luceneq=Person_Name:kris*
but if I do
http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kris*
I get nothing.  I thought the LowerCaseFilterFactory would have handled
lowercasing both the query and what is being indexed, am I missing
something?

Re: Text field case sensitivity problem

2011-06-15 Thread Jamie Johnson

So simply lower casing the works but can get complex.  The query that I'm
executing may have things like ranges which require some words to be upper
case (i.e. TO).  I think this would be much better solved on Solrs end, is
there a JIRA about this?

On Tue, Jun 14, 2011 at 5:33 PM, Mike Sokolov soko...@ifactory.com wrote:

 opps, please s/Highlight/Wildcard/


 On 06/14/2011 05:31 PM, Mike Sokolov wrote:

 Wildcard queries aren't analyzed, I think?  I'm not completely sure what
 the best workaround is here: perhaps simply lowercasing the query terms
 yourself in the application.  Also - I hope someone more knowledgeable will
 say that the new HighlightQuery in trunk doesn't have this restriction, but
 I'm not sure about that.

 -Mike

 On 06/14/2011 05:13 PM, Jamie Johnson wrote:

 Also of interest to me is this returns results
 http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kristine


 On Tue, Jun 14, 2011 at 5:08 PM, Jamie Johnsonjej2...@gmail.com
  wrote:

  I am using the following for my text field:

 fieldType name=text class=solr.TextField
 positionIncrementGap=100 autoGeneratePhraseQueries=true
 analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 !-- in this example, we will only use synonyms at query time
 filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
 --
 !-- Case insensitive stop word removal.
   add enablePositionIncrements=true in both the index and query
   analyzers to leave a 'gap' for more accurate phrase queries.
 --
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
 /analyzer
 analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
 /analyzer
 /fieldType

 I have a field defined as
 field name=Person_Name type=text stored=true indexed=true /

 when I execute a go to the following url I get results
 http://localhost:8983/solr/select?defType=luceneq=Person_Name:kris*
 but if I do
 http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kris*
 I get nothing.  I thought the LowerCaseFilterFactory would have handled
 lowercasing both the query and what is being indexed, am I missing
 something?

Re: Text field case sensitivity problem

2011-06-15 Thread Mike Sokolov

I wonder whether CharFilters are applied to wildcard terms?  I suspect 
they might be.  If that's the case, you could use the MappingCharFilter 
to perform lowercasing (and strip diacritics too if you want that)


-Mike

On 06/15/2011 10:12 AM, Jamie Johnson wrote:
So simply lower casing the works but can get complex.  The query that 
I'm executing may have things like ranges which require some words to 
be upper case (i.e. TO).  I think this would be much better solved on 
Solrs end, is there a JIRA about this?


On Tue, Jun 14, 2011 at 5:33 PM, Mike Sokolov soko...@ifactory.com 
mailto:soko...@ifactory.com wrote:


opps, please s/Highlight/Wildcard/


On 06/14/2011 05:31 PM, Mike Sokolov wrote:

Wildcard queries aren't analyzed, I think?  I'm not completely
sure what the best workaround is here: perhaps simply
lowercasing the query terms yourself in the application.  Also
- I hope someone more knowledgeable will say that the new
HighlightQuery in trunk doesn't have this restriction, but I'm
not sure about that.

-Mike

On 06/14/2011 05:13 PM, Jamie Johnson wrote:

Also of interest to me is this returns results

http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kristine

http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kristine


On Tue, Jun 14, 2011 at 5:08 PM, Jamie
Johnsonjej2...@gmail.com mailto:jej2...@gmail.com  wrote:

I am using the following for my text field:

fieldType name=text class=solr.TextField
positionIncrementGap=100
autoGeneratePhraseQueries=true
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
!-- in this example, we will only use synonyms at
query time
filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true
expand=false/
--
!-- Case insensitive stop word removal.
  add enablePositionIncrements=true in both
the index and query
  analyzers to leave a 'gap' for more accurate
phrase queries.
--
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1
catenateWords=1
catenateNumbers=1 catenateAll=0
splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory
synonyms=synonyms.txt
ignoreCase=true expand=true/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1
catenateWords=0
catenateNumbers=0 catenateAll=0
splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
/analyzer
/fieldType

I have a field defined as
field name=Person_Name type=text stored=true
indexed=true /

when I execute a go to the following url I get results

http://localhost:8983/solr/select?defType=luceneq=Person_Name:kris*

http://localhost:8983/solr/select?defType=luceneq=Person_Name:kris*
but if I do

http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kris*

http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kris*
I get nothing.  I thought the LowerCaseFilterFactory
would have handled
lowercasing both the query and what is being indexed,
am I missing
something?

Text field case sensitivity problem

2011-06-14 Thread Jamie Johnson

I am using the following for my text field:

fieldType name=text class=solr.TextField positionIncrementGap=100
autoGeneratePhraseQueries=true
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
!-- in this example, we will only use synonyms at query time
filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true expand=false/
--
!-- Case insensitive stop word removal.
  add enablePositionIncrements=true in both the index and query
  analyzers to leave a 'gap' for more accurate phrase queries.
--
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
  /analyzer
/fieldType

I have a field defined as
   field name=Person_Name type=text stored=true indexed=true /

when I execute a go to the following url I get results
http://localhost:8983/solr/select?defType=luceneq=Person_Name:kris*
but if I do
http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kris*
I get nothing.  I thought the LowerCaseFilterFactory would have handled
lowercasing both the query and what is being indexed, am I missing
something?

Re: Text field case sensitivity problem

2011-06-14 Thread Jamie Johnson

Also of interest to me is this returns results
http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kristine


On Tue, Jun 14, 2011 at 5:08 PM, Jamie Johnson jej2...@gmail.com wrote:

 I am using the following for my text field:

 fieldType name=text class=solr.TextField
 positionIncrementGap=100 autoGeneratePhraseQueries=true
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 !-- in this example, we will only use synonyms at query time
 filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
 --
 !-- Case insensitive stop word removal.
   add enablePositionIncrements=true in both the index and query
   analyzers to leave a 'gap' for more accurate phrase queries.
 --
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
   /analyzer
 /fieldType

 I have a field defined as
field name=Person_Name type=text stored=true indexed=true /

 when I execute a go to the following url I get results
 http://localhost:8983/solr/select?defType=luceneq=Person_Name:kris*
 but if I do
 http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kris*
 I get nothing.  I thought the LowerCaseFilterFactory would have handled
 lowercasing both the query and what is being indexed, am I missing
 something?

Re: Text field case sensitivity problem

2011-06-14 Thread Mike Sokolov

Wildcard queries aren't analyzed, I think?  I'm not completely sure what 
the best workaround is here: perhaps simply lowercasing the query terms 
yourself in the application.  Also - I hope someone more knowledgeable 
will say that the new HighlightQuery in trunk doesn't have this 
restriction, but I'm not sure about that.


-Mike

On 06/14/2011 05:13 PM, Jamie Johnson wrote:

Also of interest to me is this returns results
http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kristine


On Tue, Jun 14, 2011 at 5:08 PM, Jamie Johnsonjej2...@gmail.com  wrote:

   

I am using the following for my text field:

 fieldType name=text class=solr.TextField
positionIncrementGap=100 autoGeneratePhraseQueries=true
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 !-- in this example, we will only use synonyms at query time
 filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true expand=false/
 --
 !-- Case insensitive stop word removal.
   add enablePositionIncrements=true in both the index and query
   analyzers to leave a 'gap' for more accurate phrase queries.
 --
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
 filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
 filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
   /analyzer
 /fieldType

I have a field defined as
field name=Person_Name type=text stored=true indexed=true /

when I execute a go to the following url I get results
http://localhost:8983/solr/select?defType=luceneq=Person_Name:kris*
but if I do
http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kris*
I get nothing.  I thought the LowerCaseFilterFactory would have handled
lowercasing both the query and what is being indexed, am I missing
something?

RE: Text field case sensitivity problem

2011-06-14 Thread Bob Sandiford

Unfortunately, wild card search terms don't get processed by the analyzers.

One suggestion that's fairly common is to make sure you lower case your wild 
card search terms yourself before issuing the query.

Bob Sandiford | Lead Software Engineer | SirsiDynix
P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com
www.sirsidynix.com

 -Original Message-
 From: Jamie Johnson [mailto:jej2...@gmail.com]
 Sent: Tuesday, June 14, 2011 5:13 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Text field case sensitivity problem
 
 Also of interest to me is this returns results
 http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kristine
 
 
 On Tue, Jun 14, 2011 at 5:08 PM, Jamie Johnson jej2...@gmail.com
 wrote:
 
  I am using the following for my text field:
 
  fieldType name=text class=solr.TextField
  positionIncrementGap=100 autoGeneratePhraseQueries=true
analyzer type=index
  tokenizer class=solr.WhitespaceTokenizerFactory/
  !-- in this example, we will only use synonyms at query time
  filter class=solr.SynonymFilterFactory
  synonyms=index_synonyms.txt ignoreCase=true expand=false/
  --
  !-- Case insensitive stop word removal.
add enablePositionIncrements=true in both the index and
 query
analyzers to leave a 'gap' for more accurate phrase
 queries.
  --
  filter class=solr.StopFilterFactory
  ignoreCase=true
  words=stopwords.txt
  enablePositionIncrements=true
  /
  filter class=solr.WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=1
  catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.KeywordMarkerFilterFactory
  protected=protwords.txt/
  filter class=solr.PorterStemFilterFactory/
/analyzer
analyzer type=query
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.SynonymFilterFactory
 synonyms=synonyms.txt
  ignoreCase=true expand=true/
  filter class=solr.StopFilterFactory
  ignoreCase=true
  words=stopwords.txt
  enablePositionIncrements=true
  /
  filter class=solr.WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=0
  catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.KeywordMarkerFilterFactory
  protected=protwords.txt/
  filter class=solr.PorterStemFilterFactory/
/analyzer
  /fieldType
 
  I have a field defined as
 field name=Person_Name type=text stored=true indexed=true
 /
 
  when I execute a go to the following url I get results
  http://localhost:8983/solr/select?defType=luceneq=Person_Name:kris*
  but if I do
  http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kris*
  I get nothing.  I thought the LowerCaseFilterFactory would have
 handled
  lowercasing both the query and what is being indexed, am I missing
  something?

Re: Text field case sensitivity problem

2011-06-14 Thread Mike Sokolov


opps, please s/Highlight/Wildcard/

On 06/14/2011 05:31 PM, Mike Sokolov wrote:
Wildcard queries aren't analyzed, I think?  I'm not completely sure 
what the best workaround is here: perhaps simply lowercasing the query 
terms yourself in the application.  Also - I hope someone more 
knowledgeable will say that the new HighlightQuery in trunk doesn't 
have this restriction, but I'm not sure about that.


-Mike

On 06/14/2011 05:13 PM, Jamie Johnson wrote:

Also of interest to me is this returns results
http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kristine


On Tue, Jun 14, 2011 at 5:08 PM, Jamie Johnsonjej2...@gmail.com  
wrote:



I am using the following for my text field:

fieldType name=text class=solr.TextField
positionIncrementGap=100 autoGeneratePhraseQueries=true
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
!-- in this example, we will only use synonyms at query time
filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true expand=false/
 --
!-- Case insensitive stop word removal.
   add enablePositionIncrements=true in both the index and 
query

   analyzers to leave a 'gap' for more accurate phrase queries.
 --
filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
/analyzer
/fieldType

I have a field defined as
field name=Person_Name type=text stored=true indexed=true /

when I execute a go to the following url I get results
http://localhost:8983/solr/select?defType=luceneq=Person_Name:kris*
but if I do
http://localhost:8983/solr/select?defType=luceneq=Person_Name:Kris*
I get nothing.  I thought the LowerCaseFilterFactory would have handled
lowercasing both the query and what is being indexed, am I missing
something?

Re: Text field case sensitivity problem

Re: Text field case sensitivity problem

Re: Text field case sensitivity problem

Re: Text field case sensitivity problem

Re: Text field case sensitivity problem

Re: Text field case sensitivity problem

Re: Text field case sensitivity problem

Text field case sensitivity problem

Re: Text field case sensitivity problem

Re: Text field case sensitivity problem

RE: Text field case sensitivity problem

Re: Text field case sensitivity problem

12 matches

Site Navigation

Mail list logo

Footer information