Re: Solr failing on y charakter in string?

2009-08-03 Thread gateway0

Ok thanks you´re right.

But the thing is my users will often search for expressions like: Harr or
har etc.. 

So I thought I automatically add the wildcard * to every request.

If that too gets me into trouble Harr*=no result harry*=no result

What should I do?



Otis Gospodnetic wrote:
 
 I believe it's because wildcard queries are not stemmed.  During indexing
 harry probably got stemmed to harr, so now harry* doesn't match,
 because there is no harry token in that string, only harr.  Why
 wildcard queries are not analyzed is described in the Lucene FAQ on the
 Lucene Wiki.
 
 You could also try searching for kunde:Harr* for example (not the
 upper-case Harr).  I bet it won't result in a hit for the same reason - at
 index time you probably lower-case tokens with LowerCaseFilter(Factory),
 and if you search for Harr*, the lower-casing won't happen because the
 query string with the wildcard character isn't analyzed.
 
 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
 
 
 
 - Original Message 
 From: gateway0 reiterwo...@yahoo.de
 To: solr-user@lucene.apache.org
 Sent: Sunday, August 2, 2009 7:30:19 PM
 Subject: Solr failing on y charakter in string?
 
 
 Hi,
 
 I have the following setting:
 schema.xml:
 
 the text field-type was updated with the preserveOriginal=1 option in
 the schema
 
 I have the following string indexd in the field kunde
 Harry Heim KG
 
 Now when I search for kunde:harry* it gives me an empty result.
 
 When I search for kunde:harry I get the right result. Also
 kunde:harr*
 works just fine.
 
 The strange thing is that with every other string (for example
 kunde:heim*) I will get the right result. 
 
 So why not on harry* with an y* at the end?
 
 kind regards, S.
 -- 
 View this message in context: 
 http://www.nabble.com/Solr-failing-on-%22y%22-charakter-in-string--tp24783211p24783211.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Solr-failing-on-%22y%22-charakter-in-string--tp24783211p24789070.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr failing on y charakter in string?

2009-08-03 Thread Avlesh Singh
The easiest thing to do would be to create a new field in your schema which
only has a lowercasefilter applied to it. While searching perform searches
across the two fields. You'll get desired results.

You can use the copyField directive in your schema.xml for copying data
from your original field into the new field.

Cheers
Avlesh

On Mon, Aug 3, 2009 at 4:51 PM, gateway0 reiterwo...@yahoo.de wrote:


 Ok thanks you´re right.

 But the thing is my users will often search for expressions like: Harr or
 har etc..

 So I thought I automatically add the wildcard * to every request.

 If that too gets me into trouble Harr*=no result harry*=no result

 What should I do?



 Otis Gospodnetic wrote:
 
  I believe it's because wildcard queries are not stemmed.  During indexing
  harry probably got stemmed to harr, so now harry* doesn't match,
  because there is no harry token in that string, only harr.  Why
  wildcard queries are not analyzed is described in the Lucene FAQ on the
  Lucene Wiki.
 
  You could also try searching for kunde:Harr* for example (not the
  upper-case Harr).  I bet it won't result in a hit for the same reason -
 at
  index time you probably lower-case tokens with LowerCaseFilter(Factory),
  and if you search for Harr*, the lower-casing won't happen because the
  query string with the wildcard character isn't analyzed.
 
  Otis
  --
  Sematext is hiring -- http://sematext.com/about/jobs.html?mls
  Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
 
 
 
  - Original Message 
  From: gateway0 reiterwo...@yahoo.de
  To: solr-user@lucene.apache.org
  Sent: Sunday, August 2, 2009 7:30:19 PM
  Subject: Solr failing on y charakter in string?
 
 
  Hi,
 
  I have the following setting:
  schema.xml:
  
  the text field-type was updated with the preserveOriginal=1 option
 in
  the schema
 
  I have the following string indexd in the field kunde
  Harry Heim KG
 
  Now when I search for kunde:harry* it gives me an empty result.
 
  When I search for kunde:harry I get the right result. Also
  kunde:harr*
  works just fine.
 
  The strange thing is that with every other string (for example
  kunde:heim*) I will get the right result.
 
  So why not on harry* with an y* at the end?
 
  kind regards, S.
  --
  View this message in context:
 
 http://www.nabble.com/Solr-failing-on-%22y%22-charakter-in-string--tp24783211p24783211.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Solr-failing-on-%22y%22-charakter-in-string--tp24783211p24789070.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: Solr failing on y charakter in string?

2009-08-03 Thread gateway0

Ok still not working with new field text_two:
str name=qtext:Har* text_two:Har*/str
== result 0

Schema Updates:

fieldType name=text_two class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.LowerCaseTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.LowerCaseTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer   
/fieldType


field name=text_two type=text_two indexed=true stored=false
multiValued=true/

copyField source=text dest=text_two/


This is what you suggested, right?

kind regards, S.



gateway0 wrote:
 
 Hi,
 
 I have the following setting:
 schema.xml:
 field name=kunde type=text indexed=true stored=true /
 the text field-type was updated with the preserveOriginal=1 option in
 the schema
 
 I have the following string indexd in the field kunde
 Harry Heim KG
 
 Now when I search for kunde:harry* it gives me an empty result.
 
 When I search for kunde:harry I get the right result. Also kunde:harr*
 works just fine.
 
 The strange thing is that with every other string (for example
 kunde:heim*) I will get the right result. 
 
 So why not on harry* with an y* at the end?
 
 kind regards, S.
 

-- 
View this message in context: 
http://www.nabble.com/Solr-failing-on-%22y%22-charakter-in-string--tp24783211p24790774.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr failing on y charakter in string?

2009-08-03 Thread gateway0

Ok still not working with new field text_two:
str name=qtext:Har* text_two:Har*/str
== result 0

Schema Updates:

fieldType name=text_two class=solr.TextField
positionIncrementGap=100
  analyzer type=index
  tokenizer class=solr.LowerCaseTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
  tokenizer class=solr.LowerCaseTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType


field name=text_two type=text_two indexed=true stored=false
multiValued=true/

copyField source=text dest=text_two/


This is what you suggested, right?

kind regards, S. 



Avlesh Singh wrote:
 
 The easiest thing to do would be to create a new field in your schema
 which
 only has a lowercasefilter applied to it. While searching perform searches
 across the two fields. You'll get desired results.
 
 You can use the copyField directive in your schema.xml for copying data
 from your original field into the new field.
 
 Cheers
 Avlesh
 
 On Mon, Aug 3, 2009 at 4:51 PM, gateway0 reiterwo...@yahoo.de wrote:
 

 Ok thanks you´re right.

 But the thing is my users will often search for expressions like: Harr
 or
 har etc..

 So I thought I automatically add the wildcard * to every request.

 If that too gets me into trouble Harr*=no result harry*=no result

 What should I do?



 Otis Gospodnetic wrote:
 
  I believe it's because wildcard queries are not stemmed.  During
 indexing
  harry probably got stemmed to harr, so now harry* doesn't match,
  because there is no harry token in that string, only harr.  Why
  wildcard queries are not analyzed is described in the Lucene FAQ on the
  Lucene Wiki.
 
  You could also try searching for kunde:Harr* for example (not the
  upper-case Harr).  I bet it won't result in a hit for the same reason -
 at
  index time you probably lower-case tokens with
 LowerCaseFilter(Factory),
  and if you search for Harr*, the lower-casing won't happen because the
  query string with the wildcard character isn't analyzed.
 
  Otis
  --
  Sematext is hiring -- http://sematext.com/about/jobs.html?mls
  Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
 
 
 
  - Original Message 
  From: gateway0 reiterwo...@yahoo.de
  To: solr-user@lucene.apache.org
  Sent: Sunday, August 2, 2009 7:30:19 PM
  Subject: Solr failing on y charakter in string?
 
 
  Hi,
 
  I have the following setting:
  schema.xml:
  
  the text field-type was updated with the preserveOriginal=1 option
 in
  the schema
 
  I have the following string indexd in the field kunde
  Harry Heim KG
 
  Now when I search for kunde:harry* it gives me an empty result.
 
  When I search for kunde:harry I get the right result. Also
  kunde:harr*
  works just fine.
 
  The strange thing is that with every other string (for example
  kunde:heim*) I will get the right result.
 
  So why not on harry* with an y* at the end?
 
  kind regards, S.
  --
  View this message in context:
 
 http://www.nabble.com/Solr-failing-on-%22y%22-charakter-in-string--tp24783211p24783211.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Solr-failing-on-%22y%22-charakter-in-string--tp24783211p24789070.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://www.nabble.com/Solr-failing-on-%22y%22-charakter-in-string--tp24783211p24790836.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Solr failing on y charakter in string?

2009-08-03 Thread Ensdorf Ken
 Ok still not working with new field text_two:
 str name=qtext:Har* text_two:Har*/str
 == result 0

 Schema Updates:
 
 fieldType name=text_two class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
   tokenizer class=solr.LowerCaseTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
   analyzer type=query
   tokenizer class=solr.LowerCaseTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType


 field name=text_two type=text_two indexed=true stored=false
 multiValued=true/

 copyField source=text dest=text_two/
 

I'm pretty sure the query string needs to be lower-case, since a wildcard query 
is not analyzed.

I think what Avlesh was suggesting was more like this:

str name=qtext:Har text_two:har*/str

So the original field would be for a regular query containing whatever the user 
entered and would undergo the usual analysis for searching, and the secondary 
field would be used to construct a wildcard query which would strictly serve 
the begins-with case.

-Ken


Re: Solr failing on y charakter in string?

2009-08-03 Thread Bill Au
I have a Solr text field and when I use Solr's field analysis tool, it shows
that wildcard queries are being stemmed.  But query results indicate that it
is not.  It looks like there is a bug in the tool.

Bill

On Mon, Aug 3, 2009 at 7:21 AM, gateway0 reiterwo...@yahoo.de wrote:


 Ok thanks you´re right.

 But the thing is my users will often search for expressions like: Harr or
 har etc..

 So I thought I automatically add the wildcard * to every request.

 If that too gets me into trouble Harr*=no result harry*=no result

 What should I do?



 Otis Gospodnetic wrote:
 
  I believe it's because wildcard queries are not stemmed.  During indexing
  harry probably got stemmed to harr, so now harry* doesn't match,
  because there is no harry token in that string, only harr.  Why
  wildcard queries are not analyzed is described in the Lucene FAQ on the
  Lucene Wiki.
 
  You could also try searching for kunde:Harr* for example (not the
  upper-case Harr).  I bet it won't result in a hit for the same reason -
 at
  index time you probably lower-case tokens with LowerCaseFilter(Factory),
  and if you search for Harr*, the lower-casing won't happen because the
  query string with the wildcard character isn't analyzed.
 
  Otis
  --
  Sematext is hiring -- http://sematext.com/about/jobs.html?mls
  Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
 
 
 
  - Original Message 
  From: gateway0 reiterwo...@yahoo.de
  To: solr-user@lucene.apache.org
  Sent: Sunday, August 2, 2009 7:30:19 PM
  Subject: Solr failing on y charakter in string?
 
 
  Hi,
 
  I have the following setting:
  schema.xml:
  
  the text field-type was updated with the preserveOriginal=1 option
 in
  the schema
 
  I have the following string indexd in the field kunde
  Harry Heim KG
 
  Now when I search for kunde:harry* it gives me an empty result.
 
  When I search for kunde:harry I get the right result. Also
  kunde:harr*
  works just fine.
 
  The strange thing is that with every other string (for example
  kunde:heim*) I will get the right result.
 
  So why not on harry* with an y* at the end?
 
  kind regards, S.
  --
  View this message in context:
 
 http://www.nabble.com/Solr-failing-on-%22y%22-charakter-in-string--tp24783211p24783211.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Solr-failing-on-%22y%22-charakter-in-string--tp24783211p24789070.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: Solr failing on y charakter in string?

2009-08-03 Thread Avlesh Singh

 I have a Solr text field and when I use Solr's field analysis tool, it
 shows that wildcard queries are being stemmed.  But query results indicate
 that it is not.  It looks like there is a bug in the tool.

I am in agreement. Seems like a bug to me.

Cheers
Avlesh

On Mon, Aug 3, 2009 at 10:19 PM, Bill Au bill.w...@gmail.com wrote:

 I have a Solr text field and when I use Solr's field analysis tool, it
 shows
 that wildcard queries are being stemmed.  But query results indicate that
 it
 is not.  It looks like there is a bug in the tool.

 Bill

 On Mon, Aug 3, 2009 at 7:21 AM, gateway0 reiterwo...@yahoo.de wrote:

 
  Ok thanks you´re right.
 
  But the thing is my users will often search for expressions like: Harr
 or
  har etc..
 
  So I thought I automatically add the wildcard * to every request.
 
  If that too gets me into trouble Harr*=no result harry*=no result
 
  What should I do?
 
 
 
  Otis Gospodnetic wrote:
  
   I believe it's because wildcard queries are not stemmed.  During
 indexing
   harry probably got stemmed to harr, so now harry* doesn't match,
   because there is no harry token in that string, only harr.  Why
   wildcard queries are not analyzed is described in the Lucene FAQ on the
   Lucene Wiki.
  
   You could also try searching for kunde:Harr* for example (not the
   upper-case Harr).  I bet it won't result in a hit for the same reason -
  at
   index time you probably lower-case tokens with
 LowerCaseFilter(Factory),
   and if you search for Harr*, the lower-casing won't happen because the
   query string with the wildcard character isn't analyzed.
  
   Otis
   --
   Sematext is hiring -- http://sematext.com/about/jobs.html?mls
   Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
  
  
  
   - Original Message 
   From: gateway0 reiterwo...@yahoo.de
   To: solr-user@lucene.apache.org
   Sent: Sunday, August 2, 2009 7:30:19 PM
   Subject: Solr failing on y charakter in string?
  
  
   Hi,
  
   I have the following setting:
   schema.xml:
   
   the text field-type was updated with the preserveOriginal=1 option
  in
   the schema
  
   I have the following string indexd in the field kunde
   Harry Heim KG
  
   Now when I search for kunde:harry* it gives me an empty result.
  
   When I search for kunde:harry I get the right result. Also
   kunde:harr*
   works just fine.
  
   The strange thing is that with every other string (for example
   kunde:heim*) I will get the right result.
  
   So why not on harry* with an y* at the end?
  
   kind regards, S.
   --
   View this message in context:
  
 
 http://www.nabble.com/Solr-failing-on-%22y%22-charakter-in-string--tp24783211p24783211.html
   Sent from the Solr - User mailing list archive at Nabble.com.
  
  
  
 
  --
  View this message in context:
 
 http://www.nabble.com/Solr-failing-on-%22y%22-charakter-in-string--tp24783211p24789070.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 



Re: Solr failing on y charakter in string?

2009-08-02 Thread Otis Gospodnetic
I believe it's because wildcard queries are not stemmed.  During indexing 
harry probably got stemmed to harr, so now harry* doesn't match, because 
there is no harry token in that string, only harr.  Why wildcard queries 
are not analyzed is described in the Lucene FAQ on the Lucene Wiki.

You could also try searching for kunde:Harr* for example (not the upper-case 
Harr).  I bet it won't result in a hit for the same reason - at index time you 
probably lower-case tokens with LowerCaseFilter(Factory), and if you search for 
Harr*, the lower-casing won't happen because the query string with the wildcard 
character isn't analyzed.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: gateway0 reiterwo...@yahoo.de
 To: solr-user@lucene.apache.org
 Sent: Sunday, August 2, 2009 7:30:19 PM
 Subject: Solr failing on y charakter in string?
 
 
 Hi,
 
 I have the following setting:
 schema.xml:
 
 the text field-type was updated with the preserveOriginal=1 option in
 the schema
 
 I have the following string indexd in the field kunde
 Harry Heim KG
 
 Now when I search for kunde:harry* it gives me an empty result.
 
 When I search for kunde:harry I get the right result. Also kunde:harr*
 works just fine.
 
 The strange thing is that with every other string (for example
 kunde:heim*) I will get the right result. 
 
 So why not on harry* with an y* at the end?
 
 kind regards, S.
 -- 
 View this message in context: 
 http://www.nabble.com/Solr-failing-on-%22y%22-charakter-in-string--tp24783211p24783211.html
 Sent from the Solr - User mailing list archive at Nabble.com.