Re: Wildcard queries on whole words

2012-06-27 Thread Jack Krupansky
I would understand if you had said that "Klostermeyer*" returned nothing 
because the presence of the wildcard used to suppress analysis, including 
the lower case filter so that the capital "K" term would never match an 
indexed term. But, I would have expected "klostermeyer*" to match 
"klostermeyer"  since the unanalyzed wildcard prefix would have the same 
term value as "klostermeyer" when it is indexed. So, this is a mystery to 
me.


-- Jack Krupansky

-Original Message- 
From: Klostermeyer, Michael

Sent: Wednesday, June 27, 2012 11:14 AM
To: solr-user@lucene.apache.org
Subject: Wildcard queries on whole words

I am researching an issue w/ wildcard searches on complete words in 3.5. 
For example, searching for "kloster*" returns "klostermeyer", but 
"klostermeyer*" returns nothing.


The field being queried has the following analysis chain (standard 
'text_general'):


positionIncrementGap="100">

 
   
   words="stopwords.txt" enablePositionIncrements="true" />

   
 
 
   
   words="stopwords.txt" enablePositionIncrements="true" />
   ignoreCase="true" expand="true"/>

   
 


I see that wildcard queries are not analyzed at query time, which could be 
the source of my issue, but I read conflicting advice on the interwebs.  I 
read also that this might have changed in 3.6, but I am unable to determine 
if my specific issue is addressed.


My questions:

1.   Why am I getting these search results with my current config?

2.   How do I fix it in 3.5?  Would upgrading to 3.6 also "fix" my 
issue?


Thanks!

Mike Klostermeyer



Re: Wildcard queries on whole words

2012-06-27 Thread Michael Della Bitta
We're doing:

?'kloster'^2 OR kloster*

This is for a homegrown autocomplete index based on a database of
context-free terms, so we have kind of a weird use case.

Note that wildcard matches will all be scored the same, so you might
need to do something to order them to suit your needs. In our case,
we're storing the value length and sorting on that, among other
things, but YMMV.

Michael Della Bitta


Appinions, Inc. -- Where Influence Isn’t a Game.
http://www.appinions.com


On Wed, Jun 27, 2012 at 2:16 PM, Klostermeyer, Michael
 wrote:
> Interesting solution.  Can you then explain to me for a given query:
>
> ?q='kloster' OR kloster*
>
> How the "exact match" part of that is boosted (assuming the above is how you 
> formulated your query)?
>
> Thanks!
>
> Mike
>
> -Original Message-
> From: Michael Della Bitta [mailto:michael.della.bi...@appinions.com]
> Sent: Wednesday, June 27, 2012 11:11 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Wildcard queries on whole words
>
> Hi Michael,
>
> I solved a similar issue by reformatting my query to do an OR across an exact 
> match or a wildcard query, with the exact match boosted.
>
> HTH,
>
> Michael Della Bitta
>
> 
> Appinions, Inc. -- Where Influence Isn't a Game.
> http://www.appinions.com
>
>
> On Wed, Jun 27, 2012 at 12:14 PM, Klostermeyer, Michael 
>  wrote:
>> I am researching an issue w/ wildcard searches on complete words in 3.5.  
>> For example, searching for "kloster*" returns "klostermeyer", but 
>> "klostermeyer*" returns nothing.
>>
>> The field being queried has the following analysis chain (standard 
>> 'text_general'):
>>
>> > positionIncrementGap="100">
>>      
>>        
>>        > words="stopwords.txt" enablePositionIncrements="true" />
>>        
>>      
>>      
>>        
>>        > words="stopwords.txt" enablePositionIncrements="true" />
>>        > synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
>>        
>>      
>> 
>>
>> I see that wildcard queries are not analyzed at query time, which could be 
>> the source of my issue, but I read conflicting advice on the interwebs.  I 
>> read also that this might have changed in 3.6, but I am unable to determine 
>> if my specific issue is addressed.
>>
>> My questions:
>>
>> 1.       Why am I getting these search results with my current config?
>>
>> 2.       How do I fix it in 3.5?  Would upgrading to 3.6 also "fix" my issue?
>>
>> Thanks!
>>
>> Mike Klostermeyer
>>


Re: Wildcard queries on whole words

2012-06-27 Thread Erick Erickson
q=kloster^3 OR kloster*

On Wed, Jun 27, 2012 at 2:16 PM, Klostermeyer, Michael
 wrote:
> Interesting solution.  Can you then explain to me for a given query:
>
> ?q='kloster' OR kloster*
>
> How the "exact match" part of that is boosted (assuming the above is how you 
> formulated your query)?
>
> Thanks!
>
> Mike
>
> -Original Message-
> From: Michael Della Bitta [mailto:michael.della.bi...@appinions.com]
> Sent: Wednesday, June 27, 2012 11:11 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Wildcard queries on whole words
>
> Hi Michael,
>
> I solved a similar issue by reformatting my query to do an OR across an exact 
> match or a wildcard query, with the exact match boosted.
>
> HTH,
>
> Michael Della Bitta
>
> 
> Appinions, Inc. -- Where Influence Isn't a Game.
> http://www.appinions.com
>
>
> On Wed, Jun 27, 2012 at 12:14 PM, Klostermeyer, Michael 
>  wrote:
>> I am researching an issue w/ wildcard searches on complete words in 3.5.  
>> For example, searching for "kloster*" returns "klostermeyer", but 
>> "klostermeyer*" returns nothing.
>>
>> The field being queried has the following analysis chain (standard 
>> 'text_general'):
>>
>> > positionIncrementGap="100">
>>      
>>        
>>        > words="stopwords.txt" enablePositionIncrements="true" />
>>        
>>      
>>      
>>        
>>        > words="stopwords.txt" enablePositionIncrements="true" />
>>        > synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
>>        
>>      
>> 
>>
>> I see that wildcard queries are not analyzed at query time, which could be 
>> the source of my issue, but I read conflicting advice on the interwebs.  I 
>> read also that this might have changed in 3.6, but I am unable to determine 
>> if my specific issue is addressed.
>>
>> My questions:
>>
>> 1.       Why am I getting these search results with my current config?
>>
>> 2.       How do I fix it in 3.5?  Would upgrading to 3.6 also "fix" my issue?
>>
>> Thanks!
>>
>> Mike Klostermeyer
>>


RE: Wildcard queries on whole words

2012-06-27 Thread Klostermeyer, Michael
Interesting solution.  Can you then explain to me for a given query:

?q='kloster' OR kloster*

How the "exact match" part of that is boosted (assuming the above is how you 
formulated your query)?

Thanks!

Mike

-Original Message-
From: Michael Della Bitta [mailto:michael.della.bi...@appinions.com] 
Sent: Wednesday, June 27, 2012 11:11 AM
To: solr-user@lucene.apache.org
Subject: Re: Wildcard queries on whole words

Hi Michael,

I solved a similar issue by reformatting my query to do an OR across an exact 
match or a wildcard query, with the exact match boosted.

HTH,

Michael Della Bitta


Appinions, Inc. -- Where Influence Isn't a Game.
http://www.appinions.com


On Wed, Jun 27, 2012 at 12:14 PM, Klostermeyer, Michael 
 wrote:
> I am researching an issue w/ wildcard searches on complete words in 3.5.  For 
> example, searching for "kloster*" returns "klostermeyer", but "klostermeyer*" 
> returns nothing.
>
> The field being queried has the following analysis chain (standard 
> 'text_general'):
>
>  positionIncrementGap="100">
>      
>        
>         words="stopwords.txt" enablePositionIncrements="true" />
>        
>      
>      
>        
>         words="stopwords.txt" enablePositionIncrements="true" />
>         synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
>        
>      
> 
>
> I see that wildcard queries are not analyzed at query time, which could be 
> the source of my issue, but I read conflicting advice on the interwebs.  I 
> read also that this might have changed in 3.6, but I am unable to determine 
> if my specific issue is addressed.
>
> My questions:
>
> 1.       Why am I getting these search results with my current config?
>
> 2.       How do I fix it in 3.5?  Would upgrading to 3.6 also "fix" my issue?
>
> Thanks!
>
> Mike Klostermeyer
>


Re: Wildcard queries on whole words

2012-06-27 Thread Michael Della Bitta
Hi Michael,

I solved a similar issue by reformatting my query to do an OR across
an exact match or a wildcard query, with the exact match boosted.

HTH,

Michael Della Bitta


Appinions, Inc. -- Where Influence Isn’t a Game.
http://www.appinions.com


On Wed, Jun 27, 2012 at 12:14 PM, Klostermeyer, Michael
 wrote:
> I am researching an issue w/ wildcard searches on complete words in 3.5.  For 
> example, searching for "kloster*" returns "klostermeyer", but "klostermeyer*" 
> returns nothing.
>
> The field being queried has the following analysis chain (standard 
> 'text_general'):
>
>  positionIncrementGap="100">
>      
>        
>         words="stopwords.txt" enablePositionIncrements="true" />
>        
>      
>      
>        
>         words="stopwords.txt" enablePositionIncrements="true" />
>         ignoreCase="true" expand="true"/>
>        
>      
> 
>
> I see that wildcard queries are not analyzed at query time, which could be 
> the source of my issue, but I read conflicting advice on the interwebs.  I 
> read also that this might have changed in 3.6, but I am unable to determine 
> if my specific issue is addressed.
>
> My questions:
>
> 1.       Why am I getting these search results with my current config?
>
> 2.       How do I fix it in 3.5?  Would upgrading to 3.6 also "fix" my issue?
>
> Thanks!
>
> Mike Klostermeyer
>