subject:"Re\: wildcard search"

Re: wildcard search doesn't fetch results when field has whi

2019-03-31 Thread Erick Erickson

Yes. But You haven’t told us what _type_ of field you’re working with though. 
If it’s a “string” type, then ComplexPhraseQueryParser won’t work. Looking 
again at your example it looks as though you are using strings. Then try
abc\ d*

Adding debug=query to your url will show you how the query gets parsed and may 
help considerably.

Best,
Erick
 

> On Mar 31, 2019, at 7:24 AM, Ahemad Ali  
> wrote:
> 
> Erick,I tried complexqueryparser, still no result.Escape white space, do you 
> mean to say using "\" ?Thanks,Ahemad 
> 
> Sent from Yahoo Mail on Android 
> 
>  On Sun, Mar 31, 2019 at 1:22, Erick Erickson wrote: 
>   Try complexphrasequeryparser. If (and only if) you always want to search
> from the beginning of the content, you might be able to use string rather
> than text-based Fields but make sure to escape whitespace...
> 
> Best,
> Erick
> 
> On Sat, Mar 30, 2019, 10:33 ahemad.sh...@yahoo.com.INVALID
>  wrote:
> 
>> Hi ,
>> I have field with white spaces and special characters on which indexing
>> needs to be done to do wildcard querying.
>> It works for most of the scnearios with wildcard search.
>> e.g. if my data is "ali.abc" and "abc_pqr" and "ali abc" and "ahemad ali"
>> then search with ali* gives this three results.
>> 
>> But I am not able to search with say -  ali a*
>> 
>> Search with query q="ali abc" gives exact match and desired result.
>> 
>> I want to do wildcard search where criteria can include spaces like
>> example - "ahemad a* or ahemad a*
>> 
>> 
>> i.e. if space is present then I am not able to to wildcard search.
>> 
>> Is there any way by which wildcard search will be achieved even if space
>> is present in token.
>> 
>> The field type have is below:
>> 
>> > sortMissingLast="true">
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> > replacement=""replace="all" />
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> > replacement=""replace="all" />
>> 
>> 
>> 
>> 
>> Any help would be great.
>> Thanks,Ahemad Ali
>

Re: wildcard search doesn't fetch results when field has whi

2019-03-31 Thread Ahemad Ali

Erick,I tried complexqueryparser, still no result.Escape white space, do you 
mean to say using "\" ?Thanks,Ahemad 

Sent from Yahoo Mail on Android 
 
  On Sun, Mar 31, 2019 at 1:22, Erick Erickson wrote:  
 Try complexphrasequeryparser. If (and only if) you always want to search
from the beginning of the content, you might be able to use string rather
than text-based Fields but make sure to escape whitespace...

Best,
Erick

On Sat, Mar 30, 2019, 10:33 ahemad.sh...@yahoo.com.INVALID
 wrote:

> Hi ,
> I have field with white spaces and special characters on which indexing
> needs to be done to do wildcard querying.
> It works for most of the scnearios with wildcard search.
> e.g. if my data is "ali.abc" and "abc_pqr" and "ali abc" and "ahemad ali"
> then search with ali* gives this three results.
>
> But I am not able to search with say -  ali a*
>
> Search with query q="ali abc" gives exact match and desired result.
>
> I want to do wildcard search where criteria can include spaces like
> example - "ahemad a* or ahemad a*
>
>
> i.e. if space is present then I am not able to to wildcard search.
>
> Is there any way by which wildcard search will be achieved even if space
> is present in token.
>
> The field type have is below:
>
>     sortMissingLast="true">
>
>    
>
>        
>
>        
>
>         replacement=""replace="all" />
>
>        
>
>    
>
>    
>
>        
>
>        
>
>         replacement=""replace="all" />
>
>    
>
> 
> Any help would be great.
> Thanks,Ahemad Ali

Re: wildcard search doesn't fetch results when field has white spaces and special charecters

2019-03-30 Thread Erick Erickson

Try complexphrasequeryparser. If (and only if) you always want to search
from the beginning of the content, you might be able to use string rather
than text-based Fields but make sure to escape whitespace...

Best,
Erick

On Sat, Mar 30, 2019, 10:33 ahemad.sh...@yahoo.com.INVALID
 wrote:

> Hi ,
> I have field with white spaces and special characters on which indexing
> needs to be done to do wildcard querying.
> It works for most of the scnearios with wildcard search.
> e.g. if my data is "ali.abc" and "abc_pqr" and "ali abc" and "ahemad ali"
> then search with ali* gives this three results.
>
> But I am not able to search with say -  ali a*
>
> Search with query q="ali abc" gives exact match and desired result.
>
> I want to do wildcard search where criteria can include spaces like
> example - "ahemad a* or ahemad a*
>
>
> i.e. if space is present then I am not able to to wildcard search.
>
> Is there any way by which wildcard search will be achieved even if space
> is present in token.
>
> The field type have is below:
>
>  sortMissingLast="true">
>
> 
>
> 
>
> 
>
>  replacement=""replace="all" />
>
> 
>
> 
>
> 
>
> 
>
> 
>
>  replacement=""replace="all" />
>
> 
>
> 
> Any help would be great.
> Thanks,Ahemad Ali

RE: Wildcard search not working

2016-08-12 Thread Ribeaud, Christian (Ext)

Hi Ahmet, Hi Upayavira,

OK, it seems that I have to dive a bit deeper in the Solr filters and 
tokenizers. I've just realized that my command there is too limited.
Thanks a lot guys so far for help. Cheers and have a nice day,

christian

-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com] 
Sent: Freitag, 12. August 2016 07:41
To: solr-user@lucene.apache.org; Ribeaud, Christian (Ext)
Subject: Re: Wildcard search not working

Hi Christian,

Please use the following filter before/above the stemmer.


Plus, you may want to add :


  
  
  

Ahmet



On Thursday, August 11, 2016 9:31 PM, "Ribeaud, Christian (Ext)" 
<christian.ribe...@novartis.com> wrote:
Hi Ahmet,

Many thanks for your reply. I had a look at the URL you pointed out but, 
honestly, I have to admit that I did not fully understand you.
Let's be a bit more concrete. Following the schema snippet for the 
corresponding field:

...




 









...

What is wrong with this schema? Respectively, what should I change to be able 
to correctly do wildcard searches?

Many thanks for your time. Cheers,

christian
--
Christian Ribeaud
Software Engineer (External)
NIBR / WSJ-310.5.17
Novartis Campus
CH-4056 Basel



-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com] 
Sent: Donnerstag, 11. August 2016 16:00
To: solr-user@lucene.apache.org; Ribeaud, Christian (Ext)
Subject: Re: Wildcard search not working

Hi Chiristian,

The query r?che may not return at least the same number of matches as roche 
depending on your analysis chain.
The difference is roche is analyzed but r?che don't. Wildcard queries are 
executed on the indexed/analyzed terms.
For example, if roche is indexed/analyzed as roch, the query r?che won't match 
it.

Please see : https://wiki.apache.org/solr/MultitermQueryAnalysis

Ahmet



On Thursday, August 11, 2016 4:42 PM, "Ribeaud, Christian (Ext)" 
<christian.ribe...@novartis.com> wrote:
Hi,

What would be the reasons making the wildcard search for Lucene Query Parser 
NOT working?

We are using Solr 5.4.1 and, using the admin console, I am triggering for 
instance searches with term 'roche' in a specific core. Everything fine, I am 
getting for instance two matches. I would expect at least the same number of 
matches with term 'r?che'. However, this does NOT happen. I am getting zero 
matches. Same problem occurs with 'r*che'. 'roch?' does not work neither but 
'roch*' works.

Switching debug mode brings following output:

"debug": {
"rawquerystring": "roch?",
"querystring": "roch?",
"parsedquery": "text:roch?",
"parsedquery_toString": "text:roch?",
"explain": {},
"QParser": "LuceneQParser",
...

Any idea? Thanks and cheers,

christian

Re: Wildcard search not working

2016-08-12 Thread Ahmet Arslan

Hi Christian,

Please use the following filter before/above the stemmer.


Plus, you may want to add :


  
  
  

Ahmet



On Thursday, August 11, 2016 9:31 PM, "Ribeaud, Christian (Ext)" 
<christian.ribe...@novartis.com> wrote:
Hi Ahmet,

Many thanks for your reply. I had a look at the URL you pointed out but, 
honestly, I have to admit that I did not fully understand you.
Let's be a bit more concrete. Following the schema snippet for the 
corresponding field:

...




 









...

What is wrong with this schema? Respectively, what should I change to be able 
to correctly do wildcard searches?

Many thanks for your time. Cheers,

christian
--
Christian Ribeaud
Software Engineer (External)
NIBR / WSJ-310.5.17
Novartis Campus
CH-4056 Basel



-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com] 
Sent: Donnerstag, 11. August 2016 16:00
To: solr-user@lucene.apache.org; Ribeaud, Christian (Ext)
Subject: Re: Wildcard search not working

Hi Chiristian,

The query r?che may not return at least the same number of matches as roche 
depending on your analysis chain.
The difference is roche is analyzed but r?che don't. Wildcard queries are 
executed on the indexed/analyzed terms.
For example, if roche is indexed/analyzed as roch, the query r?che won't match 
it.

Please see : https://wiki.apache.org/solr/MultitermQueryAnalysis

Ahmet



On Thursday, August 11, 2016 4:42 PM, "Ribeaud, Christian (Ext)" 
<christian.ribe...@novartis.com> wrote:
Hi,

What would be the reasons making the wildcard search for Lucene Query Parser 
NOT working?

We are using Solr 5.4.1 and, using the admin console, I am triggering for 
instance searches with term 'roche' in a specific core. Everything fine, I am 
getting for instance two matches. I would expect at least the same number of 
matches with term 'r?che'. However, this does NOT happen. I am getting zero 
matches. Same problem occurs with 'r*che'. 'roch?' does not work neither but 
'roch*' works.

Switching debug mode brings following output:

"debug": {
"rawquerystring": "roch?",
"querystring": "roch?",
"parsedquery": "text:roch?",
"parsedquery_toString": "text:roch?",
"explain": {},
"QParser": "LuceneQParser",
...

Any idea? Thanks and cheers,

christian

Re: Wildcard search not working

2016-08-11 Thread Upayavira

You have a stemming filter in your analysis chain. Go to the analysis
tab, select the 'text' field, and put "Roche" into both boxes. Click
analyse. I bet you you will see Roch, not Roche, because of your
stemming filter shown below.

That's what Ahmet shrewdly identified above.

Upayavira

On Thu, 11 Aug 2016, at 08:31 PM, Ribeaud, Christian (Ext) wrote:
> Hi Ahmet,
> 
> Many thanks for your reply. I had a look at the URL you pointed out but,
> honestly, I have to admit that I did not fully understand you.
> Let's be a bit more concrete. Following the schema snippet for the
> corresponding field:
> 
> ...
>  required="false" multiValued="false" />
> 
> 
>  positionIncrementGap="100">
>  
> 
> 
>  words="lang/stopwords_de.txt" format="snowball" />
> 
> 
> 
> 
> 
> 
> ...
> 
> What is wrong with this schema? Respectively, what should I change to be
> able to correctly do wildcard searches?
> 
> Many thanks for your time. Cheers,
> 
> christian
> --
> Christian Ribeaud
> Software Engineer (External)
> NIBR / WSJ-310.5.17
> Novartis Campus
> CH-4056 Basel
> 
> 
> -----Original Message-
> From: Ahmet Arslan [mailto:iori...@yahoo.com] 
> Sent: Donnerstag, 11. August 2016 16:00
> To: solr-user@lucene.apache.org; Ribeaud, Christian (Ext)
> Subject: Re: Wildcard search not working
> 
> Hi Chiristian,
> 
> The query r?che may not return at least the same number of matches as
> roche depending on your analysis chain.
> The difference is roche is analyzed but r?che don't. Wildcard queries are
> executed on the indexed/analyzed terms.
> For example, if roche is indexed/analyzed as roch, the query r?che won't
> match it.
> 
> Please see : https://wiki.apache.org/solr/MultitermQueryAnalysis
> 
> Ahmet
> 
> 
> 
> On Thursday, August 11, 2016 4:42 PM, "Ribeaud, Christian (Ext)"
> <christian.ribe...@novartis.com> wrote:
> Hi,
> 
> What would be the reasons making the wildcard search for Lucene Query
> Parser NOT working?
> 
> We are using Solr 5.4.1 and, using the admin console, I am triggering for
> instance searches with term 'roche' in a specific core. Everything fine,
> I am getting for instance two matches. I would expect at least the same
> number of matches with term 'r?che'. However, this does NOT happen. I am
> getting zero matches. Same problem occurs with 'r*che'. 'roch?' does not
> work neither but 'roch*' works.
> 
> Switching debug mode brings following output:
> 
> "debug": {
> "rawquerystring": "roch?",
> "querystring": "roch?",
> "parsedquery": "text:roch?",
> "parsedquery_toString": "text:roch?",
> "explain": {},
> "QParser": "LuceneQParser",
> ...
> 
> Any idea? Thanks and cheers,
> 
> christian

RE: Wildcard search not working

2016-08-11 Thread Ribeaud, Christian (Ext)

Hi Ahmet,

Many thanks for your reply. I had a look at the URL you pointed out but, 
honestly, I have to admit that I did not fully understand you.
Let's be a bit more concrete. Following the schema snippet for the 
corresponding field:

...




 









...

What is wrong with this schema? Respectively, what should I change to be able 
to correctly do wildcard searches?

Many thanks for your time. Cheers,

christian
--
Christian Ribeaud
Software Engineer (External)
NIBR / WSJ-310.5.17
Novartis Campus
CH-4056 Basel


-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com] 
Sent: Donnerstag, 11. August 2016 16:00
To: solr-user@lucene.apache.org; Ribeaud, Christian (Ext)
Subject: Re: Wildcard search not working

Hi Chiristian,

The query r?che may not return at least the same number of matches as roche 
depending on your analysis chain.
The difference is roche is analyzed but r?che don't. Wildcard queries are 
executed on the indexed/analyzed terms.
For example, if roche is indexed/analyzed as roch, the query r?che won't match 
it.

Please see : https://wiki.apache.org/solr/MultitermQueryAnalysis

Ahmet



On Thursday, August 11, 2016 4:42 PM, "Ribeaud, Christian (Ext)" 
<christian.ribe...@novartis.com> wrote:
Hi,

What would be the reasons making the wildcard search for Lucene Query Parser 
NOT working?

We are using Solr 5.4.1 and, using the admin console, I am triggering for 
instance searches with term 'roche' in a specific core. Everything fine, I am 
getting for instance two matches. I would expect at least the same number of 
matches with term 'r?che'. However, this does NOT happen. I am getting zero 
matches. Same problem occurs with 'r*che'. 'roch?' does not work neither but 
'roch*' works.

Switching debug mode brings following output:

"debug": {
"rawquerystring": "roch?",
"querystring": "roch?",
"parsedquery": "text:roch?",
"parsedquery_toString": "text:roch?",
"explain": {},
"QParser": "LuceneQParser",
...

Any idea? Thanks and cheers,

christian

Re: Wildcard search not working

2016-08-11 Thread Ahmet Arslan

Hi Chiristian,

The query r?che may not return at least the same number of matches as roche 
depending on your analysis chain.
The difference is roche is analyzed but r?che don't. Wildcard queries are 
executed on the indexed/analyzed terms.
For example, if roche is indexed/analyzed as roch, the query r?che won't match 
it.

Please see : https://wiki.apache.org/solr/MultitermQueryAnalysis

Ahmet



On Thursday, August 11, 2016 4:42 PM, "Ribeaud, Christian (Ext)" 
 wrote:
Hi,

What would be the reasons making the wildcard search for Lucene Query Parser 
NOT working?

We are using Solr 5.4.1 and, using the admin console, I am triggering for 
instance searches with term 'roche' in a specific core. Everything fine, I am 
getting for instance two matches. I would expect at least the same number of 
matches with term 'r?che'. However, this does NOT happen. I am getting zero 
matches. Same problem occurs with 'r*che'. 'roch?' does not work neither but 
'roch*' works.

Switching debug mode brings following output:

"debug": {
"rawquerystring": "roch?",
"querystring": "roch?",
"parsedquery": "text:roch?",
"parsedquery_toString": "text:roch?",
"explain": {},
"QParser": "LuceneQParser",
...

Any idea? Thanks and cheers,

christian

RE: wildcard search for string having spaces

2016-06-15 Thread Roshan Kamble

Great.
First option worked for me. I was trying with q=abc\sp*... it should be q=abc\ 
p*

Thanks

-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com]
Sent: Wednesday, June 15, 2016 6:25 PM
To: solr-user@lucene.apache.org; Roshan Kamble
Subject: Re: wildcard search for string having spaces

Hi Roshan,

I think there are two options:

1) escape the space q=abc\ p*
2) use prefix query parser q={!prefix f=my_string}abc p

Ahmet

On Wednesday, June 15, 2016 3:48 PM, Roshan Kamble 
<roshan.kam...@smartstreamrdu.com> wrote:
Hello,

I have below custom field type defined for solr 6.0.0

I am using above field to ensure that entire string is considered as single 
token and search should be case insensitive.

It works for most of the scnearios with wildcard search.
e.g. if my data is "abc.pqr" and "abc_pqr" and "abc pqr" then search with abc* 
gives this three results.

But I am not able to search with say abc p*

Search with query q="abc pqr" gives exact match and desired result.

I want to do wildcard search where criteria can include spaces like above 
example

i.e. if space is present then I am not able to to wildcard search.

Is there any way by which wildcard search will be achieved even if space is 
present in token.

Regards,
Roshan

The information in this email is confidential and may be legally privileged. It 
is intended solely for the addressee. Access to this email by anyone else is 
unauthorised. If you are not the intended recipient, any disclosure, copying, 
distribution or any action taken or omitted to be taken in reliance on it, is 
prohibited and may be unlawful.

 The information in this email is confidential and may be legally privileged. 
It is intended solely for the addressee. Access to this email by anyone else is 
unauthorised. If you are not the intended recipient, any disclosure, copying, 
distribution or any action taken or omitted to be taken in reliance on it, is 
prohibited and may be unlawful.

Re: wildcard search for string having spaces

2016-06-15 Thread Ahmet Arslan

Hi Roshan,

I think there are two options:

1) escape the space q=abc\ p*
2) use prefix query parser q={!prefix f=my_string}abc p

Ahmet


On Wednesday, June 15, 2016 3:48 PM, Roshan Kamble 
 wrote:
Hello,

I have below custom field type defined for solr 6.0.0

   

  
  


  
  

  


I am using above field to ensure that entire string is considered as single 
token and search should be case insensitive.

It works for most of the scnearios with wildcard search.
e.g. if my data is "abc.pqr" and "abc_pqr" and "abc pqr" then search with abc* 
gives this three results.

But I am not able to search with say abc p*

Search with query q="abc pqr" gives exact match and desired result.

I want to do wildcard search where criteria can include spaces like above 
example


i.e. if space is present then I am not able to to wildcard search.

Is there any way by which wildcard search will be achieved even if space is 
present in token.

Regards,
Roshan

The information in this email is confidential and may be legally privileged. It 
is intended solely for the addressee. Access to this email by anyone else is 
unauthorised. If you are not the intended recipient, any disclosure, copying, 
distribution or any action taken or omitted to be taken in reliance on it, is 
prohibited and may be unlawful.

Re: Wildcard search makes no sense!!

2014-10-02 Thread waynemailinglist

Many many thanks for the replies - it was helpful for me to start
understanding how this works.

I'm using 3.5 so this goes to explain a lot. What I have done is if the
query contains a * I make the query lowercase before sending to solr. This
seems to have solved this issue given your explanation above. Many thanks 

Something that is still not clear in my mind is how this tokenising works.
For example with the filters I have when I run the analyser I get:
Field: Hello You

Hello|You
Hello|You
Hello|You
hello|you
hello|you


Does this mean that the index is stored as 'hello|you' (the final one) and
that when I run a query and it goes through the filters whatever the end
result of that is must match the 'hello|you' in order to return a result?






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-makes-no-sense-tp4162069p4162284.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search makes no sense!!

2014-10-02 Thread Shawn Heisey

On 10/2/2014 4:33 AM, waynemailinglist wrote:
 Something that is still not clear in my mind is how this tokenising works.
 For example with the filters I have when I run the analyser I get:
 Field: Hello You
 
 Hello|You
 Hello|You
 Hello|You
 hello|you
 hello|you
 
 
 Does this mean that the index is stored as 'hello|you' (the final one) and
 that when I run a query and it goes through the filters whatever the end
 result of that is must match the 'hello|you' in order to return a result?

The index has two terms for this field if this is the whole input --
hello and you -- which can be searched for individually.  The tokenizer
does the initial job of separating the input into tokens (terms) ...
some filters can create additional terms, depending on exactly what's
left when the tokenizer is done.

Thanks,
Shawn

Re: Wildcard search makes no sense!!

2014-10-02 Thread Erick Erickson

right, prior to 3.6, the standard way to handle wildcards was to,
essentially, pre-analyze the terms that had  wildcards. This works
fine for simple filters, things like lowercasing for instance, but
doesn't work so well for things like stemming.

So you're doing what can be done at this point, but moving to 4.x (or
even 3.6) would solve it better.

Best,
Erick

On Thu, Oct 2, 2014 at 6:29 AM, Shawn Heisey apa...@elyograg.org wrote:
 On 10/2/2014 4:33 AM, waynemailinglist wrote:
 Something that is still not clear in my mind is how this tokenising works.
 For example with the filters I have when I run the analyser I get:
 Field: Hello You

 Hello|You
 Hello|You
 Hello|You
 hello|you
 hello|you


 Does this mean that the index is stored as 'hello|you' (the final one) and
 that when I run a query and it goes through the filters whatever the end
 result of that is must match the 'hello|you' in order to return a result?

 The index has two terms for this field if this is the whole input --
 hello and you -- which can be searched for individually.  The tokenizer
 does the initial job of separating the input into tokens (terms) ...
 some filters can create additional terms, depending on exactly what's
 left when the tokenizer is done.

 Thanks,
 Shawn

Re: Wildcard search makes no sense!!

2014-10-02 Thread waynemailinglist

Ok I think I understand your points there. Just clarify say if the term was
Large increased and my filters went something like:

Large|increased
Large|increase|increased
large|increase|increased

the final tokens indexed would be large|increase|increased  ?

Once again thanks for all the help.


On Thu, Oct 2, 2014 at 2:30 PM, Shawn Heisey-2 [via Lucene] 
ml-node+s472066n4162306...@n3.nabble.com wrote:

 On 10/2/2014 4:33 AM, waynemailinglist wrote:

  Something that is still not clear in my mind is how this tokenising
 works.
  For example with the filters I have when I run the analyser I get:
  Field: Hello You
 
  Hello|You
  Hello|You
  Hello|You
  hello|you
  hello|you
 
 
  Does this mean that the index is stored as 'hello|you' (the final one)
 and
  that when I run a query and it goes through the filters whatever the end
  result of that is must match the 'hello|you' in order to return a
 result?

 The index has two terms for this field if this is the whole input --
 hello and you -- which can be searched for individually.  The tokenizer
 does the initial job of separating the input into tokens (terms) ...
 some filters can create additional terms, depending on exactly what's
 left when the tokenizer is done.

 Thanks,
 Shawn



 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/Wildcard-search-makes-no-sense-tp4162069p4162306.html
  To unsubscribe from Wildcard search makes no sense!!, click here
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4162069code=d2F5bmVtYWlsaW5nbGlzdHNAZ21haWwuY29tfDQxNjIwNjl8LTIxOTMxNzkyNQ==
 .
 NAML
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-makes-no-sense-tp4162069p4162349.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search makes no sense!!

2014-10-01 Thread Ahmet Arslan

Hi,

Probably you have stemmer and it is eating up Capital to capit. Thats the 
reason.
Either remove stemmer from analyser chain or add keyword repeat filter.

Ahmet



On Wednesday, October 1, 2014 2:16 PM, Wayne W waynemailingli...@gmail.com 
wrote:
Hi,

I don't understand this at all. We are indexing some contact names. When we
do a standard query:

query 1: capi*
result: Capital Health

query 2: capit*
result: Capital Health

query 3: capita*
result: no results

query 4: capital*
result: no results

I understand (as we are using solar 3.5) that the wildcard search does not
actually return the query without the wildcard so I understand at least why
query 4 is not working ( I need to use: capital* OR capital ). What I don't
understand is why query 3 is not working.

Also if we place in the text field the following 3 contacts:

j...@capitalhealth.com
f...@capitalhealth.com
Capital Heath

When searching for:

query A: capita*
result: j...@capitalhealth.com, f...@capitalhealth.com

query B: capit*
result: j...@capitalhealth.com, f...@capitalhealth.com, Capital Heath


What is going on and how can I solve this?
many thanks as I'm really stuck on this

Re: Wildcard search makes no sense!!

2014-10-01 Thread Toke Eskildsen

On Wed, 2014-10-01 at 13:16 +0200, Wayne W wrote:
 query 2: capit*
 result: Capital Health
 
 query 3: capita*
 result: no results

You are likely using a stemmer for the field: Capital Health gets
indexed as capit and health, so there are no tokens starting with
capita.

Turn off the stemmer or add a non-stemmed copy-field for trunkated
searches.


(sanity-checked at http://9ol.es/porter_js_demo.html)


- Toke Eskildsen, State and University Library, Denmark

Re: Wildcard search makes no sense!!

2014-10-01 Thread Jack Krupansky

The presence of a wildcard in a query term short circuits some portions of 
the analysis process. Some token filters like lower case can still be 
performed on the query terms, but others, like stemming, cannot. So, either 
simplify the analysis (be more selective of what token filters you use), or 
you will have to modify your query terms so that you manually simulate the 
token transformations that your text analysis is performing.


Take one of your indexed terms that you think should match and send it 
through the Solr Admin UI analysis page for the query field and see what the 
source token gets analyzed into - that's what your wildcard prefix must 
match. Sometimes (usually!) you will be surprised.


-- Jack Krupansky

-Original Message- 
From: Wayne W

Sent: Wednesday, October 1, 2014 7:16 AM
To: solr-user@lucene.apache.org
Subject: Wildcard search makes no sense!!

Hi,

I don't understand this at all. We are indexing some contact names. When we
do a standard query:

query 1: capi*
result: Capital Health

query 2: capit*
result: Capital Health

query 3: capita*
result: no results

query 4: capital*
result: no results

I understand (as we are using solar 3.5) that the wildcard search does not
actually return the query without the wildcard so I understand at least why
query 4 is not working ( I need to use: capital* OR capital ). What I don't
understand is why query 3 is not working.

Also if we place in the text field the following 3 contacts:

j...@capitalhealth.com
f...@capitalhealth.com
Capital Heath

When searching for:

query A: capita*
result: j...@capitalhealth.com, f...@capitalhealth.com

query B: capit*
result: j...@capitalhealth.com, f...@capitalhealth.com, Capital Heath


What is going on and how can I solve this?
many thanks as I'm really stuck on this

Re: Wildcard search makes no sense!!

2014-10-01 Thread waynemailinglist

Ahmet -  many thanks - I removed the EnglishPorterFilterFactory and reindexed
and this seems to behave as expected now.

Jack - thanks aswell - I'm very much a noob with this, and thats a great
tip.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-makes-no-sense-tp4162069p4162086.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search makes no sense!!

2014-10-01 Thread waynemailinglist

I'm still stuck on this actually. I would really appreciate any pointers. 
If I search for :
query 1: Κώστας
result: Κώστας

query 2: Κώστα*
result: no result

I've looked at the analyser but I don't really understand what I'm looking
at if I'm honest. It gives the output:
Field (name): title
Field value: Κώστας
Field value (query): Κώστα*

Index Analyzer
Κώστας
Κώστας
Κώστας
κώστας
κώστας
Query Analyzer
Κώστα*
Κώστα*
Κώστα*
Κώστα
κώστα
κώστα


In my schema I have defined
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/ (only used in query)
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=0 catenateNumbers=0
catenateAll=0/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/


I tried adding ASCIIFoldingFilterFactory but that didm;t make any difference
after reindexing.

Any ideas?

many thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-makes-no-sense-tp4162069p4162150.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search makes no sense!!

2014-10-01 Thread Alexandre Rafalovitch

If you use * you use Multiterm analysis path, which is semi-hidden
and is a lot more limited to the things done with normal tokens:
https://wiki.apache.org/solr/MultitermQueryAnalysis

The Analyzer components that are NOT multiterm aware cannot be used
that way. Looking at: http://www.solr-start.com/info/analyzers/ , you
can see that only LowerCase analyzer is multiterm aware (with (multi)
in the brackets). So, the rest are not used.

You may switch to EdgeNGrams or similar instead.

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 1 October 2014 13:10, waynemailinglist waynemailingli...@gmail.com wrote:
 I'm still stuck on this actually. I would really appreciate any pointers.
 If I search for :
 query 1: Κώστας
 result: Κώστας

 query 2: Κώστα*
 result: no result

 I've looked at the analyser but I don't really understand what I'm looking
 at if I'm honest. It gives the output:
 Field (name): title
 Field value: Κώστας
 Field value (query): Κώστα*

 Index Analyzer
 Κώστας
 Κώστας
 Κώστας
 κώστας
 κώστας
 Query Analyzer
 Κώστα*
 Κώστα*
 Κώστα*
 Κώστα
 κώστα
 κώστα


 In my schema I have defined
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/ (only used in query)
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt/
 filter class=solr.WordDelimiterFilterFactory generateWordParts=1
 generateNumberParts=1 catenateWords=0 catenateNumbers=0
 catenateAll=0/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/


 I tried adding ASCIIFoldingFilterFactory but that didm;t make any difference
 after reindexing.

 Any ideas?

 many thanks



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Wildcard-search-makes-no-sense-tp4162069p4162150.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search makes no sense!!

2014-10-01 Thread Erick Erickson

Two things:

1 what version of Solr are you using? If it's prior to 3.6, then the
bits that handle applying lowercaseFilter to wildcards isn't in the
code.

2 what do you see if you add debug=query?

I just tried it with your analysis chain and it seemed to work. Did
you completely blow your index away when trying this? I did get into a
state where my terms didn't show up. When you change the schema,
sometimes some information about the fields is written into the index
and is incompatible with later changes.

By completely blow away I mean
stop Solr
rm -rf blah/collection/data
start Solr
reindex
test


Best,
Erick

On Wed, Oct 1, 2014 at 10:10 AM, waynemailinglist
waynemailingli...@gmail.com wrote:
 I'm still stuck on this actually. I would really appreciate any pointers.
 If I search for :
 query 1: Κώστας
 result: Κώστας

 query 2: Κώστα*
 result: no result

 I've looked at the analyser but I don't really understand what I'm looking
 at if I'm honest. It gives the output:
 Field (name): title
 Field value: Κώστας
 Field value (query): Κώστα*

 Index Analyzer
 Κώστας
 Κώστας
 Κώστας
 κώστας
 κώστας
 Query Analyzer
 Κώστα*
 Κώστα*
 Κώστα*
 Κώστα
 κώστα
 κώστα


 In my schema I have defined
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/ (only used in query)
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt/
 filter class=solr.WordDelimiterFilterFactory generateWordParts=1
 generateNumberParts=1 catenateWords=0 catenateNumbers=0
 catenateAll=0/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/


 I tried adding ASCIIFoldingFilterFactory but that didm;t make any difference
 after reindexing.

 Any ideas?

 many thanks



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Wildcard-search-makes-no-sense-tp4162069p4162150.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working with search term having special characters and digits

2014-04-29 Thread Geepalem

Can someone help me out with this issue please?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-with-search-term-having-special-characters-and-digits-tp4133385p4133770.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working with search term having special characters and digits

2014-04-28 Thread Geepalem

Can some one please help me with this as I am struck with this issue.. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-with-search-term-having-special-characters-and-digits-tp4133385p4133478.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working with search term having special characters and digits

2014-04-28 Thread Jack Krupansky

Wildcard query only works for single terms. Any embedded special characters 
will cause a term to be split into multiple terms at index time. The use of 
a wildcard in a query term with embedded special characters will bypass 
normal analysis - you need to enter the term exactly as it would be analyzed 
at index time for wildcard to work.


Ditto is your filed type uses the word delimiter filter with the split 
digits option enabled - the alpha and numeric portions will generate 
separate terms - and cause a wildcard to fail.


-- Jack Krupansky

-Original Message- 
From: Geepalem

Sent: Sunday, April 27, 2014 3:30 PM
To: solr-user@lucene.apache.org
Subject: Wildcard search not working with search term having special 
characters and digits


Hi,

Below query without wildcard search is returning results.
http://localhost:8080/solr/master/select?q=page_title_t:an-138;

But below query with wildcard is not returning results
http://localhost:8080/solr/master/select?q=page_title_t:an-13*;

Below query with wildcard search and no didgits  is returning results.
http://localhost:8080/solr/master/select?q=page_title_t:an-*;

I have tried by adding WordDelimeter Filter but there is no luck.
filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=1 catenateNumbers=1
catenateAll=0 splitOnCaseChange=1/


Please suggest or guide how to make wildcard search works with special
characters and digits.

Appreciate immediate response!!

Thanks,
G. Naresh Kumar






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-with-search-term-having-special-characters-and-digits-tp4133385.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working with search term having special characters and digits

2014-04-28 Thread Geepalem

Thanks jack for prompt response!

So is there any solution to make this scenario works? 
Or wildcard doesn't work with special characters and numerics?

Thanks,
G. Naresh Kumar



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-with-search-term-having-special-characters-and-digits-tp4133385p4133554.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working if the query contains numbers along with special characters.

2014-03-05 Thread Kashish

Hi, Pls help me with this.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608p4121457.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working if the query contains numbers along with special characters.

2014-03-05 Thread Ahmet Arslan

Hi Kashish,

This is confusing. You gave the following example :

query 1999/99* should return RABIAN NIGHTS #01 (1999/99)

However you said I cannot ignore parenthesis or other special characters...

Above two contadicts each other.

Since you are after autocomplete you might be interested in this 
http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/

Ahmet


On Wednesday, March 5, 2014 8:36 PM, Kashish itzz.me.kash...@gmail.com wrote:
Hi, Pls help me with this.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608p4121457.html

Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working if the query contains numbers along with special characters.

2014-03-05 Thread Kashish

Hi Ahmet,

Let me explain with another scenario .

There is a title -  ARABIAN NIGHTS - 1999/99

Now in autocomplete, if i give 1999/99 , in the backend i append an asterisk
to it and form the solr url thsi way

q=titleName:1999/99*

I get the above mentioned title.- so works perfect

Now lets add another title to this.

- JULIUS CAESER (1999/99)

If i pass the same query parameter, i would definitely expect both these
titles to come up. but this new one doesn't come(Because of the braces).

I can add patternReplaceFilter but this way i will never be able to
specifically search the title i want as 1999/99.

Hope you get what i am trying to achieve. Is my understanding wrong
somewhere?

Thanks.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608p4121512.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working if the query contains numbers along with special characters.

2014-03-05 Thread Ahmet Arslan

Hi,

Forget about patternReplaceCharFilter for a moment. Your example is more clear
this time.

q=titleName:1999/99*

should return following two docs:

d1) JULIUS CAESER (1999/99)

d2) ARABIAN NIGHTS - 1999/99

This is achievable with the following type.

1) MappingCharFilterFactory with mappings.txt

( =
) =

2) WhiteSpaceTokenizerFactory
3) LowercaseFilterFactory

I dont understand your sentence : i will never be able to specifically search
the title i want as 1999/99.
But please try / test above. I also suggest you to use prefix query parser.

https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-PrefixQueryParser

Ahmet

On Wednesday, March 5, 2014 11:20 PM, Kashish itzz.me.kash...@gmail.com wrote:
Hi Ahmet,

Let me explain with another scenario .

There is a title - ARABIAN NIGHTS - 1999/99

Now in autocomplete, if i give 1999/99 , in the backend i append an asterisk
to it and form the solr url thsi way

q=titleName:1999/99*

I get the above mentioned title.- so works perfect

Now lets add another title to this.

- JULIUS CAESER (1999/99)

If i pass the same query parameter, i would definitely expect both these
titles to come up. but this new one doesn't come(Because of the braces).

I can add patternReplaceFilter but this way i will never be able to
specifically search the title i want as 1999/99.

Hope you get what i am trying to achieve. Is my understanding wrong
somewhere?

Thanks.

--
View this message in context:
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608p4121512.html

Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working if the query contains numbers along with special characters.

2014-03-04 Thread Kashish

Hi Erick,

I understand what you pointing out but the thing is.. this is for
autocomplete feature. I cannot ignore parenthesis or other special
characters as in certain titles like 'A Team of five', if the user fives 'a
team' then titles containing a-team and rest also comes off and this one
gets lost as we show only top 6 results (user can drill down to get closer
to the result he wants).
 
I modified my fieldtype so at index added worddelimeter delimeter and at
query time added patternfilter but now still if i use asterisk i get no
records for 1999/99* but get without asterisk. Thsi is not what i want as by
default, whatever the user enters we append asterisk to it for autocomplete
search.

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608p4121205.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working if the query conatins numbers along with special characters.

2014-02-25 Thread Ahmet Arslan

Hi Kashish,


What happens when you use this q={!prefix f=title_autocomplete}1999/99

I suspect '/' character is a special query parser character therefore it needs 
to be escaped.

Ahmet


On Tuesday, February 25, 2014 9:55 PM, Kashish itzz.me.kash...@gmail.com 
wrote:
Hi,

I have a very weird problem. The wild card search works fine for all
scenarios but one. It doesn't seem to give any result for query 1999/99*. I
checked the debug query and its formed perfect.

str name=rawquerystringtitle_autocomplete:1999/99*/str
str name=querystringtitle_autocomplete:1999/99*/str
str name=parsedquery(+title_autocomplete:1999/99* ())/no_coord/str
str name=parsedquery_toString+title_autocomplete:1999/99* ()/str

This is my fieldType

fieldType name=text_general_Title class=solr.TextField
positionIncrementGap=100
      analyzer type=index
        tokenizer class=solr.WhitespaceTokenizerFactory/
        filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /
        
        filter class=solr.LowerCaseFilterFactory/
      /analyzer
      analyzer type=query
        tokenizer class=solr.WhitespaceTokenizerFactory/
        filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /

      
        filter class=solr.LowerCaseFilterFactory/
      /analyzer
    /fieldType

Please help we with this.

Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working if the query contains numbers along with special characters.

2014-02-25 Thread Kashish

Hi Ahmet,

Thanks for your reply.

Yes. I pass my query this way -  q=title_autocomplete:1999%2f99

I tried your way too. But no luck. :( 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608p4119615.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working if the query conatins numbers along with special characters.

2014-02-25 Thread Erick Erickson

What does it say happens on your admin/analysis page
for that field?

And did you by any chance change your schema without
reindexing everything?

Also, try the TermsComonent to see what tokens are actually
_in_ your index. Schema-browser from the admin page can
help here too.

Best,
Erick


On Tue, Feb 25, 2014 at 12:05 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi Kashish,


 What happens when you use this q={!prefix f=title_autocomplete}1999/99

 I suspect '/' character is a special query parser character therefore it
 needs to be escaped.

 Ahmet


 On Tuesday, February 25, 2014 9:55 PM, Kashish itzz.me.kash...@gmail.com
 wrote:
 Hi,

 I have a very weird problem. The wild card search works fine for all
 scenarios but one. It doesn't seem to give any result for query 1999/99*. I
 checked the debug query and its formed perfect.

 str name=rawquerystringtitle_autocomplete:1999/99*/str
 str name=querystringtitle_autocomplete:1999/99*/str
 str name=parsedquery(+title_autocomplete:1999/99* ())/no_coord/str
 str name=parsedquery_toString+title_autocomplete:1999/99* ()/str

 This is my fieldType

 fieldType name=text_general_Title class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true /

 filter class=solr.LowerCaseFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true /


 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType

 Please help we with this.

 Thanks.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working if the query contains numbers along with special characters.

2014-02-25 Thread Ahmet Arslan

Hi,

By saying escaping I mean this : q=title_autocomplete:1999\/99*   It is 
different than URL encoding.

http://lucene.apache.org/core/4_6_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Escaping_Special_Characters

If prefix query parser didn't return what you want then it must be something 
with indexed terms.

Can you give an example raw documents text that you expect to retrieve with 
this query?



On Tuesday, February 25, 2014 10:15 PM, Kashish itzz.me.kash...@gmail.com 
wrote:
Hi Ahmet,

Thanks for your reply.

Yes. I pass my query this way -  q=title_autocomplete:1999%2f99

I tried your way too. But no luck. :( 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608p4119615.html

Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working if the query contains numbers along with special characters.

2014-02-25 Thread Kashish

Hi Ahmet/Erick,

I tried escaping as well. See no luck.

The title am looking for is  - ARABIAN NIGHTS #01 (1999/99)

I figured out that if i pass the query as *1999/99* (i.e asterisk not only
at the end but at the beginning as well), It works.

The problem is the braces. I can change my field type and add 

 filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=1 catenateNumbers=1
catenateAll=0 splitOnCaseChange=1 preserveOriginal=1/

But this will show too many results in autocomplete.

Is there any best way to handle this? Or should i pass asterisk before and
after the query?

Thanks.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608p4119678.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working if the query contains numbers along with special characters.

2014-02-25 Thread Erick Erickson

The admin/analysis page is your friend. Taking some time to
get acquainted with that page will save you lots and lots and
lots of time. In this case, you'd have seen that your input
is actually tokenized as (1999/99), parentheses and all as a
_single_ token, so of course searching for 1999/99 wouldn't work.

Searching for *1999/99* is generally a bad idea. It'll work, but it's
a kludge.

What you _do_ need to do is define your use-cases. Let's
assume that you _never_ want parentheses to be relevant. You
could use PatternReplaceCharFilterFactory or PatternReplaceFilterFactory
in both index and query parts of your analysis chain to remove
parens. Or really any kinds of extraneous characters you decided
were unimportant.

But you need to decide what's important and enforce that.

Best,
Erick

On Tue, Feb 25, 2014 at 7:28 PM, Kashish itzz.me.kash...@gmail.com wrote:

Hi Ahmet/Erick,

I tried escaping as well. See no luck.

The title am looking for is - ARABIAN NIGHTS #01 (1999/99)

I figured out that if i pass the query as *1999/99* (i.e asterisk not only
at the end but at the beginning as well), It works.

The problem is the braces. I can change my field type and add

filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=1 catenateNumbers=1
catenateAll=0 splitOnCaseChange=1 preserveOriginal=1/

But this will show too many results in autocomplete.

Is there any best way to handle this? Or should i pass asterisk before and
after the query?

Thanks.

--
View this message in context:
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608p4119678.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard-Search Solr 3.5.0

2012-06-03 Thread Erick Erickson

Chiming in late here, just back from vacation. But off the top of my
head, I don't see any reason SnowballPorterFilterFactory shouldn't
be MultiTermAware.

I've created https://issues.apache.org/jira/browse/SOLR-3503 as
a placeholder.

Erick

On Fri, May 25, 2012 at 1:31 PM,  spr...@gmx.eu wrote:
 I don't know the specific rules in these specific stemmers,
 but generally a
 less aggressive stemming (e.g., plural-only) of
 paintings would be
 painting, while a more aggressive stemming would be
 paint. For some
 aggressive stemmers the stemmed word is not even a word.

 Sounds logically :)

 It would be nice to have doc with some example words for each stemmer.

 Absolutely!

 Thx alot!

Re: Wildcard-Search Solr 3.5.0

2012-06-03 Thread Erick Erickson

And I closed the JIRA, see the comments. But the short form is that
it's not worth the effort because of the edge cases. Jack writes
up some of them; the short form is what does stemming
do with terms like organiz* . Sure, it would produce one token (which is
the main restriction on a MultiTermAware filter), but the output
might not be anything equivalent to the stem of organization, maybe
not even organize. Better to avoid that rat-hole, it seems like one of those
problems that could suck up enormous amounts of time and _still_ not
do what's expected.

If you _really_ want to try this, you could always define your own
multiterm analysis component that included the stemmer, see:
http://www.lucidimagination.com/blog/2011/11/29/whats-with-lowercasing-wildcard-multiterm-queries-in-solr/
But don't say I didn't warn you G...

Best
Erick

On Sun, Jun 3, 2012 at 8:25 AM, Erick Erickson erickerick...@gmail.com wrote:
Chiming in late here, just back from vacation. But off the top of my
head, I don't see any reason SnowballPorterFilterFactory shouldn't
be MultiTermAware.

I've created https://issues.apache.org/jira/browse/SOLR-3503 as
a placeholder.

Erick

On Fri, May 25, 2012 at 1:31 PM, spr...@gmx.eu wrote:
I don't know the specific rules in these specific stemmers,
but generally a
less aggressive stemming (e.g., plural-only) of
paintings would be
painting, while a more aggressive stemming would be
paint. For some
aggressive stemmers the stemmed word is not even a word.

Sounds logically :)

It would be nice to have doc with some example words for each stemmer.

Absolutely!

Thx alot!

RE: Wildcard-Search Solr 3.5.0

2012-05-25 Thread spring

Oh, thx for the update! I didn't noticed that solr 3.6 has a text_de field
type. These two options... less / more aggressive. Aggressive in terms of
what?

Thank you!

 -Original Message-
 From: Jack Krupansky [mailto:j...@basetechnology.com] 
 Sent: Freitag, 25. Mai 2012 03:25
 To: solr-user@lucene.apache.org
 Subject: Re: Wildcard-Search Solr 3.5.0

 I tried it and it does appear to be the 
 SnowballPorterFilterFactory that 
 normally does the accent folding but can't here because it is 
 not multi-term 
 aware. I did notice that the text_de field type that comes in 
 the Solr 3.6 
 example schema handles your case fine. It uses the 
 GermanNormalizationFilterFactory to fold accented characters and is 
 multi-term aware. Any particular reason you're not using the 
 stock text_de 
 field type? It also has three stemming options which might be 
 sufficient for 
 your needs.

 In any case, try to make your text_de field type closer to the stock 
 version, and try to use GermanNormalizationFilterFactory, and 
 that may be 
 good enough for your situation.

Re: Wildcard-Search Solr 3.5.0

2012-05-25 Thread Jack Krupansky

I don't know the specific rules in these specific stemmers, but generally a 
less aggressive stemming (e.g., plural-only) of paintings would be 
painting, while a more aggressive stemming would be paint. For some 
aggressive stemmers the stemmed word is not even a word.


It would be nice to have doc with some example words for each stemmer.

-- Jack Krupansky

-Original Message- 
From: spr...@gmx.eu

Sent: Friday, May 25, 2012 5:59 AM
To: solr-user@lucene.apache.org
Subject: RE: Wildcard-Search Solr 3.5.0

Oh, thx for the update! I didn't noticed that solr 3.6 has a text_de field
type. These two options... less / more aggressive. Aggressive in terms of
what?

Thank you!


-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Freitag, 25. Mai 2012 03:25
To: solr-user@lucene.apache.org
Subject: Re: Wildcard-Search Solr 3.5.0

I tried it and it does appear to be the
SnowballPorterFilterFactory that
normally does the accent folding but can't here because it is
not multi-term
aware. I did notice that the text_de field type that comes in
the Solr 3.6
example schema handles your case fine. It uses the
GermanNormalizationFilterFactory to fold accented characters and is
multi-term aware. Any particular reason you're not using the
stock text_de
field type? It also has three stemming options which might be
sufficient for
your needs.

In any case, try to make your text_de field type closer to the stock
version, and try to use GermanNormalizationFilterFactory, and
that may be
good enough for your situation.

RE: Wildcard-Search Solr 3.5.0

2012-05-25 Thread spring

 I don't know the specific rules in these specific stemmers, 
 but generally a 
 less aggressive stemming (e.g., plural-only) of 
 paintings would be 
 painting, while a more aggressive stemming would be 
 paint. For some 
 aggressive stemmers the stemmed word is not even a word.

Sounds logically :)

 It would be nice to have doc with some example words for each stemmer.

Absolutely!

Thx alot!

Re: Wildcard-Search Solr 3.5.0

2012-05-24 Thread Jack Krupansky

I tried it and it does appear to be the SnowballPorterFilterFactory that 
normally does the accent folding but can't here because it is not multi-term 
aware. I did notice that the text_de field type that comes in the Solr 3.6 
example schema handles your case fine. It uses the 
GermanNormalizationFilterFactory to fold accented characters and is 
multi-term aware. Any particular reason you're not using the stock text_de 
field type? It also has three stemming options which might be sufficient for 
your needs.


In any case, try to make your text_de field type closer to the stock 
version, and try to use GermanNormalizationFilterFactory, and that may be 
good enough for your situation.


-- Jack Krupansky

-Original Message- 
From: spr...@gmx.eu

Sent: Wednesday, May 23, 2012 10:16 AM
To: solr-user@lucene.apache.org
Subject: RE: Wildcard-Search Solr 3.5.0


I'd guess that this is because SnowballPorterFilterFactory
does not implement MultiTermAwareComponent. Not sure, though.


Yes, I think this hinders the automagically multiterm awarness to do it's
job.
Could an own analyzer chain with analyzer type=multiterm help? Like
described (very, very short, too short...) here:
http://wiki.apache.org/solr/MultitermQueryAnalysis

RE: Wildcard-Search Solr 3.5.0

2012-05-23 Thread spring

No one an idea?

Thx.


  The text may contain FooBar.
  
  When I do a wildcard search like this: Foo* - no hits.
  When I do a wildcard search like this: foo* - doc is
  found.
 
 Please see http://wiki.apache.org/solr/MultitermQueryAnalysis


Well, it works in 3.6. With one exception: If I use german umlauts it does
not work anymore.

Text: Bär

Bä* - no hits
Bär - hits

What can I do in this case?

Thank you

Re: Wildcard-Search Solr 3.5.0

2012-05-23 Thread Dmitry Kan

what about bä*-hits?

-- Dmitry

On Wed, May 23, 2012 at 2:19 PM, spr...@gmx.eu wrote:

 No one an idea?

 Thx.


   The text may contain FooBar.
  
   When I do a wildcard search like this: Foo* - no hits.
   When I do a wildcard search like this: foo* - doc is
   found.
 
  Please see http://wiki.apache.org/solr/MultitermQueryAnalysis


 Well, it works in 3.6. With one exception: If I use german umlauts it does
 not work anymore.

 Text: Bär

 Bä* - no hits
 Bär - hits

 What can I do in this case?

 Thank you




-- 
Regards,

Dmitry Kan

RE: Wildcard-Search Solr 3.5.0

2012-05-23 Thread spring

No. No hits for bä*.
It's something with the umlauts but I have no idea what...

 -Original Message-
 From: Dmitry Kan [mailto:dmitry@gmail.com] 
 Sent: Mittwoch, 23. Mai 2012 13:36
 To: solr-user@lucene.apache.org
 Subject: Re: Wildcard-Search Solr 3.5.0

 what about bä*-hits?

 -- Dmitry

 On Wed, May 23, 2012 at 2:19 PM, spr...@gmx.eu wrote:

  No one an idea?

  Thx.

The text may contain FooBar.

When I do a wildcard search like this: Foo* - no hits.
When I do a wildcard search like this: foo* - doc is
found.

   Please see http://wiki.apache.org/solr/MultitermQueryAnalysis

  Well, it works in 3.6. With one exception: If I use german 
 umlauts it does
  not work anymore.

  Text: Bär

  Bä* - no hits
  Bär - hits

  What can I do in this case?

  Thank you

 -- 
 Regards,

 Dmitry Kan

Re: Wildcard-Search Solr 3.5.0

2012-05-23 Thread Dmitry Kan

do umlauts arrive properly on the server side, no encoding issues? Check
the query params of the response xml/json/.. set debugQuery to true as well
to see if it produces any useful diagnostic info.

On Wed, May 23, 2012 at 2:58 PM, spr...@gmx.eu wrote:

 No. No hits for bä*.
 It's something with the umlauts but I have no idea what...

  -Original Message-
  From: Dmitry Kan [mailto:dmitry@gmail.com]
  Sent: Mittwoch, 23. Mai 2012 13:36
  To: solr-user@lucene.apache.org
  Subject: Re: Wildcard-Search Solr 3.5.0
 
  what about bä*-hits?
 
  -- Dmitry
 
  On Wed, May 23, 2012 at 2:19 PM, spr...@gmx.eu wrote:
 
   No one an idea?
  
   Thx.
  
  
 The text may contain FooBar.

 When I do a wildcard search like this: Foo* - no hits.
 When I do a wildcard search like this: foo* - doc is
 found.
   
Please see http://wiki.apache.org/solr/MultitermQueryAnalysis
  
  
   Well, it works in 3.6. With one exception: If I use german
  umlauts it does
   not work anymore.
  
   Text: Bär
  
   Bä* - no hits
   Bär - hits
  
   What can I do in this case?
  
   Thank you
  
  
 
 
  --
  Regards,
 
  Dmitry Kan
 




-- 
Regards,

Dmitry Kan

RE: Wildcard-Search Solr 3.5.0

2012-05-23 Thread spring

 -Original Message-
 From: Dmitry Kan [mailto:dmitry@gmail.com] 
 Sent: Mittwoch, 23. Mai 2012 14:02
 To: solr-user@lucene.apache.org
 Subject: Re: Wildcard-Search Solr 3.5.0

 do umlauts arrive properly on the server side, no encoding 
 issues?

Yes, works fine.

It must, since I have hits for Bär or bär.
It's just the combination between umlauts and wildcards.
Must be something with the automagically Multiterm feature in Solr 3.6.

Re: Wildcard-Search Solr 3.5.0

2012-05-23 Thread Jens Grivolla

Maybe a filter like ISOLatin1AccentFilter that doesn't get applied when 
using wildcards? How do the terms actually appear in the index?


Jens

On 05/23/2012 01:19 PM, spr...@gmx.eu wrote:

No one an idea?

Thx.



The text may contain FooBar.

When I do a wildcard search like this: Foo* - no hits.
When I do a wildcard search like this: foo* - doc is
found.


Please see http://wiki.apache.org/solr/MultitermQueryAnalysis



Well, it works in 3.6. With one exception: If I use german umlauts it does
not work anymore.

Text: Bär

Bä* -  no hits
Bär -  hits

What can I do in this case?

Thank you

RE: Wildcard-Search Solr 3.5.0

2012-05-23 Thread spring

 Maybe a filter like ISOLatin1AccentFilter that doesn't get 
 applied when 
 using wildcards? How do the terms actually appear in the index?

Bär get indexed as bar.

I use not ISOLatin1AccentFilter . My field def is this:

fieldType name=text_de class=solr.TextField positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/ 
filter class=solr.LowerCaseFilterFactory/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.SnowballPorterFilterFactory language=German2
/
  /analyzer
  analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
 filter class=solr.SnowballPorterFilterFactory language=German2
/
  /analyzer
/fieldType
 /types

RE: Wildcard-Search Solr 3.5.0

2012-05-23 Thread Michael Ryan

I'd guess that this is because SnowballPorterFilterFactory does not implement 
MultiTermAwareComponent. Not sure, though.

-Michael

RE: Wildcard-Search Solr 3.5.0

2012-05-23 Thread spring

 I'd guess that this is because SnowballPorterFilterFactory 
 does not implement MultiTermAwareComponent. Not sure, though.

Yes, I think this hinders the automagically multiterm awarness to do it's
job.
Could an own analyzer chain with analyzer type=multiterm help? Like
described (very, very short, too short...) here:
http://wiki.apache.org/solr/MultitermQueryAnalysis

RE: Wildcard-Search Solr 3.5.0

2012-05-22 Thread spring

  The text may contain FooBar.
  
  When I do a wildcard search like this: Foo* - no hits.
  When I do a wildcard search like this: foo* - doc is
  found.
 
 Please see http://wiki.apache.org/solr/MultitermQueryAnalysis


Well, it works in 3.6. With one exception: If I use german umlauts it does
not work anymore.

Text: Bär

Bä* - no hits
Bär - hits

What can I do in this case?

Thank you

Re: Wildcard-Search Solr 3.5.0

2012-05-20 Thread Ahmet Arslan

 The text may contain FooBar.
 
 When I do a wildcard search like this: Foo* - no hits.
 When I do a wildcard search like this: foo* - doc is
 found.

Please see http://wiki.apache.org/solr/MultitermQueryAnalysis

RE: Wildcard-Search Solr 3.5.0

2012-05-20 Thread spring

Hi Ahmet,

 Please see http://wiki.apache.org/solr/MultitermQueryAnalysis

so your advice is to upgrade to 3.6? 

Thank you

RE: Wildcard-Search Solr 3.5.0

2012-05-20 Thread Ahmet Arslan


 so your advice is to upgrade to 3.6? 

Or, as a workaround, you can lowercase wildcard queries on the client side.

Re: Wildcard search not working if full word is queried

2011-07-01 Thread Celso Pinto

Hi François,

it is indeed being stemmed, thanks a lot for the heads up. It appears
that stemming is also configured for the query so it should work just
the same, no?

Thanks again.

Regards,
Celso


2011/6/30 François Schiettecatte fschietteca...@gmail.com:
 I would run that word through the analyzer, I suspect that the word 'teste' 
 is being stemmed to 'test' in the index, at least that is the first place I 
 would check.

 François

 On Jun 30, 2011, at 2:21 PM, Celso Pinto wrote:

 Hi everyone,

 I'm having some trouble figuring out why a query with an exact word
 followed by the * wildcard, eg. teste*, returns no results while a
 query for test* returns results that have the word teste in them.

 I've created a couple of pasties:

 Exact word with wildcard : http://pastebin.com/n9SMNsH0
 Similar word: http://pastebin.com/jQ56Ww6b

 Parameters other than title, description and content have no effect
 other than filtering out unwanted results. In a two of the four
 results, the title has the complete word teste. On the other two,
 the word appears in the other fields.

 Does anyone have any insights about what I'm doing wrong?

 Thanks in advance.

 Regards,
 Celso

Re: Wildcard search not working if full word is queried

2011-07-01 Thread Celso Pinto

Hi again,

read (past tense) TFM :-) and:

On wildcard and fuzzy searches, no text analysis is performed on the
search word.

Thanks a lot François!

Regards,
Celso

On Fri, Jul 1, 2011 at 10:02 AM, Celso Pinto cpi...@yimports.com wrote:
 Hi François,

 it is indeed being stemmed, thanks a lot for the heads up. It appears
 that stemming is also configured for the query so it should work just
 the same, no?

 Thanks again.

 Regards,
 Celso


 2011/6/30 François Schiettecatte fschietteca...@gmail.com:
 I would run that word through the analyzer, I suspect that the word 'teste' 
 is being stemmed to 'test' in the index, at least that is the first place I 
 would check.

 François

 On Jun 30, 2011, at 2:21 PM, Celso Pinto wrote:

 Hi everyone,

 I'm having some trouble figuring out why a query with an exact word
 followed by the * wildcard, eg. teste*, returns no results while a
 query for test* returns results that have the word teste in them.

 I've created a couple of pasties:

 Exact word with wildcard : http://pastebin.com/n9SMNsH0
 Similar word: http://pastebin.com/jQ56Ww6b

 Parameters other than title, description and content have no effect
 other than filtering out unwanted results. In a two of the four
 results, the title has the complete word teste. On the other two,
 the word appears in the other fields.

 Does anyone have any insights about what I'm doing wrong?

 Thanks in advance.

 Regards,
 Celso

Re: Wildcard search not working if full word is queried

2011-07-01 Thread François Schiettecatte

Celso

You are very welcome and yes I should have mentioned that wildcard searches are 
not analyzed (which is a recurring theme). This also means that they are not 
downcased, so the search TEST* will probably not find anything either in  your 
set up.

Cheers

François

On Jul 1, 2011, at 5:16 AM, Celso Pinto wrote:

 Hi again,
 
 read (past tense) TFM :-) and:
 
 On wildcard and fuzzy searches, no text analysis is performed on the
 search word.
 
 Thanks a lot François!
 
 Regards,
 Celso
 
 On Fri, Jul 1, 2011 at 10:02 AM, Celso Pinto cpi...@yimports.com wrote:
 Hi François,
 
 it is indeed being stemmed, thanks a lot for the heads up. It appears
 that stemming is also configured for the query so it should work just
 the same, no?
 
 Thanks again.
 
 Regards,
 Celso
 
 
 2011/6/30 François Schiettecatte fschietteca...@gmail.com:
 I would run that word through the analyzer, I suspect that the word 'teste' 
 is being stemmed to 'test' in the index, at least that is the first place I 
 would check.
 
 François
 
 On Jun 30, 2011, at 2:21 PM, Celso Pinto wrote:
 
 Hi everyone,
 
 I'm having some trouble figuring out why a query with an exact word
 followed by the * wildcard, eg. teste*, returns no results while a
 query for test* returns results that have the word teste in them.
 
 I've created a couple of pasties:
 
 Exact word with wildcard : http://pastebin.com/n9SMNsH0
 Similar word: http://pastebin.com/jQ56Ww6b
 
 Parameters other than title, description and content have no effect
 other than filtering out unwanted results. In a two of the four
 results, the title has the complete word teste. On the other two,
 the word appears in the other fields.
 
 Does anyone have any insights about what I'm doing wrong?
 
 Thanks in advance.
 
 Regards,
 Celso

Re: Wildcard search not working if full word is queried

2011-06-30 Thread François Schiettecatte

I would run that word through the analyzer, I suspect that the word 'teste' is 
being stemmed to 'test' in the index, at least that is the first place I would 
check.

François

On Jun 30, 2011, at 2:21 PM, Celso Pinto wrote:

 Hi everyone,
 
 I'm having some trouble figuring out why a query with an exact word
 followed by the * wildcard, eg. teste*, returns no results while a
 query for test* returns results that have the word teste in them.
 
 I've created a couple of pasties:
 
 Exact word with wildcard : http://pastebin.com/n9SMNsH0
 Similar word: http://pastebin.com/jQ56Ww6b
 
 Parameters other than title, description and content have no effect
 other than filtering out unwanted results. In a two of the four
 results, the title has the complete word teste. On the other two,
 the word appears in the other fields.
 
 Does anyone have any insights about what I'm doing wrong?
 
 Thanks in advance.
 
 Regards,
 Celso

Re: wildcard search

2011-06-11 Thread Thomas Fischer

Hi Ahmet,

 so I created a fake license
 ComplexPhrase-LICENSE-MIT.txt
 for ComplexPhrase and tried again, which ran through
 successfully, I hope this is OK.
 
 I didn't used it with solr 3.2. I will check about it. 
 
 So your GOK field already contains the list as multivalued. Then you can use 
 prefix query parser plugin for this. Just make sure that field type of GOK is 
 string not text.  
 q={!prefix f=GOK}IA 3   should be equivalent to  {!complexphrase}GOK:IA 3*

I'll try that. But my search requests come from a pazpar2 system and are 
directed against different clients, which all get requests of the form 
GOK:IA 32*, so in some sense this is better for me.

I found two problems:

– In the solr 1.4.2 version I'm testing the request IA 32* works, but GOK:IA 
32* will not. Is this somehow related to the indexing of that field?

– The other is that IA320 (on 1.4.2) and GOK:IA320 (on 3.2) will throw an 
exception:
description
The server encountered an internal error
(Unknown query type org.apache.lucene.search.PhraseQuery found in phrase 
query string IA620 java.lang.IllegalArgumentException: Unknown query type 
org.apache.lucene.search.PhraseQuery found in phrase query string IA620 at 
org.apache.lucene.queryParser.ComplexPhraseQueryParser

Cheers
Thomas

Re: wildcard search

2011-06-10 Thread Thomas Fischer

Hi Ahmet,

 I don't use it myself  (but I will soon), so I
 may be wrong, but did you try
 to use the ComplexPhraseQueryParser :
 
 ComplexPhraseQueryParser
   QueryParser which
 permits complex phrase query syntax eg (john
 jon jonathan~) peters*.
 
 It seems that you could do such type of queries :
 
 GOK:IA 38*
 
 yes that sounds interesting.
 But I don't know how to get and install it into solr. Cam
 you give me a hint?
 
 https://issues.apache.org/jira/browse/SOLR-1604

I tried to follow this recipe, adapting it to the solr 3.2 I am testing right 
now.
The first try gave me a message

 [java] !!! Couldn't get license file for 
/Installer/solr/apache-solr-3.2.0/solr/lib/ComplexPhrase-1.0.jar
 [java] At least one file does not have a license, or it's license name is 
not in the proper format.  See the logs.

BUILD FAILED

so I created a fake license
ComplexPhrase-LICENSE-MIT.txt
for ComplexPhrase and tried again, which ran through successfully, I hope this 
is OK.

I registered queryparser not to solrhome/conf/solrconfig.xml (no such thing, 
I'm running multiple cores) but to
solrhome/cores/lit/conf/solrconfig.xml
and could search successfully for
{!complexphrase}GOK:IC 62*

 But it seems that you can achieve what you want with vanilla solr.
 
 I don't follow the multivalued part in your example but you can tokenize 
 IA 300; IC 330; IA 317; IA 318 into these 4 tokens 
 
 IA 300
 IC 330
 IA 314
 IA 318

I didn't have to split them up, they are already separated as field with 
multiValued=true.
But I need to be able to search for IA 310 - IA 319 with one call,
{!complexphrase}GOK:IA 31?
will do this now, or even for 
{!complexphrase}GOK:IA 3*
to catch all those in one go.

Thanks, this helped a lot
Thomas

Re: wildcard search

2011-06-10 Thread Ahmet Arslan

 I tried to follow this recipe, adapting it to the solr 3.2
 I am testing right now.
 The first try gave me a message
 
      [java] !!! Couldn't get
 license file for
 /Installer/solr/apache-solr-3.2.0/solr/lib/ComplexPhrase-1.0.jar
      [java] At least one file does not
 have a license, or it's license name is not in the proper
 format.  See the logs.
 
 BUILD FAILED
 
 so I created a fake license
 ComplexPhrase-LICENSE-MIT.txt
 for ComplexPhrase and tried again, which ran through
 successfully, I hope this is OK.

I didn't used it with solr 3.2. I will check about it. 

  IA 300
  IC 330
  IA 314
  IA 318
 
 I didn't have to split them up, they are already separated
 as field with multiValued=true.
 But I need to be able to search for IA 310 - IA 319 with
 one call,
 {!complexphrase}GOK:IA 31?
 will do this now, or even for 
 {!complexphrase}GOK:IA 3*
 to catch all those in one go.

So your GOK field already contains the list as multivalued. Then you can use 
prefix query parser plugin for this. Just make sure that field type of GOK is 
string not text.  
q={!prefix f=GOK}IA 3   should be equivalent to  {!complexphrase}GOK:IA 3*

Re: wildcard search

2011-06-08 Thread Thomas Fischer

Hi Erick,

I have a multivalued field GOK (local classification scheme) with separate 
entries of the sort
 IA 300; IC 330; IA 317; IA 318, i.e. 1 to 3 capital characters, space, 3 
digits.
I want to be able to perform a truncated search on that field:
either just the string before the space, or a combination of that string with 1 
or 2 digits, something like:
GOK:IA
or
GOK:IA 3*
or
GOK:IA 31?
My problem is the clash between the phrase (GOK:IA 317 works) and the 
wildcards.

As a start I tried as type
fieldType name=text class=solr.TextField positionIncrementGap=100 
autoGeneratePhraseQueries=true
from the solr 3.2 distribution schema
(apache-solr-3.2.0/example/solr/conf/schema.xml),
the field is just
field name=GOK type=text multiValued=true/

BTW, I have another field DDC with entries of the form t1:086643 with 
analogous requirements which yields similar problems due to the colon, also 
indexed as text.
Here also 
DDC:T1\:086643
works, but not 
DDC:T1\:08664?

Thanks in advance
Thomas

 Yes there is, but you haven't provided enough information to
 make a suggestion. What isthe fieldType definition? What is
 the field definition?
 
 Two resources that'll help you greatly are:
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
 
 and the admin/analysis page...
 
 Best
 Erick
 
 On Tue, Jun 7, 2011 at 6:23 PM, Thomas Fischer fischer...@aon.at wrote:
 Hello,
 
 I am testing solr 3.2 and have problems with wildcards.
 I am indexing values like IA 300; IC 330; IA 317; IA 318 in a field GOK, 
 and can't find a way to search with wildcards.
 I want to use a wild card search to match something like IA 31? but cannot 
 find a way to do so.
 GOK:IA\ 38* doesn't work with the contents of GOK indexed as text.
 Is there a way to index and search that would meet my requirements?
 
 Thomas
 
 
 

Mit freundlichen Grüßen
Thomas Fischer

Re: wildcard search

2011-06-08 Thread Erick Erickson

Hmmm, have you tried EdgeNGrams? This works for me (at the expense
of a somewhat larger index, of course)...

fieldType name=edge class=solr.TextField positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory /
filter class=solr.EdgeNGramFilterFactory minGramSize=4
maxGramSize=15 side=front/
  /analyzer
  analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory /
  /analyzer


and a field of type edge named thomasfield


Now searches like
thomasfield:GOK IA 3
(include quotes!) should work. The various parameters (min/max gram size)
I chose arbitrarily, you'll want to tweak them.

I include a lowercasefilter for safety's sake if people are actually
going to type
things in...

It's probably instructive to look at the admin/analysis page to see how
this all plays out

Best
Erick


On Wed, Jun 8, 2011 at 9:29 AM, Thomas Fischer fischer...@aon.at wrote:
 Hi Erick,

 I have a multivalued field GOK (local classification scheme) with separate 
 entries of the sort
  IA 300; IC 330; IA 317; IA 318, i.e. 1 to 3 capital characters, space, 3 
 digits.
 I want to be able to perform a truncated search on that field:
 either just the string before the space, or a combination of that string with 
 1 or 2 digits, something like:
 GOK:IA
 or
 GOK:IA 3*
 or
 GOK:IA 31?
 My problem is the clash between the phrase (GOK:IA 317 works) and the 
 wildcards.

 As a start I tried as type
 fieldType name=text class=solr.TextField positionIncrementGap=100 
 autoGeneratePhraseQueries=true
 from the solr 3.2 distribution schema
 (apache-solr-3.2.0/example/solr/conf/schema.xml),
 the field is just
 field name=GOK type=text multiValued=true/

 BTW, I have another field DDC with entries of the form t1:086643 with 
 analogous requirements which yields similar problems due to the colon, also 
 indexed as text.
 Here also
 DDC:T1\:086643
 works, but not
 DDC:T1\:08664?

 Thanks in advance
 Thomas

 Yes there is, but you haven't provided enough information to
 make a suggestion. What isthe fieldType definition? What is
 the field definition?

 Two resources that'll help you greatly are:
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

 and the admin/analysis page...

 Best
 Erick

 On Tue, Jun 7, 2011 at 6:23 PM, Thomas Fischer fischer...@aon.at wrote:
 Hello,

 I am testing solr 3.2 and have problems with wildcards.
 I am indexing values like IA 300; IC 330; IA 317; IA 318 in a field 
 GOK, and can't find a way to search with wildcards.
 I want to use a wild card search to match something like IA 31? but 
 cannot find a way to do so.
 GOK:IA\ 38* doesn't work with the contents of GOK indexed as text.
 Is there a way to index and search that would meet my requirements?

 Thomas




 Mit freundlichen Grüßen
 Thomas Fischer

Re: wildcard search

2011-06-08 Thread lboutros

Hi Thomas,

I don't use it myself  (but I will soon), so I may be wrong, but did you try
to use the ComplexPhraseQueryParser :

ComplexPhraseQueryParser
  QueryParser which permits complex phrase query syntax eg (john
jon jonathan~) peters*.

It seems that you could do such type of queries :

GOK:IA 38*

Ludovic.


-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/memory-leak-during-undeploying-tp2620093p3039561.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: wildcard search

2011-06-08 Thread Thomas Fischer

Hi Ludovic,


 I don't use it myself  (but I will soon), so I may be wrong, but did you try
 to use the ComplexPhraseQueryParser :
 
 ComplexPhraseQueryParser
  QueryParser which permits complex phrase query syntax eg (john
 jon jonathan~) peters*.
 
 It seems that you could do such type of queries :
 
 GOK:IA 38*

yes that sounds interesting.
But I don't know how to get and install it into solr. Cam you give me a hint?

Thanks
Thomas

Re: wildcard search

2011-06-08 Thread Ahmet Arslan


  I don't use it myself  (but I will soon), so I
 may be wrong, but did you try
  to use the ComplexPhraseQueryParser :
  
  ComplexPhraseQueryParser
           QueryParser which
 permits complex phrase query syntax eg (john
  jon jonathan~) peters*.
  
  It seems that you could do such type of queries :
  
  GOK:IA 38*
 
 yes that sounds interesting.
 But I don't know how to get and install it into solr. Cam
 you give me a hint?

https://issues.apache.org/jira/browse/SOLR-1604

But it seems that you can achieve what you want with vanilla solr.

I don't follow the multivalued part in your example but you can tokenize 
IA 300; IC 330; IA 317; IA 318 into these 4 tokens 

IA 300
IC 330
IA 314
IA 318

Using Pattern Tokenizer Factory. And you can use PrefixQParserPlugin for 
searching.

http://lucene.apache.org/solr/api/org/apache/solr/search/PrefixQParserPlugin.html

Re: wildcard search

2011-06-07 Thread Erick Erickson

Yes there is, but you haven't provided enough information to
make a suggestion. What isthe fieldType definition? What is
the field definition?

Two resources that'll help you greatly are:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

and the admin/analysis page...

Best
Erick

On Tue, Jun 7, 2011 at 6:23 PM, Thomas Fischer fischer...@aon.at wrote:
 Hello,

 I am testing solr 3.2 and have problems with wildcards.
 I am indexing values like IA 300; IC 330; IA 317; IA 318 in a field GOK, 
 and can't find a way to search with wildcards.
 I want to use a wild card search to match something like IA 31? but cannot 
 find a way to do so.
 GOK:IA\ 38* doesn't work with the contents of GOK indexed as text.
 Is there a way to index and search that would meet my requirements?

 Thomas

Re: wildcard search inconsistencies

2011-04-01 Thread lboutros

'conditional' seems to be stemmed into the word 'condit' in the index.

So your results are normal.

As you said, mixing wildcards searching and stemmed fields is not
recommanded.

Ludovic.

2011/4/1 Melanie Drake [via Lucene]
ml-node+2763787-65059921-383...@n3.nabble.com

I noticed an inconsistency in results when performing wildcard searches.
When searching on variations of conditional the following results
occurred:

conditional - hits
conditional* - hits
conditi* - hits
condit* - hits
con*al - no hits
c?nditional - no hits
c*ld - hits (on a different word: child)

I don't see an obvious pattern to when the wildcard searches work.

In a response to another post, I read that stemming will cause wildcard
searches to behave strangely. I believe we may be using stemming, although
the only configuration I see is the list of words protected against stemming
defined in protwords.txt.

Also, I'm not sure if it's helpful, but I see a vague Solr error in my
server log (jboss) any time I perform a search (whether successful or not):
ERROR [STDERR] timestamp org.apache.solr.core.SolrCore execute
INFO: [core0] webapp=null path=/select
params={q=condi*alfq=url%3A%28%22%2Flong list of application-specific IDs
used for filteringfl=scorehl=truehl.fragsize=50hl.snippets=3} hits=0
status=0 QTime=0

The developer who implemented our search solution is no longer with our
company, so I'm just looking for any information useful to investigate this
issue. I apologize if I ommitted any necessary information. Thanks!

--
If you reply to this email, your message will be added to the discussion
below:

http://lucene.472066.n3.nabble.com/wildcard-search-inconsistencies-tp2763787p2763787.html
To start a new topic under Solr - User, email
ml-node+472068-1765922688-383...@n3.nabble.com
To unsubscribe from Solr - User, click
herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=472068code=Ym91dHJvc2xAZ21haWwuY29tfDQ3MjA2OHw0Mzk2MDUxNjE=.

-
Jouve
France.
--
View this message in context:
http://lucene.472066.n3.nabble.com/wildcard-search-inconsistencies-tp2763787p2763841.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: wildcard search inconsistencies

2011-04-01 Thread lboutros

And to be more helpfull, you can activate the debug (debugQuery=on in the
query) mode to see the transform query :

for instance 'field:contitional' :

field:conditional
field:conditional
field:condit
field:condit

for 'field:conditional*' :

field:conditional*
field:conditional*
field:conditional*
field:conditional*

and for 'field:con*al' :

field:con*al
field:con*al
field:con*al
field:con*al

but in the field index the word 'conditional' is stored as 'condit' and is
not matched by 'con*al'.
but the words 'conceal' stored as is, 'congealable' stored as 'congeal' are
matched and retrieved (and highlighted if well configured).

Ludovic.

-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/wildcard-search-inconsistencies-tp2763787p2763918.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: wildcard search inconsistencies

2011-04-01 Thread Melanie Drake

Thanks, Ludovic.  That was it.  I added the word conditional to the
protected words file and I no longer see the odd search results when using
wildcards.  I will try to disable stemming altogether.  

Thanks again!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/wildcard-search-inconsistencies-tp2763787p2763934.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search in phrase query using spanquery

2010-04-21 Thread Maddy.Jsh


I used eclipse-jee-galileo-SR2-win32 to build the ant and selected dist-war
for execution in build. I got the following message.

Buildfile: D:\apache-solr-1.4.0\build.xml
init-forrest-entities:
compile-solrj:
compile:
[javac] Compiling 1 source file to D:\apache-solr-1.4.0\build\solr
[javac] Note:
D:\apache-solr-1.4.0\src\java\org\apache\solr\search\DocSetHitCollector.java
uses or overrides a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
make-manifest:
 [exec] Execute failed: java.io.IOException: Cannot run program
svnversion: CreateProcess error=2, The system cannot find the file
specified
dist-jar:
  [jar] Building jar:
D:\apache-solr-1.4.0\dist\apache-solr-core-1.4.1-dev.jar
dist-solrj:
  [jar] Building jar:
D:\apache-solr-1.4.0\dist\apache-solr-solrj-1.4.1-dev.jar
dist-war:
  [war] Building war:
D:\apache-solr-1.4.0\dist\apache-solr-1.4.1-dev.war
BUILD SUCCESSFUL
Total time: 5 second

The solr performed as usual and when I tried adding defType=complexphrase
to search url the previous error showed up again. I performed this task to
fresh solr-1.4.0. There was error while executing make-manifest, did it have
anything to do with the error or am I missing something else?
-- 
View this message in context: 
http://n3.nabble.com/Wildcard-search-in-phrase-query-using-spanquery-tp729275p739475.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search in phrase query using spanquery

2010-04-21 Thread Ahmet Arslan


 I used eclipse-jee-galileo-SR2-win32 to build the ant and
 selected dist-war
 for execution in build. I got the following message.

I use command prompt to invoke ant so I am not sure about this.

 The solr performed as usual and when I tried adding
 defType=complexphrase
 to search url the previous error showed up again. I
 performed this task to
 fresh solr-1.4.0. There was error while executing
 make-manifest, did it have
 anything to do with the error or am I missing something
 else?

Are you running solr using java -jar start.jar?
If yes you need to re-name 
apache-solr-1.4.0\dist\apache-solr-1.4.1-dev.war to solr.war and put it under 
apache-solr-1.4.0\example\webapps

Also you may need to delete the folder under \apache-solr-1.4.0\example\work.

Re: Wildcard search in phrase query using spanquery

2010-04-21 Thread Maddy.Jsh


I used command line to build ant this time.


Ahmet Arslan wrote:
 
 
 Are you running solr using java -jar start.jar?
 If yes you need to re-name 
 apache-solr-1.4.0\dist\apache-solr-1.4.1-dev.war to solr.war and put it
 under apache-solr-1.4.0\example\webapps
 
 Also you may need to delete the folder under
 \apache-solr-1.4.0\example\work.
 
 
 

Yes I ran solr using java -jar start.jar. I did the above mentioned tasks
but the results were the same. I even tried by removing all the jar files of
version 1.4.0 under apache-solr-1.4.0\dist\ but still I get the same result.
-- 
View this message in context: 
http://n3.nabble.com/Wildcard-search-in-phrase-query-using-spanquery-tp729275p739666.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search in phrase query using spanquery

2010-04-21 Thread Ahmet Arslan

 
 I used command line to build ant this time.
 

Before calling 'ant dist' where did you copy the ComplexPhrase-1.0.jar?

apache-solr-1.4.0\lib or apache-solr-1.4.0\example\lib?


 Yes I ran solr using java -jar start.jar. I did the above
 mentioned tasks
 but the results were the same. 

can you delete the \apache-solr-1.4.0\example\solr\lib\ComplexPhrase-1.0.jar  
and test again? If classnotfound exception comes then it means that your new 
solr.war does not contain it.

Re: Wildcard search in phrase query using spanquery

2010-04-21 Thread Maddy.Jsh



Ahmet Arslan wrote:
 
 
 Before calling 'ant dist' where did you copy the ComplexPhrase-1.0.jar?
 
 apache-solr-1.4.0\lib or apache-solr-1.4.0\example\lib?
 
 
 

I tried it by placing ComplexPhrase-1.0.jar in apache-solr-1.4.0\lib ;
apache-solr-1.4.0\example\lib ; and apache-solr-1.4.0\example\solr\lib with
the same error

HTTP ERROR: 500

tried to access field org.apache.lucene.queryParser.QueryParser.field from
class
org.apache.lucene.queryParser.ComplexPhraseQueryParser$ComplexPhraseQuery



Ahmet Arslan wrote:
 
 
 
 can you delete the
 \apache-solr-1.4.0\example\solr\lib\ComplexPhrase-1.0.jar  and test again?
 If classnotfound exception comes then it means that your new solr.war does
 not contain it.
 
   
 
 

Yes I tried that as well but classnotfound exception did not arise. 
-- 
View this message in context: 
http://n3.nabble.com/Wildcard-search-in-phrase-query-using-spanquery-tp729275p739721.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search in phrase query using spanquery

2010-04-21 Thread Ahmet Arslan


 I tried it by placing ComplexPhrase-1.0.jar in
 apache-solr-1.4.0\lib ;
 apache-solr-1.4.0\example\lib ; and
 apache-solr-1.4.0\example\solr\lib with
 the same error

You need to copy it to only apache-solr-1.4.0\lib

Maybe it is better to get a fresh copy of apache-solr-1.4.0.zip and continue 
step by step.

copy ComplexPhrase-1.0.jar to only apache-solr-1.4.0\lib

run 'ant clean dist'

move \apache-solr-1.4.0\dist\apache-solr-1.4.1-dev.war to 
apache-solr-1.4.0\example\webapps\solr.war

add the line 
queryParser name=complexphrase 
class=org.apache.solr.search.ComplexPhraseQParserPlugin /
to apache-solr-1.4.0\example\solr\conf\solrconfig.xml

java -jar start.jar

java -jar post.jar *.xml

http://localhost:8983/solr/select/?q=features%3A%22s*+c*%22version=2.2start=0rows=10indent=ondefType=complexphrasehl=truehl.fl=features
 
should return 4 documents with highlighting.

Re: Wildcard search in phrase query using spanquery

2010-04-21 Thread Maddy.Jsh


Thanks it worked now. I think building clean ant is what made the difference.
I'll work on this a bit more and give you feedbacks.
-- 
View this message in context: 
http://n3.nabble.com/Wildcard-search-in-phrase-query-using-spanquery-tp729275p741988.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search in phrase query using spanquery

2010-04-20 Thread Maddy.Jsh


I tried that and got the following result. Do I have to do anything other
than the mentioned instructions to make it work?

HTTP ERROR: 500

tried to access field org.apache.lucene.queryParser.QueryParser.field from
class
org.apache.lucene.queryParser.ComplexPhraseQueryParser$ComplexPhraseQuery

java.lang.IllegalAccessError: tried to access field
org.apache.lucene.queryParser.QueryParser.field from class
org.apache.lucene.queryParser.ComplexPhraseQueryParser$ComplexPhraseQuery
at
org.apache.lucene.queryParser.ComplexPhraseQueryParser$ComplexPhraseQuery.parsePhraseElements(ComplexPhraseQueryParser.java:216)
at
org.apache.lucene.queryParser.ComplexPhraseQueryParser.parse(ComplexPhraseQueryParser.java:114)
at
org.apache.solr.search.ComplexPhraseQParser.parse(ComplexPhraseQParserPlugin.java:82)
at org.apache.solr.search.QParser.getQuery(QParser.java:131)
at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:89)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)

RequestURI=/solr/select/

Powered by Jetty://





















-- 
View this message in context: 
http://n3.nabble.com/Wildcard-search-in-phrase-query-using-spanquery-tp729275p731654.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search in phrase query using spanquery

2010-04-20 Thread Ahmet Arslan


 I tried that and got the following result. Do I have to do
 anything other
 than the mentioned instructions to make it work?
 
 HTTP ERROR: 500
 
 tried to access field
 org.apache.lucene.queryParser.QueryParser.field from
 class
 org.apache.lucene.queryParser.ComplexPhraseQueryParser$ComplexPhraseQuery
 
 java.lang.IllegalAccessError: tried to access field
 org.apache.lucene.queryParser.QueryParser.field from class
 org.apache.lucene.queryParser.ComplexPhraseQueryParser$ComplexPhraseQuery

I have never noticed this since i am using this class with custom super class. 
Thanks for it. 

It seems that the reason:

A class loaded by one classloader can't
access package-visible members in a class loaded by a different
classloader even if they are nominally in the same package. [1]

[1]http://www.pubbs.net/grails/200911/37981/

I think easiest thing to do is to place ComplexPhrase-1.0.jar (created by mvn 
package) under apache-solr-1.4.0\lib directory. 
And create a new apache-solr-1.4.0\dist\apache-solr-1.4.1-dev.war 
by invoking ant dist . I tested this solution and it works as a workaround.  

It would be great if you give us feedback after using it.

Re: Wildcard search in phrase query using spanquery

2010-04-19 Thread Ahmet Arslan

 I need to perform wildcard search in phrase query. I have 2
 documents
 containing text how do impair and how to improve. I
 want to be able to
 search both documents by searching (how to im*). There is a
 provision in
 lucene which allows me to perform this operation using
 SpanWildcardQuery and
 keeping span length to 0. 
 
 http://mail-archives.apache.org/mod_mbox//lucene-java-user/200707.mbox/%3c469df09f.9030...@gmail.com%3e
 
 
 I tried proximity search in solr but it didn't work with
 wildcard. Is there
 any other provision to perform wildcard search in phrase
 query?

With https://issues.apache.org/jira/browse/SOLR-1604  you can use * operator 
inside phrases, e.g. how to im*

Re: Wildcard Search and Filter in Solr

2010-02-01 Thread ashokcz

hey thanks ravi , ahmed and Erik for your reply.
though its tough to change my solr version , still let me try out at 1.4
and see.

Erik Hatcher-4 wrote:

Note that the query analyzer output is NOT doing query _parsing_, but
rather taking the string you passed and running it through the query
analyzer only. When using the default query parser, Inte* will be a
search for terms that begin with inte. It is odd that you're not
finding it. But you're using a pretty old version of Solr and quite
likely something here has been fixed since.

Give Solr 1.4 a try.

Erik

On Jan 27, 2010, at 12:56 AM, ashokcz wrote:

Hi just looked at the analysis.jsp and found out what it does during
index /
query

Index Analyzer
Intel
intel
intel
intel
intel
intel

Query Analyzer
Inte*
Inte*
inte*
inte
inte
inte
int

I think somewhere my configuration or my definition of the type
text is
wrong.
This is my configuration .

fieldType class=solr.TextField name=text
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter catenateAll=0 catenateNumbers=0 catenateWords=0
class=solr.WordDelimiterFilterFactory generateNumberParts=1
generateWordParts=1/

filter class=solr.StopFilterFactory/
filter class=solr.TrimFilterFactory/
filter class=solr.PorterStemFilterFactory/
/analyzer

analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory expand=true
ignoreCase=true
synonyms=synonyms.txt/
filter class=solr.LowerCaseFilterFactory/
filter catenateAll=0 catenateNumbers=0 catenateWords=0
class=solr.WordDelimiterFilterFactory generateNumberParts=1
generateWordParts=1/
filter class=solr.StopFilterFactory/
filter class=solr.TrimFilterFactory/
filter class=solr.PorterStemFilterFactory/
/analyzer

/fieldType

I think i am missing some basic configuration for doing wildcard
searches .
but could not figure it out .
can someone help please

Ahmet Arslan wrote:

Hi ,
I m trying to use wildcard keywords in my search term and
filter term . but
i didnt get any results.
Searched a lot but could not find any lead .
Can someone help me in this.
i m using solr 1.2.0 and have few records indexed with
vendorName value as
Intel

In solr admin interface i m trying to do the search like
this

http://localhost:8983/solr/select?indent=onversion=2.2q=intelstart=0rows=10fl=*%2Cscoreqt=standardwt=standardexplainOther=hl.fl=

and i m getting the result properly

but when i use q=inte* no records are returned.

the same is the case for Filter Query on using
fq=VendorName:Intel i get
my results.

but on using fq=VendorName:Inte* no results are
returned.

I can guess i doing mistake in few obvious things , but
could not figure it
out ..
Can someone pls help me out :) :)

If q=intel returns documents while q=inte* does not, it means that
fieldType of your defaultSearchField is reducing the token intel into
something.

Can you find out it by using /admin/anaysis.jsp what happens to
Intel
intel at index and query time?

What is your defaultSearchField? Is it VendorName?

It is expected that fq=VendorName:Intel returns results while
fq=VendorName:Inte* does not. Because prefix queries are not
analyzed.

But it is strange that q=inte* does not return anything. Maybe your
index
analyzer is reducing Intel into int or ıntel?

I am not 100% sure but solr 1.2.0 may use default locale in
lowercase
operation. What is your default locale?

It is better to see what happens word Intel using analysis.jsp page.

--
View this message in context:
http://old.nabble.com/Wildcard-Search-and-Filter-in-Solr-tp27306734p27334486.html
Sent from the Solr - User mailing list archive at Nabble.com.

--
View this message in context:
http://old.nabble.com/Wildcard-Search-and-Filter-in-Solr-tp27306734p27405151.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard Search and Filter in Solr

2010-01-27 Thread Ravi Gidwani

Ashok:
May be this will help:
http://gravi2.blogspot.com/2009/05/solr-wildcards-and-omitnorms.html

~Ravi

On Tue, Jan 26, 2010 at 9:56 PM, ashokcz ashokkumar.gane...@tcs.com wrote:

Hi just looked at the analysis.jsp and found out what it does during index
/
query

Index Analyzer
Intel
intel
intel
intel
intel
intel

Query Analyzer
Inte*
Inte*
inte*
inte
inte
inte
int

I think somewhere my configuration or my definition of the type text is
wrong.
This is my configuration .

filter class=solr.StopFilterFactory/
filter class=solr.TrimFilterFactory/
filter class=solr.PorterStemFilterFactory/
/analyzer

/fieldType

I think i am missing some basic configuration for doing wildcard searches .
but could not figure it out .
can someone help please

Ahmet Arslan wrote:

In solr admin interface i m trying to do the search like
this

http://localhost:8983/solr/select?indent=onversion=2.2q=intelstart=0rows=10fl=*%2Cscoreqt=standardwt=standardexplainOther=hl.fl=

and i m getting the result properly

but when i use q=inte* no records are returned.

the same is the case for Filter Query on using
fq=VendorName:Intel i get
my results.

but on using fq=VendorName:Inte* no results are
returned.

I can guess i doing mistake in few obvious things , but
could not figure it
out ..
Can someone pls help me out :) :)

If q=intel returns documents while q=inte* does not, it means that
fieldType of your defaultSearchField is reducing the token intel into
something.

Can you find out it by using /admin/anaysis.jsp what happens to Intel
intel at index and query time?

What is your defaultSearchField? Is it VendorName?

It is expected that fq=VendorName:Intel returns results while
fq=VendorName:Inte* does not. Because prefix queries are not analyzed.

But it is strange that q=inte* does not return anything. Maybe your index
analyzer is reducing Intel into int or ıntel?

I am not 100% sure but solr 1.2.0 may use default locale in lowercase
operation. What is your default locale?

It is better to see what happens word Intel using analysis.jsp page.

--
View this message in context:
http://old.nabble.com/Wildcard-Search-and-Filter-in-Solr-tp27306734p27334486.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard Search and Filter in Solr

2010-01-27 Thread Ahmet Arslan


 Hi just looked at the analysis.jsp and found out what it
 does during index /
 query
 
 Index Analyzer
 Intel 
 intel 
 intel 
 intel 
 intel 
 intel 

If the resultant token is intel, then q=inte* should return documents.
What says when you add debugQuery=on to your search url?
And why are you using an old version of solr?

Re: Wildcard Search and Filter in Solr

2010-01-27 Thread Erik Hatcher

Give Solr 1.4 a try.

Erik

On Jan 27, 2010, at 12:56 AM, ashokcz wrote:

Hi just looked at the analysis.jsp and found out what it does during
index /

query

Index Analyzer
Intel
intel
intel
intel
intel
intel

Query Analyzer
Inte*
Inte*
inte*
inte
inte
inte
int

I think somewhere my configuration or my definition of the type
text is

wrong.
This is my configuration .

filter class=solr.StopFilterFactory/
filter class=solr.TrimFilterFactory/
filter class=solr.PorterStemFilterFactory/
/analyzer

analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory expand=true
ignoreCase=true

synonyms=synonyms.txt/
filter class=solr.LowerCaseFilterFactory/
filter catenateAll=0 catenateNumbers=0 catenateWords=0
class=solr.WordDelimiterFilterFactory generateNumberParts=1
generateWordParts=1/
filter class=solr.StopFilterFactory/
filter class=solr.TrimFilterFactory/
filter class=solr.PorterStemFilterFactory/
/analyzer

/fieldType

I think i am missing some basic configuration for doing wildcard
searches .

but could not figure it out .
can someone help please

Ahmet Arslan wrote:

In solr admin interface i m trying to do the search like
this

http://localhost:8983/solr/select?indent=onversion=2.2q=intelstart=0rows=10fl=*%2Cscoreqt=standardwt=standardexplainOther=hl.fl=

and i m getting the result properly

but when i use q=inte* no records are returned.

the same is the case for Filter Query on using
fq=VendorName:Intel i get
my results.

but on using fq=VendorName:Inte* no results are
returned.

I can guess i doing mistake in few obvious things , but
could not figure it
out ..
Can someone pls help me out :) :)

If q=intel returns documents while q=inte* does not, it means that
fieldType of your defaultSearchField is reducing the token intel into
something.

Can you find out it by using /admin/anaysis.jsp what happens to
Intel

intel at index and query time?

What is your defaultSearchField? Is it VendorName?

It is expected that fq=VendorName:Intel returns results while
fq=VendorName:Inte* does not. Because prefix queries are not
analyzed.

But it is strange that q=inte* does not return anything. Maybe your
index

analyzer is reducing Intel into int or ıntel?

I am not 100% sure but solr 1.2.0 may use default locale in
lowercase

operation. What is your default locale?

It is better to see what happens word Intel using analysis.jsp page.

--
View this message in context:
http://old.nabble.com/Wildcard-Search-and-Filter-in-Solr-tp27306734p27334486.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard Search and Filter in Solr

2010-01-26 Thread ashokcz


Hi just looked at the analysis.jsp and found out what it does during index /
query

Index Analyzer
Intel 
intel 
intel 
intel 
intel 
intel 

Query Analyzer
Inte* 
Inte* 
inte* 
inte 
inte 
inte 
int 

I think somewhere my configuration or my definition of the type text is
wrong.
This is my configuration . 

fieldType class=solr.TextField name=text
  analyzer type=index
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
filter catenateAll=0 catenateNumbers=0 catenateWords=0
class=solr.WordDelimiterFilterFactory generateNumberParts=1
generateWordParts=1/
  
  filter class=solr.StopFilterFactory/
  filter class=solr.TrimFilterFactory/
  filter class=solr.PorterStemFilterFactory/ 
   /analyzer


 analyzer type=query
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.SynonymFilterFactory expand=true 
ignoreCase=true
synonyms=synonyms.txt/
  filter class=solr.LowerCaseFilterFactory/
  filter catenateAll=0 catenateNumbers=0 catenateWords=0
class=solr.WordDelimiterFilterFactory generateNumberParts=1
generateWordParts=1/
  filter class=solr.StopFilterFactory/
  filter class=solr.TrimFilterFactory/
  filter class=solr.PorterStemFilterFactory/ 
  /analyzer
 
/fieldType

I think i am missing some basic configuration for doing wildcard searches .
but could not figure it out .
can someone help please


Ahmet Arslan wrote:
 
 
 Hi , 
 I m trying to use wildcard keywords in my search term and
 filter term . but
 i didnt get any results.
 Searched a lot but could not find any lead .
 Can someone help me in this.
 i m using solr 1.2.0 and have few records indexed with
 vendorName value as
 Intel
 
 In solr admin interface i m trying to do the search like
 this 
 
 http://localhost:8983/solr/select?indent=onversion=2.2q=intelstart=0rows=10fl=*%2Cscoreqt=standardwt=standardexplainOther=hl.fl=
 
 and i m getting the result properly 
 
 but when i use q=inte* no records are returned.
 
 the same is the case for Filter Query on using
 fq=VendorName:Intel i get
 my results.
 
 but on using fq=VendorName:Inte* no results are
 returned.
 
 I can guess i doing mistake in few obvious things , but
 could not figure it
 out ..
 Can someone pls help me out :) :)
 
 If q=intel returns documents while q=inte* does not, it means that
 fieldType of your defaultSearchField is reducing the token intel into
 something. 
 
 Can you find out it by using /admin/anaysis.jsp what happens to Intel
 intel at index and query time?
 
 What is your defaultSearchField? Is it VendorName?
 
 It is expected that fq=VendorName:Intel returns results while
 fq=VendorName:Inte* does not. Because prefix queries are not analyzed.
  
 
 But it is strange that q=inte* does not return anything. Maybe your index
 analyzer is reducing Intel into int or ıntel? 
 
 I am not 100% sure but solr 1.2.0  may use default locale in lowercase
 operation. What is your default locale?
 
 It is better to see what happens word Intel using analysis.jsp page.
 
 
 
 
 

-- 
View this message in context: 
http://old.nabble.com/Wildcard-Search-and-Filter-in-Solr-tp27306734p27334486.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard Search and Filter in Solr

2010-01-25 Thread Ahmet Arslan

In solr admin interface i m trying to do the search like
this

http://localhost:8983/solr/select?indent=onversion=2.2q=intelstart=0rows=10fl=*%2Cscoreqt=standardwt=standardexplainOther=hl.fl=

and i m getting the result properly

but when i use q=inte* no records are returned.

the same is the case for Filter Query on using
fq=VendorName:Intel i get
my results.

but on using fq=VendorName:Inte* no results are
returned.

I can guess i doing mistake in few obvious things , but
could not figure it
out ..
Can someone pls help me out :) :)

If q=intel returns documents while q=inte* does not, it means that fieldType
of your defaultSearchField is reducing the token intel into something.

Can you find out it by using /admin/anaysis.jsp what happens to Intel intel
at index and query time?

What is your defaultSearchField? Is it VendorName?

It is expected that fq=VendorName:Intel returns results while
fq=VendorName:Inte* does not. Because prefix queries are not analyzed.

But it is strange that q=inte* does not return anything. Maybe your index
analyzer is reducing Intel into int or ıntel?

I am not 100% sure but solr 1.2.0 may use default locale in lowercase
operation. What is your default locale?

It is better to see what happens word Intel using analysis.jsp page.

Re: wildcard search and hierarchical faceting

2010-01-25 Thread Erik Hatcher


There are some approaches outlined here that might be of interest:

http://wiki.apache.org/solr/HierarchicalFaceting


On Jan 24, 2010, at 2:54 AM, Andy wrote:


I'd like to provide a hierarchical faceting functionality.

An example would be location drill down such as USA - New York -  
New York City - SoHo


The number of levels can be arbitrary. One way to handle this could  
be to use a special character as separator, store values such as  
USA|New York|New York City|SoHo and use wildcard search. So if  
USA has been selected, the fq would be USA*


I read somewhere that when using wildcard search, no stemming or  
tokenization will be performed. So USA will not match 'usa. Is  
there any way to work around that?


Or would you recommend a different way to handle hierarchical  
faceting?

Re: wildcard search is not working

2009-08-06 Thread Avlesh Singh

Go through this thread first - http://markmail.org/message/bannl2fpblt5sqlw
If it still does not help, post back your field type definition in
schema.xml

Cheers
Avlesh

On Thu, Aug 6, 2009 at 3:46 PM, Radha C. cra...@ceiindia.com wrote:

 Hi,


 I have documents contain word healthcare articles. I need to match the
 healthcare artcles documents
 for the query strings helath, articles...

 I tried q=health*, q=helath*, q=heath*articles but everything returns
 empty result. When I try q=healthcare artilces ,the search returns proper
 documents.

 Can anyone tell me what is the wrong with my query string?

Re: Wildcard Search

2009-05-08 Thread Erick Erickson

Are you by any chance stemming the field when you index?

Erick

On Fri, May 8, 2009 at 2:29 AM, dabboo ag...@sapient.com wrote:


 Hi,

 I am facing a n wierd issue while searching.

 I am searching for word *sytem*, it displays all the records which contains
 system, systems etc. But when I tried to search *systems*, it only returns
 me those records, which have systems-, systems/ etc etc. It is considering
 wildcard as 1 or more character and not zero character.

 So, it is not returning records which has systems has one word. Is there
 any
 way to resolve this.

 Please suggest.

 Thanks,
 Amit Garg
 --
 View this message in context:
 http://www.nabble.com/Wildcard-Search-tp23440795p23440795.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard Search

2009-05-08 Thread dabboo


Yes, thats correct. I have applied EnglishPorterFilterFactory at index time
as well. Do you think, I should remove it and do the indexing again.


Erick Erickson wrote:
 
 Are you by any chance stemming the field when you index?
 
 Erick
 
 On Fri, May 8, 2009 at 2:29 AM, dabboo ag...@sapient.com wrote:
 

 Hi,

 I am facing a n wierd issue while searching.

 I am searching for word *sytem*, it displays all the records which
 contains
 system, systems etc. But when I tried to search *systems*, it only
 returns
 me those records, which have systems-, systems/ etc etc. It is
 considering
 wildcard as 1 or more character and not zero character.

 So, it is not returning records which has systems has one word. Is there
 any
 way to resolve this.

 Please suggest.

 Thanks,
 Amit Garg
 --
 View this message in context:
 http://www.nabble.com/Wildcard-Search-tp23440795p23440795.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://www.nabble.com/Wildcard-Search-tp23440795p23445966.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard Search

2009-05-08 Thread Erick Erickson

My *guess* is that what you're seeing is that wildcard searches are
not analyzed, in this case not run through the stemmer. So your
index only contains system and the funky variants (e.g. systems/).
I don't really understand why you'd get systems/ in your index, but
I'm assuming that your filter chain doesn't remove things like slashes.

So, you have system and systems/ in your index, but not
systems due to stemming, so searching for systems*
translates into systems OR systems/ OR and since
no documents have systems, you don't get them as hits.

All that said, you need to revisit your indexing parameters to make
what happens fit your expectations. I'd advise getting a copy of Luke
and pointing it at your index in order to see what *really* gets put in
it.

Best
Erick



You need to either introduce filters that remove odd stuff like slashes

On Fri, May 8, 2009 at 9:25 AM, dabboo ag...@sapient.com wrote:


 Yes, thats correct. I have applied EnglishPorterFilterFactory at index time
 as well. Do you think, I should remove it and do the indexing again.


 Erick Erickson wrote:
 
  Are you by any chance stemming the field when you index?
 
  Erick
 
  On Fri, May 8, 2009 at 2:29 AM, dabboo ag...@sapient.com wrote:
 
 
  Hi,
 
  I am facing a n wierd issue while searching.
 
  I am searching for word *sytem*, it displays all the records which
  contains
  system, systems etc. But when I tried to search *systems*, it only
  returns
  me those records, which have systems-, systems/ etc etc. It is
  considering
  wildcard as 1 or more character and not zero character.
 
  So, it is not returning records which has systems has one word. Is there
  any
  way to resolve this.
 
  Please suggest.
 
  Thanks,
  Amit Garg
  --
  View this message in context:
  http://www.nabble.com/Wildcard-Search-tp23440795p23440795.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Wildcard-Search-tp23440795p23445966.html
 Sent from the Solr - User mailing list archive at Nabble.com.

RE: Wildcard search with q query

2009-02-24 Thread Jana, Kumar Raja

Hi Amit,
Leading wildcard searches are not allowing in Solr Out-of-the-box. There
are other techniques to overcome this handicap.
Search for Leading Wildcards in the user mailing list archives will
return the necessary mail threads which might be pretty useful to you.

-Kumar


-Original Message-
From: dabboo [mailto:ag...@sapient.com] 
Sent: Tuesday, February 24, 2009 4:04 PM
To: solr-user@lucene.apache.org
Subject: Wildcard search with q query


Hi,

I am trying to perform wildcard search using q query. 

e.g. If I give tes* as the query, it works fine and returns the results
as
expected.

But if I give *tes*, then it throws an exception saying that we cant
have
wildcards in front of any string.

Please suggest.

Thanks,
Amit Garg
-- 
View this message in context:
http://www.nabble.com/Wildcard-search-with-q-query-tp22179548p22179548.h
tml
Sent from the Solr - User mailing list archive at Nabble.com.

RE: wildcard search issue

2009-02-05 Thread Jana, Kumar Raja

Hi Mahendra,

Wildcard searches are case-sensitive in Solr. I faced the same issue and 
handled converting the query string to lower case in my code itself. The 
filters and analyzers are not applicable for wildcard queries.


FYI
You will also face problem when you use keywords/Operators in your queries in 
the upper case. Queries like Jack AND Jill or containing words like OR, 
NOT, etc will throw errors from the SolrQueryParser. You will have to convert 
such strings to lower-case too. In this case, the exception is thrown during 
the parsing stage itself. Again, the filters and analyzers have no effect 
(since they never come into the picture here too)

-Kumar

-Original Message-
From: mahendra mahendra [mailto:mahendra_featu...@yahoo.com] 
Sent: Friday, February 06, 2009 12:34 PM
To: solr-user@lucene.apache.org
Subject: wildcard search issue

Hi,
 
The case sensitive wild-card search is not working for TextField type.
I have tried searching for UserName:cust*, it gave the results, but 
UserName:Cust* didn't give any results. How can I make it work..
I have defined my TextField in following way.
 
 fieldtype name=textTight class=solr.TextField positionIncrementGap=100
  analyzer type=index
    tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
    tokenizer class=solr.WhitespaceTokenizerFactory/
    filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldtype 
  
Any help would appreciate!!

Thanks  Regards,
Mahendra

Re: Wildcard search question

2008-06-24 Thread Jon Drukman


Norberto Meijome wrote:
ok well let's say that i can live without john/jon in the short term. 
what i really need today is a case insensitive wildcard search with 
literal matching (no fancy stemming.  bobby is bobby, not bobbi.)


what are my options?

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

define your own type (or modify text / string... but I find that it gets 
confusing to have variations of text / string ...) to perform the operations on 
the content as needed.

There are also other tokenizer/analysers available that *may* help in the 
partial searches (ngram , edgengram ), but there isn't much documentation on 
them yet (that I could find) - I am only getting into them myself i'll see 
how it goes..


thanks, that got me on the right track.  i came up with this:

fieldType name=text_ws class=solr.TextField positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

now searching for user_name:bobby* works as i wanted.

my next question: is there a way that i can score matches that are at 
the start of the string higher than matches in the middle?  for example, 
if i search for steve, i get kelly stevenson before steve jobs.  i'd 
like steve jobs to come first.


-jsd-

Re: Wildcard search question

2008-06-23 Thread Erik Hatcher


Jon,

You provided a lot of nice details, thanks for helping us help you :)

The one missing piece is the definition of the text field type.   In  
Solr's _example_ schema, bobby gets analyzed (stemmed) to  
bobbi[1].   When you query for bobby*, the query parser is not  
running an analyzer on the wildcard query, thus literally searching  
for terms that begin with bobby[2].


As for steve , same story, but it analyzes to steve, which is  
found with a steve* query.


Erik


[1] I used Solr's admin/analysis.jsp to double-check the text field  
type behavior.


[2] I http://localhost:8983/solr/select/?q=bobby*debugQuery=true  
and see the parsed query in the debug output as text:bobby*


On Jun 23, 2008, at 4:13 PM, Jon Drukman wrote:


When I search with q=bobby I get the following record:

doc
   date name=date2008-06-23T07:06:40Z/date
   str name=descriptionhttp://farm1.static.flickr.com/117/.../ 
str

   int name=id9/int
   str name=nameBobby Gaza/str
   str name=user[EMAIL PROTECTED]/str
/doc

When I search with bobby* I get nothing.

When I search with steve* I get Steve Ballmer and Steve Jobs...  
What's going on?



The relevant part of my schema.xml is:

fields
  field name=id type=integer indexed=true stored=true  
required=true /

  field name=user_id type=integer indexed=true stored=true /
  field name=name type=text indexed=true stored=true/
  field name=description type=text indexed=true stored=true/
  field name=tags type=text indexed=true stored=true/
  field name=user type=text indexed=true stored=true/
  field name=date type=date indexed=true stored=true/
  field name=type type=string indexed=true stored=false/
  field name=type_id type=string indexed=true stored=false/
  field name=thumb_url type=string indexed=true stored=true/

/fields

!-- Field to use to determine and enforce document uniqueness.
 Unless this field is marked with required=false, it will be a  
required field

  --
uniqueKeytype_id/uniqueKey

!-- field for the QueryParser to use when an explicit fieldname is  
absent --

defaultSearchFieldname/defaultSearchField

Re: Wildcard search question

2008-06-23 Thread Jon Drukman


Erik Hatcher wrote:

Jon,

You provided a lot of nice details, thanks for helping us help you :)

The one missing piece is the definition of the text field type.   In 
Solr's _example_ schema, bobby gets analyzed (stemmed) to 
bobbi[1].   When you query for bobby*, the query parser is not running 
an analyzer on the wildcard query, thus literally searching for terms 
that begin with bobby[2].


As for steve , same story, but it analyzes to steve, which is found 
with a steve* query.


so, what's the solution?  if i change the field to string, will it be 
able to find bobby* ?  eventually it would be nice to be able to use 
fuzzy matching, to find 'jon' from 'john', for example.


thanks
-jsd-

Re: Wildcard search question

2008-06-23 Thread Erik Hatcher



On Jun 23, 2008, at 4:45 PM, Jon Drukman wrote:

Erik Hatcher wrote:

Jon,
You provided a lot of nice details, thanks for helping us help you :)
The one missing piece is the definition of the text field type.
In Solr's _example_ schema, bobby gets analyzed (stemmed) to  
bobbi[1].   When you query for bobby*, the query parser is not  
running an analyzer on the wildcard query, thus literally searching  
for terms that begin with bobby[2].
As for steve , same story, but it analyzes to steve, which is  
found with a steve* query.


so, what's the solution?


it depends(tm) ;)


 if i change the field to string, will it be able to find bobby* ?


No, because the original data is str name=nameBobby Gaza/str, so  
Bobby* would match, but not bobby*. string type (in the example  
schema, to be clear) does effectively no analysis, leaving the  
original string indexed as-is, case and all.


 eventually it would be nice to be able to use fuzzy matching, to  
find 'jon' from 'john', for example.


you could search for john~ to do that.  or bobby~ would match bobbi.

stemming and wildcard term queries aren't quite compatible, as you've  
found, but it does depend on how much of the prefix is provided.  bob*  
matches bobbi, for example.


Erik

Re: Wildcard search question

2008-06-23 Thread Jon Drukman


Erik Hatcher wrote:
No, because the original data is str name=nameBobby Gaza/str, so 
Bobby* would match, but not bobby*. string type (in the example 
schema, to be clear) does effectively no analysis, leaving the original 
string indexed as-is, case and all.

[...]

stemming and wildcard term queries aren't quite compatible, as you've 
found, but it does depend on how much of the prefix is provided.  bob* 
matches bobbi, for example.


ok well let's say that i can live without john/jon in the short term. 
what i really need today is a case insensitive wildcard search with 
literal matching (no fancy stemming.  bobby is bobby, not bobbi.)


what are my options?

-jsd-

Re: Wildcard search question

2008-06-23 Thread Norberto Meijome

On Mon, 23 Jun 2008 14:23:14 -0700
Jon Drukman [EMAIL PROTECTED] wrote:

 ok well let's say that i can live without john/jon in the short term. 
 what i really need today is a case insensitive wildcard search with 
 literal matching (no fancy stemming.  bobby is bobby, not bobbi.)
 
 what are my options?

Jon,
read 
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

define your own type (or modify text / string... but I find that it gets 
confusing to have variations of text / string ...) to perform the operations on 
the content as needed.

There are also other tokenizer/analysers available that *may* help in the 
partial searches (ngram , edgengram ), but there isn't much documentation on 
them yet (that I could find) - I am only getting into them myself i'll see 
how it goes..

B

_
{Beto|Norberto|Numard} Meijome

There are no stupid questions, but there are a LOT of inquisitive idiots.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.

1 2 >

1 - 100 of 103 matches

Mail list logo