Re: query with @ and *

2017-09-14 Thread Erick Erickson
See: 
https://lucidworks.com/2011/11/29/whats-with-lowercasing-wildcard-multiterm-queries-in-solr/

It discusses the general problem of particular filters being able to
cope with wildcards or not. Generally any filter that could
potentially produce more than one output token per input token is
skipped when wildcards are encountered.

Best,
Erick

On Thu, Sep 14, 2017 at 6:26 AM, Susheel Kumar  wrote:
> You may want to use UAX29URLEmailTokenizerFactory tokenizer into your
> analysis chain.
>
> Thanks,
> Susheel
>
>
> On Thu, Sep 14, 2017 at 8:46 AM, Shawn Heisey  wrote:
>
>> On 9/14/2017 5:06 AM, Mannott, Birgit wrote:
>> > I have a problem when searching on email addresses.
>> > @ seems to be handled as a special character but I don't find anything
>> about it in the documentation.
>> >
>> > This is my test data
>> > t...@one.com
>> > t...@two.com
>>
>> Chances are that have analysis defined on this field, and that the
>> analysis includes a tokenizer or tokenizer/filter combination that
>> splits on punctuation.  This means that for the both entries, you have
>> three terms.  For the first one, those terms are test, one, and com.
>> For the second one, they are test,  two, and com.  The rest of what I'm
>> writing assumes that this is the case.
>>
>> > searching for test* results both, ok.
>>
>> This matches the term "test" in both entries.
>>
>> > searching for t...@one.com results the correct one, ok.
>>
>> Query analysis probably splits the same way index analysis does, so the
>> actual search is for all three terms.
>>
>> > searching for test results both, what I didn't expect but it's ok.
>>
>> In this case, it matches the simple term "test" that's in the index on
>> both documents.
>>
>> > searching for test@one* results none and that's the problem.
>>
>> When you include wildcards in a query, most query analysis is skipped,
>> so it's looking for the literal text "test@one" followed by any
>> characters.  Because the index analysis removed the @ character and
>> split the things around it into separate terms, this will not match any
>> of the terms in the index.
>>
>> Wildcards, while they do work in many cases, are often not the correct
>> way to do queries.
>>
>> Thanks,
>> Shawn
>>
>>


Re: query with @ and *

2017-09-14 Thread Susheel Kumar
You may want to use UAX29URLEmailTokenizerFactory tokenizer into your
analysis chain.

Thanks,
Susheel


On Thu, Sep 14, 2017 at 8:46 AM, Shawn Heisey  wrote:

> On 9/14/2017 5:06 AM, Mannott, Birgit wrote:
> > I have a problem when searching on email addresses.
> > @ seems to be handled as a special character but I don't find anything
> about it in the documentation.
> >
> > This is my test data
> > t...@one.com
> > t...@two.com
>
> Chances are that have analysis defined on this field, and that the
> analysis includes a tokenizer or tokenizer/filter combination that
> splits on punctuation.  This means that for the both entries, you have
> three terms.  For the first one, those terms are test, one, and com.
> For the second one, they are test,  two, and com.  The rest of what I'm
> writing assumes that this is the case.
>
> > searching for test* results both, ok.
>
> This matches the term "test" in both entries.
>
> > searching for t...@one.com results the correct one, ok.
>
> Query analysis probably splits the same way index analysis does, so the
> actual search is for all three terms.
>
> > searching for test results both, what I didn't expect but it's ok.
>
> In this case, it matches the simple term "test" that's in the index on
> both documents.
>
> > searching for test@one* results none and that's the problem.
>
> When you include wildcards in a query, most query analysis is skipped,
> so it's looking for the literal text "test@one" followed by any
> characters.  Because the index analysis removed the @ character and
> split the things around it into separate terms, this will not match any
> of the terms in the index.
>
> Wildcards, while they do work in many cases, are often not the correct
> way to do queries.
>
> Thanks,
> Shawn
>
>


Re: query with @ and *

2017-09-14 Thread Shawn Heisey
On 9/14/2017 5:06 AM, Mannott, Birgit wrote:
> I have a problem when searching on email addresses.
> @ seems to be handled as a special character but I don't find anything about 
> it in the documentation.
>
> This is my test data
> t...@one.com
> t...@two.com

Chances are that have analysis defined on this field, and that the
analysis includes a tokenizer or tokenizer/filter combination that
splits on punctuation.  This means that for the both entries, you have
three terms.  For the first one, those terms are test, one, and com. 
For the second one, they are test,  two, and com.  The rest of what I'm
writing assumes that this is the case.

> searching for test* results both, ok.

This matches the term "test" in both entries.

> searching for t...@one.com results the correct one, ok.

Query analysis probably splits the same way index analysis does, so the 
actual search is for all three terms.

> searching for test results both, what I didn't expect but it's ok.

In this case, it matches the simple term "test" that's in the index on
both documents.

> searching for test@one* results none and that's the problem.

When you include wildcards in a query, most query analysis is skipped, 
so it's looking for the literal text "test@one" followed by any
characters.  Because the index analysis removed the @ character and
split the things around it into separate terms, this will not match any
of the terms in the index.

Wildcards, while they do work in many cases, are often not the correct
way to do queries.

Thanks,
Shawn



Re: query with @ and *

2017-09-14 Thread Atita Arora
Hi,

Can you give us a little information about the query parser you using in
your handler ?

Thanks,
Ati


On Thu, Sep 14, 2017 at 4:36 PM, Mannott, Birgit 
wrote:

> Hi,
>
> I have a problem when searching on email addresses.
> @ seems to be handled as a special character but I don't find anything
> about it in the documentation.
>
> This is my test data
> t...@one.com
> t...@two.com
>
> searching for test* results both, ok.
> searching for t...@one.com results the correct one, ok.
> searching for test results both, what I didn't expect but it's ok.
> searching for test@one* results none and that's the problem.
>
> Escaping the char @ doesn't change it.
> It seems that every query containing @ and * has no result.
>
> Has anyone an idea how to change this?
>
> Thanks,
> Birgit
>
>
>
>
>
>


Re: Query with AND|OR operator with Dismaxrequest

2009-05-19 Thread dabboo

Hi,

Thanks for the information. I would appreciate if somebody can share the URL
of the JIRA issue.

Otis Gospodnetic wrote:
> 
> 
> Prerna,
> 
> Yes, DisMax doesn't take in queries with Boolean operators.  But I believe
> there is a patch in JIRA that makes that possible.
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: prerna07 
>> To: solr-user@lucene.apache.org
>> Sent: Monday, May 18, 2009 6:05:50 AM
>> Subject: Query with AND|OR operator with Dismaxrequest
>> 
>> 
>> 
>> Hi,
>> 
>> I am not getting correct results with a Query which has multiple AND | OR
>> operator.
>> 
>> Query Format q=((A AND B) OR (C OR D) OR E) 
>> 
>> ?q=((intAgeFrom_product_i:[0+TO+3]+AND+intAgeTo_product_i:[3+TO+*])+OR+(intAgeFrom_product_i:[0+TO+3]+AND+intAgeTo_product_i:[0+TO+3])+OR+(ageFrom_product_s:Adult))&qt=dismaxrequest
>> 
>> Query return correct result without Dismaxrequest, but incorrect results
>> with Dismaxrequest.
>> 
>> I have to use dismaxrequest because i need boosting of search results 
>> 
>> According to some posts there are issues with AND | OR operator with
>> dismaxrequest. 
>> Please let me know if anyone has faced the same problem and if there is
>> any
>> way to make the query work with dismaxrequest.
>> 
>> Thanks,
>> Prerna
>> 
>> 
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Query-with-AND%7COR-operator-with-Dismaxrequest-tp23594592p23594592.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Query-with-AND%7COR-operator-with-Dismaxrequest-tp23594592p23612577.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query with AND|OR operator with Dismaxrequest

2009-05-18 Thread Otis Gospodnetic

Prerna,

Yes, DisMax doesn't take in queries with Boolean operators.  But I believe 
there is a patch in JIRA that makes that possible.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: prerna07 
> To: solr-user@lucene.apache.org
> Sent: Monday, May 18, 2009 6:05:50 AM
> Subject: Query with AND|OR operator with Dismaxrequest
> 
> 
> 
> Hi,
> 
> I am not getting correct results with a Query which has multiple AND | OR
> operator.
> 
> Query Format q=((A AND B) OR (C OR D) OR E) 
> 
> ?q=((intAgeFrom_product_i:[0+TO+3]+AND+intAgeTo_product_i:[3+TO+*])+OR+(intAgeFrom_product_i:[0+TO+3]+AND+intAgeTo_product_i:[0+TO+3])+OR+(ageFrom_product_s:Adult))&qt=dismaxrequest
> 
> Query return correct result without Dismaxrequest, but incorrect results
> with Dismaxrequest.
> 
> I have to use dismaxrequest because i need boosting of search results 
> 
> According to some posts there are issues with AND | OR operator with
> dismaxrequest. 
> Please let me know if anyone has faced the same problem and if there is any
> way to make the query work with dismaxrequest.
> 
> Thanks,
> Prerna
> 
> 
> -- 
> View this message in context: 
> http://www.nabble.com/Query-with-AND%7COR-operator-with-Dismaxrequest-tp23594592p23594592.html
> Sent from the Solr - User mailing list archive at Nabble.com.