Re: AW: Leading wildcards

2007-04-20 Thread Maarten . De . Vilder
thanks, this worked like a charm !!

we built a custom "QueryParser" and we integrated the *foo** in it, so 
basically we can now search leading, trailing and both ...

only crappy thing is the max Boolean clauses, but i'm going to look into 
that after the weekend

for the next release of Solr :
do not make this default, too many risks
but do make an option in the config to enable it, it's a very nice feature 


thanks everybody for the help and have a nice weekend,
maarten





"Burkamp, Christian" <[EMAIL PROTECTED]> 
19/04/2007 12:37
Please respond to
solr-user@lucene.apache.org


To

cc

Subject
AW: Leading wildcards






Hi there,

Solr does not support leading wildcards, because it uses Lucene's standard 
QueryParser class without changing the defaults. You can easily change 
this by inserting the line

parser.setAllowLeadingWildcards(true);

in QueryParsing.java line 92. (This is after creating a QueryParser 
instance in QueryParsing.parseQuery(...))

and it obviously means that you have to change solr's source code. It 
would be nice to have an option in the schema to switch leading wildcards 
on or off per field. Leading wildcards really make no sense on richly 
populated fields because queries tend to result in too many clauses 
exceptions most of the time.

This works for leading wildcards. Unfortunately it does not enable 
searches with leading AND trailing wildcards. (E.g. searching for "*lega*" 
does not find results even if the term "elegance" is in the index. If you 
put a second asterisk at the end, the term "elegance" is found. (search 
for "*lega**" to get hits).
Can anybody explain this though it seems to be more of a lucene 
QueryParser issue?

-- Christian

-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Gesendet: Donnerstag, 19. April 2007 08:35
An: solr-user@lucene.apache.org
Betreff: Leading wildcards


hi,

we have been trying to get the leading wildcards to work.

we have been looking around the Solr website, the Lucene website, wiki's 
and the mailing lists etc ...
but we found a lot of contradictory information.

so we have a few question : 
- is the latest version of lucene capable of handling leading wildcards ? 
- is the latest version of solr capable of handling leading wildcards ?
- do we need to make adjustments to the solr source code ?
- if we need to adjust the solr source, what do we need to change ?

thanks in advance !
Maarten




Re: AW: Leading wildcards

2007-04-20 Thread Michael Kimsal

Maarten:

Would you mind sharing your custom query parser?


On 4/20/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:


thanks, this worked like a charm !!

we built a custom "QueryParser" and we integrated the *foo** in it, so
basically we can now search leading, trailing and both ...

only crappy thing is the max Boolean clauses, but i'm going to look into
that after the weekend

for the next release of Solr :
do not make this default, too many risks
but do make an option in the config to enable it, it's a very nice feature


thanks everybody for the help and have a nice weekend,
maarten





"Burkamp, Christian" <[EMAIL PROTECTED]>
19/04/2007 12:37
Please respond to
solr-user@lucene.apache.org


To

cc

Subject
AW: Leading wildcards






Hi there,

Solr does not support leading wildcards, because it uses Lucene's standard
QueryParser class without changing the defaults. You can easily change
this by inserting the line

parser.setAllowLeadingWildcards(true);

in QueryParsing.java line 92. (This is after creating a QueryParser
instance in QueryParsing.parseQuery(...))

and it obviously means that you have to change solr's source code. It
would be nice to have an option in the schema to switch leading wildcards
on or off per field. Leading wildcards really make no sense on richly
populated fields because queries tend to result in too many clauses
exceptions most of the time.

This works for leading wildcards. Unfortunately it does not enable
searches with leading AND trailing wildcards. (E.g. searching for "*lega*"
does not find results even if the term "elegance" is in the index. If you
put a second asterisk at the end, the term "elegance" is found. (search
for "*lega**" to get hits).
Can anybody explain this though it seems to be more of a lucene
QueryParser issue?

-- Christian

-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Gesendet: Donnerstag, 19. April 2007 08:35
An: solr-user@lucene.apache.org
Betreff: Leading wildcards


hi,

we have been trying to get the leading wildcards to work.

we have been looking around the Solr website, the Lucene website, wiki's
and the mailing lists etc ...
but we found a lot of contradictory information.

so we have a few question :
- is the latest version of lucene capable of handling leading wildcards ?
- is the latest version of solr capable of handling leading wildcards ?
- do we need to make adjustments to the solr source code ?
- if we need to adjust the solr source, what do we need to change ?

thanks in advance !
Maarten






--
Michael Kimsal
http://webdevradio.com


Re: AW: Leading wildcards

2007-04-23 Thread Maarten . De . Vilder
hey,

i'm sorry for the confusion : our "custom query parser" is not a Lucene 
query parser 

it is something we built for the client-side of Solr ...

it basically transforms some search arguments into an Solr query URL

example : method query( searchID, searchQuery, category, ) returns 
http://solrhost/solr/select/?q=id%3AsearchString+OR+query%3AsearchString&version=2.2&start=0&rows=10&indent=on
(that is what i mean by "query parsing")
this method will perform a series of operations on the keywords and return 
a working Solr-query

we are using the Java solr client and we built a framework around it to 
simplify our actions.

example for the wildcards :
we basically check if there is a keyword that starts and ends with an * 
(by using regular expressions)
and if such a keyword is found, we add a second * at the end ...
by doing this we make sure we send a working query to the Solr server

we also escape special characters and other wildcards this way

and we also built in highlighting for wildcard queries :
if we see the user is using wildcards, we dont use the standard 
solr-highlighting (which doesnt work with wildcards)
in stead we use regular expression to highlight the results after we get 
them back from the server
example : 
*foo*  in solr query becomes .*foo.* in regular expression... ( .* means a 
series of characters in RE)
then we check if our result contains this regular expression and put some 
-tags around the matching words
and before we knew it, our wildcard searches were highlighted

wether this is a good way of handling these things is open for discussion, 
if we have more time we might actually change the Solr-server code to fix 
these things.
it's just a full proof work-around at this moment.

grts,m





"Michael Kimsal" <[EMAIL PROTECTED]> 
20/04/2007 16:30
Please respond to
solr-user@lucene.apache.org


To
solr-user@lucene.apache.org
cc

Subject
Re: AW: Leading wildcards






Maarten:

Would you mind sharing your custom query parser?


On 4/20/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>
> thanks, this worked like a charm !!
>
> we built a custom "QueryParser" and we integrated the *foo** in it, so
> basically we can now search leading, trailing and both ...
>
> only crappy thing is the max Boolean clauses, but i'm going to look into
> that after the weekend
>
> for the next release of Solr :
> do not make this default, too many risks
> but do make an option in the config to enable it, it's a very nice 
feature
>
>
> thanks everybody for the help and have a nice weekend,
> maarten
>
>
>
>
>
> "Burkamp, Christian" <[EMAIL PROTECTED]>
> 19/04/2007 12:37
> Please respond to
> solr-user@lucene.apache.org
>
>
> To
> 
> cc
>
> Subject
> AW: Leading wildcards
>
>
>
>
>
>
> Hi there,
>
> Solr does not support leading wildcards, because it uses Lucene's 
standard
> QueryParser class without changing the defaults. You can easily change
> this by inserting the line
>
> parser.setAllowLeadingWildcards(true);
>
> in QueryParsing.java line 92. (This is after creating a QueryParser
> instance in QueryParsing.parseQuery(...))
>
> and it obviously means that you have to change solr's source code. It
> would be nice to have an option in the schema to switch leading 
wildcards
> on or off per field. Leading wildcards really make no sense on richly
> populated fields because queries tend to result in too many clauses
> exceptions most of the time.
>
> This works for leading wildcards. Unfortunately it does not enable
> searches with leading AND trailing wildcards. (E.g. searching for 
"*lega*"
> does not find results even if the term "elegance" is in the index. If 
you
> put a second asterisk at the end, the term "elegance" is found. (search
> for "*lega**" to get hits).
> Can anybody explain this though it seems to be more of a lucene
> QueryParser issue?
>
> -- Christian
>
> -Ursprüngliche Nachricht-
> Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> Gesendet: Donnerstag, 19. April 2007 08:35
> An: solr-user@lucene.apache.org
> Betreff: Leading wildcards
>
>
> hi,
>
> we have been trying to get the leading wildcards to work.
>
> we have been looking around the Solr website, the Lucene website, wiki's
> and the mailing lists etc ...
> but we found a lot of contradictory information.
>
> so we have a few question :
> - is the latest version of lucene capable of handling leading wildcards 
?
> - is the latest version of solr capable of handling leading wildcards ?
> - do we need to make adjustments to the solr source code ?
> - if we need to adjust the solr source, what do we need to change ?
>
> thanks in advance !
> Maarten
>
>
>


-- 
Michael Kimsal
http://webdevradio.com



Re: AW: Leading wildcards

2007-04-23 Thread Maarten . De . Vilder
hey,

we've stumbled on something weird while using wildcards 

we enabled leading wildcards in solr (see previous message from Christian 
Burkamp)

when we do a search on a nonexisting field, we get a  SolrException: 
undefined field
(this was for query "nonfield:test")

but when we use wildcards in our query, we dont get the undefined field 
exception,
so the query "nonfield:*test" works fine ... just zero results...

is this normal behaviour ? 




"Burkamp, Christian" <[EMAIL PROTECTED]> 
19/04/2007 12:37
Please respond to
solr-user@lucene.apache.org


To

cc

Subject
AW: Leading wildcards






Hi there,

Solr does not support leading wildcards, because it uses Lucene's standard 
QueryParser class without changing the defaults. You can easily change 
this by inserting the line

parser.setAllowLeadingWildcards(true);

in QueryParsing.java line 92. (This is after creating a QueryParser 
instance in QueryParsing.parseQuery(...))

and it obviously means that you have to change solr's source code. It 
would be nice to have an option in the schema to switch leading wildcards 
on or off per field. Leading wildcards really make no sense on richly 
populated fields because queries tend to result in too many clauses 
exceptions most of the time.

This works for leading wildcards. Unfortunately it does not enable 
searches with leading AND trailing wildcards. (E.g. searching for "*lega*" 
does not find results even if the term "elegance" is in the index. If you 
put a second asterisk at the end, the term "elegance" is found. (search 
for "*lega**" to get hits).
Can anybody explain this though it seems to be more of a lucene 
QueryParser issue?

-- Christian

-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Gesendet: Donnerstag, 19. April 2007 08:35
An: solr-user@lucene.apache.org
Betreff: Leading wildcards


hi,

we have been trying to get the leading wildcards to work.

we have been looking around the Solr website, the Lucene website, wiki's 
and the mailing lists etc ...
but we found a lot of contradictory information.

so we have a few question : 
- is the latest version of lucene capable of handling leading wildcards ? 
- is the latest version of solr capable of handling leading wildcards ?
- do we need to make adjustments to the solr source code ?
- if we need to adjust the solr source, what do we need to change ?

thanks in advance !
Maarten




Re: AW: Leading wildcards

2007-04-27 Thread Chris Hostetter

: when we do a search on a nonexisting field, we get a  SolrException:
: undefined field
: (this was for query "nonfield:test")
:
: but when we use wildcards in our query, we dont get the undefined field
: exception,
: so the query "nonfield:*test" works fine ... just zero results...
:
: is this normal behaviour ?

the error about undefined fields comes up because the Lucene QueryParser
is attempting to analyze the field, and the Solr IndexSchema
complains if it can't find the field it's asked to provide an analyzer
for.

for wildcard (and fuzzy and prefix) queries, the input is not analyzed
(the Lucene FAQ explains this a bit) so the Solr IndexSchema is never
consulted about the field.


It is certianly an odd bit of behavior, and we should try to be
consistent.  I think it it would be fairly straight forward to make the
SolrQueryParser *always* test that the field is "viable" according the
IndexSchema ... would you mind opening a bug in Jira for this?



-Hoss



Re: AW: Leading wildcards

2007-04-27 Thread Paul Fryer

PLEASE REMOVE ME FROM THIS MAILING LIST!!!

Whoever manages this list, can you please remove me i have tried sending 
emails to the unsubscribe email, but i just keep getting more emails. This 
is really an issue for me... so your help would be great!


Thanks,

Paul



From: Chris Hostetter <[EMAIL PROTECTED]>
Reply-To: solr-user@lucene.apache.org
To: solr-user@lucene.apache.org
Subject: Re: AW: Leading wildcards
Date: Fri, 27 Apr 2007 16:25:37 -0700 (PDT)


: when we do a search on a nonexisting field, we get a  SolrException:
: undefined field
: (this was for query "nonfield:test")
:
: but when we use wildcards in our query, we dont get the undefined field
: exception,
: so the query "nonfield:*test" works fine ... just zero results...
:
: is this normal behaviour ?

the error about undefined fields comes up because the Lucene QueryParser
is attempting to analyze the field, and the Solr IndexSchema
complains if it can't find the field it's asked to provide an analyzer
for.

for wildcard (and fuzzy and prefix) queries, the input is not analyzed
(the Lucene FAQ explains this a bit) so the Solr IndexSchema is never
consulted about the field.


It is certianly an odd bit of behavior, and we should try to be
consistent.  I think it it would be fairly straight forward to make the
SolrQueryParser *always* test that the field is "viable" according the
IndexSchema ... would you mind opening a bug in Jira for this?



-Hoss



_
Download Messenger. Join the i’m Initiative. Help make a difference today. 
http://im.live.com/messenger/im/home/?source=TAGHM_APR07




RE: AW: Leading wildcards

2007-04-28 Thread Jery Cook
Just cant figure this out, ...or do I have to do this programmatically?
 

Have a facet, and field in an document called estimatedRepairs, it is
declared in  the schema.xml as
 



 

I execute a query with the below parameters
 

q=state%3Avirgina;

&facet.query=estimatedRepairs:[*+TO+1000.0]

&facet.query=estimatedRepairs:[1000.0+TO+*]

&facet=true

&facet.field=state

&facet.field=country

&facet.field=zip

&facet.field=estimatedProfit

&facet.field=marketValue

&facet.field=numberOfBaths

&facet.field=numberOfBeds

&facet.field=price

&facet.field=type

&facet.limit=10

&facet.zeros=false

&facet.missing=false

&version=2.2

&debugQuery=true

 

 




However my results show

 

facet name: [estimatedRepairs] value count: [10]

[ListingApp] INFO [main]
ListingManagerImplTest.testCreateQueryFromSearchParams(186) |  count Name:
[24153.0] , count: [7]

[ListingApp] INFO [main]
ListingManagerImplTest.testCreateQueryFromSearchParams(186) |  count Name:
[1469.0] , count: [6]

[ListingApp] INFO [main]
ListingManagerImplTest.testCreateQueryFromSearchParams(186) |  count Name:
[4249.0] , count: [6]

[ListingApp] INFO [main]
ListingManagerImplTest.testCreateQueryFromSearchParams(186) |  count Name:
[16444.0] , count: [6]

[ListingApp] INFO [main]
ListingManagerImplTest.testCreateQueryFromSearchParams(186) |  count Name:
[21555.0] , count: [6]

[ListingApp] INFO [main]
ListingManagerImplTest.testCreateQueryFromSearchParams(186) |  count Name:
[23132.0] , count: [6]

[ListingApp] INFO [main]
ListingManagerImplTest.testCreateQueryFromSearchParams(186) |  count Name:
[25669.0] , count: [6]

[ListingApp] INFO [main]
ListingManagerImplTest.testCreateQueryFromSearchParams(186) |  count Name:
[26160.0] , count: [6]

[ListingApp] INFO [main]
ListingManagerImplTest.testCreateQueryFromSearchParams(186) |  count Name:
[27058.0] , count: [6]

[ListingApp] INFO [main]
ListingManagerImplTest.testCreateQueryFromSearchParams(186) |  count Name:
[171.0] , count: [5]

 

 

AND I DON'T WANT THIS. I want it to show Something like

by estimated Repairs.

1 to 1000[23]

1000 - 2000[53]

 
 

I thought facet.query allows me to do this? If not what will let SOLR
generate the query counts, for the results in intervals of 1000

 

Jeryl Cook
^ Pharaoh ^
http://pharaohofkush.blogspot.com/

"1f u c4n r34d th1s u r34lly n33d t0 g37 l41d "