Re: Facet.query

2007-04-20 Thread Erik Hatcher


On Apr 19, 2007, at 10:41 PM, Ge, Yao ((Y.)) wrote:

When mutiple facet queries are specified, are they booleaned as OR or
AND?


Neither, if you're referring to facet.query=...

facet.query's are all appended to the response, like this (in Ruby  
response format):


{
'responseHeader'={
  'status'=0,
  'QTime'=105,
  'params'={
'wt'='ruby',
'rows'='0',
'facet.query'=['ant',
 'lucene'],
'facet'='on',
'indent'='on',
'q'='erik hatcher'}},
'response'={'numFound'=3,'start'=0,'docs'=[]
},
'facet_counts'={
  'facet_queries'={
'ant'=1,
'lucene'=1},
  'facet_fields'={}}}

The query was this:  ?q=erik% 
20hatcherfacet=onfacet.query=antfacet.query=lucenewt=rubyindent=on 
rows=0
on our library metadata which, pleasantly, has copies of both the Ant  
book (yes, I'm looking into that JUnit issue, Ryan and Yonik :) and  
the Lucene book.


If you mean the filter queries, fq=... then those are logically  
ANDed when multiple are present.


Erik





Avoiding caching of special filter queries

2007-04-20 Thread Burkamp, Christian
Hi,

I'm using filter queries to implement document level security with solr.
The caching mechanism for filters separate from queries comes in handy
and the system performs well once all the filters for the users of the
system are stored in the cache.
However, I'm storing full document content in the index for the purpose
of highlighting. In addition to the standard snippet highlighting I
would like to offer a feature that displays the highlighted full
document content. I can add a filter query to select just the needed
Document by ID but this filter would go into the filter cache as well,
possibly throwing out some of the other usefull filters.
Is there a way to get the single document with highlighting info but
without polluting the filter cache?

-- Christian



AW: Avoiding caching of special filter queries

2007-04-20 Thread Burkamp, Christian
Hi Erik,

No, what I need to do is 

q=my funny queryfq=user:erikfq=id:doc Idhl=on ...

This is because the StandardRequestHandler needs the original query to do 
proper highlighting.
The user gets his paginated result page with his next 10 hits. He can then 
select one document for highlighting. Then I just repeat the last request with 
an additional filter query to select this one document and add the highlighting 
parameters.

-- Christian

-Ursprüngliche Nachricht-
Von: Erik Hatcher [mailto:[EMAIL PROTECTED] 
Gesendet: Freitag, 20. April 2007 15:43
An: solr-user@lucene.apache.org
Betreff: Re: Avoiding caching of special filter queries



On Apr 20, 2007, at 7:11 AM, Burkamp, Christian wrote:
 I'm using filter queries to implement document level security with
 solr.
 The caching mechanism for filters separate from queries comes in handy
 and the system performs well once all the filters for the users of the
 system are stored in the cache.
 However, I'm storing full document content in the index for the  
 purpose
 of highlighting. In addition to the standard snippet highlighting I
 would like to offer a feature that displays the highlighted full
 document content. I can add a filter query to select just the needed
 Document by ID but this filter would go into the filter cache as well,
 possibly throwing out some of the other usefull filters.
 Is there a way to get the single document with highlighting info but
 without polluting the filter cache?

Correct me if I'm wrong, but here's my understanding...

q=id:doc idfq=user:erik

is what you'd want to do.  q=id:doc won't go into the filter cache,  
but rather the query cache and the document itself into the document  
cache.  So you won't risk bumping things out of the filter cache by  
using queries.

Erik



Re: AW: Leading wildcards

2007-04-20 Thread Maarten . De . Vilder
thanks, this worked like a charm !!

we built a custom QueryParser and we integrated the *foo** in it, so 
basically we can now search leading, trailing and both ...

only crappy thing is the max Boolean clauses, but i'm going to look into 
that after the weekend

for the next release of Solr :
do not make this default, too many risks
but do make an option in the config to enable it, it's a very nice feature 


thanks everybody for the help and have a nice weekend,
maarten





Burkamp, Christian [EMAIL PROTECTED] 
19/04/2007 12:37
Please respond to
solr-user@lucene.apache.org


To
solr-user@lucene.apache.org
cc

Subject
AW: Leading wildcards






Hi there,

Solr does not support leading wildcards, because it uses Lucene's standard 
QueryParser class without changing the defaults. You can easily change 
this by inserting the line

parser.setAllowLeadingWildcards(true);

in QueryParsing.java line 92. (This is after creating a QueryParser 
instance in QueryParsing.parseQuery(...))

and it obviously means that you have to change solr's source code. It 
would be nice to have an option in the schema to switch leading wildcards 
on or off per field. Leading wildcards really make no sense on richly 
populated fields because queries tend to result in too many clauses 
exceptions most of the time.

This works for leading wildcards. Unfortunately it does not enable 
searches with leading AND trailing wildcards. (E.g. searching for *lega* 
does not find results even if the term elegance is in the index. If you 
put a second asterisk at the end, the term elegance is found. (search 
for *lega** to get hits).
Can anybody explain this though it seems to be more of a lucene 
QueryParser issue?

-- Christian

-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Gesendet: Donnerstag, 19. April 2007 08:35
An: solr-user@lucene.apache.org
Betreff: Leading wildcards


hi,

we have been trying to get the leading wildcards to work.

we have been looking around the Solr website, the Lucene website, wiki's 
and the mailing lists etc ...
but we found a lot of contradictory information.

so we have a few question : 
- is the latest version of lucene capable of handling leading wildcards ? 
- is the latest version of solr capable of handling leading wildcards ?
- do we need to make adjustments to the solr source code ?
- if we need to adjust the solr source, what do we need to change ?

thanks in advance !
Maarten




Re: AW: Leading wildcards

2007-04-20 Thread Michael Kimsal

Maarten:

Would you mind sharing your custom query parser?


On 4/20/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:


thanks, this worked like a charm !!

we built a custom QueryParser and we integrated the *foo** in it, so
basically we can now search leading, trailing and both ...

only crappy thing is the max Boolean clauses, but i'm going to look into
that after the weekend

for the next release of Solr :
do not make this default, too many risks
but do make an option in the config to enable it, it's a very nice feature


thanks everybody for the help and have a nice weekend,
maarten





Burkamp, Christian [EMAIL PROTECTED]
19/04/2007 12:37
Please respond to
solr-user@lucene.apache.org


To
solr-user@lucene.apache.org
cc

Subject
AW: Leading wildcards






Hi there,

Solr does not support leading wildcards, because it uses Lucene's standard
QueryParser class without changing the defaults. You can easily change
this by inserting the line

parser.setAllowLeadingWildcards(true);

in QueryParsing.java line 92. (This is after creating a QueryParser
instance in QueryParsing.parseQuery(...))

and it obviously means that you have to change solr's source code. It
would be nice to have an option in the schema to switch leading wildcards
on or off per field. Leading wildcards really make no sense on richly
populated fields because queries tend to result in too many clauses
exceptions most of the time.

This works for leading wildcards. Unfortunately it does not enable
searches with leading AND trailing wildcards. (E.g. searching for *lega*
does not find results even if the term elegance is in the index. If you
put a second asterisk at the end, the term elegance is found. (search
for *lega** to get hits).
Can anybody explain this though it seems to be more of a lucene
QueryParser issue?

-- Christian

-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Gesendet: Donnerstag, 19. April 2007 08:35
An: solr-user@lucene.apache.org
Betreff: Leading wildcards


hi,

we have been trying to get the leading wildcards to work.

we have been looking around the Solr website, the Lucene website, wiki's
and the mailing lists etc ...
but we found a lot of contradictory information.

so we have a few question :
- is the latest version of lucene capable of handling leading wildcards ?
- is the latest version of solr capable of handling leading wildcards ?
- do we need to make adjustments to the solr source code ?
- if we need to adjust the solr source, what do we need to change ?

thanks in advance !
Maarten






--
Michael Kimsal
http://webdevradio.com


Re: Avoiding caching of special filter queries

2007-04-20 Thread Erik Hatcher


On Apr 20, 2007, at 7:11 AM, Burkamp, Christian wrote:
I'm using filter queries to implement document level security with  
solr.

The caching mechanism for filters separate from queries comes in handy
and the system performs well once all the filters for the users of the
system are stored in the cache.
However, I'm storing full document content in the index for the  
purpose

of highlighting. In addition to the standard snippet highlighting I
would like to offer a feature that displays the highlighted full
document content. I can add a filter query to select just the needed
Document by ID but this filter would go into the filter cache as well,
possibly throwing out some of the other usefull filters.
Is there a way to get the single document with highlighting info but
without polluting the filter cache?


Correct me if I'm wrong, but here's my understanding...

   q=id:doc idfq=user:erik

is what you'd want to do.  q=id:doc won't go into the filter cache,  
but rather the query cache and the document itself into the document  
cache.  So you won't risk bumping things out of the filter cache by  
using queries.


Erik



Re: Multiple Solr Cores

2007-04-20 Thread Henrib


Updated (forgot the patch for Servlet).
http://www.nabble.com/file/7996/solr-trunk-src.patch solr-trunk-src.patch 

The change should still be compatible with the trunk it is based upon.


Henrib wrote:
 
 Following up on a previous thread in the Solr-User list, here is a patch
 that allows managing multiple cores in the same VM (thus multiple
 config/schemas/indexes).
 The SolrCore.core singleton has been changed to a MapString, SolrCore;
 the current singleton behavior is keyed as 'null'. (Which is used by
 SolrInfoRegistry).
 All static references to either a Config or a SolrCore have been removed;
 this implies that some classes now do refer to either a SolrCore or a
 SolrConfig (some ctors have been modified accordingly).
 
 I haven't tried to modify anything above the 'jar' (script, admin 
 servlet are unaware of the multi-core part).
 
 The 2 patches files are the src/  the test/ patches.
  http://www.nabble.com/file/7971/solr-test.patch solr-test.patch 
  http://www.nabble.com/file/7972/solr-src.patch solr-src.patch 
 
 This being my first attempt at a contribution, I will humbly welcome any
 comment.
 Regards,
 Henri
 

-- 
View this message in context: 
http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10106126
Sent from the Solr - User mailing list archive at Nabble.com.



sorting by matched field, then title alpha

2007-04-20 Thread Simon Kahl

Hi.  I have some result ordering requirements which I can not solve
searching the doco and forums to this point.  Perhaps what I am trying does
not belong in solr.  Can anyone offer any hint or suggestion before I
disappear into a vortex of terror?

I need to conditionally order results of phrase searches like this:
* First show all docs with phrase in field A - regardless of other
occurences of phrase in doc - ordered alphabetically by field X
* Next show all docs with phrase in field B - ... (as above)
* Then same again for fields C, D and E
* Then show remaining matching docs as per default scoring

Perhaps I will have to do this clientside (ie. do multiple searches and
concatenate results), but I'm hoping there is some way I can do this in a
single search.

Thanks,
Simon


Re: Avoiding caching of special filter queries

2007-04-20 Thread Mike Klaas

On 4/20/07, Burkamp, Christian [EMAIL PROTECTED] wrote:

Hi Erik,

No, what I need to do is

q=my funny queryfq=user:erikfq=id:doc Idhl=on ...

This is because the StandardRequestHandler needs the original query to do 
proper highlighting.
The user gets his paginated result page with his next 10 hits. He can then 
select one document for highlighting. Then I just repeat the last request with 
an additional filter query to select this one document and add the highlighting 
parameters.


Erik posted the way to do this that works with OOB Solr.  If you want
to do it with no additional querying (not even for the docid filter),
you can use an approach like this (from a previous email):

- turn on lazy field loading.  For best effect, compress the main text field.
- create a new request handler that is similar to dismax, but uses
the query for highlighting only.  A separate parameter allows the
specification of document keys to highlight
- highlighting requires the internal lucene document id, not the
document key, and it can be slow to execute queries to get the ids.  I
created a custom cache that maps doc keys - doc ids, populate it
during the main query, and grab ids from the cache during the
highlighting step.

-Mike


Re: Solr performance warnings

2007-04-20 Thread Mike Klaas

On 4/20/07, Michael Thessel [EMAIL PROTECTED] wrote:

Hey Erik,

thanks for the fast reply. Yes this could be possible. I currently got
solr running for the indexing of a forum with 100k users. It could
definitely be possible that two commits overlap. But I need to commit
all changes because the new posts must be available in the search as
soon as they are posted.

Do you think there is a way to optimize this?


As soon as is a rather vague requirement.  If you can specify the
minimum acceptible delay, then you can use Solr's autocommit
functionality to trigger commits.

-Mike


Re: sorting by matched field, then title alpha

2007-04-20 Thread Mike Klaas

On 4/20/07, Simon Kahl [EMAIL PROTECTED] wrote:


I need to conditionally order results of phrase searches like this:
* First show all docs with phrase in field A - regardless of other
occurences of phrase in doc - ordered alphabetically by field X
* Next show all docs with phrase in field B - ... (as above)
* Then same again for fields C, D and E
* Then show remaining matching docs as per default scoring

Perhaps I will have to do this clientside (ie. do multiple searches and
concatenate results), but I'm hoping there is some way I can do this in a
single search.


You can approximate it by doing something like:
A:phrase^10 B:phrase^1 C:phrase^1000 D:phrase^100
E:phrase^30 rest of query

HTH,
-Mike


Re: Solr performance warnings

2007-04-20 Thread Michael Thessel

Mike Klaas wrote:

On 4/20/07, Michael Thessel [EMAIL PROTECTED] wrote:

Hey Erik,

thanks for the fast reply. Yes this could be possible. I currently got
solr running for the indexing of a forum with 100k users. It could
definitely be possible that two commits overlap. But I need to commit
all changes because the new posts must be available in the search as
soon as they are posted.

Do you think there is a way to optimize this?


As soon as is a rather vague requirement.  If you can specify the
minimum acceptible delay, then you can use Solr's autocommit
functionality to trigger commits.

-Mike


I didn't know about the timed commits. That's perfect for me.

Thanks,

Michael


Re: Solr performance warnings

2007-04-20 Thread Michael Thessel

Michael Thessel wrote:

Mike Klaas wrote:

On 4/20/07, Michael Thessel [EMAIL PROTECTED] wrote:

Hey Erik,

thanks for the fast reply. Yes this could be possible. I currently got
solr running for the indexing of a forum with 100k users. It could
definitely be possible that two commits overlap. But I need to commit
all changes because the new posts must be available in the search as
soon as they are posted.

Do you think there is a way to optimize this?


As soon as is a rather vague requirement.  If you can specify the
minimum acceptible delay, then you can use Solr's autocommit
functionality to trigger commits.

-Mike


I didn't know about the timed commits. That's perfect for me.

Thanks,

Michael


The timed commits don't work for me. The webinterface says 0 commits 
since the server was restarted. And nothing in the logs as well.


I use:
apache-solr-1.1.0-incubating


My updateHandler section from solrconfig.xml:

updateHandler class=solr.DirectUpdateHandler2
  autoCommit
maxTime1/maxTime
  /autoCommit
/updateHandler

I also tried maxTime10/maxTime in case its seconds and not ms.

Cheers Michael