Re: Facet.query
On Apr 19, 2007, at 10:41 PM, Ge, Yao ((Y.)) wrote: When mutiple facet queries are specified, are they booleaned as OR or AND? Neither, if you're referring to facet.query=... facet.query's are all appended to the response, like this (in Ruby response format): { 'responseHeader'={ 'status'=0, 'QTime'=105, 'params'={ 'wt'='ruby', 'rows'='0', 'facet.query'=['ant', 'lucene'], 'facet'='on', 'indent'='on', 'q'='erik hatcher'}}, 'response'={'numFound'=3,'start'=0,'docs'=[] }, 'facet_counts'={ 'facet_queries'={ 'ant'=1, 'lucene'=1}, 'facet_fields'={}}} The query was this: ?q=erik% 20hatcherfacet=onfacet.query=antfacet.query=lucenewt=rubyindent=on rows=0 on our library metadata which, pleasantly, has copies of both the Ant book (yes, I'm looking into that JUnit issue, Ryan and Yonik :) and the Lucene book. If you mean the filter queries, fq=... then those are logically ANDed when multiple are present. Erik
Avoiding caching of special filter queries
Hi, I'm using filter queries to implement document level security with solr. The caching mechanism for filters separate from queries comes in handy and the system performs well once all the filters for the users of the system are stored in the cache. However, I'm storing full document content in the index for the purpose of highlighting. In addition to the standard snippet highlighting I would like to offer a feature that displays the highlighted full document content. I can add a filter query to select just the needed Document by ID but this filter would go into the filter cache as well, possibly throwing out some of the other usefull filters. Is there a way to get the single document with highlighting info but without polluting the filter cache? -- Christian
AW: Avoiding caching of special filter queries
Hi Erik, No, what I need to do is q=my funny queryfq=user:erikfq=id:doc Idhl=on ... This is because the StandardRequestHandler needs the original query to do proper highlighting. The user gets his paginated result page with his next 10 hits. He can then select one document for highlighting. Then I just repeat the last request with an additional filter query to select this one document and add the highlighting parameters. -- Christian -Ursprüngliche Nachricht- Von: Erik Hatcher [mailto:[EMAIL PROTECTED] Gesendet: Freitag, 20. April 2007 15:43 An: solr-user@lucene.apache.org Betreff: Re: Avoiding caching of special filter queries On Apr 20, 2007, at 7:11 AM, Burkamp, Christian wrote: I'm using filter queries to implement document level security with solr. The caching mechanism for filters separate from queries comes in handy and the system performs well once all the filters for the users of the system are stored in the cache. However, I'm storing full document content in the index for the purpose of highlighting. In addition to the standard snippet highlighting I would like to offer a feature that displays the highlighted full document content. I can add a filter query to select just the needed Document by ID but this filter would go into the filter cache as well, possibly throwing out some of the other usefull filters. Is there a way to get the single document with highlighting info but without polluting the filter cache? Correct me if I'm wrong, but here's my understanding... q=id:doc idfq=user:erik is what you'd want to do. q=id:doc won't go into the filter cache, but rather the query cache and the document itself into the document cache. So you won't risk bumping things out of the filter cache by using queries. Erik
Re: AW: Leading wildcards
thanks, this worked like a charm !! we built a custom QueryParser and we integrated the *foo** in it, so basically we can now search leading, trailing and both ... only crappy thing is the max Boolean clauses, but i'm going to look into that after the weekend for the next release of Solr : do not make this default, too many risks but do make an option in the config to enable it, it's a very nice feature thanks everybody for the help and have a nice weekend, maarten Burkamp, Christian [EMAIL PROTECTED] 19/04/2007 12:37 Please respond to solr-user@lucene.apache.org To solr-user@lucene.apache.org cc Subject AW: Leading wildcards Hi there, Solr does not support leading wildcards, because it uses Lucene's standard QueryParser class without changing the defaults. You can easily change this by inserting the line parser.setAllowLeadingWildcards(true); in QueryParsing.java line 92. (This is after creating a QueryParser instance in QueryParsing.parseQuery(...)) and it obviously means that you have to change solr's source code. It would be nice to have an option in the schema to switch leading wildcards on or off per field. Leading wildcards really make no sense on richly populated fields because queries tend to result in too many clauses exceptions most of the time. This works for leading wildcards. Unfortunately it does not enable searches with leading AND trailing wildcards. (E.g. searching for *lega* does not find results even if the term elegance is in the index. If you put a second asterisk at the end, the term elegance is found. (search for *lega** to get hits). Can anybody explain this though it seems to be more of a lucene QueryParser issue? -- Christian -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Gesendet: Donnerstag, 19. April 2007 08:35 An: solr-user@lucene.apache.org Betreff: Leading wildcards hi, we have been trying to get the leading wildcards to work. we have been looking around the Solr website, the Lucene website, wiki's and the mailing lists etc ... but we found a lot of contradictory information. so we have a few question : - is the latest version of lucene capable of handling leading wildcards ? - is the latest version of solr capable of handling leading wildcards ? - do we need to make adjustments to the solr source code ? - if we need to adjust the solr source, what do we need to change ? thanks in advance ! Maarten
Re: AW: Leading wildcards
Maarten: Would you mind sharing your custom query parser? On 4/20/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: thanks, this worked like a charm !! we built a custom QueryParser and we integrated the *foo** in it, so basically we can now search leading, trailing and both ... only crappy thing is the max Boolean clauses, but i'm going to look into that after the weekend for the next release of Solr : do not make this default, too many risks but do make an option in the config to enable it, it's a very nice feature thanks everybody for the help and have a nice weekend, maarten Burkamp, Christian [EMAIL PROTECTED] 19/04/2007 12:37 Please respond to solr-user@lucene.apache.org To solr-user@lucene.apache.org cc Subject AW: Leading wildcards Hi there, Solr does not support leading wildcards, because it uses Lucene's standard QueryParser class without changing the defaults. You can easily change this by inserting the line parser.setAllowLeadingWildcards(true); in QueryParsing.java line 92. (This is after creating a QueryParser instance in QueryParsing.parseQuery(...)) and it obviously means that you have to change solr's source code. It would be nice to have an option in the schema to switch leading wildcards on or off per field. Leading wildcards really make no sense on richly populated fields because queries tend to result in too many clauses exceptions most of the time. This works for leading wildcards. Unfortunately it does not enable searches with leading AND trailing wildcards. (E.g. searching for *lega* does not find results even if the term elegance is in the index. If you put a second asterisk at the end, the term elegance is found. (search for *lega** to get hits). Can anybody explain this though it seems to be more of a lucene QueryParser issue? -- Christian -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Gesendet: Donnerstag, 19. April 2007 08:35 An: solr-user@lucene.apache.org Betreff: Leading wildcards hi, we have been trying to get the leading wildcards to work. we have been looking around the Solr website, the Lucene website, wiki's and the mailing lists etc ... but we found a lot of contradictory information. so we have a few question : - is the latest version of lucene capable of handling leading wildcards ? - is the latest version of solr capable of handling leading wildcards ? - do we need to make adjustments to the solr source code ? - if we need to adjust the solr source, what do we need to change ? thanks in advance ! Maarten -- Michael Kimsal http://webdevradio.com
Re: Avoiding caching of special filter queries
On Apr 20, 2007, at 7:11 AM, Burkamp, Christian wrote: I'm using filter queries to implement document level security with solr. The caching mechanism for filters separate from queries comes in handy and the system performs well once all the filters for the users of the system are stored in the cache. However, I'm storing full document content in the index for the purpose of highlighting. In addition to the standard snippet highlighting I would like to offer a feature that displays the highlighted full document content. I can add a filter query to select just the needed Document by ID but this filter would go into the filter cache as well, possibly throwing out some of the other usefull filters. Is there a way to get the single document with highlighting info but without polluting the filter cache? Correct me if I'm wrong, but here's my understanding... q=id:doc idfq=user:erik is what you'd want to do. q=id:doc won't go into the filter cache, but rather the query cache and the document itself into the document cache. So you won't risk bumping things out of the filter cache by using queries. Erik
Re: Multiple Solr Cores
Updated (forgot the patch for Servlet). http://www.nabble.com/file/7996/solr-trunk-src.patch solr-trunk-src.patch The change should still be compatible with the trunk it is based upon. Henrib wrote: Following up on a previous thread in the Solr-User list, here is a patch that allows managing multiple cores in the same VM (thus multiple config/schemas/indexes). The SolrCore.core singleton has been changed to a MapString, SolrCore; the current singleton behavior is keyed as 'null'. (Which is used by SolrInfoRegistry). All static references to either a Config or a SolrCore have been removed; this implies that some classes now do refer to either a SolrCore or a SolrConfig (some ctors have been modified accordingly). I haven't tried to modify anything above the 'jar' (script, admin servlet are unaware of the multi-core part). The 2 patches files are the src/ the test/ patches. http://www.nabble.com/file/7971/solr-test.patch solr-test.patch http://www.nabble.com/file/7972/solr-src.patch solr-src.patch This being my first attempt at a contribution, I will humbly welcome any comment. Regards, Henri -- View this message in context: http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10106126 Sent from the Solr - User mailing list archive at Nabble.com.
sorting by matched field, then title alpha
Hi. I have some result ordering requirements which I can not solve searching the doco and forums to this point. Perhaps what I am trying does not belong in solr. Can anyone offer any hint or suggestion before I disappear into a vortex of terror? I need to conditionally order results of phrase searches like this: * First show all docs with phrase in field A - regardless of other occurences of phrase in doc - ordered alphabetically by field X * Next show all docs with phrase in field B - ... (as above) * Then same again for fields C, D and E * Then show remaining matching docs as per default scoring Perhaps I will have to do this clientside (ie. do multiple searches and concatenate results), but I'm hoping there is some way I can do this in a single search. Thanks, Simon
Re: Avoiding caching of special filter queries
On 4/20/07, Burkamp, Christian [EMAIL PROTECTED] wrote: Hi Erik, No, what I need to do is q=my funny queryfq=user:erikfq=id:doc Idhl=on ... This is because the StandardRequestHandler needs the original query to do proper highlighting. The user gets his paginated result page with his next 10 hits. He can then select one document for highlighting. Then I just repeat the last request with an additional filter query to select this one document and add the highlighting parameters. Erik posted the way to do this that works with OOB Solr. If you want to do it with no additional querying (not even for the docid filter), you can use an approach like this (from a previous email): - turn on lazy field loading. For best effect, compress the main text field. - create a new request handler that is similar to dismax, but uses the query for highlighting only. A separate parameter allows the specification of document keys to highlight - highlighting requires the internal lucene document id, not the document key, and it can be slow to execute queries to get the ids. I created a custom cache that maps doc keys - doc ids, populate it during the main query, and grab ids from the cache during the highlighting step. -Mike
Re: Solr performance warnings
On 4/20/07, Michael Thessel [EMAIL PROTECTED] wrote: Hey Erik, thanks for the fast reply. Yes this could be possible. I currently got solr running for the indexing of a forum with 100k users. It could definitely be possible that two commits overlap. But I need to commit all changes because the new posts must be available in the search as soon as they are posted. Do you think there is a way to optimize this? As soon as is a rather vague requirement. If you can specify the minimum acceptible delay, then you can use Solr's autocommit functionality to trigger commits. -Mike
Re: sorting by matched field, then title alpha
On 4/20/07, Simon Kahl [EMAIL PROTECTED] wrote: I need to conditionally order results of phrase searches like this: * First show all docs with phrase in field A - regardless of other occurences of phrase in doc - ordered alphabetically by field X * Next show all docs with phrase in field B - ... (as above) * Then same again for fields C, D and E * Then show remaining matching docs as per default scoring Perhaps I will have to do this clientside (ie. do multiple searches and concatenate results), but I'm hoping there is some way I can do this in a single search. You can approximate it by doing something like: A:phrase^10 B:phrase^1 C:phrase^1000 D:phrase^100 E:phrase^30 rest of query HTH, -Mike
Re: Solr performance warnings
Mike Klaas wrote: On 4/20/07, Michael Thessel [EMAIL PROTECTED] wrote: Hey Erik, thanks for the fast reply. Yes this could be possible. I currently got solr running for the indexing of a forum with 100k users. It could definitely be possible that two commits overlap. But I need to commit all changes because the new posts must be available in the search as soon as they are posted. Do you think there is a way to optimize this? As soon as is a rather vague requirement. If you can specify the minimum acceptible delay, then you can use Solr's autocommit functionality to trigger commits. -Mike I didn't know about the timed commits. That's perfect for me. Thanks, Michael
Re: Solr performance warnings
Michael Thessel wrote: Mike Klaas wrote: On 4/20/07, Michael Thessel [EMAIL PROTECTED] wrote: Hey Erik, thanks for the fast reply. Yes this could be possible. I currently got solr running for the indexing of a forum with 100k users. It could definitely be possible that two commits overlap. But I need to commit all changes because the new posts must be available in the search as soon as they are posted. Do you think there is a way to optimize this? As soon as is a rather vague requirement. If you can specify the minimum acceptible delay, then you can use Solr's autocommit functionality to trigger commits. -Mike I didn't know about the timed commits. That's perfect for me. Thanks, Michael The timed commits don't work for me. The webinterface says 0 commits since the server was restarted. And nothing in the logs as well. I use: apache-solr-1.1.0-incubating My updateHandler section from solrconfig.xml: updateHandler class=solr.DirectUpdateHandler2 autoCommit maxTime1/maxTime /autoCommit /updateHandler I also tried maxTime10/maxTime in case its seconds and not ms. Cheers Michael