Inconsistent facet ranges when using distributed search in Solr 4.3
Hi all, I am seeing some inconsistent behavior with facets, specifically range facets, on Solr 4.3. Running the same query several times (pressing F5 on the browser) produces different facet ranges when doing distributed searches, as some times it doesn't include some of the buckets. The results of the search are always correct as far as I can tell, it is just the range facets that sometimes miss ranges . Has anyone seen this behavior in Solr before? Any recommendations on how to troubleshoot this issue? Here are some details and an example: As an example of what I am seeing, take this query, in which I'll be faceting on the docnumber field: http://SERVER:8081/solr/shard1/myhandler? shards=SERVER:8081/solr/shard1,SERVER:8081/solr/shard2,SERVER:8081/solr/shard3 shards.qt=myhandler facet=true facet.field=docnumber f.docnumber.facet.sort=index facet.range=docnumber f.docnumber.facet.range.start=0 f.docnumber.facet.range.gap=100 f.docnumber.facet.range.end=10 f.docnumber.facet.limit=1000 facet.mincount=1 q=type:document wt=xml When I run it, I get one of the following three response, seemingly at random (haven't been able to notice a pattern so far): 1. Get 859 results (correct), but nothing on the facet ranges: ... result name=response numFound=859 start=0 maxScore=8.006225 ... lst name=facet_ranges lst name=docnumber lst name=counts/ int name=gap100/int int name=start0/int int name=end10/int /lst /lst 2. Get 859 results (correct), and the correct number of facets come up in the facet ranges (118+109+119+122+134+100+100+57=859): ... result name=response numFound=859 start=0 maxScore=8.006225 ... lst name=facet_ranges lst name=docnumber lst name=counts int name=0118/int int name=100109/int int name=200119/int int name=300122/int int name=400134/int int name=500100/int int name=600100/int int name=70057/int /lst int name=gap100/int int name=start0/int int name=end10/int /lst /lst 3. Get 859 results (correct), and only a partial number of facet ranges (118+109+119+122+134=602 vs. 859 results): ... result name=response numFound=859 start=0 maxScore=8.006225 ... lst name=facet_ranges lst name=docnumber lst name=counts int name=0118/int int name=100109/int int name=200119/int int name=300122/int int name=400134/int /lst int name=gap100/int int name=start0/int int name=end10/int /lst /lst I am using Solr 4.3 (4.3.0 1477023), with these parameters: Facet-related: facet=true facet.field=docnumber f.docnumber.facet.sort=index facet.range=docnumber f.docnumber.facet.range.start=0 f.docnumber.facet.range.gap=100 f.docnumber.facet.range.end=10 f.docnumber.facet.limit=1000 facet.mincount=1 For distributed search (environment has 3 cores in the same box): shards=SERVER:8081/solr/shard1,SERVER:8081/solr/shard2,SERVER:8081/solr/shard3 shards.qt=myhandler And the query: q=type:document wt=xml It is also worth noting that the facet field section does come up with the correct facets, the issue seems to be related only to the facet ranges (unless I am missing something). In the responses for all three examples above, the facet_fields list has all the values for docnumber, from 1 to 756, even if the facet ranges are missing buckets. lst name=facet_fields lst name=docnumber int name=11/int int name=22/int ... (continues on from 3 to 754) ... int name=7551/int int name=7561/int /lst /lst Thanks, Jose.
Timeout when calling Luke request handler after migrating from Solr 3.5 to 3.6.1
Hi all, As part of our business logic we query the Luke request handler to extract the fields in the index from our code using the following url: http://server:8080/solr/admin/luke?wt=jsonnumTerms=0 This worked fine with Solr 3.5, but now with 3.6.1 this call never returns, it hangs, and there is no error message in the server logs. Has any one seen this, or has an idea of what may be causing this? The Luke request handler is configured by default, we didn't change the configuration for this. If I go to solr/admin/stats.jsp, it is shown: name: /admin/luke class: org.apache.solr.handler.admin.LukeRequestHandler version: $Revision: 1242152 $ description: Lucene Index Browser. Inspired and modeled after Luke: http://www.getopt.org/luke/ stats: handlerStart : 1353373022984 requests : 0 errors : 0 timeouts : 0 totalTime : 0 avgTimePerRequest : NaN avgRequestsPerSecond : 0.0 We are running Apache Tomcat 6.0.35 with JDK 1.7.0_03, in case that rings a bell. The index has about Alternatively, our requirement is to get the list of fields in the index, including dynamic fields – is there any other way to obtain this at runtime? It is an application that runs on a separate process from Solr, and may even run on a separate box, thus the Luke call. Thank you for any help you can provide. Jose.
Re: Problem with spellchecker
Thank you for your help, the whole team overlooked this simple error. It was driving us crazy! :) Thanks!! Jose. On 10/2/12 1:23 AM, Markus Jelsma markus.jel...@openindex.io wrote: The problem is your stray double quote: str name=queryAnalyzerFieldTypetext_general_fr/str I'd think this would throw an exception somewhere. -Original message- From:Jose Aguilar jagui...@searchtechnologies.com Sent: Tue 02-Oct-2012 01:40 To: solr-user@lucene.apache.org Subject: Problem with spellchecker We have configured 2 spellcheckers English and French in solr 4 BETA. Each spellchecker works with a specific search handler. The English spellchecker is working as expected with any word regardless of the case. On the other hand, the French spellchecker works with lowercase words. If the first letter is uppercase, then the spellchecker is not returning any suggestion unless we add the spellcheck.q parameter with that term. To further clarify, this doesn't return any corrections: http://localhost:8984/solr/collection1/handler?wt=xmlq=Systme But this one works as expected: http://localhost:8984/solr/collection1/handler?wt=xmlq=Systmespellcheck .q=Systme According to this page (http://wiki.apache.org/solr/SpellCheckComponent#q_OR_spellcheck.q) , the spellcheck.q paramater shouldn't be required: If spellcheck.q is defined, then it is used, otherwise the original input query is used Are we missing something? We double checked the configuration settings for English which is working fine and it seems well configured. Here is an extract of the spellcheck component configuration for French language searchComponent name=spellcheckfr class=solr.SpellCheckComponent str name=queryAnalyzerFieldTypetext_general_fr/str lst name=spellchecker str name=namedefault/str str name=fieldSpellingFr/str str name=classnamesolr.DirectSolrSpellChecker/str str name=distanceMeasureinternal/str float name=accuracy0.5/float int name=maxEdits2/int int name=minPrefix1/int int name=maxInspections5/int int name=minQueryLength4/int float name=maxQueryFrequency0.01/float str name=buildOnCommittrue/str /lst /searchComponent Thanks for any help
Problem with spellchecker
We have configured 2 spellcheckers English and French in solr 4 BETA. Each spellchecker works with a specific search handler. The English spellchecker is working as expected with any word regardless of the case. On the other hand, the French spellchecker works with lowercase words. If the first letter is uppercase, then the spellchecker is not returning any suggestion unless we add the spellcheck.q parameter with that term. To further clarify, this doesn't return any corrections: http://localhost:8984/solr/collection1/handler?wt=xmlq=Systme But this one works as expected: http://localhost:8984/solr/collection1/handler?wt=xmlq=Systmespellcheck.q=Systme According to this page (http://wiki.apache.org/solr/SpellCheckComponent#q_OR_spellcheck.q) , the spellcheck.q paramater shouldn't be required: If spellcheck.q is defined, then it is used, otherwise the original input query is used Are we missing something? We double checked the configuration settings for English which is working fine and it seems well configured. Here is an extract of the spellcheck component configuration for French language searchComponent name=spellcheckfr class=solr.SpellCheckComponent str name=queryAnalyzerFieldTypetext_general_fr/str lst name=spellchecker str name=namedefault/str str name=fieldSpellingFr/str str name=classnamesolr.DirectSolrSpellChecker/str str name=distanceMeasureinternal/str float name=accuracy0.5/float int name=maxEdits2/int int name=minPrefix1/int int name=maxInspections5/int int name=minQueryLength4/int float name=maxQueryFrequency0.01/float str name=buildOnCommittrue/str /lst /searchComponent Thanks for any help
Enforce overall Solr timeout
Hi all, Is there a setting to enforce an overall timeout for Solr? For example, we are using setting timeallowed=2000 in solrconfig.xml (using version 3.5), but as far as I can tell, that only applies to the search part that returns partial results if it takes more than 2 seconds and returns partialResults=true, but the other processing time (facetting, highlighting, etc) is not covered in this timeallowed setting. Is there something that can be done so that for example if a Solr call overall takes more than say 5 seconds, kill the request it and return an error, or empty response or something? -- Jose Aguilar.
Using sort_values (fsv=true parameter) and Field Collapsing (group=true) at the same time
Hi all, I am using Solr 4.0 trunk with the Field Collapsing feature (http://wiki.apache.org/solr/FieldCollapsing) and I notice that when used at the same time as the fsv=true parameter, the sort_values in the response is gone. I haven't found much information about the fsv parameter, so I turned to the list to see if someone here can help us out, or shed some light if there is any incompatibility between the two features (which is what I think is happening, because of the field collapse implementation). Or maybe give us some pointers on how to achieve a similar effect. We use fsv=true to help in debugging as to why one document was sorted on top of the other when using certain sort orders in our application, so this is a great way to visualize this and save us debugging time. To clarify further, we send in this query to Solr expecting the grouped, sort_values and debug tags to be on the response, with the sort_values arrays corresponding to the first element of each group: http://localhost:8983/solr/select?wt=xmlfl=*q=solr+memorygroup=truegroup.field=manu_exactfsv=truedebugQuery=on… But we don't get the sort_values part back, we only get the following top-level tags in the response: response lst name=responseHeader… lst name=grouped… lst name=debug… /response If we don't use Field Collapsing, and instead send in something like this: http://localhost:8983/solr/select?wt=xmlfl=*q=solr+memoryfsv=truedebugQuery=on… Then we do get the sort_values element in the response: response lst name=responseHeader lst name=sort_values result name=response… lst name=debug /response Is there some incompatibility between the two features? Any other way to retrieve this information in a way that would be compatible with field collapsing? Thanks, Jose Aguilar