Re: 2 solr dataImport requests on a single core at the same time
please help me -- View this message in context: http://lucene.472066.n3.nabble.com/2-solr-dataImport-requests-on-a-single-core-at-the-same-time-tp978649p986351.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Dismax query response field number
No, i'm talking about fields. In my schema i've got about 15 fields with: stored=true Like this: field name=city type=text indexed=true stored=true/ But when i run a query it return me only 10 fields, the last 4 or 5 are not the the response?? -Original Message- From: Lance Norskog goks...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, Jul 22, 2010 2:47 am Subject: Re: Dismax query response field number Fields or documents? It will return all of the fields that are 'stored'. The default number of documents to return is 10. Returning all of the documents is very slow, so you have to request that with the rows= parameter. On Wed, Jul 21, 2010 at 3:32 PM, scr...@asia.com wrote: Hi, It seems that not all field are returned from query response when i use DISMAX? Only first 10?? Any idea? Here is my solrconfig: requestHandler name=dismax class=solr.SearchHandler lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str str name=fl*/str float name=tie0.01/float str name=qf text^0.5 content^1.1 title^1.5 /str str name=pf text^0.2 content^1.1 title^1.5 /str str name=bf recip(price,1,1000,1000)^0.3 /str str name=mm 2-1 5-2 690% /str int name=ps100/int str name=q.alt*:*/str !-- example highlighter config, enable per-query with hl=true -- str name=hl.fltext features name/str !-- for this field, we want no fragmenting, just highlighting -- str name=f.name.hl.fragsize0/str !-- instructs Solr to return the field itself if no query terms are found -- str name=f.name.hl.alternateFieldname/str str name=f.text.hl.fragmenterregex/str !-- defined below -- /lst /requestHandler -- Lance Norskog goks...@gmail.com
Re: Dismax query response field number
Do u have data in that field also,Solr returns field which have data only. -- View this message in context: http://lucene.472066.n3.nabble.com/Dismax-query-response-field-number-tp985567p986417.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Clustering results limit?
Hi, I am attempting to cluster a query. It kinda works, but where my (regular) query returns 500 results the cluster only shows 1-10 hits for each cluster (5 clusters). Never more than 10 docs and I know its not right. What could be happening here? It should be showing dozens of documents per cluster. Just to clarify -- how many documents do you see in the response (result name=response / section)? Clustering is performed on the search results (in real time), so if you request 10 results, clustering will apply only to those 10 results. To get a larger number of clusters you'd need to request more results, e.g. 50, 100, 200 etc. Obviously, the trade-off here is that it will take longer to fetch the documents from the index, clustering time will also increase. For some guidance on choosing the clustering algorithm, you can take a look at the following section of Carrot2 manual: http://download.carrot2.org/stable/manual/#section.advanced-topics.fine-tuning.choosing-algorithm . Cheers, Staszek
Re: 2 solr dataImport requests on a single core at the same time
DataImportHandler does not support parallel execution of several requests. You should either send your requests sequentially or register several DIH handlers in solrconfig and use them in parallel. On Thu, Jul 22, 2010 at 11:20 AM, kishan mklpra...@gmail.com wrote: please help me -- View this message in context: http://lucene.472066.n3.nabble.com/2-solr-dataImport-requests-on-a-single-core-at-the-same-time-tp978649p986351.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Dismax query response field number
Yes i've data... maybe my query is wrong? select?q=motoqt=dismaxq=city:Paris Field city is not showing? -Original Message- From: Grijesh.singh pintu.grij...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, Jul 22, 2010 10:07 am Subject: Re: Dismax query response field number Do u have data in that field also,Solr returns field which have data only. -- View this message in context: http://lucene.472066.n3.nabble.com/Dismax-query-response-field-number-tp985567p986417.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Securing Solr 1.4 in a glassfish container AS NEW THREAD
Are you using the same instance of CommonsHttpSolrServer for all the requests? On Wed, Jul 21, 2010 at 4:50 PM, Sharp, Jonathan jsh...@coh.org wrote: Some further information -- I tried indexing a batch of PDFs with the client and Solr CELL, setting the credentials in the httpclient. For some reason after successfully indexing several hundred files I start getting a SolrException: Unauthorized and an info message (for every subsequent file): INFO basic authentication scheme selected Org.apache.commons.httpclient.HttpMethodDirector process WWWAuthChallenge INFO Failure authenticating with BASIC 'realm'@host:port I increased session timeout in web.xml with no change. I'm looking through the httpclient authentication now. -Jon -Original Message- From: Sharp, Jonathan Sent: Friday, July 16, 2010 8:59 AM To: 'solr-user@lucene.apache.org' Subject: RE: Securing Solr 1.4 in a glassfish container AS NEW THREAD Hi Bilgin, Thanks for the snippet -- that helps a lot. -Jon -Original Message- From: Bilgin Ibryam [mailto:bibr...@gmail.com] Sent: Friday, July 16, 2010 1:31 AM To: solr-user@lucene.apache.org Subject: Re: Securing Solr 1.4 in a glassfish container AS NEW THREAD Hi Jon, SolrJ (CommonsHttpSolrServer) internally uses apache http client to connect to solr. You can check there for some documentation. I secured solr also with BASIC auth-method and use the following snippet to access it from solrJ: //set username and password ((CommonsHttpSolrServer) server).getHttpClient().getParams().setAuthenticationPreemptive(true); Credentials defaultcreds = new UsernamePasswordCredentials(username, secret); ((CommonsHttpSolrServer) server).getHttpClient().getState().setCredentials(new AuthScope(localhost, 80, AuthScope.ANY_REALM), defaultcreds); HTH Bilgin Ibryam On Fri, Jul 16, 2010 at 2:35 AM, Sharp, Jonathan jsh...@coh.org wrote: Hi All, I am considering securing Solr with basic auth in glassfish using the container, by adding to web.xml and adding sun-web.xml file to the distributed WAR as below. If using SolrJ to index files, how can I provide the credentials for authentication to the http-client (or can someone point me in the direction of the right documentation to do that or that will help me make the appropriate modifications) ? Also any comment on the below is appreciated. Add this to web.xml --- login-config auth-methodBASIC/auth-method realm-nameSomeRealm/realm-name /login-config security-constraint web-resource-collection web-resource-nameAdmin Pages/web-resource-name url-pattern/admin/url-pattern url-pattern/admin/*/url-pattern http-methodGET/http-methodhttp-methodPOST/http-methodhttp-metho dPUT/http-methodhttp-methodTRACE/http-methodhttp-methodHEAD/htt p-methodhttp-methodOPTIONS/http-methodhttp-methodDELETE/http-met hod /web-resource-collection auth-constraint role-nameSomeAdminRole/role-name /auth-constraint /security-constraint security-constraint web-resource-collection web-resource-nameUpdate Servlet/web-resource-name url-pattern/update/*/url-pattern http-methodGET/http-methodhttp-methodPOST/http-methodhttp-metho dPUT/http-methodhttp-methodTRACE/http-methodhttp-methodHEAD/htt p-methodhttp-methodOPTIONS/http-methodhttp-methodDELETE/http-met hod /web-resource-collection auth-constraint role-nameSomeUpdateRole/role-name /auth-constraint /security-constraint security-constraint web-resource-collection web-resource-nameSelect Servlet/web-resource-name url-pattern/select/*/url-pattern http-methodGET/http-methodhttp-methodPOST/http-methodhttp-metho dPUT/http-methodhttp-methodTRACE/http-methodhttp-methodHEAD/htt p-methodhttp-methodOPTIONS/http-methodhttp-methodDELETE/http-met hod /web-resource-collection auth-constraint role-nameSomeSearchRole/role-name /auth-constraint /security-constraint --- Also add this as sun-web.xml ?xml version=1.0 encoding=UTF-8? !DOCTYPE sun-web-app PUBLIC -//Sun Microsystems, Inc.//DTD Application Server 9.0 Servlet 2.5//EN http://www.sun.com/software/appserver/dtds/sun-web-app_2_5-0.dtd; sun-web-app error-url= context-root/Solr/context-root jsp-config property name=keepgenerated value=true descriptionKeep a copy of the generated servlet class' java code./description /property /jsp-config security-role-mapping role-nameSomeAdminRole/role-name group-nameSomeAdminGroup/group-name /security-role-mapping security-role-mapping role-nameSomeUpdateRole/role-name
Re: Dismax query response field number
maybe its too simple, but did you try the rows=20 or sth. greater as Lance suggested? = select?rows=20qt=dismax Regards, Peter. Yes i've data... maybe my query is wrong? select?q=motoqt=dismaxq=city:Paris Field city is not showing? -Original Message- From: Grijesh.singh pintu.grij...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, Jul 22, 2010 10:07 am Subject: Re: Dismax query response field number Do u have data in that field also,Solr returns field which have data only. -- http://karussell.wordpress.com/
Re: solrconfig.xml and xinclude
Hi, I am trying to do a similar thing within the schema.xml (using Solr 1.4.1), having a (super)schema that is common to 2 instances and specific fields I would like to include (with XInclude). Something like this: *schema name=dummy ... ... field name=A type=string indexed=true stored=false required=false multiValued=true/ field name=B type=string indexed=true stored=false required=false multiValued=true/ field name=C type=string indexed=true stored=true required=false/ !-- xincluding here -- xi:include href=solr/conf/specific_**fields_1.xml parse=xml xi:fallback xi:include href=solr/conf/specific_fields_2.**xml parse=xml/ /xi:fallback /xi:include ... /schema* and it works with the specific_fields_1.xml (or specific_fields_2.xml) like the following: *field name=first_specific_field type=string indexed=true stored=true required=false/* but it stops working when I add more than one field in the included XML: *fields* * field name=first_specific_field type=string indexed=true stored=true required=false/* *field name=second_specific_field type=string indexed=true stored=false required=false/* */fields* and consequently modify the including element as following: * xi:include href=solr/conf/**specific_**fields_1**.xml parse=xml xpointer=/fields/field xi:fallback xi:include href=solr/conf/**specific_**fields_2**.xml parse=xml xpointer=/fields/field/ /xi:fallback /xi:include* I tried to modify the *xpointer* attribute value to: *fields/field fields/* /fields/* element(/fields/field) element(/fields/*) element(fields/field) element(fields/*) * but I had no luck. Fiedzia, I think that xpointer=xpointer(something) won't work as you can read in the last sentence of the page regarding SolrConfig.xml [1]. I took a look to the Solr source code and I found a JUnit test for the XInclusion that tests the inclusion documented in the wiki [2][3]. I also found an entry on Lucid Imagination website at [4] but couldn't fix my issue. Please, could someone help us regarding what is the right way to configure XInclude inside Solr? Thanks in advance for your time. Cheers, Tommaso [1] : http://wiki.apache.org/solr/SolrConfigXml [2] : http://wiki.apache.org/solr/SolrConfigXml#XInclude [3] : http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/test/org/apache/solr/core/TestXIncludeConfig.java [4] : http://www.lucidimagination.com/search/document/31a60b7ccad76de1/is_it_possible_to_use_xinclude_in_schema_xml 2010/7/21 fiedzia fied...@gmail.com I am trying to export some config options common to all cores into single file, which would be included using xinclude. The only problem is how to include childrens of given node. common_solrconfig.xml looks like that: ?xml version=1.0 encoding=UTF-8 ? config lib dir=/solr/lib / /config solrconfig.xml looks like that: ?xml version=1.0 encoding=UTF-8 ? config !-- xinclude here -- /config now all of the following attemps have failed: xi:include href=/solr/common_solrconfig.xml xmlns:xi=http://www.w3.org/2001/XInclude;/xi:include xi:include href=/solr/common_solrconfig.xml xpointer=config/* xmlns:xi=http://www.w3.org/2001/XInclude;/xi:include xi:include href=/solr/common_solrconfig.xml xpointer=xpointer(config/*) xmlns:xi=http://www.w3.org/2001/XInclude;/xi:include xi:include href=/solr/common_solrconfig.xml xpointer=element(config/*) xmlns:xi=http://www.w3.org/2001/XInclude;/xi:include -- View this message in context: http://lucene.472066.n3.nabble.com/solrconfig-xml-and-xinclude-tp984058p984058.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Dismax query response field number
is this a typo in your query or in your e-mail? you have the q parameter twice. use fq for query inputs that mention a field explicitly when using dismax. So it should be: select?q=motoqt=dismax fq =city:Paris (the whitespace is only for visualization) chantal On Thu, 2010-07-22 at 11:03 +0200, scr...@asia.com wrote: Yes i've data... maybe my query is wrong? select?q=motoqt=dismaxq=city:Paris Field city is not showing? -Original Message- From: Grijesh.singh pintu.grij...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, Jul 22, 2010 10:07 am Subject: Re: Dismax query response field number Do u have data in that field also,Solr returns field which have data only.
Re: Dismax query response field number
Thanks, That was the problem! select?q=motoqt=dismax fq =city:Paris -Original Message- From: Chantal Ackermann chantal.ackerm...@btelligent.de To: solr-user@lucene.apache.org solr-user@lucene.apache.org Sent: Thu, Jul 22, 2010 12:47 pm Subject: Re: Dismax query response field number is this a typo in your query or in your e-mail? you have the q parameter twice. use fq for query inputs that mention a field explicitly when using dismax. So it should be: select?q=motoqt=dismax fq =city:Paris (the whitespace is only for visualization) chantal On Thu, 2010-07-22 at 11:03 +0200, scr...@asia.com wrote: Yes i've data... maybe my query is wrong? select?q=motoqt=dismaxq=city:Paris Field city is not showing? -Original Message- From: Grijesh.singh pintu.grij...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, Jul 22, 2010 10:07 am Subject: Re: Dismax query response field number Do u have data in that field also,Solr returns field which have data only.
Re: Clustering results limit?
Staszek, Thank you. The cluster response has a maximum of 10 documents in each cluster. I didn't set this limit and the query by itself returns 500+ documents. There should be many more than 10 in each cluster. Does it default to 10 maybe? Or is there a way to say, cluster every result in the query? thank you, I will read the links again, Darren On Thu, 2010-07-22 at 10:15 +0200, Stanislaw Osinski wrote: Hi, I am attempting to cluster a query. It kinda works, but where my (regular) query returns 500 results the cluster only shows 1-10 hits for each cluster (5 clusters). Never more than 10 docs and I know its not right. What could be happening here? It should be showing dozens of documents per cluster. Just to clarify -- how many documents do you see in the response (result name=response / section)? Clustering is performed on the search results (in real time), so if you request 10 results, clustering will apply only to those 10 results. To get a larger number of clusters you'd need to request more results, e.g. 50, 100, 200 etc. Obviously, the trade-off here is that it will take longer to fetch the documents from the index, clustering time will also increase. For some guidance on choosing the clustering algorithm, you can take a look at the following section of Carrot2 manual: http://download.carrot2.org/stable/manual/#section.advanced-topics.fine-tuning.choosing-algorithm . Cheers, Staszek
Re: Clustering results limit?
I set the rows=50 on my clustering URL in a browser and it returns more. In my SolrJ, I used ModifiableSolrParams and I set (rows,50) but it still returns less than 10 for each cluster. Is there a way to set rows wanted with ModifiableSolrParams? thanks and sorry for the double post. Darren On Thu, 2010-07-22 at 10:15 +0200, Stanislaw Osinski wrote: Hi, I am attempting to cluster a query. It kinda works, but where my (regular) query returns 500 results the cluster only shows 1-10 hits for each cluster (5 clusters). Never more than 10 docs and I know its not right. What could be happening here? It should be showing dozens of documents per cluster. Just to clarify -- how many documents do you see in the response (result name=response / section)? Clustering is performed on the search results (in real time), so if you request 10 results, clustering will apply only to those 10 results. To get a larger number of clusters you'd need to request more results, e.g. 50, 100, 200 etc. Obviously, the trade-off here is that it will take longer to fetch the documents from the index, clustering time will also increase. For some guidance on choosing the clustering algorithm, you can take a look at the following section of Carrot2 manual: http://download.carrot2.org/stable/manual/#section.advanced-topics.fine-tuning.choosing-algorithm . Cheers, Staszek
Re: Dismax query response field number
scrapy what version of solr are you using? I'd like to do fq=city:Paris but it doesnt seem to work for me (solr 1.4) and the docs seem to suggest its a feature that is coming but not there yet? Or maybe I misunderstood? On Thu, Jul 22, 2010 at 6:00 AM, scr...@asia.com wrote: Thanks, That was the problem! select?q=motoqt=dismax fq =city:Paris -Original Message- From: Chantal Ackermann chantal.ackerm...@btelligent.de To: solr-user@lucene.apache.org solr-user@lucene.apache.org Sent: Thu, Jul 22, 2010 12:47 pm Subject: Re: Dismax query response field number is this a typo in your query or in your e-mail? you have the q parameter twice. use fq for query inputs that mention a field explicitly when using dismax. So it should be: select?q=motoqt=dismax fq =city:Paris (the whitespace is only for visualization) chantal On Thu, 2010-07-22 at 11:03 +0200, scr...@asia.com wrote: Yes i've data... maybe my query is wrong? select?q=motoqt=dismaxq=city:Paris Field city is not showing? -Original Message- From: Grijesh.singh pintu.grij...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, Jul 22, 2010 10:07 am Subject: Re: Dismax query response field number Do u have data in that field also,Solr returns field which have data only.
Re: Dismax query response field number
I'm using Solr 1.4.1 -Original Message- From: Justin Lolofie jta...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, Jul 22, 2010 2:57 pm Subject: Re: Dismax query response field number scrapy what version of solr are you using? I'd like to do fq=city:Paris but it doesnt seem to work for me (solr 1.4) and the docs seem to suggest its a feature that is coming but not there yet? Or maybe I misunderstood? On Thu, Jul 22, 2010 at 6:00 AM, scr...@asia.com wrote: Thanks, That was the problem! select?q=motoqt=dismax fq =city:Paris -Original Message- From: Chantal Ackermann chantal.ackerm...@btelligent.de To: solr-user@lucene.apache.org solr-user@lucene.apache.org Sent: Thu, Jul 22, 2010 12:47 pm Subject: Re: Dismax query response field number is this a typo in your query or in your e-mail? you have the q parameter twice. use fq for query inputs that mention a field explicitly when using dismax. So it should be: select?q=motoqt=dismax fq =city:Paris (the whitespace is only for visualization) chantal On Thu, 2010-07-22 at 11:03 +0200, scr...@asia.com wrote: Yes i've data... maybe my query is wrong? select?q=motoqt=dismaxq=city:Paris Field city is not showing? -Original Message- From: Grijesh.singh pintu.grij...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, Jul 22, 2010 10:07 am Subject: Re: Dismax query response field number Do u have data in that field also,Solr returns field which have data only.
Using Solr to perform range queries in Dspace
I'm trying to use dspace to search across a range of index created and stored using Dsindexer.java class. I have seen where Solr can be use to perform numerical range queries using either TrieIntField, TrieDoubleField,TrieLongField, etc.. classes defined in Solr's api or SortableIntField.java, SortableLongField,SortableDoubleField.java. I would like to know how to implement these classes in Dspace so that I can be able to perform numerical range queries. Any help would be greatly apprciated. -- View this message in context: http://lucene.472066.n3.nabble.com/Using-Solr-to-perform-range-queries-in-Dspace-tp987049p987049.html Sent from the Solr - User mailing list archive at Nabble.com.
Getting FileNotFoundException with repl command=backup?
Informational Hi, This information is for anyone who might be running into problems when performing explicit periodic backups of Solr indexes. I encountered this problem, and hopefully this might be useful to others. A related Jira issue is: SOLR-1475. The issue is: When you execute a 'command=backup' request, the snapshot starts, but then fails later on with file not found errors. This aborts the snapshot, and you end up with no backup. This error occurs if, during the backup process, Solr performs more commits than its 'maxCommitsToKeep' setting in solrconfig.xml. If you don't commit very often, you probably won't see this problem. If, however, like me, you have Solr committing very often, the commit point files for the backup can get deleted before the backup finishes. This is particualrly true of larger indexes, where the backup can take some time. Workaround 1: One workaround to this is to set 'maxCommitsToKeep' to a number higher than the total number of commits that can occur during the time it takes to do a backup. Sounds like a 'finger-in-the-air' number? Well, yes it is. If you commit every 20secs, and a full backup takes 10mins, you'll want a value of at least 31. The trouble is, how long will a backup take? This can vary hugely as the index grows, system is busy, disk fragmentation etc. (my environment takes ~13mins to backup a 5.5GB index to a local folder) An inefficiency of this approach that needs to be considered is the higher the 'maxCommitsToKeep' number is, the more files you're going to have lounging around in your index data folder - the majority of which never get used. The collective size of these commit point files can be significant. If you have a high mergeFactor, the number of files will increase as well. You can set 'maxCommitAge' to delete old commit points after a certain time - as long as it's not shorter than the 'worst-case' backup time. I set my 'maxCommitsToKeep' to 2400, and the file not found errors disappeared (note that 2400 is a hugely conservative number to cater for a backup taking 24hrs). My mergeFactor is 25, so I get a high number of files in the index folder, they are generally small in size, but significant extra storage can be required. If you're willing to trade off some (ok, potentially a lot of) extraneous disk usage to keep commit points around waiting for a backup command, this approach addresses the problem. Workaround 2: A preferable method (IMHO), is if you have an extra box, set up a read-only replica, and then backup from the replica. Then you can then tune the slave to suit your needs. Coding: I'm not very familiar with the repl/backup code, but a coded way to address this might be to save a commit point's index version files when a backup command is received, then release them for deletion when complete. Perhaps someone with good knowledge of this part of Solr could comment more succinctly. Thanks, Peter
Tree Faceting in Solr 1.4
Hi Solr Community If I have: COUNTRY CITY Germany Berlin Germany Hamburg Spain Madrid Can I do faceting like: Germany Berlin Hamburg Spain Madrid I tried to apply SOLR-792 to the current trunk but it does not seem to be compatible. Maybe there is a similar feature existing in the latest builds? Thanks Regards Eric
Re: Tree Faceting in Solr 1.4
Perhaps the following article can help: http://www.craftyfella.com/2010/01/faceting-and-multifaceting-syntax-in.html -S On Jul 22, 2010, at 5:39 PM, Eric Grobler wrote: Hi Solr Community If I have: COUNTRY CITY Germany Berlin Germany Hamburg Spain Madrid Can I do faceting like: Germany Berlin Hamburg Spain Madrid I tried to apply SOLR-792 to the current trunk but it does not seem to be compatible. Maybe there is a similar feature existing in the latest builds? Thanks Regards Eric
Re: Clustering results limit?
Hi, In my SolrJ, I used ModifiableSolrParams and I set (rows,50) but it still returns less than 10 for each cluster. Oh, the number of documents per cluster very much depends on the characteristics of your documents, it often happens that the algorithms create larger numbers of smaller clusters. However, all returned documents should get assigned to some cluster(s), the Other Topics one in the worst case. Does that hold in your case? If you'd like to tune clustering a bit, you can try Carrot2 tools: http://download.carrot2.org/stable/manual/#section.getting-started.solr and then: http://download.carrot2.org/stable/manual/#chapter.tuning Cheers, S.
Delta import processing duration
I'm using Solr to index data from our data warehouse. The data is imported through text files. I've written a custom FileImportDataImportHandler that extends DataSource and it works fine - I've tested it with 280,000 records and it manages to build the index in about 3 minutes. My problem is that doing a delta update seems to take a really long time. I've written a custome FileUpdateDataImportHandler which takes two files, one for deletes and one fore updates. I've tested with an update file containing 18,000 records and a delete file containing 30 records - my custom handler whizzed through them in a few seconds but the page at /solr/admin/dataimport.jsp says the command is still running (its been running nearly an hour). What's taking so long? Could there be some kind of inefficiency in the way my update handler works? -- View this message in context: http://lucene.472066.n3.nabble.com/Delta-import-processing-duration-tp987562p987562.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Tree Faceting in Solr 1.4
Thank you for the link. I was not aware of the multifaceting syntax - this will enable me to run 1 less query on the main page! However this is not a tree faceting feature. Thanks Eric On Thu, Jul 22, 2010 at 4:51 PM, SR r.steve@gmail.com wrote: Perhaps the following article can help: http://www.craftyfella.com/2010/01/faceting-and-multifaceting-syntax-in.html -S On Jul 22, 2010, at 5:39 PM, Eric Grobler wrote: Hi Solr Community If I have: COUNTRY CITY Germany Berlin Germany Hamburg Spain Madrid Can I do faceting like: Germany Berlin Hamburg Spain Madrid I tried to apply SOLR-792 to the current trunk but it does not seem to be compatible. Maybe there is a similar feature existing in the latest builds? Thanks Regards Eric
Solr on iPad?
Dear Solr community, does anyone know whether it may be possible or has already been done to bring Solr to the Apple iPad so that applications may use a local search engine? Greetings, Stephan -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-on-iPad-tp987655p987655.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr on iPad?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Stephan Schwab wrote: Dear Solr community, does anyone know whether it may be possible or has already been done to bring Solr to the Apple iPad so that applications may use a local search engine? huh? Solr requires Java. iPad does not support Java. Solr is memory and cpu intensive...nothing that fits with the concept of a tablet pc. - -aj -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkxIfp8ACgkQCJIWIbr9KYwQgQCg0p1oiXuPf17/Vg2JEpVHlZql bLEAoL46mARjhGkHsz30Kv1Agpf2xp6r =86KI -END PGP SIGNATURE-
Re: a bug of solr distributed search
As the comments suggest, it's not a bug, but just the best we can do for now since our priority queues don't support removal of arbitrary elements. I guess we could rebuild the current priority queue if we detect a duplicate, but that will have an obvious performance impact. Any other suggestions? -Yonik http://www.lucidimagination.com On Wed, Jul 21, 2010 at 3:13 AM, Li Li fancye...@gmail.com wrote: in QueryComponent.mergeIds. It will remove document which has duplicated uniqueKey with others. In current implementation, it use the first encountered. String prevShard = uniqueDoc.put(id, srsp.getShard()); if (prevShard != null) { // duplicate detected numFound--; collapseList.remove(id+); docs.set(i, null);//remove it. // For now, just always use the first encountered since we can't currently // remove the previous one added to the priority queue. If we switched // to the Java5 PriorityQueue, this would be easier. continue; // make which duplicate is used deterministic based on shard // if (prevShard.compareTo(srsp.shard) = 0) { // TODO: remove previous from priority queue // continue; // } } It iterate ove ShardResponse by for (ShardResponse srsp : sreq.responses) But the sreq.responses may be different. That is -- shard1's result and shard2's result may interchange position So when an uniqueKey(such as url) occurs in both shard1 and shard2. which one will be used is unpredicatable. But the socre of these 2 docs are different because of different idf. So the same query will get different result. One possible solution is to sort ShardResponse srsp by shard name.
Re: Solr on iPad?
Hi Stephan, On a lark, I hacked up solr running under a small-footprint servlet engine on my jailbroken iPad. You can see the console here: http://imgur.com/tHRh3 It's not a particularly practical solution, though, since Apple would never approve a Java-based app for the App Store. Or a background service, for that matter. So it would only ever run on a jailbroken iPad. Even if you're willing to live with that, keeping the process running in the background all the time would have a devastating impact on battery life. Michael -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-on-iPad-tp987655p987716.html Sent from the Solr - User mailing list archive at Nabble.com.
Providing token variants at index time
Is there a tokenizer that supports providing variants of the tokens at index time? I'm looking for something that could take a syntax like: International|I Business|B Machines|M Which would take each pipe delimited token and preserve its position so that phrase queries work properly. The above would result in queries for International Business Machines as well as I B M or any variants. The point is that the variants would be generated externally as part of the indexing process so they may not be as simple as the above. Any ideas or do I have to write a custom tokenizer to do this? Thanks, Paul
calling other core from request handler
I have a multi-core environment and a custom request handler. However, I have one place where I would like to have my request handler on coreA query to coreB. This is not distributed search. This is just an independent query to get some additional data. I am also guaranteed that each server will have the same core set. I am also guaranteed that I will not be reloading cores (just indexes). It looks I can call coreA.getCoreDescriptor().getCoreContainer().getCore(coreB); and then get the Searcher and release it when I am done. Is there a better way? And it also appears that during the inform or init methods of my requestHandler, coreB is NOT guaranteed to already exist?
Re: Providing token variants at index time
I think the Synonym filter should actually do exactly what you want, no? http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory Hmm, maybe not exactly what you want as you describe it. It comes close, maybe good enough. Do you REALLY need to support I Business M or I B Machines as source/query? Your spec suggests yes, synonym filter won't easily do that.But if you just want International Business Machines == IBM, keeping positions intact for subsequent terms, I think synonym filter will do it. If not, I suppose you could look at it's source to write your own. Or maybe there's some way to combine the PositionFilter with something else to do it, but I can't figure one out. Jonathan Paul Dlug wrote: Is there a tokenizer that supports providing variants of the tokens at index time? I'm looking for something that could take a syntax like: International|I Business|B Machines|M Which would take each pipe delimited token and preserve its position so that phrase queries work properly. The above would result in queries for International Business Machines as well as I B M or any variants. The point is that the variants would be generated externally as part of the indexing process so they may not be as simple as the above. Any ideas or do I have to write a custom tokenizer to do this? Thanks, Paul
Re: a bug of solr distributed search
: As the comments suggest, it's not a bug, but just the best we can do : for now since our priority queues don't support removal of arbitrary FYI: I updated the DistributedSearch wiki to be more clear about this -- it previously didn't make it explicitly clear that docIds were suppose to be unique across all shards, and suggested that there was specific well definied behavior when they weren't. -Hoss
Re: Providing token variants at index time
On Thu, Jul 22, 2010 at 4:01 PM, Jonathan Rochkind rochk...@jhu.edu wrote: I think the Synonym filter should actually do exactly what you want, no? http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory Hmm, maybe not exactly what you want as you describe it. It comes close, maybe good enough. Do you REALLY need to support I Business M or I B Machines as source/query? Your spec suggests yes, synonym filter won't easily do that.But if you just want International Business Machines == IBM, keeping positions intact for subsequent terms, I think synonym filter will do it. If not, I suppose you could look at it's source to write your own. Or maybe there's some way to combine the PositionFilter with something else to do it, but I can't figure one out. The synonym approach won't work as I need to provide them in a file. The variants may be more dynamic and not known in advance, the process creating the documents to index does have that logic and could easily put them into the document in a format a tokenizer could pull apart later. --Paul
RE: Finding distinct unique IDs in documents returned by fq -- Urgent Help Req
: I would like get the total count of the facet.field response values : : I'm pretty sure there's no way to get Solr to do that -- other than not : setting a facet.limit, getting every value back in the response, and : counting them yourself (not feasible for very large counts). I've : looked at trying to patch Solr to do it, because I could really use it : too; it's definitely possible, but made trickier because there are now : several different methods that Solr can use to do facetting, with : separate code paths. It seems like an odd omission to me too. beyond just having multiple facet algorithms for perforamance making it difficult to add this feature, the other issue is hte perforamce of computing the number: in some algorithms it's relatively cheap (on a single server) but in others it's more expensive then computing the facet counts being returned (consider the case where we are sorting in term order - once we have collected counts for ${facet.limit} constraints, we can stop iterating over terms -- but to compute the total umber of constraints (ie: terms) we would have to keep going and test every one of them against ${facet.mincount}) With distributed searching it becomes even more prohibitive -- your description of using an infinite facet.limit and asking for every value back to count them is exactly what would have to be done internally in a distributed faceting situation -- except they couldn't just be counted, they'd have to be deduped and then counted) To do this efficiently, other data structures (denormalized beyond just the inverted index level) would need to be built. -Hoss
Re: stats on a field with no values
: : When I use the stats component on a field that has no values in the result set : (ie, stats.missing == rowCount), I'd expect that 'min'and 'max' would be : blank. : : Instead, they seem to be the smallest and largest float values or something, : min = 1.7976931348623157E308, max = 4.9E-324 . : : Is this a bug? off the top of my head it sounds like it ... would you mind opening a n issue in Jira please? -Hoss
Re: How to get the list of all available fields in a (sharded) index
: I cannot find any info on how to get the list of current fields in an index : (possibly sharded). With dynamic fields, I cannot simply parse the schema to there isn't one -- the LukeRequestHandler can tell you what fields *actually* exist in your index, but you'd have to query it on each shard to know the full set of concrete fields in the entire distributed index. -Hoss
Re: about warm up
: I want to load full text into an external cache, So I added so codes : in newSearcher where I found the warm up takes place. I add my codes ... : public void newSearcher(SolrIndexSearcher newSearcher, : SolrIndexSearcher currentSearcher) { : warmTextCache(newSearcher,warmTextCache,new String[]{title,content}); ... : in warmTextCache I need a reader to get some docs ... : So I need a reader, When I contruct a reader by myself like: : IndexReader reader=IndexReader.open(...); : Or by core.getSearcher().get().getReader() Don't do that -- the readers/searchers are refrenced counted by the SolrCore, so unless you release your refrences cleanly all you are likely to get into some interesting situations the newSearcher method you are implementing directly gives you the SolrIndexSearcher (the newSearcher arg) that will be used along with your cache . why don't you use it to get the reader (the getReader() method)instead of jumping through these hoops you've been trying? -Hoss
commit is taking very very long time
Hi, I am not sure why some commits take very long time. I have a batch indexing which commits just once after it completes the indexing. I tried to index just 36 rows but the total time taken to index was like 12 minutes. The indexing time was very less just some 30 seconds but it took the remaining time for commit. response − lst name=responseHeader int name=status0/int int name=QTime0/int /lst − lst name=initArgs − lst name=defaults str name=configdataimportHydrogen.xml/str /lst /lst str name=statusidle/str str name=importResponse/ − lst name=statusMessages str name=Total Requests made to DataSource4/str str name=Total Rows Fetched36/str str name=Total Documents Skipped0/str str name=Full Dump Started2010-07-22 15:42:28/str − str name= Indexing completed. Added/Updated: 4 documents. Deleted 0 documents. /str str name=Committed2010-07-22 15:54:49/str str name=Optimized2010-07-22 15:54:49/str str name=Total Documents Processed4/str str name=Time taken 0:12:21.632/str /lst − str name=WARNING This response format is experimental. It is likely to change in the future. /str /response I even set the autowarm count to 0 in solrconfig.xml file but of non use. Any reason why the commit takes more time? Also is there a way to reduce the time it takes? I have attached my solrconfig / log for your reference. http://lucene.472066.n3.nabble.com/file/n988220/SOLRerror.log SOLRerror.log http://lucene.472066.n3.nabble.com/file/n988220/solrconfig.xml solrconfig.xml Thanks, BB -- View this message in context: http://lucene.472066.n3.nabble.com/commit-is-taking-very-very-long-time-tp988220p988220.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Novice seeking help to change filters to search without diacritics
: I am new to Solr and seeking your help to change filter from : ISOLatin1AccentFilterFactory to ASCIIFoldingFilterFactory files. I am not According to the files you posted, you aren't using the ISOLatin1AccentFilterFactory -- so problem solved w/o making any changes. : sure what change is to be made and where exactly this change is to be made. : And finally, what would replace mapping-ISOLatin1Accent.txt file? I would i think what's confusing you is thta you are using the MappingCharFilterFactory with that file in your text field type to convert any ISOLatin1Accent characters to their base characters (i'm sure there is a more precise term for it, but i'm not a charset savant like rmuir so i odn't know what it's caled) : like Solr to search both with and without diacritics found in : transliteration of Indian languages with characters such as Ā ś ṛ ṇ, etc. your existing usage should allow that on any fields using the text type -- if you index those characters they will get flattened and if someone searches on those characters they will get flattened -- it's just like using LowerCaseFilter -- as long as you do it at index and query time everything is consistent. if you want docs to score higher when even the accents match, just index and query across two fields: on with that charfilter and one w/o. -Hoss
Re: Providing token variants at index time
Paul Dlug wrote: On Thu, Jul 22, 2010 at 4:01 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The synonym approach won't work as I need to provide them in a file. The variants may be more dynamic and not known in advance, the process creating the documents to index does have that logic and could easily put them into the document in a format a tokenizer could pull apart later. Then maybe look at the source code of the synonyms file, and build your own filter, copying the parts that do the real work (or even sub-classing), but instead of using a file, using the transient state information that is for some reason only available at indexing time? Don't entirely understand your use case, if you give some more explicit examples, others might have other ideas. Joanthan
Re: Finding distinct unique IDs in documents returned by fq -- Urgent Help Req
Chris Hostetter wrote: computing the number: in some algorithms it's relatively cheap (on a single server) but in others it's more expensive then computing the facet counts being returned (consider the case where we are sorting in term order - once we have collected counts for ${facet.limit} constraints, we can stop iterating over terms -- but to compute the total umber of constraints (ie: terms) we would have to keep going and test every one of them against ${facet.mincount}) I've been told this before, but it still doesn't really make sense to me. How can you possibly find the top N constraints, without having at least examined all the contraints? How do you know which are the top N if there are some you haven't looked at? And if you've looked at them all, it's no problem to increment at a counter as you look at each one. Although I guess the facet.minCount test does possibly put a crimp in things, I don't ever use that param myself to be something other than 1, so hadn't considered it. But I may be missing something. I've examined only one of the code paths/methods for faceting in source code, the one (if my reading was correct) that ends up used for high-cardinality multi-valued fields -- in that method, it looked like it should add no work at all to give you a facet unique value (result set value cardinality) count. (with facet.mincount of 1 anyway). But I may have been mis-reading, or it may be that other methods are more troublesome. At any rate, if I need it bad enough, I'll try to write my own facet component that does it (perhaps a subclass of the existing SimpleFacet), and see what happens. It does seem to be something a variety of people's use cases could use, I see it mentioned periodically in the list serv archives. Jonathan
WordDelimiterFilter and phrase queries?
Hi All, A question about the WordDelimiterFilter and position increments / phrase queries: I have a string like: 3-diphenyl-propanoic When indexed gets it is broken up into the following tokens: pos token offset 1 3 0-1 2 diphenyl 2-10 3 propanoic 11-20 3 diphenylpropanoic 2-20 The WordDelimiterFilter has catenateWords set to 1, which causes it to emit 'diphenylpropanoic'. Note that position for this term is '3'. (catentateAll is set to 0) Say someone enters the query string 3-diphenylpropanoic The query parser I'm using transforms this into a phrase query and the indexed form is missed because based the positions of the terms '3' and 'diphenylpropanoic' indicate they are not adjacent? Is this intended behavior? I expect that the catenated word 'diphenylpropanoic' should have a position of 2 based on the position of the first term in the concatenation, but perhaps I'm missing something. This seems to be present in both 1.4.1 and the current trunk. - Drew
Re: calling other core from request handler
: It looks I can : call coreA.getCoreDescriptor().getCoreContainer().getCore(coreB); and then get : the Searcher and release it when I am done. : : Is there a better way? not really ... not unless you want to do it via HTTP to localhost : And it also appears that during the inform or init methods of my requestHandler, : coreB is NOT guaranteed to already exist? correct ... your RequestHandler shouldn't make any assumptions about the order that core's are initialized in. -Hoss
Duplicates
Hi, Is it possible to remove duplicates in search results by a given field? Thanks. -- Pavel Minchenkov
Re: Finding distinct unique IDs in documents returned by fq -- Urgent Help Req
: being returned (consider the case where we are sorting in term order - once : we have collected counts for ${facet.limit} constraints, we can stop : iterating over terms -- but to compute the total umber of constraints (ie: : terms) we would have to keep going and test every one of them against : ${facet.mincount}) : : I've been told this before, but it still doesn't really make sense to me. How : can you possibly find the top N constraints, without having at least examined : all the contraints? How do you know which are the top N if there are some you that's exactly my point: in the scenerio where you've asked for facet.mincount=Nfacet.limit=Mfacet.sort=index you don't have to find hte top constraints, you just have to find the first M terms in index order that have a mincount of N. : But I may be missing something. I've examined only one of the code : paths/methods for faceting in source code, the one (if my reading was correct) : that ends up used for high-cardinality multi-valued fields -- in that method, : it looked like it should add no work at all to give you a facet unique value : (result set value cardinality) count. (with facet.mincount of 1 anyway). But : I may have been mis-reading, or it may be that other methods are more : troublesome. in any case where you ar sorting by *counts* then yes, all of the constraints have to be checked, so you can count them as you go -- but that doesn't scale in distributed faceting, you can't just add the counts up from each shard because you don't know what the overlap is -- hence my comment about how to dedup them. there are some simple usecases where it's feasible, but in general it's a very hard problem. -Hoss
Re: boosting particular field values
I blieve this cam up on IRC, and the end result wsa that the bq was working fine, Justin just wasn't noticing because he added it to his solrconfig.xml (and not to the query URL) and his browser was still caching the page -- so he didn't see his boost affect anything) (but i may be confusing justin with someone else) : I'm using dismax request handler, solr 1.4. : : I would like to boost the weight of certain fields according to their : values... this appears to work: : : bq=category:electronics^5.5 : : However, I think this boosting only affects sorting the results that : have already matched? So if I only get 10 rows back, I might not get : any records back that are category electronics. If I get 100 rows, I : can see that bq is working. However, I only want to get 10 rows. : : How does one affect the kinds of results that are matched to begin : with? bq is the wrong thing to use, right? : : Thanks for any help, : Justin : -Hoss
Re: Clustering results limit?
Yeah, my results count is 151 and only 21 documents appear in 6 clusters. This is true whether I use URL or SolrJ. When I use carrot workbench and point to my Solr using local clustering, the workbench has numerous clusters and all documents are placed On Thu, 2010-07-22 at 18:06 +0200, Stanislaw Osinski wrote: Hi, In my SolrJ, I used ModifiableSolrParams and I set (rows,50) but it still returns less than 10 for each cluster. Oh, the number of documents per cluster very much depends on the characteristics of your documents, it often happens that the algorithms create larger numbers of smaller clusters. However, all returned documents should get assigned to some cluster(s), the Other Topics one in the worst case. Does that hold in your case? If you'd like to tune clustering a bit, you can try Carrot2 tools: http://download.carrot2.org/stable/manual/#section.getting-started.solr and then: http://download.carrot2.org/stable/manual/#chapter.tuning Cheers, S.
Re: Duplicates
If the field is a single token, just define the uniqueKey on it in your schema. Otherwise, this may be of interest: http://wiki.apache.org/solr/Deduplication Haven't used it myself though... best Erick On Thu, Jul 22, 2010 at 6:14 PM, Pavel Minchenkov char...@gmail.com wrote: Hi, Is it possible to remove duplicates in search results by a given field? Thanks. -- Pavel Minchenkov
DIH stalling, how to debug?
Hi, When I run my DIH script, it says it's busy but the Total Requests made to DataSource and Total Rows Fetched remain unchanged at 4 and 6. It hasn't reported a failure. How can I debug what is blocking the DIH? -- @tommychheng Programmer and UC Irvine Graduate Student Find a great grad school based on research interests: http://gradschoolnow.com
Re: DIH stalling, how to debug?
Ok, it was a runaway SQL query which isn't using an index. @tommychheng Programmer and UC Irvine Graduate Student Find a great grad school based on research interests: http://gradschoolnow.com On 7/22/10 4:26 PM, Tommy Chheng wrote: Hi, When I run my DIH script, it says it's busy but the Total Requests made to DataSource and Total Rows Fetched remain unchanged at 4 and 6. It hasn't reported a failure. How can I debug what is blocking the DIH?
Re: filter query on timestamp slowing query???
: You are correct, first of all i haven't move yet to the TrieDateField, but i : am still waiting to find out a bit more information about it, and there's : not a lot of info, other then in the xml file. In general TrieFields are a way of trading disk space for range query speed. they are explained fairly well if you look at the docs... http://lucene.apache.org/solr/api/org/apache/solr/schema/TrieField.html http://lucene.apache.org/java/2_9_0/api/all/org/apache/lucene/search/NumericRangeQuery.html ...allthough i realize now that TrieDateField's docs don't actually link to TrieField where the explanation is provided. AS for your usecase... : I'll explain my use case, so you'll know a bit more. I have an index that's : being updated regularly, (every second i have 10 to 50 new documents, most : of them are small) : : Every 30 minutes, i ask the index what are the documents that were added to : it, since the last time i queried it, that match a certain criteria. : From time to time, once a week or so, i ask the index for ALL the documents : that match that criteria. (i also do this for not only one query, but : several) : This is why i need the timestamp filter. : : The queries that don't have any time range, take a few seconds to finish, : while the ones with time range, take a few minutes. : Hope that helps understanding my situation, and i am open to any suggestion : how to change the way things work, if it will improve performance. you keep saying you run simple queries and gave an example of myStrField:foo and you say you ask the index what are the documents that were added to it, since the last time i queried it ... but you've never given any concrete example of a full Solr request that incorporates these timestamp filtering so we can see *exactly* what your requests look like. Even with an index the size you are describing, and even with the slower performance of DateField compared to TreiDateField i find it hard to believe that a query for myStrField:foo would go fro ma few seconds to several minutes by adding an fq range query for a span of ~30 minutes. are you by any chance also *sorting* the documents by that timestamp field when you do this? My best guess is that either: a) your raw query performance is generally really bad, but you don't notice when you do your simple queries because of solr's queryResultCache -- but this can't be used when you add the fq so you see the bad performance then. If this is the situation I have no real suggestions b) when you do your individual requests that filter by your timestamp field you are also sorting by your timestamp field -- a field you don't ever sort on in any other queries so the filterCache needed for sorting needs to be built before those queries can be returned. if you stop sorting onthis timestamp field (or add a newSearcher warming query that does the same sort) then the problem should go away. -Hoss
Re: Novice seeking help to change filters to search without diacritics
Hoss, thank you for your helpful response! : i think what's confusing you is that you are using the : MappingCharFilterFactory with that file in your text field type to : convert any ISOLatin1Accent characters to their base characters The problem is that a large range of characters are not getting converting to their base characters. The ASCIIFoldingFilterFactory handles this conversion for the entire Latin character set, including the extended sets without having to specify individual characters and their equivalent base characters. Is there way for me to switch to ASCIIFoldingFilterFactory? If so, what changes do I need to make to these files? I would appreciate your help! -- View this message in context: http://lucene.472066.n3.nabble.com/Novice-seeking-help-to-change-filters-to-search-without-diacritics-tp971263p988890.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Tree Faceting in Solr 1.4
I am also looking out for same feature in Solr and very keen to know whether it supports this feature of tree faceting... Or we are forced to index in tree faceting formatlike 1/2/3/4 1/2/3 1/2 1 In-case of multilevel faceting it will give only 2 level tree facet is what i found.. If i give query as : country India and state Karnataka and city bangalore...All what i want is a facet count 1) for condition above. 2) The number of states in that Country 3) the number of cities in that state ... Like = Country: India ,State:Karnataka , City: Bangalore 1 State:Karnataka Kerla Tamilnadu Andra Pradesh...and so on City: Mysore Hubli Mangalore Coorg and so on... If I am doing facet=on facet.field={!ex=State}State fq={!tag=State}State:Karnataka All it gives me is Facets on state excluding only that filter query.. But i was not able to do same on third level ..Like facet.field= Give me the counts of cities also in state Karantaka.. Let me know solution for this... Regards, Rajani Maski On Thu, Jul 22, 2010 at 10:13 PM, Eric Grobler impalah...@googlemail.comwrote: Thank you for the link. I was not aware of the multifaceting syntax - this will enable me to run 1 less query on the main page! However this is not a tree faceting feature. Thanks Eric On Thu, Jul 22, 2010 at 4:51 PM, SR r.steve@gmail.com wrote: Perhaps the following article can help: http://www.craftyfella.com/2010/01/faceting-and-multifaceting-syntax-in.html -S On Jul 22, 2010, at 5:39 PM, Eric Grobler wrote: Hi Solr Community If I have: COUNTRY CITY Germany Berlin Germany Hamburg Spain Madrid Can I do faceting like: Germany Berlin Hamburg Spain Madrid I tried to apply SOLR-792 to the current trunk but it does not seem to be compatible. Maybe there is a similar feature existing in the latest builds? Thanks Regards Eric