RE: Building a facet query in SolrJ
Thanks! I actually found a page on line that explained this. -Rich -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Wednesday, August 10, 2011 4:01 PM To: solr-user@lucene.apache.org Cc: Simon, Richard T Subject: RE: Building a facet query in SolrJ : query.addFacetQuery(MyField + : + \ + uri + \); ... : But when I examine queryResponse.getFacetFields, it's an empty list, if facet.query constraints+counts do not come back in the facet.field section of hte response. they come back in the facet.query section of the response (look at the XML in your browser and you'll see what i mean)... https://lucene.apache.org/solr/api/org/apache/solr/client/solrj/response/QueryResponse.html#getFacetQuery%28%29 -Hoss
Building a facet query in SolrJ
Hi - I'm trying to do a (I think) simple facet query, but I'm not getting the results I expect. I have a field, MyField, and I want to get facets for specific values of that field. That is, I want a FacetField if MyField is ABC, DEF, etc. (a specific list of values), but not if MyField is any other value. If I build my query like this: SolrQuery query = new SolrQuery( luceneQueryStr ); query.setStart( request.getStartIndex() ); query.setRows( request.getMaxResults() ); query.setFacet(true); query.setFacetMinCount(1); query.addFacetField(MYFIELD); for (String fieldValue : desiredFieldValues) { query.addFacetQuery(MYFIELD + : + fieldValue); } queryResponse.getFacetFields returns facets for ALL values of MyField. I figured that was because setting the facet field with addFacetField caused Solr to examine all values. But, if I take out that line, then getFacetFields returns an empty list. I'm sure I'm doing something simple wrong, but I'm out of ideas right now. -Rich
RE: Building a facet query in SolrJ
Oops. I think I found it. My desiredFieldValues list has the wrong info. Knew there was something simple wrong. From: Simon, Richard T Sent: Wednesday, August 10, 2011 10:55 AM To: solr-user@lucene.apache.org Cc: Simon, Richard T Subject: Building a facet query in SolrJ Hi - I'm trying to do a (I think) simple facet query, but I'm not getting the results I expect. I have a field, MyField, and I want to get facets for specific values of that field. That is, I want a FacetField if MyField is ABC, DEF, etc. (a specific list of values), but not if MyField is any other value. If I build my query like this: SolrQuery query = new SolrQuery( luceneQueryStr ); query.setStart( request.getStartIndex() ); query.setRows( request.getMaxResults() ); query.setFacet(true); query.setFacetMinCount(1); query.addFacetField(MYFIELD); for (String fieldValue : desiredFieldValues) { query.addFacetQuery(MYFIELD + : + fieldValue); } queryResponse.getFacetFields returns facets for ALL values of MyField. I figured that was because setting the facet field with addFacetField caused Solr to examine all values. But, if I take out that line, then getFacetFields returns an empty list. I'm sure I'm doing something simple wrong, but I'm out of ideas right now. -Rich
RE: Building a facet query in SolrJ
I take it back. I didn't find it. I corrected my values and the facet queries still don't find what I want. The values I'm looking for are URIs, so they look like: http://place.org/abc/def I add the facet query like so: query.addFacetQuery(MyField + : + \ + uri + \); I print the query, just to see what it is: Facet Query: MyField: : http://place.org/abc/def; But when I examine queryResponse.getFacetFields, it's an empty list, if I do not set the facet field. If I set the facet field to MyField, then I get facets for ALL the values of MyField, not just the ones in the facet queries. Can anyone help here? Thanks. From: Simon, Richard T Sent: Wednesday, August 10, 2011 11:07 AM To: Simon, Richard T; solr-user@lucene.apache.org Subject: RE: Building a facet query in SolrJ Oops. I think I found it. My desiredFieldValues list has the wrong info. Knew there was something simple wrong. From: Simon, Richard T Sent: Wednesday, August 10, 2011 10:55 AM To: solr-user@lucene.apache.org Cc: Simon, Richard T Subject: Building a facet query in SolrJ Hi - I'm trying to do a (I think) simple facet query, but I'm not getting the results I expect. I have a field, MyField, and I want to get facets for specific values of that field. That is, I want a FacetField if MyField is ABC, DEF, etc. (a specific list of values), but not if MyField is any other value. If I build my query like this: SolrQuery query = new SolrQuery( luceneQueryStr ); query.setStart( request.getStartIndex() ); query.setRows( request.getMaxResults() ); query.setFacet(true); query.setFacetMinCount(1); query.addFacetField(MYFIELD); for (String fieldValue : desiredFieldValues) { query.addFacetQuery(MYFIELD + : + fieldValue); } queryResponse.getFacetFields returns facets for ALL values of MyField. I figured that was because setting the facet field with addFacetField caused Solr to examine all values. But, if I take out that line, then getFacetFields returns an empty list. I'm sure I'm doing something simple wrong, but I'm out of ideas right now. -Rich
RE: Building a facet query in SolrJ
Hi -- I do get facets for all the values of MyField when I specify the facet field, but that's not what I want. I just want facets for a subset of the values of MyField. That's why I'm trying to use the facet queries, to just get facets for those values. -Rich -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: Wednesday, August 10, 2011 2:04 PM To: solr-user@lucene.apache.org Subject: Re: Building a facet query in SolrJ Try making your queries, manually, to see this closer in action... q=MyField:uri and see what you get. In this case, because your URI contains characters that make the default query parser unhappy, do this sort of query instead: {!term f=MyField}uri That way the query is parsed properly into a single term query. I am a little confused below since you're faceting on MyField entirely (addFacetField) where you'd get the values of each URI facet query in that list anyway. Erik On Aug 10, 2011, at 13:42 , Simon, Richard T wrote: I take it back. I didn't find it. I corrected my values and the facet queries still don't find what I want. The values I'm looking for are URIs, so they look like: http://place.org/abc/def I add the facet query like so: query.addFacetQuery(MyField + : + \ + uri + \); I print the query, just to see what it is: Facet Query: MyField: : http://place.org/abc/def; But when I examine queryResponse.getFacetFields, it's an empty list, if I do not set the facet field. If I set the facet field to MyField, then I get facets for ALL the values of MyField, not just the ones in the facet queries. Can anyone help here? Thanks. From: Simon, Richard T Sent: Wednesday, August 10, 2011 11:07 AM To: Simon, Richard T; solr-user@lucene.apache.org Subject: RE: Building a facet query in SolrJ Oops. I think I found it. My desiredFieldValues list has the wrong info. Knew there was something simple wrong. From: Simon, Richard T Sent: Wednesday, August 10, 2011 10:55 AM To: solr-user@lucene.apache.org Cc: Simon, Richard T Subject: Building a facet query in SolrJ Hi - I'm trying to do a (I think) simple facet query, but I'm not getting the results I expect. I have a field, MyField, and I want to get facets for specific values of that field. That is, I want a FacetField if MyField is ABC, DEF, etc. (a specific list of values), but not if MyField is any other value. If I build my query like this: SolrQuery query = new SolrQuery( luceneQueryStr ); query.setStart( request.getStartIndex() ); query.setRows( request.getMaxResults() ); query.setFacet(true); query.setFacetMinCount(1); query.addFacetField(MYFIELD); for (String fieldValue : desiredFieldValues) { query.addFacetQuery(MYFIELD + : + fieldValue); } queryResponse.getFacetFields returns facets for ALL values of MyField. I figured that was because setting the facet field with addFacetField caused Solr to examine all values. But, if I take out that line, then getFacetFields returns an empty list. I'm sure I'm doing something simple wrong, but I'm out of ideas right now. -Rich
Highlighting map use unique key field?
Hi - A simple yes or no question, I think. I want to retrieve highlighting result from a QueryResponse. I know to use the following: MapString, MapString, ListString highlighting = resp.getHighlighting(); Most of the examples I've seen use the document uid to extract the results like so: String key = resultDec.getFieldValue(UID_FIELD); MapString, ListString map = highlighting.get(key); I think this is the right way to go, however I did see one code example that did things a bit differently: They defined a field to query and then used that field as the id field, like so: solrQuery.setParam(fl,query); ... peform query ... String id = (String) resultDoc.getFieldValue(query); MapString,ListString highlightSnippets = queryResponse.getHighlighting().get(id); Our documents have no unique field right now. I can create one rather easily. However, because of the above example, I've been asked to confirm that the map returned by highlighting requires/uses the unique key field defined in the schema. So, yes or no: Does the highlighting map require a unique key field? (yes could mean well there are obscure ways to avoid it, but using the unique key is easier/better/more common). Thanks, -Rich
RE: getFieldValue always returns an ArrayList?
Interesting. You guessed right. I changed multivalued to multiValued and all of a sudden I get Strings. But, doesn't multivalued default to false? In my schema, I originally did not set multivalued. I only put in multivalued=false after I experienced this issue. -Rich For the record, I had a number of fields which had never settings for multivalued because none of them were multivalued and I expected the default to be false. When I experienced this problem, I added multivalued=false to all of them. I still had the problem. So, I added a method to deal with the returned ArrayLists: private Object getFieldValue(String field, SolrDocument document) { ArrayList list = (ArrayList)document.getFieldValue(field); return list.get(0); } I deliberately did not test if the return Object was an ArrayList because I wanted to get an exception if any of them were Strings; I got no exceptions, so they were all returned as ArrayLists. I then changed one of the fields to use multiValued=false, and I got an exception, trying to cast String to ArrayList! So, I changed all the troublesome fields to use multiValued, and changed my helper method to look like this: private Object getFieldValue(String field, SolrDocument document) { Object o = document.getFieldValue(field); if (o instanceof ArrayList) { System.out.println(### Field + field + is an instance of ArrayList.); ArrayList list = (ArrayList)document.getFieldValue(field); return list.get(0); } else { if (!(o instanceof String)) { System.out.println(## ERROR); } else { System.out.println(### Field + field + is an instance of String.); } return o; } } Here's the output, interspersed with the schema definitions of the fields: field name=uri type=string indexed=true stored=true multiValued=false required=true / ### Field uri is an instance of String. field name=entity_label type=string indexed=false stored=true required=false / ### Field entity_label is an instance of ArrayList. field name=institution_uri type=string indexed=true stored=true required=false / ### Field institution_uri is an instance of ArrayList. field name=asserted_type_uri type=string indexed=true stored=true required=false / ### Field asserted_type_uri is an instance of ArrayList. field name=asserted_type_label type=text_eaglei indexed=true stored=true required=false / ### Field asserted_type_label is an instance of ArrayList. field name=provider_uri type=string indexed=true stored=true multiValued=false required=false / ### Field provider_uri is an instance of String. field name=provider_label type=string indexed=true stored=true multiValued=false required=false / ### Field provider_label is an instance of String. As you can see, the ones with no declaration for multivalued are returned as ArrayLists, while the ones with multiValued=false are returned as Strings. So, it looks like there are two problems here: multivalued (small v) is not recognized, since using that in the schema still causes all fields to be returned as ArrayLists; and, multivalued does not default to false (or, at least, not setting it causes a field to be returned as an ArrayList, as though it were set to true). -Rich -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Wednesday, June 15, 2011 4:25 PM To: solr-user@lucene.apache.org Subject: Re: getFieldValue always returns an ArrayList? Hmmm, I admit I'm not using embedded, and I'm using 3.2, but I'm not seeing the behavior you are. My question about reindexing could have been better stated, I was just making sure you didn't have some leftover cruft where your field was multi-valued from previous experiments, but if you're reindexing each time that's not the problem. Arrrh, camel case may be striking again. Try multiValued, not multivalued If that's still not it, can we see the code? Best Erick On Wed, Jun 15, 2011 at 3:47 PM, Simon, Richard T richard_si...@hms.harvard.edu wrote: We rebuild the index from scratch each time we start (for now). The fields in question are not multi-valued; in fact, I explicitly set multi-valued to false, just to be sure. Yes, this is SolrJ, using the embedded server, if that matters. Using Solr/Lucene 3.1.0. -Rich -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Wednesday, June 15, 2011 3:44 PM To: solr-user@lucene.apache.org Subject: Re: getFieldValue always returns an ArrayList? Did you perhaps change the schema but not re-index? I'm
RE: getFieldValue always returns an ArrayList?
FYI: Using multiValued=false for all string fields results in the following output: ### Field uri is an instance of String. ### Field entity_label is an instance of String. ### Field institution_uri is an instance of String. ### Field asserted_type_uri is an instance of String. ### Field asserted_type_label is an instance of String. ### Field provider_uri is an instance of String. ### Field provider_label is an instance of String. -Rich -Original Message- From: Simon, Richard T Sent: Thursday, June 16, 2011 10:08 AM To: solr-user@lucene.apache.org Cc: Simon, Richard T Subject: RE: getFieldValue always returns an ArrayList? Interesting. You guessed right. I changed multivalued to multiValued and all of a sudden I get Strings. But, doesn't multivalued default to false? In my schema, I originally did not set multivalued. I only put in multivalued=false after I experienced this issue. -Rich For the record, I had a number of fields which had never settings for multivalued because none of them were multivalued and I expected the default to be false. When I experienced this problem, I added multivalued=false to all of them. I still had the problem. So, I added a method to deal with the returned ArrayLists: private Object getFieldValue(String field, SolrDocument document) { ArrayList list = (ArrayList)document.getFieldValue(field); return list.get(0); } I deliberately did not test if the return Object was an ArrayList because I wanted to get an exception if any of them were Strings; I got no exceptions, so they were all returned as ArrayLists. I then changed one of the fields to use multiValued=false, and I got an exception, trying to cast String to ArrayList! So, I changed all the troublesome fields to use multiValued, and changed my helper method to look like this: private Object getFieldValue(String field, SolrDocument document) { Object o = document.getFieldValue(field); if (o instanceof ArrayList) { System.out.println(### Field + field + is an instance of ArrayList.); ArrayList list = (ArrayList)document.getFieldValue(field); return list.get(0); } else { if (!(o instanceof String)) { System.out.println(## ERROR); } else { System.out.println(### Field + field + is an instance of String.); } return o; } } Here's the output, interspersed with the schema definitions of the fields: field name=uri type=string indexed=true stored=true multiValued=false required=true / ### Field uri is an instance of String. field name=entity_label type=string indexed=false stored=true required=false / ### Field entity_label is an instance of ArrayList. field name=institution_uri type=string indexed=true stored=true required=false / ### Field institution_uri is an instance of ArrayList. field name=asserted_type_uri type=string indexed=true stored=true required=false / ### Field asserted_type_uri is an instance of ArrayList. field name=asserted_type_label type=text_eaglei indexed=true stored=true required=false / ### Field asserted_type_label is an instance of ArrayList. field name=provider_uri type=string indexed=true stored=true multiValued=false required=false / ### Field provider_uri is an instance of String. field name=provider_label type=string indexed=true stored=true multiValued=false required=false / ### Field provider_label is an instance of String. As you can see, the ones with no declaration for multivalued are returned as ArrayLists, while the ones with multiValued=false are returned as Strings. So, it looks like there are two problems here: multivalued (small v) is not recognized, since using that in the schema still causes all fields to be returned as ArrayLists; and, multivalued does not default to false (or, at least, not setting it causes a field to be returned as an ArrayList, as though it were set to true). -Rich -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Wednesday, June 15, 2011 4:25 PM To: solr-user@lucene.apache.org Subject: Re: getFieldValue always returns an ArrayList? Hmmm, I admit I'm not using embedded, and I'm using 3.2, but I'm not seeing the behavior you are. My question about reindexing could have been better stated, I was just making sure you didn't have some leftover cruft where your field was multi-valued from previous experiments, but if you're reindexing each time that's not the problem. Arrrh, camel case may be striking again. Try multiValued, not multivalued If that's still not it, can
RE: getFieldValue always returns an ArrayList?
We haven't changed Solr versions. We've been using 3.1.0 all along. Plus, I have some code that runs during indexing and retrieves the fields from a SolrInputDocument, rather than a SolrDocument. That code gets Strings without any problem, and always has, even without saying multiValued=false. -Rich -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Thursday, June 16, 2011 2:18 PM To: solr-user@lucene.apache.org Cc: Simon, Richard T Subject: RE: getFieldValue always returns an ArrayList? : and all of a sudden I get Strings. But, doesn't multivalued default to : false? In my schema, I originally did not set multivalued. I only put in : multivalued=false after I experienced this issue. That's dependent on the version of Solr, and it's is where the version property of the schema comes in. (as the default behavior in solr changes, it does so dependent on what version you specify in your schema to prevent radical behavior changes if you upgrade but keep the same configs)... schema name=example version=1.4 !-- attribute name is the name of this schema and is only used for display purposes. Applications should change this to reflect the nature of the search collection. version=1.4 is Solr's version number for the schema syntax and semantics. It should not normally be changed by applications. 1.0: multiValued attribute did not exist, all fields are multiValued by nature 1.1: multiValued attribute introduced, false by default 1.2: omitTermFreqAndPositions attribute introduced, true by default except for text fields. 1.3: removed optional field compress feature 1.4: default auto-phrase (QueryParser feature) to off -- -Hoss
RE: getFieldValue always returns an ArrayList?
Ah! That was the problem. The version was 1.0. I'll change it to 1.2. Thanks! -Rich -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Thursday, June 16, 2011 2:33 PM To: Simon, Richard T Cc: solr-user@lucene.apache.org Subject: RE: getFieldValue always returns an ArrayList? : We haven't changed Solr versions. We've been using 3.1.0 all along. but that's not what i'm talking about. I'm talking about the schema version ... a specific property declared in your schema.xml file. did you check it? (even when people start with Solr X, they sometimes are using schema.xml files provided by external packages -- Drupal, wordpress, etc... -- and don't notice that those are from older versions) : Plus, I have some code that runs during indexing and retrieves the : fields from a SolrInputDocument, rather than a SolrDocument. That code : gets Strings without any problem, and always has, even without saying : multiValued=false. SolrInputDocument's are irelevant. they are used to index data, but they don't know anything about the schema. A SolrInputDocument might be completely invalid because of multiple values for singled value fields, or missing values for required fields, etc... what comes back from a search *is* consistent with the schema (even when there is only one value stored in a multiValued field) -Hoss
getFieldValue always returns an ArrayList?
Hi - I am examining a SolrDocument I retrieved through a query. The field I am looking at is declared this way in my schema: field name=uri type=string indexed=true stored=true multivalued=false required=true / I know multivalued defaults to false, but I set it explicitly because I'm seeing some unexpected behavior. I retrieve the value of the field like so: final String resource = (String)document.getFieldValue(uri); However, I get an exception because an ArrayList is returned. I confirmed that the returned ArrayList has one element with the correct value, but I thought getFieldValue would return a String if the field is single valued. When I index the document, I have some code that retrieves the same field in the same way from the SolrInputDocument, and that code works. I looked at the code for SolrDocument.setField and it looks like the only way a field should be set to an ArrayList is if one is passed in by the code creating the SolrDocument. Why would it do that if the field is not multivalued? Is this behavior expected? -Rich
RE: getFieldValue always returns an ArrayList?
We rebuild the index from scratch each time we start (for now). The fields in question are not multi-valued; in fact, I explicitly set multi-valued to false, just to be sure. Yes, this is SolrJ, using the embedded server, if that matters. Using Solr/Lucene 3.1.0. -Rich -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Wednesday, June 15, 2011 3:44 PM To: solr-user@lucene.apache.org Subject: Re: getFieldValue always returns an ArrayList? Did you perhaps change the schema but not re-index? I'm grasping at straws here, but something like this might happen if part of your index has that field as a multi-valued field If that't not the problem, what version of solr are you using? I presume this is SolrJ? Best Erick On Wed, Jun 15, 2011 at 2:21 PM, Simon, Richard T richard_si...@hms.harvard.edu wrote: Hi - I am examining a SolrDocument I retrieved through a query. The field I am looking at is declared this way in my schema: field name=uri type=string indexed=true stored=true multivalued=false required=true / I know multivalued defaults to false, but I set it explicitly because I'm seeing some unexpected behavior. I retrieve the value of the field like so: final String resource = (String)document.getFieldValue(uri); However, I get an exception because an ArrayList is returned. I confirmed that the returned ArrayList has one element with the correct value, but I thought getFieldValue would return a String if the field is single valued. When I index the document, I have some code that retrieves the same field in the same way from the SolrInputDocument, and that code works. I looked at the code for SolrDocument.setField and it looks like the only way a field should be set to an ArrayList is if one is passed in by the code creating the SolrDocument. Why would it do that if the field is not multivalued? Is this behavior expected? -Rich
Solr Newbie: Starting embedded server with multicore
I'm just starting with Solr. I'm using Solr 3.1.0, and I want to use EmbeddedSolrServer with a multicore setup, even though I currently have only one core (various documents I read suggest starting that way even if you have one core, to get the better administrative tools supported by mutlicore). I have two questions: 1. Does the first code sample below start the server with multicore or not? 2. Why is it the first sample work and the second does not? My solr.xml looks like this: solr persistent=true cores adminPath=/admin/cores defaultCoreName=mycore sharedLib=lib core name=mycore instanceDir=mycore / /cores /solr It's in a directory called solrhome in war/WEB-INF. I can get the server to come up cleanly if I follow an example in the Packt Solr book (p. 231), but I'm not sure if this enables multi-core or not: File solrXML = new File(war/WEB-INF/solrhome/solr.xml); String solrHome = solrXML.getParentFile().getAbsolutePath(); String dataDir = solrHome + /data; coreContainer = new CoreContainer(solrHome); SolrConfig solrConfig = new SolrConfig(solrHome, solrconfig.xml, null); CoreDescriptor coreDescriptor = new CoreDescriptor(coreContainer, mycore, solrHome); SolrCore solrCore = new SolrCore(mycore, dataDir + / + mycore, solrConfig, null, coreDescriptor); coreContainer.register(solrCore, false); embeddedSolr = new EmbeddedSolrServer(coreContainer, mycore); The documentation on the Solr wiki says I should configure the EmbeddedSolrServer for multicore like this: File home = new File( /path/to/solr/home ); File f = new File( home, solr.xml ); CoreContainer container = new CoreContainer(); container.load( /path/to/solr/home, f ); EmbeddedSolrServer server = new EmbeddedSolrServer( container, core name as defined in solr.xml ); When I try to do this, I get an error saying that it cannot find solrconfig.xml: File solrXML = new File(war/WEB-INF/solrhome/solr.xml); String solrHome = solrXML.getParentFile().getAbsolutePath(); coreContainer = new CoreContainer(); coreContainer.load(solrHome, solrXML); embeddedSolr = new EmbeddedSolrServer(coreContainer, mycore); The message says it is looking in an odd place (I removed my user name from this). Why is it looking in solrhome/mycore/conf for solrconfig.xml? Both that and my schema.xml are in solrhome/conf. How can I point it at the right place? I tried adding REMOVED\workspace-Solr\institution-webapp\war\WEB-INF\solrhome\conf to the classpath, but got the same result: SEVERE: java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in classpath or 'REMOVED\workspace-Solr\institution-webapp\war\WEB-INF\solrhome\mycore\conf/', cwd=REMOVED\workspace-Solr\institution-webapp at org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:268) at org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:234) at org.apache.solr.core.Config.init(Config.java:141) at org.apache.solr.core.SolrConfig.init(SolrConfig.java:132) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:430) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)