Error while sorting by geo_distance in Solr 1.4
Hi All, I am using Solr 1.4 (10 November 2009 release), Lucene core 2.9.2, LocalSolr 2.0, LocalLucene 2.0, Tomcat 5.5 I get the following error when I try to sort the result by geo_distance. Here is the stacktrace... SEVERE: java.lang.NullPointerException at org.apache.lucene.search.SortField.getComparator(SortField.java:496) at org.apache.lucene.search.FieldValueHitQueue$OneComparatorFieldValueHitQueue.init(FieldValueHitQueue.java:79) at org.apache.lucene.search.FieldValueHitQueue.create(FieldValueHitQueue.java:192) at org.apache.lucene.search.TopFieldCollector.create(TopFieldCollector.java:886) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:981) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341) at org.apache.solr.search.SolrIndexSearcher.getDocList(SolrIndexSearcher.java:1161) at com.pjaol.search.solr.component.LocalSolrQueryComponent.process(LocalSolrQueryComponent.java:286) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) Commented the following line in org.apache.lucene.search.SortField return comparatorSource.newComparator(field, numHits, sortPos, reverse); and added the following line temporarily. return new FieldComparator.DoubleComparator(numHits, field, parser); Changed geo_distance type to double from sdouble in my schema.xml. Then it works. But i think its a bug in lucene 2.9.2 code. It will always give NPE. Here are the original two lines from the file org.apache.lucene.search.SortField [getComparator() method, lucene 2.9.2]... assert factory == null comparatorSource != null; return comparatorSource.newComparator(field, numHits, sortPos, reverse); And factory = com.pjaol.search.geo.utils.distancesortsou...@1568654 and comparatorSource = null always. So it always call newComparator() on null. Can anyone help me with this, please? Thank you very much for your support. Regards, Sandeep -- View this message in context: http://n3.nabble.com/Error-while-sorting-by-geo-distance-in-Solr-1-4-tp715415p715415.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to make SOLR display empty value attributes also
I guess, You can achieve this with DataImportHandler Transformers. -- View this message in context: http://n3.nabble.com/How-to-make-SOLR-display-empty-value-attributes-also-tp712766p715436.html Sent from the Solr - User mailing list archive at Nabble.com.
SolRJ 1.4 : java.lang.VerifyError: class org.apache.solr.search.SolrIndexReader overrides final method setNorm.(ILjava/lang/String;B)V
Hi, I'm starting using SolR 1.4 queried by SolRJ 1.4 (all official release that I've downloaded from the main link on the web site : http://www.apache.org/dyn/closer.cgi/lucene/solr/ mirror : http://apache.multidist.com/lucene/solr/1.4.0/ downloaded the .zip file.) The servers start OK and the class CommonsHttpSolrServer works fine. But when running the class EmbeddedSolrServer to do the same basic test, it fails with the following compilation like exception : 09:55:59,859 INFO ~ Adding 'file:/C:/Projets_RD/WorkspacePlay/apache-solr-1.4.0/contrib/clustering/lib/jackson-mapper-asl-0.9.9-6.jar' to classloader 09:55:59,859 INFO ~ Adding 'file:/C:/Projets_RD/WorkspacePlay/apache-solr-1.4.0/contrib/clustering/lib/log4j-1.2.14.jar' to classloader Exception in thread main java.lang.VerifyError: class org.apache.solr.search.SolrIndexReader overrides final method setNorm.(ILjava/lang/String;B)V at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(Unknown Source) at java.security.SecureClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.access$000(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) at org.apache.solr.core.SolrConfig.init(SolrConfig.java:166) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:134) at util.UtilSolR.getEmbeddedSolRServer(UtilSolR.java:65) at util.UtilSolR.main(UtilSolR.java:29) Am I doing something wrong ? I'm not a newby in Java and this exception seems more like a compilation problem than a config or usage pbl. In my classPath, I've imported the jars from the folder \apache-solr-1.4.0\dist. The class SlorIndexReader extends the class FilterIndexReader. It effectively overrides the following method : *...@override public void setNorm(int doc, String field, byte value) throws StaleReaderException, CorruptIndexException, LockObtainFailedException, IOException { in.setNorm(doc, field, value); }* The ovrridden method seems to be in a lucene jar : org.apache.lucene.index.FilterIndexReader or org.apache.lucene.index.IndexReader Does this pbl come from a build/packaging error, like a no compatibility between jars that have been packaged together ?? I do not have the source code of the Lucene library... so I can't check where the pbl exactly comes from... Can someone help me please ?? I really don't know what to do to make that work... I'm going to try to download from other mirrors, but I guess that obviously they will provide me with the exact same package. Thanks. Damien *PS:* Here is my test code : *package util; import java.io.IOException; import java.net.MalformedURLException; import java.util.ArrayList; import java.util.Collection; import javax.xml.parsers.ParserConfigurationException; import org.apache.solr.client.solrj.SolrServer; import org.apache.solr.client.solrj.SolrServerException; import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer; import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer; import org.apache.solr.common.SolrInputDocument; import org.apache.solr.core.CoreContainer; import org.xml.sax.SAXException; public class UtilSolR { private static EmbeddedSolrServer embeddedSolrServer = null; private static SolrServer httpSolrServer = null; /** * @param args */ public static void main(String[] args) { //SolrServer server = getHttpSolRServer(); SolrServer server = getEmbeddedSolRServer(); SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField( id, id1, 1.0f ); doc1.addField( name, doc1, 1.0f ); doc1.addField( price, 10 ); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField( id, id2, 1.0f ); doc2.addField( name, doc2, 1.0f ); doc2.addField( price, 20 ); CollectionSolrInputDocument docs = new ArrayListSolrInputDocument(); docs.add( doc1 ); docs.add( doc2 ); try { server.add( docs ); server.commit(); } catch (SolrServerException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } System.out.println(Done !!); } public static EmbeddedSolrServer getEmbeddedSolRServer(){ if(embeddedSolrServer == null){ CoreContainer coreContainer; System.setProperty(solr.solr.home, C:\\Projets_RD\\WorkspacePlay\\apache-solr-1.4.0\\example\\solr); CoreContainer.Initializer initializer = new CoreContainer.Initializer();
Re: Multi-core memory problem
Thanks! We enlarged the max heap size and it looks ok so far. On Fri, Apr 9, 2010 at 4:23 AM, Lance Norskog goks...@gmail.com wrote: Since the facet cache is hard-allocated and has not eviction policy, you could do a facet query on each core as part of the wam-up. This way, the facets will not fail. At that point, you can tune the Solr cache sizes. Solr caches documents, searches, and filter queries. Filter queries are sets of documents cached as a bitmap, one bit for every document (I think). Searches are cached as a sorted list of document numbers (they include the relevance order, where filters don't care). Documents cache all of the fields in a document. On Thu, Apr 8, 2010 at 3:46 AM, Victoria Kagansky victoria.kagan...@gmail.com wrote: I noticed now that the OutOfMemory exception occurs upon faceting queries. Queries without facets do return successfully. There are two log types upon the exception. The queries causing them differ only in q parameter, the faceting and sorting parameters are the same. I guess this has something to do with the result set size influencing the faceting mechanism. 1) Apr 8, 2010 9:18:21 AM org.apache.solr.common.SolrException log SEVERE: java.lang.OutOfMemoryError: Java heap space 2) Apr 8, 2010 9:18:13 AM org.apache.solr.common.SolrException log SEVERE: java.lang.OutOfMemoryError: Java heap space at org.apache.solr.request.UnInvertedField.uninvert(UnInvertedField.java:191) at org.apache.solr.request.UnInvertedField.init(UnInvertedField.java:178) at org.apache.solr.request.UnInvertedField.getUnInvertedField(UnInvertedField.java:839) at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:250) at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:283) at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:166) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:72) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:567) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:859) at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:574) at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1527) at java.lang.Thread.run(Thread.java:619) On Thu, Apr 8, 2010 at 9:51 AM, Victoria Kagansky victoria.kagan...@gmail.com wrote: The queries do require sorting (on int) and faceting. They should fetch first 200 docs. The current problematic core has 10 entries in fieldCache and 5 entries in filterCache. The other caches are empty. Is there any way to know how much memory specific cache takes? The problem is that one core behaves well, while the other one throws OutOfMemory exceptions right from the restart. This behavior is consistent if I switch the order of the cores initialization. It feels like there the core initialized second has no memory resources assigned. On Thu, Apr 8, 2010 at 4:26 AM, Lance Norskog goks...@gmail.com wrote: Sorting takes memory. What data types are the fields sorted on? If they're strings, that could be a space-eater. If they are ints or dates, not a problem. Do the queries pull all of the documents found? Or do they just fetch the, for example, first 10 documents? What are the cache statistics like? Can they be shrunk? The stats are shown the Statistics page off of the main solr/admin page. Facets come from something called the Lucene Field Cache, which is not controlled out of Solr. It has no eviction policy. When you do a facet request, the memory used to load up the facets
Internal Server Error
Hello, I have the following piece of code : ContentStreamUpdateRequest contentUpdateRequest = new ContentStreamUpdateRequest(/update/extract); contentUpdateRequest.addFile(new File(contentFileName)); contentUpdateRequest.setParam(extractOnly,true); NamedList result = solrServerSession.request(contentUpdateRequest); This is throwing the following error : org.apache.solr.common.SolrException: Internal Server Error Internal Server Error request: http://localhost:8080/solr/update/extract?extractOnly=truewt=javabinversion=1 [Apr 13, 2010 4:25:23 PM (IndexThread-1_9)]:at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424) [Apr 13, 2010 4:25:23 PM (IndexThread-1_9)]:at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243) I have solr 1.4 set up on tomcat 6.0.26. There is no detailed stack trace / logs available. Could somebody please let me know what might be the issue. Thanks, Sandhya
change get to post ??
Hello. My client uses my autocompletion with an normal http-Request to solr. like this: http://XXX/solr/suggestpg/select/?q=harry so, when i want to search in a category with all his childs, my request is too long. How can i change from GET to POST ?? my request to solr looks like this. in short ;) http://XXX/solr/suggestpg/select/?q=harryfq= cat_id:7994+OR+cat_id:7995+OR+cat_id:8375+OR+cat_id:8465+OR+cat_id:8757+OR+cat_id:8766+OR+cat_id:8792+OR+cat_id:8843 . thx, for fast support ;) its very important ^^ -- View this message in context: http://n3.nabble.com/change-get-to-post-tp715714p715714.html Sent from the Solr - User mailing list archive at Nabble.com.
Combining Dismax and payload boosting
Hi, We are using payloads for score boosting. For this purpose we've implemented custom boosting QueryParser and similarity function. We followed http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/ . On the other hand, we'd like to use dismax query handling because of its benefits in several fields search. How can we make dismax use our custom QueryParser? Thanks!
Re: change get to post ??
Hi, the problem is not the GET request type, the problem is that you build a far too complicated query. This won't scale very much and looks rather weird. Why don't you just add all parent category ids to every document at index time? Then you could simply filter your request with the topmost category id, and no more. If you want additional queries filtered by some single category only, then you maybe should define two fields for the category ids, one with a single id token, and another with the ids of all sub-categories. -Kuli Am 13.04.2010 13:07, schrieb stockii: Hello. My client uses my autocompletion with an normal http-Request to solr. like this: http://XXX/solr/suggestpg/select/?q=harry so, when i want to search in a category with all his childs, my request is too long. How can i change from GET to POST ?? my request to solr looks like this. in short ;) http://XXX/solr/suggestpg/select/?q=harryfq= cat_id:7994+OR+cat_id:7995+OR+cat_id:8375+OR+cat_id:8465+OR+cat_id:8757+OR+cat_id:8766+OR+cat_id:8792+OR+cat_id:8843 . thx, for fast support ;) its very important ^^
[Job] Software Developers With Solr Experience (Cambridge, MA USA)
Hi folks, Our company, a funded startup based in Cambridge, is hiring senior developers with Solr experience. See our ad posted on Startuply: http://www.startuply.com/Jobs/Software_Developer_Server__1973_1.aspx Thanks, Chris
Re: change get to post ??
You need to change the way how your data is imported. Or look for an alternative how to build your query. It depends on your data model, and your import mechanism. Do your really have hundreds of categories? BTW, childs is amusing! ;-) -Michael Am 13.04.2010 14:12, schrieb stockii: hi. thx for reply =) okay i think im little bit stupid. i dont know how can i filter the right categorys. i get only the id of one category. i importet every childs for each item and the parent_category_id. but it ist only one parent for each item and not for each category. so an example for dummies ;-) DOC: id:1; cat_id:5; cat_parent:4; cat_childs: 3,2,1 so how can i filter so that i get all items of categories 1,2,3,4 and 5 when my request is only: q=harry potter; fq=cat_id:5 thx i dont see it XD
Re: Combining Dismax and payload boosting
Victoria, An example of specifically what types of queries you'd like to do would be helpful. Using nested queries you can leverage dismax and your custom query parser together, which may be what you're looking for. See this article for details on nested queries: http://www.lucidimagination.com/blog/2009/03/31/nested-queries-in-solr/ Also, I'm curious about your query parser that uses payloads. How does it work? There's a PayloadTermQuery parser attached to the following issue, and I'm wondering how your work might align with that implementation: https://issues.apache.org/jira/browse/SOLR-1485 Erik On Apr 13, 2010, at 7:14 AM, Victoria Kagansky wrote: Hi, We are using payloads for score boosting. For this purpose we've implemented custom boosting QueryParser and similarity function. We followed http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/ . On the other hand, we'd like to use dismax query handling because of its benefits in several fields search. How can we make dismax use our custom QueryParser? Thanks!
Re: change get to post ??
heya... childs... ^^ hehe not my schema of the database :P we have 2447 categorys and its gonna be more and more... some cat, have 300 child-categories. what do you think is the best way to solve this problem ? the porblem is that the APP (iPhone App) doesnt know all the sub-categories. so, the app only send in wich cat he is and what he is searching for it. i think i need to write an own handler ?? or, how can i import the cat-data ? -- View this message in context: http://n3.nabble.com/change-get-to-post-tp715714p715932.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: change get to post ??
Hi, Am 13.04.2010 14:52, schrieb stockii: some cat, have 300 child-categories. And that's the reason why you shouldn't add them all to your filter query. or, how can i import the cat-data ? Again: How do you do it NOW? -Michael
limit rows by field
Hi, for a preview of results, I need to display up to 3 documents per category. Is it possible to limit the number of rows of solr response by field-values? What I mean is: rows: 9 -(sub)rows of field:cat1 : 3 -(sub)rows of field:cat2 : 3 -(sub)rows of field:cat3 : 3 If not, is there a workaround or do I have to send three queries? Thanks! felix
Re: change get to post ??
heya. okay NOW. i import from database with DIH. every item have cat_id, more not. for the normal search it works to use fq and Post the search. but for my autosuggestion, it didnt work, because our app does not use the autosuggestion with our API. Because of better performance ... -- View this message in context: http://n3.nabble.com/change-get-to-post-Best-Way-for-Category-Search-tp715714p715992.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: closest terms, sentence, boosting 'business' keywords instead of field ?
Do you have an example of what you are trying to do? For instance a request like: tomcat servlet should return document which have tomcat is a servlet container rather than a document that havetomcat offers the last specification implementaion of the servlet technology, at least this last should not come before the first in results. Boost in the query or during indexing? I think boosting in query is more relevant. Thanks On 4/13/10, Grant Ingersoll gsing...@apache.org wrote: On Apr 12, 2010, at 7:57 PM, Abdelhamid ABID wrote: Hi, - I'm bit confused on how analyzer apply filters on query, I know that they are applied in order on which they are declared, but still, does the search result include only the final digest of the filters chain or at each filter step solr add the matched term to results set. It is the final output. - Does Dismax request handler support quoted keywords ? if not, how can I search for an exact sentence using dismax. It does. - How to match a request with the documents that only have keywords that appears in the closest positions. Do you have an example of what you are trying to do? - How can I boost a set of keywords instead of fields? this would be useful in case where a document with one single searchable field, which is of type text and where boosting field has no sense. Boost in the query or during indexing? -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search -- Abdelhamid ABID Software Engineer- J2EE / WEB / ESB MULE
Displaying fieldValueCache stats in Solr 1.4 admin/stats page
Hi All, FieldValueCache stats are not getting displayed on http://./solr/admin/stats.jsp page. I configured it in solrconfig.xml (Solr 1.4) as fieldValueCache class=solr.FastLRUCache size=512 autowarmCount=128 showItems=32/ Any inputs on this? Thank you. Regards, Sandeep -- View this message in context: http://n3.nabble.com/Displaying-fieldValueCache-stats-in-Solr-1-4-admin-stats-page-tp716093p716093.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Error while sorting by geo_distance in Solr 1.4
Hi Sandeep, You're probably better off asking on the LocalSolr mailing list (I think there is one) or trying out the Solr trunk, which has much of this functionality incorporated in a more native manner. For docs on that, refer to http://wiki.apache.org/solr/SpatialSearch, but note it is not yet complete. -Grant On Apr 13, 2010, at 3:15 AM, SandeepTagore wrote: Hi All, I am using Solr 1.4 (10 November 2009 release), Lucene core 2.9.2, LocalSolr 2.0, LocalLucene 2.0, Tomcat 5.5 I get the following error when I try to sort the result by geo_distance. Here is the stacktrace... SEVERE: java.lang.NullPointerException at org.apache.lucene.search.SortField.getComparator(SortField.java:496) at org.apache.lucene.search.FieldValueHitQueue$OneComparatorFieldValueHitQueue.init(FieldValueHitQueue.java:79) at org.apache.lucene.search.FieldValueHitQueue.create(FieldValueHitQueue.java:192) at org.apache.lucene.search.TopFieldCollector.create(TopFieldCollector.java:886) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:981) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341) at org.apache.solr.search.SolrIndexSearcher.getDocList(SolrIndexSearcher.java:1161) at com.pjaol.search.solr.component.LocalSolrQueryComponent.process(LocalSolrQueryComponent.java:286) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) Commented the following line in org.apache.lucene.search.SortField return comparatorSource.newComparator(field, numHits, sortPos, reverse); and added the following line temporarily. return new FieldComparator.DoubleComparator(numHits, field, parser); Changed geo_distance type to double from sdouble in my schema.xml. Then it works. But i think its a bug in lucene 2.9.2 code. It will always give NPE. Here are the original two lines from the file org.apache.lucene.search.SortField [getComparator() method, lucene 2.9.2]... assert factory == null comparatorSource != null; return comparatorSource.newComparator(field, numHits, sortPos, reverse); And factory = com.pjaol.search.geo.utils.distancesortsou...@1568654 and comparatorSource = null always. So it always call newComparator() on null. Can anyone help me with this, please? Thank you very much for your support. Regards, Sandeep -- View this message in context: http://n3.nabble.com/Error-while-sorting-by-geo-distance-in-Solr-1-4-tp715415p715415.html Sent from the Solr - User mailing list archive at Nabble.com. -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search
Re: closest terms, sentence, boosting 'business' keywords instead of field ?
On Apr 13, 2010, at 10:02 AM, Abdelhamid ABID wrote: Do you have an example of what you are trying to do? For instance a request like: tomcat servlet should return document which have tomcat is a servlet container rather than a document that havetomcat offers the last specification implementaion of the servlet technology, at least this last should not come before the first in results. Using a phrase query with a high slop value should handle this, as the closer the terms are, the better the score for that match. Boost in the query or during indexing? I think boosting in query is more relevant. Thanks On 4/13/10, Grant Ingersoll gsing...@apache.org wrote: On Apr 12, 2010, at 7:57 PM, Abdelhamid ABID wrote: Hi, - I'm bit confused on how analyzer apply filters on query, I know that they are applied in order on which they are declared, but still, does the search result include only the final digest of the filters chain or at each filter step solr add the matched term to results set. It is the final output. - Does Dismax request handler support quoted keywords ? if not, how can I search for an exact sentence using dismax. It does. - How to match a request with the documents that only have keywords that appears in the closest positions. Do you have an example of what you are trying to do? - How can I boost a set of keywords instead of fields? this would be useful in case where a document with one single searchable field, which is of type text and where boosting field has no sense. Boost in the query or during indexing? You should be able to just boost the keyword using the ^ operator. -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search
Re: Displaying fieldValueCache stats in Solr 1.4 admin/stats page
This is an implicit cache (if you don't define it, it will still exist and show up on stats.jsp). Can you be more specific about FieldValueCache stats are not getting displayed If you start the example server, go to the stats page, and search for fieldValueCache, is it there? Or do you mean that it's there but you think the numbers are wrong? -Yonik Apache Lucene Eurocon 2010 18-21 May 2010 | Prague On Tue, Apr 13, 2010 at 10:07 AM, SandeepTagore sandeep.tag...@gmail.com wrote: Hi All, FieldValueCache stats are not getting displayed on http://./solr/admin/stats.jsp page. I configured it in solrconfig.xml (Solr 1.4) as fieldValueCache class=solr.FastLRUCache size=512 autowarmCount=128 showItems=32/ Any inputs on this? Thank you. Regards, Sandeep -- View this message in context: http://n3.nabble.com/Displaying-fieldValueCache-stats-in-Solr-1-4-admin-stats-page-tp716093p716093.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: problem with RegexTransformer and delimited data
Thanks guys. Unfortunately, neither pattern works. I tried various combos including these: ([^|]*)\|([^|]*) with replaceWith=$1 (.*?)(\|.*) with replaceWith=$1 (.*?)\|.* with and without replaceWith=$1 (.*?)\| with and without replaceWith=$1 As previously mentioned, I have tried many other ways without success. I did notice that if I dont do the stripBy that it removes the everything from the last |^ onwards leaving me something like dataA1|^dataA2|?dataB1|^dataB2|?dataC1. to me this doesnt look like a regex pattern issue; instead looks more like a solr/lucene issue with regex. any other suggestions welcome. otherwise, will have to create custom transformer -- View this message in context: http://n3.nabble.com/problem-with-RegexTransformer-and-delimited-data-tp713846p716203.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: problem with RegexTransformer and delimited data
Thanks guys. Unfortunately, neither pattern works. I tried various combos including these: ([^|]*)\|([^|]*) with replaceWith=$1 (.*?)(\|.*) with replaceWith=$1 (.*?)\|.* with and without replaceWith=$1 (.*?)\| with and without replaceWith=$1 As previously mentioned, I have tried many other ways without success. I did notice that if I dont do the stripBy that it removes the everything from the last |^ onwards leaving me something like dataA1|^dataA2|?dataB1|^dataB2|?dataC1. to me this doesnt look like a regex pattern issue; instead looks more like a solr/lucene issue with regex. any other suggestions welcome. otherwise, will have to create custom transformer -- View this message in context: http://n3.nabble.com/problem-with-RegexTransformer-and-delimited-data-tp713846p716206.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: change get to post ??
I wouldn't do autosuggestion with normal queries anyway. Because of better performance... :-) I don't use DIH, so I can't tell what to do then. For us, we import data with a simple PHP script, which was rather easy to write. So we have full control on Solr's data structure. You somehow have to add information about the category branch into every document for fast queries. Michael Am 13.04.2010 15:22, schrieb stockii: heya. okay NOW. i import from database with DIH. every item have cat_id, more not. for the normal search it works to use fq and Post the search. but for my autosuggestion, it didnt work, because our app does not use the autosuggestion with our API. Because of better performance ...
Re: change get to post ??
what did you mean with normal queries.. our autosuggestion is on an extra core ;-) and the performance is really good. do anyone known how its the best way for search in much categories ? has solr something like an tree component or similar ? -- View this message in context: http://n3.nabble.com/change-get-to-post-Best-Way-for-Category-Search-tp715714p716476.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Snapshooter shooting after commit or optimize
On Mon, Apr 12, 2010 at 7:02 PM, Bill Au bill.w...@gmail.com wrote: The lines you have encloses are commented out by the !-- and -- Bill On Mon, Apr 12, 2010 at 1:32 PM, william pink will.p...@gmail.com wrote: Hi, I am running Solr 1.2 ( I will be updating in due course) I am having a few issues with doing the snapshots after a postCommit or postOptimize neither appear to work in my solrconfig.xml I have the following !-- A postCommit event is fired after every commit or optimize command listener event=postCommit class=solr.RunExecutableListener str name=exesnapshooter/str str name=dir/opt/solr/bin/str bool name=waittrue/bool arr name=args strarg1/str strarg2/str /arr arr name=env strMYVAR=val1/str /arr /listener -- - !-- A postOptimize event is fired only after every optimize command, useful in conjunction with index distribution to only distribute optimized indicies listener event=postOptimize class=solr.RunExecutableListener str name=exesnapshooter/str str name=dir/opt/solr/bin/str bool name=waittrue/bool /listener -- But a snapshot never gets taken, It's most likely something that I haven't spotted but I can't seem to work it out. It's quite possible the route to snapshooter might be the issue but I have tried a few different things and none have worked. Any tips appreciated, Thanks Will Ahh silly me, I have removed those comments and restarted it but it still isn't working correctly, I have just set up snapshooter in cron though so it isn't a major problem but it would be a nice to have. I am now using snappuller and snapinstaller on the slave but I am getting the following error /bin/sh: solr.solr.home: No such file or directory I have set it as a option when Solr starts but I can't seem to find any jetty xml configs to put it in neither does the wiki tell me what file I require to edit so I am bit lost. Thanks again, Will
SChema change with an additional copyField
Hi, after indexing a lot of data I found that in my schema is missing the copyfield declaration for my spell field..:( The question is : do I have to reindex all the documents? I'm asking that because the new field is just a copy of an existing one and so I was wondering if SOLR is able to understand that and create (at startup) that field copyning copying the other... Regards, Andrea
RE: problem with RegexTransformer and delimited data
AWESOME. may take me some time to understand the regex pattern but it worked And many thanks for looking into RegexTransformer.process(). Nice to know that splitby cant be used with regex or replacewith etc Many thanks Steve. -- View this message in context: http://n3.nabble.com/problem-with-RegexTransformer-and-delimited-data-tp713846p716749.html Sent from the Solr - User mailing list archive at Nabble.com.
indexversion not updating on master
I'm having trouble with replication, and i believe it's because the indexversion isn't updating on master. My solrconfig.xml on master: requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=replicateAfterstartup/str str name=replicateAftercommit/str str name=replicateAfteroptimize/str !-- str name=backupAfteroptimize/str -- str name=confFilessolrconfig- slave.xml:solrconfig.xml,schema.xml,stopwords.txt/str /lst /requestHandler BTW, I am certain that this does NOT work: str name=replicateAfterstartup,commit,optimize/str it MUST be separate elements. My solrconfig.xml on slave: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave str name=masterUrlhttp://my_host:8983/solr/replication/str !--Format is HH:mm:ss -- str name=pollInterval00:15:00/str /lst /requestHandler /replication?command=details on master: (I don't understand why there are two indexVersion and two generation entries in this data) lst name=details str name=indexSize19.91 GB/str str name=indexPath/data/solr/index/str arr name=commits lst long name=indexVersion1270535894533/long long name=generation32/long arr name=filelist str_1xv.fdt/str ... str_1xv.frq/str strsegments_w/str /arr /lst /arr str name=isMastertrue/str str name=isSlavefalse/str long name=indexVersion1270535894534/long long name=generation33/long /lst master log shows the commit: INFO: start commit (optimize=false,waitFlush=false,waitSearcher=true,expungeDeletes=false) Apr 12, 2010 4:00:54 PM org.apache.solr.search.SolrIndexSearcher init INFO: Opening searc...@31dd7736 main Apr 12, 2010 4:00:54 PM org.apache.solr.update.DirectUpdateHandler2 commit INFO: end_commit_flush Apr 12, 2010 4:00:54 PM org.apache.solr.search.SolrIndexSearcher warm but indexversion is the OLD one, not the NEW one: response lst name=responseHeader int name=status0/int int name=QTime0/int /lst long name=indexversion1270535894533/long long name=generation32/long /response What's going on? - Naomi
Re: indexversion not updating on master
Does it matter that my last index update did NOT add any new documents and did NOT delete any existing documents? (For testing, I just re- ran the last update) - Naomi On Apr 13, 2010, at 11:09 AM, Naomi Dushay wrote: I'm having trouble with replication, and i believe it's because the indexversion isn't updating on master. My solrconfig.xml on master: requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=replicateAfterstartup/str str name=replicateAftercommit/str str name=replicateAfteroptimize/str !-- str name=backupAfteroptimize/str -- str name=confFilessolrconfig- slave.xml:solrconfig.xml,schema.xml,stopwords.txt/str /lst /requestHandler BTW, I am certain that this does NOT work: str name=replicateAfterstartup,commit,optimize/str it MUST be separate elements. My solrconfig.xml on slave: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave str name=masterUrlhttp://my_host:8983/solr/replication/str !--Format is HH:mm:ss -- str name=pollInterval00:15:00/str /lst /requestHandler /replication?command=details on master: (I don't understand why there are two indexVersion and two generation entries in this data) lst name=details str name=indexSize19.91 GB/str str name=indexPath/data/solr/index/str arr name=commits lst long name=indexVersion1270535894533/long long name=generation32/long arr name=filelist str_1xv.fdt/str ... str_1xv.frq/str strsegments_w/str /arr /lst /arr str name=isMastertrue/str str name=isSlavefalse/str long name=indexVersion1270535894534/long long name=generation33/long /lst master log shows the commit: INFO: start commit (optimize =false,waitFlush=false,waitSearcher=true,expungeDeletes=false) Apr 12, 2010 4:00:54 PM org.apache.solr.search.SolrIndexSearcher init INFO: Opening searc...@31dd7736 main Apr 12, 2010 4:00:54 PM org.apache.solr.update.DirectUpdateHandler2 commit INFO: end_commit_flush Apr 12, 2010 4:00:54 PM org.apache.solr.search.SolrIndexSearcher warm but indexversion is the OLD one, not the NEW one: response lst name=responseHeader int name=status0/int int name=QTime0/int /lst long name=indexversion1270535894533/long long name=generation32/long /response What's going on? - Naomi
Re: SChema change with an additional copyField
On Apr 13, 2010, at 1:29 PM, Andrea Gazzarini wrote: Hi, after indexing a lot of data I found that in my schema is missing the copyfield declaration for my spell field..:( The question is : do I have to reindex all the documents? Unfortunately, yes, you do, unless your spell field is the same analysis as the original (which I doubt it is) I'm asking that because the new field is just a copy of an existing one and so I was wondering if SOLR is able to understand that and create (at startup) that field copyning copying the other... Regards, Andrea
Re: Experience with indexing billions of documents?
Hey there, We've actually been tackling this problem at Drawn to Scale. We'd really like to get our hands on LuceHBase to see how it scales. Our faceting still needs to be done in-memory, which is kinda tricky, but it's worth exploring. On Mon, Apr 12, 2010 at 7:27 AM, Thomas Koch tho...@koch.ro wrote: Hi, could I interest you in this project? http://github.com/thkoch2001/lucehbase The aim is to store the index directly in HBase, a database system modelled after google's Bigtable to store data in the regions of tera or petabytes. Best regards, Thomas Koch Lance Norskog: The 2B limitation is within one shard, due to using a signed 32-bit integer. There is no limit in that regard in sharding- Distributed Search uses the stored unique document id rather than the internal docid. On Fri, Apr 2, 2010 at 10:31 AM, Rich Cariens richcari...@gmail.com wrote: A colleague of mine is using native Lucene + some home-grown patches/optimizations to index over 13B small documents in a 32-shard environment, which is around 406M docs per shard. If there's a 2B doc id limitation in Lucene then I assume he's patched it himself. On Fri, Apr 2, 2010 at 1:17 PM, dar...@ontrenet.com wrote: My guess is that you will need to take advantage of Solr 1.5's upcoming cloud/cluster renovations and use multiple indexes to comfortably achieve those numbers. Hypthetically, in that case, you won't be limited by single index docid limitations of Lucene. We are currently indexing 5 million books in Solr, scaling up over the next few years to 20 million. However we are using the entire book as a Solr document. We are evaluating the possibility of indexing individual pages as there are some use cases where users want the most relevant pages regardless of what book they occur in. However, we estimate that we are talking about somewhere between 1 and 6 billion pages and have concerns over whether Solr will scale to this level. Does anyone have experience using Solr with 1-6 billion Solr documents? The lucene file format document (http://lucene.apache.org/java/3_0_1/fileformats.html#Limitations) mentions a limit of about 2 billion document ids. I assume this is the lucene internal document id and would therefore be a per index/per shard limit. Is this correct? Tom Burton-West. Thomas Koch, http://www.koch.ro -- Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
Re: limit rows by field
I believe you're talking about Fieldcollapsing. It's available as a patch, although I'm not sure how well it applies to the current trunk. for more info check out: http://wiki.apache.org/solr/FieldCollapsing http://wiki.apache.org/solr/FieldCollapsingGeert-Jan 2010/4/13 Felix Zimmermann feliz...@gmx.de Hi, for a preview of results, I need to display up to 3 documents per category. Is it possible to limit the number of rows of solr response by field-values? What I mean is: rows: 9 -(sub)rows of field:cat1 : 3 -(sub)rows of field:cat2 : 3 -(sub)rows of field:cat3 : 3 If not, is there a workaround or do I have to send three queries? Thanks! felix
Re: Internal Server Error
The extracting stuff can use a lot of memory for large documents. Your app may be running out of memory. Tomcat by default has logging for tomcat but not for tomcat apps. If you configure tomcat's log4j to log org.apache.solr classes it will tell you what is wrong. On Tue, Apr 13, 2010 at 5:22 AM, Andrea Gazzarini andrea.gazzar...@atcult.it wrote: Some problem with extraction (Tika, etc...)? My suggestion is : try to extract manually the document...I had a lot of problem with Tika and pdf extraction... Cheers, Andrea Il 13/04/2010 13:05, Sandhya Agarwal ha scritto: Hello, I have the following piece of code : ContentStreamUpdateRequest contentUpdateRequest = new ContentStreamUpdateRequest(/update/extract); contentUpdateRequest.addFile(new File(contentFileName)); contentUpdateRequest.setParam(extractOnly,true); NamedList result = solrServerSession.request(contentUpdateRequest); This is throwing the following error : org.apache.solr.common.SolrException: Internal Server Error Internal Server Error request: http://localhost:8080/solr/update/extract?extractOnly=truewt=javabinversion=1 [Apr 13, 2010 4:25:23 PM (IndexThread-1_9)]: at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424) [Apr 13, 2010 4:25:23 PM (IndexThread-1_9)]: at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243) I have solr 1.4 set up on tomcat 6.0.26. There is no detailed stack trace / logs available. Could somebody please let me know what might be the issue. Thanks, Sandhya -- Lance Norskog goks...@gmail.com
RE: Internal Server Error
Thanks all. I realized the issue is because of my solr home path. I created a new directory and copied all the config files there and mentioned that as my solr path. However, I failed to notice that solrconfig.xml, uses relative paths for all the jars, which were no longer available. Could somebody plz tell me how I can get the remote streaming using solrj API work. Thanks, Sandhya -Original Message- From: Lance Norskog [mailto:goks...@gmail.com] Sent: Wednesday, April 14, 2010 8:57 AM To: solr-user@lucene.apache.org Subject: Re: Internal Server Error The extracting stuff can use a lot of memory for large documents. Your app may be running out of memory. Tomcat by default has logging for tomcat but not for tomcat apps. If you configure tomcat's log4j to log org.apache.solr classes it will tell you what is wrong. On Tue, Apr 13, 2010 at 5:22 AM, Andrea Gazzarini andrea.gazzar...@atcult.it wrote: Some problem with extraction (Tika, etc...)? My suggestion is : try to extract manually the document...I had a lot of problem with Tika and pdf extraction... Cheers, Andrea Il 13/04/2010 13:05, Sandhya Agarwal ha scritto: Hello, I have the following piece of code : ContentStreamUpdateRequest contentUpdateRequest = new ContentStreamUpdateRequest(/update/extract); contentUpdateRequest.addFile(new File(contentFileName)); contentUpdateRequest.setParam(extractOnly,true); NamedList result = solrServerSession.request(contentUpdateRequest); This is throwing the following error : org.apache.solr.common.SolrException: Internal Server Error Internal Server Error request: http://localhost:8080/solr/update/extract?extractOnly=truewt=javabinversion=1 [Apr 13, 2010 4:25:23 PM (IndexThread-1_9)]: at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424) [Apr 13, 2010 4:25:23 PM (IndexThread-1_9)]: at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243) I have solr 1.4 set up on tomcat 6.0.26. There is no detailed stack trace / logs available. Could somebody please let me know what might be the issue. Thanks, Sandhya -- Lance Norskog goks...@gmail.com
Re: problem with RegexTransformer and delimited data
It would be nice if the RegexTransformer logged that the user does not know how to use the different parameters... On 4/13/10, Gerald gerald.deco...@topproducer.com wrote: AWESOME. may take me some time to understand the regex pattern but it worked And many thanks for looking into RegexTransformer.process(). Nice to know that splitby cant be used with regex or replacewith etc Many thanks Steve. -- View this message in context: http://n3.nabble.com/problem-with-RegexTransformer-and-delimited-data-tp713846p716749.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com