Re: Your attention is needed! Solr to be used with a web application.
Hi Noble Paul, I am a beginner in solr pls don't mistake me if I am wrong, Multicore in the sense using Database and HTTPdatasource(like web site or rss feed) or any other combination of datasources to get data, is it right? If yes then I am not trying to do it. I am only tying to configure the HTTPdatasource(web application which has a web page in it) with Solr. Your suggestions about configuring the web app with the Solr would be useful... If my understanding abt multicore is wrong can you pls direct me to the right resource to understand the same. Thanks, Udaya Noble Paul നോബിള് नोब्ळ् wrote: do you really need a multicore configuration? start with a single core first. On Wed, Apr 1, 2009 at 10:31 AM, Udaya ukvign...@gmail.com wrote: Shalin Shekhar Mangar wrote: On Tue, Mar 31, 2009 at 7:04 PM, Udaya ukvign...@gmail.com wrote: 3. The solr configuration xml files are placed inside the directory structure C:\web1\solr1\test\DIH\conf. 5. I have set the java option in tomcat configuration as -Dsolr.solr.home=C:\web1\solr1\test(It is here that the solr.xml and DIH folder are located inside which the conf folder is located) When I try to run the apache-solr-1.3.war thats deployed in the tomcat, it results to Welcome to Solr page with a Solr Admin hyper link in it. When clicking on Solr Admin it results to an error pageHTTP Status 404 - missing core name in path It seems you need only a single Solr index. The solr.xml is needed only when you want to use multiple Solr indices -- the example-DIH provides a db and an rss example, therefore it uses a solr.xml. Point your solr home to the test/DIH directory which contains the conf directory. -- Regards, Shalin Shekhar Mangar. Hi Shalin, Thank you, I do added the java option of the tomcat configuration as your suggestion i.e -Dsolr.solr.home=C:\web1\solr1\test\DIH After setting when i tried running the apache-solr-1.3 from tomcat I get the following exceptions 1.org.xml.sax.SAXParseException: Content is not allowed in prolog. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) 2.org.apache.solr.handler.dataimport.DataImportHandlerException: Exception occurred while initializing context Processing Document # at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:176) at org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:93) at .. 3.org.apache.solr.common.SolrException: FATAL: Could not create importer. DataImporter config invalid at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:114) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:311) at My tomcat is protected with password, i.e we do have to give the username and password when trying to access the web applications that are deployed in it. My doubt is how do we overcome this when Solr tries to access resource from tomcat? I do tried to add the username and password in the datasource tag of dataconfig.xml as follows dataSource type=HttpDataSource user=admin password=password/ Even then the exceptions do occur Suggestions would be of great help Thanks, Udaya -- View this message in context: http://www.nabble.com/Your-attention-is-needed%21-Solr-to-be-used-with-a-web-application.-tp22804930p22819854.html Sent from the Solr - User mailing list archive at Nabble.com. -- --Noble Paul -- View this message in context: http://www.nabble.com/Your-attention-is-needed%21-Solr-to-be-used-with-a-web-application.-tp22804930p22820368.html Sent from the Solr - User mailing list archive at Nabble.com.
Runtime exception when adding documents using solrj
Hi All, I am trying to index documents by using solrj client. I have written a simple code below, { CommonsHttpSolrServer server = new CommonsHttpSolrServer(http://localhost:8080/solr/update;); SolrInputDocument doc1=new SolrInputDocument(); doc1.addField( id, id1, 1.0f ); doc1.addField( name, doc1, 1.0f ); doc1.addField( price, 10 ); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField( id, id2, 1.0f ); doc2.addField( name, doc2, 1.0f ); doc2.addField( price, 20 ); CollectionSolrInputDocument docs = new ArrayListSolrInputDocument(); docs.add( doc1 ); docs.add( doc2 ); server.add(docs); server.commit(); } But I am getting the below error, Can anyone tell me what is the wrong with the above code. Exception in thread main java.lang.RuntimeException: Invalid version or the data in not in 'javabin' format at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:98) at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(Binar yResponseParser.java:39) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS olrServer.java:470) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS olrServer.java:245) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.jav a:243) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at SolrIndexTest.main(SolrIndexTest.java:46) Java Result: 1
Re: Runtime exception when adding documents using solrj
which version of Solr are you using? On Wed, Apr 1, 2009 at 12:01 PM, Radha C. cra...@ceiindia.com wrote: Hi All, I am trying to index documents by using solrj client. I have written a simple code below, { CommonsHttpSolrServer server = new CommonsHttpSolrServer(http://localhost:8080/solr/update;); SolrInputDocument doc1=new SolrInputDocument(); doc1.addField( id, id1, 1.0f ); doc1.addField( name, doc1, 1.0f ); doc1.addField( price, 10 ); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField( id, id2, 1.0f ); doc2.addField( name, doc2, 1.0f ); doc2.addField( price, 20 ); CollectionSolrInputDocument docs = new ArrayListSolrInputDocument(); docs.add( doc1 ); docs.add( doc2 ); server.add(docs); server.commit(); } But I am getting the below error, Can anyone tell me what is the wrong with the above code. Exception in thread main java.lang.RuntimeException: Invalid version or the data in not in 'javabin' format at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:98) at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(Binar yResponseParser.java:39) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS olrServer.java:470) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS olrServer.java:245) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.jav a:243) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at SolrIndexTest.main(SolrIndexTest.java:46) Java Result: 1 -- --Noble Paul
RE: Runtime exception when adding documents using solrj
I am using Solr 1.3 version _ From: Noble Paul നോബിള് नोब्ळ् [mailto:noble.p...@gmail.com] Sent: Wednesday, April 01, 2009 12:16 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Runtime exception when adding documents using solrj which version of Solr are you using? On Wed, Apr 1, 2009 at 12:01 PM, Radha C. cra...@ceiindia.com wrote: Hi All, I am trying to index documents by using solrj client. I have written a simple code below, { CommonsHttpSolrServer server = new CommonsHttpSolrServer(http://localhost:8080/solr/update;); SolrInputDocument doc1=new SolrInputDocument(); doc1.addField( id, id1, 1.0f ); doc1.addField( name, doc1, 1.0f ); doc1.addField( price, 10 ); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField( id, id2, 1.0f ); doc2.addField( name, doc2, 1.0f ); doc2.addField( price, 20 ); CollectionSolrInputDocument docs = new ArrayListSolrInputDocument(); docs.add( doc1 ); docs.add( doc2 ); server.add(docs); server.commit(); } But I am getting the below error, Can anyone tell me what is the wrong with the above code. Exception in thread main java.lang.RuntimeException: Invalid version or the data in not in 'javabin' format at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:98) at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(Binar yResponseParser.java:39) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS olrServer.java:470) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS olrServer.java:245) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.jav a:243) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at SolrIndexTest.main(SolrIndexTest.java:46) Java Result: 1 -- --Noble Paul
Re: Defining DataDir in Multi-Core
I'm using the latest released one - Solr 1.3. The wiki says passing dataDir to CREATE action (web service) should work, but that doesn't seem to be working. -vivek 2009/3/31 Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com: which version of Solr are you using? if you are using one from trunk , you can pass the dataDir as an extra parameter? On Wed, Apr 1, 2009 at 7:41 AM, vivek sar vivex...@gmail.com wrote: Hi, I'm trying to set up cores dynamically. I want to use the same schema.xml and solrconfig.xml for all the created cores, so plan to pass the same instance directory, but different dir directory. Here is what I got in solr.xml by default (I didn't want define any core here, but looks like we have to have at least one core defined before we start the Solr). solr persistent=true cores adminPath=/admin/cores core name=core0 instanceDir=./ /cores /solr Now I run the following URL in the browser (as described on wiki - http://wiki.apache.org/solr/CoreAdmin), http://localhost:8080/solr/admin/cores?action=CREATEname=20090331_1instanceDir=/Users/opal/temp/chat/solrdataDir=/Users/opal/temp/chat/solr/data/20090331_1 I get a response, str name=saved/Users/opal/temp/chat/solr/solr.xml/str Now when I check the solr.xml I see, ?xml version='1.0' encoding='UTF-8'?solr persistent='true' cores adminPath='/admin/cores' core name='core0' instanceDir='./'/ core name='20090331_2' instanceDir='/Users/opal/temp/afterchat/solr/'/ /cores /solr Note, there is NO dir directory specified. When I check the status (http://localhost:8080/solr/admin/cores?action=STATUS) I see, str name=namecore0/str str name=instanceDir/Users/opal/temp/afterchat/solr/.//str str name=dataDir/Users/opal/temp/afterchat/solr/./data//str ... str name=name20090331_2/str str name=instanceDir/Users/opal/temp/afterchat/solr//str str name=dataDir/Users/opal/temp/afterchat/solr/data//str both cores are pointing to the same data directory. My question is how can I create cores on fly and have them point to different data directories so each core write index in different location? Thanks, -vivek -- --Noble Paul
Re: Merging Solr Indexes
Thanks Otis. Could you write to same core (same index) from multiple threads at the same time? I thought each writer would lock the index so other can not write at the same time. I'll try it though. Another reason of putting indexes in separate core was to limit the index size. Our index can grow up to 50G a day, so I was hoping writing to smaller indexes would be faster in separate cores and if needed I can merge them at later point (like end of day). I want to keep daily cores. Isn't this a good idea? How else can I limit the index size (beside multiple instances or separate boxes). Thanks, -vivek On Tue, Mar 31, 2009 at 8:28 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Let me start with 4) Have you tried simply using multiple threads to send your docs to a single Solr instance/core? You should get about the same performance as what you are trying with your approach below, but without the headache of managing multiple cores and index merging (not yet possible to do programatically). Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: vivek sar vivex...@gmail.com To: solr-user@lucene.apache.org Sent: Tuesday, March 31, 2009 1:59:01 PM Subject: Merging Solr Indexes Hi, As part of speeding up the index process I'm thinking of spawning multiple threads which will write to different temporary SolrCores. Once the index process is done I want to merge all the indexes in temporary cores to a master core. For ex., if I want one SolrCore per day then every index cycle I'll spawn 4 threads which will index into some temporary index and once they are done I want to merge all these into the day core. My questions, 1) I want to use the same schema and solrconfig.xml for all cores without duplicating them - how do I do that? 2) How do I merge the temporary Solr cores into one master core programmatically? I've read the wiki on MergingSolrIndexes, but I want to do it programmatically (like in Lucene - writer.addIndexes(..)) once the temporary indices are done. 3) Can I remove the temporary indices once the merge process is done? 4) Is this the right strategy to speed up indexing? Thanks, -vivek
RE: Runtime exception when adding documents using solrj
Can anyone please tell me , what is the issue with the below java code.. -Original Message- From: Radha C. [mailto:cra...@ceiindia.com] Sent: Wednesday, April 01, 2009 12:28 PM To: solr-user@lucene.apache.org Subject: RE: Runtime exception when adding documents using solrj I am using Solr 1.3 version _ From: Noble Paul നോബിള് नोब्ळ् [mailto:noble.p...@gmail.com] Sent: Wednesday, April 01, 2009 12:16 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Runtime exception when adding documents using solrj which version of Solr are you using? On Wed, Apr 1, 2009 at 12:01 PM, Radha C. cra...@ceiindia.com wrote: Hi All, I am trying to index documents by using solrj client. I have written a simple code below, { CommonsHttpSolrServer server = new CommonsHttpSolrServer(http://localhost:8080/solr/update;); SolrInputDocument doc1=new SolrInputDocument(); doc1.addField( id, id1, 1.0f ); doc1.addField( name, doc1, 1.0f ); doc1.addField( price, 10 ); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField( id, id2, 1.0f ); doc2.addField( name, doc2, 1.0f ); doc2.addField( price, 20 ); CollectionSolrInputDocument docs = new ArrayListSolrInputDocument(); docs.add( doc1 ); docs.add( doc2 ); server.add(docs); server.commit(); } But I am getting the below error, Can anyone tell me what is the wrong with the above code. Exception in thread main java.lang.RuntimeException: Invalid version or the data in not in 'javabin' format at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:98) at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(Binar yResponseParser.java:39) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS olrServer.java:470) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS olrServer.java:245) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.jav a:243) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at SolrIndexTest.main(SolrIndexTest.java:46) Java Result: 1 -- --Noble Paul
Re: Defining DataDir in Multi-Core
On Wed, Apr 1, 2009 at 1:48 PM, vivek sar vivex...@gmail.com wrote: I'm using the latest released one - Solr 1.3. The wiki says passing dataDir to CREATE action (web service) should work, but that doesn't seem to be working. That is a Solr 1.4 feature (not released yet). -- Regards, Shalin Shekhar Mangar.
Re: Runtime exception when adding documents using solrj
the url is wrong try this CommonsHttpSolrServer server = new CommonsHttpSolrServer(http://localhost:8080/solr/;); On Wed, Apr 1, 2009 at 2:04 PM, Radha C. cra...@ceiindia.com wrote: Can anyone please tell me , what is the issue with the below java code.. -Original Message- From: Radha C. [mailto:cra...@ceiindia.com] Sent: Wednesday, April 01, 2009 12:28 PM To: solr-user@lucene.apache.org Subject: RE: Runtime exception when adding documents using solrj I am using Solr 1.3 version _ From: Noble Paul നോബിള് नोब्ळ् [mailto:noble.p...@gmail.com] Sent: Wednesday, April 01, 2009 12:16 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Runtime exception when adding documents using solrj which version of Solr are you using? On Wed, Apr 1, 2009 at 12:01 PM, Radha C. cra...@ceiindia.com wrote: Hi All, I am trying to index documents by using solrj client. I have written a simple code below, { CommonsHttpSolrServer server = new CommonsHttpSolrServer(http://localhost:8080/solr/update;); SolrInputDocument doc1=new SolrInputDocument(); doc1.addField( id, id1, 1.0f ); doc1.addField( name, doc1, 1.0f ); doc1.addField( price, 10 ); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField( id, id2, 1.0f ); doc2.addField( name, doc2, 1.0f ); doc2.addField( price, 20 ); CollectionSolrInputDocument docs = new ArrayListSolrInputDocument(); docs.add( doc1 ); docs.add( doc2 ); server.add(docs); server.commit(); } But I am getting the below error, Can anyone tell me what is the wrong with the above code. Exception in thread main java.lang.RuntimeException: Invalid version or the data in not in 'javabin' format at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:98) at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(Binar yResponseParser.java:39) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS olrServer.java:470) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS olrServer.java:245) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.jav a:243) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at SolrIndexTest.main(SolrIndexTest.java:46) Java Result: 1 -- --Noble Paul -- --Noble Paul
performance tests with DataImportHandler and full-import
Hey there, I am doing performance tests with full-import command from DataImportHandler. I have configured 20 cores with 1 Gig index each (about a milion docs per index). If I start doing full-imports indexing from a mysql table SEQUENCIALY with cron jobs as frequently as possible it will work fine for about 17 or 20 full-import jobs in total. After that it starts taking much much longer for each full import... (from 30 min at the begining to 4 hours for the las full-import before crashing) until I get an OutOfMemoryError java Heap Space. Apr 1 10:30:47 indexer-03 solr: 63534480 [Thread-536] ERROR org.apache.solr.handler.dataimport.DataImporter - Full Import failed Apr 1 10:34:04 indexer-03 solr: 63581229 [http-8080-Processor92] ERROR org.apache.solr.servlet.SolrDispatchFilter - java.lang.OutOfMemoryError: Java heap space Apr 1 10:34:23 indexer-03 solr: 63562764 [Thread-546] WARN org.apache.solr.handler.dataimport.DocBuilder - GC overhead limit exceeded Apr 1 10:36:11 indexer-03 solr: 63739903 [http-8080-Processor99] ERROR org.apache.solr.servlet.SolrDispatchFilter - java.lang.OutOfMemoryError: Java heap space Apr 1 10:36:11 indexer-03 solr: 63876821 [http-8080-Processor95] ERROR org.apache.solr.servlet.SolrDispatchFilter - java.lang.OutOfMemoryError: Java heap space Apr 1 10:36:20 indexer-03 solr: 63787787 [Thread-546] ERROR org.apache.solr.handler.dataimport.DataImporter - Full Import failed Apr 1 10:40:02 indexer-03 solr: 64073790 [http-8080-Processor100] ERROR org.apache.solr.servlet.SolrDispatchFilter - java.lang.OutOfMemoryError: Java heap space Apr 1 10:40:45 indexer-03 solr: 63991575 [http-8080-Processor96] ERROR org.apache.solr.servlet.SolrDispatchFilter - java.lang.OutOfMemoryError: Java heap space Even I do indexations sequencially I have tryed using a the ConcurrentGarvageCollector: -XX:+UseConcMarkSweepGC but nothing seems to change (it's quite logic as I don't index concurrently) I am runing on Debian 2.6.26-1-amd64 mysql for database java version: java version 1.6.0_12 Java(TM) SE Runtime Environment (build 1.6.0_12-b04) Java HotSpot(TM) 64-Bit Server VM (build 11.2-b01, mixed mode) Any idea why this could be happening? -- View this message in context: http://www.nabble.com/performance-tests-with-DataImportHandler-and-full-import-tp22823145p22823145.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Runtime exception when adding documents using solrj
Thanks Paul, I changed the URL but I am getting another error - Bad request , Any help will be appriciated. Exception in thread main org.apache.solr.common.SolrException: Bad Request Bad Request request: http://localhost:8080/solr/update?wt=javabin at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:428) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:245) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:243) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at SolrIndexTest.main(SolrIndexTest.java:47) Java Result: 1 -Original Message- From: Noble Paul നോബിള് नोब्ळ् [mailto:noble.p...@gmail.com] Sent: Wednesday, April 01, 2009 2:26 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Runtime exception when adding documents using solrj the url is wrong try this CommonsHttpSolrServer server = new CommonsHttpSolrServer(http://localhost:8080/solr/;); On Wed, Apr 1, 2009 at 2:04 PM, Radha C. cra...@ceiindia.com wrote: Can anyone please tell me , what is the issue with the below java code.. -Original Message- From: Radha C. [mailto:cra...@ceiindia.com] Sent: Wednesday, April 01, 2009 12:28 PM To: solr-user@lucene.apache.org Subject: RE: Runtime exception when adding documents using solrj I am using Solr 1.3 version _ From: Noble Paul നോബിള് नोब्ळ् [mailto:noble.p...@gmail.com] Sent: Wednesday, April 01, 2009 12:16 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Runtime exception when adding documents using solrj which version of Solr are you using? On Wed, Apr 1, 2009 at 12:01 PM, Radha C. cra...@ceiindia.com wrote: Hi All, I am trying to index documents by using solrj client. I have written a simple code below, { CommonsHttpSolrServer server = new CommonsHttpSolrServer(http://localhost:8080/solr/update;); SolrInputDocument doc1=new SolrInputDocument(); doc1.addField( id, id1, 1.0f ); doc1.addField( name, doc1, 1.0f ); doc1.addField( price, 10 ); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField( id, id2, 1.0f ); doc2.addField( name, doc2, 1.0f ); doc2.addField( price, 20 ); CollectionSolrInputDocument docs = new ArrayListSolrInputDocument(); docs.add( doc1 ); docs.add( doc2 ); server.add(docs); server.commit(); } But I am getting the below error, Can anyone tell me what is the wrong with the above code. Exception in thread main java.lang.RuntimeException: Invalid version or the data in not in 'javabin' format at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:9 8) at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse (Binar yResponseParser.java:39) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(Common sHttpS olrServer.java:470) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(Common sHttpS olrServer.java:245) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateReque st.jav a:243) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at SolrIndexTest.main(SolrIndexTest.java:46) Java Result: 1 -- --Noble Paul -- --Noble Paul
RE: Runtime exception when adding documents using solrj
Thanks Paul, I resolved it, I missed one field declaration in schema.xml. Now I added, and it works. -Original Message- From: Noble Paul നോബിള് नोब्ळ् [mailto:noble.p...@gmail.com] Sent: Wednesday, April 01, 2009 3:52 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Runtime exception when adding documents using solrj Can u take a look at the Solr logs and see what is hapening? On Wed, Apr 1, 2009 at 3:19 PM, Radha C. cra...@ceiindia.com wrote: Thanks Paul, I changed the URL but I am getting another error - Bad request , Any help will be appriciated. Exception in thread main org.apache.solr.common.SolrException: Bad Request Bad Request request: http://localhost:8080/solr/update?wt=javabin at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(Common sHttpSolrServer.java:428) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(Common sHttpSolrServer.java:245) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateReque st.java:243) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at SolrIndexTest.main(SolrIndexTest.java:47) Java Result: 1 -Original Message- From: Noble Paul നോബിള് नोब्ळ् [mailto:noble.p...@gmail.com] Sent: Wednesday, April 01, 2009 2:26 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Runtime exception when adding documents using solrj the url is wrong try this CommonsHttpSolrServer server = new CommonsHttpSolrServer(http://localhost:8080/solr/;); On Wed, Apr 1, 2009 at 2:04 PM, Radha C. cra...@ceiindia.com wrote: Can anyone please tell me , what is the issue with the below java code.. -Original Message- From: Radha C. [mailto:cra...@ceiindia.com] Sent: Wednesday, April 01, 2009 12:28 PM To: solr-user@lucene.apache.org Subject: RE: Runtime exception when adding documents using solrj I am using Solr 1.3 version _ From: Noble Paul നോബിള് नोब्ळ् [mailto:noble.p...@gmail.com] Sent: Wednesday, April 01, 2009 12:16 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Runtime exception when adding documents using solrj which version of Solr are you using? On Wed, Apr 1, 2009 at 12:01 PM, Radha C. cra...@ceiindia.com wrote: Hi All, I am trying to index documents by using solrj client. I have written a simple code below, { CommonsHttpSolrServer server = new CommonsHttpSolrServer(http://localhost:8080/solr/update;); SolrInputDocument doc1=new SolrInputDocument(); doc1.addField( id, id1, 1.0f ); doc1.addField( name, doc1, 1.0f ); doc1.addField( price, 10 ); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField( id, id2, 1.0f ); doc2.addField( name, doc2, 1.0f ); doc2.addField( price, 20 ); CollectionSolrInputDocument docs = new ArrayListSolrInputDocument(); docs.add( doc1 ); docs.add( doc2 ); server.add(docs); server.commit(); } But I am getting the below error, Can anyone tell me what is the wrong with the above code. Exception in thread main java.lang.RuntimeException: Invalid version or the data in not in 'javabin' format at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java: 9 8) at org.apache.solr.client.solrj.impl.BinaryResponseParser.processRespons e (Binar yResponseParser.java:39) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(Commo n sHttpS olrServer.java:470) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(Commo n sHttpS olrServer.java:245) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequ e st.jav a:243) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at SolrIndexTest.main(SolrIndexTest.java:46) Java Result: 1 -- --Noble Paul -- --Noble Paul -- --Noble Paul
Re: Runtime exception when adding documents using solrj
Can u take a look at the Solr logs and see what is hapening? On Wed, Apr 1, 2009 at 3:19 PM, Radha C. cra...@ceiindia.com wrote: Thanks Paul, I changed the URL but I am getting another error - Bad request , Any help will be appriciated. Exception in thread main org.apache.solr.common.SolrException: Bad Request Bad Request request: http://localhost:8080/solr/update?wt=javabin at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:428) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:245) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:243) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at SolrIndexTest.main(SolrIndexTest.java:47) Java Result: 1 -Original Message- From: Noble Paul നോബിള് नोब्ळ् [mailto:noble.p...@gmail.com] Sent: Wednesday, April 01, 2009 2:26 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Runtime exception when adding documents using solrj the url is wrong try this CommonsHttpSolrServer server = new CommonsHttpSolrServer(http://localhost:8080/solr/;); On Wed, Apr 1, 2009 at 2:04 PM, Radha C. cra...@ceiindia.com wrote: Can anyone please tell me , what is the issue with the below java code.. -Original Message- From: Radha C. [mailto:cra...@ceiindia.com] Sent: Wednesday, April 01, 2009 12:28 PM To: solr-user@lucene.apache.org Subject: RE: Runtime exception when adding documents using solrj I am using Solr 1.3 version _ From: Noble Paul നോബിള് नोब्ळ् [mailto:noble.p...@gmail.com] Sent: Wednesday, April 01, 2009 12:16 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Runtime exception when adding documents using solrj which version of Solr are you using? On Wed, Apr 1, 2009 at 12:01 PM, Radha C. cra...@ceiindia.com wrote: Hi All, I am trying to index documents by using solrj client. I have written a simple code below, { CommonsHttpSolrServer server = new CommonsHttpSolrServer(http://localhost:8080/solr/update;); SolrInputDocument doc1=new SolrInputDocument(); doc1.addField( id, id1, 1.0f ); doc1.addField( name, doc1, 1.0f ); doc1.addField( price, 10 ); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField( id, id2, 1.0f ); doc2.addField( name, doc2, 1.0f ); doc2.addField( price, 20 ); CollectionSolrInputDocument docs = new ArrayListSolrInputDocument(); docs.add( doc1 ); docs.add( doc2 ); server.add(docs); server.commit(); } But I am getting the below error, Can anyone tell me what is the wrong with the above code. Exception in thread main java.lang.RuntimeException: Invalid version or the data in not in 'javabin' format at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:9 8) at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse (Binar yResponseParser.java:39) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(Common sHttpS olrServer.java:470) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(Common sHttpS olrServer.java:245) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateReque st.jav a:243) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at SolrIndexTest.main(SolrIndexTest.java:46) Java Result: 1 -- --Noble Paul -- --Noble Paul -- --Noble Paul
Re: performance tests with DataImportHandler and full-import
I guess Solr itself is hogging more memory. May be you can try reloading the core before each import. On Wed, Apr 1, 2009 at 3:19 PM, Marc Sturlese marc.sturl...@gmail.com wrote: Hey there, I am doing performance tests with full-import command from DataImportHandler. I have configured 20 cores with 1 Gig index each (about a milion docs per index). If I start doing full-imports indexing from a mysql table SEQUENCIALY with cron jobs as frequently as possible it will work fine for about 17 or 20 full-import jobs in total. After that it starts taking much much longer for each full import... (from 30 min at the begining to 4 hours for the las full-import before crashing) until I get an OutOfMemoryError java Heap Space. Apr 1 10:30:47 indexer-03 solr: 63534480 [Thread-536] ERROR org.apache.solr.handler.dataimport.DataImporter - Full Import failed Apr 1 10:34:04 indexer-03 solr: 63581229 [http-8080-Processor92] ERROR org.apache.solr.servlet.SolrDispatchFilter - java.lang.OutOfMemoryError: Java heap space Apr 1 10:34:23 indexer-03 solr: 63562764 [Thread-546] WARN org.apache.solr.handler.dataimport.DocBuilder - GC overhead limit exceeded Apr 1 10:36:11 indexer-03 solr: 63739903 [http-8080-Processor99] ERROR org.apache.solr.servlet.SolrDispatchFilter - java.lang.OutOfMemoryError: Java heap space Apr 1 10:36:11 indexer-03 solr: 63876821 [http-8080-Processor95] ERROR org.apache.solr.servlet.SolrDispatchFilter - java.lang.OutOfMemoryError: Java heap space Apr 1 10:36:20 indexer-03 solr: 63787787 [Thread-546] ERROR org.apache.solr.handler.dataimport.DataImporter - Full Import failed Apr 1 10:40:02 indexer-03 solr: 64073790 [http-8080-Processor100] ERROR org.apache.solr.servlet.SolrDispatchFilter - java.lang.OutOfMemoryError: Java heap space Apr 1 10:40:45 indexer-03 solr: 63991575 [http-8080-Processor96] ERROR org.apache.solr.servlet.SolrDispatchFilter - java.lang.OutOfMemoryError: Java heap space Even I do indexations sequencially I have tryed using a the ConcurrentGarvageCollector: -XX:+UseConcMarkSweepGC but nothing seems to change (it's quite logic as I don't index concurrently) I am runing on Debian 2.6.26-1-amd64 mysql for database java version: java version 1.6.0_12 Java(TM) SE Runtime Environment (build 1.6.0_12-b04) Java HotSpot(TM) 64-Bit Server VM (build 11.2-b01, mixed mode) Any idea why this could be happening? -- View this message in context: http://www.nabble.com/performance-tests-with-DataImportHandler-and-full-import-tp22823145p22823145.html Sent from the Solr - User mailing list archive at Nabble.com. -- --Noble Paul
RE: Problems with synonyms
Hi Leonardo, I've been using the synonym filter at index time (expand = true) and it works just fine. Also use OR as the default operator. Once you do it at index time there is no point doing it at query time (which in fact is likely to be the reason of your problems). Have a look at the Wiki page Yonik sent about it. Cheers, Daniel From: Leonardo Dias [mailto:leona...@catho.com.br] Sent: 31 March 2009 20:40 To: solr-user@lucene.apache.org Subject: Re: Problems with synonyms Hi, Vernon! We tried both approaches: OR and AND. In both cases, the results were smaller when the synonyms was set up, with no change at all when it comes to synonyms. Any other ideas? Is it likely to be a bug? Best, Leonardo Vernon Chapman escreveu: Leonardo, I am no expert but I would check to make sure that the DefaultOperator parameter in your schema.xml file is set to OR rather thank AND. Vernon On 3/31/09 3:24 PM, Leonardo Dias leona...@catho.com.br mailto:leona...@catho.com.br wrote: Hello there. How are you guys? We're having problems with synonyms here and I thought that maybe you guys could help us on how SOLR works for synonyms. The problem is the following: I'd like to setup a synonym like dba, database administrator. Instead of increasing the number of results for the keyword dba, the results got smaller and it only brought me back results that had both the keywords dba and database administrator at the same time instead of bringing back both dba and database administrator as expected since our synonym configuration is using expand=true. Since in the past this was not the expected behavior, I'd like to know whether something changed in the solr/lucene internals so that this functionality is now lost, or if I'm doing something wrong with my setup. Currently all fields pass through the Synonym filter factory. The analysis shows me that it tries to search for database administrator and DBA. A debug query also shows me that the query it's trying to do is something like this: +DisjunctionMaxQuery((title:(dba datab) administr)~0.1) DisjunctionMaxQuery((title:(dba datab) administr^10.0 | observation:(dba datab) administr^10.0 | description:(dba datab) administr^10.0 | company:(dba datab) administr)~0.1) The problem is: when I search for this, I get 5 results. When I search for dba only, without the dba, database administrator line in the synonyms.txt file, I get more than 100 results. Do you guys know why this is happening? Thank you, Leonardo -- Leonardo Dias Gerente de Processos Estratégicos leona...@catho.com.br mailto:ego...@catho.com.br Tel.:(11) 3177.0742 Ramal: 742 www.catho.com.br http://www.catho.com.br/ Antes de imprimir, pense no meio ambiente. Esta mensagem é destinada exclusivamente para a(s) pessoa(s) a quem é dirigida, podendo conter informação confidencial e legalmente protegida. Se você não for destinatário desta mensagem, desde já fica notificado de abster-se a divulgar, copiar, distribuir, examinar ou, de qualquer forma, utilizar a informação contida nesta mensagem, por ser ilegal. Caso você tenha recebido esta mensagem por engano, pedimos que responda essa mensagem informando o acontecido. http://www.bbc.co.uk/ This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this.
Indexing fields of xml file using solrj
Hi All, I want to index the document fields in a xml file to index using solrj. I know how to index the document fields using doc.addfield(). But I dont know how to post the xml document instead of adding each field in solrj. Can I index xml file using solrj? Can anyone help me in how to do this? Thanks,
DIH Date conversion from a source column skews time
I have noticed that setting a dynamic date field from source column changes the time within the date. Can anyone confirm this? For example, the document I import has the following xml field. field name=original_air_date_d2002-12-18T00:00:00Z/field In my data-inport-config file I define the following instructions: field column=temp_original_air_date_s xpath=/add/doc/fie...@name='original_air_date_d'] / field column=original_air_year_s sourceColName=temp_original_air_date_s regex=([0-9][0-9][0-9][0-9])[- /.][0-9][0-9][- /.][0- 9][0-9][T][0-9][0-9][:][0-9][0-9][:][0-9][0-9][Z] replaceWith=$1 / field column=original_air_date_d sourceColName=temp_original_air_date_s dateTimeFormat=-MM-dd'T'HH:mm:ss'Z'/ What is set in my index is is the following: arr name=temp_original_air_date_s str2002-12-18T00:00:00Z/str /arr arr name=original_air_year_s str2002/str /arr arr name=original_air_date_d date2002-12-18T05:00:00Z/date /arr You'll notice that the hour (HH) in original_air_date_d changes is set to 05. It should still be 00. I have noticed that it changes to either 04 or 05 in all cases within my index. In my schema the dynamic field *_d dynamicField name=*_d type=date indexed=true stored=true/ Thanks, Wesley.
Re: Indexing fields of xml file using solrj
Hello, I believe what you want is DirectXMLRequest. http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/request/DirectXmlRequest.html Cheers, Giovanni On 4/1/09, Radha C. cra...@ceiindia.com wrote: Hi All, I want to index the document fields in a xml file to index using solrj. I know how to index the document fields using doc.addfield(). But I dont know how to post the xml document instead of adding each field in solrj. Can I index xml file using solrj? Can anyone help me in how to do this? Thanks,
Re: Indexing fields of xml file using solrj
On Wed, Apr 1, 2009 at 5:17 PM, Radha C. cra...@ceiindia.com wrote: Hi All, I want to index the document fields in a xml file to index using solrj. I know how to index the document fields using doc.addfield(). But I dont know how to post the xml document instead of adding each field in solrj. Can I index xml file using solrj? Can anyone help me in how to do this? Solr will only accept xml files which are in the solr's update xml format. You cannot post any arbitrary xml (you can convert using the xslt). You can also parse it yourself and use solrj for adding the document. There's DataImportHandler too which can parse XML using xpath. -- Regards, Shalin Shekhar Mangar.
multicore
Hi, I need to create multiple cores for my project. I need to know: how to have multiple cores ? can we start all cores from single startup file or we need to start all independently? I need a way by which I can start all of them in one go. Thanks Neha DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
Re: multicore
Hello, the starting point is here: http://wiki.apache.org/solr/CoreAdmin Cheers, Giovanni On 4/1/09, Neha Bhardwaj neha_bhard...@persistent.co.in wrote: Hi, I need to create multiple cores for my project. I need to know: how to have multiple cores ? can we start all cores from single startup file or we need to start all independently? I need a way by which I can start all of them in one go. Thanks Neha DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
RE: Indexing fields of xml file using solrj
Thanks shalin, I need to index the xml which is in solr's format only. I want to index that xnl directly using solrj same like how we post using curl. Is there any API class is available for that? Can you please provide me any reference link? -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Wednesday, April 01, 2009 6:07 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Indexing fields of xml file using solrj On Wed, Apr 1, 2009 at 5:17 PM, Radha C. cra...@ceiindia.com wrote: Hi All, I want to index the document fields in a xml file to index using solrj. I know how to index the document fields using doc.addfield(). But I dont know how to post the xml document instead of adding each field in solrj. Can I index xml file using solrj? Can anyone help me in how to do this? Solr will only accept xml files which are in the solr's update xml format. You cannot post any arbitrary xml (you can convert using the xslt). You can also parse it yourself and use solrj for adding the document. There's DataImportHandler too which can parse XML using xpath. -- Regards, Shalin Shekhar Mangar.
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
Grant, Redoing the work with your patch applied does not seem to make a difference! Is this the expected result? I did run it again using the full file, this time using my Imac:- 643465took 22min 14sec 2008-04-01 734796 73min 58sec 2009-01-15 758795 70min 55sec 2009-03-26 Again using only the first 1M records with commit=falseoverwrite=true:- 643465took 2m51.516s 2008-04-01 734796 7m29.326s 2009-01-15 758795 8m18.403s 2009-03-26 SOLR-1095 7m41.699s this time with commit=trueoverwrite=true. 643465took 2m49.200s 2008-04-01 734796 8m27.414s 2009-01-15 758795 9m32.459s 2009-03-26 SOLR-1095 7m58.825s this time with commit=falseoverwrite=false. 643465took 2m46.149s 2008-04-01 734796 3m29.909s 2009-01-15 758795 3m26.248s 2009-03-26 SOLR-1095 2m49.997s -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Re: Indexing fields of xml file using solrj
I understand Shalin is a guru and I am nobody but... http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/request/DirectXmlRequest.html Is what you need if you want to use Solrj... :-) On 4/1/09, Radha C. cra...@ceiindia.com wrote: Thanks shalin, I need to index the xml which is in solr's format only. I want to index that xnl directly using solrj same like how we post using curl. Is there any API class is available for that? Can you please provide me any reference link? -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Wednesday, April 01, 2009 6:07 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Indexing fields of xml file using solrj On Wed, Apr 1, 2009 at 5:17 PM, Radha C. cra...@ceiindia.com wrote: Hi All, I want to index the document fields in a xml file to index using solrj. I know how to index the document fields using doc.addfield(). But I dont know how to post the xml document instead of adding each field in solrj. Can I index xml file using solrj? Can anyone help me in how to do this? Solr will only accept xml files which are in the solr's update xml format. You cannot post any arbitrary xml (you can convert using the xslt). You can also parse it yourself and use solrj for adding the document. There's DataImportHandler too which can parse XML using xpath. -- Regards, Shalin Shekhar Mangar.
FW: multicore
From: Neha Bhardwaj [mailto:neha_bhard...@persistent.co.in] Sent: Wednesday, April 01, 2009 6:52 PM To: 'solr-user@lucene.apache.org' Subject: multicore Hi, I need to create multiple cores for my project. I need to know: how to have multiple cores ? can we start all cores from single startup file or we need to start all independently? I need a way by which I can start all of them in one go. Thanks Neha DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
RE: Indexing fields of xml file using solrj
Hey No, Actually I did not look at your response email. But I saw your email after I responded to shalin and you gave me a correct answer. Thanks a lot. So I started coding it also, I thought to reply to u once I executed successfully. Here is my code, DirectXmlRequest xmlreq = new DirectXmlRequest( /update, xml.toString() ); server.request( xmlreq ); server.commit(); But I am having trouble in identifying the xml location, I am having the input xml in $solrhome/inputdata/example.xml, Do you have any idea about how to get solrhome location dynamically by using any solrj API class? _ From: Giovanni De Stefano [mailto:giovanni.destef...@gmail.com] Sent: Wednesday, April 01, 2009 7:30 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Indexing fields of xml file using solrj I understand Shalin is a guru and I am nobody but... http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/request/Direc tXmlRequest.html Is what you need if you want to use Solrj... :-) On 4/1/09, Radha C. cra...@ceiindia.com wrote: Thanks shalin, I need to index the xml which is in solr's format only. I want to index that xnl directly using solrj same like how we post using curl. Is there any API class is available for that? Can you please provide me any reference link? -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Wednesday, April 01, 2009 6:07 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Indexing fields of xml file using solrj On Wed, Apr 1, 2009 at 5:17 PM, Radha C. cra...@ceiindia.com wrote: Hi All, I want to index the document fields in a xml file to index using solrj. I know how to index the document fields using doc.addfield(). But I dont know how to post the xml document instead of adding each field in solrj. Can I index xml file using solrj? Can anyone help me in how to do this? Solr will only accept xml files which are in the solr's update xml format. You cannot post any arbitrary xml (you can convert using the xslt). You can also parse it yourself and use solrj for adding the document. There's DataImportHandler too which can parse XML using xpath. -- Regards, Shalin Shekhar Mangar.
Re: how to improve concurrent request performance and stress testing
Thanks for all this help, But I guess it can't be optimal with a lot of update, my slave get back from the master 20 000docs updated every 20minutes, it's made to try to warmup have a big cache and everything go fast with that amount of update I guess ...? zqzuk wrote: Hi, try to firstly have a look at http://wiki.apache.org/solr/SolrCaching the section on firstsearcher and warming. Search engines rely on caching, so first searches will be slow. I think to be fair testing it is necessary to warm up the search engine by sending most frequently used and/or most costly queries, then start your stress testing. I used this tool http://code.google.com/p/httpstone/ to do stress testing. It allows you to create multiple threads sending queries to a server simultaneously, and records time taken to process each query in each thread. Hope it helps. sunnyfr wrote: Hi, I'm trying as well to stress test solr. I would love some advice to manage it properly. I'm using solr 1.3 and tomcat55. Thanks a lot, zqzuk wrote: Hi, I am doing a stress testing of my solr application to see how many concurrent requests it can handle and how long it takes. But I m not sure if I have done it in proper way... responses seem to be very slow My configuration: 1 Solr instance, using the default settings distributed in the example code, while I made two changes: useColdSearchertrue/useColdSearcher maxWarmingSearchers10/maxWarmingSearchers As I thought the more searchers the more concurrent requests can be dealt with? There are 1.1 million documents indexed, and the platform is winxp sp2, duo core 1.8 GB machine with ram 2GB I used httpstone, a simple server load testing tool to create 100 workers (so 100 threads) each issuing one same query to the server. To deal with a single request of this query it took solr 2 seconds (with facet counts), and 7 documents are returned. I was assuming that only first request would take longer time and following requests should be almost instantaneous as the query is the same. But strange that the first response took as long as 20 seconds. It looked like that the 100 workers sent same request to solr and then all of a sudden solr server went silent. Only after 20 seconds some of these workers started to receive responses, but still very slow. clearly there I must have made something wrong with configuring solr server... Could you give me some pointers on how to improve the performance please? Many thanks! -- View this message in context: http://www.nabble.com/how-to-improve-concurrent-request-performance-and-stress-testing-tp15299687p22827717.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: FW: multicore
how to have multiple cores ? You need a solr.xml file in the root of your solr home. In this solr.xml you will inicalize the cores. In this same folder you will have a folder per core with its /conf and /data. Every core has it's own solrconfig.xml and schema.xml. If you grap a nighlty build you will see a config example in there. Every thing is proper explained in the solr core wiki: http://wiki.apache.org/solr/CoreAdmin can we start all cores from single startup file or we need to start all independently? I need a way by which I can start all of them in one go. Once you have your cores configures in wour webapp, all of the will be loaded automatically when you start your server. Neha Bhardwaj wrote: From: Neha Bhardwaj [mailto:neha_bhard...@persistent.co.in] Sent: Wednesday, April 01, 2009 6:52 PM To: 'solr-user@lucene.apache.org' Subject: multicore Hi, I need to create multiple cores for my project. I need to know: how to have multiple cores ? can we start all cores from single startup file or we need to start all independently? I need a way by which I can start all of them in one go. Thanks Neha DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- View this message in context: http://www.nabble.com/FW%3A-multicore-tp22827267p22827926.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Indexing fields of xml file using solrj
Hello, you can try with: SolrConfig.config.getResourceLoader().getInstanceDir() Let me know if it works. Cheers, Giovanni On 4/1/09, Radha C. cra...@ceiindia.com wrote: Hey No, Actually I did not look at your response email. But I saw your email after I responded to shalin and you gave me a correct answer. Thanks a lot. So I started coding it also, I thought to reply to u once I executed successfully. Here is my code, DirectXmlRequest xmlreq = new DirectXmlRequest( /update, xml.toString() ); server.request( xmlreq ); server.commit(); But I am having trouble in identifying the xml location, I am having the input xml in $solrhome/inputdata/example.xml, Do you have any idea about how to get solrhome location dynamically by using any solrj API class? _ From: Giovanni De Stefano [mailto:giovanni.destef...@gmail.com] Sent: Wednesday, April 01, 2009 7:30 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Indexing fields of xml file using solrj I understand Shalin is a guru and I am nobody but... http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/request/Direc tXmlRequest.html Is what you need if you want to use Solrj... :-) On 4/1/09, Radha C. cra...@ceiindia.com wrote: Thanks shalin, I need to index the xml which is in solr's format only. I want to index that xnl directly using solrj same like how we post using curl. Is there any API class is available for that? Can you please provide me any reference link? -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Wednesday, April 01, 2009 6:07 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Indexing fields of xml file using solrj On Wed, Apr 1, 2009 at 5:17 PM, Radha C. cra...@ceiindia.com wrote: Hi All, I want to index the document fields in a xml file to index using solrj. I know how to index the document fields using doc.addfield(). But I dont know how to post the xml document instead of adding each field in solrj. Can I index xml file using solrj? Can anyone help me in how to do this? Solr will only accept xml files which are in the solr's update xml format. You cannot post any arbitrary xml (you can convert using the xslt). You can also parse it yourself and use solrj for adding the document. There's DataImportHandler too which can parse XML using xpath. -- Regards, Shalin Shekhar Mangar.
RE: getting solr home path dynamically
Hi, No luck, I tried as follows, my solr home is there out of of my solrj client. I think it is looking for the config at CWD, config = new SolrConfig(); String dir= config.getResourceLoader().getDataDir(); ( I used getInstanceDir() also ) //File f = new File( home, solr.xml ); System.out.println(solr home : +dir); But got an exception Apr 1, 2009 8:20:41 PM org.apache.solr.core.SolrResourceLoader locateInstanceDir INFO: JNDI not configured for solr (NoInitialContextEx) Apr 1, 2009 8:20:41 PM org.apache.solr.core.SolrResourceLoader locateInstanceDir INFO: solr home defaulted to 'solr/' (could not find system property or JNDI) Apr 1, 2009 8:20:41 PM org.apache.solr.core.SolrResourceLoader init INFO: Solr home set to 'solr/' Apr 1, 2009 8:20:41 PM org.apache.solr.core.SolrResourceLoader createClassLoader INFO: Reusing parent classloader Exception in thread main java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in classpath or 'solr/conf/', cwd=D:\Lucene\solrjclient at org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java :197) at org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:1 65) at org.apache.solr.core.Config.init(Config.java:101) at org.apache.solr.core.SolrConfig.init(SolrConfig.java:111) at org.apache.solr.core.SolrConfig.init(SolrConfig.java:68) at SolrDeleteTest.main(SolrDeleteTest.java:30) Java Result: 1 Anybody have any idea ?? _ From: Giovanni De Stefano [mailto:giovanni.destef...@gmail.com] Sent: Wednesday, April 01, 2009 8:19 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Indexing fields of xml file using solrj Hello, you can try with: SolrConfig.config.getResourceLoader().getInstanceDir() Let me know if it works. Cheers, Giovanni On 4/1/09, Radha C. cra...@ceiindia.com wrote: Hey No, Actually I did not look at your response email. But I saw your email after I responded to shalin and you gave me a correct answer. Thanks a lot. So I started coding it also, I thought to reply to u once I executed successfully. Here is my code, DirectXmlRequest xmlreq = new DirectXmlRequest( /update, xml.toString() ); server.request( xmlreq ); server.commit(); But I am having trouble in identifying the xml location, I am having the input xml in $solrhome/inputdata/example.xml, Do you have any idea about how to get solrhome location dynamically by using any solrj API class? _ From: Giovanni De Stefano [mailto:giovanni.destef...@gmail.com] Sent: Wednesday, April 01, 2009 7:30 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Indexing fields of xml file using solrj I understand Shalin is a guru and I am nobody but... http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/request/Direc tXmlRequest.html Is what you need if you want to use Solrj... :-) On 4/1/09, Radha C. cra...@ceiindia.com wrote: Thanks shalin, I need to index the xml which is in solr's format only. I want to index that xnl directly using solrj same like how we post using curl. Is there any API class is available for that? Can you please provide me any reference link? -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Wednesday, April 01, 2009 6:07 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Indexing fields of xml file using solrj On Wed, Apr 1, 2009 at 5:17 PM, Radha C. cra...@ceiindia.com wrote: Hi All, I want to index the document fields in a xml file to index using solrj. I know how to index the document fields using doc.addfield(). But I dont know how to post the xml document instead of adding each field in solrj. Can I index xml file using solrj? Can anyone help me in how to do this? Solr will only accept xml files which are in the solr's update xml format. You cannot post any arbitrary xml (you can convert using the xslt). You can also parse it yourself and use solrj for adding the document. There's DataImportHandler too which can parse XML using xpath. -- Regards, Shalin Shekhar Mangar.
Re: getting solr home path dynamically
On Apr 1, 2009, at 10:55 AM, Radha C. wrote: But I am having trouble in identifying the xml location, I am having the input xml in $solrhome/inputdata/example.xml, Do you have any idea about how to get solrhome location dynamically by using any solrj API class? Using SolrJ remotely, you can hit the /admin/system request handler to get the Solr home directory. In a URL it'd be http://localhost:8983/solr/admin/system . You get this sort of thing in the response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime80/int /lst lst name=core str name=schemaexample/str str name=host10.0.1.193/str date name=now2009-04-01T15:24:32.428Z/date date name=start2009-04-01T15:21:20.765Z/date lst name=directory str name=instance/Users/erikhatcher/dev/solr/example/solr/str str name=data/Users/erikhatcher/dev/solr/example/./solr/data/ str str name=index/Users/erikhatcher/dev/solr/example/solr/data/ index/str /lst /lst ... And you can navigate the SolrJ response to get to the directory/ instance value. I question whether it's a good idea to do this from a SolrJ client though, as that directory is only useful on the server itself and client on the same machine, but not actually remotely. But, you can get at it at least :) Erik
Re: Encoding problem
Thanks,I detected that same problem. I have CP 1252 system file encoding and was recording data-config.xml file in UTF-8. DIH was reading using the default encoding. One possible workarround was using InputStream and OutputStream like DIH, but the files won't be in UTF-8 if the system has different encoding (not really good for XML files). I will get the latest 1.4 build and maintain the files in UTF-8. On Fri, Mar 27, 2009 at 9:37 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Sat, Mar 28, 2009 at 12:51 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: I see that you are specifying the topologyname's value in the query itself. It might be a bug in DataImportHandler because it reads the data-config as a string from an InputStream. If your default platform encoding is not UTF-8, this may be the cause. I've opened SOLR-1090 to fix this issue. https://issues.apache.org/jira/browse/SOLR-1090 -- Regards, Shalin Shekhar Mangar.
Re: Merging Solr Indexes
Hi, Yes, you can write to the same index from multiple threads. You still need to keep track of the index size manually, whether you create 1 or N indices/cores. I'd go with a single index first. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: vivek sar vivex...@gmail.com To: solr-user@lucene.apache.org Sent: Wednesday, April 1, 2009 4:26:04 AM Subject: Re: Merging Solr Indexes Thanks Otis. Could you write to same core (same index) from multiple threads at the same time? I thought each writer would lock the index so other can not write at the same time. I'll try it though. Another reason of putting indexes in separate core was to limit the index size. Our index can grow up to 50G a day, so I was hoping writing to smaller indexes would be faster in separate cores and if needed I can merge them at later point (like end of day). I want to keep daily cores. Isn't this a good idea? How else can I limit the index size (beside multiple instances or separate boxes). Thanks, -vivek On Tue, Mar 31, 2009 at 8:28 PM, Otis Gospodnetic wrote: Let me start with 4) Have you tried simply using multiple threads to send your docs to a single Solr instance/core? You should get about the same performance as what you are trying with your approach below, but without the headache of managing multiple cores and index merging (not yet possible to do programatically). Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: vivek sar To: solr-user@lucene.apache.org Sent: Tuesday, March 31, 2009 1:59:01 PM Subject: Merging Solr Indexes Hi, As part of speeding up the index process I'm thinking of spawning multiple threads which will write to different temporary SolrCores. Once the index process is done I want to merge all the indexes in temporary cores to a master core. For ex., if I want one SolrCore per day then every index cycle I'll spawn 4 threads which will index into some temporary index and once they are done I want to merge all these into the day core. My questions, 1) I want to use the same schema and solrconfig.xml for all cores without duplicating them - how do I do that? 2) How do I merge the temporary Solr cores into one master core programmatically? I've read the wiki on MergingSolrIndexes, but I want to do it programmatically (like in Lucene - writer.addIndexes(..)) once the temporary indices are done. 3) Can I remove the temporary indices once the merge process is done? 4) Is this the right strategy to speed up indexing? Thanks, -vivek
Re: SingleInstanceLock: write.lock
Hi, Are you sure there really is/was enough free space? Were you monitoring the disk space when this error happened? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: davidb dav...@mate1inc.com To: solr-user@lucene.apache.org Sent: Tuesday, March 31, 2009 5:31:42 PM Subject: SingleInstanceLock: write.lock Hi, I am new to Solr and am having an issue with the following SingleInstanceLock: write.lock. We have solr 1.3 running under tomcat 1.6.0_11. We have an index of users that are online at any given time (Usually around 4000 users). The records from solr are deleted and repopulated at around 30 second intervals (we would like this to be as fast as possible). The server runs fine for a period of time, then we get the following: SEVERE: auto commit error... java.io.FileNotFoundException: /data/solr-oni/data/index/_3w.prx (No space left on device) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.(Unknown Source) at org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:639) at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:442) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:104) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:145) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:74) at org.apache.lucene.index.DocFieldConsumers.flush(DocFieldConsumers.java:75) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:60) at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:574) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3615) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3524) at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1709) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1674) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1648) at org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:153) at org.apache.solr.update.DirectUpdateHandler2.closeWriter(DirectUpdateHandler2.java:175) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:359) at org.apache.solr.update.DirectUpdateHandler2$CommitTracker.run(DirectUpdateHandler2.java:515) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) 26-Mar-2009 6:19:50 PM org.apache.solr.common.SolrException log SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: SingleInstanceLock: write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:85) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1140) at org.apache.lucene.index.IndexWriter.(IndexWriter.java:938) at org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:116) at org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:122) at org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:167) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:221) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:59) at com.pjaol.search.solr.update.LocalUpdaterProcessor.processAdd(LocalUpdateProcessorFactory.java:148) at org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:196) at org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:123) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at
Re: Wildcard searches
Hi, Another option for 1) is to use n-grams with token begin/end symbols. Then you won't need to use wildcards at all, but you'll have a larger index. 2) may be added to Lucene in the near future, actually, I saw a related JIRA issue. But in the mean time, yes, you coul dimplement it via a custom QParserPlugin. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Vauthrin, Laurent laurent.vauth...@disney.com To: solr-user@lucene.apache.org Sent: Monday, March 30, 2009 5:45:30 PM Subject: Wildcard searches Hello again, I'm in the process of converting one of our services that was previously using Lucene to use Solr instead. The main focus here is to preserve backwards compatibility (even if some searches are not as efficient). There are currently two scenarios that are giving me problems right now. 1. Leading wildcard searches/suffix searches (e.g. *ickey) I've looked at https://issues.apache.org/jira/browse/SOLR-218. Is the best approach to create a QParserPlugin and change the parser to allow leading wildcards - setAllowLeadingWildcard(true)? At the moment we're trying to avoid indexing terms in reverse order. 2. Phrase searches with wildcards (e.g. Mickey Mou*) From what I understand, Solr/Lucene doesn't support this but we used to get results with the following code: new WildcardQuery(new Term(U_name, Mickey Mou*)) Is it possible for me to allow this capability in a QParserPlugin? Is there another way for me to do it? Thanks, Laurent Vauthrin
Quick Indexing Method???
Hello, I am new to Solr. I looked at getting started document. Can somebody show me how to index text file. I've tried other method, it just takes too much time. I am aware that Solr take XML files. I'm trying to find the *quickiest*method to index text, binary, or pcap file. Preferably, text file. Thanx, Alex V.
Re: Quick Indexing Method???
What about building an XML with text fields as everyones does ? :) On Wed, Apr 1, 2009 at 6:17 PM, Alex Vu alex.v...@gmail.com wrote: Hello, I am new to Solr. I looked at getting started document. Can somebody show me how to index text file. I've tried other method, it just takes too much time. I am aware that Solr take XML files. I'm trying to find the *quickiest*method to index text, binary, or pcap file. Preferably, text file. Thanx, Alex V.
Re: Defining DataDir in Multi-Core
Thanks Shalin. Is it available in the latest nightly build? Is there any other way I can create cores dynamically (using CREATE service) which will use the same schema.xml and solrconfig.xml, but write to different data directories? Thanks, -vivek On Wed, Apr 1, 2009 at 1:55 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Apr 1, 2009 at 1:48 PM, vivek sar vivex...@gmail.com wrote: I'm using the latest released one - Solr 1.3. The wiki says passing dataDir to CREATE action (web service) should work, but that doesn't seem to be working. That is a Solr 1.4 feature (not released yet). -- Regards, Shalin Shekhar Mangar.
Solr Stats and Their Meaning
Is there any documentation for some of the Solr admin stats? I'm not sure if these numbers are good or bad. We have a pretty small index (numDocs : 25253, maxDoc : 25312) org.apache.solr.handler.StandardRequestHandler requests : 11 errors : 0 avgTimePerRequest : 80.36364 avgRequestsPerSecond : 1.4128508E-4 org.apache.solr.handler.DisMaxRequestHandler requests : 28025 errors : 1 avgTimePerRequest : 74.93038 avgRequestsPerSecond : 0.35995585 I'd be interested to know if there are other stats I should look at. Thanks - Mike
Re: Defining DataDir in Multi-Core
Hi, I tried the latest nightly build (04-01-09) - it takes the dataDir property now, but it's creating the Data dir at the wrong location. For ex., I've the following in solr.xml, solr persistent=true cores adminPath=/admin/cores core name=core0 instanceDir=/Users/opal/temp/chat/solr dataDir=/Users/opal/temp/afterchat/solr/data/core0/ /cores /solr but, it always seem to be creating the solr/data directory in the cwd (where I started the Tomcat from). Here is the log from Catalina.out, Apr 1, 2009 10:47:21 AM org.apache.solr.core.SolrCore init INFO: [core2] Opening new SolrCore at /Users/opal/temp/chat/solr/, dataDir=./solr/data/ .. Apr 1, 2009 10:47:21 AM org.apache.solr.core.SolrCore initIndex WARNING: [core2] Solr index directory './solr/data/index' doesn't exist. Creating new index... I've also tried relative paths, but to no avail. Is this a bug? Thanks, -vivek On Wed, Apr 1, 2009 at 9:45 AM, vivek sar vivex...@gmail.com wrote: Thanks Shalin. Is it available in the latest nightly build? Is there any other way I can create cores dynamically (using CREATE service) which will use the same schema.xml and solrconfig.xml, but write to different data directories? Thanks, -vivek On Wed, Apr 1, 2009 at 1:55 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Apr 1, 2009 at 1:48 PM, vivek sar vivex...@gmail.com wrote: I'm using the latest released one - Solr 1.3. The wiki says passing dataDir to CREATE action (web service) should work, but that doesn't seem to be working. That is a Solr 1.4 feature (not released yet). -- Regards, Shalin Shekhar Mangar.
Re: DIH Date conversion from a source column skews time
Was there any follow up to this issue I found? Is this a legitimate bug with the time of day changing? I could try to solve this by executing same xpath statement twice. field column=original_air_date_d xpath=/add/doc/fie...@name='original_air_date_d'] / field column=temp_original_air_date_s xpath=/add/doc/fie...@name='original_air_date_d'] / However, when I do that, the first field original_air_date_d does not make it into the index. Is seems that you cannot have two identical xpath statements in the data input config file. Is this by design? On 4/1/09 7:45 AM, Small, Wesley wesley.sm...@mtvstaff.com wrote: I have noticed that setting a dynamic date field from source column changes the time within the date. Can anyone confirm this? For example, the document I import has the following xml field. field name=original_air_date_d2002-12-18T00:00:00Z/field In my data-inport-config file I define the following instructions: field column=temp_original_air_date_s xpath=/add/doc/fie...@name='original_air_date_d'] / field column=original_air_year_s sourceColName=temp_original_air_date_s regex=([0-9][0-9][0-9][0-9])[- /.][0-9][0-9][- /.][0- 9][0-9][T][0-9][0-9][:][0-9][0-9][:][0-9][0-9][Z] replaceWith=$1 / field column=original_air_date_d sourceColName=temp_original_air_date_s dateTimeFormat=-MM-dd'T'HH:mm:ss'Z'/ What is set in my index is is the following: arr name=temp_original_air_date_s str2002-12-18T00:00:00Z/str /arr arr name=original_air_year_s str2002/str /arr arr name=original_air_date_d date2002-12-18T05:00:00Z/date /arr You'll notice that the hour (HH) in original_air_date_d changes is set to 05. It should still be 00. I have noticed that it changes to either 04 or 05 in all cases within my index. In my schema the dynamic field *_d dynamicField name=*_d type=date indexed=true stored=true/ Thanks, Wesley.
Re: DIH; Hardcode field value/replacement based on source column
Thanks for the feedback. The templateTransformer is pretty straightforward solution. Perfect. Wesley. On 4/1/09 12:14 AM, Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com wrote: use TemplateTransformer field column=content_type_s template=Video / On Tue, Mar 31, 2009 at 9:20 PM, Wesley Small wesley.sm...@mtvstaff.com wrote: I am trying to find a clean way to *hardcode* a field/column to a specific value during the DIH process. It does seems to be possible but I am getting an slightly invalid constant value in my index. field column=content_type_s sourceColName=title_t regex=(.*) replaceWith=Video / However, the value in the index was set to VideoVideo for all documents. Any idea why this DIH instruction would see constant value appear twice?? Thanks, Wesley. -- --Noble Paul
Re: Merging Solr Indexes
There is a jira issue on supporting index merge: https://issues.apache.org/jira/browse/SOLR-1051. But I agree with Otis that you should go with a single index first. Cheers, Ning On Wed, Apr 1, 2009 at 12:06 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hi, Yes, you can write to the same index from multiple threads. You still need to keep track of the index size manually, whether you create 1 or N indices/cores. I'd go with a single index first. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: vivek sar vivex...@gmail.com To: solr-user@lucene.apache.org Sent: Wednesday, April 1, 2009 4:26:04 AM Subject: Re: Merging Solr Indexes Thanks Otis. Could you write to same core (same index) from multiple threads at the same time? I thought each writer would lock the index so other can not write at the same time. I'll try it though. Another reason of putting indexes in separate core was to limit the index size. Our index can grow up to 50G a day, so I was hoping writing to smaller indexes would be faster in separate cores and if needed I can merge them at later point (like end of day). I want to keep daily cores. Isn't this a good idea? How else can I limit the index size (beside multiple instances or separate boxes). Thanks, -vivek On Tue, Mar 31, 2009 at 8:28 PM, Otis Gospodnetic wrote: Let me start with 4) Have you tried simply using multiple threads to send your docs to a single Solr instance/core? You should get about the same performance as what you are trying with your approach below, but without the headache of managing multiple cores and index merging (not yet possible to do programatically). Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: vivek sar To: solr-user@lucene.apache.org Sent: Tuesday, March 31, 2009 1:59:01 PM Subject: Merging Solr Indexes Hi, As part of speeding up the index process I'm thinking of spawning multiple threads which will write to different temporary SolrCores. Once the index process is done I want to merge all the indexes in temporary cores to a master core. For ex., if I want one SolrCore per day then every index cycle I'll spawn 4 threads which will index into some temporary index and once they are done I want to merge all these into the day core. My questions, 1) I want to use the same schema and solrconfig.xml for all cores without duplicating them - how do I do that? 2) How do I merge the temporary Solr cores into one master core programmatically? I've read the wiki on MergingSolrIndexes, but I want to do it programmatically (like in Lucene - writer.addIndexes(..)) once the temporary indices are done. 3) Can I remove the temporary indices once the merge process is done? 4) Is this the right strategy to speed up indexing? Thanks, -vivek
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
On Apr 1, 2009, at 9:39 AM, Fergus McMenemie wrote: Grant, Redoing the work with your patch applied does not seem to make a difference! Is this the expected result? No, I didn't expect Solr 1095 to fix the problem. Overwrite = false + 1095, does, however, AFAICT by your last line, right? I did run it again using the full file, this time using my Imac:- 643465took 22min 14sec 2008-04-01 734796 73min 58sec 2009-01-15 758795 70min 55sec 2009-03-26 Again using only the first 1M records with commit=falseoverwrite=true:- 643465took 2m51.516s 2008-04-01 734796 7m29.326s 2009-01-15 758795 8m18.403s 2009-03-26 SOLR-1095 7m41.699s this time with commit=trueoverwrite=true. 643465took 2m49.200s 2008-04-01 734796 8m27.414s 2009-01-15 758795 9m32.459s 2009-03-26 SOLR-1095 7m58.825s this time with commit=falseoverwrite=false. 643465took 2m46.149s 2008-04-01 734796 3m29.909s 2009-01-15 758795 3m26.248s 2009-03-26 SOLR-1095 2m49.997s -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer === -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: Quick Indexing Method???
On Wed, Apr 1, 2009 at 9:47 PM, Alex Vu alex.v...@gmail.com wrote: Hello, I am new to Solr. I looked at getting started document. Can somebody show me how to index text file. I've tried other method, it just takes too much time. I am aware that Solr take XML files. I'm trying to find the *quickiest*method to index text, binary, or pcap file. Preferably, text file. One quick method is to use CSV files. Look at http://wiki.apache.org/solr/UpdateCSV -- Regards, Shalin Shekhar Mangar.
Re: Runtime exception when adding documents using solrj
Hi, I'm trying to add the list of POJO objects (using annotations) using solrj, but the server.addBeans(...) is throwing this exception, org.apache.solr.common.SolrException: Bad Request Bad Request request: http://localhost:8080/solr/core0/update?wt=javabinversion=2.2 Note, I'm using multi-core. There is no other exception in the solr log. Related question - I'm trying to upgrade the solrj from nightly build, but I get some classnotfound exception (java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory). What are all the dependencies for Solrj1.4 (wiki has only up to 1.3 information). Thanks, -vivek On Wed, Apr 1, 2009 at 3:30 AM, Radha C. cra...@ceiindia.com wrote: Thanks Paul, I resolved it, I missed one field declaration in schema.xml. Now I added, and it works. -Original Message- From: Noble Paul നോബിള് नोब्ळ् [mailto:noble.p...@gmail.com] Sent: Wednesday, April 01, 2009 3:52 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Runtime exception when adding documents using solrj Can u take a look at the Solr logs and see what is hapening? On Wed, Apr 1, 2009 at 3:19 PM, Radha C. cra...@ceiindia.com wrote: Thanks Paul, I changed the URL but I am getting another error - Bad request , Any help will be appriciated. Exception in thread main org.apache.solr.common.SolrException: Bad Request Bad Request request: http://localhost:8080/solr/update?wt=javabin at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(Common sHttpSolrServer.java:428) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(Common sHttpSolrServer.java:245) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateReque st.java:243) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at SolrIndexTest.main(SolrIndexTest.java:47) Java Result: 1 -Original Message- From: Noble Paul നോബിള് नोब्ळ् [mailto:noble.p...@gmail.com] Sent: Wednesday, April 01, 2009 2:26 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Runtime exception when adding documents using solrj the url is wrong try this CommonsHttpSolrServer server = new CommonsHttpSolrServer(http://localhost:8080/solr/;); On Wed, Apr 1, 2009 at 2:04 PM, Radha C. cra...@ceiindia.com wrote: Can anyone please tell me , what is the issue with the below java code.. -Original Message- From: Radha C. [mailto:cra...@ceiindia.com] Sent: Wednesday, April 01, 2009 12:28 PM To: solr-user@lucene.apache.org Subject: RE: Runtime exception when adding documents using solrj I am using Solr 1.3 version _ From: Noble Paul നോബിള് नोब्ळ् [mailto:noble.p...@gmail.com] Sent: Wednesday, April 01, 2009 12:16 PM To: solr-user@lucene.apache.org; cra...@ceiindia.com Subject: Re: Runtime exception when adding documents using solrj which version of Solr are you using? On Wed, Apr 1, 2009 at 12:01 PM, Radha C. cra...@ceiindia.com wrote: Hi All, I am trying to index documents by using solrj client. I have written a simple code below, { CommonsHttpSolrServer server = new CommonsHttpSolrServer(http://localhost:8080/solr/update;); SolrInputDocument doc1=new SolrInputDocument(); doc1.addField( id, id1, 1.0f ); doc1.addField( name, doc1, 1.0f ); doc1.addField( price, 10 ); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField( id, id2, 1.0f ); doc2.addField( name, doc2, 1.0f ); doc2.addField( price, 20 ); CollectionSolrInputDocument docs = new ArrayListSolrInputDocument(); docs.add( doc1 ); docs.add( doc2 ); server.add(docs); server.commit(); } But I am getting the below error, Can anyone tell me what is the wrong with the above code. Exception in thread main java.lang.RuntimeException: Invalid version or the data in not in 'javabin' format at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java: 9 8) at org.apache.solr.client.solrj.impl.BinaryResponseParser.processRespons e (Binar yResponseParser.java:39) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(Commo n sHttpS olrServer.java:470) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(Commo n sHttpS olrServer.java:245) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequ e st.jav a:243) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at SolrIndexTest.main(SolrIndexTest.java:46) Java Result: 1 -- --Noble Paul -- --Noble Paul -- --Noble Paul
Re: Runtime exception when adding documents using solrj
On Thu, Apr 2, 2009 at 1:13 AM, vivek sar vivex...@gmail.com wrote: Hi, I'm trying to add the list of POJO objects (using annotations) using solrj, but the server.addBeans(...) is throwing this exception, org.apache.solr.common.SolrException: Bad Request Bad Request request: http://localhost:8080/solr/core0/update?wt=javabinversion=2.2 Note, I'm using multi-core. There is no other exception in the solr log. Can you make sure all the cores' solrconfig.xml have the following line? requestHandler name=/update/javabin class=solr.BinaryUpdateRequestHandler / The above is needed for binary update format to work. I don't think the multi core example solrconfig.xml in solr nightly builds contain this line. Related question - I'm trying to upgrade the solrj from nightly build, but I get some classnotfound exception (java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory). What are all the dependencies for Solrj1.4 (wiki has only up to 1.3 information). I think you need slf4j-api-1.5.5.jar and slf4j-jdk14-1.5.5.jar. Both can be found in solr's nightly downloads in the lib directory. -- Regards, Shalin Shekhar Mangar.
Re: Runtime exception when adding documents using solrj
Thanks Shalin. I added that in the solrconfig.xml, but now I get this exception, org.apache.solr.common.SolrException: Not Found Not Found request: http://localhost:8080/solr/core0/update?wt=javabinversion=2.2 I do have the core0 under the solr.home. The core0 directory also contains the conf and data directories. The solr.xml has following in it, cores adminPath=/admin/cores core name=core0 instanceDir=core0 dataDir=data/ /cores Am I missing anything else? Thanks, -vivek On Wed, Apr 1, 2009 at 1:02 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Thu, Apr 2, 2009 at 1:13 AM, vivek sar vivex...@gmail.com wrote: Hi, I'm trying to add the list of POJO objects (using annotations) using solrj, but the server.addBeans(...) is throwing this exception, org.apache.solr.common.SolrException: Bad Request Bad Request request: http://localhost:8080/solr/core0/update?wt=javabinversion=2.2 Note, I'm using multi-core. There is no other exception in the solr log. Can you make sure all the cores' solrconfig.xml have the following line? requestHandler name=/update/javabin class=solr.BinaryUpdateRequestHandler / The above is needed for binary update format to work. I don't think the multi core example solrconfig.xml in solr nightly builds contain this line. Related question - I'm trying to upgrade the solrj from nightly build, but I get some classnotfound exception (java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory). What are all the dependencies for Solrj1.4 (wiki has only up to 1.3 information). I think you need slf4j-api-1.5.5.jar and slf4j-jdk14-1.5.5.jar. Both can be found in solr's nightly downloads in the lib directory. -- Regards, Shalin Shekhar Mangar.
Additive filter queries
I have a design question for all of those who might be willing to provide an answer. We are looking for a way to do a type of additive filters. Our documents are comprised of a single item of a specified color. We will use shoes as an example. Each document contains a multivalued ³size² field with all sizes and a multivalued ³width² field for all widths available for a given color. Our issue is that the values are not linked to each other. This issue can be seen when a user chooses a size (e.g. 7) and we filter the options down to only size 7. When the width facet is displayed it will have all widths available for all documents that match on size 7 even though most don¹t come in a wide width. We are looking for strategies to filter facets based on other facets in separate queries. -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562
spectrum of Lucene queries in solr?
Hello list, I am surprised not to find any equivalent to the classical Lucene queries in Solr... I must have badly looked... E.g. where can I get a BooleanQuery, a PrefixQuery, a FuzzyQuery, or even a few spanqueries? thanks in advance paul smime.p7s Description: S/MIME cryptographic signature
java.lang.ClassCastException: java.lang.Long using Solrj
Hi, I'm using solrj (released v 1.3) to add my POJO objects (server.addbeans(...)), but I'm getting this exception, java.lang.ClassCastException: java.lang.Long at org.apache.solr.common.util.NamedListCodec.unmarshal(NamedListCodec.java:89) at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:39) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:385) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:183) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:217) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at org.apache.solr.client.solrj.SolrServer.addBeans(SolrServer.java:57) I don't have any Long member variable in my java object - so not sure where is this coming from. I've checked the schema.xml to make sure the data types are ok. I'm adding 15K objects at a time - I'm assuming that should be ok. Any ideas? Thanks, -vivek
Re: spectrum of Lucene queries in solr?
Paul, I'm not sure I understand what you're looking for exactly. Solr supports Lucene's QueryParser by default for /select?q=... so you get the breadth of what it supports including boolean, prefix, fuzzy, and more. QueryParser has never supported span queries though. There is also a dismax parser available (defType=dismax to enable it), and numerous other parser plugins. Queries with Solr aren't created from the client as a Query object, but rather some string parameters come from the client that are then used to build a Query on the server side. You can also add your own QParserPlugin to build custom Lucene Query objects however you like. Erik On Apr 1, 2009, at 6:34 PM, Paul Libbrecht wrote: I am surprised not to find any equivalent to the classical Lucene queries in Solr... I must have badly looked... E.g. where can I get a BooleanQuery, a PrefixQuery, a FuzzyQuery, or even a few spanqueries? thanks in advance paul
Re: Runtime exception when adding documents using solrj
On Thu, Apr 2, 2009 at 2:34 AM, vivek sar vivex...@gmail.com wrote: Thanks Shalin. I added that in the solrconfig.xml, but now I get this exception, org.apache.solr.common.SolrException: Not Found Not Found request: http://localhost:8080/solr/core0/update?wt=javabinversion=2.2 I do have the core0 under the solr.home. The core0 directory also contains the conf and data directories. The solr.xml has following in it, cores adminPath=/admin/cores core name=core0 instanceDir=core0 dataDir=data/ /cores Are you able to see the Solr admin dashboard at http://localhost:8080/solr/core0/admin/ ? Are there any exceptions in Solr log? -- Regards, Shalin Shekhar Mangar.
Re: DIH Date conversion from a source column skews time
I guess dateFormat does the job properly but the returned value is changed according to timezone. can y try this out add an extra field which converts the date to toString() field column=original_air_date_d_str template=${entityname.original_air_date_d}/ this would add an extra field as string to the index On Wed, Apr 1, 2009 at 11:31 PM, Wesley Small wesley.sm...@mtvstaff.com wrote: Was there any follow up to this issue I found? Is this a legitimate bug with the time of day changing? I could try to solve this by executing same xpath statement twice. field column=original_air_date_d xpath=/add/doc/fie...@name='original_air_date_d'] / field column=temp_original_air_date_s xpath=/add/doc/fie...@name='original_air_date_d'] / However, when I do that, the first field original_air_date_d does not make it into the index. Is seems that you cannot have two identical xpath statements in the data input config file. Is this by design? On 4/1/09 7:45 AM, Small, Wesley wesley.sm...@mtvstaff.com wrote: I have noticed that setting a dynamic date field from source column changes the time within the date. Can anyone confirm this? For example, the document I import has the following xml field. field name=original_air_date_d2002-12-18T00:00:00Z/field In my data-inport-config file I define the following instructions: field column=temp_original_air_date_s xpath=/add/doc/fie...@name='original_air_date_d'] / field column=original_air_year_s sourceColName=temp_original_air_date_s regex=([0-9][0-9][0-9][0-9])[- /.][0-9][0-9][- /.][0- 9][0-9][T][0-9][0-9][:][0-9][0-9][:][0-9][0-9][Z] replaceWith=$1 / field column=original_air_date_d sourceColName=temp_original_air_date_s dateTimeFormat=-MM-dd'T'HH:mm:ss'Z'/ What is set in my index is is the following: arr name=temp_original_air_date_s str2002-12-18T00:00:00Z/str /arr arr name=original_air_year_s str2002/str /arr arr name=original_air_date_d date2002-12-18T05:00:00Z/date /arr You'll notice that the hour (HH) in original_air_date_d changes is set to 05. It should still be 00. I have noticed that it changes to either 04 or 05 in all cases within my index. In my schema the dynamic field *_d dynamicField name=*_d type=date indexed=true stored=true/ Thanks, Wesley. -- --Noble Paul
Re: Unexpected sorting results when sorting with mutivalued filed
Shalin Shekhar Mangar wrote: On Tue, Mar 31, 2009 at 2:18 PM, tushar kapoor tushar_kapoor...@rediffmail.com wrote: I have indexes with a multivalued field authorLastName. I query them with sort=authorLastName asc and get the results as - Index# authorLastName 1 Antonakos 2 Keller 3 Antonakos Mansfield However Index#3 has a value starting with A (Antonakos ) .. Should'nt Index#3 preceed Index#2 in the results. The last value is used for sorting in multi-valued fields. What is the reason behind sorting on a multi-valued field? -- Regards, Shalin Shekhar Mangar. Cant do much about it, that is the way our design is. Is there any way we can change this ? -- View this message in context: http://www.nabble.com/Unexpected-sorting-results-when-sorting-with-mutivalued-filed-tp22800877p22840705.html Sent from the Solr - User mailing list archive at Nabble.com.