Re: how to config DataImport Scheduling
I think it must work with any version of solr. because it works url base (see config file). Attention to this point: Successfully tested on Apache Tomcat v6(should work on any other servlet container) From: Ahmet Arslan To: solr-user@lucene.apache.org Sent: Fri, December 17, 2010 3:22:37 AM Subject: Re: how to config DataImport Scheduling > I also have the same problem, i configure > dataimport.properties file as shown > in > http://wiki.apache.org/solr/DataImportHandler#dataimport.properties_example > but no change occur, can any one help me What version of solr are you using? This seems a new feature. So it won't work on solr 1.4.1.
Re: Solr & JVM performance issue after 2 days
Dear Erick thanks for advice Index size on all cores is 35 GB for 35 million doc (for 3 week indexing data) Kind Regards, Hamid From: Erick Erickson To: solr-user@lucene.apache.org Sent: Sun, December 12, 2010 5:24:18 PM Subject: Re: Solr & JVM performance issue after 2 days Several things: 1> Your ramBufferSizeMB is probably too large. 128M is often the point of diminishing returns. Your situation may be different... 2> Your logs will show you what is happening with your autocommit properties. If you're really sending a 200 docs/second to your index your commits are happening every 10 seconds. Still too fast.. 3> I'd really, really, really recommend that you use a master/slave configuration where the slaves are your searchers and your master is the indexer. Really. You're really hammering your machine. If you separate the machines, you can turn off all of the autowarming etc on the indexer and control the frequency of slave updates. Really consider this. 4> You haven't given us any idea of the total index size. 5> I doubt separate JVMs are useful here. You're still operating on the same underlying hardware. Multiple cores are preferable to multiple JVMs almost always. Best Erick On Sun, Dec 12, 2010 at 8:26 AM, Hamid Vahedi wrote: > Hi > > Thanks for suggestion. > I do following changes in solrconfig.xml : > > 256 > > false > > 1 > > > 2000 > 30 > > simple > > class="solr.LRUCache" > size="512" > initialSize="512" > autowarmCount="0"/> > > class="solr.FastLRUCache" > size="512" > initialSize="512" > autowarmCount="0"/> > > class="solr.LRUCache" > size="512" > initialSize="512" > autowarmCount="0"/> > > after that, i see one server works fine (that includes 3 cores for 3 > languages) > but another server (3 cores for 3 other languages) has problem after 52 > hours. > > > I will plan to do your suggestion. i hope it helps me > > any better idea would be appreciated > > Kind Regards > Hamid > > > > > From: Peter Karich > To: solr-user@lucene.apache.org > Sent: Tue, December 7, 2010 8:26:01 PM > Subject: Re: Solr & JVM performance issue after 2 days > > Am 07.12.2010 13:01, schrieb Hamid Vahedi: > > Hi Peter > > > > Thanks a lot for reply. Actually I need real time indexing and query at > the > >same > > time. > > > > Here told: > > "You can run multiple Solr instances in separate JVMs, with both having > their > > solr.xml configured to use the same index folder." > > > > Now > > Q1: I'm using Tomcat now, Could you please tell me how to have separate > JVMs > > with Tomcat? > > Are you sure you don't want two servers and you really want real time? > Slow down indexing + less cache should do the trick I think. > > I wouldn't recommend indexing AND querying on the same machine unless > you have a lot RAM and CPU. > > you could even deploy two indices into one tomcat... the read only index > refers to the data dir via: > /path/to/index/data > then issue an empty (!!) commit to the read only index every minute. so > that the read only index sees the changes from the feeding index. > (again: see the wikipage!) > > setting up two tomcats on one server I woudn't recommend too, but its > possible via copying tomcat into, say tomcat2 > and change the shutdown and 8080 port in the tomcat2/conf/server.xml > > > Q2:What should I set for LockType? > > I'm using simple, but native should also be ok. > > > Thanks in advanced > > > > > > > > > > > > From: Peter Karich > > To: solr-user@lucene.apache.org > > Sent: Tue, December 7, 2010 2:06:49 PM > > Subject: Re: Solr& JVM performance issue after 2 days > > > >Hi Hamid, > > > > try to avoid autowarming when indexing (see solrconfig.xml: > > caches->autowarm + newSearcher + maxSearcher). > > If you need to query and indexing at the same time, > > then probably you'll need one read-only core and one for writing with no > > autowarming configured. > > See: http://wiki.apache.org/solr/NearRealtimeSearchTuning > > > > Or replicate from the indexing-core to a different core with different > > settings. > > > > Regards, > > Peter. > > > > > >
Re: Solr & JVM performance issue after 2 days
Hi Thanks for suggestion. I do following changes in solrconfig.xml : 256 false 1 2000 30 simple after that, i see one server works fine (that includes 3 cores for 3 languages) but another server (3 cores for 3 other languages) has problem after 52 hours. I will plan to do your suggestion. i hope it helps me any better idea would be appreciated Kind Regards Hamid From: Peter Karich To: solr-user@lucene.apache.org Sent: Tue, December 7, 2010 8:26:01 PM Subject: Re: Solr & JVM performance issue after 2 days Am 07.12.2010 13:01, schrieb Hamid Vahedi: > Hi Peter > > Thanks a lot for reply. Actually I need real time indexing and query at the >same > time. > > Here told: > "You can run multiple Solr instances in separate JVMs, with both having their > solr.xml configured to use the same index folder." > > Now > Q1: I'm using Tomcat now, Could you please tell me how to have separate JVMs > with Tomcat? Are you sure you don't want two servers and you really want real time? Slow down indexing + less cache should do the trick I think. I wouldn't recommend indexing AND querying on the same machine unless you have a lot RAM and CPU. you could even deploy two indices into one tomcat... the read only index refers to the data dir via: /path/to/index/data then issue an empty (!!) commit to the read only index every minute. so that the read only index sees the changes from the feeding index. (again: see the wikipage!) setting up two tomcats on one server I woudn't recommend too, but its possible via copying tomcat into, say tomcat2 and change the shutdown and 8080 port in the tomcat2/conf/server.xml > Q2:What should I set for LockType? I'm using simple, but native should also be ok. > Thanks in advanced > > > > > > From: Peter Karich > To: solr-user@lucene.apache.org > Sent: Tue, December 7, 2010 2:06:49 PM > Subject: Re: Solr& JVM performance issue after 2 days > >Hi Hamid, > > try to avoid autowarming when indexing (see solrconfig.xml: > caches->autowarm + newSearcher + maxSearcher). > If you need to query and indexing at the same time, > then probably you'll need one read-only core and one for writing with no > autowarming configured. > See: http://wiki.apache.org/solr/NearRealtimeSearchTuning > > Or replicate from the indexing-core to a different core with different > settings. > > Regards, > Peter. > > >> Hi, >> >> I am using multi-core tomcat on 2 servers. 3 language per server. >> >> I am adding documents to solr up to 200 doc/sec. when updating process is >> started, every thing is fine (update performance is max 200 ms/doc. with about >> 800 MB memory used with minimal cpu usage). >> >> After 15-17 hours it's became so slow (more that 900 sec for update), used >> heap >> memory is about 15GB, GC time is became more than one hour. >> >> >> I don't know what's wrong with it? Can anyone describe me what's the problem? >> Is that came from Solr or JVM? >> >> Note: when i stop updating, CPU busy within 15-20 min. and when start updating >> again i have same issue. but when stop tomcat service and start it again, all >> thing is OK. >> >> I am using tomcat 6 with 18 GB memory on windows 2008 server x64. Solr 1.4.1 >> >> thanks in advanced >> Hamid > -- http://jetwick.com twitter search prototype
Re: Solr & JVM performance issue after 2 days
Hi Peter Thanks a lot for reply. Actually I need real time indexing and query at the same time. Here told: "You can run multiple Solr instances in separate JVMs, with both having their solr.xml configured to use the same index folder." Now Q1: I'm using Tomcat now, Could you please tell me how to have separate JVMs with Tomcat? Q2:What should I set for LockType? Thanks in advanced From: Peter Karich To: solr-user@lucene.apache.org Sent: Tue, December 7, 2010 2:06:49 PM Subject: Re: Solr & JVM performance issue after 2 days Hi Hamid, try to avoid autowarming when indexing (see solrconfig.xml: caches->autowarm + newSearcher + maxSearcher). If you need to query and indexing at the same time, then probably you'll need one read-only core and one for writing with no autowarming configured. See: http://wiki.apache.org/solr/NearRealtimeSearchTuning Or replicate from the indexing-core to a different core with different settings. Regards, Peter. > Hi, > > I am using multi-core tomcat on 2 servers. 3 language per server. > > I am adding documents to solr up to 200 doc/sec. when updating process is > started, every thing is fine (update performance is max 200 ms/doc. with about > 800 MB memory used with minimal cpu usage). > > After 15-17 hours it's became so slow (more that 900 sec for update), used >heap > memory is about 15GB, GC time is became more than one hour. > > > I don't know what's wrong with it? Can anyone describe me what's the problem? > Is that came from Solr or JVM? > > Note: when i stop updating, CPU busy within 15-20 min. and when start updating > again i have same issue. but when stop tomcat service and start it again, all > thing is OK. > > I am using tomcat 6 with 18 GB memory on windows 2008 server x64. Solr 1.4.1 > > thanks in advanced > Hamid -- http://jetwick.com twitter search prototype
Re: Solr & JVM performance issue after 2 days
hi Sven no, only auto commit 1000 1000 From: Sven Almgren To: solr-user@lucene.apache.org Sent: Tue, December 7, 2010 1:54:40 PM Subject: Re: Solr & JVM performance issue after 2 days Have you run any optimize requests yet? /Sven On Tue, Dec 7, 2010 at 08:40, Hamid Vahedi wrote: > Hi, > > I am using multi-core tomcat on 2 servers. 3 language per server. > > I am adding documents to solr up to 200 doc/sec. when updating process is > started, every thing is fine (update performance is max 200 ms/doc. with about > 800 MB memory used with minimal cpu usage). > > After 15-17 hours it's became so slow (more that 900 sec for update), used >heap > memory is about 15GB, GC time is became more than one hour. > > > I don't know what's wrong with it? Can anyone describe me what's the problem? > Is that came from Solr or JVM? > > Note: when i stop updating, CPU busy within 15-20 min. and when start updating > again i have same issue. but when stop tomcat service and start it again, all > thing is OK. > > I am using tomcat 6 with 18 GB memory on windows 2008 server x64. Solr 1.4.1 > > thanks in advanced > Hamid > > >
Solr & JVM performance issue after 2 days
Hi, I am using multi-core tomcat on 2 servers. 3 language per server. I am adding documents to solr up to 200 doc/sec. when updating process is started, every thing is fine (update performance is max 200 ms/doc. with about 800 MB memory used with minimal cpu usage). After 15-17 hours it's became so slow (more that 900 sec for update), used heap memory is about 15GB, GC time is became more than one hour. I don't know what's wrong with it? Can anyone describe me what's the problem? Is that came from Solr or JVM? Note: when i stop updating, CPU busy within 15-20 min. and when start updating again i have same issue. but when stop tomcat service and start it again, all thing is OK. I am using tomcat 6 with 18 GB memory on windows 2008 server x64. Solr 1.4.1 thanks in advanced Hamid
how to config DataImport Scheduling
Hi I want to config DataImport Scheduling, but not know, how to do it. i just create and compile Scheduling classes with netbeans. and now have Scheduling.Jar. Q: how to setup it on tomcat or solr? (i using tomcat 6 on windows 2008) thanks in advanced
Distributed Solr (Shard mode) performance issue
Hi to all We using solr multi core with 6 core in shard mode per server (2 server till now. therefore totally 12 core). using tomcat on windows 2008 with 18GB RAM assign to it. We add almost 6 million doc per day to solr (up to 200 doc/sec) which must appear in query result real-time. (currently more than 350 million doc indexed) Query very slow (about 4-32 sec).but Update performance very good. note1: result must sort by publish date desc. note2: Query on one shard also slow sometime (300ms-2s) note3: we can't optimize index because always doc add. Can solr help me? if yes, What's best configuration ? if not, what is the best solution ? Kind Regards, Hamid
Re: OutOfMemoryError when using query with sort
I install 64 bit windows and my problem solved. also i using shard mode (100 M doc per machine with one solr instance) is there better solution? because i insert at least 5M doc per day From: Koji Sekiguchi To: solr-user@lucene.apache.org Sent: Sun, May 2, 2010 9:08:42 PM Subject: Re: OutOfMemoryError when using query with sort Hamid Vahedi wrote: > Hi, i using solr that running on windows server 2008 32-bit. > I add about 100 million article into solr without set store attribute. (only > store document id) (index file size about 164 GB) > when try to get query without sort , it's return doc ids in some ms, but when > add sort command, i get below error: > > TTP Status 500 - Java heap space java.lang.OutOfMemoryError: Java heap space > at Since sort uses FieldCache and it consumes memory, you got OOM. I think 100M docs/164GB index is considerable large for 32 bit machine. Why don't you use distributed search? Koji -- http://www.rondhuit.com/en/
Solr date range problem - specific date problem
I index some data include date in solr but when search for specific date, i get some record (not all record) include some record in next day for example: http://localhost:8080/solr/select/?q=pubdate:[2010-03-25T00:00:00Z >TO >2010-03-25T23:59:59Z]&start=0&rows=10&indent=on&sort=pubdate > desc i have 625000 record in 2010-03-25 but above query result return 325412 that include 14 record from 2010-03-26. Also i try with below query, but not get right result http://localhost:8080/solr/select/?q=pubdate:"2010-03-25T00:00:00Z"&start=0&rows=10&indent=on&sort=pubdate > > desc How to get right result for specific date ??? Could you please help me? Thanks in advanced Hamid
OutOfMemoryError when using query with sort
Hi, i using solr that running on windows server 2008 32-bit. I add about 100 million article into solr without set store attribute. (only store document id) (index file size about 164 GB) when try to get query without sort , it's return doc ids in some ms, but when add sort command, i get below error: TTP Status 500 - Java heap space java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.search.FieldCacheImpl$LongCache.createValue(FieldCacheImpl.java:560) at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:208) at org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:525) at org.apache.lucene.search.FieldComparator$LongComparator.setNextReader(FieldComparator.java:391) at org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:94) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:245) at org.apache.lucene.search.Searcher.search(Searcher.java:171) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988) at Note: i set max heap size to 1600MB (tomcat service not start when apply more heap size) but problem not solved I check heap dump file with mat and see this info org.apache.lucene.index.ReadOnlySegmentReader @ 0x253508e8 Shallow Size: 80 B Retained Size: 449,4 MB Problem Suspect 1 One instance of "org.apache.lucene.index.ReadOnlySegmentReader" loaded by "org.apache.catalina.loader.WebappClassLoader @ 0x25350c80" occupies 471.244.848 (97,44%) bytes. The memory is accumulated in one instance of "org.apache.lucene.index.TermInfosReader" loaded by "org.apache.catalina.loader.WebappClassLoader @ 0x25350c80".Keywords org.apache.lucene.index.ReadOnlySegmentReader org.apache.catalina.loader.WebappClassLoader @ 0x25350c80 org.apache.lucene.index.TermInfosReader Problem Suspect 1 how to decrease segment file size for solving this problem Thanks in advanced Hamid