Re: how to config DataImport Scheduling

2010-12-18 Thread Hamid Vahedi
I think it must work with any version of solr. because it works url base (see 
config file). 


Attention to this point: Successfully tested on Apache Tomcat v6(should work on 
any other servlet container)

 


From: Ahmet Arslan 
To: solr-user@lucene.apache.org
Sent: Fri, December 17, 2010 3:22:37 AM
Subject: Re: how to config DataImport Scheduling

> I also have the same problem, i configure
> dataimport.properties file as shown
> in 
> http://wiki.apache.org/solr/DataImportHandler#dataimport.properties_example
> but no change occur, can any one help me

What version of solr are you using? This seems a new feature. So it won't work 
on solr 1.4.1.


  

Re: Solr & JVM performance issue after 2 days

2010-12-12 Thread Hamid Vahedi
Dear Erick 


thanks for advice

Index size on all cores is 35 GB for 35 million doc (for 3 week indexing data) 

Kind Regards,
Hamid



From: Erick Erickson 
To: solr-user@lucene.apache.org
Sent: Sun, December 12, 2010 5:24:18 PM
Subject: Re: Solr & JVM performance issue after 2 days

Several things:
1> Your ramBufferSizeMB is probably too large. 128M is often the
 point of diminishing returns. Your situation may be different...
2> Your logs will show you what is happening with your autocommit
   properties. If you're really sending a 200 docs/second to your index
   your commits are happening every 10 seconds. Still too fast..
3> I'd really, really, really recommend that you use a master/slave
configuration where the slaves are your searchers and your
master is the indexer. Really. You're really hammering your machine.
If you separate the machines, you can turn off all of the autowarming
etc on the indexer and control the frequency of slave updates. Really
consider this.
4> You haven't given us any idea of the total index size.
5> I doubt separate JVMs are useful here. You're still operating on the
 same underlying hardware. Multiple cores are preferable to
 multiple JVMs almost always.

Best
Erick

On Sun, Dec 12, 2010 at 8:26 AM, Hamid Vahedi  wrote:

> Hi
>
> Thanks for suggestion.
> I do following changes in solrconfig.xml :
>
> 256
>
> false
>
> 1
>
> 
>  2000
>  30
> 
> simple
>
>   class="solr.LRUCache"
>  size="512"
>  initialSize="512"
>  autowarmCount="0"/>
>
>   class="solr.FastLRUCache"
>  size="512"
>  initialSize="512"
>  autowarmCount="0"/>
>
>   class="solr.LRUCache"
>  size="512"
>  initialSize="512"
>  autowarmCount="0"/>
>
> after that, i see one server works fine (that includes 3 cores for 3
> languages)
> but another server (3 cores for 3 other languages) has problem after 52
> hours.
>
>
> I will plan to do your suggestion. i hope it helps me
>
> any better idea would be appreciated
>
> Kind Regards
> Hamid
>
>
>
> 
> From: Peter Karich 
> To: solr-user@lucene.apache.org
> Sent: Tue, December 7, 2010 8:26:01 PM
> Subject: Re: Solr & JVM performance issue after 2 days
>
>  Am 07.12.2010 13:01, schrieb Hamid Vahedi:
> > Hi Peter
> >
> > Thanks a lot for reply. Actually I need real time indexing and query at
> the
> >same
> > time.
> >
> > Here  told:
> > "You  can run multiple Solr instances in separate JVMs, with both having
> their
> > solr.xml configured to use the same index folder."
> >
> > Now
> > Q1: I'm using Tomcat now, Could you please tell me how to have separate
> JVMs
> > with Tomcat?
>
> Are you sure you don't want two servers and you really want real time?
> Slow down indexing + less cache should do the trick I think.
>
> I wouldn't recommend indexing AND querying on the same machine unless
> you have a lot RAM and CPU.
>
> you could even deploy two indices into one tomcat... the read only index
> refers to the data dir via:
> /path/to/index/data
> then issue an empty (!!) commit to the read only index every minute. so
> that the read only index sees the changes from the feeding index.
> (again: see the wikipage!)
>
> setting up two tomcats on one server I woudn't recommend too, but its
> possible via copying tomcat into, say tomcat2
> and change the shutdown and 8080 port in the tomcat2/conf/server.xml
>
> > Q2:What should  I set for LockType?
>
> I'm using simple, but native should also be ok.
>
> > Thanks in advanced
> >
> >
> >
> >
> > 
> > From: Peter Karich
> > To: solr-user@lucene.apache.org
> > Sent: Tue, December 7, 2010 2:06:49 PM
> > Subject: Re: Solr&  JVM performance issue after 2 days
> >
> >Hi Hamid,
> >
> > try to avoid autowarming when indexing (see solrconfig.xml:
> > caches->autowarm + newSearcher + maxSearcher).
> > If you need to query and indexing at the same time,
> > then probably you'll need one read-only core and one for writing with no
> > autowarming configured.
> > See: http://wiki.apache.org/solr/NearRealtimeSearchTuning
> >
> > Or replicate from the indexing-core to a different core with different
> > settings.
> >
> > Regards,
> > Peter.
> >
> >
> >

Re: Solr & JVM performance issue after 2 days

2010-12-12 Thread Hamid Vahedi
Hi 

Thanks for suggestion.
I do following changes in solrconfig.xml :

256

false

1


  2000
  30

simple







after that, i see one server works fine (that includes 3 cores for 3 languages)
but another server (3 cores for 3 other languages) has problem after 52 hours. 


I will plan to do your suggestion. i hope it helps me 

any better idea would be appreciated

Kind Regards
Hamid




From: Peter Karich 
To: solr-user@lucene.apache.org
Sent: Tue, December 7, 2010 8:26:01 PM
Subject: Re: Solr & JVM performance issue after 2 days

  Am 07.12.2010 13:01, schrieb Hamid Vahedi:
> Hi Peter
>
> Thanks a lot for reply. Actually I need real time indexing and query at the 
>same
> time.
>
> Here  told:
> "You  can run multiple Solr instances in separate JVMs, with both having  
their
> solr.xml configured to use the same index folder."
>
> Now
> Q1: I'm using Tomcat now, Could you please tell me how to have separate JVMs
> with Tomcat?

Are you sure you don't want two servers and you really want real time?
Slow down indexing + less cache should do the trick I think.

I wouldn't recommend indexing AND querying on the same machine unless 
you have a lot RAM and CPU.

you could even deploy two indices into one tomcat... the read only index 
refers to the data dir via:
/path/to/index/data
then issue an empty (!!) commit to the read only index every minute. so 
that the read only index sees the changes from the feeding index.
(again: see the wikipage!)

setting up two tomcats on one server I woudn't recommend too, but its 
possible via copying tomcat into, say tomcat2
and change the shutdown and 8080 port in the tomcat2/conf/server.xml

> Q2:What should  I set for LockType?

I'm using simple, but native should also be ok.

> Thanks in advanced
>
>
>
>
> 
> From: Peter Karich
> To: solr-user@lucene.apache.org
> Sent: Tue, December 7, 2010 2:06:49 PM
> Subject: Re: Solr&  JVM performance issue after 2 days
>
>Hi Hamid,
>
> try to avoid autowarming when indexing (see solrconfig.xml:
> caches->autowarm + newSearcher + maxSearcher).
> If you need to query and indexing at the same time,
> then probably you'll need one read-only core and one for writing with no
> autowarming configured.
> See: http://wiki.apache.org/solr/NearRealtimeSearchTuning
>
> Or replicate from the indexing-core to a different core with different
> settings.
>
> Regards,
> Peter.
>
>
>> Hi,
>>
>> I am using multi-core tomcat on 2 servers. 3 language per server.
>>
>> I am adding documents to solr up to 200 doc/sec. when updating process is
>> started, every thing is fine (update performance is max 200 ms/doc. with 
about
>> 800 MB memory used with minimal cpu usage).
>>
>> After 15-17 hours it's became so slow  (more that 900 sec for update), used
>> heap
>> memory is about 15GB, GC time is became more than one hour.
>>
>>
>> I don't know what's wrong with it? Can anyone describe me what's the problem?
>> Is that came from Solr or JVM?
>>
>> Note: when i stop updating, CPU busy within 15-20 min. and when start 
updating
>> again i have same issue. but when stop tomcat service and start it again, all
>> thing is OK.
>>
>> I am using tomcat 6 with 18 GB memory on windows 2008 server x64. Solr 1.4.1
>>
>> thanks in advanced
>> Hamid
>


-- 
http://jetwick.com twitter search prototype


  

Re: Solr & JVM performance issue after 2 days

2010-12-07 Thread Hamid Vahedi
Hi Peter

Thanks a lot for reply. Actually I need real time indexing and query at the 
same 
time. 

Here  told: 
"You  can run multiple Solr instances in separate JVMs, with both having  their 
solr.xml configured to use the same index folder."

Now
Q1: I'm using Tomcat now, Could you please tell me how to have separate JVMs 
with Tomcat? 

Q2:What should  I set for LockType?

Thanks in advanced 





From: Peter Karich 
To: solr-user@lucene.apache.org
Sent: Tue, December 7, 2010 2:06:49 PM
Subject: Re: Solr & JVM performance issue after 2 days

  Hi Hamid,

try to avoid autowarming when indexing (see solrconfig.xml: 
caches->autowarm + newSearcher + maxSearcher).
If you need to query and indexing at the same time,
then probably you'll need one read-only core and one for writing with no 
autowarming configured.
See: http://wiki.apache.org/solr/NearRealtimeSearchTuning

Or replicate from the indexing-core to a different core with different 
settings.

Regards,
Peter.


> Hi,
>
> I am using multi-core tomcat on 2 servers. 3 language per server.
>
> I am adding documents to solr up to 200 doc/sec. when updating process is
> started, every thing is fine (update performance is max 200 ms/doc. with about
> 800 MB memory used with minimal cpu usage).
>
> After 15-17 hours it's became so slow  (more that 900 sec for update), used 
>heap
> memory is about 15GB, GC time is became more than one hour.
>
>
> I don't know what's wrong with it? Can anyone describe me what's the problem?
> Is that came from Solr or JVM?
>
> Note: when i stop updating, CPU busy within 15-20 min. and when start updating
> again i have same issue. but when stop tomcat service and start it again, all
> thing is OK.
>
> I am using tomcat 6 with 18 GB memory on windows 2008 server x64. Solr 1.4.1
>
> thanks in advanced
> Hamid


-- 
http://jetwick.com twitter search prototype


  

Re: Solr & JVM performance issue after 2 days

2010-12-07 Thread Hamid Vahedi
hi Sven

no, only auto commit 


  1000
  1000








From: Sven Almgren 
To: solr-user@lucene.apache.org
Sent: Tue, December 7, 2010 1:54:40 PM
Subject: Re: Solr & JVM performance issue after 2 days

Have you run any optimize requests yet?

/Sven

On Tue, Dec 7, 2010 at 08:40, Hamid Vahedi  wrote:
> Hi,
>
> I am using multi-core tomcat on 2 servers. 3 language per server.
>
> I am adding documents to solr up to 200 doc/sec. when updating process is
> started, every thing is fine (update performance is max 200 ms/doc. with about
> 800 MB memory used with minimal cpu usage).
>
> After 15-17 hours it's became so slow  (more that 900 sec for update), used 
>heap
> memory is about 15GB, GC time is became more than one hour.
>
>
> I don't know what's wrong with it? Can anyone describe me what's the problem?
> Is that came from Solr or JVM?
>
> Note: when i stop updating, CPU busy within 15-20 min. and when start updating
> again i have same issue. but when stop tomcat service and start it again, all
> thing is OK.
>
> I am using tomcat 6 with 18 GB memory on windows 2008 server x64. Solr 1.4.1
>
> thanks in advanced
> Hamid
>
>
>



  

Solr & JVM performance issue after 2 days

2010-12-06 Thread Hamid Vahedi
Hi,

I am using multi-core tomcat on 2 servers. 3 language per server.

I am adding documents to solr up to 200 doc/sec. when updating process is 
started, every thing is fine (update performance is max 200 ms/doc. with about 
800 MB memory used with minimal cpu usage). 

After 15-17 hours it's became so slow  (more that 900 sec for update), used 
heap 
memory is about 15GB, GC time is became more than one hour. 


I don't know what's wrong with it? Can anyone describe me what's the problem? 
Is that came from Solr or JVM? 

Note: when i stop updating, CPU busy within 15-20 min. and when start updating 
again i have same issue. but when stop tomcat service and start it again, all 
thing is OK.

I am using tomcat 6 with 18 GB memory on windows 2008 server x64. Solr 1.4.1 

thanks in advanced
Hamid


  

how to config DataImport Scheduling

2010-12-06 Thread Hamid Vahedi
Hi 

I want to config DataImport Scheduling, but not know, how to do it.
i just create and compile Scheduling classes with netbeans. and now have 
Scheduling.Jar. 

Q: how to setup it on tomcat or solr?  (i using tomcat 6 on windows 2008)

thanks in advanced



  

Distributed Solr (Shard mode) performance issue

2010-11-24 Thread Hamid Vahedi
Hi to all 

We using solr multi core with 6 core in shard mode per server (2 server till 
now. therefore totally 12 core). using tomcat on windows 2008 with 18GB RAM 
assign to it.

We add almost 6 million doc per day to solr (up to 200 doc/sec) which must 
appear in query result real-time. (currently more than 350 million doc indexed)
Query very slow (about 4-32 sec).but Update performance very good.

note1: result must sort by publish date desc.
note2: Query on one shard also slow sometime (300ms-2s)
note3: we can't optimize index because always doc add.

Can solr help me? 
if yes, What's best configuration ? 
if not, what is the best solution ?

Kind Regards,
Hamid



  

Re: OutOfMemoryError when using query with sort

2010-05-02 Thread Hamid Vahedi
I install 64 bit windows and my problem solved. also i using shard mode (100 M 
doc per machine with one solr instance)
is there better solution? because i insert at least 5M doc per day





From: Koji Sekiguchi 
To: solr-user@lucene.apache.org
Sent: Sun, May 2, 2010 9:08:42 PM
Subject: Re: OutOfMemoryError when using query with sort

Hamid Vahedi wrote:
> Hi, i using solr that running on windows server 2008 32-bit. 
> I add about 100 million article into solr without set store attribute. (only 
> store document id) (index file size about 164 GB)
> when try to get query without sort , it's return doc ids in some ms, but when 
> add sort command, i get below error:
> 
> TTP Status 500 - Java heap space java.lang.OutOfMemoryError: Java heap space 
> at  
Since sort uses FieldCache and it consumes memory, you got OOM.
I think 100M docs/164GB index is considerable large for 32 bit machine.
Why don't you use distributed search?

Koji

-- http://www.rondhuit.com/en/


  

Solr date range problem - specific date problem

2010-04-29 Thread Hamid Vahedi
I index some data include date in solr
but when search for specific date, i get some record (not all record) 
include some record in next day for example: 
http://localhost:8080/solr/select/?q=pubdate:[2010-03-25T00:00:00Z >TO 
>2010-03-25T23:59:59Z]&start=0&rows=10&indent=on&sort=pubdate
> desc
i have 625000 record in 2010-03-25 but above query result return 
325412 that include 14 record from 2010-03-26. 
Also i try with below query, but not get right result
http://localhost:8080/solr/select/?q=pubdate:"2010-03-25T00:00:00Z"&start=0&rows=10&indent=on&sort=pubdate
>
>  desc
How to get right result for specific date ???

Could you please help me?

Thanks in advanced
Hamid


  

OutOfMemoryError when using query with sort

2010-04-19 Thread Hamid Vahedi
Hi, i using solr that running on windows server 2008 32-bit. 

I add about 100 million article into solr without set store attribute. (only 
store document id) (index file size about 164 GB)
when try to get query without sort , it's return doc ids in some ms, but when 
add sort command, i get below error:

TTP Status 500 - Java heap space java.lang.OutOfMemoryError: Java heap 
space at 
org.apache.lucene.search.FieldCacheImpl$LongCache.createValue(FieldCacheImpl.java:560)
 at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:208) 
at 
org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:525) at 
org.apache.lucene.search.FieldComparator$LongComparator.setNextReader(FieldComparator.java:391)
 at 
org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:94)
 at 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:245) at 
org.apache.lucene.search.Searcher.search(Searcher.java:171) at 
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988)
 at 

Note: i set max heap size to 1600MB (tomcat service not start when apply more 
heap size) but problem not solved

I check heap dump file with mat and see this info

org.apache.lucene.index.ReadOnlySegmentReader @ 0x253508e8  Shallow Size: 80 B 
Retained Size: 449,4 MB

Problem Suspect 1 
One instance of "org.apache.lucene.index.ReadOnlySegmentReader" loaded 
by "org.apache.catalina.loader.WebappClassLoader @ 0x25350c80" occupies 
471.244.848 (97,44%) bytes. The memory is accumulated in one instance of 
"org.apache.lucene.index.TermInfosReader" loaded by 
"org.apache.catalina.loader.WebappClassLoader @ 
0x25350c80".Keywords
org.apache.lucene.index.ReadOnlySegmentReader
org.apache.catalina.loader.WebappClassLoader 
@ 0x25350c80
org.apache.lucene.index.TermInfosReader
Problem Suspect 1

how to decrease segment file size for solving this problem 

Thanks in advanced 
Hamid