Re: OOM issue

2011-09-15 Thread abhijit bashetti
Hi Eric,

Thanks for the reply. It is very useful for me.

For point 1. : I do need 10 core and it will go on increasing in future.

I have document that belongs to different workspaces , so the

1 workspace = 1 core ; I cant go with one core. Currrently having 10 core

but in future the count may go 40+.


For point 2.: Currently I have not given any thought on it , but yes I
think in future I

may have to go for the master/slave setup


For point 3: the current cache size for document cache , filter cache
and query cache is 512 for each

the ramBufferSizeMB size is 512M. Shall I reduce the same to 128M?

For point 4: I didnot get you why should I use SolrJ with Tika? Do you mean
sending the new/updated documents to Tika for reindexing?
Then I am already doing it using data-config. I have written the query in
data-config in way that it takes the path of updated/new documents.

Thanks in advance!

Regards,
Abhijit


Multiple webapps will not help you, they're still on the underlying
memory. In fact, it'll make matters worse since they won't share
resources.

So questions become:
1 Why do you have 10 cores? Putting 10 cores on the same machine
doesn't really do much. It can make lots of sense to put 10 cores on the
same machine for *indexing*, then replicate them out. But putting
10 cores on one machine in hopes of making better use of memory
isn't useful. It may be useful to just go to one core.

2 Indexing, reindexing and searching on a single machine is requiring a
lot from that machine. Really you should consider having a master/slave
setup.

3 But assuming more hardware of any sort isn't in the cards, sure. reduce
your cache sizes. Look at ramBufferSizeMB and make it small.

4 Consider indexing with Tika via SolrJ and only sending the finished
document to Solr.

Best
Erick

On Mon, Sep 12, 2011 at 5:42 AM, Manish Bafna manish.bafna...@gmail.com wrote:
 Number of cache is definitely going to reduce heap usage.

 Can you run those xlsx file separately with Tika and see if you are getting
 OOM issue.


 On Mon, Sep 12, 2011 at 3:09 PM, abhijit bashetti abhijitbashe...@gmail.com
 wrote:

 I am facing the OOM issue.

 OTHER than increasing the RAM , Can we chnage some other parameters to
 avoid the OOM issue.


 such as minimizing the filter cache size , document cache size etc.

 Can you suggest me some other option to avoid the OOM issue?


 Thanks in advance!


 Regards,

 Abhijit


Re: OOM issue

2011-09-13 Thread Erick Erickson
Multiple webapps will not help you, they're still on the underlying
memory. In fact, it'll make matters worse since they won't share
resources.

So questions become:
1 Why do you have 10 cores? Putting 10 cores on the same machine
doesn't really do much. It can make lots of sense to put 10 cores on the
same machine for *indexing*, then replicate them out. But putting
10 cores on one machine in hopes of making better use of memory
isn't useful. It may be useful to just go to one core.

2 Indexing, reindexing and searching on a single machine is requiring a
lot from that machine. Really you should consider having a master/slave
setup.

3 But assuming more hardware of any sort isn't in the cards, sure. reduce
your cache sizes. Look at ramBufferSizeMB and make it small.

4 Consider indexing with Tika via SolrJ and only sending the finished
document to Solr.

Best
Erick

On Mon, Sep 12, 2011 at 5:42 AM, Manish Bafna manish.bafna...@gmail.com wrote:
 Number of cache is definitely going to reduce heap usage.

 Can you run those xlsx file separately with Tika and see if you are getting
 OOM issue.

 On Mon, Sep 12, 2011 at 3:09 PM, abhijit bashetti abhijitbashe...@gmail.com
 wrote:

 I am facing the OOM issue.

 OTHER than increasing the RAM , Can we chnage some other parameters to
 avoid the OOM issue.


 such as minimizing the filter cache size , document cache size etc.

 Can you suggest me some other option to avoid the OOM issue?


 Thanks in advance!


 Regards,

 Abhijit




OOM issue

2011-09-12 Thread abhijit bashetti
Hi,

I am getting the OOM error.

I am working with multi-core for solr . I am using DIH for indexing. I have
also integrated TIKA for content extraction.

I am using ORACLE 10g DB.

In the solrconfig.xml , I have added

filterCache class=solr.FastLRUCache
 size=512
 initialSize=512
 autowarmCount=0/

queryResultCache class=solr.LRUCache
 size=512
 initialSize=512
 autowarmCount=0/

documentCache class=solr.LRUCache
   size=512
   initialSize=512
   autowarmCount=0/


 lockTypenative/lockType


My indexing server is on linux with 8GB of ram.
I am indexing huge document set. 10 cores are there. every core has 300 000
documents.

I got the OOM error for a xlsx document which is of 25MB size.

On the Indexing server , I am doing indexing (first time indexing for a new
core added) , re-indexing and searching also.

Do I need to create multiple solr webapps to resolve the issues.

Or I need add more RAM to the system so as to avoid OOM.

Regards,
Abhijit


Re: OOM issue

2011-09-12 Thread Manish Bafna
Are you using Tika to do the extraction of content?
You might be getting OOM because of huge xlsx file.

Try having bigger RAM and you might not get the issue.

On Mon, Sep 12, 2011 at 12:44 PM, abhijit bashetti 
abhijitbashe...@gmail.com wrote:

 Hi,

 I am getting the OOM error.

 I am working with multi-core for solr . I am using DIH for indexing. I have
 also integrated TIKA for content extraction.

 I am using ORACLE 10g DB.

 In the solrconfig.xml , I have added

 filterCache class=solr.FastLRUCache
 size=512
 initialSize=512
 autowarmCount=0/

 queryResultCache class=solr.LRUCache
 size=512
 initialSize=512
 autowarmCount=0/

 documentCache class=solr.LRUCache
   size=512
   initialSize=512
   autowarmCount=0/


  lockTypenative/lockType


 My indexing server is on linux with 8GB of ram.
 I am indexing huge document set. 10 cores are there. every core has 300 000
 documents.

 I got the OOM error for a xlsx document which is of 25MB size.

 On the Indexing server , I am doing indexing (first time indexing for a new
 core added) , re-indexing and searching also.

 Do I need to create multiple solr webapps to resolve the issues.

 Or I need add more RAM to the system so as to avoid OOM.

 Regards,
 Abhijit



OOM issue

2011-09-12 Thread abhijit bashetti
I am facing the OOM issue.

OTHER than increasing the RAM , Can we chnage some other parameters to
avoid the OOM issue.


such as minimizing the filter cache size , document cache size etc.

Can you suggest me some other option to avoid the OOM issue?


Thanks in advance!


Regards,

Abhijit


Re: OOM issue

2011-09-12 Thread Manish Bafna
Number of cache is definitely going to reduce heap usage.

Can you run those xlsx file separately with Tika and see if you are getting
OOM issue.

On Mon, Sep 12, 2011 at 3:09 PM, abhijit bashetti abhijitbashe...@gmail.com
 wrote:

 I am facing the OOM issue.

 OTHER than increasing the RAM , Can we chnage some other parameters to
 avoid the OOM issue.


 such as minimizing the filter cache size , document cache size etc.

 Can you suggest me some other option to avoid the OOM issue?


 Thanks in advance!


 Regards,

 Abhijit