On 5/21/2013 9:22 PM, Umesh Prasad wrote:
>     This is our own implementation of data source (canon name
> com.flipkart.w3.solr.MultiSPCMSProductsDataSource) , which pulls the data
> from out downstream service and it doesn't cache data in RAM. It fetches
> the data in batches of 200 and iterates over it when DIH asks for it. I
> will check the possibility of leak, but unlikely.
>        Can OOM issue be because during analysis, IndexWriter finds the
> document to be too large to fit in 100 MB memory and can't flush to disk ?
> Our analyzer chain doesn't make easy (specially with a field like) (does a
> cross product of synonyms terms)

If your documents are really large (hundreds of KB, or a few MB), you
might need a bigger ramBufferSizeMB value ... but if that were causing
problems, I would expect it to show up during import, not at commit time.

How much of your 32GB heap is in use during indexing?  Would you be able
to try with the heap at 31GB instead of 32GB?  One of Java's default
optimizations (UseCompressedOops) gets turned off with a heap size of
32GB because it doesn't work any more, and that might lead to strange
things happening.

Do you have the ability to try 4.3 instead of 4.2.1?

Thanks,
Shawn

Reply via email to