Jack Krupansky <jack <at> basetechnology.com> writes:

> 
> I vaguely recall some thread blocking issue with trying to parse too many 
> PDF files at one time in the same JVM.
> 
> Occasionally Tika (actually PDFBox) has been known to hang for some PDF 
> docs.
> 
> Do you have enough memory in the JVM? When the CPU is busy, is there much 
> memory available in the JVM? Maybe garbage collection is taking too much of 
> the CPU.
> 


Hi Jack,

Thanks for your quick response. Yes. I hope I have enough JVM memory. Here is
the mem settings.

-Xms11g -Xmx11g -XX:MaxPermSize=2g 

Is this a common issue seen for PDF extraction and indexing? Why i am not able
to do more than 1k documents per hour?

Thanks,
Surendra.

Reply via email to