RE: Memory issues with PDF parser

2015-06-04 Thread Allison, Timothy B.
[mailto:mouthgalya.ganapa...@fitchratings.com] Sent: Thursday, June 04, 2015 10:20 AM To: Allison, Timothy B.; talli...@apache.org Cc: user@tika.apache.org; Sauparna Sarkar Subject: RE: Memory issues with PDF parser Hi Timothy, Thanks for the prompt reply. 1.)Wouldn't fixing the null pointer exception in turn

RE: Memory issues with PDF parser

2015-06-04 Thread Allison, Timothy B.
[mailto:mouthgalya.ganapa...@fitchratings.com] Sent: Thursday, June 04, 2015 2:55 PM To: Allison, Timothy B. Cc: user@tika.apache.org; Sauparna Sarkar Subject: RE: Memory issues with PDF parser Thanks for the update Timothy, I see that Tika 1.9.-SNAPSHOT is available in maven repo. I am going to try

RE: Memory issues with PDF parser

2015-06-04 Thread Allison, Timothy B.
Hi Mouthgalya, We fixed that NPE in https://issues.apache.org/jira/browse/TIKA-1605, and the fix will be available in Tika 1.9, which should be out within a week. As for memory issues, we worked around a memory leak in PDFBox with static caching of fonts for Tika 1.7 (may have been 1.8), but