Am 14.07.2015 um 21:37 schrieb Allison, Timothy B.:
Interesting, yes: 781/781172.pdf, 490/490376.pdf and 029/029423.pdf. Are you running your own regression testing against govdocs1?
Yes, from time to time for the last few months.
Is it duplicated effort for me to do anything with 2.0.0?
Partly yes. The only difference is that I didn't do any text extraction.
Or, is your point that should I wait until PDFBOX-2842 is completed?
Yes :-) Tilman
Thank you! Best, Tim -----Original Message----- From: Tilman Hausherr [mailto:[email protected]] Sent: Tuesday, July 14, 2015 12:47 PM To: [email protected] Subject: Re: first stack trace report from pdfbox 2.0.0 trunk Hi Tim, Currently there is at least one known regression, mentioned in PDFBOX-2842, it applies to 029423 but also to other files. Tilman Am 10.07.2015 um 13:57 schrieb Allison, Timothy B.:All, I just posted the first stacktrace report from my initial partial batch run of against govdocs1 here: https://issues.apache.org/jira/secure/attachment/12744700/pdfbox_reports_2_0_0_20150709.zip Caveats/Notes The run yesterday did not include the fixes that were made in PDFBOX-2370 or PDFBOX-2862. I stopped the batch run early. This only covered ~50k pdfs. I forgot to turn on accesspermission checking. Some of the pdfs in here would normally have been skipped. I haven't reviewed any of the exceptions. They may be caused by code on the Tika side. I'll plan to re-run with the latest trunk on Tuesday. I need to turn back to the actual eval code for a bit. :) Cheers, Tim--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
