> On 14 Jul 2015, at 13:49, Tilman Hausherr <thaush...@t-online.de> wrote: > > Am 14.07.2015 um 22:35 schrieb John Hewson: >>> On 14 Jul 2015, at 13:20, Tilman Hausherr <thaush...@t-online.de> wrote: >>> >>> Am 14.07.2015 um 21:37 schrieb Allison, Timothy B.: >>>> Interesting, yes: 781/781172.pdf, 490/490376.pdf and 029/029423.pdf. Are >>>> you running your own regression testing against govdocs1? >>> Yes, from time to time for the last few months. >>> >>>> Is it duplicated effort for me to do anything with 2.0.0? >>> Partly yes. The only difference is that I didn't do any text extraction. >>> >>>> Or, is your point that should I wait until PDFBOX-2842 is completed? >>> Yes :-) >> Good news, PDFBOX-2842 is now complete. > > No, the 029423 file is still throwing an exception :-( >
Ok, I’ve just fixed this, hopefully it works. — John > Tilman > > >> >> — John >> >>> Tilman >>> >>>> Thank you! >>>> >>>> Best, >>>> >>>> Tim >>>> -----Original Message----- >>>> From: Tilman Hausherr [mailto:thaush...@t-online.de] >>>> Sent: Tuesday, July 14, 2015 12:47 PM >>>> To: dev@pdfbox.apache.org >>>> Subject: Re: first stack trace report from pdfbox 2.0.0 trunk >>>> >>>> Hi Tim, >>>> >>>> Currently there is at least one known regression, mentioned in >>>> PDFBOX-2842, it applies to 029423 but also to other files. >>>> >>>> Tilman >>>> >>>> Am 10.07.2015 um 13:57 schrieb Allison, Timothy B.: >>>>> All, >>>>> I just posted the first stacktrace report from my initial partial >>>>> batch run of against govdocs1 here: >>>>> https://issues.apache.org/jira/secure/attachment/12744700/pdfbox_reports_2_0_0_20150709.zip >>>>> >>>>> Caveats/Notes >>>>> >>>>> The run yesterday did not include the fixes that were made in PDFBOX-2370 >>>>> or PDFBOX-2862. >>>>> >>>>> I stopped the batch run early. This only covered ~50k pdfs. >>>>> >>>>> I forgot to turn on accesspermission checking. Some of the pdfs in here >>>>> would normally have been skipped. >>>>> >>>>> I haven't reviewed any of the exceptions. They may be caused by code on >>>>> the Tika side. >>>>> >>>>> I'll plan to re-run with the latest trunk on Tuesday. I need to turn >>>>> back to the actual eval code for a bit. :) >>>>> >>>>> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: dev-h...@pdfbox.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org