> On 14 Jul 2015, at 13:20, Tilman Hausherr <thaush...@t-online.de> wrote: > > Am 14.07.2015 um 21:37 schrieb Allison, Timothy B.: >> Interesting, yes: 781/781172.pdf, 490/490376.pdf and 029/029423.pdf. Are >> you running your own regression testing against govdocs1? > > Yes, from time to time for the last few months. > >> Is it duplicated effort for me to do anything with 2.0.0? > Partly yes. The only difference is that I didn't do any text extraction. > >> Or, is your point that should I wait until PDFBOX-2842 is completed? > > Yes :-)
Good news, PDFBOX-2842 is now complete. — John > > Tilman > >> >> Thank you! >> >> Best, >> >> Tim >> -----Original Message----- >> From: Tilman Hausherr [mailto:thaush...@t-online.de] >> Sent: Tuesday, July 14, 2015 12:47 PM >> To: dev@pdfbox.apache.org >> Subject: Re: first stack trace report from pdfbox 2.0.0 trunk >> >> Hi Tim, >> >> Currently there is at least one known regression, mentioned in >> PDFBOX-2842, it applies to 029423 but also to other files. >> >> Tilman >> >> Am 10.07.2015 um 13:57 schrieb Allison, Timothy B.: >>> All, >>> I just posted the first stacktrace report from my initial partial batch >>> run of against govdocs1 here: >>> https://issues.apache.org/jira/secure/attachment/12744700/pdfbox_reports_2_0_0_20150709.zip >>> >>> Caveats/Notes >>> >>> The run yesterday did not include the fixes that were made in PDFBOX-2370 >>> or PDFBOX-2862. >>> >>> I stopped the batch run early. This only covered ~50k pdfs. >>> >>> I forgot to turn on accesspermission checking. Some of the pdfs in here >>> would normally have been skipped. >>> >>> I haven't reviewed any of the exceptions. They may be caused by code on the >>> Tika side. >>> >>> I'll plan to re-run with the latest trunk on Tuesday. I need to turn back >>> to the actual eval code for a bit. :) >>> >>> >>> Cheers, >>> >>> Tim >>> >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org >> For additional commands, e-mail: dev-h...@pdfbox.apache.org >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org >> For additional commands, e-mail: dev-h...@pdfbox.apache.org >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org > <mailto:dev-unsubscr...@pdfbox.apache.org> > For additional commands, e-mail: dev-h...@pdfbox.apache.org > <mailto:dev-h...@pdfbox.apache.org>