Am 14.07.2015 um 21:37 schrieb Allison, Timothy B.:
Interesting, yes: 781/781172.pdf, 490/490376.pdf and 029/029423.pdf. Are you
running your own regression testing against govdocs1?
Yes, from time to time for the last few months.
Is it duplicated effort for me to do anything with 2.0.0?
Partly yes. The only difference is that I didn't do any text extraction.
Or, is your point that should I wait until PDFBOX-2842 is completed?
Yes :-)
Tilman
Thank you!
Best,
Tim
-----Original Message-----
From: Tilman Hausherr [mailto:thaush...@t-online.de]
Sent: Tuesday, July 14, 2015 12:47 PM
To: dev@pdfbox.apache.org
Subject: Re: first stack trace report from pdfbox 2.0.0 trunk
Hi Tim,
Currently there is at least one known regression, mentioned in
PDFBOX-2842, it applies to 029423 but also to other files.
Tilman
Am 10.07.2015 um 13:57 schrieb Allison, Timothy B.:
All,
I just posted the first stacktrace report from my initial partial batch run
of against govdocs1 here:
https://issues.apache.org/jira/secure/attachment/12744700/pdfbox_reports_2_0_0_20150709.zip
Caveats/Notes
The run yesterday did not include the fixes that were made in PDFBOX-2370 or
PDFBOX-2862.
I stopped the batch run early. This only covered ~50k pdfs.
I forgot to turn on accesspermission checking. Some of the pdfs in here would
normally have been skipped.
I haven't reviewed any of the exceptions. They may be caused by code on the
Tika side.
I'll plan to re-run with the latest trunk on Tuesday. I need to turn back to
the actual eval code for a bit. :)
Cheers,
Tim
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org