Am 14.07.2015 um 21:37 schrieb Allison, Timothy B.:
Interesting, yes: 781/781172.pdf, 490/490376.pdf and 029/029423.pdf.  Are you 
running your own regression testing against govdocs1?

Yes, from time to time for the last few months.

Is it duplicated effort for me to do anything with 2.0.0?
Partly yes. The only difference is that I didn't do any text extraction.

Or, is your point that should I wait until PDFBOX-2842 is completed?

Yes :-)

Tilman


Thank you!

Best,

           Tim
-----Original Message-----
From: Tilman Hausherr [mailto:thaush...@t-online.de]
Sent: Tuesday, July 14, 2015 12:47 PM
To: dev@pdfbox.apache.org
Subject: Re: first stack trace report from pdfbox 2.0.0 trunk

Hi Tim,

Currently there is at least one known regression, mentioned in
PDFBOX-2842, it applies to 029423 but also to other files.

Tilman

Am 10.07.2015 um 13:57 schrieb Allison, Timothy B.:
All,
    I just posted the first stacktrace report from my initial partial batch run 
of against govdocs1 here: 
https://issues.apache.org/jira/secure/attachment/12744700/pdfbox_reports_2_0_0_20150709.zip

Caveats/Notes

The run yesterday did not include the fixes that were made in PDFBOX-2370 or 
PDFBOX-2862.

I stopped the batch run early. This only covered ~50k pdfs.

I forgot to turn on accesspermission checking. Some of the pdfs in here would 
normally have been skipped.

I haven't reviewed any of the exceptions. They may be caused by code on the 
Tika side.

I'll plan to re-run with the latest trunk on Tuesday.  I need to turn back to 
the actual eval code for a bit. :)


Cheers,

            Tim




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to