https://home.snafu.de/tilman/tmp/reports_pdfbox_3.0.1_vs_3.0.2_2.tar.xz

Only one possible regression now:

bug_trackers/poppler/poppler-106962-0.zip-0.pdf

However I found no difference when running ExtractText with PDFBox.

I also counted the two top tokens and the numbers matc. So I guess this is a tika error, maybe a sporadic error because it wasn't in the first test.

TOP_N_TOKENS_A
漏洞: 164 | 数据: 112 (...)

Tilman

Reply via email to