On 30.03.2023 16:27, Tim Allison wrote:
Reports are here:
https://corpora.tika.apache.org/base/reports/pdfbox-2.0.27-v-2.0.28-SNAPSHOT.tgz
Thank you Tim!
What I see is
1) Text missing in TOP_10_MORE_IN_B, these might (all?) be related to
the issue that Andreas reopened
2) Different Arabic text => PDFBOX-4531, hopefully these are improvements
3) misc improvements, I'll add two of them to my own extraction
regression tests
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org