@Tilman
Thanks again for running the tests.

Looks good to me, so that I'm planning to cut the release tomorrow evening in about 28 hours from now.

Andreas


Am 06.07.24 um 19:17 schrieb Tilman Hausherr:
Result:
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.31_vs_2.0.32_3.tar.xz

to be compared against

https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.31_vs_2.0.32.tar.xz

I couldn't find a difference visually except the file sizes. This might be because of the path names or some meta data.

Tilman

On 06.07.2024 14:19, Tilman Hausherr wrote:
Hi,

I've just started a new "B" test.

Tilman

On 06.07.2024 13:29, Andreas Lehmkühler wrote:
Hi,

after closing https://issues.apache.org/jira/browse/PDFBOX-5838 I'd like to finally cut the 2.0.32 release.

Do we need a new regression test due the latest changes?

There some related changes such as https://issues.apache.org/jira/browse/PDFBOX-5843 and the recent refactoring in fontbox.

Andreas


Am 14.06.24 um 13:03 schrieb Tilman Hausherr:
Result:
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.31_vs_2.0.32_2.tar.xz

 From what I see, nothing to do.
And I know the time it takes: 3 hours for the A (or B) test, 1 hour to create the A vs B report (tika-eval).

Tilman

On 14.06.2024 08:47, Tilman Hausherr wrote:
I'll repeat the regression tests with locally reverting the change from PDFBOX-5790 but locally adding my proposed xmpbox change from PDFBOX-5835. This way we'll know whether there are other problems.

Tilman

On 13.06.2024 19:23, Tilman Hausherr wrote:
See https://issues.apache.org/jira/browse/PDFBOX-5838

I hope that it's all the same problem.

Tilman

On 13.06.2024 18:30, Andreas Lehmkühler wrote:
Thanks for running the tests.

the exceptions part looks good, but I'm afraid we have a text extraction issue.

commoncrawl3_refetched/JA/JA77WEHMKS2T5LCXM42OXFJ3OSBNRDTI

some of the special characters changed. In 2.0.31 the were "omitted" and in 2.0.32 there is some special char. But th remaining part looks good to me.


cc-main-2021-31-pdf-untruncated/0085/0085885.pdf

ist seems to contain some special characters as well, but 2.0.31 is able to extract them. 2.0.32 seems to mix some of the content.

I guess it is somehow font related. Need to investigate more

Andreas


Am 12.06.24 um 20:23 schrieb Tilman Hausherr:
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.31_vs_2.0.32.tar.xz

No new exceptions but many content differences. I haven't investigated yet.

Tilman

On 12.06.2024 11:31, Tilman Hausherr wrote:
I've started the tests. If there aren't any troubles I'll have the results tomorrow.

Tilman

On 05.06.2024 08:07, Andreas Lehmkühler wrote:
Thanks for the update.

I'm going to postpone the release as I'll need any helping hand I can get.

Andreas

Am 02.06.24 um 14:22 schrieb Tilman Hausherr:
+1 but I won't be able to help with tests this time

Tilman

On 01.06.2024 12:15, Andreas Lehmkühler wrote:
Hi,

IMHO it is time to cut another 2.0.x release.

I'm planing to do so in a week or so?

Any objections or is there something we should add/fix first?

Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to