It is still tricky but I'm on it.

Sorry for the noise

Andreas

Am 08.07.24 um 20:06 schrieb Andreas Lehmkühler:
There is an issue with the changes from https://issues.apache.org/jira/browse/PDFBOX-5789


I've to postpone the release to solve the issue first

Sorry for the inconvenience

Andreas

Am 08.07.24 um 19:02 schrieb Andreas Lehmkühler:
Looks good to me, I'm starting the release process ...

Am 08.07.24 um 08:43 schrieb Tilman Hausherr:
Last one:
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.31_vs_2.0.32_4.tar.xz

This is because the last change I made yesterday.

Tilman

On 06.07.2024 19:17, Tilman Hausherr wrote:
Result:
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.31_vs_2.0.32_3.tar.xz

to be compared against

https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.31_vs_2.0.32.tar.xz

I couldn't find a difference visually except the file sizes. This might be because of the path names or some meta data.

Tilman

On 06.07.2024 14:19, Tilman Hausherr wrote:
Hi,

I've just started a new "B" test.

Tilman

On 06.07.2024 13:29, Andreas Lehmkühler wrote:
Hi,

after closing https://issues.apache.org/jira/browse/PDFBOX-5838 I'd like to finally cut the 2.0.32 release.

Do we need a new regression test due the latest changes?

There some related changes such as https://issues.apache.org/jira/browse/PDFBOX-5843 and the recent refactoring in fontbox.

Andreas


Am 14.06.24 um 13:03 schrieb Tilman Hausherr:
Result:
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.31_vs_2.0.32_2.tar.xz

 From what I see, nothing to do.
And I know the time it takes: 3 hours for the A (or B) test, 1 hour to create the A vs B report (tika-eval).

Tilman

On 14.06.2024 08:47, Tilman Hausherr wrote:
I'll repeat the regression tests with locally reverting the change from PDFBOX-5790 but locally adding my proposed xmpbox change from PDFBOX-5835. This way we'll know whether there are other problems.

Tilman

On 13.06.2024 19:23, Tilman Hausherr wrote:
See https://issues.apache.org/jira/browse/PDFBOX-5838

I hope that it's all the same problem.

Tilman

On 13.06.2024 18:30, Andreas Lehmkühler wrote:
Thanks for running the tests.

the exceptions part looks good, but I'm afraid we have a text extraction issue.

commoncrawl3_refetched/JA/JA77WEHMKS2T5LCXM42OXFJ3OSBNRDTI

some of the special characters changed. In 2.0.31 the were "omitted" and in 2.0.32 there is some special char. But th remaining part looks good to me.


cc-main-2021-31-pdf-untruncated/0085/0085885.pdf

ist seems to contain some special characters as well, but 2.0.31 is able to extract them. 2.0.32 seems to mix some of the content.

I guess it is somehow font related. Need to investigate more

Andreas


Am 12.06.24 um 20:23 schrieb Tilman Hausherr:
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.31_vs_2.0.32.tar.xz

No new exceptions but many content differences. I haven't investigated yet.

Tilman

On 12.06.2024 11:31, Tilman Hausherr wrote:
I've started the tests. If there aren't any troubles I'll have the results tomorrow.

Tilman

On 05.06.2024 08:07, Andreas Lehmkühler wrote:
Thanks for the update.

I'm going to postpone the release as I'll need any helping hand I can get.

Andreas

Am 02.06.24 um 14:22 schrieb Tilman Hausherr:
+1 but I won't be able to help with tests this time

Tilman

On 01.06.2024 12:15, Andreas Lehmkühler wrote:
Hi,

IMHO it is time to cut another 2.0.x release.

I'm planing to do so in a week or so?

Any objections or is there something we should add/fix first?

Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to