Re: 2.0.28 release?

2023-04-03 Thread Tilman Hausherr

On 30.03.2023 16:27, Tim Allison wrote:

Reports are here:
https://corpora.tika.apache.org/base/reports/pdfbox-2.0.27-v-2.0.28-SNAPSHOT.tgz


Thank you Tim!

What I see is

1) Text missing in TOP_10_MORE_IN_B, these might (all?) be related to 
the issue that Andreas reopened


2) Different Arabic text => PDFBOX-4531, hopefully these are improvements

3) misc improvements, I'll add two of them to my own extraction 
regression tests


Tilman


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.28 release?

2023-04-01 Thread Tilman Hausherr

On 01.04.2023 11:41, Tilman Hausherr wrote:

On 30.03.2023 16:27, Tim Allison wrote:

Reports are here:
https://corpora.tika.apache.org/base/reports/pdfbox-2.0.27-v-2.0.28-SNAPSHOT.tgz 



Thank you Tim!

What I see is

1) Text missing in TOP_10_MORE_IN_B, these might (all?) be related to 
the issue that Andreas reopened


2) Different Arabic text => PDFBOX-4531, hopefully these are improvements

3) misc improvements, I'll add two of them to my own extraction 
regression tests


Tilman

Also some improved ligature text extraction, this might also be related 
to the PDFBOX-4531 changes. It can be seen in govdocs file 433525.pdf, 
in the first page "Neutron radiation offers" (ff now appears correctly)


Tilman


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.28 release?

2023-03-30 Thread Tim Allison
Reports are here:
https://corpora.tika.apache.org/base/reports/pdfbox-2.0.27-v-2.0.28-SNAPSHOT.tgz

On Tue, Mar 28, 2023 at 10:42 PM Tilman Hausherr  wrote:
>
> Yes please!
>
> Thanks
>
> Tilman
>
> On 28.03.2023 19:22, Tim Allison wrote:
> > +1
> >
> > Should I run the regression tests now or is there anything else text
> > related that is still being worked on?
> >
> > On Tue, Mar 28, 2023 at 1:05 PM Tilman Hausherr  
> > wrote:
> >> +1
> >>
> >> Tilman
> >>
> >> On 28.03.2023 08:46, Andreas Lehmkuehler wrote:
> >>> Hi,
> >>>
> >>> how about cutting a 2.0.28 release next week on Monday?
> >>>
> >>> there is a bunch of solved tickets and the last release dates back 6
> >>> months
> >>>
> >>> Andreas
> >>>
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> >>> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> >>>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> >> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> >>
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.28 release?

2023-03-28 Thread Tilman Hausherr

Yes please!

Thanks

Tilman

On 28.03.2023 19:22, Tim Allison wrote:

+1

Should I run the regression tests now or is there anything else text
related that is still being worked on?

On Tue, Mar 28, 2023 at 1:05 PM Tilman Hausherr  wrote:

+1

Tilman

On 28.03.2023 08:46, Andreas Lehmkuehler wrote:

Hi,

how about cutting a 2.0.28 release next week on Monday?

there is a bunch of solved tickets and the last release dates back 6
months

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.28 release?

2023-03-28 Thread Andreas Lehmkuehler

Am 28.03.23 um 19:22 schrieb Tim Allison:

+1

Should I run the regression tests now or is there anything else text
related that is still being worked on?
I don't have any text related TODO on my list, please run the tests if nobody 
else objects.


Andreas


On Tue, Mar 28, 2023 at 1:05 PM Tilman Hausherr  wrote:


+1

Tilman

On 28.03.2023 08:46, Andreas Lehmkuehler wrote:

Hi,

how about cutting a 2.0.28 release next week on Monday?

there is a bunch of solved tickets and the last release dates back 6
months

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.28 release?

2023-03-28 Thread Tim Allison
+1

Should I run the regression tests now or is there anything else text
related that is still being worked on?

On Tue, Mar 28, 2023 at 1:05 PM Tilman Hausherr  wrote:
>
> +1
>
> Tilman
>
> On 28.03.2023 08:46, Andreas Lehmkuehler wrote:
> > Hi,
> >
> > how about cutting a 2.0.28 release next week on Monday?
> >
> > there is a bunch of solved tickets and the last release dates back 6
> > months
> >
> > Andreas
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.28 release?

2023-03-28 Thread Tilman Hausherr

+1

Tilman

On 28.03.2023 08:46, Andreas Lehmkuehler wrote:

Hi,

how about cutting a 2.0.28 release next week on Monday?

there is a bunch of solved tickets and the last release dates back 6 
months


Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.28 release?

2023-03-28 Thread sahy...@fileaffairs.de
+1
Maruan

Am Dienstag, dem 28.03.2023 um 08:46 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> how about cutting a 2.0.28 release next week on Monday?
> 
> there is a bunch of solved tickets and the last release dates back 6
> months
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org