Thanks Maruan,
I got the explanation.
Slava
On Wed, Jan 22, 2020 at 12:18 PM Maruan Sahyoun
wrote:
> Hi,
>
> please take a look at the FAQ at
>
> https://pdfbox.apache.org/2.0/faq.html#how-come-i-am-getting-gibberishg38g43g36g51g5-when-extracting-text
>
> BR
> Maruan
>
> > Hi,
> > I have PDF, wh
Hi,
I have PDF, which is looks fine in readers but when I trying to extract
text I get garbage.
What am I doing wrong ?
PDF is attached.
Thanks
-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands,
Well, seems that It'll be fixed in PDFBox 2.0.16
On Wed, May 15, 2019 at 5:35 PM Slava G wrote:
> Will definitely try, is this rc available via maven?
>
> On Wed, May 15, 2019, 17:20 Tim Allison wrote:
>
>> Yay! Tilman and colleagues on PDFBox really are _that_fast.
Got you.
Thanks
On Thu, May 16, 2019 at 6:42 AM Tilman Hausherr
wrote:
> Am 15.05.2019 um 21:57 schrieb Slava G:
> > But I tried to extract text using 2.0.15 and got immidiatelly exception
> and
> > didn't get OOM.
>
>
> I got slow response on the seco
But I tried to extract text using 2.0.15 and got immidiatelly exception and
didn't get OOM.
On Wed, May 15, 2019, 22:52 Tilman Hausherr wrote:
> Am 15.05.2019 um 16:00 schrieb Slava G:
> > But seems that in PDFBox 2.0.15 it's already fixed as, when I run
> tika-app
>
&
org/thread.html/2c027535156cc6862149490b289552d72ba5a9bff985fb7cce794e21@%3Cdev.tika.apache.org%3E
>
> On Wed, May 15, 2019 at 10:01 AM Slava G wrote:
>
> > Sure, I can share it privately.
> > But seems that in PDFBox 2.0.15 it's already fixed as, when I run
> tika-app
> > (1.20) it's
4:54 PM Tim Allison wrote:
> Sounds like it might be a bug.
>
> PDFBox colleagues, any recs?
>
> Slava, if you’re able to share the file even if only privately, that’ll
> help.
>
> On Wed, May 15, 2019 at 9:49 AM Slava G wrote:
>
> > I have small pdf file (142kb) whi
Tim, to what email to send you the PDF ?
Thanks
On Thu, Feb 28, 2019 at 3:57 PM Slava G wrote:
> I'll once I'll get customer's approval.
> Meanwhile I can do any checks, if you can specify what to check.
> Thanks
>
> On Thu, Feb 28, 2019 at 3:56 PM Tim Allison
I'll once I'll get customer's approval.
Meanwhile I can do any checks, if you can specify what to check.
Thanks
On Thu, Feb 28, 2019 at 3:56 PM Tim Allison wrote:
> Any chance you can share the file directly w me or someone else on the
> PDFBox team?
>
> On Wed, Feb 2
rehoster (e.g. filedropper.com ) and put the file
> into an encrypted ZIP. Please send the link and the password to
> tilman at snafu dot de. Make sure you're not breaking any laws by
> sending the file.
>
> Tilman
>
>
> Am 27.02.2019 um 17:33 schrieb Slava G:
> > As this is c
's going on.
>
> If you can't share it, you'll have to investigate yourself by using the
> profiler. Before that, try with old 2.0.* versions to see if these are
> faster.
>
> Tilman
>
> Am 27.02.2019 um 17:23 schrieb Slava G:
> > After 3h 40m it&
After 3h 40m it's still parsing using PDFBox 2.0.14 app...
Thanks
On Wed, Feb 27, 2019 at 3:29 PM Slava G wrote:
> With 2.0.14 it's 40 minutes running, no result, still working...
> Seems that issue is still there.
> Thanks
>
> On Wed, Feb 27, 2019 at 2:52 PM Slava G
With 2.0.14 it's 40 minutes running, no result, still working...
Seems that issue is still there.
Thanks
On Wed, Feb 27, 2019 at 2:52 PM Slava G wrote:
> Checking with 2.0.14. Started as an app. Will update soon.
>
> On Wed, Feb 27, 2019 at 2:47 PM Tim Allison wrote:
>
>&g
b 27, 2019 at 3:04 AM Slava G wrote:
>
>> Well, I ran (as was suggested) PDFBox app to extract text , so far 2
>> hours and still counting...
>> It's seems to be a PDFBox issue.
>>
>> On Wed, Feb 27, 2019 at 9:51 AM JB Data31 wrote:
>>
>>> Why
ed message -
>> > From: Tim Allison
>> > Date: Tue, Feb 26, 2019 at 12:13 PM
>> > Subject: Re: Very slow PDF parsing.
>> > To:
>> >
>> >
>> > Sorry...that's an OCR tool. One thing that can slow down processing
>&g
15 matches
Mail list logo