Thank you all. You do a great service.
I am up and running.
Thanks,
Pulkit
On Thu, Feb 2, 2017 at 3:19 PM, Tilman Hausherr
wrote:
> Am 02.02.2017 um 21:12 schrieb Pulkit Kapur:
>
>> I am getting just the headers:
>> "2016 IEEE/RSJ International Conference on Intelligent Robots and Systems
>> (
Am 02.02.2017 um 21:12 schrieb Pulkit Kapur:
I am getting just the headers:
"2016 IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS)
Daejeon Convention Center
October 9-14, 2016, Daejeon, Korea
978-1-5090-3761-2/16/$31.00 ©2016 IEEE 5324
5325
5326
5327
5328
5329
5330
5331
I am getting just the headers:
"2016 IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS)
Daejeon Convention Center
October 9-14, 2016, Daejeon, Korea
978-1-5090-3761-2/16/$31.00 ©2016 IEEE 5324
5325
5326
5327
5328
5329
5330
5331
"
Did use the new file path:
javaaddpath('C:\Us
Am 02.02.2017 um 20:26 schrieb Pulkit Kapur:
Thanks. Thats what i would expect to read.
Also thanks for pointing to the latest version. I pointed to the
pdfbox-app-2.0.4.jar and the fontbox-2.0.4.jar files.
Since i want to read over 1000 pdf documents programmatically in matlab, i
am not using t
Thanks. Thats what i would expect to read.
Also thanks for pointing to the latest version. I pointed to the
pdfbox-app-2.0.4.jar and the fontbox-2.0.4.jar files.
Since i want to read over 1000 pdf documents programmatically in matlab, i
am not using the command line, but using the java library in
Am 02.02.2017 um 19:59 schrieb Pulkit Kapur:
My apologies. This was very careless of me. I did not realize scribd would
want you to register to download.
I have uploaded the document here: http://www.filedropper.com/0024iros2016
My code is in Matlab (and not command line interface) and i am usi
My apologies. This was very careless of me. I did not realize scribd would
want you to register to download.
I have uploaded the document here: http://www.filedropper.com/0024iros2016
My code is in Matlab (and not command line interface) and i am using
*PDFBox-0.7.3.jar* and *FontBox-0.1.0.jar*
I
Am 02.02.2017 um 16:10 schrieb Pulkit Kapur:
Hi
I have uploaded the pdf here:
https://www.scribd.com/document/338221804/0024-iros-2016
Hello Pulkit,
This site requires registration. This is a "don't" from the list:
https://pdfbox.apache.org/support.html
I don't want to register.
Please find
.com] On Behalf Of Pulkit
Kapur
Sent: Thursday, February 2, 2017 10:34 AM
To: users@pdfbox.apache.org
Subject: Re: Fwd: Trouble reading IEEE pdf
Thanks Karl for the reply.
Thats helpful.
What confuses me is this" very likely because usually such an XObject would
just be an image"
-
Karl,
Got it.
I understand the point about XObjects and how pdfBox might be missing the
XObject because typically they are images.
I am hoping someone here might have had luck making pdfBox get data from
XObject elements that contain text.
Thanks,
Pulkit
On Thu, Feb 2, 2017 at 10:36 AM, Karl He
Pulpit,
I did not say that in your document the XObjects are images, I said that
they usually are just images. When you analyze 100 random PDF documents,
changes are that that most of them only use the XObject construct for
images and vector graphic, not for elements that contain text. Your
docume
Thanks Karl for the reply.
Thats helpful.
What confuses me is this" very likely because usually such an XObject would
just be an
image"
-> I am able to select the underlying text in the XObject using acrobat and
copy/paste it.
Thats why i am confused why pdfbox cannot access the XObject.
Perhaps
The document does not contain layers (or optional content groups as they
are called in PDF), the problem seems to be that the actual text of
the document is in an XObject - something that is completely legal in a PDF
file. I suspect that the text was created in one application, and then a
second ap
Hi
I have uploaded the pdf here:
https://www.scribd.com/document/338221804/0024-iros-2016
I did some more diagnosis last night and it seems that there are two layers
on the pdf. One which is the content and the other with headers and
footers. Pdf box is only reading the headers and footers.
I sus
Am 02.02.2017 um 05:55 schrieb Pulkit Kapur:
Hi
I am trying to read some past years IEEE conference proceedings i have.
I can read the pdf using acrobat and select the text.
But when i try to read the text using readText function from the
pdfbox library, i only get the headers and footers in t
15 matches
Mail list logo