Mike Marchywka wrote:
> I would suggest your reiterate your comments to me about asking authors to
> retain
> logical structure if they want to turn information into a work of art.
That's exactly what this article is about:
http://www.govloop.com/forum/topics/humans-vs-machines-is-this-a
--
This
Hi,
On 03/18/2010 06:22 AM, Saurav Ranjit wrote:
> I am working on the project in which I need to extract the content from
> the PDF file. I am able to get the content in the byte form ,but unable
> to do proper decoding of the content in the byte form.
> Is there any iTextSharp class that can hel
I ran into an issue when there is a tab in a chunk: if there is an
underlined chunk after the tab, the underline is offset to the left. It took
me a while to figure out that there was a tab and it did that. Try the code
below and you'll see the issue. I saw a mention that tab support was done in
20
The text is probably in a xobject depending on the way you transformed the PDF.
Paulo
- Original Message -
From: ericvaleyev
To: itext-questions@lists.sourceforge.net
Sent: Thursday, March 18, 2010 8:40 PM
Subject: Re: [iText-questions] Problem reading the content from the PDF
You want to re-flow a PDF document. That can't be done. Not in iText, not
with other tools.
The closest you can come to what you're asking for is deleting blank pages.
But moving existing text and elements to new locations is not possible.
---mr. bean
pcole wrote:
>
> Hi
>
> I'm trying to
> From: lrose...@adobe.com
> To: itext-questions@lists.sourceforge.net
> Date: Thu, 18 Mar 2010 13:16:42 -0700
> Subject: Re: [iText-questions] (no subject)
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> PDF doesn’t support a “table structure” – you will need to apply
Txs, good idea to optimise the code. However, it doesn't help me to find the
answer why the PDFExtractByte works on the virgin PDF's, then when you run
the same code after it's been through iTextSharp, it can't find the text.
--
View this message in context:
http://old.nabble.com/Problem-reading
PDF doesn't support a "table structure" - you will need to apply advanced
heuristics to figure out what is (or isn't) a table and what is it's "header",
"columsn", etc.
Leonard
From: Ahmad Amin [mailto:ahmad_a...@siliconexpert.com]
Sent: Thursday, March 18, 2010 5:17 PM
To: itext-questions@list
Hi! iText has an extractFdf method which works perfectly and we were looking
for a an extractXFDF solution. Any advice would be appreciated.
Thanks,
Sam
--
Download Intel® Parallel Studio Eval
Try the new software too
Hi
I'm try to extract PDF Text content automatically,
The problem is when I encounter Text in different table structure, I
Couldn't differentiate between headers and columns values,
I'm using Eclipse as JAVA2 IDE and most popular PDF Lib. (JPedal, iText,
PDFOne
Java, PDFBox) all thes
I bet the content parser doesn't handle inline images - you'll need to fix the
parser.
From: Pamela Bondi [mailto:pamelabo...@siwebsrl.com]
Sent: Thursday, March 18, 2010 5:07 PM
To: itext-questions@lists.sourceforge.net
Subject: [iText-questions] '>' not expected at file pointer 23512
Hi,
I'm
What type of "content"? Text only? Raster images? Vector data? 3D? Movies?
Sounds? Other?
From: Saurav Ranjit [mailto:ranjitsau...@gmail.com]
Sent: Thursday, March 18, 2010 11:23 AM
To: itext-questions@lists.sourceforge.net
Subject: [iText-questions] How to Get Content from the PDF documen
Dear iTextSharp Team,
I am working on the project in which I need to extract the content from the
PDF file. I am able to get the content in the byte form ,but unable to do
proper decoding of the content in the byte form.
Is there any iTextSharp class that can help me extract the PDF file content
p
Hi,
I'm extracting from a PDF file of text data using
the method ".getTextFromPage (i)" from a PdfTextExtractor.
The error born when in the PDF there is an image of type "BI / IM".
And that gives me the error is '>' not expected at 23512th file pointer.
Is there a way to solve with iText or do
I would like to see a PDF that triggers this problem.
Paulo
From: Juan Antonio de la Puente [jpue...@dit.upm.es]
Sent: Thursday, March 18, 2010 5:36 PM
To: itext-questions@lists.sourceforge.net
Subject: [iText-questions] BaseFonts.getDocumentFonts crashes
Hi,
I have found that BaseFonts.getDocumentFonts throws a null pointer exception
with some PDF files. I have traced the exception to the end of the
recourseFonts method, in particular line 1449 of the BaseFonts source file:
1446 PdfDictionary xobj = resources.getAsDict(PdfName.XOBJECT);
1
Hi
I'm trying to remove the blank section on the end of a pdf I'm reading in so
that when I display it on the screen the blank section of the existing pdf
will not be shown.
for example if my document is printed on two pages and one page is half full
of text how do i find out in code that the se
I've deployed a ASP.NET web application to my web hotel. In the application I
use iTextSharp (the itextSharp.dll is located in the Bin directory). The ISP
use Medium Trust and therefore I've recompiled iTextSharp.dll to allow
partial trust.
Most of the time the application works perfect. But occa
Have you read the PDF/A standard and the section of metadata? Have you read
the tech notes from the PDF/A competence center about metadata in PDF/A? If
not, you need to do so.
Short answer - there are those schemas that are predefined (which don't require
an extension schema) and those that a
19 matches
Mail list logo