Thanks very much for this information.
Maybe you could offer me some direction of how to solve my problem?
I need to parse pdf mobile phone bills. the information i require is
the itemized data that is in a table format. Is this possible with
itextpdf?
On 19 June 2010 08:44, 1T3XT info
Date: Mon, 21 Jun 2010 09:49:44 +0100
From: b...@benshort.co.uk
To: itext-questions@lists.sourceforge.net
Subject: Re: [iText-questions] NPE while Extracting text
Thanks very much for this information.
Maybe you could offer me some direction
The trick here is obtaining a mapping between the type 3 font glyphs and some
sort of encoded text. There are several ways that this can be done, and
they are fairly well supported by the text parser - but type 3 fonts, as has
been mentioned, don't *usually* have this sort of mapping
to Unicode values.
Leonard
-Original Message-
From: Kevin Day [mailto:ke...@trumpetinc.com]
Sent: Monday, June 21, 2010 5:52 PM
To: itext-questions@lists.sourceforge.net
Subject: Re: [iText-questions] NPE while Extracting text
The trick here is obtaining a mapping between the type 3 font
Ben Short wrote:
subType is /Type3
Does this help identify the problem?
Yes, but it doesn't bring us closer to a solution.
Type 3 fonts are user defined fonts.
See for instance:
http://itextpdf.com/examples/index.php?page=exampleid=200
In that example, a 'delta' and 'sigma' shaped glyph was
Hi Kevin,
I'm happy to dig in to the code. Can you point me to a place to start debugging?
Ben
On 18 June 2010 00:04, Kevin Day ke...@trumpetinc.com wrote:
ok - most likely the font is using an encoding that we just don't have
support for yet. The encodings are a bit of a hack right now, so
Hi,
I have debugged and found that in the displayPdfString method of the
PdfContentStreamProcessor class the string parameter is valid but it
is decoded to a string of the same length but all bytes are set to 0.
private void displayPdfString(PdfString string){
String unicode =
Hi,
I downloaded and built the latest source code and the exception is no
longer thrown. Now I'm left with a file that's 101KB in size but shows
no content when opened in wordpad.
Am I missing something?
Ben
On 17 June 2010 09:08, Ben Short b...@benshort.co.uk wrote:
Hi Kevin,
Thats for
Message-
From: Ben Short [mailto:b...@benshort.co.uk]
Sent: Thursday, June 17, 2010 2:47 PM
To: Post all your questions about iText here
Subject: Re: [iText-questions] NPE while Extracting text
Hi,
I downloaded and built the latest source code and the exception is no
longer thrown. Now
Message-
From: Ben Short [mailto:b...@benshort.co.uk]
Sent: Thursday, June 17, 2010 2:47 PM
To: Post all your questions about iText here
Subject: Re: [iText-questions] NPE while Extracting text
Hi,
I downloaded and built the latest source code and the exception is no
longer thrown. Now
ok - most likely the font is using an encoding that we just don't have
support for yet. The encodings are a bit of a hack right now, so these
unusual cases are tough to deal with.
If you are willing to dig in to the code, I can provide assistance.
- K
--
View this message in context:
I will add to Mark's (excellent) stream of consciousness analysis:
The next step is to see what the name of the font resource is that is
causing the problem. Then, load RUPS and dig into the page dictionary and
find the entry for that font resource - given what Mark is showing in the
source,
On Mark's advice I downloaded the source code from the 5.0.2 branch
and dug a little deeper...
The NPE is thrown on the following line of the DocumentFont constructor.
fontName = PdfName.decodeName(font.getAsName(PdfName.BASEFONT).toString());
It turns out that font.getAsName(PdfName.BASEFONT)
legalese.Disclaimer;
DisclaimerCardiff DisCard = null;
-Original Message-
From: Ben Short [mailto:b...@benshort.co.uk]
Sent: Wednesday, June 16, 2010 3:12 PM
To: Post all your questions about iText here
Subject: Re: [iText-questions] NPE while Extracting text
On Mark's advice I downloaded
ok - I ran into this issue myself a month or so ago. It's been fixed in the
5.0.3 codebase (which is the current HEAD in SVN).
/** Creates a new instance of DocumentFont */
DocumentFont(PRIndirectReference refFont) {
encoding = ;
fontSpecific = false;
Mark - FYI, basefont isn't required for Type3 fonts (or TrueType for that
matter). I had the same reaction when I first ran into this issue, but the
spec never lies, right? It just injects ambiguity and confusion.
--
View this message in context:
Hi,
I'm trying to use iText 5.0.2 to extract the text from a pdf file
using the following code...
PdfReader reader = new PdfReader(C:/development/May.pdf);
PdfReaderContentParser parser = new
PdfReaderContentParser(reader);
PrintWriter out = new
Cardiff.com
import legalese.Disclaimer;
DisclaimerCardiff DisCard = null;
-Original Message-
From: Ben Short [mailto:b...@benshort.co.uk]
Sent: Tuesday, June 15, 2010 1:36 PM
To: itext-questions@lists.sourceforge.net
Subject: [iText-questions] NPE while Extracting text
Hi,
I'm
18 matches
Mail list logo