Re: [iText-questions] PDFTextExtractor returns an exception - 'Input string was not in a correct format" when parsing this file

2012-02-04 Thread Leonard Rosenthol
Just because Adobe Reader processes your file does NOT make it valid PDF. Reader is EXTREMELY lenient because the average user would have no way to fix such crappy PDFs. If you ran this file through the PDF validator in Acrobat, I would bet it would fail. Leonard From: RIchard Hammond [mailt

Re: [iText-questions] PDFTextExtractor returns an exception - 'Input string was not in a correct format" when parsing this file

2012-02-04 Thread Leonard Rosenthol
It's not valid PDF. Leonard -Original Message- From: Kevin Day [mailto:ke...@trumpetinc.com] Sent: Saturday, February 04, 2012 5:23 PM To: itext-questions@lists.sourceforge.net Subject: Re: [iText-questions] PDFTextExtractor returns an exception - 'Input string was not in a correct form

Re: [iText-questions] PDFTextExtractor returns an exception - 'Input string was not in a correct format" when parsing this file

2012-02-04 Thread RIchard Hammond
Kevin - the stack trace wasn't posted because the problem is easily reproducable. If you use Abode Acrobat Reader to view the PDF it looks fine as I tested this before I posted the query. The original file was produced by a power company (who should know what they are doing) Also, I have use

Re: [iText-questions] PDFTextExtractor returns an exception - 'Input string was not in a correct format" when parsing this file

2012-02-04 Thread Kevin Day
Next time, post the stack trace! For everyone's reference, here is the stack trace: java.lang.RuntimeException: - is not a valid number - java.lang.NumberFormatException: For input string: "-" at com.itextpdf.text.pdf.PdfNumber.(PdfNumber.java:83) at com.itextpdf.text.pdf.PdfCont

Re: [iText-questions] Possible bug in PdfTextExtractor.GetTextFromPage [iTextSharp]

2012-02-04 Thread Kevin Day
I believe that the bug in LocationTextExtractionStrategy.GetResultantText() was fixed some time ago - did you experience this problem with the latest code in HEAD ? for reference, the line in question in SVN has the following (And startsWithSpace and endsWithSpace has the null and empty conditions

Re: [iText-questions] Possible bug in PdfTextExtractor.GetTextFromPage [iTextSharp]

2012-02-04 Thread newton
I have the same problem with index outside the bounds... I was converting following PDF => http://zbierka.sk/ov/kapitoly/default.aspx?KapitolaID=64396&FileName=ov2012-018-01&Rocnik=2012&TypKapitolyID=1 http://zbierka.sk/ov/kapitoly/default.aspx?KapitolaID=64396&FileName=ov2012-018-01&Rocnik=2012

[iText-questions] iTextSharp - XMLWorkerHelper - Parsing HTML into a list of Element objects

2012-02-04 Thread Dean McCarthy
Could somebody please provide an example of parsing HTML into a list of elements using XMLWorkerHelper in iTextSharp (C# or VB). The JAVA version as given in the documentation is: XMLWorkerHelper.getInstance().parseXHtml(new ElementHandler() { public void add(final Writable w) { if (w instanceof

Re: [iText-questions] PdfPTable .setExtendLastRow() does not work - why?

2012-02-04 Thread 1T3XT BVBA
Hello Jens, please subscribe to the mailing list. If you don't, you risk this scenario: http://lowagie.com/itextml As for your question, I assume you don't want the first table to be extended. In that case, you should adapt your code like this: import java.io.FileOutputStream; import com.itextp

[iText-questions] PdfPTable .setExtendLastRow() does not work - why?

2012-02-04 Thread Jens Haberer
import com.itextpdf.text.Document; import com.itextpdf.text.PageSize; import com.itextpdf.text.Phrase; import com.itextpdf.text.pdf.PdfPCell; import com.itextpdf.text.pdf.PdfPTable; import com.itextpdf.text.pdf.PdfWriter; public class PDFTest { public static void main(String args[]) {