Dear All, I am using iTextSharp in my application and found its text extraction capabilities excellent. I am facing a problem though. I use the PdfTextExtractor.GetTextFromPage method but it returns text pieces that are far apart separated by a single space. Take the following example (as displayed in Acrobat):
User name: abcdef Password: Cool1234 In the PDF there are no spaces between "abcdef" and "Password". If I extract the above text using PdfTextExtractor.GetTextFromPage I'll get the following result: User name: abcdef Password: Cool1234 So the distance between the two words were cut down to a single space. What I need to achieve is that the words that are not separated by a space but a larger distance would be separated by a TAB in the resultant text. I am guessing that I should abandon PdfTextExtractor.GetTextFromPage and use the LocationTextExtractionStrategy class combined with TextRenderInfo, but I have no clue how. I'd be eternally grateful if anyone could point me in the right direction. C#, VB.NET, Java samples are all appreciated. Thank you for your kind help in advance. Best Regards, Daniel ------------------------------------------------------------------------------ 5 Ways to Improve & Secure Unified Communications Unified Communications promises greater efficiencies for business. UC can improve internal communications as well as offer faster, more efficient ways to interact with customers and streamline customer service. Learn more! http://www.accelacomm.com/jaw/sfnl/114/51426253/ _______________________________________________ iText-questions mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
