Marco, Schärfke Marco, Dr. wrote > I use the TextMarginFinder to detect text area. In principle it works as > expected but for some documents, the upper border is wrong (this is true > for text with upper German Umlauten like Ä,Ü in the first text line).
iText uses the Ascent and Descent values from the font descriptor to determine the upper and lower boundaries of some text fragment. According to the PDF specification ISO 32000-1, Table 122: Ascent number (Required, except for Type 3 fonts) The maximum height above the baseline reached by glyphs in this font. The height of glyphs for accented characters shall be excluded. If accented characters (e.g. Ä, Ö, and Ü) of a font are higher then any non-accented characters, therefore, they may be cut off if you use the returned rectangle as is. Thus, you may want to add a small margin just in case. iText might also be extended to not rely on these global height information but instead use individual glyph heights from the font. This, unfortunately, would increase the resource requirements of the text extraction framework considerably, i.e. slow it down. Regards, Michael -- View this message in context: http://itext-general.2136553.n4.nabble.com/MarginTextFinder-tp4660145p4660151.html Sent from the iText - General mailing list archive at Nabble.com. ------------------------------------------------------------------------------ Open source business process management suite built on Java and Eclipse Turn processes into business applications with Bonita BPM Community Edition Quickly connect people, data, and systems into organized workflows Winner of BOSSIE, CODIE, OW2 and Gartner awards http://p.sf.net/sfu/Bonitasoft _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php