Marco,

Schärfke Marco, Dr. wrote
> I use the TextMarginFinder to detect text area. In principle it works as
> expected but for some documents, the upper border is wrong (this is true
> for text with upper German Umlauten like Ä,Ü in the first text line).

iText uses the Ascent and Descent values from the font descriptor to
determine the upper and lower boundaries of some text fragment.

According to the PDF specification ISO 32000-1, Table 122:

Ascent
number
(Required, except for Type 3 fonts) The maximum height above the baseline
reached by glyphs in this font. The height of glyphs for accented characters
shall be excluded. 

If accented characters (e.g. Ä, Ö, and Ü) of a font are higher then any
non-accented characters, therefore, they may be cut off if you use the
returned rectangle as is.

Thus, you may want to add a small margin just in case.

iText might also be extended to not rely on these global height information
but instead use individual glyph heights from the font. This, unfortunately,
would increase the resource requirements of the text extraction framework
considerably, i.e. slow it down.

Regards,   Michael



--
View this message in context: 
http://itext-general.2136553.n4.nabble.com/MarginTextFinder-tp4660145p4660151.html
Sent from the iText - General mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to