Am 08.08.2016 um 23:45 schrieb Melanie Freed:
Hi. I'm using pdfbox-2.0.2 and am having trouble getting the height of
extracted text from a PDF with Type 3 fonts.
I've been able to successfully get the height for Type 1 fonts by
overriding the writeString function in the PDFTextStripper class and using
the maximum font size in points as the height:
float height = 0f;
for (TextPosition textPosition : textPositions)
{
height = Math.max(height, textPosition.getFontSizeInPt());
}
But this doesn't work for Type 3 fonts since they don't use sizes in the
same way. I tried to use the bounding box like this:
PDFont font_obj = textPositions.get(0).getFont();
BoundingBox bbox = font_obj.getBoundingBox();
float height = bbox.getHeight();
But the results aren't what I would expect. For example, when I run it on
a document with a Type 1 font, I get a value of 7.0 as the font size in
points (using the first method) and the second method gives me a value of
1156.0.
Am I missing some kind of conversion from units of the bounding box to
points? Or just approaching this problem in the wrong way?
Please have a look at the DrawPrintTextLocations example in the source
code download, this has a solution for (some) type 3 fonts.
at.concatenate(font.getFontMatrix().createAffineTransform());
if (font instanceof PDType3Font)
{
PDType3Font t3Font = (PDType3Font) font;
PDType3CharProc charProc = t3Font.getCharProc(code);
if (charProc != null)
{
PDRectangle glyphBBox = charProc.getGlyphBBox();
if (glyphBBox != null)
{
path = glyphBBox.toGeneralPath();
}
}
}
later, at.createTransformedShape(path.getBounds2D()) gets you the bounds.
If the above doesn't make sense, just run the full programme and see
whether it draws cyan bounds around your type 3 font glyphs.
It may not always work, because some type3 charprocs don't have a
bounding box.
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]