Hi Tilman, Thanks so much for your detailed response, it was a huge help. I was able to implement it and it worked perfectly for my PDF! yay.
Best, Melanie On Tue, Aug 9, 2016 at 3:13 AM, Tilman Hausherr <[email protected]> wrote: > Am 08.08.2016 um 23:45 schrieb Melanie Freed: > >> Hi. I'm using pdfbox-2.0.2 and am having trouble getting the height of >> extracted text from a PDF with Type 3 fonts. >> >> I've been able to successfully get the height for Type 1 fonts by >> overriding the writeString function in the PDFTextStripper class and using >> the maximum font size in points as the height: >> >> float height = 0f; >> for (TextPosition textPosition : textPositions) >> { >> height = Math.max(height, textPosition.getFontSizeInPt()); >> } >> >> But this doesn't work for Type 3 fonts since they don't use sizes in the >> same way. I tried to use the bounding box like this: >> >> PDFont font_obj = textPositions.get(0).getFont(); >> BoundingBox bbox = font_obj.getBoundingBox(); >> float height = bbox.getHeight(); >> >> But the results aren't what I would expect. For example, when I run it on >> a document with a Type 1 font, I get a value of 7.0 as the font size in >> points (using the first method) and the second method gives me a value of >> 1156.0. >> >> Am I missing some kind of conversion from units of the bounding box to >> points? Or just approaching this problem in the wrong way? >> > > Please have a look at the DrawPrintTextLocations example in the source > code download, this has a solution for (some) type 3 fonts. > > at.concatenate(font.getFontMatrix().createAffineTransform()); > if (font instanceof PDType3Font) > { > PDType3Font t3Font = (PDType3Font) font; > PDType3CharProc charProc = t3Font.getCharProc(code); > if (charProc != null) > { > PDRectangle glyphBBox = charProc.getGlyphBBox(); > if (glyphBBox != null) > { > path = glyphBBox.toGeneralPath(); > } > } > } > > later, at.createTransformedShape(path.getBounds2D()) gets you the bounds. > > If the above doesn't make sense, just run the full programme and see > whether it draws cyan bounds around your type 3 font glyphs. > > > It may not always work, because some type3 charprocs don't have a bounding > box. > > Tilman > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >

