Hi,
Am 15.02.2012 16:36, schrieb Tate, Hongliang Tian(田洪亮):
Hi,
Why are the heights of some TextPosition objects generated from
PDFStreamEngine is 0? In another words, I found that TextPosition.getHeight()
returns 0.
What I am trying to do is to extract text and their positions and sizes from
PDF. So I need to know the height of TextPosition object. To repeat the
problem, you can check out the attached PDF. All heights of TextPosition objects
that represent the title string "Accelerating SQL Database Operations on a GPU
with CUDA" are 0. After examining the debug information, I found
PDFont.getFontHeight(byte[] c, int offset, int length) returned 0 for those
TextPosition objects, which is why the height of TextPosition is 0.
So why does the TextPosition.getHeight() and PDFont.getFontHeight() return
0? Is there a way to work around?
Missing height values are a know issue. There are some improvements in the
current trunk, see PDFBOX-611 [1] for details. But there are still some cases
left where those values are still missing.
Thanks ahead.
BR
Andreas Lehmkühler
[1] https://issues.apache.org/jira/browse/PDFBOX-611