[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075169#comment-15075169
]
Tilman Hausherr commented on PDFBOX-3062:
-
This commit also improves the height v
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075164#comment-15075164
]
ASF subversion and git services commented on PDFBOX-3062:
-
Commit
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053474#comment-15053474
]
John Hewson commented on PDFBOX-3062:
-
To wrap this one up, revisiting my questions a
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053468#comment-15053468
]
John Hewson commented on PDFBOX-3062:
-
{quote}
Since PDF was not designed for text ex
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037704#comment-15037704
]
Maruan Sahyoun commented on PDFBOX-3062:
+1
> Text extraction and height differe
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037591#comment-15037591
]
Timo Boehme commented on PDFBOX-3062:
-
I would like to second the proposal of Tilman
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036261#comment-15036261
]
Tilman Hausherr commented on PDFBOX-3062:
-
Re reliability, see the test files and
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036215#comment-15036215
]
John Hewson commented on PDFBOX-3062:
-
{quote}
How is it not reliable?
{quote}
Why w
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034552#comment-15034552
]
Tilman Hausherr commented on PDFBOX-3062:
-
{quote}
BBox + CapHeight isn't reliabl
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034507#comment-15034507
]
John Hewson commented on PDFBOX-3062:
-
BBox + CapHeight isn't reliable either. Either
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034207#comment-15034207
]
Tilman Hausherr commented on PDFBOX-3062:
-
This isn't really about metrics, this
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034170#comment-15034170
]
John Hewson commented on PDFBOX-3062:
-
Then the next JIRA issue will be "Text extract
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15029257#comment-15029257
]
Maruan Sahyoun commented on PDFBOX-3062:
Can't we use the current approach for 2.
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15029202#comment-15029202
]
John Hewson commented on PDFBOX-3062:
-
Technically, the bbox is always correct - it's
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15029195#comment-15029195
]
Tilman Hausherr commented on PDFBOX-3062:
-
If we use the visual bounds to get the
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15029088#comment-15029088
]
ASF subversion and git services commented on PDFBOX-3062:
-
Commit
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15029081#comment-15029081
]
ASF subversion and git services commented on PDFBOX-3062:
-
Commit
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15029078#comment-15029078
]
ASF subversion and git services commented on PDFBOX-3062:
-
Commit
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15029077#comment-15029077
]
Tilman Hausherr commented on PDFBOX-3062:
-
I ran a test on about 2500 files. The
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15027434#comment-15027434
]
Tilman Hausherr commented on PDFBOX-3062:
-
I tried using CapHeight when available
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15013930#comment-15013930
]
Tilman Hausherr commented on PDFBOX-3062:
-
PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZB
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14997389#comment-14997389
]
John Hewson commented on PDFBOX-3062:
-
That's good news.
> Text extraction and heigh
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996376#comment-14996376
]
Tilman Hausherr commented on PDFBOX-3062:
-
They are already good anyway, probably
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996365#comment-14996365
]
Maruan Sahyoun commented on PDFBOX-3062:
{quote}
At the same time, I've noticed y
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996179#comment-14996179
]
Tilman Hausherr commented on PDFBOX-3062:
-
But the BBox height is huge... the res
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996114#comment-14996114
]
John Hewson commented on PDFBOX-3062:
-
Those BBox values are pretty reasonable though
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995206#comment-14995206
]
ASF subversion and git services commented on PDFBOX-3062:
-
Commit
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995200#comment-14995200
]
Tilman Hausherr commented on PDFBOX-3062:
-
After the latest changes (probably PDF
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14979870#comment-14979870
]
John Hewson commented on PDFBOX-3062:
-
With my previous comment in mind, I've depreca
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14979869#comment-14979869
]
ASF subversion and git services commented on PDFBOX-3062:
-
Commit
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14979318#comment-14979318
]
Tilman Hausherr commented on PDFBOX-3062:
-
PDFBOX-3062-N2MOQ7YZICIYGTPLQJAWJ4HLN6
[
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14977531#comment-14977531
]
John Hewson commented on PDFBOX-3062:
-
Height isn't calculated in a meaningful way in
32 matches
Mail list logo