[jira] [Commented] (TIKA-100) Structured PDF parsing

2013-03-01 Thread David vandendriessche (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13590425#comment-13590425 ] David vandendriessche commented on TIKA-100: At the moment I'm using pdfbox to

[jira] [Commented] (TIKA-100) Structured PDF parsing

2011-09-04 Thread Malik Hemani (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096863#comment-13096863 ] Malik Hemani commented on TIKA-100: --- Since PDFTextStripper can extract at page level, here

[jira] [Commented] (TIKA-100) Structured PDF parsing

2011-06-06 Thread Gregory Kanevsky (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045184#comment-13045184 ] Gregory Kanevsky commented on TIKA-100: --- The issue with 'sortByPosition' is addressed

[jira] Commented: (TIKA-100) Structured PDF parsing

2010-08-26 Thread Gregory Kanevsky (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902897#action_12902897 ] Gregory Kanevsky commented on TIKA-100: --- This issue seems to be partially fixed. PDF2XH