[jira] [Updated] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-11-25 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-1285: -- Attachment: TIKA-1285_rev1641423.patch Updated patch to work with PDF & POI Snapshot builds as of

[jira] [Commented] (TIKA-1268) Extract images from PDF documents

2014-09-10 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128921#comment-14128921 ] Jeremy Anderson commented on TIKA-1268: --- Take a look at my last comment in TIKA-1285,

[jira] [Commented] (TIKA-1268) Extract images from PDF documents

2014-09-10 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128855#comment-14128855 ] Jeremy Anderson commented on TIKA-1268: --- I created the TIKA-1285 patch after making t

[jira] [Updated] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-09-04 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-1285: -- Attachment: TIKA-1285.patch > Upgrade to PDFBox 2.0.0 when available > --

[jira] [Updated] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-09-04 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-1285: -- Attachment: (was: TIKA-1285.patch) > Upgrade to PDFBox 2.0.0 when available > ---

[jira] [Updated] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-09-04 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-1285: -- Attachment: (was: TIKA-1285.patch) > Upgrade to PDFBox 2.0.0 when available > ---

[jira] [Updated] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-09-04 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-1285: -- Attachment: TIKA-1285.patch > Upgrade to PDFBox 2.0.0 when available > --

[jira] [Comment Edited] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-09-04 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121991#comment-14121991 ] Jeremy Anderson edited comment on TIKA-1285 at 9/4/14 10:31 PM: -

[jira] [Updated] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-09-04 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-1285: -- Attachment: TIKA-1285.patch > Upgrade to PDFBox 2.0.0 when available > --

[jira] [Updated] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-09-04 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-1285: -- Attachment: (was: TIKA-1285.patch) > Upgrade to PDFBox 2.0.0 when available > ---

[jira] [Commented] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-09-04 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121991#comment-14121991 ] Jeremy Anderson commented on TIKA-1285: --- Updated patch to include fixes as of revisio

[jira] [Updated] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-04-29 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-1285: -- Attachment: TIKA-1285.patch > Upgrade to PDFBox 2.0.0 when available > -

[jira] [Updated] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-04-29 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-1285: -- Attachment: (was: TIKA-1285.patch) > Upgrade to PDFBox 2.0.0 when available > --

[jira] [Updated] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-04-29 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-1285: -- Attachment: TIKA-1285.patch AdobeFontMetricParser (small change) PDF2XHTML (removal of xobject

[jira] [Comment Edited] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-04-29 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985028#comment-13985028 ] Jeremy Anderson edited comment on TIKA-1285 at 4/30/14 1:15 AM: -

[jira] [Updated] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-04-29 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-1285: -- Description: This issue is to track fixes required when upgrading the PDFbox dependency to 2.0.

[jira] [Created] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-04-29 Thread Jeremy Anderson (JIRA)
Jeremy Anderson created TIKA-1285: - Summary: Upgrade to PDFBox 2.0.0 when available Key: TIKA-1285 URL: https://issues.apache.org/jira/browse/TIKA-1285 Project: Tika Issue Type: Improvement

[jira] [Comment Edited] (TIKA-1268) Extract images from PDF documents

2014-04-29 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984984#comment-13984984 ] Jeremy Anderson edited comment on TIKA-1268 at 4/29/14 11:59 PM:

[jira] [Commented] (TIKA-1268) Extract images from PDF documents

2014-04-29 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984984#comment-13984984 ] Jeremy Anderson commented on TIKA-1268: --- This fix will break when PDFBox 2.0.0 is rel

[jira] [Created] (TIKA-716) Upgrade apache-Mime4J to Version 0.7

2011-09-16 Thread Jeremy Anderson (JIRA)
Upgrade apache-Mime4J to Version 0.7 Key: TIKA-716 URL: https://issues.apache.org/jira/browse/TIKA-716 Project: Tika Issue Type: Wish Components: packaging, parser Affects Versions: 0.9

[jira] [Updated] (TIKA-704) PDF and Outlook docs embedded in MS Word documents not parsed

2011-09-07 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-704: - Attachment: LicensedTestWithPdf.docx LicensedTestWithOutlook.docx These are license

[jira] [Commented] (TIKA-704) PDF and Outlook docs embedded in MS Word documents not parsed

2011-09-07 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13098920#comment-13098920 ] Jeremy Anderson commented on TIKA-704: -- Thanks for the fast attention for the fix. Did

[jira] [Updated] (TIKA-704) PDF and Outlook docs embedded in MS Word documents not parsed

2011-09-01 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-704: - Attachment: TestWithPdf.docx TestWithOutlook.docx recursiveUsage.txt

[jira] [Created] (TIKA-704) PDF and Outlook docs embedded in MS Word documents not parsed

2011-09-01 Thread Jeremy Anderson (JIRA)
PDF and Outlook docs embedded in MS Word documents not parsed - Key: TIKA-704 URL: https://issues.apache.org/jira/browse/TIKA-704 Project: Tika Issue Type: Bug Components:

[jira] [Updated] (TIKA-489) Embedded Documents within documents

2011-08-30 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-489: - Attachment: recursiveUsage.txt TestWithOutlook.docx TestWithPdf.docx

[jira] [Commented] (TIKA-489) Embedded Documents within documents

2011-08-30 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093869#comment-13093869 ] Jeremy Anderson commented on TIKA-489: -- Thanks For your prompt follow-up. I did some m

[jira] [Commented] (TIKA-489) Embedded Documents within documents

2011-08-29 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093184#comment-13093184 ] Jeremy Anderson commented on TIKA-489: -- It appears that the RecursiveParser does not wo