[jira] [Created] (TIKA-1837) HtmlEncodingDetector wrongly detects charset from commented meta

2016-01-19 Thread Pascal Essiembre (JIRA)
Pascal Essiembre created TIKA-1837: -- Summary: HtmlEncodingDetector wrongly detects charset from commented meta Key: TIKA-1837 URL: https://issues.apache.org/jira/browse/TIKA-1837 Project: Tika

[jira] [Commented] (TIKA-1799) Upgrade to POI 3.14-Beta1 when available

2016-01-19 Thread Andreas Beeker (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107690#comment-15107690 ] Andreas Beeker commented on TIKA-1799: -- I have no idea how osgi bundling works, but ad

[jira] [Commented] (TIKA-1799) Upgrade to POI 3.14-Beta1 when available

2016-01-19 Thread Bob Paulin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107395#comment-15107395 ] Bob Paulin commented on TIKA-1799: -- Actually I'd be careful using the wildcard here becaus

[jira] [Commented] (TIKA-1799) Upgrade to POI 3.14-Beta1 when available

2016-01-19 Thread Bob Paulin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107392#comment-15107392 ] Bob Paulin commented on TIKA-1799: -- So it's actually a pretty interesting question. If yo

[jira] [Commented] (TIKA-1799) Upgrade to POI 3.14-Beta1 when available

2016-01-19 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107368#comment-15107368 ] Tim Allison commented on TIKA-1799: --- [~kiwiwings], looks like we have to specify packages

[jira] [Commented] (TIKA-1799) Upgrade to POI 3.14-Beta1 when available

2016-01-19 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107308#comment-15107308 ] Tim Allison commented on TIKA-1799: --- Great. Thank you! That did it! Apologies for the

[jira] [Commented] (TIKA-1799) Upgrade to POI 3.14-Beta1 when available

2016-01-19 Thread Bob Paulin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107257#comment-15107257 ] Bob Paulin commented on TIKA-1799: -- [~talli...@mitre.org] Looks like the structure of org

[jira] [Commented] (TIKA-1836) Convertion DOC->TXT failed due to POI issue

2016-01-19 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107221#comment-15107221 ] Tim Allison commented on TIKA-1836: --- The better solution of course would be to add proper

[jira] [Commented] (TIKA-1836) Convertion DOC->TXT failed due to POI issue

2016-01-19 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107216#comment-15107216 ] Tim Allison commented on TIKA-1836: --- Y, done. I asked POI colleagues if they minded if w

[jira] [Commented] (TIKA-1836) Convertion DOC->TXT failed due to POI issue

2016-01-19 Thread Jorge Spinsanti (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107212#comment-15107212 ] Jorge Spinsanti commented on TIKA-1836: --- POI issue was report in 2014-08-22. Perhaps

[jira] [Comment Edited] (TIKA-1836) Convertion DOC->TXT failed due to POI issue

2016-01-19 Thread Jorge Spinsanti (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107212#comment-15107212 ] Jorge Spinsanti edited comment on TIKA-1836 at 1/19/16 7:08 PM: -

[jira] [Comment Edited] (TIKA-1836) Convertion DOC->TXT failed due to POI issue

2016-01-19 Thread Jorge Spinsanti (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106919#comment-15106919 ] Jorge Spinsanti edited comment on TIKA-1836 at 1/19/16 7:04 PM: -

[jira] [Comment Edited] (TIKA-1836) Convertion DOC->TXT failed due to POI issue

2016-01-19 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107067#comment-15107067 ] Tim Allison edited comment on TIKA-1836 at 1/19/16 5:57 PM: I c

[jira] [Commented] (TIKA-1836) Convertion DOC->TXT failed due to POI issue

2016-01-19 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107080#comment-15107080 ] Tim Allison commented on TIKA-1836: --- Not already fixed in POI: this is still open: http

[jira] [Commented] (TIKA-1799) Upgrade to POI 3.14-Beta1 when available

2016-01-19 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107077#comment-15107077 ] Tim Allison commented on TIKA-1799: --- [~bobpaulin], I hate to bother you with this, but do

[jira] [Comment Edited] (TIKA-1836) Convertion DOC->TXT failed due to POI issue

2016-01-19 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107067#comment-15107067 ] Tim Allison edited comment on TIKA-1836 at 1/19/16 5:50 PM: I c

[jira] [Commented] (TIKA-1836) Convertion DOC->TXT failed due to POI issue

2016-01-19 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107067#comment-15107067 ] Tim Allison commented on TIKA-1836: --- I concur with Ken, if I understand this correctly, w

[jira] [Commented] (TIKA-1836) Convertion DOC->TXT failed due to POI issue

2016-01-19 Thread Jorge Spinsanti (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106919#comment-15106919 ] Jorge Spinsanti commented on TIKA-1836: --- POI is a dependency of TIKA. I think TIKA ca

[jira] [Commented] (TIKA-1836) Convertion DOC->TXT failed due to POI issue

2016-01-19 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106908#comment-15106908 ] Ken Krugler commented on TIKA-1836: --- This seems to be an issue for POI, as per the messag

[jira] [Updated] (TIKA-1836) Convertion DOC->TXT failed due to POI issue

2016-01-19 Thread Jorge Spinsanti (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Spinsanti updated TIKA-1836: -- Attachment: test.doc File used to find the issue. > Convertion DOC->TXT failed due to POI issue

[jira] [Updated] (TIKA-1836) Convertion DOC->TXT failed due to POI issue

2016-01-19 Thread Jorge Spinsanti (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Spinsanti updated TIKA-1836: -- Component/s: parser > Convertion DOC->TXT failed due to POI issue > -

[jira] [Created] (TIKA-1836) Convertion DOC->TXT failed due to POI issue

2016-01-19 Thread Jorge Spinsanti (JIRA)
Jorge Spinsanti created TIKA-1836: - Summary: Convertion DOC->TXT failed due to POI issue Key: TIKA-1836 URL: https://issues.apache.org/jira/browse/TIKA-1836 Project: Tika Issue Type: Bug

[jira] [Updated] (TIKA-1835) LinkContentHandler skips iframe and rel tags

2016-01-19 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated TIKA-1835: Attachment: TIKA-1835.patch Patch for trunk. Adds support for iframe and link element link extraction

[jira] [Updated] (TIKA-1835) LinkContentHandler skips iframe and rel tags

2016-01-19 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated TIKA-1835: Flags: Patch,Important (was: Important) > LinkContentHandler skips iframe and rel tags > ---

[jira] [Commented] (TIKA-1824) Tika 2.0 - Create Initial Parser Modules

2016-01-19 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106752#comment-15106752 ] Tim Allison commented on TIKA-1824: --- Thank you, [~bobpaulin]! Again, this is fantastic.

[jira] [Commented] (TIKA-1833) NoClassDefFoundError for POIXMLTypeLoader

2016-01-19 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106723#comment-15106723 ] Tim Allison commented on TIKA-1833: --- Ha. Ok. Great to hear. It doesn't surprise me tha

[jira] [Updated] (TIKA-1823) Support detecting DWF format

2016-01-19 Thread Luca Moretti (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luca Moretti updated TIKA-1823: --- Attachment: blocks_and_tables.dwf I found this file on the Autodesk website that could be a suitably li

[jira] [Created] (TIKA-1835) LinkContentHandler skips iframe and rel tags

2016-01-19 Thread Markus Jelsma (JIRA)
Markus Jelsma created TIKA-1835: --- Summary: LinkContentHandler skips iframe and rel tags Key: TIKA-1835 URL: https://issues.apache.org/jira/browse/TIKA-1835 Project: Tika Issue Type: Bug