[GitHub] tika pull request: tika_2.x

2016-01-14 Thread kulkarniachyut
Github user kulkarniachyut closed the pull request at: https://github.com/apache/tika/pull/70 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] tika pull request: tika_2.x

2016-01-14 Thread kulkarniachyut
GitHub user kulkarniachyut opened a pull request: https://github.com/apache/tika/pull/70 tika_2.x test You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/tika 2.x Alternatively you can review and apply these changes as the pa

WMF extraction

2016-01-14 Thread Andreas Beeker
Hi, POI will have a WMF module (org.apache.poi.hwmf.*) in the next beta. Looking over the govdocs collection, those embedded wmfs might contain interesting information for TIKA. Although my main goal is to integrate the rendering for common sl, it shouldn't be to laborious to provide something a

[jira] [Comment Edited] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098465#comment-15098465 ] Tim Allison edited comment on TIKA-1830 at 1/14/16 5:50 PM: Y,

[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098515#comment-15098515 ] Tim Allison commented on TIKA-1830: --- Doh. Right. Thank you. > Upgrade to PDFBox 1.8.11

[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098503#comment-15098503 ] Tilman Hausherr commented on TIKA-1830: --- Not that, but the change I mentioned https:/

[jira] [Comment Edited] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098465#comment-15098465 ] Tim Allison edited comment on TIKA-1830 at 1/14/16 5:37 PM: Y,

[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098465#comment-15098465 ] Tim Allison commented on TIKA-1830: --- Y, 074531.pdf has uncovered a Tika issue. I can rep

[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098427#comment-15098427 ] Tim Allison commented on TIKA-1830: --- I just tested casting a null object that started lif

[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098418#comment-15098418 ] Tilman Hausherr commented on TIKA-1830: --- The line at {{BaseParser.java:1077}} is {cod

[jira] [Comment Edited] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096866#comment-15096866 ] Tilman Hausherr edited comment on TIKA-1830 at 1/14/16 5:05 PM: -

[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098412#comment-15098412 ] Tilman Hausherr commented on TIKA-1830: --- Another possibility is that the change I men

[jira] [Comment Edited] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098401#comment-15098401 ] Tilman Hausherr edited comment on TIKA-1830 at 1/14/16 5:02 PM: -

[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098401#comment-15098401 ] Tilman Hausherr commented on TIKA-1830: --- {quote} On PDFBOX-3193, you've set affected

[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098393#comment-15098393 ] Tim Allison commented on TIKA-1830: --- Finished the rerun...and the results look the same.

[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098112#comment-15098112 ] Tim Allison commented on TIKA-1830: --- Argh...I'll rerun the 1.8.10 batch and see what we g

RE: Tika questions on StackOverflow

2016-01-14 Thread Nick Burch
On Wed, 13 Jan 2016, Allison, Timothy B. wrote: Are there other consumer lists we should be following? Elastic Search? I think Elastic Search only has a forum-type thingy, this probably should let you see Tika posts there (not that frequent) https://discuss.elastic.co/search?q=tika%20categor

[jira] [Commented] (TIKA-1824) Tika 2.0 - Create Initial Parser Modules

2016-01-14 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097884#comment-15097884 ] Nick Burch commented on TIKA-1824: -- Tika already supports using a custom classloader for l