Re: Tika 2.0 and language detection

2016-02-04 Thread Mattmann, Chris A (3980)
Hey Ken, This is fine. I wanted to get going with our Julia/MIT-LL Text.jl based detector and turning LanguageIdentifier into an interface. Me and Trevor (CC’ed) are working on it, but not sure where we’re at and shouldn’t be a blocker to moving forward. Cheers, Chris +++

[jira] [Commented] (TIKA-1851) Tika 2.0 - Move test resources from core to test-resources

2016-02-04 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133629#comment-15133629 ] Ken Krugler commented on TIKA-1851: --- I'm also curious why we have Groovy code and shell s

[jira] [Commented] (TIKA-1851) Tika 2.0 - Move test resources from core to test-resources

2016-02-04 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133624#comment-15133624 ] Ken Krugler commented on TIKA-1851: --- Hi [~talli...@apache.org] - I'm also getting a local

Re: [VOTE] Apache Tika 1.12 Release Candidate #1

2016-02-04 Thread Lewis John Mcgibbney
Hi Chris, +1 to release this release candidate Thanks Lewis On Tue, Feb 2, 2016 at 4:24 PM, Lewis John Mcgibbney < lewis.mcgibb...@gmail.com> wrote: > Hi Chris, > > Signatures all good. Verified using the scripts apachestuff. > mvn install and all tests pass fine on MacOSX 10.9.5 > Ran DRAT from

[jira] [Commented] (TIKA-1723) Integrate language-detector into Tika

2016-02-04 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15132961#comment-15132961 ] Ken Krugler commented on TIKA-1723: --- Good idea re gathering input - I just emailed the de

Tika 2.0 and language detection

2016-02-04 Thread Ken Krugler
Hi all, Over at https://issues.apache.org/jira/browse/TIKA-1723, Tim & I have been discussing whether to focus these pending changes on the 2.0 branch, and leave 1.x as-is. As part of that, we could do a cut-and-run in 2.0, and not spend the time to port the current (Tika 1.x) language detecto

[jira] [Commented] (TIKA-1836) Convertion DOC->TXT failed due to POI issue

2016-02-04 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15132895#comment-15132895 ] Tim Allison commented on TIKA-1836: --- Committed workaround to log rather than throw an exc

[jira] [Created] (TIKA-1853) Upgrade to POI 3.14-final when available

2016-02-04 Thread Tim Allison (JIRA)
Tim Allison created TIKA-1853: - Summary: Upgrade to POI 3.14-final when available Key: TIKA-1853 URL: https://issues.apache.org/jira/browse/TIKA-1853 Project: Tika Issue Type: Improvement

[jira] [Created] (TIKA-1852) Tika 2.0 - clean up unit tests to rely more on TikaTest

2016-02-04 Thread Tim Allison (JIRA)
Tim Allison created TIKA-1852: - Summary: Tika 2.0 - clean up unit tests to rely more on TikaTest Key: TIKA-1852 URL: https://issues.apache.org/jira/browse/TIKA-1852 Project: Tika Issue Type: Task

[jira] [Commented] (TIKA-1723) Integrate language-detector into Tika

2016-02-04 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15132613#comment-15132613 ] Tim Allison commented on TIKA-1723: --- Agreed on the ease of building the new ld framework

[jira] [Reopened] (TIKA-1851) Tika 2.0 - Move test resources from core to test-resources

2016-02-04 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison reopened TIKA-1851: --- wrong reason for resolving...need to fix > Tika 2.0 - Move test resources from core to test-resources > --

[jira] [Resolved] (TIKA-1851) Tika 2.0 - Move test resources from core to test-resources

2016-02-04 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1851. --- Resolution: Fixed > Tika 2.0 - Move test resources from core to test-resources > --

[jira] [Commented] (TIKA-1851) Tika 2.0 - Move test resources from core to test-resources

2016-02-04 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15132593#comment-15132593 ] Tim Allison commented on TIKA-1851: --- Dunno, but I should have mentioned that I'm getting

[jira] [Commented] (TIKA-1851) Tika 2.0 - Move test resources from core to test-resources

2016-02-04 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15132549#comment-15132549 ] Tim Allison commented on TIKA-1851: --- [~bobpaulin], any chance you could look into why we'

[jira] [Commented] (TIKA-1851) Tika 2.0 - Move test resources from core to test-resources

2016-02-04 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15132561#comment-15132561 ] Lewis John McGibbney commented on TIKA-1851: Are we using the most recent osgi/

[jira] [Commented] (TIKA-1851) Tika 2.0 - Move test resources from core to test-resources

2016-02-04 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15132542#comment-15132542 ] Hudson commented on TIKA-1851: -- UNSTABLE: Integrated in tika-2.x #18 (See [https://builds.apa

[jira] [Commented] (TIKA-1824) Tika 2.0 - Create Initial Parser Modules

2016-02-04 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15132507#comment-15132507 ] Tim Allison commented on TIKA-1824: --- Sorry, [~grossws], [~thaichat04] and [~lfcnassif] sh

[jira] [Commented] (TIKA-1824) Tika 2.0 - Create Initial Parser Modules

2016-02-04 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15132503#comment-15132503 ] Tim Allison commented on TIKA-1824: --- bq. Thanks so much for the feedback, these are grea

[jira] [Resolved] (TIKA-1851) Tika 2.0 - Move test resources from core to test-resources

2016-02-04 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1851. --- Resolution: Invalid Moved shared test resources to test-resources and did some other very small test c

[jira] [Commented] (TIKA-1850) Tika erroneously detects some versions of jQuery as "text/html"

2016-02-04 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15132249#comment-15132249 ] Nick Burch commented on TIKA-1850: -- It's showing up for me in the snapshots repo - see ht