[jira] [Commented] (TIKA-3340) LanguageProfile for Myanmar

2021-03-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312520#comment-17312520 ] ASF GitHub Bot commented on TIKA-3340: -- kkrugler commented on pull request #421: URL:

[GitHub] [tika] kkrugler commented on pull request #421: [TIKA-3340] LanguageProfile for Myanmar

2021-03-31 Thread GitBox
kkrugler commented on pull request #421: URL: https://github.com/apache/tika/pull/421#issuecomment-811185369 @arky - re using UDHR text...that's fine, but as per the **Permissions** section on https://www.ohchr.org/EN/UDHR/Pages/Introduction.aspx, you would need to add attribution to the

[jira] [Commented] (TIKA-3328) PDFs detected as matlab

2021-03-31 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312484#comment-17312484 ] Hudson commented on TIKA-3328: -- SUCCESS: Integrated in Jenkins build Tika ยป tika-main-jdk8 #1

[jira] [Commented] (TIKA-3340) LanguageProfile for Myanmar

2021-03-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312475#comment-17312475 ] ASF GitHub Bot commented on TIKA-3340: -- arky commented on pull request #421: URL: htt

[GitHub] [tika] arky commented on pull request #421: [TIKA-3340] LanguageProfile for Myanmar

2021-03-31 Thread GitBox
arky commented on pull request #421: URL: https://github.com/apache/tika/pull/421#issuecomment-811151917 @kkrugler Thanks for that information, I'll add a pull request to add appropriate testcase for Myanmar and few other language that were introduced. Any technical objections to us

[jira] [Commented] (TIKA-3340) LanguageProfile for Myanmar

2021-03-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312435#comment-17312435 ] ASF GitHub Bot commented on TIKA-3340: -- kkrugler commented on pull request #421: URL:

[GitHub] [tika] kkrugler commented on pull request #421: [TIKA-3340] LanguageProfile for Myanmar

2021-03-31 Thread GitBox
kkrugler commented on pull request #421: URL: https://github.com/apache/tika/pull/421#issuecomment-811120110 Hi @arky you also need to edit the `LanguageIdentifierTest.java` file, to add `my` to the list of languages, like this: ``` java private static final String[] languages

[jira] [Commented] (TIKA-3340) LanguageProfile for Myanmar

2021-03-31 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312417#comment-17312417 ] Tim Allison commented on TIKA-3340: --- Thank you for the list, [~arky]. It looks like the

[jira] [Commented] (TIKA-3343) Remove Tika custom lang detection for 2.x

2021-03-31 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312408#comment-17312408 ] Tim Allison commented on TIKA-3343: --- Terribly sorry. You're right. We did actually imp

[jira] [Commented] (TIKA-3343) Remove Tika custom lang detection for 2.x

2021-03-31 Thread Peter Kronenberg (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312405#comment-17312405 ] Peter Kronenberg commented on TIKA-3343: If I don't have Optimaize in my pom, then

[jira] [Commented] (TIKA-3343) Remove Tika custom lang detection for 2.x

2021-03-31 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312402#comment-17312402 ] Tim Allison commented on TIKA-3343: --- The LanguageHandler is currently hardcoded to use T

[jira] [Commented] (TIKA-3343) Remove Tika custom lang detection for 2.x

2021-03-31 Thread Peter Kronenberg (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312366#comment-17312366 ] Peter Kronenberg commented on TIKA-3343: I might be mistaken, due to my confusion

[jira] [Commented] (TIKA-3340) LanguageProfile for Myanmar

2021-03-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312187#comment-17312187 ] ASF GitHub Bot commented on TIKA-3340: -- arky commented on pull request #421: URL: htt

[GitHub] [tika] arky commented on pull request #421: [TIKA-3340] LanguageProfile for Myanmar

2021-03-31 Thread GitBox
arky commented on pull request #421: URL: https://github.com/apache/tika/pull/421#issuecomment-810887814 @kkrugler I'll be happy to contribute test cases for Myanmar. Can you please tell me more about how to do this? Just adding 'lang_code.test' file with 100 lines of Myanamar text i