[jira] [Commented] (TIKA-638) Language recognition - Failed trying to load language profile for language lt . Error: java.lang.IllegalArgumentException: Unable to add an ngram of incorrect length: 5 !

2011-08-16 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086068#comment-13086068 ] Joseph Vychtrle commented on TIKA-638: -- Simply put, please ignore everything I said, th

[jira] [Commented] (TIKA-638) Language recognition - Failed trying to load language profile for language lt . Error: java.lang.IllegalArgumentException: Unable to add an ngram of incorrect length: 5 !

2011-08-16 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086063#comment-13086063 ] Joseph Vychtrle commented on TIKA-638: -- Sorry I didn't recall what was going on back th

[jira] [Commented] (TIKA-690) WordExtractor doesn't extract text from HWPFDocument

2011-08-14 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084878#comment-13084878 ] Joseph Vychtrle commented on TIKA-690: -- Thank you Nick, I didn't know that the "Closing

[jira] [Closed] (TIKA-690) WordExtractor doesn't extract text from HWPFDocument

2011-08-14 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Vychtrle closed TIKA-690. Resolution: Not A Problem WordExctractor requires HWPF document with paragraphs > WordExtractor does

[jira] [Commented] (TIKA-690) WordExtractor doesn't extract text from HWPFDocument

2011-08-14 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084869#comment-13084869 ] Joseph Vychtrle commented on TIKA-690: -- I was using tika snapshot so that poi 3.8-beta3

[jira] [Created] (TIKA-690) WordExtractor doesn't extract text from HWPFDocument

2011-08-14 Thread Joseph Vychtrle (JIRA)
WordExtractor doesn't extract text from HWPFDocument Key: TIKA-690 URL: https://issues.apache.org/jira/browse/TIKA-690 Project: Tika Issue Type: Bug Components: parser Affect

[jira] [Closed] (TIKA-689) MimeTypes detector detects text/plain content type of a PPT file

2011-08-14 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Vychtrle closed TIKA-689. Resolution: Not A Problem > MimeTypes detector detects text/plain content type of a PPT file > --

[jira] [Commented] (TIKA-689) MimeTypes detector detects text/plain content type of a PPT file

2011-08-14 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084823#comment-13084823 ] Joseph Vychtrle commented on TIKA-689: -- You're right Nick, although I got it working, I

[jira] [Commented] (TIKA-689) MimeTypes detector detects text/plain content type of a PPT file

2011-08-14 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084808#comment-13084808 ] Joseph Vychtrle commented on TIKA-689: -- I'm debugging that so I can see that the file i

[jira] [Created] (TIKA-689) MimeTypes detector detects text/plain content type of a PPT file

2011-08-14 Thread Joseph Vychtrle (JIRA)
MimeTypes detector detects text/plain content type of a PPT file Key: TIKA-689 URL: https://issues.apache.org/jira/browse/TIKA-689 Project: Tika Issue Type: Bug Compo

[jira] [Commented] (TIKA-638) Language recognition - Failed trying to load language profile for language lt . Error: java.lang.IllegalArgumentException: Unable to add an ngram of incorrect length: 5 !

2011-08-03 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078702#comment-13078702 ] Joseph Vychtrle commented on TIKA-638: -- I didn't do anything else than using LanguageId

[jira] [Commented] (TIKA-546) Add ability to create language profiles to tika-app

2011-06-06 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045234#comment-13045234 ] Joseph Vychtrle commented on TIKA-546: -- Hey Chris, have you managed to think this thr

[jira] [Commented] (TIKA-546) Add ability to create language profiles to tika-app

2011-06-05 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044476#comment-13044476 ] Joseph Vychtrle commented on TIKA-546: -- How come that NGramProfile.java is not in Tika'

[jira] [Commented] (TIKA-546) Add ability to create language profiles to tika-app

2011-06-05 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044464#comment-13044464 ] Joseph Vychtrle commented on TIKA-546: -- Guys is anybody going to commit the patch ? Or

[jira] [Created] (TIKA-638) Language recognition - Failed trying to load language profile for language lt . Error: java.lang.IllegalArgumentException: Unable to add an ngram of incorrect length: 5 !=

2011-04-10 Thread Joseph Vychtrle (JIRA)
Language recognition - Failed trying to load language profile for language lt . Error: java.lang.IllegalArgumentException: Unable to add an ngram of incorrect length: 5 != 3 ---

[jira] [Updated] (TIKA-630) Dealing with PDF documents from scanning programs

2011-03-31 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Vychtrle updated TIKA-630: - Summary: Dealing with PDF documents from scanning programs (was: Dealing with PDF documents produc

[jira] [Created] (TIKA-630) Dealing with PDF documents produced from scanning programs

2011-03-31 Thread Joseph Vychtrle (JIRA)
Dealing with PDF documents produced from scanning programs -- Key: TIKA-630 URL: https://issues.apache.org/jira/browse/TIKA-630 Project: Tika Issue Type: Improvement Component

[jira] Commented: (TIKA-607) ParseUtils.getStringContent( ) of a text file - parser is null

2011-03-09 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004713#comment-13004713 ] Joseph Vychtrle commented on TIKA-607: -- Thank tou Jukka, I'd be already working with th

[jira] Updated: (TIKA-607) ParseUtils.getStringContent( ) of a text file - parser is null

2011-02-26 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Vychtrle updated TIKA-607: - Description: Hey, I'm trying to get content of a text file (mysql config file). {code} publ

[jira] Updated: (TIKA-607) ParseUtils.getStringContent( ) of a text file - parser is null

2011-02-26 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Vychtrle updated TIKA-607: - Description: Hey, I'm trying to get content of a text file (mysql config file). {code} publ

[jira] Updated: (TIKA-607) ParseUtils.getStringContent( ) of a text file - parser is null

2011-02-26 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Vychtrle updated TIKA-607: - Description: Hey, I'm trying to get content of a text file (mysql config file). {code} publ

[jira] Updated: (TIKA-607) ParseUtils.getStringContent( ) of a text file - parser is null

2011-02-26 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Vychtrle updated TIKA-607: - Description: Hey, I'm trying to get content of a text file (mysql config file). {code} publ

[jira] Updated: (TIKA-607) ParseUtils.getStringContent( ) of a text file - parser is null

2011-02-26 Thread Joseph Vychtrle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Vychtrle updated TIKA-607: - Description: Hey, I'm trying to get content of a text file (mysql config file). {code} publ

[jira] Created: (TIKA-607) BufferedInputStream.getInIfOpen() - null inputStream

2011-02-25 Thread Joseph Vychtrle (JIRA)
BufferedInputStream.getInIfOpen() - null inputStream - Key: TIKA-607 URL: https://issues.apache.org/jira/browse/TIKA-607 Project: Tika Issue Type: Bug Components: parser Affe