[jira] [Commented] (TIKA-877) Embedded document not extracted (regression)

2012-03-21 Thread Maxim Valyanskiy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234198#comment-13234198 ] Maxim Valyanskiy commented on TIKA-877: --- Hm, I found this problem in my tika-server ye

[jira] [Assigned] (TIKA-877) Embedded document not extracted (regression)

2012-03-21 Thread Maxim Valyanskiy (Assigned) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Valyanskiy reassigned TIKA-877: - Assignee: Maxim Valyanskiy > Embedded document not extracted (regression) > ---

[jira] [Commented] (TIKA-877) Embedded document not extracted (regression)

2012-03-21 Thread Daniel Bonniot de Ruisselet (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234209#comment-13234209 ] Daniel Bonniot de Ruisselet commented on TIKA-877: -- Maxim, sounds good, but

[jira] [Issue Comment Edited] (TIKA-877) Embedded document not extracted (regression)

2012-03-21 Thread Daniel Bonniot de Ruisselet (Issue Comment Edited) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232846#comment-13232846 ] Daniel Bonniot de Ruisselet edited comment on TIKA-877 at 3/21/12 9:14 AM: ---

[jira] [Commented] (TIKA-877) Embedded document not extracted (regression)

2012-03-21 Thread Maxim Valyanskiy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234228#comment-13234228 ] Maxim Valyanskiy commented on TIKA-877: --- I'm no sure about 'file5', but zero sized 'MB

[jira] [Commented] (TIKA-877) Embedded document not extracted (regression)

2012-03-21 Thread Daniel Bonniot de Ruisselet (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234240#comment-13234240 ] Daniel Bonniot de Ruisselet commented on TIKA-877: -- Then it seems like that

[jira] [Commented] (TIKA-877) Embedded document not extracted (regression)

2012-03-21 Thread Maxim Valyanskiy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234276#comment-13234276 ] Maxim Valyanskiy commented on TIKA-877: --- It became the same problem after commit that

[jira] [Commented] (TIKA-877) Embedded document not extracted (regression)

2012-03-21 Thread Maxim Valyanskiy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234280#comment-13234280 ] Maxim Valyanskiy commented on TIKA-877: --- {noformat} [maxcom@pc-elrond t]$ java -jar .

[jira] [Commented] (TIKA-877) Embedded document not extracted (regression)

2012-03-21 Thread Maxim Valyanskiy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234282#comment-13234282 ] Maxim Valyanskiy commented on TIKA-877: --- Hm, no empty files, but file5 size is differe

[jira] [Commented] (TIKA-877) Embedded document not extracted (regression)

2012-03-21 Thread Maxim Valyanskiy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234291#comment-13234291 ] Maxim Valyanskiy commented on TIKA-877: --- I think it is not a real problem, because "fi

[jira] [Resolved] (TIKA-877) Embedded document not extracted (regression)

2012-03-21 Thread Maxim Valyanskiy (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Valyanskiy resolved TIKA-877. --- Resolution: Fixed Fix Version/s: (was: 1.1) 1.2 > Embedded do

[jira] [Commented] (TIKA-873) Tika --extract fails for DOC

2012-03-21 Thread Maxim Valyanskiy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234294#comment-13234294 ] Maxim Valyanskiy commented on TIKA-873: --- Current trunk version extracts following file

[jira] [Commented] (TIKA-873) Tika --extract fails for DOC

2012-03-21 Thread Maxim Valyanskiy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234310#comment-13234310 ] Maxim Valyanskiy commented on TIKA-873: --- hm, 1.0 extracts something that is not valid.

[jira] [Resolved] (TIKA-873) Tika --extract fails for DOC

2012-03-21 Thread Maxim Valyanskiy (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Valyanskiy resolved TIKA-873. --- Resolution: Fixed > Tika --extract fails for DOC > > >

[jira] [Created] (TIKA-879) Detection problem: message/rfc822 file is detected as text/plain.

2012-03-21 Thread Kostya Gribov (Created) (JIRA)
Detection problem: message/rfc822 file is detected as text/plain. - Key: TIKA-879 URL: https://issues.apache.org/jira/browse/TIKA-879 Project: Tika Issue Type: Bug Com

[jira] [Commented] (TIKA-873) Tika --extract fails for DOC

2012-03-21 Thread Albert L. (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234350#comment-13234350 ] Albert L. commented on TIKA-873: Thanks, Maxim. > Tika --extract fails for

Re: Pluggable language detection

2012-03-21 Thread Ken Krugler
On Mar 21, 2012, at 8:51am, Julien Nioche wrote: > Hi guys, > > Just wondering about the best way to make the language detection pluggable > instead of having it hard-wired as it is now. We now that the resources > that are currently in Tika are both slow and inaccurate [1] and there are > other

Re: Pluggable language detection

2012-03-21 Thread Michael McCandless
On Wed, Mar 21, 2012 at 12:55 PM, Ken Krugler wrote: > > On Mar 21, 2012, at 8:51am, Julien Nioche wrote: > >> Hi guys, >> >> Just wondering about the best way to make the language detection pluggable >> instead of having it hard-wired as it is now. We now that the resources >> that are currently

Re: Pluggable language detection

2012-03-21 Thread Chris A Mattmann
Hey Juls, I'd be super +1 to make it pluggable and willing to help. Cheers, Chris On Mar 21, 2012, at 4:51 PM, Julien Nioche wrote: > Hi guys, > > Just wondering about the best way to make the language detection pluggable > instead of having it hard-wired as it is now. We now that the resource

[jira] [Created] (TIKA-880) while integrating microsoft parser it is giving error

2012-03-21 Thread Somenath Mukhopadhyay (Created) (JIRA)
while integrating microsoft parser it is giving error - Key: TIKA-880 URL: https://issues.apache.org/jira/browse/TIKA-880 Project: Tika Issue Type: Wish Components: parser Aff