[jira] [Commented] (TIKA-676) Boilerpipe fails

2011-12-22 Thread Markus Jelsma (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174770#comment-13174770 ] Markus Jelsma commented on TIKA-676: The latest artifact is still not published on centr

[jira] [Commented] (TIKA-826) TikaException / OfficeXmlFileException with .xlsb files

2011-12-22 Thread John Mastarone (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174803#comment-13174803 ] John Mastarone commented on TIKA-826: - After reading a little more on this, I see that P

[jira] [Issue Comment Edited] (TIKA-826) TikaException / OfficeXmlFileException with .xlsb files

2011-12-22 Thread John Mastarone (Issue Comment Edited) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174803#comment-13174803 ] John Mastarone edited comment on TIKA-826 at 12/22/11 1:35 PM: ---

[jira] [Commented] (TIKA-815) Tika parsers should handle failures more gracefully

2011-12-22 Thread Jerome Lacoste (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174827#comment-13174827 ] Jerome Lacoste commented on TIKA-815: - Agreed. Yet improving the default parsers might s

Parser stability and ForkParser

2011-12-22 Thread Jerome Lacoste
Hei, I opened a couple of issues to note some parser instability: https://issues.apache.org/jira/browse/TIKA-815 https://issues.apache.org/bugzilla/show_bug.cgi?id=52372 https://issues.apache.org/bugzilla/show_bug.cgi?id=52373 https://issues.apache.org/jira/browse/COMPRESS-169 TIKA-815 is the ov

Re: Parser stability and ForkParser

2011-12-22 Thread Jerome Lacoste
On Thu, Dec 22, 2011 at 5:18 PM, Jerome Lacoste wrote: > Hei, > > I opened a couple of issues to note some parser instability: > > https://issues.apache.org/jira/browse/TIKA-815 > https://issues.apache.org/bugzilla/show_bug.cgi?id=52372 > https://issues.apache.org/bugzilla/show_bug.cgi?id=52373 >

[jira] [Commented] (TIKA-826) TikaException / OfficeXmlFileException with .xlsb files

2011-12-22 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175240#comment-13175240 ] Nick Burch commented on TIKA-826: - POI doesn't support .xlsb files, and nor is it likely to

[jira] [Commented] (TIKA-815) Tika parsers should handle failures more gracefully

2011-12-22 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175243#comment-13175243 ] Nick Burch commented on TIKA-815: - For people with strong stability requirements, we provide

Re: Parser stability and ForkParser

2011-12-22 Thread Nick Burch
On 22/12/11 16:18, Jerome Lacoste wrote: Now a question that pertains more to the user list. In TIKA-815, Nick pointed that one could use ForkedParser to improve stability. I didn't manage to get it to work. When I use the command line tika app, e.g. with java -jar /tmp/tika-app-1.0.jar -v -t -

Re: Pushing parsers upstream

2011-12-22 Thread Nick Burch
On 16/12/11 15:12, Jukka Zitting wrote: As mentioned by Antoni, in the end the metadata keys are just strings, so with a little coordination we don't need to delay the introduction of new keys over multiple releases. Hmm, they're not quite just strings - with the new Property stuff they can al