Re: [jira] [Commented] (TIKA-245) Support of CHM Format

2013-03-05 Thread Oleg Tikhonov
Tika chm support has its limitations, can you provide such file(s) for further investigation ? BR, Oleg On Wed, Mar 6, 2013 at 1:10 AM, Tejas Patil (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&foc

[jira] [Commented] (TIKA-245) Support of CHM Format

2013-03-05 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13594074#comment-13594074 ] Tejas Patil commented on TIKA-245: -- I am working on NUTCH-1454 and I am observing that tika

Re: how to add more metadata to tika extraction?

2013-03-05 Thread Nick Burch
On Wed, 27 Feb 2013, eShard wrote: I manually ran the tika-app --gui and I dropped the rss feed into it. Here's what the metadata output: Content-Length: 615913 Content-Type: application/rss+xml dc:description: This is an IBM C3 Public Files feed generated by a Java application. dc:title: IBM -