[jira] [Created] (TIKA-1573) Not possible to restrict default mime types

2015-03-11 Thread Pavel Micka (JIRA)
Pavel Micka created TIKA-1573: - Summary: Not possible to restrict default mime types Key: TIKA-1573 URL: https://issues.apache.org/jira/browse/TIKA-1573 Project: Tika Issue Type: Improvement

[jira] [Updated] (TIKA-1573) Not possible to restrict default mime types

2015-03-11 Thread Pavel Micka (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Micka updated TIKA-1573: -- Description: I am facing the following problem. I am using MagicNumber detector, but the detection is sl

[jira] [Commented] (TIKA-1286) Adding MS Visio VSDX to mime-types detection

2015-03-11 Thread Fabian Lange (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356822#comment-14356822 ] Fabian Lange commented on TIKA-1286: For us it would be very helpful to know that the f

[jira] [Commented] (TIKA-1573) Not possible to restrict default mime types

2015-03-11 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356856#comment-14356856 ] Nick Burch commented on TIKA-1573: -- If you only want a handful of types, why not just repl

[jira] [Commented] (TIKA-1286) Adding MS Visio VSDX to mime-types detection

2015-03-11 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356859#comment-14356859 ] Nick Burch commented on TIKA-1286: -- Any chance you could create very small sample files of

[jira] [Commented] (TIKA-1573) Not possible to restrict default mime types

2015-03-11 Thread Pavel Micka (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356864#comment-14356864 ] Pavel Micka commented on TIKA-1573: --- Hi, because my restriction is all "binary mimetypes"

[jira] [Commented] (TIKA-1573) Not possible to restrict default mime types

2015-03-11 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356924#comment-14356924 ] Nick Burch commented on TIKA-1573: -- Detecting text types should be quick, I'd expect quick

[jira] [Commented] (TIKA-1573) Not possible to restrict default mime types

2015-03-11 Thread Pavel Micka (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356954#comment-14356954 ] Pavel Micka commented on TIKA-1573: --- No, I don't have a profiling results. In my case I

[jira] [Commented] (TIKA-1540) New Tika plugin for image based feature extraction using computer vision techniques

2015-03-11 Thread Dimuthu Upeksha (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357046#comment-14357046 ] Dimuthu Upeksha commented on TIKA-1540: --- Is this still available for GSoC 2015? > Ne

Re: Parser test resources

2015-03-11 Thread Nick Burch
On Tue, 10 Mar 2015, Tyler Palsulich wrote: Or, do enough parsers have overlapping test resource dependencies where it makes sense to have them _all_ under one directory? I believe that most of the test files get used for both detection and parsing unit tests It would be nice to easily know

[jira] [Commented] (TIKA-1286) Adding MS Visio VSDX to mime-types detection

2015-03-11 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357158#comment-14357158 ] Hudson commented on TIKA-1286: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #541 (See [https://b

[jira] [Commented] (TIKA-1286) Adding MS Visio VSDX to mime-types detection

2015-03-11 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357169#comment-14357169 ] Nick Burch commented on TIKA-1286: -- Thanks for all this! Note that the types given here d

[jira] [Commented] (TIKA-1286) Adding MS Visio VSDX to mime-types detection

2015-03-11 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357265#comment-14357265 ] Hudson commented on TIKA-1286: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #542 (See [https://b

[jira] [Created] (TIKA-1574) Frames in header/footer in doc files aren't extracted

2015-03-11 Thread Konstantin Gribov (JIRA)
Konstantin Gribov created TIKA-1574: --- Summary: Frames in header/footer in doc files aren't extracted Key: TIKA-1574 URL: https://issues.apache.org/jira/browse/TIKA-1574 Project: Tika Issue

[jira] [Commented] (TIKA-1573) Not possible to restrict default mime types

2015-03-11 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357408#comment-14357408 ] Nick Burch commented on TIKA-1573: -- My hunch is that your profiling will show almost no di

[jira] [Commented] (TIKA-1572) Utility script for pushing 3rd Party UCAR Dependencies to Maven Central

2015-03-11 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357897#comment-14357897 ] Chris A. Mattmann commented on TIKA-1572: - how about in https://svn.apache.org/repo

[jira] [Commented] (TIKA-1572) Utility script for pushing 3rd Party UCAR Dependencies to Maven Central

2015-03-11 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357930#comment-14357930 ] Lewis John McGibbney commented on TIKA-1572: Looks much better. I didn't see th