[jira] [Commented] (TIKA-1813) Figure out file types for several unknown OLE files in Common Crawl

2015-12-17 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15062115#comment-15062115 ] Tim Allison commented on TIKA-1813: --- It's a small world after all (TIKA-1814)...the only three files in

RE: looking to contribute

2015-12-17 Thread Allison, Timothy B.
Speaking of the docs/examples, TIKA-1329 is still open because I haven't gotten around to documenting it. Y, if you'd like a report of exceptions, let me know. IIRC, it would be great if we could improve on XML detection (we're currently over detecting), and there's plenty of work to do on

Re: looking to contribute

2015-12-17 Thread Joey Hong
Thanks for the advice! I’ll start with some documentation and tests and move to harder tasks from there. Regarding the JIRA instance for TIKA-1329, would the documentation for the RecursiveParserWrapper go with the RecursiveMetadata page on the wiki? Thanks, Joey > On Dec 17, 2015, at 5:32

Re: looking to contribute

2015-12-17 Thread Mattmann, Chris A (3980)
What Tim and Nick said. :) Joey is at Caltech and interested in working with me, so I said jump on the Tika lists and let’s see if there is something we can pin down. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and