Oh, he is?! Did I mention I have a grad-student-semester of projects for corpus curation? :)
-----Original Message----- From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Thursday, December 17, 2015 12:55 PM To: dev@tika.apache.org Subject: Re: looking to contribute What Tim and Nick said. :) Joey is at Caltech and interested in working with me, so I said jump on the Tika lists and let’s see if there is something we can pin down. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: "Allison, Timothy B." <talli...@mitre.org> Reply-To: "dev@tika.apache.org" <dev@tika.apache.org> Date: Thursday, December 17, 2015 at 5:32 AM To: "dev@tika.apache.org" <dev@tika.apache.org> Subject: RE: looking to contribute >Speaking of the docs/examples, TIKA-1329 is still open because I >haven't gotten around to documenting it. > >Y, if you'd like a report of exceptions, let me know. IIRC, it would >be great if we could improve on XML detection (we're currently over >detecting), and there's plenty of work to do on html parsing TIKA-1599. > >I also have probably a full grad student semester worth of curation >project ideas on the test corpus. Not glamorous, but very useful for >the community. > >Then there's the eval code itself...that still needs to make it into >shape to be added. > >I agree with Nick though, start small on documentation/examples. > >Cheers, > > Tim > >-----Original Message----- >From: Nick Burch [mailto:apa...@gagravarr.org] >Sent: Wednesday, December 16, 2015 4:23 PM >To: dev@tika.apache.org >Subject: Re: looking to contribute > >On Wed, 16 Dec 2015, Joey Hong wrote: >> My name is Joey. I am a college freshmen with programming experience >>looking to get into the world of open-source. I was hoping to >>contribute to the Tika project, and was wondering if there were any >>tasks that a beginner like me could tackle. I am willing to do >>anything, whether it be fixing a minor bug, or adding test suites or >>documentation. > >On the docs / examples side, we have a few examples on the website, but >probably not enough! One thing might be to look through those, identify >gaps with your fresh eyes, and work on those. We also have instructions >for some more complicated integrations on the wiki, maybe try some of >those and feed back on which ones aren't clear enough? > >If you want to try more coding, Tim quite often runs Tika against some >large filesets, and has a nifty tool to report on what breaks. He can >hopefully point you at the most recent report! Maybe have a look >through that, identify a few common failures from unidentified or >common exceptions, and try to fix one or two of those? > >Nick