Thanks for the advice! I’ll start with some documentation and tests and move to 
harder tasks from there.

Regarding the JIRA instance for TIKA-1329, would the documentation for the 
RecursiveParserWrapper go with the RecursiveMetadata page on the wiki?

Thanks,
Joey

> On Dec 17, 2015, at 5:32 AM, Allison, Timothy B. <talli...@mitre.org> wrote:
> 
> Speaking of the docs/examples, TIKA-1329 is still open because I haven't 
> gotten around to documenting it.
> 
> Y, if you'd like a report of exceptions, let me know.  IIRC, it would be 
> great if we could improve on XML detection (we're currently over detecting), 
> and there's plenty of work to do on html parsing TIKA-1599.
> 
> I also have probably a full grad student semester worth of curation project 
> ideas on the test corpus.  Not glamorous, but very useful for the community.
> 
> Then there's the eval code itself...that still needs to make it into shape to 
> be added.
> 
> I agree with Nick though, start small on documentation/examples.
> 
> Cheers,
> 
>               Tim
> 
> -----Original Message-----
> From: Nick Burch [mailto:apa...@gagravarr.org] 
> Sent: Wednesday, December 16, 2015 4:23 PM
> To: dev@tika.apache.org
> Subject: Re: looking to contribute
> 
> On Wed, 16 Dec 2015, Joey Hong wrote:
>> My name is Joey. I am a college freshmen with programming experience 
>> looking to get into the world of open-source. I was hoping to 
>> contribute to the Tika project, and was wondering if there were any 
>> tasks that a beginner like me could tackle. I am willing to do 
>> anything, whether it be fixing a minor bug, or adding test suites or 
>> documentation.
> 
> On the docs / examples side, we have a few examples on the website, but 
> probably not enough! One thing might be to look through those, identify gaps 
> with your fresh eyes, and work on those. We also have instructions for some 
> more complicated integrations on the wiki, maybe try some of those and feed 
> back on which ones aren't clear enough?
> 
> If you want to try more coding, Tim quite often runs Tika against some large 
> filesets, and has a nifty tool to report on what breaks. He can hopefully 
> point you at the most recent report! Maybe have a look through that, identify 
> a few common failures from unidentified or common exceptions, and try to fix 
> one or two of those?
> 
> Nick

Reply via email to