Hi Remzi - thanks! You may want to consider this as a Tika or Any23 project since Nutch delegates its parsing to Tika (and Any23 uses Tika [and vice versa] to handle micro formats).
Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Remzi Düzağaç <remz...@gmail.com> Reply-To: "dev@nutch.apache.org" <dev@nutch.apache.org> Date: Friday, March 27, 2015 at 5:07 AM To: "dev@nutch.apache.org" <dev@nutch.apache.org> Subject: GSOC RDF Microformats Support >Hi Guys, > > >I have sent a proposal to gsoc. I would like to add rdf microformat >support to nutch. I kindly ask for your support. Is there anyone >volunteer to be my mentor on this topic? > > >Thank you very much >