+1 this was one of the key facets of Nutch that we wanted to port to Tika: the ability to have a preference order for plugins according to mime types. Nutch does it a bit differently, and I think we could improve on the way it's done here in Tika.
Cheers, Chris On 10/19/07 1:07 AM, "Jukka Zitting" <[EMAIL PROTECTED]> wrote: > Hi, > > On 10/19/07, Bertrand Delacretaz <[EMAIL PROTECTED]> wrote: >> In the case of CSV, for example, we might want to say "this is CSV, >> which is also plain text", so that if Tika has a CSV specific parser >> it uses it, and if not it uses a plain text parser. > > The MIME framework in Tika already supports the concept of type > inheritance, and we could fairly easily make AutoDetectParser (or > anything similar) walk up the type hierarchy until it finds a parser > that can handle the content. > > BR, > > Jukka Zitting ______________________________________________ Chris Mattmann, Ph.D. [EMAIL PROTECTED] Cognizant Development Engineer Early Detection Research Network Project _________________________________________________ Jet Propulsion Laboratory Pasadena, CA Office: 171-266B Mailstop: 171-246 _______________________________________________________ Disclaimer: The opinions presented within are my own and do not reflect those of either NASA, JPL, or the California Institute of Technology.
