+1 this was one of the key facets of Nutch that we wanted to port to Tika:
the ability to have a preference order for plugins according to mime types.
Nutch does it a bit differently, and I think we could improve on the way
it's done here in Tika.

Cheers,
  Chris



On 10/19/07 1:07 AM, "Jukka Zitting" <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> On 10/19/07, Bertrand Delacretaz <[EMAIL PROTECTED]> wrote:
>> In the case of CSV, for example, we might want to say "this is CSV,
>> which is also plain text", so that if Tika has a CSV specific parser
>> it uses it, and if not it uses a plain text parser.
> 
> The MIME framework in Tika already supports the concept of type
> inheritance, and we could fairly easily make AutoDetectParser (or
> anything similar) walk up the type hierarchy until it finds a parser
> that can handle the content.
> 
> BR,
> 
> Jukka Zitting

______________________________________________
Chris Mattmann, Ph.D.
[EMAIL PROTECTED]
Cognizant Development Engineer
Early Detection Research Network Project

_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                     Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.


Reply via email to