Hi,

On Tue, Sep 29, 2009 at 12:08 AM, Ken Krugler
<[email protected]> wrote:
> Just for grins, I set up for types with names ending in +xml to
> automatically get application/xml as the parent mimetype.
>
> But when I used TikaCLI to process a test.xspf file, no content was
> generated.
>
> The issue is that CompositeParser.getParser() doesn't use supertypes when
> falling back - if it can't get a parser for the exact mimetype, then it goes
> straight to the fallback parser.
>
> It seems like it should try to use the mimetype hierarchy. If so, I can file
> an issue and a patch.

Correct, that would be great.

Note that both the MimeType.getSuperType()  method already does some
of this and we have related supertype settings stored in the
tika-mimetypes.xml configuration. The type registry could also be told
about the +xml convention and related implicit supertype settings like
the ones encoded in the MediaType.isSpecializationOf() method.

(Note that we currently have both MimeType and MediaType classes for
similar purposes. This is due to an ongoing redesign of the mime type
registry. For now it's probably best to work on the MimeType class
until the redesign is more complete.)

BR,

Jukka Zitting

Reply via email to