Better handling of content type metadata
----------------------------------------

                 Key: TIKA-759
                 URL: https://issues.apache.org/jira/browse/TIKA-759
             Project: Tika
          Issue Type: Improvement
          Components: metadata, mime
            Reporter: Jukka Zitting
            Assignee: Jukka Zitting
            Priority: Minor


Currently we use the "Content-Type" metadata key for storing (and looking up) 
the media type of a document. This is simple enough and works well especially 
with HTTP, but not too well in line with XMP or other metadata standards like 
Dublin Core. So as an improvement I propose the following:

* Switch to "dc:format" as the standard metadata key for the content type
* Keep the existing "Content-Type" key for backwards compatibility with 
existing clients
* Make the Metadata class aware of such aliases
* Add getFormat() and setFormat() utility methods to Metadata to simplify 
client code and to make the exact metadata key more of an implementation detail 
in Tika

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to