[ https://issues.apache.org/jira/browse/TIKA-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ryan McKinley updated TIKA-1012: -------------------------------- Attachment: TIKA-1012-MimeMeta.patch This updates the patch to use tika namespace for custom attributes: {code:xml} <mime-type type="image/x-ms-bmp"> <alias type="image/bmp"/> <acronym>BMP</acronym> <tika:description>Windows bitmap</tika:description> <tika:link>http://en.wikipedia.org/wiki/BMP_file_format</tika:link> <tika:uti>com.microsoft.bmp</tika:uti> <magic priority="50"> ... {code} I think we should replace the use of <_comment> with <tika:description> since the value ends up in a 'description' field, not a 'comment' field. ryan > Add additional fields to MimeType reader > ---------------------------------------- > > Key: TIKA-1012 > URL: https://issues.apache.org/jira/browse/TIKA-1012 > Project: Tika > Issue Type: New Feature > Components: mime > Reporter: Ryan McKinley > Priority: Minor > Attachments: TIKA-1012-MimeMeta.patch, TIKA-1012-MimeMeta.patch > > > Currently the MimeType class exposes a description (_comment). It would be > nice to also expose: > * Acronym (this is already in tika-mimetypes.xml, see <acronym>BMP</acronym>) > * Links, add helper docs for some formats > * UTI, http://en.wikipedia.org/wiki/Uniform_Type_Identifier > A sample entry would look like this: > {code:xml} > <mime-type type="image/x-ms-bmp"> > <alias type="image/bmp"/> > <acronym>BMP</acronym> > <_comment>Windows bitmap</_comment> > <_link>http://en.wikipedia.org/wiki/BMP_file_format</_link> > <_uti>com.microsoft.bmp</_uti> > <magic priority="50"> > ... > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira