[ 
https://issues.apache.org/jira/browse/TIKA-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14364064#comment-14364064
 ] 

Nick Burch commented on TIKA-1114:
----------------------------------

The file(1) sgml magic file seems to all be sgml-based formats such as svg, xml 
sitemap, osm, gnucash etc

Is there really such a thing as a generic SGML file though? Aren't most/all(?) 
sgml-based files actually ones of a specific SGML Application which is a 
subtype based on the SGML structure?

> sgml mime type is not detected when passed in as byte stream
> ------------------------------------------------------------
>
>                 Key: TIKA-1114
>                 URL: https://issues.apache.org/jira/browse/TIKA-1114
>             Project: Tika
>          Issue Type: Bug
>          Components: mime
>            Reporter: Vikas Garg
>
> When passing sgml files as  TikaInputStream (created from byte[]) to 
> Detector.detect(), it returns text/plain as mediatype and not 
> application/sgml or text/sgml. But when I provide the file name to metadata, 
> then it gives me correct mime-type, i.e., text/sgml.
> Is it because Tika is missing any designated parser for sgml files OR am I 
> missing something? I am on Tika-1.3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to