kalyan created TIKA-2494:
----------------------------

             Summary: Apache Tika Detect issue
                 Key: TIKA-2494
                 URL: https://issues.apache.org/jira/browse/TIKA-2494
             Project: Tika
          Issue Type: Bug
          Components: tika-batch
            Reporter: kalyan


In my XML file, i has <body> tag in it. So detect() method of tika recognizes 
the file and gives the content type as "text/html" instead of xml. *Note:* File 
name doesn't have file extension.

*Example:* Xml file looks like below format.

<?xml version="1.0"?>
<body>
 <a>
  <b></b>
  <c></c>
 </a>
</body>
</xml>

Is there any other method or approach available to detect this file as xml 
format instead of html.

Thank you in advance



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to