Mime type detection fails with upper case file extensions such as "PDF".
------------------------------------------------------------------------

                 Key: TIKA-56
                 URL: https://issues.apache.org/jira/browse/TIKA-56
             Project: Tika
          Issue Type: Bug
          Components: general
    Affects Versions: 0.1-incubator
            Reporter: Keith R. Bennett
            Priority: Critical
             Fix For: 0.1-incubator


Mime type detection only seems to work when the file extension is lower case.  
Both PDF and DOC extensions failed.

To test this, add the following method to TestParsers:

    public void testGetParsers() throws TikaException, MalformedURLException {
        assertNotNull(ParseUtils.getParser(new URL("file:x.pdf"), tc));
        assertNotNull(ParseUtils.getParser(new URL("file:x.PDF"), tc));
        assertNotNull(ParseUtils.getParser(new URL("file:x.doc"), tc));
        assertNotNull(ParseUtils.getParser(new URL("file:x.DOC"), tc));
        assertNotNull(ParseUtils.getParser(new URL("file:x.txt"), tc));
        assertNotNull(ParseUtils.getParser(new URL("file:x.TXT"), tc));
        assertNotNull(ParseUtils.getParser(new URL("file:x.html"), tc));
        assertNotNull(ParseUtils.getParser(new URL("file:x.HTML"), tc));
        assertNotNull(ParseUtils.getParser(new URL("file:x.HtMl"), tc));
        assertNotNull(ParseUtils.getParser(new URL("file:x.htm"), tc));
        assertNotNull(ParseUtils.getParser(new URL("file:x.HTM"), tc));
        assertNotNull(ParseUtils.getParser(new URL("file:x.ppt"), tc));
        assertNotNull(ParseUtils.getParser(new URL("file:x.PPT"), tc));
        assertNotNull(ParseUtils.getParser(new URL("file:x.xls"), tc));
        assertNotNull(ParseUtils.getParser(new URL("file:x.XLS"), tc));
        // more?
    }


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to