The MIME types for a given extension are defined here [1] which we took from httpd's view of the world. So while it would be trivial to change them to be the same as the RI, I'm inclined to: - leave rtf as text/rtf - add java to our list as text/plain - leave doc as application/msword then figure out how to snoop the stream for other types.
[1] http://svn.apache.org/viewvc/harmony/enhanced/classlib/trunk/depends/files/content-types.properties?revision=494047&view=markup Thoughts? Tim Vasily Zakharov (JIRA) wrote: > [classlib][luni] URLConnection.getContentType() works with files incorrectly > ---------------------------------------------------------------------------- > > Key: HARMONY-4699 > URL: https://issues.apache.org/jira/browse/HARMONY-4699 > Project: Harmony > Issue Type: Bug > Components: Classlib > Reporter: Vasily Zakharov > > > In Harmony implementation, java.net.URLConnection.getContentType() works > incorrectly when addresses a file URL: > > 1. For files with .rtf extension, RI returns "application/rtf", while Harmony > returns "text/rtf". > > 2. For files with .java extension, RI returns "text/plain", while Harmony > returns "content/unknown". > > 3. For files with .doc extension, RI returns "content/unknown", while Harmony > returns "application/msword". The same is true for other known extensions. > > 4. For files with unrecognized extension and with HTML content, RI returns > "text/html", while Harmony returns "content/unknown". > > Items 1 and 2 look like a minor issues that would better be fixed for > compatibility with RI. > > Item 3 looks like a non-bug difference, as Harmony behaves clearly better > than RI in these cases. > > Item 4 looks like a serious bug, as RI clearly looks into file content for > the file type, and Harmony does not. Looks like > org.apache.harmony.luni.internal.net.www.protocol.file.FileURLConnection.getContentType() > needs to be fixed to use guessContentTypeFromStream() in addition to > guessContentTypeFromName(). > > The attached archive contains the reproducer with some test files it uses. > Here's the reproducer code: > > public class Test { > static void printContentType(String fileName) throws java.io.IOException { > System.out.println(fileName + ": " + new java.net.URL("file:" + > fileName).openConnection().getContentType()); > } > public static void main(String argv[]) { > try { > printContentType("test.rtf"); > printContentType("Test.java"); > printContentType("test.doc"); > printContentType("test.htx"); > } catch (Exception e) { > e.printStackTrace(System.out); > } > } > } > > Output on RI: > > test.rtf: application/rtf > Test.java: text/plain > test.doc: content/unknown > test.htx: text/html > > Output on Harmony: > > test.rtf: text/rtf > Test.java: content/unknown > test.doc: application/msword > test.htx: content/unknown > > This issue is a blocker for HARMONY-4696, as on RI > JEditorPane.getContentType() should be based on > URLConnection.getContentType() that now works incorrectly. > >
