The MIME types for a given extension are defined here [1] which we took
from httpd's view of the world.  So while it would be trivial to change
them to be the same as the RI, I'm inclined to:
 - leave rtf as text/rtf
 - add java to our list as text/plain
 - leave doc as application/msword
then figure out how to snoop the stream for other types.

[1]
http://svn.apache.org/viewvc/harmony/enhanced/classlib/trunk/depends/files/content-types.properties?revision=494047&view=markup

Thoughts?
Tim


Vasily Zakharov (JIRA) wrote:
> [classlib][luni] URLConnection.getContentType() works with files incorrectly
> ----------------------------------------------------------------------------
> 
>                  Key: HARMONY-4699
>                  URL: https://issues.apache.org/jira/browse/HARMONY-4699
>              Project: Harmony
>           Issue Type: Bug
>           Components: Classlib
>             Reporter: Vasily Zakharov
> 
> 
> In Harmony implementation, java.net.URLConnection.getContentType() works 
> incorrectly when addresses a file URL:
> 
> 1. For files with .rtf extension, RI returns "application/rtf", while Harmony 
> returns "text/rtf".
> 
> 2. For files with .java extension, RI returns "text/plain", while Harmony 
> returns "content/unknown".
> 
> 3. For files with .doc extension, RI returns "content/unknown", while Harmony 
> returns "application/msword". The same is true for other known extensions.
> 
> 4. For files with unrecognized extension and with HTML content, RI returns 
> "text/html", while Harmony returns "content/unknown".
> 
> Items 1 and 2 look like a minor issues that would better be fixed for 
> compatibility with RI.
> 
> Item 3 looks like a non-bug difference, as Harmony behaves clearly better 
> than RI in these cases.
> 
> Item 4 looks like a serious bug, as RI clearly looks into file content for 
> the file type, and Harmony does not. Looks like 
> org.apache.harmony.luni.internal.net.www.protocol.file.FileURLConnection.getContentType()
>  needs to be fixed to use guessContentTypeFromStream() in addition to 
> guessContentTypeFromName().
> 
> The attached archive contains the reproducer with some test files it uses. 
> Here's the reproducer code:
> 
> public class Test {
>     static void printContentType(String fileName) throws java.io.IOException {
>         System.out.println(fileName + ": " + new java.net.URL("file:" + 
> fileName).openConnection().getContentType());
>     }
>     public static void main(String argv[]) {
>         try {
>             printContentType("test.rtf");
>             printContentType("Test.java");
>             printContentType("test.doc");
>             printContentType("test.htx");
>         } catch (Exception e) {
>             e.printStackTrace(System.out);
>         }
>     }
> } 
> 
> Output on RI:
> 
> test.rtf: application/rtf
> Test.java: text/plain
> test.doc: content/unknown
> test.htx: text/html
> 
> Output on Harmony:
> 
> test.rtf: text/rtf
> Test.java: content/unknown
> test.doc: application/msword
> test.htx: content/unknown
> 
> This issue is a blocker for HARMONY-4696, as on RI 
> JEditorPane.getContentType() should be based on 
> URLConnection.getContentType() that now works incorrectly.
> 
> 

Reply via email to