Thanks, I had initially read over the additional restriction on the
first character.

This strikes me as one of those cases where the reference impl. wins
over the specification.  I think Svetlana's test was written to the
spec.  If we discover an app that relies upon isSupported  throwing an
IllegalCharsetNameException instead of returning false then (besides
wondering where this app has ever run) we can revisit.

I vote we resolve this part of the bug as "won't fix".

Regards,
Tim

karan malhi wrote:
> Here is text from the j2se1.4.2 spec
> A charset name must begin with either a letter or a digit. The empty
> string is not a legal charset name. Charset names are not
> case-sensitive; that is, case is always ignored when comparing charset
> names. Charset names generally follow the conventions documented in /RFC
> 2278: IANA Charset Registration Procedures/
> <http://ietf.org/rfc/rfc2278.txt>.
> According to RFC - 2278
> 
>   Finally, charsets being registered for use with the "text" media type
>   MUST have a primary name that conforms to the more restrictive syntax
>   of the charset field in MIME encoded-words [RFC-2047, RFC-2184] and
>   MIME extended parameter values [RFC-2184]. A combined ABNF definition
>   for such names is as follows:
> 
>   mime-charset = 1*<Any CHAR except SPACE, CTLs, and cspecials>
> 
>   cspecials    = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "
>                  <"> / "/" / "[" / "]" / "?" / "." / "=" / "*"
> 
>   CHAR         =  <any ASCII character>        ; (  0-177,  0.-127.)
>   SPACE        =  <ASCII SP, space>            ; (     40,      32.)
>   CTL          =  <any ASCII control           ; (  0- 37,  0.- 31.)
>                    character and DEL>          ; (    177,     127.)
> 
> If I have interpreted the above correctly, then it basically means that
> the name can start with any ASCII character except ASCII (octal) 40,
> 0-37, 177. A "-" is 055 and an "_" is 137 which does not fall under the
> above exclude list.
> So primarily if I have a charset named "-UTF-8"  or "_UTF-8", it is not
> an illegal name.
> 
> So looks like the spec definition is further tightening the Charsets
> accepted by java in that the name can only start with a letter or a
> digit. How do we interpret *must* ?
> 
> 
> 
> So
> 
> Richard Liang wrote:
> 
>> Hello Tim,
>>
>> I'm wondering why I did not just copy the first sentence. :-)
>>
>> "A charset name **must** begin with either a letter or a digit."  Does
>> this mean if the charset name which begin with neither a letter nor a
>> digit should be regarded as an illegal charset name?
>>
>>
>> Richard Liang
>> China Software Development Lab, IBM
>>
>>
>>
>> Tim Ellison wrote:
>>
>>> Richard Liang wrote:
>>>  
>>>
>>>> Hello Tim,
>>>>
>>>> I think this is caused by different understanding of the java spec:
>>>>
>>>> A charset name **must** begin with either a letter or a digit. The
>>>> empty
>>>> string is not a legal charset name....
>>>>
>>>> What do think the implication of "must" here? :-)
>>>>     
>>>
>>>
>>> But the name isn't empty, it is "-UTF-8" ?  I must be missing
>>> something...
>>>
>>> Regards,
>>> Tim
>>>
>>>
>>>  
>>>
>>>> Tim Ellison (JIRA) wrote:
>>>>   
>>>>>     [
>>>>> http://issues.apache.org/jira/browse/HARMONY-68?page=comments#action_12366784
>>>>>
>>>>> ]
>>>>> Tim Ellison commented on HARMONY-68:
>>>>> ------------------------------------
>>>>>
>>>>> The test looks invalid to me.  You shoud only expect an
>>>>> java.nio.charset.IllegalCharsetNameException if the name itself
>>>>> contains disallowed characters, and both underscore and dash are
>>>>> permitted.
>>>>>
>>>>> The code     Charset.isSupported("-UTF-8")
>>>>>
>>>>> should return false, not throw an exception.
>>>>>
>>>>>  
>>>>>     
>>>>>> java.nio.charset.Charset.isSupported(String charsetName) does not
>>>>>> throw IllegalCharsetNameException for spoiled standard sharset name
>>>>>> -------------------------------------------------------------------------------------------------------------------------------------
>>>>>>
>>>>>>
>>>>>>
>>>>>>          Key: HARMONY-68
>>>>>>          URL: http://issues.apache.org/jira/browse/HARMONY-68
>>>>>>      Project: Harmony
>>>>>>         Type: Bug
>>>>>>   Components: Classlib
>>>>>>     Reporter: Svetlana Samoilenko
>>>>>>  Attachments: charset_patch.txt
>>>>>>
>>>>>> According to j2se 1.4.2 specification for Charset.isSupported(String
>>>>>> charsetName)  the method must throw IllegalCharsetNameException  "if
>>>>>> the given charset name is illegal ". "Legal charset name must begin
>>>>>> with either a letter or a digit. The test listed below shows that
>>>>>> there is no the exception  if to insert "-" or "_" symbols before
>>>>>> standard sharset name, for example "-UTF-8" or "_US-ASCII".
>>>>>> Moreover the method returns "true" in this case.
>>>>>> BEA also does not throw the exception but returns "false".
>>>>>> Code to reproduce: import java.nio.charset.*;  public class test2
>>>>>> {     public static void main (String[] args) {
>>>>>>         // string starts neither a letter nor a digit         boolean
>>>>>> sup=false;         try{
>>>>>>              sup=Charset.isSupported("-UTF-8");
>>>>>>              System.out.println("***BAD. should be exception;
>>>>>> sup="+sup);              sup=Charset.isSupported("_US-ASCII");
>>>>>>              System.out.println("***BAD. should be exception;
>>>>>> sup="+sup);         } catch (IllegalCharsetNameException e) {
>>>>>>             System.out.println("***OK. Expected
>>>>>> IllegalCharsetNameException " + e);         }           } } Steps to
>>>>>> Reproduce: 1. Build Harmony (check-out on 2006-01-30) j2se subset as
>>>>>> described in README.txt. 2. Compile test2.java using BEA 1.4
>>>>>> javac          
>>>>>>> javac -d . test2.java                 
>>>>>>
>>>>>> 3. Run java using compatible VM (J9)          
>>>>>>> java -showversion test2                 
>>>>>>
>>>>>> Output: C:\tmp>C:\jrockit-j2sdk1.4.2_04\bin\java.exe -showversion
>>>>>> test2 java version "1.4.2_04" Java(TM) 2 Runtime Environment,
>>>>>> Standard Edition (build 1.4.2_04-b05) BEA WebLogic JRockit(TM)
>>>>>> 1.4.2_04 JVM (build ari-31788-20040616-1132-win-ia32, Native Threads,
>>>>>> GC strategy: parallel) ***BAD. should be exception; sup=false
>>>>>> ***BAD. should be exception; sup=false
>>>>>> C:\tmp>C:\harmony\trunk\deploy\jre\bin\java -showversion test2 (c)
>>>>>> Copyright 1991, 2005 The Apache Software Foundation or its licensors,
>>>>>> as applicable. ***BAD. should be exception; sup=true
>>>>>> ***BAD. should be exception; sup=true
>>>>>> Suggested junit test case:
>>>>>> ------------------------ CharserTest.java
>>>>>> ------------------------------------------------- import
>>>>>> java.nio.charset.*; import junit.framework.*; public class
>>>>>> CharsetTest extends TestCase {     public static void main(String[]
>>>>>> args) {         junit.textui.TestRunner.run(CharsetTest.class);     }
>>>>>>     public void test_isSupported() {       boolean
>>>>>> sup=false;        // string starts neither a letter nor a
>>>>>> digit         try{
>>>>>>             sup=Charset.isSupported("-UTF-8");
>>>>>>             fail("***BAD. should be exception
>>>>>> IllegalCharsetNameException");         } catch
>>>>>> (IllegalCharsetNameException e) {  //expected
>>>>>>         }
>>>>>>         // string starts neither a letter nor a digit         try{
>>>>>>              sup=Charset.isSupported("_US-ASCII");
>>>>>>              fail("***BAD. should be exception
>>>>>> IllegalCharsetNameException");          } catch
>>>>>> (IllegalCharsetNameException e) {  //expected
>>>>>>         }
>>>>>>    } }
>>>>>>             
>>>>>
>>>>>         
>>>>
>>>
>>>   
>>
>>
> 

-- 

Tim Ellison ([EMAIL PROTECTED])
IBM Java technology centre, UK.

Reply via email to