Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-19 Thread Michael McMahon

Seems fine to me Xuelei.

- Michael

On 19/08/13 06:56, Xuelei Fan wrote:

If no objections, I will push the change by COB Monday.

Thanks,
Xuelei

On 8/13/2013 4:29 PM, Xuelei Fan wrote:

Can I get an additional code review from networking team?

Thanks,
Xuelei

On 8/12/2013 2:07 PM, Weijun Wang wrote:

new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.06/




Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-18 Thread Xuelei Fan
If no objections, I will push the change by COB Monday.

Thanks,
Xuelei

On 8/13/2013 4:29 PM, Xuelei Fan wrote:
 Can I get an additional code review from networking team?
 
 Thanks,
 Xuelei
 
 On 8/12/2013 2:07 PM, Weijun Wang wrote:
 new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.06/
 



Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-15 Thread Mike Duigou
I've been confused through this discussion as to why a trailing dot would be 
regarded as illegal.

Historically a trailing dot has been frequently (though not universally) used 
to denote a fully qualified domain name.

https://en.wikipedia.org/wiki/Fully_qualified_domain_name

Is this use now illegal/unsupported/invalid? Does having a trailing dot 
conflict with other parts of the IDN specification?

Mike

On Aug 13 2013, at 01:29 , Xuelei Fan wrote:

 Can I get an additional code review from networking team?
 
 Thanks,
 Xuelei
 
 On 8/12/2013 2:07 PM, Weijun Wang wrote:
 new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.06/
 



Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-15 Thread Xuelei Fan
On Aug 16, 2013, at 1:08, Mike Duigou mike.dui...@oracle.com wrote:

 I've been confused through this discussion as to why a trailing dot would be 
 regarded as illegal.
 
The discussion is too long to find the final decision easily.  A IDN with 
trailing dot should be regarded as legal IDN.  This update is trying to fix 
this.  For example, . and example.com. are legal IDN, and IDN.toASCII() 
should be return the legal name accordingly.

However, per the specification of Server Name Indication of TLS extension, a 
hostname should not end with trailing dot.  So in SNIHostName, we will check 
the return value of IDN.toASCII() to filter out hostnames with trailing dots.

This fix is trying to have IDN working with tailing dot and empty label 
correctly.   The previous code of SNIHostName will work as expected if IDN can 
handle trailing dot properly.

Thanks,
Xuelei

 Historically a trailing dot has been frequently (though not universally) used 
 to denote a fully qualified domain name.
 
 https://en.wikipedia.org/wiki/Fully_qualified_domain_name
 
 Is this use now illegal/unsupported/invalid? Does having a trailing dot 
 conflict with other parts of the IDN specification?
 
 Mike
 
 On Aug 13 2013, at 01:29 , Xuelei Fan wrote:
 
 Can I get an additional code review from networking team?
 
 Thanks,
 Xuelei
 
 On 8/12/2013 2:07 PM, Weijun Wang wrote:
 new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.06/
 


Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-15 Thread Matthew Hall
On Thu, Aug 15, 2013 at 10:08:35AM -0700, Mike Duigou wrote:
 I've been confused through this discussion as to why a trailing dot would be 
 regarded as illegal.
 
 Historically a trailing dot has been frequently (though not universally) used 
 to denote a fully qualified domain name.
 
 https://en.wikipedia.org/wiki/Fully_qualified_domain_name
 
 Is this use now illegal/unsupported/invalid? Does having a trailing dot 
 conflict with other parts of the IDN specification?
 
 Mike

This is why some of us were protesting the code which disallowed the trailing 
'.', and eventually the code was changed to allow it to be present.

Matthew.


Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-11 Thread Xuelei Fan
new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.05/

Added a new test to test illegal hostname in SNIHostName.

Xuelei

On 8/10/2013 10:49 AM, Xuelei Fan wrote:
 Hi Michael,
 
 It is pretty hard to get the issue solved in SNIHostName in a good
 sharp.  Here is my try to state why we should fix the issue in IDN.
 
 In SNIHostName, the following hostname are not accepted as valid hostname:
 1. empty hostname
 2. hostname ends with a trailing dot
 3. hostname does not comply to RFC 3490.
 
 The process in SNIHostName looks like:
 1. call IDN.toASCII() to convert a string hostname
 2. check that the return value of #1 is an valid hostname (non-empty,
 non-end-with-tailing-dot).
 
 At present, the IDN cannot handle the following IDN properly.
 1. returns com for com.
the trailing dot is swallowed.
 
 2. throws StringIndexOutOfBoundsException for .
 If . is an valid IDN that comply to RFC 3490, IDN.toASCII() should
 be able to handle it; otherwise, IDN.toASCII() should throw IAE as the
 specification suggested. However, IDN.toASCII(.) throws
 StringIndexOutOfBoundsException, this behavior does not comply the the
 specification:
 
 3. throws StringIndexOutOfBoundsException for example...net
As #2.
 
 We can address #1 and #2 in SNIHostName, but the checking is overloaded
 as IDN also need to address the issue. And SNIHostName has to know
 what's the separators (., \u3002, etc) of IDN in order to check the
 dot character. It is not a good encapsulation, and involved in too much
 about the details of domain name, I think.
 
 It is a little big hard to address #3 in SNIHostName.
 
 Both all of above issue can be easily addressed in IDN.  And once IDN
 addressed these issues, the current SNIHostName is able to handle
 invalid hostname (empty, trailing dot, etc) correctly.  We won't need to
 touch SNIHostName any more.
 
 Please consider it.
 
 The latest webrev is at:
 http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/
 
 Thanks,
 Xuelei
 
 On 8/10/2013 9:13 AM, Xuelei Fan wrote:
 Hi Michael,

 I plan to address this issue in SNIHostName.  I have filled another two
 the potential bugs in IDN.

 Thank you, and other people, for the feedback.

 Thanks,
 Xuelei

 On 8/9/2013 11:25 PM, Xuelei Fan wrote:
 On 8/9/2013 7:31 PM, Michael McMahon wrote:
 I don't see how this fixes the original problem as the SNIHostName spec
 still doesn't like hostnames with a trailing '.'

 The bug description did not reflect the IDN specification correctly.  If
 com. is a valid IDN, SNIHostName should accept it an an valid
 hostname.  The host name in SNIHostName is nothing more or less than an
 standard IDN.

 I added a comment in the bug: com. and . are valid IDN according the
 IDN and domain name specifications.  I will contact the bug reporter
 about this point.

 Xuelei

 I'd prefer to check first where that requirement is coming from, if it is
 actually necessary, and if not consider removing it from SNIHostName.
 If it is necessary, then the check should be implemented in SNIHostName.

 Michael

 On 09/08/13 05:28, Xuelei Fan wrote:
 Thanks for your feedback and suggestions.

 Here is the new webrev:
 http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/

 . is regarded as valid IDN in this update.

 Thanks,
 Xuelei

 On 8/9/2013 10:50 AM, Xuelei Fan wrote:
 On 8/9/2013 10:14 AM, Weijun Wang wrote:

 On 8/9/13 9:37 AM, Xuelei Fan wrote:
 On 8/9/2013 9:22 AM, Weijun Wang wrote:
 I tried nslookup. Those with .. inside are illegal,

 $ nslookup com..
 nslookup: 'com..' is not a legal name (empty label)

 but

 $ nslookup .
 Server:192.168.10.1
 Address:192.168.10.1#53

 Non-authoritative answer:
 *** Can't find .: No answer

 Thanks for the testing.  The behaviors are the same as this fix now.
 No exactly. It seems nslookup still regards . legal but just cannot
 find an IP for it.

 I'm not sure whether a root domain name can be stand alone.  Root label
 is not considered as a label in IDN.  I think it is safe to regard that
 . is not a valid IDN as it contains no label.  Anyway, it is a corner
 case.

 There are many online IDN conversion web services, some of them can
 convert ., some of the cannot.  In the present implementation, we
 cannot recognize ., and IDN.toASCII(.) throws
 StringIndexOutOfBoundsException.  With this fix, I was wondering IAE is
 a better exception for IDN.toASCII(.).

 Learn something new today to use nslookup.

 Also, since this bug was originally about SNIHostName, do you need to
 add some extra restriction there to reject oracle.com. things?

 No, we cannot restrict the format of IDN in SNIHostName more than in
 IDN. However, we may need to rethink about the comparing of two
 IDN, for
 example, example.com. should equal to example.com.  I want to
 consider it in another bug.
 Not sure. Does the spec say IDN and SNIHostName are equivalent sets?
 And
 it's not one is another's subset?

 Per TLS specification, host name in SNI is an IDN.  The spec of
 SNIHostname 

Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-11 Thread Weijun Wang

   if (q  input.length()) {   // Ah, a dot!
 out.append('.');
   }
   p = q + 1;


Using

  if (q != input.length())

should be even better. The searchDots method clearly specifies that or 
if there is no dots, return the length of input string.


--Max


Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-11 Thread Xuelei Fan
new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.06/

 Lines 280 and 333: How about we call them steps 8a and 8b?

Step 8 is referring to the steps in RFC 3490.  Let's use step 8.

Thanks,
Xuelei

On 8/12/2013 11:11 AM, Weijun Wang wrote:
 I think the fix is adequate and necessary.
 
 One problem: lines 367-373 adds a new IAE to ToUnicode but the method
 should not fail forever.
 
 And some small comments on styles etc.
 
 On 8/12/13 9:09 AM, Xuelei Fan wrote:
 new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.05/
 
 Lines 123 and 185:
 
  184 p = q + 1;
  185 if (p  input.length() || q == (input.length() - 1)) {
  186// has more labels, or keep the trailing dot as at
 present
  187out.append('.');
  188 }
 
 I prefer
 
   if (q  input.length()) {   // Ah, a dot!
 out.append('.');
   }
   p = q + 1;
 
 Lines 282, 335, 270: Insert a blank after if.
 
 Lines 284 and 372: nslookup uses empty label, which I like better.
 
 Lines 453 and 460: Personally I don't like the parenthesis for the whole
 return value, but you have your choice.
 
 Lines 280 and 333: How about we call them steps 8a and 8b?
 

 Added a new test to test illegal hostname in SNIHostName.
 
 Excellent. Otherwise I will be wondering why the fix in IDN could solve
 the problem as described in the bug description.
 
 Thanks
 Max
 

 Xuelei

 On 8/10/2013 10:49 AM, Xuelei Fan wrote:
 Hi Michael,

 It is pretty hard to get the issue solved in SNIHostName in a good
 sharp.  Here is my try to state why we should fix the issue in IDN.

 In SNIHostName, the following hostname are not accepted as valid
 hostname:
 1. empty hostname
 2. hostname ends with a trailing dot
 3. hostname does not comply to RFC 3490.

 The process in SNIHostName looks like:
 1. call IDN.toASCII() to convert a string hostname
 2. check that the return value of #1 is an valid hostname (non-empty,
 non-end-with-tailing-dot).

 At present, the IDN cannot handle the following IDN properly.
 1. returns com for com.
 the trailing dot is swallowed.

 2. throws StringIndexOutOfBoundsException for .
  If . is an valid IDN that comply to RFC 3490, IDN.toASCII()
 should
 be able to handle it; otherwise, IDN.toASCII() should throw IAE as the
 specification suggested. However, IDN.toASCII(.) throws
 StringIndexOutOfBoundsException, this behavior does not comply the the
 specification:

 3. throws StringIndexOutOfBoundsException for example...net
 As #2.

 We can address #1 and #2 in SNIHostName, but the checking is overloaded
 as IDN also need to address the issue. And SNIHostName has to know
 what's the separators (., \u3002, etc) of IDN in order to check the
 dot character. It is not a good encapsulation, and involved in too much
 about the details of domain name, I think.

 It is a little big hard to address #3 in SNIHostName.

 Both all of above issue can be easily addressed in IDN.  And once IDN
 addressed these issues, the current SNIHostName is able to handle
 invalid hostname (empty, trailing dot, etc) correctly.  We won't need to
 touch SNIHostName any more.

 Please consider it.

 The latest webrev is at:
 http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/

 Thanks,
 Xuelei

 On 8/10/2013 9:13 AM, Xuelei Fan wrote:
 Hi Michael,

 I plan to address this issue in SNIHostName.  I have filled another two
 the potential bugs in IDN.

 Thank you, and other people, for the feedback.

 Thanks,
 Xuelei

 On 8/9/2013 11:25 PM, Xuelei Fan wrote:
 On 8/9/2013 7:31 PM, Michael McMahon wrote:
 I don't see how this fixes the original problem as the SNIHostName
 spec
 still doesn't like hostnames with a trailing '.'

 The bug description did not reflect the IDN specification
 correctly.  If
 com. is a valid IDN, SNIHostName should accept it an an valid
 hostname.  The host name in SNIHostName is nothing more or less
 than an
 standard IDN.

 I added a comment in the bug: com. and . are valid IDN
 according the
 IDN and domain name specifications.  I will contact the bug reporter
 about this point.

 Xuelei

 I'd prefer to check first where that requirement is coming from,
 if it is
 actually necessary, and if not consider removing it from SNIHostName.
 If it is necessary, then the check should be implemented in
 SNIHostName.

 Michael

 On 09/08/13 05:28, Xuelei Fan wrote:
 Thanks for your feedback and suggestions.

 Here is the new webrev:
  http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/

 . is regarded as valid IDN in this update.

 Thanks,
 Xuelei

 On 8/9/2013 10:50 AM, Xuelei Fan wrote:
 On 8/9/2013 10:14 AM, Weijun Wang wrote:

 On 8/9/13 9:37 AM, Xuelei Fan wrote:
 On 8/9/2013 9:22 AM, Weijun Wang wrote:
 I tried nslookup. Those with .. inside are illegal,

 $ nslookup com..
 nslookup: 'com..' is not a legal name (empty label)

 but

 $ nslookup .
 Server:192.168.10.1
 Address:192.168.10.1#53

 Non-authoritative answer:
 *** Can't find .: No answer

 

Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-09 Thread Dmitry Samersoff
Xuelei,

 119 p = q + 1;
 120 if (p  input.length() || q == (input.length() - 1)) {

Could be simplified to:

q = input.length()-1

-Dmitry

On 2013-08-09 04:41, Xuelei Fan wrote:
 Ping.
 
 Thanks,
 Xuelei
 
 On 8/7/2013 11:17 PM, Xuelei Fan wrote:
 Please review the new update:

 http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/

 With this update, com. is valid (return com.); . and
 example..com are invalid.  And IAE will be thrown for invalid IDN.

 Thanks,
 Xuelei

 On 8/7/2013 10:18 PM, Michael McMahon wrote:
 On 07/08/13 15:13, Xuelei Fan wrote:
 On 8/7/2013 10:05 PM, Michael McMahon wrote:
 Resolvers seem to accept queries using trailing dots.

 eg nslookup www.oracle.com.

 or InetAddress.getByName(www.oracle.com.);

 The part of RFC3490 quoted below seems to me to be saying
 that the empty label implied by the trailing dot is not regarded
 as a label so that you don't end up calling toAscii() or toUnicode()
 with an empty string. I don't think it's saying the trailing dot can't
 be there.

 It makes sense.

 What's your preference to return for IDN.toASCII(www.oracle.com.),
 www.oracle.com. or www.oracle.com? The current returned value is
 www.oracle.com.  I would like to reserve the behavior in this update.

 My opinion is to keep it as at present ie. www.oracle.com.

 Michael

 I think we are on same page soon.

 Thanks,
 Xuelei

 Michael

 On 07/08/13 13:44, Xuelei Fan wrote:
 On 8/7/2013 12:06 AM, Matthew Hall wrote:
 Trailing dots are allowed in plain DNS (thus almost surely in IDN),
 and the single dot represents the root zone. So you have to be
 careful making this sort of change to check the DNS RFCs first.
 That's the first question we need to answer, whether IDN allow tailling
 dots (com.), zero-length root label (.), and zero-length label (,
 for example example..com)?

 Per the specification of IDN.toASCII():
 ===
 ToASCII operation can fail. ToASCII fails if any step of it fails. If
 ToASCII operation fails, an IllegalArgumentException will be thrown. In
 this case, the input string should not be used in an internationalized
 domain name.

 A label is an individual part of a domain name. The original ToASCII
 operation, as defined in RFC 3490, only operates on a single label.
 This
 method can handle both label and entire domain name, by assuming that
 labels in a domain name are always separated by dots. ...

 Throws IllegalArgumentException - if the input string doesn't
 conform to
 RFC 3490 specification

 Per the specification of RFC 3490:
 ==
 [section 2]
 A label is an individual part of a domain name.  Labels are usually
shown separated by dots; for example, the domain name
www.example.com is composed of three labels: www, example, and
com.  (The zero-length root label described in [STD13], which can
be explicit as in www.example.com. or implicit as in
www.example.com, is not considered a label in this
 specification.)

 An internationalized label is a label to which the ToASCII
operation (see section 4) can be applied without failing (with the
UseSTD3ASCIIRules flag unset).  ...
Although most Unicode characters can appear in
internationalized labels, ToASCII will fail for some input strings,
and such strings are not valid internationalized labels.

 An internationalized domain name (IDN) is a domain name in which
every label is an internationalized label.

 [Section 4.1]
 ToASCII consists of the following steps:

...

8. Verify that the number of code points is in the range 1 to 63
 inclusive.


 Here are the questions:
 1. whether example..com is an valid IDN?
  As dot is used as label separators, there are three labels,
 example, , com.  Per RFC 3490,  is not a valid label. Hence,
 example..com is not a valid IDN.

  We need to address the issue in IDN.

 2. whether xyz. is an valid IDN?
  It's an gray area, I think. We can treat the trailing . as root
 label, or a label separator.
  If the trailing . is treated as label separator, xyz. is
 invalid
 per RFC 3490.
  if the trailing . is treated as root label, what's the expected
 return value of IDN.toASCII(xyz.)?  I think the return value can be
 either xyz. or xyz.  The current implementation returns xyz.

  We may need not to update the implementation if tailing . is
 treated as root label.

 3. whether . is an valid IDN?
  It's an gray area again, I think.
  As above, if the trailing . is treated as root label, I think
 the
 return value can be either . or .  The current implementation
 throws
 a StringIndexOutOfBoundsException.

  However, what empty domain name () really means?  I would
 prefer to
 return . for . instead.

  We need to address the issue in IDN.


 Here comes the solution, the IDN.toASCII() returns:
 1. . for .;
 2. xyz for xyz.;
 3. IAE for example..com.

 Does it make sense?

 Thanks,
 Xuelei


 On 8/7/2013 

Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-09 Thread Xuelei Fan
On Aug 9, 2013, at 14:08, Dmitry Samersoff dmitry.samers...@oracle.com wrote:

 Xuelei,
 
 119 p = q + 1;
 120 if (p  input.length() || q == (input.length() - 1)) {
 
 Could be simplified to:
 
 q = input.length()-1
 
It's cool!

Xuelei

 -Dmitry
 
 On 2013-08-09 04:41, Xuelei Fan wrote:
 Ping.
 
 Thanks,
 Xuelei
 
 On 8/7/2013 11:17 PM, Xuelei Fan wrote:
 Please review the new update:
 
 http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/
 
 With this update, com. is valid (return com.); . and
 example..com are invalid.  And IAE will be thrown for invalid IDN.
 
 Thanks,
 Xuelei
 
 On 8/7/2013 10:18 PM, Michael McMahon wrote:
 On 07/08/13 15:13, Xuelei Fan wrote:
 On 8/7/2013 10:05 PM, Michael McMahon wrote:
 Resolvers seem to accept queries using trailing dots.
 
 eg nslookup www.oracle.com.
 
 or InetAddress.getByName(www.oracle.com.);
 
 The part of RFC3490 quoted below seems to me to be saying
 that the empty label implied by the trailing dot is not regarded
 as a label so that you don't end up calling toAscii() or toUnicode()
 with an empty string. I don't think it's saying the trailing dot can't
 be there.
 It makes sense.
 
 What's your preference to return for IDN.toASCII(www.oracle.com.),
 www.oracle.com. or www.oracle.com? The current returned value is
 www.oracle.com.  I would like to reserve the behavior in this update.
 
 My opinion is to keep it as at present ie. www.oracle.com.
 
 Michael
 
 I think we are on same page soon.
 
 Thanks,
 Xuelei
 
 Michael
 
 On 07/08/13 13:44, Xuelei Fan wrote:
 On 8/7/2013 12:06 AM, Matthew Hall wrote:
 Trailing dots are allowed in plain DNS (thus almost surely in IDN),
 and the single dot represents the root zone. So you have to be
 careful making this sort of change to check the DNS RFCs first.
 That's the first question we need to answer, whether IDN allow tailling
 dots (com.), zero-length root label (.), and zero-length label (,
 for example example..com)?
 
 Per the specification of IDN.toASCII():
 ===
 ToASCII operation can fail. ToASCII fails if any step of it fails. If
 ToASCII operation fails, an IllegalArgumentException will be thrown. In
 this case, the input string should not be used in an internationalized
 domain name.
 
 A label is an individual part of a domain name. The original ToASCII
 operation, as defined in RFC 3490, only operates on a single label.
 This
 method can handle both label and entire domain name, by assuming that
 labels in a domain name are always separated by dots. ...
 
 Throws IllegalArgumentException - if the input string doesn't
 conform to
 RFC 3490 specification
 
 Per the specification of RFC 3490:
 ==
 [section 2]
 A label is an individual part of a domain name.  Labels are usually
   shown separated by dots; for example, the domain name
   www.example.com is composed of three labels: www, example, and
   com.  (The zero-length root label described in [STD13], which can
   be explicit as in www.example.com. or implicit as in
   www.example.com, is not considered a label in this
 specification.)
 
 An internationalized label is a label to which the ToASCII
   operation (see section 4) can be applied without failing (with the
   UseSTD3ASCIIRules flag unset).  ...
   Although most Unicode characters can appear in
   internationalized labels, ToASCII will fail for some input strings,
   and such strings are not valid internationalized labels.
 
 An internationalized domain name (IDN) is a domain name in which
   every label is an internationalized label.
 
 [Section 4.1]
 ToASCII consists of the following steps:
 
   ...
 
   8. Verify that the number of code points is in the range 1 to 63
inclusive.
 
 
 Here are the questions:
 1. whether example..com is an valid IDN?
 As dot is used as label separators, there are three labels,
 example, , com.  Per RFC 3490,  is not a valid label. Hence,
 example..com is not a valid IDN.
 
 We need to address the issue in IDN.
 
 2. whether xyz. is an valid IDN?
 It's an gray area, I think. We can treat the trailing . as root
 label, or a label separator.
 If the trailing . is treated as label separator, xyz. is
 invalid
 per RFC 3490.
 if the trailing . is treated as root label, what's the expected
 return value of IDN.toASCII(xyz.)?  I think the return value can be
 either xyz. or xyz.  The current implementation returns xyz.
 
 We may need not to update the implementation if tailing . is
 treated as root label.
 
 3. whether . is an valid IDN?
 It's an gray area again, I think.
 As above, if the trailing . is treated as root label, I think
 the
 return value can be either . or .  The current implementation
 throws
 a StringIndexOutOfBoundsException.
 
 However, what empty domain name () really means?  I would
 prefer to
 return . for . instead.
 
 We need to address the issue in IDN.
 
 
 Here comes the solution, the IDN.toASCII() 

Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-09 Thread Michael McMahon

I don't see how this fixes the original problem as the SNIHostName spec
still doesn't like hostnames with a trailing '.'

I'd prefer to check first where that requirement is coming from, if it is
actually necessary, and if not consider removing it from SNIHostName.
If it is necessary, then the check should be implemented in SNIHostName.

Michael

On 09/08/13 05:28, Xuelei Fan wrote:

Thanks for your feedback and suggestions.

Here is the new webrev:
http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/

. is regarded as valid IDN in this update.

Thanks,
Xuelei

On 8/9/2013 10:50 AM, Xuelei Fan wrote:

On 8/9/2013 10:14 AM, Weijun Wang wrote:


On 8/9/13 9:37 AM, Xuelei Fan wrote:

On 8/9/2013 9:22 AM, Weijun Wang wrote:

I tried nslookup. Those with .. inside are illegal,

$ nslookup com..
nslookup: 'com..' is not a legal name (empty label)

but

$ nslookup .
Server:192.168.10.1
Address:192.168.10.1#53

Non-authoritative answer:
*** Can't find .: No answer


Thanks for the testing.  The behaviors are the same as this fix now.

No exactly. It seems nslookup still regards . legal but just cannot
find an IP for it.


I'm not sure whether a root domain name can be stand alone.  Root label
is not considered as a label in IDN.  I think it is safe to regard that
. is not a valid IDN as it contains no label.  Anyway, it is a corner
case.

There are many online IDN conversion web services, some of them can
convert ., some of the cannot.  In the present implementation, we
cannot recognize ., and IDN.toASCII(.) throws
StringIndexOutOfBoundsException.  With this fix, I was wondering IAE is
a better exception for IDN.toASCII(.).


Learn something new today to use nslookup.


Also, since this bug was originally about SNIHostName, do you need to
add some extra restriction there to reject oracle.com. things?


No, we cannot restrict the format of IDN in SNIHostName more than in
IDN. However, we may need to rethink about the comparing of two IDN, for
example, example.com. should equal to example.com.  I want to
consider it in another bug.

Not sure. Does the spec say IDN and SNIHostName are equivalent sets? And
it's not one is another's subset?


Per TLS specification, host name in SNI is an IDN.  The spec of
SNIHostname says, hostname is not a valid Internationalized Domain Name
(IDN) compliant with the RFC 3490 specification. The spec in
SNIHostName has the same means as IDN.  I won't want to add additional
restrict beyond the specification of an IDN.

Xuelei


Can I push the changeset?

I think it's better to ask someone in the networking team to make the
suggestion. From what I read Michael in this thread, he does not seem
totally agreed with your code changes (at least not the 00 version).

Thanks
Max


Thanks,
Xuelei


Thanks
Max

On 8/9/13 8:41 AM, Xuelei Fan wrote:

Ping.

Thanks,
Xuelei

On 8/7/2013 11:17 PM, Xuelei Fan wrote:

Please review the new update:

http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/

With this update, com. is valid (return com.); . and
example..com are invalid.  And IAE will be thrown for invalid IDN.

Thanks,
Xuelei





Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-09 Thread Xuelei Fan
On 8/9/2013 7:31 PM, Michael McMahon wrote:
 I don't see how this fixes the original problem as the SNIHostName spec
 still doesn't like hostnames with a trailing '.'
 
The bug description did not reflect the IDN specification correctly.  If
com. is a valid IDN, SNIHostName should accept it an an valid
hostname.  The host name in SNIHostName is nothing more or less than an
standard IDN.

I added a comment in the bug: com. and . are valid IDN according the
IDN and domain name specifications.  I will contact the bug reporter
about this point.

Xuelei

 I'd prefer to check first where that requirement is coming from, if it is
 actually necessary, and if not consider removing it from SNIHostName.
 If it is necessary, then the check should be implemented in SNIHostName.
 
 Michael
 
 On 09/08/13 05:28, Xuelei Fan wrote:
 Thanks for your feedback and suggestions.

 Here is the new webrev:
 http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/

 . is regarded as valid IDN in this update.

 Thanks,
 Xuelei

 On 8/9/2013 10:50 AM, Xuelei Fan wrote:
 On 8/9/2013 10:14 AM, Weijun Wang wrote:

 On 8/9/13 9:37 AM, Xuelei Fan wrote:
 On 8/9/2013 9:22 AM, Weijun Wang wrote:
 I tried nslookup. Those with .. inside are illegal,

 $ nslookup com..
 nslookup: 'com..' is not a legal name (empty label)

 but

 $ nslookup .
 Server:192.168.10.1
 Address:192.168.10.1#53

 Non-authoritative answer:
 *** Can't find .: No answer

 Thanks for the testing.  The behaviors are the same as this fix now.
 No exactly. It seems nslookup still regards . legal but just cannot
 find an IP for it.

 I'm not sure whether a root domain name can be stand alone.  Root label
 is not considered as a label in IDN.  I think it is safe to regard that
 . is not a valid IDN as it contains no label.  Anyway, it is a corner
 case.

 There are many online IDN conversion web services, some of them can
 convert ., some of the cannot.  In the present implementation, we
 cannot recognize ., and IDN.toASCII(.) throws
 StringIndexOutOfBoundsException.  With this fix, I was wondering IAE is
 a better exception for IDN.toASCII(.).

 Learn something new today to use nslookup.

 Also, since this bug was originally about SNIHostName, do you need to
 add some extra restriction there to reject oracle.com. things?

 No, we cannot restrict the format of IDN in SNIHostName more than in
 IDN. However, we may need to rethink about the comparing of two
 IDN, for
 example, example.com. should equal to example.com.  I want to
 consider it in another bug.
 Not sure. Does the spec say IDN and SNIHostName are equivalent sets?
 And
 it's not one is another's subset?

 Per TLS specification, host name in SNI is an IDN.  The spec of
 SNIHostname says, hostname is not a valid Internationalized Domain Name
 (IDN) compliant with the RFC 3490 specification. The spec in
 SNIHostName has the same means as IDN.  I won't want to add additional
 restrict beyond the specification of an IDN.

 Xuelei

 Can I push the changeset?
 I think it's better to ask someone in the networking team to make the
 suggestion. From what I read Michael in this thread, he does not seem
 totally agreed with your code changes (at least not the 00 version).

 Thanks
 Max

 Thanks,
 Xuelei

 Thanks
 Max

 On 8/9/13 8:41 AM, Xuelei Fan wrote:
 Ping.

 Thanks,
 Xuelei

 On 8/7/2013 11:17 PM, Xuelei Fan wrote:
 Please review the new update:

 http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/

 With this update, com. is valid (return com.); . and
 example..com are invalid.  And IAE will be thrown for invalid
 IDN.

 Thanks,
 Xuelei

 



Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-09 Thread Xuelei Fan
Hi Michael,

I plan to address this issue in SNIHostName.  I have filled another two
the potential bugs in IDN.

Thank you, and other people, for the feedback.

Thanks,
Xuelei

On 8/9/2013 11:25 PM, Xuelei Fan wrote:
 On 8/9/2013 7:31 PM, Michael McMahon wrote:
 I don't see how this fixes the original problem as the SNIHostName spec
 still doesn't like hostnames with a trailing '.'

 The bug description did not reflect the IDN specification correctly.  If
 com. is a valid IDN, SNIHostName should accept it an an valid
 hostname.  The host name in SNIHostName is nothing more or less than an
 standard IDN.
 
 I added a comment in the bug: com. and . are valid IDN according the
 IDN and domain name specifications.  I will contact the bug reporter
 about this point.
 
 Xuelei
 
 I'd prefer to check first where that requirement is coming from, if it is
 actually necessary, and if not consider removing it from SNIHostName.
 If it is necessary, then the check should be implemented in SNIHostName.

 Michael

 On 09/08/13 05:28, Xuelei Fan wrote:
 Thanks for your feedback and suggestions.

 Here is the new webrev:
 http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/

 . is regarded as valid IDN in this update.

 Thanks,
 Xuelei

 On 8/9/2013 10:50 AM, Xuelei Fan wrote:
 On 8/9/2013 10:14 AM, Weijun Wang wrote:

 On 8/9/13 9:37 AM, Xuelei Fan wrote:
 On 8/9/2013 9:22 AM, Weijun Wang wrote:
 I tried nslookup. Those with .. inside are illegal,

 $ nslookup com..
 nslookup: 'com..' is not a legal name (empty label)

 but

 $ nslookup .
 Server:192.168.10.1
 Address:192.168.10.1#53

 Non-authoritative answer:
 *** Can't find .: No answer

 Thanks for the testing.  The behaviors are the same as this fix now.
 No exactly. It seems nslookup still regards . legal but just cannot
 find an IP for it.

 I'm not sure whether a root domain name can be stand alone.  Root label
 is not considered as a label in IDN.  I think it is safe to regard that
 . is not a valid IDN as it contains no label.  Anyway, it is a corner
 case.

 There are many online IDN conversion web services, some of them can
 convert ., some of the cannot.  In the present implementation, we
 cannot recognize ., and IDN.toASCII(.) throws
 StringIndexOutOfBoundsException.  With this fix, I was wondering IAE is
 a better exception for IDN.toASCII(.).

 Learn something new today to use nslookup.

 Also, since this bug was originally about SNIHostName, do you need to
 add some extra restriction there to reject oracle.com. things?

 No, we cannot restrict the format of IDN in SNIHostName more than in
 IDN. However, we may need to rethink about the comparing of two
 IDN, for
 example, example.com. should equal to example.com.  I want to
 consider it in another bug.
 Not sure. Does the spec say IDN and SNIHostName are equivalent sets?
 And
 it's not one is another's subset?

 Per TLS specification, host name in SNI is an IDN.  The spec of
 SNIHostname says, hostname is not a valid Internationalized Domain Name
 (IDN) compliant with the RFC 3490 specification. The spec in
 SNIHostName has the same means as IDN.  I won't want to add additional
 restrict beyond the specification of an IDN.

 Xuelei

 Can I push the changeset?
 I think it's better to ask someone in the networking team to make the
 suggestion. From what I read Michael in this thread, he does not seem
 totally agreed with your code changes (at least not the 00 version).

 Thanks
 Max

 Thanks,
 Xuelei

 Thanks
 Max

 On 8/9/13 8:41 AM, Xuelei Fan wrote:
 Ping.

 Thanks,
 Xuelei

 On 8/7/2013 11:17 PM, Xuelei Fan wrote:
 Please review the new update:

 http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/

 With this update, com. is valid (return com.); . and
 example..com are invalid.  And IAE will be thrown for invalid
 IDN.

 Thanks,
 Xuelei


 



Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-09 Thread Xuelei Fan
Hi Michael,

It is pretty hard to get the issue solved in SNIHostName in a good
sharp.  Here is my try to state why we should fix the issue in IDN.

In SNIHostName, the following hostname are not accepted as valid hostname:
1. empty hostname
2. hostname ends with a trailing dot
3. hostname does not comply to RFC 3490.

The process in SNIHostName looks like:
1. call IDN.toASCII() to convert a string hostname
2. check that the return value of #1 is an valid hostname (non-empty,
non-end-with-tailing-dot).

At present, the IDN cannot handle the following IDN properly.
1. returns com for com.
   the trailing dot is swallowed.

2. throws StringIndexOutOfBoundsException for .
If . is an valid IDN that comply to RFC 3490, IDN.toASCII() should
be able to handle it; otherwise, IDN.toASCII() should throw IAE as the
specification suggested. However, IDN.toASCII(.) throws
StringIndexOutOfBoundsException, this behavior does not comply the the
specification:

3. throws StringIndexOutOfBoundsException for example...net
   As #2.

We can address #1 and #2 in SNIHostName, but the checking is overloaded
as IDN also need to address the issue. And SNIHostName has to know
what's the separators (., \u3002, etc) of IDN in order to check the
dot character. It is not a good encapsulation, and involved in too much
about the details of domain name, I think.

It is a little big hard to address #3 in SNIHostName.

Both all of above issue can be easily addressed in IDN.  And once IDN
addressed these issues, the current SNIHostName is able to handle
invalid hostname (empty, trailing dot, etc) correctly.  We won't need to
touch SNIHostName any more.

Please consider it.

The latest webrev is at:
http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/

Thanks,
Xuelei

On 8/10/2013 9:13 AM, Xuelei Fan wrote:
 Hi Michael,
 
 I plan to address this issue in SNIHostName.  I have filled another two
 the potential bugs in IDN.
 
 Thank you, and other people, for the feedback.
 
 Thanks,
 Xuelei
 
 On 8/9/2013 11:25 PM, Xuelei Fan wrote:
 On 8/9/2013 7:31 PM, Michael McMahon wrote:
 I don't see how this fixes the original problem as the SNIHostName spec
 still doesn't like hostnames with a trailing '.'

 The bug description did not reflect the IDN specification correctly.  If
 com. is a valid IDN, SNIHostName should accept it an an valid
 hostname.  The host name in SNIHostName is nothing more or less than an
 standard IDN.

 I added a comment in the bug: com. and . are valid IDN according the
 IDN and domain name specifications.  I will contact the bug reporter
 about this point.

 Xuelei

 I'd prefer to check first where that requirement is coming from, if it is
 actually necessary, and if not consider removing it from SNIHostName.
 If it is necessary, then the check should be implemented in SNIHostName.

 Michael

 On 09/08/13 05:28, Xuelei Fan wrote:
 Thanks for your feedback and suggestions.

 Here is the new webrev:
 http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/

 . is regarded as valid IDN in this update.

 Thanks,
 Xuelei

 On 8/9/2013 10:50 AM, Xuelei Fan wrote:
 On 8/9/2013 10:14 AM, Weijun Wang wrote:

 On 8/9/13 9:37 AM, Xuelei Fan wrote:
 On 8/9/2013 9:22 AM, Weijun Wang wrote:
 I tried nslookup. Those with .. inside are illegal,

 $ nslookup com..
 nslookup: 'com..' is not a legal name (empty label)

 but

 $ nslookup .
 Server:192.168.10.1
 Address:192.168.10.1#53

 Non-authoritative answer:
 *** Can't find .: No answer

 Thanks for the testing.  The behaviors are the same as this fix now.
 No exactly. It seems nslookup still regards . legal but just cannot
 find an IP for it.

 I'm not sure whether a root domain name can be stand alone.  Root label
 is not considered as a label in IDN.  I think it is safe to regard that
 . is not a valid IDN as it contains no label.  Anyway, it is a corner
 case.

 There are many online IDN conversion web services, some of them can
 convert ., some of the cannot.  In the present implementation, we
 cannot recognize ., and IDN.toASCII(.) throws
 StringIndexOutOfBoundsException.  With this fix, I was wondering IAE is
 a better exception for IDN.toASCII(.).

 Learn something new today to use nslookup.

 Also, since this bug was originally about SNIHostName, do you need to
 add some extra restriction there to reject oracle.com. things?

 No, we cannot restrict the format of IDN in SNIHostName more than in
 IDN. However, we may need to rethink about the comparing of two
 IDN, for
 example, example.com. should equal to example.com.  I want to
 consider it in another bug.
 Not sure. Does the spec say IDN and SNIHostName are equivalent sets?
 And
 it's not one is another's subset?

 Per TLS specification, host name in SNI is an IDN.  The spec of
 SNIHostname says, hostname is not a valid Internationalized Domain Name
 (IDN) compliant with the RFC 3490 specification. The spec in
 SNIHostName has the same means as IDN.  I won't want to add additional
 restrict beyond the 

Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-08 Thread Xuelei Fan
Ping.

Thanks,
Xuelei

On 8/7/2013 11:17 PM, Xuelei Fan wrote:
 Please review the new update:
 
 http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/
 
 With this update, com. is valid (return com.); . and
 example..com are invalid.  And IAE will be thrown for invalid IDN.
 
 Thanks,
 Xuelei
 
 On 8/7/2013 10:18 PM, Michael McMahon wrote:
 On 07/08/13 15:13, Xuelei Fan wrote:
 On 8/7/2013 10:05 PM, Michael McMahon wrote:
 Resolvers seem to accept queries using trailing dots.

 eg nslookup www.oracle.com.

 or InetAddress.getByName(www.oracle.com.);

 The part of RFC3490 quoted below seems to me to be saying
 that the empty label implied by the trailing dot is not regarded
 as a label so that you don't end up calling toAscii() or toUnicode()
 with an empty string. I don't think it's saying the trailing dot can't
 be there.

 It makes sense.

 What's your preference to return for IDN.toASCII(www.oracle.com.),
 www.oracle.com. or www.oracle.com? The current returned value is
 www.oracle.com.  I would like to reserve the behavior in this update.

 My opinion is to keep it as at present ie. www.oracle.com.

 Michael

 I think we are on same page soon.

 Thanks,
 Xuelei

 Michael

 On 07/08/13 13:44, Xuelei Fan wrote:
 On 8/7/2013 12:06 AM, Matthew Hall wrote:
 Trailing dots are allowed in plain DNS (thus almost surely in IDN),
 and the single dot represents the root zone. So you have to be
 careful making this sort of change to check the DNS RFCs first.
 That's the first question we need to answer, whether IDN allow tailling
 dots (com.), zero-length root label (.), and zero-length label (,
 for example example..com)?

 Per the specification of IDN.toASCII():
 ===
 ToASCII operation can fail. ToASCII fails if any step of it fails. If
 ToASCII operation fails, an IllegalArgumentException will be thrown. In
 this case, the input string should not be used in an internationalized
 domain name.

 A label is an individual part of a domain name. The original ToASCII
 operation, as defined in RFC 3490, only operates on a single label.
 This
 method can handle both label and entire domain name, by assuming that
 labels in a domain name are always separated by dots. ...

 Throws IllegalArgumentException - if the input string doesn't
 conform to
 RFC 3490 specification

 Per the specification of RFC 3490:
 ==
 [section 2]
 A label is an individual part of a domain name.  Labels are usually
shown separated by dots; for example, the domain name
www.example.com is composed of three labels: www, example, and
com.  (The zero-length root label described in [STD13], which can
be explicit as in www.example.com. or implicit as in
www.example.com, is not considered a label in this
 specification.)

 An internationalized label is a label to which the ToASCII
operation (see section 4) can be applied without failing (with the
UseSTD3ASCIIRules flag unset).  ...
Although most Unicode characters can appear in
internationalized labels, ToASCII will fail for some input strings,
and such strings are not valid internationalized labels.

 An internationalized domain name (IDN) is a domain name in which
every label is an internationalized label.

 [Section 4.1]
 ToASCII consists of the following steps:

...

8. Verify that the number of code points is in the range 1 to 63
 inclusive.


 Here are the questions:
 1. whether example..com is an valid IDN?
  As dot is used as label separators, there are three labels,
 example, , com.  Per RFC 3490,  is not a valid label. Hence,
 example..com is not a valid IDN.

  We need to address the issue in IDN.

 2. whether xyz. is an valid IDN?
  It's an gray area, I think. We can treat the trailing . as root
 label, or a label separator.
  If the trailing . is treated as label separator, xyz. is
 invalid
 per RFC 3490.
  if the trailing . is treated as root label, what's the expected
 return value of IDN.toASCII(xyz.)?  I think the return value can be
 either xyz. or xyz.  The current implementation returns xyz.

  We may need not to update the implementation if tailing . is
 treated as root label.

 3. whether . is an valid IDN?
  It's an gray area again, I think.
  As above, if the trailing . is treated as root label, I think
 the
 return value can be either . or .  The current implementation
 throws
 a StringIndexOutOfBoundsException.

  However, what empty domain name () really means?  I would
 prefer to
 return . for . instead.

  We need to address the issue in IDN.


 Here comes the solution, the IDN.toASCII() returns:
 1. . for .;
 2. xyz for xyz.;
 3. IAE for example..com.

 Does it make sense?

 Thanks,
 Xuelei


 On 8/7/2013 1:35 AM, Michael McMahon wrote:
 I don't really understand the reason for the restriction in
 SNIHostName
 But, I guess that is where it should be enforced if it is required.

 Michael.

 On 06/08/13 17:43, 

Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-08 Thread Weijun Wang

I tried nslookup. Those with .. inside are illegal,

$ nslookup com..
nslookup: 'com..' is not a legal name (empty label)

but

$ nslookup .
Server: 192.168.10.1
Address:192.168.10.1#53

Non-authoritative answer:
*** Can't find .: No answer

Also, since this bug was originally about SNIHostName, do you need to 
add some extra restriction there to reject oracle.com. things?


Thanks
Max

On 8/9/13 8:41 AM, Xuelei Fan wrote:

Ping.

Thanks,
Xuelei

On 8/7/2013 11:17 PM, Xuelei Fan wrote:

Please review the new update:

http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/

With this update, com. is valid (return com.); . and
example..com are invalid.  And IAE will be thrown for invalid IDN.

Thanks,
Xuelei



Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-08 Thread Xuelei Fan
On 8/9/2013 9:22 AM, Weijun Wang wrote:
 I tried nslookup. Those with .. inside are illegal,
 
 $ nslookup com..
 nslookup: 'com..' is not a legal name (empty label)
 
 but
 
 $ nslookup .
 Server:192.168.10.1
 Address:192.168.10.1#53
 
 Non-authoritative answer:
 *** Can't find .: No answer
 
Thanks for the testing.  The behaviors are the same as this fix now.

Learn something new today to use nslookup.

 Also, since this bug was originally about SNIHostName, do you need to
 add some extra restriction there to reject oracle.com. things?
 
No, we cannot restrict the format of IDN in SNIHostName more than in
IDN. However, we may need to rethink about the comparing of two IDN, for
example, example.com. should equal to example.com.  I want to
consider it in another bug.

Can I push the changeset?

Thanks,
Xuelei

 Thanks
 Max
 
 On 8/9/13 8:41 AM, Xuelei Fan wrote:
 Ping.

 Thanks,
 Xuelei

 On 8/7/2013 11:17 PM, Xuelei Fan wrote:
 Please review the new update:

 http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/

 With this update, com. is valid (return com.); . and
 example..com are invalid.  And IAE will be thrown for invalid IDN.

 Thanks,
 Xuelei




Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-08 Thread Weijun Wang



On 8/9/13 9:37 AM, Xuelei Fan wrote:

On 8/9/2013 9:22 AM, Weijun Wang wrote:

I tried nslookup. Those with .. inside are illegal,

$ nslookup com..
nslookup: 'com..' is not a legal name (empty label)

but

$ nslookup .
Server:192.168.10.1
Address:192.168.10.1#53

Non-authoritative answer:
*** Can't find .: No answer


Thanks for the testing.  The behaviors are the same as this fix now.


No exactly. It seems nslookup still regards . legal but just cannot 
find an IP for it.




Learn something new today to use nslookup.


Also, since this bug was originally about SNIHostName, do you need to
add some extra restriction there to reject oracle.com. things?


No, we cannot restrict the format of IDN in SNIHostName more than in
IDN. However, we may need to rethink about the comparing of two IDN, for
example, example.com. should equal to example.com.  I want to
consider it in another bug.


Not sure. Does the spec say IDN and SNIHostName are equivalent sets? And 
it's not one is another's subset?




Can I push the changeset?


I think it's better to ask someone in the networking team to make the 
suggestion. From what I read Michael in this thread, he does not seem 
totally agreed with your code changes (at least not the 00 version).


Thanks
Max



Thanks,
Xuelei


Thanks
Max

On 8/9/13 8:41 AM, Xuelei Fan wrote:

Ping.

Thanks,
Xuelei

On 8/7/2013 11:17 PM, Xuelei Fan wrote:

Please review the new update:

http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/

With this update, com. is valid (return com.); . and
example..com are invalid.  And IAE will be thrown for invalid IDN.

Thanks,
Xuelei





Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-08 Thread Xuelei Fan
On 8/9/2013 10:14 AM, Weijun Wang wrote:
 
 
 On 8/9/13 9:37 AM, Xuelei Fan wrote:
 On 8/9/2013 9:22 AM, Weijun Wang wrote:
 I tried nslookup. Those with .. inside are illegal,

 $ nslookup com..
 nslookup: 'com..' is not a legal name (empty label)

 but

 $ nslookup .
 Server:192.168.10.1
 Address:192.168.10.1#53

 Non-authoritative answer:
 *** Can't find .: No answer

 Thanks for the testing.  The behaviors are the same as this fix now.
 
 No exactly. It seems nslookup still regards . legal but just cannot
 find an IP for it.
 
I'm not sure whether a root domain name can be stand alone.  Root label
is not considered as a label in IDN.  I think it is safe to regard that
. is not a valid IDN as it contains no label.  Anyway, it is a corner
case.

There are many online IDN conversion web services, some of them can
convert ., some of the cannot.  In the present implementation, we
cannot recognize ., and IDN.toASCII(.) throws
StringIndexOutOfBoundsException.  With this fix, I was wondering IAE is
a better exception for IDN.toASCII(.).


 Learn something new today to use nslookup.

 Also, since this bug was originally about SNIHostName, do you need to
 add some extra restriction there to reject oracle.com. things?

 No, we cannot restrict the format of IDN in SNIHostName more than in
 IDN. However, we may need to rethink about the comparing of two IDN, for
 example, example.com. should equal to example.com.  I want to
 consider it in another bug.
 
 Not sure. Does the spec say IDN and SNIHostName are equivalent sets? And
 it's not one is another's subset?
 
Per TLS specification, host name in SNI is an IDN.  The spec of
SNIHostname says, hostname is not a valid Internationalized Domain Name
(IDN) compliant with the RFC 3490 specification. The spec in
SNIHostName has the same means as IDN.  I won't want to add additional
restrict beyond the specification of an IDN.

Xuelei


 Can I push the changeset?
 
 I think it's better to ask someone in the networking team to make the
 suggestion. From what I read Michael in this thread, he does not seem
 totally agreed with your code changes (at least not the 00 version).
 
 Thanks
 Max
 

 Thanks,
 Xuelei

 Thanks
 Max

 On 8/9/13 8:41 AM, Xuelei Fan wrote:
 Ping.

 Thanks,
 Xuelei

 On 8/7/2013 11:17 PM, Xuelei Fan wrote:
 Please review the new update:

 http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/

 With this update, com. is valid (return com.); . and
 example..com are invalid.  And IAE will be thrown for invalid IDN.

 Thanks,
 Xuelei





Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-08 Thread Matthew Hall
But, DNS considers . as the valid root zone...
-- 
Sent from my mobile device.

Xuelei Fan xuelei@oracle.com wrote:
On 8/9/2013 10:14 AM, Weijun Wang wrote:
 
 
 On 8/9/13 9:37 AM, Xuelei Fan wrote:
 On 8/9/2013 9:22 AM, Weijun Wang wrote:
 I tried nslookup. Those with .. inside are illegal,

 $ nslookup com..
 nslookup: 'com..' is not a legal name (empty label)

 but

 $ nslookup .
 Server:192.168.10.1
 Address:192.168.10.1#53

 Non-authoritative answer:
 *** Can't find .: No answer

 Thanks for the testing.  The behaviors are the same as this fix now.
 
 No exactly. It seems nslookup still regards . legal but just cannot
 find an IP for it.
 
I'm not sure whether a root domain name can be stand alone.  Root label
is not considered as a label in IDN.  I think it is safe to regard that
. is not a valid IDN as it contains no label.  Anyway, it is a corner
case.

There are many online IDN conversion web services, some of them can
convert ., some of the cannot.  In the present implementation, we
cannot recognize ., and IDN.toASCII(.) throws
StringIndexOutOfBoundsException.  With this fix, I was wondering IAE is
a better exception for IDN.toASCII(.).


 Learn something new today to use nslookup.

 Also, since this bug was originally about SNIHostName, do you need
to
 add some extra restriction there to reject oracle.com. things?

 No, we cannot restrict the format of IDN in SNIHostName more than in
 IDN. However, we may need to rethink about the comparing of two IDN,
for
 example, example.com. should equal to example.com.  I want to
 consider it in another bug.
 
 Not sure. Does the spec say IDN and SNIHostName are equivalent sets?
And
 it's not one is another's subset?
 
Per TLS specification, host name in SNI is an IDN.  The spec of
SNIHostname says, hostname is not a valid Internationalized Domain
Name
(IDN) compliant with the RFC 3490 specification. The spec in
SNIHostName has the same means as IDN.  I won't want to add additional
restrict beyond the specification of an IDN.

Xuelei


 Can I push the changeset?
 
 I think it's better to ask someone in the networking team to make the
 suggestion. From what I read Michael in this thread, he does not seem
 totally agreed with your code changes (at least not the 00 version).
 
 Thanks
 Max
 

 Thanks,
 Xuelei

 Thanks
 Max

 On 8/9/13 8:41 AM, Xuelei Fan wrote:
 Ping.

 Thanks,
 Xuelei

 On 8/7/2013 11:17 PM, Xuelei Fan wrote:
 Please review the new update:

 http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/

 With this update, com. is valid (return com.); . and
 example..com are invalid.  And IAE will be thrown for invalid
IDN.

 Thanks,
 Xuelei





Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-08 Thread Xuelei Fan
On 8/9/2013 11:24 AM, Matthew Hall wrote:
 But, DNS considers . as the valid root zone...
 
Good! Looks like that IDN.toASCII(.) should returns ., so that a
general domain name can always use IDN.toASCII() conversion instead of
throwing runtime exception.

Xuelei


Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-08 Thread Xuelei Fan
Thanks for your feedback and suggestions.

Here is the new webrev:
   http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/

. is regarded as valid IDN in this update.

Thanks,
Xuelei

On 8/9/2013 10:50 AM, Xuelei Fan wrote:
 On 8/9/2013 10:14 AM, Weijun Wang wrote:


 On 8/9/13 9:37 AM, Xuelei Fan wrote:
 On 8/9/2013 9:22 AM, Weijun Wang wrote:
 I tried nslookup. Those with .. inside are illegal,

 $ nslookup com..
 nslookup: 'com..' is not a legal name (empty label)

 but

 $ nslookup .
 Server:192.168.10.1
 Address:192.168.10.1#53

 Non-authoritative answer:
 *** Can't find .: No answer

 Thanks for the testing.  The behaviors are the same as this fix now.

 No exactly. It seems nslookup still regards . legal but just cannot
 find an IP for it.

 I'm not sure whether a root domain name can be stand alone.  Root label
 is not considered as a label in IDN.  I think it is safe to regard that
 . is not a valid IDN as it contains no label.  Anyway, it is a corner
 case.
 
 There are many online IDN conversion web services, some of them can
 convert ., some of the cannot.  In the present implementation, we
 cannot recognize ., and IDN.toASCII(.) throws
 StringIndexOutOfBoundsException.  With this fix, I was wondering IAE is
 a better exception for IDN.toASCII(.).
 

 Learn something new today to use nslookup.

 Also, since this bug was originally about SNIHostName, do you need to
 add some extra restriction there to reject oracle.com. things?

 No, we cannot restrict the format of IDN in SNIHostName more than in
 IDN. However, we may need to rethink about the comparing of two IDN, for
 example, example.com. should equal to example.com.  I want to
 consider it in another bug.

 Not sure. Does the spec say IDN and SNIHostName are equivalent sets? And
 it's not one is another's subset?

 Per TLS specification, host name in SNI is an IDN.  The spec of
 SNIHostname says, hostname is not a valid Internationalized Domain Name
 (IDN) compliant with the RFC 3490 specification. The spec in
 SNIHostName has the same means as IDN.  I won't want to add additional
 restrict beyond the specification of an IDN.
 
 Xuelei
 

 Can I push the changeset?

 I think it's better to ask someone in the networking team to make the
 suggestion. From what I read Michael in this thread, he does not seem
 totally agreed with your code changes (at least not the 00 version).

 Thanks
 Max


 Thanks,
 Xuelei

 Thanks
 Max

 On 8/9/13 8:41 AM, Xuelei Fan wrote:
 Ping.

 Thanks,
 Xuelei

 On 8/7/2013 11:17 PM, Xuelei Fan wrote:
 Please review the new update:

 http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/

 With this update, com. is valid (return com.); . and
 example..com are invalid.  And IAE will be thrown for invalid IDN.

 Thanks,
 Xuelei


 



Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-07 Thread Xuelei Fan
On 8/7/2013 12:06 AM, Matthew Hall wrote:
 Trailing dots are allowed in plain DNS (thus almost surely in IDN),
 and the single dot represents the root zone. So you have to be
 careful making this sort of change to check the DNS RFCs first.

That's the first question we need to answer, whether IDN allow tailling
dots (com.), zero-length root label (.), and zero-length label (,
for example example..com)?

Per the specification of IDN.toASCII():
===
ToASCII operation can fail. ToASCII fails if any step of it fails. If
ToASCII operation fails, an IllegalArgumentException will be thrown. In
this case, the input string should not be used in an internationalized
domain name.

A label is an individual part of a domain name. The original ToASCII
operation, as defined in RFC 3490, only operates on a single label. This
method can handle both label and entire domain name, by assuming that
labels in a domain name are always separated by dots. ...

Throws IllegalArgumentException - if the input string doesn't conform to
RFC 3490 specification

Per the specification of RFC 3490:
==
[section 2]
A label is an individual part of a domain name.  Labels are usually
 shown separated by dots; for example, the domain name
 www.example.com is composed of three labels: www, example, and
 com.  (The zero-length root label described in [STD13], which can
 be explicit as in www.example.com. or implicit as in
 www.example.com, is not considered a label in this specification.)

An internationalized label is a label to which the ToASCII
 operation (see section 4) can be applied without failing (with the
 UseSTD3ASCIIRules flag unset).  ...
 Although most Unicode characters can appear in
 internationalized labels, ToASCII will fail for some input strings,
 and such strings are not valid internationalized labels.

An internationalized domain name (IDN) is a domain name in which
 every label is an internationalized label.

[Section 4.1]
ToASCII consists of the following steps:

 ...

 8. Verify that the number of code points is in the range 1 to 63
  inclusive.


Here are the questions:
1. whether example..com is an valid IDN?
   As dot is used as label separators, there are three labels,
example, , com.  Per RFC 3490,  is not a valid label. Hence,
example..com is not a valid IDN.

   We need to address the issue in IDN.

2. whether xyz. is an valid IDN?
   It's an gray area, I think. We can treat the trailing . as root
label, or a label separator.
   If the trailing . is treated as label separator, xyz. is invalid
per RFC 3490.
   if the trailing . is treated as root label, what's the expected
return value of IDN.toASCII(xyz.)?  I think the return value can be
either xyz. or xyz.  The current implementation returns xyz.

   We may need not to update the implementation if tailing . is
treated as root label.

3. whether . is an valid IDN?
   It's an gray area again, I think.
   As above, if the trailing . is treated as root label, I think the
return value can be either . or .  The current implementation throws
a StringIndexOutOfBoundsException.

   However, what empty domain name () really means?  I would prefer to
return . for . instead.

   We need to address the issue in IDN.


Here comes the solution, the IDN.toASCII() returns:
1. . for .;
2. xyz for xyz.;
3. IAE for example..com.

Does it make sense?

Thanks,
Xuelei


On 8/7/2013 1:35 AM, Michael McMahon wrote:
 I don't really understand the reason for the restriction in SNIHostName
 But, I guess that is where it should be enforced if it is required.
 
 Michael.
 
 On 06/08/13 17:43, Dmitry Samersoff wrote:
 Xuelei,

 . (dot) is perfectly valid domain name and it means root domain so com.
 is valid domain name as well.

 It thinks to me that in context of methods your change we should ignore
 trailing dots, rather than throw exception.

 -Dmitry



 On 2013-08-06 15:44, Xuelei Fan wrote:
 Hi,

 Please review the bug fix to strict the illegal input checking in IDN.

 webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/

 Here is two test cases, which are expected to get IAE.

 Case 1:
 String host = IDN.toASCII(., IDN.USE_STD3_ASCII_RULES);
 Exception in thread main java.lang.StringIndexOutOfBoundsException:
 String index out of range: 0
  at java.lang.StringBuffer.charAt(StringBuffer.java:204)
  at java.net.IDN.toASCIIInternal(IDN.java:279)
  at java.net.IDN.toASCII(IDN.java:118)

 Case 2:
 String host = IDN.toASCII(com., IDN.USE_STD3_ASCII_RULES);

 Thanks,
 Xuelei


 



Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-07 Thread Dmitry Samersoff
Xuelei,

root label is an empty label[1], dot is a label separator, so in printed
form domain names is dot-terminated.

Please see also below inline.

[1]
RFC rfc1034.txt:

Internally, programs that manipulate domain names should represent them
as sequences of labels, where each label is a length octet followed by
an octet string.  Because all domain names end at the root, *which has a
null string for a label*, these internal representations can use a
length byte of zero to terminate a domain name.


On 2013-08-07 16:44, Xuelei Fan wrote:
 On 8/7/2013 12:06 AM, Matthew Hall wrote:
 Trailing dots are allowed in plain DNS (thus almost surely in IDN),
 and the single dot represents the root zone. So you have to be
 careful making this sort of change to check the DNS RFCs first.
 
 That's the first question we need to answer, whether IDN allow tailling
 dots (com.), zero-length root label (.), and zero-length label (,
 for example example..com)?
 
 Per the specification of IDN.toASCII():
 ===
 ToASCII operation can fail. ToASCII fails if any step of it fails. If
 ToASCII operation fails, an IllegalArgumentException will be thrown. In
 this case, the input string should not be used in an internationalized
 domain name.
 
 A label is an individual part of a domain name. The original ToASCII
 operation, as defined in RFC 3490, only operates on a single label. This
 method can handle both label and entire domain name, by assuming that
 labels in a domain name are always separated by dots. ...
 
 Throws IllegalArgumentException - if the input string doesn't conform to
 RFC 3490 specification
 
 Per the specification of RFC 3490:
 ==
 [section 2]
 A label is an individual part of a domain name.  Labels are usually
  shown separated by dots; for example, the domain name
  www.example.com is composed of three labels: www, example, and
  com.  (The zero-length root label described in [STD13], which can
  be explicit as in www.example.com. or implicit as in
  www.example.com, is not considered a label in this specification.)
 
 An internationalized label is a label to which the ToASCII
  operation (see section 4) can be applied without failing (with the
  UseSTD3ASCIIRules flag unset).  ...
  Although most Unicode characters can appear in
  internationalized labels, ToASCII will fail for some input strings,
  and such strings are not valid internationalized labels.
 
 An internationalized domain name (IDN) is a domain name in which
  every label is an internationalized label.
 
 [Section 4.1]
 ToASCII consists of the following steps:
 
  ...
 
  8. Verify that the number of code points is in the range 1 to 63
   inclusive.
 
 
 Here are the questions:
 1. whether example..com is an valid IDN?
As dot is used as label separators, there are three labels,
 example, , com.  Per RFC 3490,  is not a valid label. Hence,
 example..com is not a valid IDN.
 
We need to address the issue in IDN.

Root label can't appear in the middle of domain name, so example..com is
an invalid domain name and appropriate exception have to be thrown.

 
 2. whether xyz. is an valid IDN?
It's an gray area, I think. We can treat the trailing . as root
 label, or a label separator.
If the trailing . is treated as label separator, xyz. is invalid
 per RFC 3490.
if the trailing . is treated as root label, what's the expected
 return value of IDN.toASCII(xyz.)?  I think the return value can be
 either xyz. or xyz.  The current implementation returns xyz.
 
We may need not to update the implementation if tailing . is
 treated as root label.

Empty label at the end of domain names is valid per RFC 1034 and means
root label. So we should process this name and return all non-empty
labels.

 3. whether . is an valid IDN?
It's an gray area again, I think.
As above, if the trailing . is treated as root label, I think the
 return value can be either . or .  The current implementation throws
 a StringIndexOutOfBoundsException.
 
However, what empty domain name () really means?  I would prefer to
 return . for . instead.
 
We need to address the issue in IDN.

As dot is a label separator and root (empty) label can't appear in the
middle of domain name, . (dot) is not valid name and this case is
similar to case (1) - we should throw an appropriate exception.

-Dmitry

 
 Here comes the solution, the IDN.toASCII() returns:
 1. . for .;
 2. xyz for xyz.;
 3. IAE for example..com.
 
 Does it make sense?
 
 Thanks,
 Xuelei
 
 
 On 8/7/2013 1:35 AM, Michael McMahon wrote:
 I don't really understand the reason for the restriction in SNIHostName
 But, I guess that is where it should be enforced if it is required.

 Michael.

 On 06/08/13 17:43, Dmitry Samersoff wrote:
 Xuelei,

 . (dot) is perfectly valid domain name and it means root domain so com.
 is valid domain name as well.

 It thinks to me that in context of methods your change we should ignore
 

Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-07 Thread Michael McMahon

Resolvers seem to accept queries using trailing dots.

eg nslookup www.oracle.com.

or InetAddress.getByName(www.oracle.com.);

The part of RFC3490 quoted below seems to me to be saying
that the empty label implied by the trailing dot is not regarded
as a label so that you don't end up calling toAscii() or toUnicode()
with an empty string. I don't think it's saying the trailing dot can't 
be there.


Michael

On 07/08/13 13:44, Xuelei Fan wrote:

On 8/7/2013 12:06 AM, Matthew Hall wrote:

Trailing dots are allowed in plain DNS (thus almost surely in IDN),
and the single dot represents the root zone. So you have to be
careful making this sort of change to check the DNS RFCs first.

That's the first question we need to answer, whether IDN allow tailling
dots (com.), zero-length root label (.), and zero-length label (,
for example example..com)?

Per the specification of IDN.toASCII():
===
ToASCII operation can fail. ToASCII fails if any step of it fails. If
ToASCII operation fails, an IllegalArgumentException will be thrown. In
this case, the input string should not be used in an internationalized
domain name.

A label is an individual part of a domain name. The original ToASCII
operation, as defined in RFC 3490, only operates on a single label. This
method can handle both label and entire domain name, by assuming that
labels in a domain name are always separated by dots. ...

Throws IllegalArgumentException - if the input string doesn't conform to
RFC 3490 specification

Per the specification of RFC 3490:
==
[section 2]
A label is an individual part of a domain name.  Labels are usually
  shown separated by dots; for example, the domain name
  www.example.com is composed of three labels: www, example, and
  com.  (The zero-length root label described in [STD13], which can
  be explicit as in www.example.com. or implicit as in
  www.example.com, is not considered a label in this specification.)

An internationalized label is a label to which the ToASCII
  operation (see section 4) can be applied without failing (with the
  UseSTD3ASCIIRules flag unset).  ...
  Although most Unicode characters can appear in
  internationalized labels, ToASCII will fail for some input strings,
  and such strings are not valid internationalized labels.

An internationalized domain name (IDN) is a domain name in which
  every label is an internationalized label.

[Section 4.1]
ToASCII consists of the following steps:

  ...

  8. Verify that the number of code points is in the range 1 to 63
   inclusive.


Here are the questions:
1. whether example..com is an valid IDN?
As dot is used as label separators, there are three labels,
example, , com.  Per RFC 3490,  is not a valid label. Hence,
example..com is not a valid IDN.

We need to address the issue in IDN.

2. whether xyz. is an valid IDN?
It's an gray area, I think. We can treat the trailing . as root
label, or a label separator.
If the trailing . is treated as label separator, xyz. is invalid
per RFC 3490.
if the trailing . is treated as root label, what's the expected
return value of IDN.toASCII(xyz.)?  I think the return value can be
either xyz. or xyz.  The current implementation returns xyz.

We may need not to update the implementation if tailing . is
treated as root label.

3. whether . is an valid IDN?
It's an gray area again, I think.
As above, if the trailing . is treated as root label, I think the
return value can be either . or .  The current implementation throws
a StringIndexOutOfBoundsException.

However, what empty domain name () really means?  I would prefer to
return . for . instead.

We need to address the issue in IDN.


Here comes the solution, the IDN.toASCII() returns:
1. . for .;
2. xyz for xyz.;
3. IAE for example..com.

Does it make sense?

Thanks,
Xuelei


On 8/7/2013 1:35 AM, Michael McMahon wrote:

I don't really understand the reason for the restriction in SNIHostName
But, I guess that is where it should be enforced if it is required.

Michael.

On 06/08/13 17:43, Dmitry Samersoff wrote:

Xuelei,

. (dot) is perfectly valid domain name and it means root domain so com.
is valid domain name as well.

It thinks to me that in context of methods your change we should ignore
trailing dots, rather than throw exception.

-Dmitry



On 2013-08-06 15:44, Xuelei Fan wrote:

Hi,

Please review the bug fix to strict the illegal input checking in IDN.

webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/

Here is two test cases, which are expected to get IAE.

Case 1:
String host = IDN.toASCII(., IDN.USE_STD3_ASCII_RULES);
Exception in thread main java.lang.StringIndexOutOfBoundsException:
String index out of range: 0
  at java.lang.StringBuffer.charAt(StringBuffer.java:204)
  at java.net.IDN.toASCIIInternal(IDN.java:279)
  at java.net.IDN.toASCII(IDN.java:118)

Case 2:
String host = IDN.toASCII(com., 

Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-07 Thread Xuelei Fan
On 8/7/2013 10:05 PM, Michael McMahon wrote:
 Resolvers seem to accept queries using trailing dots.
 
 eg nslookup www.oracle.com.
 
 or InetAddress.getByName(www.oracle.com.);
 
 The part of RFC3490 quoted below seems to me to be saying
 that the empty label implied by the trailing dot is not regarded
 as a label so that you don't end up calling toAscii() or toUnicode()
 with an empty string. I don't think it's saying the trailing dot can't
 be there.
 
It makes sense.

What's your preference to return for IDN.toASCII(www.oracle.com.),
www.oracle.com. or www.oracle.com? The current returned value is
www.oracle.com.  I would like to reserve the behavior in this update.

I think we are on same page soon.

Thanks,
Xuelei

 Michael
 
 On 07/08/13 13:44, Xuelei Fan wrote:
 On 8/7/2013 12:06 AM, Matthew Hall wrote:
 Trailing dots are allowed in plain DNS (thus almost surely in IDN),
 and the single dot represents the root zone. So you have to be
 careful making this sort of change to check the DNS RFCs first.
 That's the first question we need to answer, whether IDN allow tailling
 dots (com.), zero-length root label (.), and zero-length label (,
 for example example..com)?

 Per the specification of IDN.toASCII():
 ===
 ToASCII operation can fail. ToASCII fails if any step of it fails. If
 ToASCII operation fails, an IllegalArgumentException will be thrown. In
 this case, the input string should not be used in an internationalized
 domain name.

 A label is an individual part of a domain name. The original ToASCII
 operation, as defined in RFC 3490, only operates on a single label. This
 method can handle both label and entire domain name, by assuming that
 labels in a domain name are always separated by dots. ...

 Throws IllegalArgumentException - if the input string doesn't conform to
 RFC 3490 specification

 Per the specification of RFC 3490:
 ==
 [section 2]
 A label is an individual part of a domain name.  Labels are usually
   shown separated by dots; for example, the domain name
   www.example.com is composed of three labels: www, example, and
   com.  (The zero-length root label described in [STD13], which can
   be explicit as in www.example.com. or implicit as in
   www.example.com, is not considered a label in this specification.)

 An internationalized label is a label to which the ToASCII
   operation (see section 4) can be applied without failing (with the
   UseSTD3ASCIIRules flag unset).  ...
   Although most Unicode characters can appear in
   internationalized labels, ToASCII will fail for some input strings,
   and such strings are not valid internationalized labels.

 An internationalized domain name (IDN) is a domain name in which
   every label is an internationalized label.

 [Section 4.1]
 ToASCII consists of the following steps:

   ...

   8. Verify that the number of code points is in the range 1 to 63
inclusive.


 Here are the questions:
 1. whether example..com is an valid IDN?
 As dot is used as label separators, there are three labels,
 example, , com.  Per RFC 3490,  is not a valid label. Hence,
 example..com is not a valid IDN.

 We need to address the issue in IDN.

 2. whether xyz. is an valid IDN?
 It's an gray area, I think. We can treat the trailing . as root
 label, or a label separator.
 If the trailing . is treated as label separator, xyz. is invalid
 per RFC 3490.
 if the trailing . is treated as root label, what's the expected
 return value of IDN.toASCII(xyz.)?  I think the return value can be
 either xyz. or xyz.  The current implementation returns xyz.

 We may need not to update the implementation if tailing . is
 treated as root label.

 3. whether . is an valid IDN?
 It's an gray area again, I think.
 As above, if the trailing . is treated as root label, I think the
 return value can be either . or .  The current implementation throws
 a StringIndexOutOfBoundsException.

 However, what empty domain name () really means?  I would prefer to
 return . for . instead.

 We need to address the issue in IDN.


 Here comes the solution, the IDN.toASCII() returns:
 1. . for .;
 2. xyz for xyz.;
 3. IAE for example..com.

 Does it make sense?

 Thanks,
 Xuelei


 On 8/7/2013 1:35 AM, Michael McMahon wrote:
 I don't really understand the reason for the restriction in SNIHostName
 But, I guess that is where it should be enforced if it is required.

 Michael.

 On 06/08/13 17:43, Dmitry Samersoff wrote:
 Xuelei,

 . (dot) is perfectly valid domain name and it means root domain so com.
 is valid domain name as well.

 It thinks to me that in context of methods your change we should ignore
 trailing dots, rather than throw exception.

 -Dmitry



 On 2013-08-06 15:44, Xuelei Fan wrote:
 Hi,

 Please review the bug fix to strict the illegal input checking in IDN.

 webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/

 Here is two test cases, which 

Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-07 Thread Michael McMahon

On 07/08/13 15:13, Xuelei Fan wrote:

On 8/7/2013 10:05 PM, Michael McMahon wrote:

Resolvers seem to accept queries using trailing dots.

eg nslookup www.oracle.com.

or InetAddress.getByName(www.oracle.com.);

The part of RFC3490 quoted below seems to me to be saying
that the empty label implied by the trailing dot is not regarded
as a label so that you don't end up calling toAscii() or toUnicode()
with an empty string. I don't think it's saying the trailing dot can't
be there.


It makes sense.

What's your preference to return for IDN.toASCII(www.oracle.com.),
www.oracle.com. or www.oracle.com? The current returned value is
www.oracle.com.  I would like to reserve the behavior in this update.


My opinion is to keep it as at present ie. www.oracle.com.

Michael


I think we are on same page soon.

Thanks,
Xuelei


Michael

On 07/08/13 13:44, Xuelei Fan wrote:

On 8/7/2013 12:06 AM, Matthew Hall wrote:

Trailing dots are allowed in plain DNS (thus almost surely in IDN),
and the single dot represents the root zone. So you have to be
careful making this sort of change to check the DNS RFCs first.

That's the first question we need to answer, whether IDN allow tailling
dots (com.), zero-length root label (.), and zero-length label (,
for example example..com)?

Per the specification of IDN.toASCII():
===
ToASCII operation can fail. ToASCII fails if any step of it fails. If
ToASCII operation fails, an IllegalArgumentException will be thrown. In
this case, the input string should not be used in an internationalized
domain name.

A label is an individual part of a domain name. The original ToASCII
operation, as defined in RFC 3490, only operates on a single label. This
method can handle both label and entire domain name, by assuming that
labels in a domain name are always separated by dots. ...

Throws IllegalArgumentException - if the input string doesn't conform to
RFC 3490 specification

Per the specification of RFC 3490:
==
[section 2]
A label is an individual part of a domain name.  Labels are usually
   shown separated by dots; for example, the domain name
   www.example.com is composed of three labels: www, example, and
   com.  (The zero-length root label described in [STD13], which can
   be explicit as in www.example.com. or implicit as in
   www.example.com, is not considered a label in this specification.)

An internationalized label is a label to which the ToASCII
   operation (see section 4) can be applied without failing (with the
   UseSTD3ASCIIRules flag unset).  ...
   Although most Unicode characters can appear in
   internationalized labels, ToASCII will fail for some input strings,
   and such strings are not valid internationalized labels.

An internationalized domain name (IDN) is a domain name in which
   every label is an internationalized label.

[Section 4.1]
ToASCII consists of the following steps:

   ...

   8. Verify that the number of code points is in the range 1 to 63
inclusive.


Here are the questions:
1. whether example..com is an valid IDN?
 As dot is used as label separators, there are three labels,
example, , com.  Per RFC 3490,  is not a valid label. Hence,
example..com is not a valid IDN.

 We need to address the issue in IDN.

2. whether xyz. is an valid IDN?
 It's an gray area, I think. We can treat the trailing . as root
label, or a label separator.
 If the trailing . is treated as label separator, xyz. is invalid
per RFC 3490.
 if the trailing . is treated as root label, what's the expected
return value of IDN.toASCII(xyz.)?  I think the return value can be
either xyz. or xyz.  The current implementation returns xyz.

 We may need not to update the implementation if tailing . is
treated as root label.

3. whether . is an valid IDN?
 It's an gray area again, I think.
 As above, if the trailing . is treated as root label, I think the
return value can be either . or .  The current implementation throws
a StringIndexOutOfBoundsException.

 However, what empty domain name () really means?  I would prefer to
return . for . instead.

 We need to address the issue in IDN.


Here comes the solution, the IDN.toASCII() returns:
1. . for .;
2. xyz for xyz.;
3. IAE for example..com.

Does it make sense?

Thanks,
Xuelei


On 8/7/2013 1:35 AM, Michael McMahon wrote:

I don't really understand the reason for the restriction in SNIHostName
But, I guess that is where it should be enforced if it is required.

Michael.

On 06/08/13 17:43, Dmitry Samersoff wrote:

Xuelei,

. (dot) is perfectly valid domain name and it means root domain so com.
is valid domain name as well.

It thinks to me that in context of methods your change we should ignore
trailing dots, rather than throw exception.

-Dmitry



On 2013-08-06 15:44, Xuelei Fan wrote:

Hi,

Please review the bug fix to strict the illegal input checking in IDN.

webrev: 

Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-07 Thread Xuelei Fan
Please review the new update:

http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/

With this update, com. is valid (return com.); . and
example..com are invalid.  And IAE will be thrown for invalid IDN.

Thanks,
Xuelei

On 8/7/2013 10:18 PM, Michael McMahon wrote:
 On 07/08/13 15:13, Xuelei Fan wrote:
 On 8/7/2013 10:05 PM, Michael McMahon wrote:
 Resolvers seem to accept queries using trailing dots.

 eg nslookup www.oracle.com.

 or InetAddress.getByName(www.oracle.com.);

 The part of RFC3490 quoted below seems to me to be saying
 that the empty label implied by the trailing dot is not regarded
 as a label so that you don't end up calling toAscii() or toUnicode()
 with an empty string. I don't think it's saying the trailing dot can't
 be there.

 It makes sense.

 What's your preference to return for IDN.toASCII(www.oracle.com.),
 www.oracle.com. or www.oracle.com? The current returned value is
 www.oracle.com.  I would like to reserve the behavior in this update.
 
 My opinion is to keep it as at present ie. www.oracle.com.
 
 Michael
 
 I think we are on same page soon.

 Thanks,
 Xuelei

 Michael

 On 07/08/13 13:44, Xuelei Fan wrote:
 On 8/7/2013 12:06 AM, Matthew Hall wrote:
 Trailing dots are allowed in plain DNS (thus almost surely in IDN),
 and the single dot represents the root zone. So you have to be
 careful making this sort of change to check the DNS RFCs first.
 That's the first question we need to answer, whether IDN allow tailling
 dots (com.), zero-length root label (.), and zero-length label (,
 for example example..com)?

 Per the specification of IDN.toASCII():
 ===
 ToASCII operation can fail. ToASCII fails if any step of it fails. If
 ToASCII operation fails, an IllegalArgumentException will be thrown. In
 this case, the input string should not be used in an internationalized
 domain name.

 A label is an individual part of a domain name. The original ToASCII
 operation, as defined in RFC 3490, only operates on a single label.
 This
 method can handle both label and entire domain name, by assuming that
 labels in a domain name are always separated by dots. ...

 Throws IllegalArgumentException - if the input string doesn't
 conform to
 RFC 3490 specification

 Per the specification of RFC 3490:
 ==
 [section 2]
 A label is an individual part of a domain name.  Labels are usually
shown separated by dots; for example, the domain name
www.example.com is composed of three labels: www, example, and
com.  (The zero-length root label described in [STD13], which can
be explicit as in www.example.com. or implicit as in
www.example.com, is not considered a label in this
 specification.)

 An internationalized label is a label to which the ToASCII
operation (see section 4) can be applied without failing (with the
UseSTD3ASCIIRules flag unset).  ...
Although most Unicode characters can appear in
internationalized labels, ToASCII will fail for some input strings,
and such strings are not valid internationalized labels.

 An internationalized domain name (IDN) is a domain name in which
every label is an internationalized label.

 [Section 4.1]
 ToASCII consists of the following steps:

...

8. Verify that the number of code points is in the range 1 to 63
 inclusive.


 Here are the questions:
 1. whether example..com is an valid IDN?
  As dot is used as label separators, there are three labels,
 example, , com.  Per RFC 3490,  is not a valid label. Hence,
 example..com is not a valid IDN.

  We need to address the issue in IDN.

 2. whether xyz. is an valid IDN?
  It's an gray area, I think. We can treat the trailing . as root
 label, or a label separator.
  If the trailing . is treated as label separator, xyz. is
 invalid
 per RFC 3490.
  if the trailing . is treated as root label, what's the expected
 return value of IDN.toASCII(xyz.)?  I think the return value can be
 either xyz. or xyz.  The current implementation returns xyz.

  We may need not to update the implementation if tailing . is
 treated as root label.

 3. whether . is an valid IDN?
  It's an gray area again, I think.
  As above, if the trailing . is treated as root label, I think
 the
 return value can be either . or .  The current implementation
 throws
 a StringIndexOutOfBoundsException.

  However, what empty domain name () really means?  I would
 prefer to
 return . for . instead.

  We need to address the issue in IDN.


 Here comes the solution, the IDN.toASCII() returns:
 1. . for .;
 2. xyz for xyz.;
 3. IAE for example..com.

 Does it make sense?

 Thanks,
 Xuelei


 On 8/7/2013 1:35 AM, Michael McMahon wrote:
 I don't really understand the reason for the restriction in
 SNIHostName
 But, I guess that is where it should be enforced if it is required.

 Michael.

 On 06/08/13 17:43, Dmitry Samersoff wrote:
 Xuelei,

 . (dot) is perfectly valid domain 

Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-06 Thread Weijun Wang
I am not sure if IDN.java is the correct place to change. At least I've 
seen trailing dots in DNS entries. So maybe it's not so illegal.


--Max

On 8/6/13 7:44 PM, Xuelei Fan wrote:

Hi,

Please review the bug fix to strict the illegal input checking in IDN.

webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/

Here is two test cases, which are expected to get IAE.

Case 1:
String host = IDN.toASCII(., IDN.USE_STD3_ASCII_RULES);
Exception in thread main java.lang.StringIndexOutOfBoundsException:
String index out of range: 0
 at java.lang.StringBuffer.charAt(StringBuffer.java:204)
 at java.net.IDN.toASCIIInternal(IDN.java:279)
 at java.net.IDN.toASCII(IDN.java:118)

Case 2:
String host = IDN.toASCII(com., IDN.USE_STD3_ASCII_RULES);

Thanks,
Xuelei



Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-06 Thread Xuelei Fan
On Aug 6, 2013, at 23:08, Weijun Wang weijun.w...@oracle.com wrote:

 I am not sure if IDN.java is the correct place to change. At least I've seen 
 trailing dots in DNS entries. So maybe it's not so illegal.
 
Per RFC 1034, a domain name cannot end with dot.  I will check other related 
specifications.  What's the case you saw with trailing dots?

Thanks,
Xuelei

 --Max
 
 On 8/6/13 7:44 PM, Xuelei Fan wrote:
 Hi,
 
 Please review the bug fix to strict the illegal input checking in IDN.
 
 webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/
 
 Here is two test cases, which are expected to get IAE.
 
 Case 1:
 String host = IDN.toASCII(., IDN.USE_STD3_ASCII_RULES);
 Exception in thread main java.lang.StringIndexOutOfBoundsException:
 String index out of range: 0
 at java.lang.StringBuffer.charAt(StringBuffer.java:204)
 at java.net.IDN.toASCIIInternal(IDN.java:279)
 at java.net.IDN.toASCII(IDN.java:118)
 
 Case 2:
 String host = IDN.toASCII(com., IDN.USE_STD3_ASCII_RULES);
 
 Thanks,
 Xuelei
 


Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-06 Thread Matthew Hall
Trailing dots are allowed in plain DNS (thus almost surely in IDN), and the 
single dot represents the root zone. So you have to be careful making this sort 
of change to check the DNS RFCs first.

Matthew.
-- 
Sent from my mobile device.

Weijun Wang weijun.w...@oracle.com wrote:
I am not sure if IDN.java is the correct place to change. At least I've

seen trailing dots in DNS entries. So maybe it's not so illegal.

--Max

On 8/6/13 7:44 PM, Xuelei Fan wrote:
 Hi,

 Please review the bug fix to strict the illegal input checking in
IDN.

 webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/

 Here is two test cases, which are expected to get IAE.

 Case 1:
 String host = IDN.toASCII(., IDN.USE_STD3_ASCII_RULES);
 Exception in thread main java.lang.StringIndexOutOfBoundsException:
 String index out of range: 0
  at java.lang.StringBuffer.charAt(StringBuffer.java:204)
  at java.net.IDN.toASCIIInternal(IDN.java:279)
  at java.net.IDN.toASCII(IDN.java:118)

 Case 2:
 String host = IDN.toASCII(com., IDN.USE_STD3_ASCII_RULES);

 Thanks,
 Xuelei




Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-06 Thread Matthew Hall
Take a look here for more clarity:

http://en.wikipedia.org/wiki/Fully_qualified_domain_name
-- 
Sent from my mobile device.

Matthew Hall mh...@mhcomputing.net wrote:
Trailing dots are allowed in plain DNS (thus almost surely in IDN), and
the single dot represents the root zone. So you have to be careful
making this sort of change to check the DNS RFCs first.

Matthew.



Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-06 Thread Dmitry Samersoff
Xuelei,

. (dot) is perfectly valid domain name and it means root domain so com.
is valid domain name as well.

It thinks to me that in context of methods your change we should ignore
trailing dots, rather than throw exception.

-Dmitry



On 2013-08-06 15:44, Xuelei Fan wrote:
 Hi,
 
 Please review the bug fix to strict the illegal input checking in IDN.
 
 webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/
 
 Here is two test cases, which are expected to get IAE.
 
 Case 1:
 String host = IDN.toASCII(., IDN.USE_STD3_ASCII_RULES);
 Exception in thread main java.lang.StringIndexOutOfBoundsException:
 String index out of range: 0
 at java.lang.StringBuffer.charAt(StringBuffer.java:204)
 at java.net.IDN.toASCIIInternal(IDN.java:279)
 at java.net.IDN.toASCII(IDN.java:118)
 
 Case 2:
 String host = IDN.toASCII(com., IDN.USE_STD3_ASCII_RULES);
 
 Thanks,
 Xuelei
 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.


Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

2013-08-06 Thread Michael McMahon

I don't really understand the reason for the restriction in SNIHostName
But, I guess that is where it should be enforced if it is required.

Michael.

On 06/08/13 17:43, Dmitry Samersoff wrote:

Xuelei,

. (dot) is perfectly valid domain name and it means root domain so com.
is valid domain name as well.

It thinks to me that in context of methods your change we should ignore
trailing dots, rather than throw exception.

-Dmitry



On 2013-08-06 15:44, Xuelei Fan wrote:

Hi,

Please review the bug fix to strict the illegal input checking in IDN.

webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/

Here is two test cases, which are expected to get IAE.

Case 1:
String host = IDN.toASCII(., IDN.USE_STD3_ASCII_RULES);
Exception in thread main java.lang.StringIndexOutOfBoundsException:
String index out of range: 0
 at java.lang.StringBuffer.charAt(StringBuffer.java:204)
 at java.net.IDN.toASCIIInternal(IDN.java:279)
 at java.net.IDN.toASCII(IDN.java:118)

Case 2:
String host = IDN.toASCII(com., IDN.USE_STD3_ASCII_RULES);

Thanks,
Xuelei