Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Seems fine to me Xuelei. - Michael On 19/08/13 06:56, Xuelei Fan wrote: If no objections, I will push the change by COB Monday. Thanks, Xuelei On 8/13/2013 4:29 PM, Xuelei Fan wrote: Can I get an additional code review from networking team? Thanks, Xuelei On 8/12/2013 2:07 PM, Weijun Wang wrote: new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.06/
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
If no objections, I will push the change by COB Monday. Thanks, Xuelei On 8/13/2013 4:29 PM, Xuelei Fan wrote: > Can I get an additional code review from networking team? > > Thanks, > Xuelei > > On 8/12/2013 2:07 PM, Weijun Wang wrote: >> new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.06/ >
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
On Thu, Aug 15, 2013 at 10:08:35AM -0700, Mike Duigou wrote: > I've been confused through this discussion as to why a trailing dot would be > regarded as illegal. > > Historically a trailing dot has been frequently (though not universally) used > to denote a fully qualified domain name. > > https://en.wikipedia.org/wiki/Fully_qualified_domain_name > > Is this use now illegal/unsupported/invalid? Does having a trailing dot > conflict with other parts of the IDN specification? > > Mike This is why some of us were protesting the code which disallowed the trailing '.', and eventually the code was changed to allow it to be present. Matthew.
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
On Aug 16, 2013, at 1:08, Mike Duigou wrote: > I've been confused through this discussion as to why a trailing dot would be > regarded as illegal. > The discussion is too long to find the final decision easily. A IDN with trailing dot should be regarded as legal IDN. This update is trying to fix this. For example, "." and "example.com." are legal IDN, and IDN.toASCII() should be return the legal name accordingly. However, per the specification of Server Name Indication of TLS extension, a hostname should not end with trailing dot. So in SNIHostName, we will check the return value of IDN.toASCII() to filter out hostnames with trailing dots. This fix is trying to have IDN working with tailing dot and empty label correctly. The previous code of SNIHostName will work as expected if IDN can handle trailing dot properly. Thanks, Xuelei > Historically a trailing dot has been frequently (though not universally) used > to denote a fully qualified domain name. > > https://en.wikipedia.org/wiki/Fully_qualified_domain_name > > Is this use now illegal/unsupported/invalid? Does having a trailing dot > conflict with other parts of the IDN specification? > > Mike > > On Aug 13 2013, at 01:29 , Xuelei Fan wrote: > >> Can I get an additional code review from networking team? >> >> Thanks, >> Xuelei >> >> On 8/12/2013 2:07 PM, Weijun Wang wrote: >>> new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.06/ >
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
I've been confused through this discussion as to why a trailing dot would be regarded as illegal. Historically a trailing dot has been frequently (though not universally) used to denote a fully qualified domain name. https://en.wikipedia.org/wiki/Fully_qualified_domain_name Is this use now illegal/unsupported/invalid? Does having a trailing dot conflict with other parts of the IDN specification? Mike On Aug 13 2013, at 01:29 , Xuelei Fan wrote: > Can I get an additional code review from networking team? > > Thanks, > Xuelei > > On 8/12/2013 2:07 PM, Weijun Wang wrote: >> new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.06/ >
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Can I get an additional code review from networking team? Thanks, Xuelei On 8/12/2013 2:07 PM, Weijun Wang wrote: > new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.06/
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
On 8/12/13 1:45 PM, Xuelei Fan wrote: new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.06/ Lines 280 and 333: How about we call them steps 8a and 8b? Step 8 is referring to the steps in RFC 3490. Let's use "step 8". You break the 1 <= len <= 63 check into two parts, that's why I say 8a and 8b. --Max Thanks, Xuelei On 8/12/2013 11:11 AM, Weijun Wang wrote: I think the fix is adequate and necessary. One problem: lines 367-373 adds a new IAE to ToUnicode but the method should not fail forever. And some small comments on styles etc. On 8/12/13 9:09 AM, Xuelei Fan wrote: new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.05/ Lines 123 and 185: 184 p = q + 1; 185 if (p < input.length() || q == (input.length() - 1)) { 186// has more labels, or keep the trailing dot as at present 187out.append('.'); 188 } I prefer if (q < input.length()) { // Ah, a dot! out.append('.'); } p = q + 1; Lines 282, 335, 270: Insert a blank after "if". Lines 284 and 372: nslookup uses "empty label", which I like better. Lines 453 and 460: Personally I don't like the parenthesis for the whole return value, but you have your choice. Lines 280 and 333: How about we call them steps 8a and 8b? Added a new test to test illegal hostname in SNIHostName. Excellent. Otherwise I will be wondering why the fix in IDN could solve the problem as described in the bug description. Thanks Max Xuelei On 8/10/2013 10:49 AM, Xuelei Fan wrote: Hi Michael, It is pretty hard to get the issue solved in SNIHostName in a good sharp. Here is my try to state why we should fix the issue in IDN. In SNIHostName, the following hostname are not accepted as valid hostname: 1. empty hostname 2. hostname ends with a trailing dot 3. hostname does not comply to RFC 3490. The process in SNIHostName looks like: 1. call IDN.toASCII() to convert a string hostname 2. check that the return value of #1 is an valid hostname (non-empty, non-end-with-tailing-dot). At present, the IDN cannot handle the following IDN properly. 1. returns "com" for "com." the trailing dot is swallowed. 2. throws StringIndexOutOfBoundsException for "." If "." is an valid IDN that comply to RFC 3490, IDN.toASCII() should be able to handle it; otherwise, IDN.toASCII() should throw IAE as the specification suggested. However, IDN.toASCII(".") throws StringIndexOutOfBoundsException, this behavior does not comply the the specification: 3. throws StringIndexOutOfBoundsException for "example...net" As #2. We can address #1 and #2 in SNIHostName, but the checking is overloaded as IDN also need to address the issue. And SNIHostName has to know what's the separators (".", "\u3002, etc) of IDN in order to check the dot character. It is not a good encapsulation, and involved in too much about the details of domain name, I think. It is a little big hard to address #3 in SNIHostName. Both all of above issue can be easily addressed in IDN. And once IDN addressed these issues, the current SNIHostName is able to handle invalid hostname (empty, trailing dot, etc) correctly. We won't need to touch SNIHostName any more. Please consider it. The latest webrev is at: http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/ Thanks, Xuelei On 8/10/2013 9:13 AM, Xuelei Fan wrote: Hi Michael, I plan to address this issue in SNIHostName. I have filled another two the potential bugs in IDN. Thank you, and other people, for the feedback. Thanks, Xuelei On 8/9/2013 11:25 PM, Xuelei Fan wrote: On 8/9/2013 7:31 PM, Michael McMahon wrote: I don't see how this fixes the original problem as the SNIHostName spec still doesn't like hostnames with a trailing '.' The bug description did not reflect the IDN specification correctly. If "com." is a valid IDN, SNIHostName should accept it an an valid hostname. The host name in SNIHostName is nothing more or less than an standard IDN. I added a comment in the bug: "com." and "." are valid IDN according the IDN and domain name specifications. I will contact the bug reporter about this point. Xuelei I'd prefer to check first where that requirement is coming from, if it is actually necessary, and if not consider removing it from SNIHostName. If it is necessary, then the check should be implemented in SNIHostName. Michael On 09/08/13 05:28, Xuelei Fan wrote: Thanks for your feedback and suggestions. Here is the new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/ "." is regarded as valid IDN in this update. Thanks, Xuelei On 8/9/2013 10:50 AM, Xuelei Fan wrote: On 8/9/2013 10:14 AM, Weijun Wang wrote: On 8/9/13 9:37 AM, Xuelei Fan wrote: On 8/9/2013 9:22 AM, Weijun Wang wrote: I tried nslookup. Those with ".." inside are illegal, $ nslookup com.. nslookup: 'com..' is not a legal name (empty label) but $ nslookup . Server:192.168.10.1 Address:192.168.10.1#
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.06/ > Lines 280 and 333: How about we call them steps 8a and 8b? Step 8 is referring to the steps in RFC 3490. Let's use "step 8". Thanks, Xuelei On 8/12/2013 11:11 AM, Weijun Wang wrote: > I think the fix is adequate and necessary. > > One problem: lines 367-373 adds a new IAE to ToUnicode but the method > should not fail forever. > > And some small comments on styles etc. > > On 8/12/13 9:09 AM, Xuelei Fan wrote: >> new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.05/ > > Lines 123 and 185: > > 184 p = q + 1; > 185 if (p < input.length() || q == (input.length() - 1)) { > 186// has more labels, or keep the trailing dot as at > present > 187out.append('.'); > 188 } > > I prefer > > if (q < input.length()) { // Ah, a dot! > out.append('.'); > } > p = q + 1; > > Lines 282, 335, 270: Insert a blank after "if". > > Lines 284 and 372: nslookup uses "empty label", which I like better. > > Lines 453 and 460: Personally I don't like the parenthesis for the whole > return value, but you have your choice. > > Lines 280 and 333: How about we call them steps 8a and 8b? > >> >> Added a new test to test illegal hostname in SNIHostName. > > Excellent. Otherwise I will be wondering why the fix in IDN could solve > the problem as described in the bug description. > > Thanks > Max > >> >> Xuelei >> >> On 8/10/2013 10:49 AM, Xuelei Fan wrote: >>> Hi Michael, >>> >>> It is pretty hard to get the issue solved in SNIHostName in a good >>> sharp. Here is my try to state why we should fix the issue in IDN. >>> >>> In SNIHostName, the following hostname are not accepted as valid >>> hostname: >>> 1. empty hostname >>> 2. hostname ends with a trailing dot >>> 3. hostname does not comply to RFC 3490. >>> >>> The process in SNIHostName looks like: >>> 1. call IDN.toASCII() to convert a string hostname >>> 2. check that the return value of #1 is an valid hostname (non-empty, >>> non-end-with-tailing-dot). >>> >>> At present, the IDN cannot handle the following IDN properly. >>> 1. returns "com" for "com." >>> the trailing dot is swallowed. >>> >>> 2. throws StringIndexOutOfBoundsException for "." >>> If "." is an valid IDN that comply to RFC 3490, IDN.toASCII() >>> should >>> be able to handle it; otherwise, IDN.toASCII() should throw IAE as the >>> specification suggested. However, IDN.toASCII(".") throws >>> StringIndexOutOfBoundsException, this behavior does not comply the the >>> specification: >>> >>> 3. throws StringIndexOutOfBoundsException for "example...net" >>> As #2. >>> >>> We can address #1 and #2 in SNIHostName, but the checking is overloaded >>> as IDN also need to address the issue. And SNIHostName has to know >>> what's the separators (".", "\u3002, etc) of IDN in order to check the >>> dot character. It is not a good encapsulation, and involved in too much >>> about the details of domain name, I think. >>> >>> It is a little big hard to address #3 in SNIHostName. >>> >>> Both all of above issue can be easily addressed in IDN. And once IDN >>> addressed these issues, the current SNIHostName is able to handle >>> invalid hostname (empty, trailing dot, etc) correctly. We won't need to >>> touch SNIHostName any more. >>> >>> Please consider it. >>> >>> The latest webrev is at: >>> http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/ >>> >>> Thanks, >>> Xuelei >>> >>> On 8/10/2013 9:13 AM, Xuelei Fan wrote: Hi Michael, I plan to address this issue in SNIHostName. I have filled another two the potential bugs in IDN. Thank you, and other people, for the feedback. Thanks, Xuelei On 8/9/2013 11:25 PM, Xuelei Fan wrote: > On 8/9/2013 7:31 PM, Michael McMahon wrote: >> I don't see how this fixes the original problem as the SNIHostName >> spec >> still doesn't like hostnames with a trailing '.' >> > The bug description did not reflect the IDN specification > correctly. If > "com." is a valid IDN, SNIHostName should accept it an an valid > hostname. The host name in SNIHostName is nothing more or less > than an > standard IDN. > > I added a comment in the bug: "com." and "." are valid IDN > according the > IDN and domain name specifications. I will contact the bug reporter > about this point. > > Xuelei > >> I'd prefer to check first where that requirement is coming from, >> if it is >> actually necessary, and if not consider removing it from SNIHostName. >> If it is necessary, then the check should be implemented in >> SNIHostName. >> >> Michael >> >> On 09/08/13 05:28, Xuelei Fan wrote: >>> Thanks for your feedback and suggestions. >>> >>> Here is the new webrev: >>> http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/ >>> >>> "." is reg
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
if (q < input.length()) { // Ah, a dot! out.append('.'); } p = q + 1; Using if (q != input.length()) should be even better. The searchDots method clearly specifies that "or if there is no dots, return the length of input string". --Max
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
I think the fix is adequate and necessary. One problem: lines 367-373 adds a new IAE to ToUnicode but the method should not fail forever. And some small comments on styles etc. On 8/12/13 9:09 AM, Xuelei Fan wrote: new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.05/ Lines 123 and 185: 184 p = q + 1; 185 if (p < input.length() || q == (input.length() - 1)) { 186// has more labels, or keep the trailing dot as at present 187out.append('.'); 188 } I prefer if (q < input.length()) { // Ah, a dot! out.append('.'); } p = q + 1; Lines 282, 335, 270: Insert a blank after "if". Lines 284 and 372: nslookup uses "empty label", which I like better. Lines 453 and 460: Personally I don't like the parenthesis for the whole return value, but you have your choice. Lines 280 and 333: How about we call them steps 8a and 8b? Added a new test to test illegal hostname in SNIHostName. Excellent. Otherwise I will be wondering why the fix in IDN could solve the problem as described in the bug description. Thanks Max Xuelei On 8/10/2013 10:49 AM, Xuelei Fan wrote: Hi Michael, It is pretty hard to get the issue solved in SNIHostName in a good sharp. Here is my try to state why we should fix the issue in IDN. In SNIHostName, the following hostname are not accepted as valid hostname: 1. empty hostname 2. hostname ends with a trailing dot 3. hostname does not comply to RFC 3490. The process in SNIHostName looks like: 1. call IDN.toASCII() to convert a string hostname 2. check that the return value of #1 is an valid hostname (non-empty, non-end-with-tailing-dot). At present, the IDN cannot handle the following IDN properly. 1. returns "com" for "com." the trailing dot is swallowed. 2. throws StringIndexOutOfBoundsException for "." If "." is an valid IDN that comply to RFC 3490, IDN.toASCII() should be able to handle it; otherwise, IDN.toASCII() should throw IAE as the specification suggested. However, IDN.toASCII(".") throws StringIndexOutOfBoundsException, this behavior does not comply the the specification: 3. throws StringIndexOutOfBoundsException for "example...net" As #2. We can address #1 and #2 in SNIHostName, but the checking is overloaded as IDN also need to address the issue. And SNIHostName has to know what's the separators (".", "\u3002, etc) of IDN in order to check the dot character. It is not a good encapsulation, and involved in too much about the details of domain name, I think. It is a little big hard to address #3 in SNIHostName. Both all of above issue can be easily addressed in IDN. And once IDN addressed these issues, the current SNIHostName is able to handle invalid hostname (empty, trailing dot, etc) correctly. We won't need to touch SNIHostName any more. Please consider it. The latest webrev is at: http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/ Thanks, Xuelei On 8/10/2013 9:13 AM, Xuelei Fan wrote: Hi Michael, I plan to address this issue in SNIHostName. I have filled another two the potential bugs in IDN. Thank you, and other people, for the feedback. Thanks, Xuelei On 8/9/2013 11:25 PM, Xuelei Fan wrote: On 8/9/2013 7:31 PM, Michael McMahon wrote: I don't see how this fixes the original problem as the SNIHostName spec still doesn't like hostnames with a trailing '.' The bug description did not reflect the IDN specification correctly. If "com." is a valid IDN, SNIHostName should accept it an an valid hostname. The host name in SNIHostName is nothing more or less than an standard IDN. I added a comment in the bug: "com." and "." are valid IDN according the IDN and domain name specifications. I will contact the bug reporter about this point. Xuelei I'd prefer to check first where that requirement is coming from, if it is actually necessary, and if not consider removing it from SNIHostName. If it is necessary, then the check should be implemented in SNIHostName. Michael On 09/08/13 05:28, Xuelei Fan wrote: Thanks for your feedback and suggestions. Here is the new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/ "." is regarded as valid IDN in this update. Thanks, Xuelei On 8/9/2013 10:50 AM, Xuelei Fan wrote: On 8/9/2013 10:14 AM, Weijun Wang wrote: On 8/9/13 9:37 AM, Xuelei Fan wrote: On 8/9/2013 9:22 AM, Weijun Wang wrote: I tried nslookup. Those with ".." inside are illegal, $ nslookup com.. nslookup: 'com..' is not a legal name (empty label) but $ nslookup . Server:192.168.10.1 Address:192.168.10.1#53 Non-authoritative answer: *** Can't find .: No answer Thanks for the testing. The behaviors are the same as this fix now. No exactly. It seems nslookup still regards "." legal but just cannot find an IP for it. I'm not sure whether a root domain name can be stand alone. Root label is not considered as a label in IDN. I think it is safe to regard that "." is not a valid IDN as it
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.05/ Added a new test to test illegal hostname in SNIHostName. Xuelei On 8/10/2013 10:49 AM, Xuelei Fan wrote: > Hi Michael, > > It is pretty hard to get the issue solved in SNIHostName in a good > sharp. Here is my try to state why we should fix the issue in IDN. > > In SNIHostName, the following hostname are not accepted as valid hostname: > 1. empty hostname > 2. hostname ends with a trailing dot > 3. hostname does not comply to RFC 3490. > > The process in SNIHostName looks like: > 1. call IDN.toASCII() to convert a string hostname > 2. check that the return value of #1 is an valid hostname (non-empty, > non-end-with-tailing-dot). > > At present, the IDN cannot handle the following IDN properly. > 1. returns "com" for "com." >the trailing dot is swallowed. > > 2. throws StringIndexOutOfBoundsException for "." > If "." is an valid IDN that comply to RFC 3490, IDN.toASCII() should > be able to handle it; otherwise, IDN.toASCII() should throw IAE as the > specification suggested. However, IDN.toASCII(".") throws > StringIndexOutOfBoundsException, this behavior does not comply the the > specification: > > 3. throws StringIndexOutOfBoundsException for "example...net" >As #2. > > We can address #1 and #2 in SNIHostName, but the checking is overloaded > as IDN also need to address the issue. And SNIHostName has to know > what's the separators (".", "\u3002, etc) of IDN in order to check the > dot character. It is not a good encapsulation, and involved in too much > about the details of domain name, I think. > > It is a little big hard to address #3 in SNIHostName. > > Both all of above issue can be easily addressed in IDN. And once IDN > addressed these issues, the current SNIHostName is able to handle > invalid hostname (empty, trailing dot, etc) correctly. We won't need to > touch SNIHostName any more. > > Please consider it. > > The latest webrev is at: > http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/ > > Thanks, > Xuelei > > On 8/10/2013 9:13 AM, Xuelei Fan wrote: >> Hi Michael, >> >> I plan to address this issue in SNIHostName. I have filled another two >> the potential bugs in IDN. >> >> Thank you, and other people, for the feedback. >> >> Thanks, >> Xuelei >> >> On 8/9/2013 11:25 PM, Xuelei Fan wrote: >>> On 8/9/2013 7:31 PM, Michael McMahon wrote: I don't see how this fixes the original problem as the SNIHostName spec still doesn't like hostnames with a trailing '.' >>> The bug description did not reflect the IDN specification correctly. If >>> "com." is a valid IDN, SNIHostName should accept it an an valid >>> hostname. The host name in SNIHostName is nothing more or less than an >>> standard IDN. >>> >>> I added a comment in the bug: "com." and "." are valid IDN according the >>> IDN and domain name specifications. I will contact the bug reporter >>> about this point. >>> >>> Xuelei >>> I'd prefer to check first where that requirement is coming from, if it is actually necessary, and if not consider removing it from SNIHostName. If it is necessary, then the check should be implemented in SNIHostName. Michael On 09/08/13 05:28, Xuelei Fan wrote: > Thanks for your feedback and suggestions. > > Here is the new webrev: > http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/ > > "." is regarded as valid IDN in this update. > > Thanks, > Xuelei > > On 8/9/2013 10:50 AM, Xuelei Fan wrote: >> On 8/9/2013 10:14 AM, Weijun Wang wrote: >>> >>> On 8/9/13 9:37 AM, Xuelei Fan wrote: On 8/9/2013 9:22 AM, Weijun Wang wrote: > I tried nslookup. Those with ".." inside are illegal, > > $ nslookup com.. > nslookup: 'com..' is not a legal name (empty label) > > but > > $ nslookup . > Server:192.168.10.1 > Address:192.168.10.1#53 > > Non-authoritative answer: > *** Can't find .: No answer > Thanks for the testing. The behaviors are the same as this fix now. >>> No exactly. It seems nslookup still regards "." legal but just cannot >>> find an IP for it. >>> >> I'm not sure whether a root domain name can be stand alone. Root label >> is not considered as a label in IDN. I think it is safe to regard that >> "." is not a valid IDN as it contains no label. Anyway, it is a corner >> case. >> >> There are many online IDN conversion web services, some of them can >> convert ".", some of the cannot. In the present implementation, we >> cannot recognize ".", and IDN.toASCII(".") throws >> StringIndexOutOfBoundsException. With this fix, I was wondering IAE is >> a better exception for IDN.toASCII("."). >> Learn something new today to use nslookup. > Also, since this bug was originally about SNIHost
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Hi Michael, It is pretty hard to get the issue solved in SNIHostName in a good sharp. Here is my try to state why we should fix the issue in IDN. In SNIHostName, the following hostname are not accepted as valid hostname: 1. empty hostname 2. hostname ends with a trailing dot 3. hostname does not comply to RFC 3490. The process in SNIHostName looks like: 1. call IDN.toASCII() to convert a string hostname 2. check that the return value of #1 is an valid hostname (non-empty, non-end-with-tailing-dot). At present, the IDN cannot handle the following IDN properly. 1. returns "com" for "com." the trailing dot is swallowed. 2. throws StringIndexOutOfBoundsException for "." If "." is an valid IDN that comply to RFC 3490, IDN.toASCII() should be able to handle it; otherwise, IDN.toASCII() should throw IAE as the specification suggested. However, IDN.toASCII(".") throws StringIndexOutOfBoundsException, this behavior does not comply the the specification: 3. throws StringIndexOutOfBoundsException for "example...net" As #2. We can address #1 and #2 in SNIHostName, but the checking is overloaded as IDN also need to address the issue. And SNIHostName has to know what's the separators (".", "\u3002, etc) of IDN in order to check the dot character. It is not a good encapsulation, and involved in too much about the details of domain name, I think. It is a little big hard to address #3 in SNIHostName. Both all of above issue can be easily addressed in IDN. And once IDN addressed these issues, the current SNIHostName is able to handle invalid hostname (empty, trailing dot, etc) correctly. We won't need to touch SNIHostName any more. Please consider it. The latest webrev is at: http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/ Thanks, Xuelei On 8/10/2013 9:13 AM, Xuelei Fan wrote: > Hi Michael, > > I plan to address this issue in SNIHostName. I have filled another two > the potential bugs in IDN. > > Thank you, and other people, for the feedback. > > Thanks, > Xuelei > > On 8/9/2013 11:25 PM, Xuelei Fan wrote: >> On 8/9/2013 7:31 PM, Michael McMahon wrote: >>> I don't see how this fixes the original problem as the SNIHostName spec >>> still doesn't like hostnames with a trailing '.' >>> >> The bug description did not reflect the IDN specification correctly. If >> "com." is a valid IDN, SNIHostName should accept it an an valid >> hostname. The host name in SNIHostName is nothing more or less than an >> standard IDN. >> >> I added a comment in the bug: "com." and "." are valid IDN according the >> IDN and domain name specifications. I will contact the bug reporter >> about this point. >> >> Xuelei >> >>> I'd prefer to check first where that requirement is coming from, if it is >>> actually necessary, and if not consider removing it from SNIHostName. >>> If it is necessary, then the check should be implemented in SNIHostName. >>> >>> Michael >>> >>> On 09/08/13 05:28, Xuelei Fan wrote: Thanks for your feedback and suggestions. Here is the new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/ "." is regarded as valid IDN in this update. Thanks, Xuelei On 8/9/2013 10:50 AM, Xuelei Fan wrote: > On 8/9/2013 10:14 AM, Weijun Wang wrote: >> >> On 8/9/13 9:37 AM, Xuelei Fan wrote: >>> On 8/9/2013 9:22 AM, Weijun Wang wrote: I tried nslookup. Those with ".." inside are illegal, $ nslookup com.. nslookup: 'com..' is not a legal name (empty label) but $ nslookup . Server:192.168.10.1 Address:192.168.10.1#53 Non-authoritative answer: *** Can't find .: No answer >>> Thanks for the testing. The behaviors are the same as this fix now. >> No exactly. It seems nslookup still regards "." legal but just cannot >> find an IP for it. >> > I'm not sure whether a root domain name can be stand alone. Root label > is not considered as a label in IDN. I think it is safe to regard that > "." is not a valid IDN as it contains no label. Anyway, it is a corner > case. > > There are many online IDN conversion web services, some of them can > convert ".", some of the cannot. In the present implementation, we > cannot recognize ".", and IDN.toASCII(".") throws > StringIndexOutOfBoundsException. With this fix, I was wondering IAE is > a better exception for IDN.toASCII("."). > >>> Learn something new today to use nslookup. >>> Also, since this bug was originally about SNIHostName, do you need to add some extra restriction there to reject "oracle.com." things? >>> No, we cannot restrict the format of IDN in SNIHostName more than in >>> IDN. However, we may need to rethink about the comparing of two >>> IDN, for >>> example, "example.com." should equal to "example.com". I want to >>> c
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Hi Michael, I plan to address this issue in SNIHostName. I have filled another two the potential bugs in IDN. Thank you, and other people, for the feedback. Thanks, Xuelei On 8/9/2013 11:25 PM, Xuelei Fan wrote: > On 8/9/2013 7:31 PM, Michael McMahon wrote: >> I don't see how this fixes the original problem as the SNIHostName spec >> still doesn't like hostnames with a trailing '.' >> > The bug description did not reflect the IDN specification correctly. If > "com." is a valid IDN, SNIHostName should accept it an an valid > hostname. The host name in SNIHostName is nothing more or less than an > standard IDN. > > I added a comment in the bug: "com." and "." are valid IDN according the > IDN and domain name specifications. I will contact the bug reporter > about this point. > > Xuelei > >> I'd prefer to check first where that requirement is coming from, if it is >> actually necessary, and if not consider removing it from SNIHostName. >> If it is necessary, then the check should be implemented in SNIHostName. >> >> Michael >> >> On 09/08/13 05:28, Xuelei Fan wrote: >>> Thanks for your feedback and suggestions. >>> >>> Here is the new webrev: >>> http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/ >>> >>> "." is regarded as valid IDN in this update. >>> >>> Thanks, >>> Xuelei >>> >>> On 8/9/2013 10:50 AM, Xuelei Fan wrote: On 8/9/2013 10:14 AM, Weijun Wang wrote: > > On 8/9/13 9:37 AM, Xuelei Fan wrote: >> On 8/9/2013 9:22 AM, Weijun Wang wrote: >>> I tried nslookup. Those with ".." inside are illegal, >>> >>> $ nslookup com.. >>> nslookup: 'com..' is not a legal name (empty label) >>> >>> but >>> >>> $ nslookup . >>> Server:192.168.10.1 >>> Address:192.168.10.1#53 >>> >>> Non-authoritative answer: >>> *** Can't find .: No answer >>> >> Thanks for the testing. The behaviors are the same as this fix now. > No exactly. It seems nslookup still regards "." legal but just cannot > find an IP for it. > I'm not sure whether a root domain name can be stand alone. Root label is not considered as a label in IDN. I think it is safe to regard that "." is not a valid IDN as it contains no label. Anyway, it is a corner case. There are many online IDN conversion web services, some of them can convert ".", some of the cannot. In the present implementation, we cannot recognize ".", and IDN.toASCII(".") throws StringIndexOutOfBoundsException. With this fix, I was wondering IAE is a better exception for IDN.toASCII("."). >> Learn something new today to use nslookup. >> >>> Also, since this bug was originally about SNIHostName, do you need to >>> add some extra restriction there to reject "oracle.com." things? >>> >> No, we cannot restrict the format of IDN in SNIHostName more than in >> IDN. However, we may need to rethink about the comparing of two >> IDN, for >> example, "example.com." should equal to "example.com". I want to >> consider it in another bug. > Not sure. Does the spec say IDN and SNIHostName are equivalent sets? > And > it's not one is another's subset? > Per TLS specification, host name in SNI is an IDN. The spec of SNIHostname says, "hostname is not a valid Internationalized Domain Name (IDN) compliant with the RFC 3490 specification". The spec in SNIHostName has the same means as IDN. I won't want to add additional restrict beyond the specification of an IDN. Xuelei >> Can I push the changeset? > I think it's better to ask someone in the networking team to make the > suggestion. From what I read Michael in this thread, he does not seem > totally agreed with your code changes (at least not the 00 version). > > Thanks > Max > >> Thanks, >> Xuelei >> >>> Thanks >>> Max >>> >>> On 8/9/13 8:41 AM, Xuelei Fan wrote: Ping. Thanks, Xuelei On 8/7/2013 11:17 PM, Xuelei Fan wrote: > Please review the new update: > > http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ > > With this update, "com." is valid (return "com."); "." and > "example..com" are invalid. And IAE will be thrown for invalid > IDN. > > Thanks, > Xuelei > >> >
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
On 8/9/2013 7:31 PM, Michael McMahon wrote: > I don't see how this fixes the original problem as the SNIHostName spec > still doesn't like hostnames with a trailing '.' > The bug description did not reflect the IDN specification correctly. If "com." is a valid IDN, SNIHostName should accept it an an valid hostname. The host name in SNIHostName is nothing more or less than an standard IDN. I added a comment in the bug: "com." and "." are valid IDN according the IDN and domain name specifications. I will contact the bug reporter about this point. Xuelei > I'd prefer to check first where that requirement is coming from, if it is > actually necessary, and if not consider removing it from SNIHostName. > If it is necessary, then the check should be implemented in SNIHostName. > > Michael > > On 09/08/13 05:28, Xuelei Fan wrote: >> Thanks for your feedback and suggestions. >> >> Here is the new webrev: >> http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/ >> >> "." is regarded as valid IDN in this update. >> >> Thanks, >> Xuelei >> >> On 8/9/2013 10:50 AM, Xuelei Fan wrote: >>> On 8/9/2013 10:14 AM, Weijun Wang wrote: On 8/9/13 9:37 AM, Xuelei Fan wrote: > On 8/9/2013 9:22 AM, Weijun Wang wrote: >> I tried nslookup. Those with ".." inside are illegal, >> >> $ nslookup com.. >> nslookup: 'com..' is not a legal name (empty label) >> >> but >> >> $ nslookup . >> Server:192.168.10.1 >> Address:192.168.10.1#53 >> >> Non-authoritative answer: >> *** Can't find .: No answer >> > Thanks for the testing. The behaviors are the same as this fix now. No exactly. It seems nslookup still regards "." legal but just cannot find an IP for it. >>> I'm not sure whether a root domain name can be stand alone. Root label >>> is not considered as a label in IDN. I think it is safe to regard that >>> "." is not a valid IDN as it contains no label. Anyway, it is a corner >>> case. >>> >>> There are many online IDN conversion web services, some of them can >>> convert ".", some of the cannot. In the present implementation, we >>> cannot recognize ".", and IDN.toASCII(".") throws >>> StringIndexOutOfBoundsException. With this fix, I was wondering IAE is >>> a better exception for IDN.toASCII("."). >>> > Learn something new today to use nslookup. > >> Also, since this bug was originally about SNIHostName, do you need to >> add some extra restriction there to reject "oracle.com." things? >> > No, we cannot restrict the format of IDN in SNIHostName more than in > IDN. However, we may need to rethink about the comparing of two > IDN, for > example, "example.com." should equal to "example.com". I want to > consider it in another bug. Not sure. Does the spec say IDN and SNIHostName are equivalent sets? And it's not one is another's subset? >>> Per TLS specification, host name in SNI is an IDN. The spec of >>> SNIHostname says, "hostname is not a valid Internationalized Domain Name >>> (IDN) compliant with the RFC 3490 specification". The spec in >>> SNIHostName has the same means as IDN. I won't want to add additional >>> restrict beyond the specification of an IDN. >>> >>> Xuelei >>> > Can I push the changeset? I think it's better to ask someone in the networking team to make the suggestion. From what I read Michael in this thread, he does not seem totally agreed with your code changes (at least not the 00 version). Thanks Max > Thanks, > Xuelei > >> Thanks >> Max >> >> On 8/9/13 8:41 AM, Xuelei Fan wrote: >>> Ping. >>> >>> Thanks, >>> Xuelei >>> >>> On 8/7/2013 11:17 PM, Xuelei Fan wrote: Please review the new update: http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ With this update, "com." is valid (return "com."); "." and "example..com" are invalid. And IAE will be thrown for invalid IDN. Thanks, Xuelei >
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
I don't see how this fixes the original problem as the SNIHostName spec still doesn't like hostnames with a trailing '.' I'd prefer to check first where that requirement is coming from, if it is actually necessary, and if not consider removing it from SNIHostName. If it is necessary, then the check should be implemented in SNIHostName. Michael On 09/08/13 05:28, Xuelei Fan wrote: Thanks for your feedback and suggestions. Here is the new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/ "." is regarded as valid IDN in this update. Thanks, Xuelei On 8/9/2013 10:50 AM, Xuelei Fan wrote: On 8/9/2013 10:14 AM, Weijun Wang wrote: On 8/9/13 9:37 AM, Xuelei Fan wrote: On 8/9/2013 9:22 AM, Weijun Wang wrote: I tried nslookup. Those with ".." inside are illegal, $ nslookup com.. nslookup: 'com..' is not a legal name (empty label) but $ nslookup . Server:192.168.10.1 Address:192.168.10.1#53 Non-authoritative answer: *** Can't find .: No answer Thanks for the testing. The behaviors are the same as this fix now. No exactly. It seems nslookup still regards "." legal but just cannot find an IP for it. I'm not sure whether a root domain name can be stand alone. Root label is not considered as a label in IDN. I think it is safe to regard that "." is not a valid IDN as it contains no label. Anyway, it is a corner case. There are many online IDN conversion web services, some of them can convert ".", some of the cannot. In the present implementation, we cannot recognize ".", and IDN.toASCII(".") throws StringIndexOutOfBoundsException. With this fix, I was wondering IAE is a better exception for IDN.toASCII("."). Learn something new today to use nslookup. Also, since this bug was originally about SNIHostName, do you need to add some extra restriction there to reject "oracle.com." things? No, we cannot restrict the format of IDN in SNIHostName more than in IDN. However, we may need to rethink about the comparing of two IDN, for example, "example.com." should equal to "example.com". I want to consider it in another bug. Not sure. Does the spec say IDN and SNIHostName are equivalent sets? And it's not one is another's subset? Per TLS specification, host name in SNI is an IDN. The spec of SNIHostname says, "hostname is not a valid Internationalized Domain Name (IDN) compliant with the RFC 3490 specification". The spec in SNIHostName has the same means as IDN. I won't want to add additional restrict beyond the specification of an IDN. Xuelei Can I push the changeset? I think it's better to ask someone in the networking team to make the suggestion. From what I read Michael in this thread, he does not seem totally agreed with your code changes (at least not the 00 version). Thanks Max Thanks, Xuelei Thanks Max On 8/9/13 8:41 AM, Xuelei Fan wrote: Ping. Thanks, Xuelei On 8/7/2013 11:17 PM, Xuelei Fan wrote: Please review the new update: http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ With this update, "com." is valid (return "com."); "." and "example..com" are invalid. And IAE will be thrown for invalid IDN. Thanks, Xuelei
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
On Aug 9, 2013, at 14:08, Dmitry Samersoff wrote: > Xuelei, > > 119 p = q + 1; > 120 if (p < input.length() || q == (input.length() - 1)) { > > Could be simplified to: > > q <= input.length()-1 > It's cool! Xuelei > -Dmitry > > On 2013-08-09 04:41, Xuelei Fan wrote: >> Ping. >> >> Thanks, >> Xuelei >> >> On 8/7/2013 11:17 PM, Xuelei Fan wrote: >>> Please review the new update: >>> >>> http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ >>> >>> With this update, "com." is valid (return "com."); "." and >>> "example..com" are invalid. And IAE will be thrown for invalid IDN. >>> >>> Thanks, >>> Xuelei >>> >>> On 8/7/2013 10:18 PM, Michael McMahon wrote: On 07/08/13 15:13, Xuelei Fan wrote: > On 8/7/2013 10:05 PM, Michael McMahon wrote: >> Resolvers seem to accept queries using trailing dots. >> >> eg nslookup www.oracle.com. >> >> or InetAddress.getByName("www.oracle.com."); >> >> The part of RFC3490 quoted below seems to me to be saying >> that the empty label implied by the trailing dot is not regarded >> as a label so that you don't end up calling toAscii() or toUnicode() >> with an empty string. I don't think it's saying the trailing dot can't >> be there. > It makes sense. > > What's your preference to return for IDN.toASCII("www.oracle.com."), > "www.oracle.com." or "www.oracle.com"? The current returned value is > "www.oracle.com". I would like to reserve the behavior in this update. My opinion is to keep it as at present ie. "www.oracle.com." Michael > I think we are on same page soon. > > Thanks, > Xuelei > >> Michael >> >> On 07/08/13 13:44, Xuelei Fan wrote: >>> On 8/7/2013 12:06 AM, Matthew Hall wrote: Trailing dots are allowed in plain DNS (thus almost surely in IDN), and the single dot represents the root zone. So you have to be careful making this sort of change to check the DNS RFCs first. >>> That's the first question we need to answer, whether IDN allow tailling >>> dots ("com."), zero-length root label ("."), and zero-length label ("", >>> for example ""example..com")? >>> >>> Per the specification of IDN.toASCII(): >>> === >>> "ToASCII operation can fail. ToASCII fails if any step of it fails. If >>> ToASCII operation fails, an IllegalArgumentException will be thrown. In >>> this case, the input string should not be used in an internationalized >>> domain name. >>> >>> A label is an individual part of a domain name. The original ToASCII >>> operation, as defined in RFC 3490, only operates on a single label. >>> This >>> method can handle both label and entire domain name, by assuming that >>> labels in a domain name are always separated by dots. ... >>> >>> Throws IllegalArgumentException - if the input string doesn't >>> conform to >>> RFC 3490 specification" >>> >>> Per the specification of RFC 3490: >>> == >>> [section 2] >>> "A label is an individual part of a domain name. Labels are usually >>> shown separated by dots; for example, the domain name >>> "www.example.com" is composed of three labels: "www", "example", and >>> "com". (The zero-length root label described in [STD13], which can >>> be explicit as in "www.example.com." or implicit as in >>> "www.example.com", is not considered a label in this >>> specification.)" >>> >>> "An "internationalized label" is a label to which the ToASCII >>> operation (see section 4) can be applied without failing (with the >>> UseSTD3ASCIIRules flag unset). ... >>> Although most Unicode characters can appear in >>> internationalized labels, ToASCII will fail for some input strings, >>> and such strings are not valid internationalized labels." >>> >>> "An "internationalized domain name" (IDN) is a domain name in which >>> every label is an internationalized label." >>> >>> [Section 4.1] >>> "ToASCII consists of the following steps: >>> >>> ... >>> >>> 8. Verify that the number of code points is in the range 1 to 63 >>>inclusive." >>> >>> >>> Here are the questions: >>> 1. whether "example..com" is an valid IDN? >>> As dot is used as label separators, there are three labels, >>> "example", "", "com". Per RFC 3490, "" is not a valid label. Hence, >>> "example..com" is not a valid IDN. >>> >>> We need to address the issue in IDN. >>> >>> 2. whether "xyz." is an valid IDN? >>> It's an gray area, I think. We can treat the trailing "." as root >>> label, or a label separator. >>> If the trailing "." is treated as label separator, "xyz." is >>> invalid >>>
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Xuelei, 119 p = q + 1; 120 if (p < input.length() || q == (input.length() - 1)) { Could be simplified to: q <= input.length()-1 -Dmitry On 2013-08-09 04:41, Xuelei Fan wrote: > Ping. > > Thanks, > Xuelei > > On 8/7/2013 11:17 PM, Xuelei Fan wrote: >> Please review the new update: >> >> http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ >> >> With this update, "com." is valid (return "com."); "." and >> "example..com" are invalid. And IAE will be thrown for invalid IDN. >> >> Thanks, >> Xuelei >> >> On 8/7/2013 10:18 PM, Michael McMahon wrote: >>> On 07/08/13 15:13, Xuelei Fan wrote: On 8/7/2013 10:05 PM, Michael McMahon wrote: > Resolvers seem to accept queries using trailing dots. > > eg nslookup www.oracle.com. > > or InetAddress.getByName("www.oracle.com."); > > The part of RFC3490 quoted below seems to me to be saying > that the empty label implied by the trailing dot is not regarded > as a label so that you don't end up calling toAscii() or toUnicode() > with an empty string. I don't think it's saying the trailing dot can't > be there. > It makes sense. What's your preference to return for IDN.toASCII("www.oracle.com."), "www.oracle.com." or "www.oracle.com"? The current returned value is "www.oracle.com". I would like to reserve the behavior in this update. >>> >>> My opinion is to keep it as at present ie. "www.oracle.com." >>> >>> Michael >>> I think we are on same page soon. Thanks, Xuelei > Michael > > On 07/08/13 13:44, Xuelei Fan wrote: >> On 8/7/2013 12:06 AM, Matthew Hall wrote: >>> Trailing dots are allowed in plain DNS (thus almost surely in IDN), >>> and the single dot represents the root zone. So you have to be >>> careful making this sort of change to check the DNS RFCs first. >> That's the first question we need to answer, whether IDN allow tailling >> dots ("com."), zero-length root label ("."), and zero-length label ("", >> for example ""example..com")? >> >> Per the specification of IDN.toASCII(): >> === >> "ToASCII operation can fail. ToASCII fails if any step of it fails. If >> ToASCII operation fails, an IllegalArgumentException will be thrown. In >> this case, the input string should not be used in an internationalized >> domain name. >> >> A label is an individual part of a domain name. The original ToASCII >> operation, as defined in RFC 3490, only operates on a single label. >> This >> method can handle both label and entire domain name, by assuming that >> labels in a domain name are always separated by dots. ... >> >> Throws IllegalArgumentException - if the input string doesn't >> conform to >> RFC 3490 specification" >> >> Per the specification of RFC 3490: >> == >> [section 2] >> "A label is an individual part of a domain name. Labels are usually >>shown separated by dots; for example, the domain name >>"www.example.com" is composed of three labels: "www", "example", and >>"com". (The zero-length root label described in [STD13], which can >>be explicit as in "www.example.com." or implicit as in >>"www.example.com", is not considered a label in this >> specification.)" >> >> "An "internationalized label" is a label to which the ToASCII >>operation (see section 4) can be applied without failing (with the >>UseSTD3ASCIIRules flag unset). ... >>Although most Unicode characters can appear in >>internationalized labels, ToASCII will fail for some input strings, >>and such strings are not valid internationalized labels." >> >> "An "internationalized domain name" (IDN) is a domain name in which >>every label is an internationalized label." >> >> [Section 4.1] >> "ToASCII consists of the following steps: >> >>... >> >>8. Verify that the number of code points is in the range 1 to 63 >> inclusive." >> >> >> Here are the questions: >> 1. whether "example..com" is an valid IDN? >> As dot is used as label separators, there are three labels, >> "example", "", "com". Per RFC 3490, "" is not a valid label. Hence, >> "example..com" is not a valid IDN. >> >> We need to address the issue in IDN. >> >> 2. whether "xyz." is an valid IDN? >> It's an gray area, I think. We can treat the trailing "." as root >> label, or a label separator. >> If the trailing "." is treated as label separator, "xyz." is >> invalid >> per RFC 3490. >> if the trailing "." is treated as root label, what's the expected >> return value of IDN.toASCII("xyz.")? I think the return value can be >> either "xyz." or "xyz". The current
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Thanks for your feedback and suggestions. Here is the new webrev: http://cr.openjdk.java.net/~xuelei/8020842/webrev.02/ "." is regarded as valid IDN in this update. Thanks, Xuelei On 8/9/2013 10:50 AM, Xuelei Fan wrote: > On 8/9/2013 10:14 AM, Weijun Wang wrote: >> >> >> On 8/9/13 9:37 AM, Xuelei Fan wrote: >>> On 8/9/2013 9:22 AM, Weijun Wang wrote: I tried nslookup. Those with ".." inside are illegal, $ nslookup com.. nslookup: 'com..' is not a legal name (empty label) but $ nslookup . Server:192.168.10.1 Address:192.168.10.1#53 Non-authoritative answer: *** Can't find .: No answer >>> Thanks for the testing. The behaviors are the same as this fix now. >> >> No exactly. It seems nslookup still regards "." legal but just cannot >> find an IP for it. >> > I'm not sure whether a root domain name can be stand alone. Root label > is not considered as a label in IDN. I think it is safe to regard that > "." is not a valid IDN as it contains no label. Anyway, it is a corner > case. > > There are many online IDN conversion web services, some of them can > convert ".", some of the cannot. In the present implementation, we > cannot recognize ".", and IDN.toASCII(".") throws > StringIndexOutOfBoundsException. With this fix, I was wondering IAE is > a better exception for IDN.toASCII("."). > >>> >>> Learn something new today to use nslookup. >>> Also, since this bug was originally about SNIHostName, do you need to add some extra restriction there to reject "oracle.com." things? >>> No, we cannot restrict the format of IDN in SNIHostName more than in >>> IDN. However, we may need to rethink about the comparing of two IDN, for >>> example, "example.com." should equal to "example.com". I want to >>> consider it in another bug. >> >> Not sure. Does the spec say IDN and SNIHostName are equivalent sets? And >> it's not one is another's subset? >> > Per TLS specification, host name in SNI is an IDN. The spec of > SNIHostname says, "hostname is not a valid Internationalized Domain Name > (IDN) compliant with the RFC 3490 specification". The spec in > SNIHostName has the same means as IDN. I won't want to add additional > restrict beyond the specification of an IDN. > > Xuelei > >>> >>> Can I push the changeset? >> >> I think it's better to ask someone in the networking team to make the >> suggestion. From what I read Michael in this thread, he does not seem >> totally agreed with your code changes (at least not the 00 version). >> >> Thanks >> Max >> >>> >>> Thanks, >>> Xuelei >>> Thanks Max On 8/9/13 8:41 AM, Xuelei Fan wrote: > Ping. > > Thanks, > Xuelei > > On 8/7/2013 11:17 PM, Xuelei Fan wrote: >> Please review the new update: >> >> http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ >> >> With this update, "com." is valid (return "com."); "." and >> "example..com" are invalid. And IAE will be thrown for invalid IDN. >> >> Thanks, >> Xuelei >> >>> >
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
On 8/9/2013 11:24 AM, Matthew Hall wrote: > But, DNS considers "." as the valid root zone... > Good! Looks like that IDN.toASCII(".") should returns ".", so that a general domain name can always use IDN.toASCII() conversion instead of throwing runtime exception. Xuelei
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
But, DNS considers "." as the valid root zone... -- Sent from my mobile device. Xuelei Fan wrote: >On 8/9/2013 10:14 AM, Weijun Wang wrote: >> >> >> On 8/9/13 9:37 AM, Xuelei Fan wrote: >>> On 8/9/2013 9:22 AM, Weijun Wang wrote: I tried nslookup. Those with ".." inside are illegal, $ nslookup com.. nslookup: 'com..' is not a legal name (empty label) but $ nslookup . Server:192.168.10.1 Address:192.168.10.1#53 Non-authoritative answer: *** Can't find .: No answer >>> Thanks for the testing. The behaviors are the same as this fix now. >> >> No exactly. It seems nslookup still regards "." legal but just cannot >> find an IP for it. >> >I'm not sure whether a root domain name can be stand alone. Root label >is not considered as a label in IDN. I think it is safe to regard that >"." is not a valid IDN as it contains no label. Anyway, it is a corner >case. > >There are many online IDN conversion web services, some of them can >convert ".", some of the cannot. In the present implementation, we >cannot recognize ".", and IDN.toASCII(".") throws >StringIndexOutOfBoundsException. With this fix, I was wondering IAE is >a better exception for IDN.toASCII("."). > >>> >>> Learn something new today to use nslookup. >>> Also, since this bug was originally about SNIHostName, do you need >to add some extra restriction there to reject "oracle.com." things? >>> No, we cannot restrict the format of IDN in SNIHostName more than in >>> IDN. However, we may need to rethink about the comparing of two IDN, >for >>> example, "example.com." should equal to "example.com". I want to >>> consider it in another bug. >> >> Not sure. Does the spec say IDN and SNIHostName are equivalent sets? >And >> it's not one is another's subset? >> >Per TLS specification, host name in SNI is an IDN. The spec of >SNIHostname says, "hostname is not a valid Internationalized Domain >Name >(IDN) compliant with the RFC 3490 specification". The spec in >SNIHostName has the same means as IDN. I won't want to add additional >restrict beyond the specification of an IDN. > >Xuelei > >>> >>> Can I push the changeset? >> >> I think it's better to ask someone in the networking team to make the >> suggestion. From what I read Michael in this thread, he does not seem >> totally agreed with your code changes (at least not the 00 version). >> >> Thanks >> Max >> >>> >>> Thanks, >>> Xuelei >>> Thanks Max On 8/9/13 8:41 AM, Xuelei Fan wrote: > Ping. > > Thanks, > Xuelei > > On 8/7/2013 11:17 PM, Xuelei Fan wrote: >> Please review the new update: >> >> http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ >> >> With this update, "com." is valid (return "com."); "." and >> "example..com" are invalid. And IAE will be thrown for invalid >IDN. >> >> Thanks, >> Xuelei >> >>>
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
On 8/9/2013 10:14 AM, Weijun Wang wrote: > > > On 8/9/13 9:37 AM, Xuelei Fan wrote: >> On 8/9/2013 9:22 AM, Weijun Wang wrote: >>> I tried nslookup. Those with ".." inside are illegal, >>> >>> $ nslookup com.. >>> nslookup: 'com..' is not a legal name (empty label) >>> >>> but >>> >>> $ nslookup . >>> Server:192.168.10.1 >>> Address:192.168.10.1#53 >>> >>> Non-authoritative answer: >>> *** Can't find .: No answer >>> >> Thanks for the testing. The behaviors are the same as this fix now. > > No exactly. It seems nslookup still regards "." legal but just cannot > find an IP for it. > I'm not sure whether a root domain name can be stand alone. Root label is not considered as a label in IDN. I think it is safe to regard that "." is not a valid IDN as it contains no label. Anyway, it is a corner case. There are many online IDN conversion web services, some of them can convert ".", some of the cannot. In the present implementation, we cannot recognize ".", and IDN.toASCII(".") throws StringIndexOutOfBoundsException. With this fix, I was wondering IAE is a better exception for IDN.toASCII("."). >> >> Learn something new today to use nslookup. >> >>> Also, since this bug was originally about SNIHostName, do you need to >>> add some extra restriction there to reject "oracle.com." things? >>> >> No, we cannot restrict the format of IDN in SNIHostName more than in >> IDN. However, we may need to rethink about the comparing of two IDN, for >> example, "example.com." should equal to "example.com". I want to >> consider it in another bug. > > Not sure. Does the spec say IDN and SNIHostName are equivalent sets? And > it's not one is another's subset? > Per TLS specification, host name in SNI is an IDN. The spec of SNIHostname says, "hostname is not a valid Internationalized Domain Name (IDN) compliant with the RFC 3490 specification". The spec in SNIHostName has the same means as IDN. I won't want to add additional restrict beyond the specification of an IDN. Xuelei >> >> Can I push the changeset? > > I think it's better to ask someone in the networking team to make the > suggestion. From what I read Michael in this thread, he does not seem > totally agreed with your code changes (at least not the 00 version). > > Thanks > Max > >> >> Thanks, >> Xuelei >> >>> Thanks >>> Max >>> >>> On 8/9/13 8:41 AM, Xuelei Fan wrote: Ping. Thanks, Xuelei On 8/7/2013 11:17 PM, Xuelei Fan wrote: > Please review the new update: > > http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ > > With this update, "com." is valid (return "com."); "." and > "example..com" are invalid. And IAE will be thrown for invalid IDN. > > Thanks, > Xuelei > >>
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
On 8/9/13 9:37 AM, Xuelei Fan wrote: On 8/9/2013 9:22 AM, Weijun Wang wrote: I tried nslookup. Those with ".." inside are illegal, $ nslookup com.. nslookup: 'com..' is not a legal name (empty label) but $ nslookup . Server:192.168.10.1 Address:192.168.10.1#53 Non-authoritative answer: *** Can't find .: No answer Thanks for the testing. The behaviors are the same as this fix now. No exactly. It seems nslookup still regards "." legal but just cannot find an IP for it. Learn something new today to use nslookup. Also, since this bug was originally about SNIHostName, do you need to add some extra restriction there to reject "oracle.com." things? No, we cannot restrict the format of IDN in SNIHostName more than in IDN. However, we may need to rethink about the comparing of two IDN, for example, "example.com." should equal to "example.com". I want to consider it in another bug. Not sure. Does the spec say IDN and SNIHostName are equivalent sets? And it's not one is another's subset? Can I push the changeset? I think it's better to ask someone in the networking team to make the suggestion. From what I read Michael in this thread, he does not seem totally agreed with your code changes (at least not the 00 version). Thanks Max Thanks, Xuelei Thanks Max On 8/9/13 8:41 AM, Xuelei Fan wrote: Ping. Thanks, Xuelei On 8/7/2013 11:17 PM, Xuelei Fan wrote: Please review the new update: http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ With this update, "com." is valid (return "com."); "." and "example..com" are invalid. And IAE will be thrown for invalid IDN. Thanks, Xuelei
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
On 8/9/2013 9:22 AM, Weijun Wang wrote: > I tried nslookup. Those with ".." inside are illegal, > > $ nslookup com.. > nslookup: 'com..' is not a legal name (empty label) > > but > > $ nslookup . > Server:192.168.10.1 > Address:192.168.10.1#53 > > Non-authoritative answer: > *** Can't find .: No answer > Thanks for the testing. The behaviors are the same as this fix now. Learn something new today to use nslookup. > Also, since this bug was originally about SNIHostName, do you need to > add some extra restriction there to reject "oracle.com." things? > No, we cannot restrict the format of IDN in SNIHostName more than in IDN. However, we may need to rethink about the comparing of two IDN, for example, "example.com." should equal to "example.com". I want to consider it in another bug. Can I push the changeset? Thanks, Xuelei > Thanks > Max > > On 8/9/13 8:41 AM, Xuelei Fan wrote: >> Ping. >> >> Thanks, >> Xuelei >> >> On 8/7/2013 11:17 PM, Xuelei Fan wrote: >>> Please review the new update: >>> >>> http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ >>> >>> With this update, "com." is valid (return "com."); "." and >>> "example..com" are invalid. And IAE will be thrown for invalid IDN. >>> >>> Thanks, >>> Xuelei >>>
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
I tried nslookup. Those with ".." inside are illegal, $ nslookup com.. nslookup: 'com..' is not a legal name (empty label) but $ nslookup . Server: 192.168.10.1 Address:192.168.10.1#53 Non-authoritative answer: *** Can't find .: No answer Also, since this bug was originally about SNIHostName, do you need to add some extra restriction there to reject "oracle.com." things? Thanks Max On 8/9/13 8:41 AM, Xuelei Fan wrote: Ping. Thanks, Xuelei On 8/7/2013 11:17 PM, Xuelei Fan wrote: Please review the new update: http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ With this update, "com." is valid (return "com."); "." and "example..com" are invalid. And IAE will be thrown for invalid IDN. Thanks, Xuelei
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Ping. Thanks, Xuelei On 8/7/2013 11:17 PM, Xuelei Fan wrote: > Please review the new update: > > http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ > > With this update, "com." is valid (return "com."); "." and > "example..com" are invalid. And IAE will be thrown for invalid IDN. > > Thanks, > Xuelei > > On 8/7/2013 10:18 PM, Michael McMahon wrote: >> On 07/08/13 15:13, Xuelei Fan wrote: >>> On 8/7/2013 10:05 PM, Michael McMahon wrote: Resolvers seem to accept queries using trailing dots. eg nslookup www.oracle.com. or InetAddress.getByName("www.oracle.com."); The part of RFC3490 quoted below seems to me to be saying that the empty label implied by the trailing dot is not regarded as a label so that you don't end up calling toAscii() or toUnicode() with an empty string. I don't think it's saying the trailing dot can't be there. >>> It makes sense. >>> >>> What's your preference to return for IDN.toASCII("www.oracle.com."), >>> "www.oracle.com." or "www.oracle.com"? The current returned value is >>> "www.oracle.com". I would like to reserve the behavior in this update. >> >> My opinion is to keep it as at present ie. "www.oracle.com." >> >> Michael >> >>> I think we are on same page soon. >>> >>> Thanks, >>> Xuelei >>> Michael On 07/08/13 13:44, Xuelei Fan wrote: > On 8/7/2013 12:06 AM, Matthew Hall wrote: >> Trailing dots are allowed in plain DNS (thus almost surely in IDN), >> and the single dot represents the root zone. So you have to be >> careful making this sort of change to check the DNS RFCs first. > That's the first question we need to answer, whether IDN allow tailling > dots ("com."), zero-length root label ("."), and zero-length label ("", > for example ""example..com")? > > Per the specification of IDN.toASCII(): > === > "ToASCII operation can fail. ToASCII fails if any step of it fails. If > ToASCII operation fails, an IllegalArgumentException will be thrown. In > this case, the input string should not be used in an internationalized > domain name. > > A label is an individual part of a domain name. The original ToASCII > operation, as defined in RFC 3490, only operates on a single label. > This > method can handle both label and entire domain name, by assuming that > labels in a domain name are always separated by dots. ... > > Throws IllegalArgumentException - if the input string doesn't > conform to > RFC 3490 specification" > > Per the specification of RFC 3490: > == > [section 2] > "A label is an individual part of a domain name. Labels are usually >shown separated by dots; for example, the domain name >"www.example.com" is composed of three labels: "www", "example", and >"com". (The zero-length root label described in [STD13], which can >be explicit as in "www.example.com." or implicit as in >"www.example.com", is not considered a label in this > specification.)" > > "An "internationalized label" is a label to which the ToASCII >operation (see section 4) can be applied without failing (with the >UseSTD3ASCIIRules flag unset). ... >Although most Unicode characters can appear in >internationalized labels, ToASCII will fail for some input strings, >and such strings are not valid internationalized labels." > > "An "internationalized domain name" (IDN) is a domain name in which >every label is an internationalized label." > > [Section 4.1] > "ToASCII consists of the following steps: > >... > >8. Verify that the number of code points is in the range 1 to 63 > inclusive." > > > Here are the questions: > 1. whether "example..com" is an valid IDN? > As dot is used as label separators, there are three labels, > "example", "", "com". Per RFC 3490, "" is not a valid label. Hence, > "example..com" is not a valid IDN. > > We need to address the issue in IDN. > > 2. whether "xyz." is an valid IDN? > It's an gray area, I think. We can treat the trailing "." as root > label, or a label separator. > If the trailing "." is treated as label separator, "xyz." is > invalid > per RFC 3490. > if the trailing "." is treated as root label, what's the expected > return value of IDN.toASCII("xyz.")? I think the return value can be > either "xyz." or "xyz". The current implementation returns "xyz". > > We may need not to update the implementation if tailing "." is > treated as root label. > > 3. whether "." is an valid IDN? > It's an gray area again, I think. > As above, if the trailing "." is treated as root label, I think > the > return val
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Please review the new update: http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ With this update, "com." is valid (return "com."); "." and "example..com" are invalid. And IAE will be thrown for invalid IDN. Thanks, Xuelei On 8/7/2013 10:18 PM, Michael McMahon wrote: > On 07/08/13 15:13, Xuelei Fan wrote: >> On 8/7/2013 10:05 PM, Michael McMahon wrote: >>> Resolvers seem to accept queries using trailing dots. >>> >>> eg nslookup www.oracle.com. >>> >>> or InetAddress.getByName("www.oracle.com."); >>> >>> The part of RFC3490 quoted below seems to me to be saying >>> that the empty label implied by the trailing dot is not regarded >>> as a label so that you don't end up calling toAscii() or toUnicode() >>> with an empty string. I don't think it's saying the trailing dot can't >>> be there. >>> >> It makes sense. >> >> What's your preference to return for IDN.toASCII("www.oracle.com."), >> "www.oracle.com." or "www.oracle.com"? The current returned value is >> "www.oracle.com". I would like to reserve the behavior in this update. > > My opinion is to keep it as at present ie. "www.oracle.com." > > Michael > >> I think we are on same page soon. >> >> Thanks, >> Xuelei >> >>> Michael >>> >>> On 07/08/13 13:44, Xuelei Fan wrote: On 8/7/2013 12:06 AM, Matthew Hall wrote: > Trailing dots are allowed in plain DNS (thus almost surely in IDN), > and the single dot represents the root zone. So you have to be > careful making this sort of change to check the DNS RFCs first. That's the first question we need to answer, whether IDN allow tailling dots ("com."), zero-length root label ("."), and zero-length label ("", for example ""example..com")? Per the specification of IDN.toASCII(): === "ToASCII operation can fail. ToASCII fails if any step of it fails. If ToASCII operation fails, an IllegalArgumentException will be thrown. In this case, the input string should not be used in an internationalized domain name. A label is an individual part of a domain name. The original ToASCII operation, as defined in RFC 3490, only operates on a single label. This method can handle both label and entire domain name, by assuming that labels in a domain name are always separated by dots. ... Throws IllegalArgumentException - if the input string doesn't conform to RFC 3490 specification" Per the specification of RFC 3490: == [section 2] "A label is an individual part of a domain name. Labels are usually shown separated by dots; for example, the domain name "www.example.com" is composed of three labels: "www", "example", and "com". (The zero-length root label described in [STD13], which can be explicit as in "www.example.com." or implicit as in "www.example.com", is not considered a label in this specification.)" "An "internationalized label" is a label to which the ToASCII operation (see section 4) can be applied without failing (with the UseSTD3ASCIIRules flag unset). ... Although most Unicode characters can appear in internationalized labels, ToASCII will fail for some input strings, and such strings are not valid internationalized labels." "An "internationalized domain name" (IDN) is a domain name in which every label is an internationalized label." [Section 4.1] "ToASCII consists of the following steps: ... 8. Verify that the number of code points is in the range 1 to 63 inclusive." Here are the questions: 1. whether "example..com" is an valid IDN? As dot is used as label separators, there are three labels, "example", "", "com". Per RFC 3490, "" is not a valid label. Hence, "example..com" is not a valid IDN. We need to address the issue in IDN. 2. whether "xyz." is an valid IDN? It's an gray area, I think. We can treat the trailing "." as root label, or a label separator. If the trailing "." is treated as label separator, "xyz." is invalid per RFC 3490. if the trailing "." is treated as root label, what's the expected return value of IDN.toASCII("xyz.")? I think the return value can be either "xyz." or "xyz". The current implementation returns "xyz". We may need not to update the implementation if tailing "." is treated as root label. 3. whether "." is an valid IDN? It's an gray area again, I think. As above, if the trailing "." is treated as root label, I think the return value can be either "." or "". The current implementation throws a StringIndexOutOfBoundsException. However, what empty domain name ("") really means? I would prefer
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
On 07/08/13 15:13, Xuelei Fan wrote: On 8/7/2013 10:05 PM, Michael McMahon wrote: Resolvers seem to accept queries using trailing dots. eg nslookup www.oracle.com. or InetAddress.getByName("www.oracle.com."); The part of RFC3490 quoted below seems to me to be saying that the empty label implied by the trailing dot is not regarded as a label so that you don't end up calling toAscii() or toUnicode() with an empty string. I don't think it's saying the trailing dot can't be there. It makes sense. What's your preference to return for IDN.toASCII("www.oracle.com."), "www.oracle.com." or "www.oracle.com"? The current returned value is "www.oracle.com". I would like to reserve the behavior in this update. My opinion is to keep it as at present ie. "www.oracle.com." Michael I think we are on same page soon. Thanks, Xuelei Michael On 07/08/13 13:44, Xuelei Fan wrote: On 8/7/2013 12:06 AM, Matthew Hall wrote: Trailing dots are allowed in plain DNS (thus almost surely in IDN), and the single dot represents the root zone. So you have to be careful making this sort of change to check the DNS RFCs first. That's the first question we need to answer, whether IDN allow tailling dots ("com."), zero-length root label ("."), and zero-length label ("", for example ""example..com")? Per the specification of IDN.toASCII(): === "ToASCII operation can fail. ToASCII fails if any step of it fails. If ToASCII operation fails, an IllegalArgumentException will be thrown. In this case, the input string should not be used in an internationalized domain name. A label is an individual part of a domain name. The original ToASCII operation, as defined in RFC 3490, only operates on a single label. This method can handle both label and entire domain name, by assuming that labels in a domain name are always separated by dots. ... Throws IllegalArgumentException - if the input string doesn't conform to RFC 3490 specification" Per the specification of RFC 3490: == [section 2] "A label is an individual part of a domain name. Labels are usually shown separated by dots; for example, the domain name "www.example.com" is composed of three labels: "www", "example", and "com". (The zero-length root label described in [STD13], which can be explicit as in "www.example.com." or implicit as in "www.example.com", is not considered a label in this specification.)" "An "internationalized label" is a label to which the ToASCII operation (see section 4) can be applied without failing (with the UseSTD3ASCIIRules flag unset). ... Although most Unicode characters can appear in internationalized labels, ToASCII will fail for some input strings, and such strings are not valid internationalized labels." "An "internationalized domain name" (IDN) is a domain name in which every label is an internationalized label." [Section 4.1] "ToASCII consists of the following steps: ... 8. Verify that the number of code points is in the range 1 to 63 inclusive." Here are the questions: 1. whether "example..com" is an valid IDN? As dot is used as label separators, there are three labels, "example", "", "com". Per RFC 3490, "" is not a valid label. Hence, "example..com" is not a valid IDN. We need to address the issue in IDN. 2. whether "xyz." is an valid IDN? It's an gray area, I think. We can treat the trailing "." as root label, or a label separator. If the trailing "." is treated as label separator, "xyz." is invalid per RFC 3490. if the trailing "." is treated as root label, what's the expected return value of IDN.toASCII("xyz.")? I think the return value can be either "xyz." or "xyz". The current implementation returns "xyz". We may need not to update the implementation if tailing "." is treated as root label. 3. whether "." is an valid IDN? It's an gray area again, I think. As above, if the trailing "." is treated as root label, I think the return value can be either "." or "". The current implementation throws a StringIndexOutOfBoundsException. However, what empty domain name ("") really means? I would prefer to return "." for "." instead. We need to address the issue in IDN. Here comes the solution, the IDN.toASCII() returns: 1. "." for "."; 2. "xyz" for "xyz."; 3. IAE for "example..com". Does it make sense? Thanks, Xuelei On 8/7/2013 1:35 AM, Michael McMahon wrote: I don't really understand the reason for the restriction in SNIHostName But, I guess that is where it should be enforced if it is required. Michael. On 06/08/13 17:43, Dmitry Samersoff wrote: Xuelei, . (dot) is perfectly valid domain name and it means root domain so com. is valid domain name as well. It thinks to me that in context of methods your change we should ignore trailing dots, rather than throw exception. -Dmitry On 2013-08-06 15:44, Xuelei Fan wrote: Hi, Please review the bug fix
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
On 8/7/2013 10:05 PM, Michael McMahon wrote: > Resolvers seem to accept queries using trailing dots. > > eg nslookup www.oracle.com. > > or InetAddress.getByName("www.oracle.com."); > > The part of RFC3490 quoted below seems to me to be saying > that the empty label implied by the trailing dot is not regarded > as a label so that you don't end up calling toAscii() or toUnicode() > with an empty string. I don't think it's saying the trailing dot can't > be there. > It makes sense. What's your preference to return for IDN.toASCII("www.oracle.com."), "www.oracle.com." or "www.oracle.com"? The current returned value is "www.oracle.com". I would like to reserve the behavior in this update. I think we are on same page soon. Thanks, Xuelei > Michael > > On 07/08/13 13:44, Xuelei Fan wrote: >> On 8/7/2013 12:06 AM, Matthew Hall wrote: >>> Trailing dots are allowed in plain DNS (thus almost surely in IDN), >>> and the single dot represents the root zone. So you have to be >>> careful making this sort of change to check the DNS RFCs first. >> That's the first question we need to answer, whether IDN allow tailling >> dots ("com."), zero-length root label ("."), and zero-length label ("", >> for example ""example..com")? >> >> Per the specification of IDN.toASCII(): >> === >> "ToASCII operation can fail. ToASCII fails if any step of it fails. If >> ToASCII operation fails, an IllegalArgumentException will be thrown. In >> this case, the input string should not be used in an internationalized >> domain name. >> >> A label is an individual part of a domain name. The original ToASCII >> operation, as defined in RFC 3490, only operates on a single label. This >> method can handle both label and entire domain name, by assuming that >> labels in a domain name are always separated by dots. ... >> >> Throws IllegalArgumentException - if the input string doesn't conform to >> RFC 3490 specification" >> >> Per the specification of RFC 3490: >> == >> [section 2] >> "A label is an individual part of a domain name. Labels are usually >> shown separated by dots; for example, the domain name >> "www.example.com" is composed of three labels: "www", "example", and >> "com". (The zero-length root label described in [STD13], which can >> be explicit as in "www.example.com." or implicit as in >> "www.example.com", is not considered a label in this specification.)" >> >> "An "internationalized label" is a label to which the ToASCII >> operation (see section 4) can be applied without failing (with the >> UseSTD3ASCIIRules flag unset). ... >> Although most Unicode characters can appear in >> internationalized labels, ToASCII will fail for some input strings, >> and such strings are not valid internationalized labels." >> >> "An "internationalized domain name" (IDN) is a domain name in which >> every label is an internationalized label." >> >> [Section 4.1] >> "ToASCII consists of the following steps: >> >> ... >> >> 8. Verify that the number of code points is in the range 1 to 63 >>inclusive." >> >> >> Here are the questions: >> 1. whether "example..com" is an valid IDN? >> As dot is used as label separators, there are three labels, >> "example", "", "com". Per RFC 3490, "" is not a valid label. Hence, >> "example..com" is not a valid IDN. >> >> We need to address the issue in IDN. >> >> 2. whether "xyz." is an valid IDN? >> It's an gray area, I think. We can treat the trailing "." as root >> label, or a label separator. >> If the trailing "." is treated as label separator, "xyz." is invalid >> per RFC 3490. >> if the trailing "." is treated as root label, what's the expected >> return value of IDN.toASCII("xyz.")? I think the return value can be >> either "xyz." or "xyz". The current implementation returns "xyz". >> >> We may need not to update the implementation if tailing "." is >> treated as root label. >> >> 3. whether "." is an valid IDN? >> It's an gray area again, I think. >> As above, if the trailing "." is treated as root label, I think the >> return value can be either "." or "". The current implementation throws >> a StringIndexOutOfBoundsException. >> >> However, what empty domain name ("") really means? I would prefer to >> return "." for "." instead. >> >> We need to address the issue in IDN. >> >> >> Here comes the solution, the IDN.toASCII() returns: >> 1. "." for "."; >> 2. "xyz" for "xyz."; >> 3. IAE for "example..com". >> >> Does it make sense? >> >> Thanks, >> Xuelei >> >> >> On 8/7/2013 1:35 AM, Michael McMahon wrote: >>> I don't really understand the reason for the restriction in SNIHostName >>> But, I guess that is where it should be enforced if it is required. >>> >>> Michael. >>> >>> On 06/08/13 17:43, Dmitry Samersoff wrote: Xuelei, . (dot) is perfectly valid domain name and it means root domain so com. is valid domain name as well.
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Resolvers seem to accept queries using trailing dots. eg nslookup www.oracle.com. or InetAddress.getByName("www.oracle.com."); The part of RFC3490 quoted below seems to me to be saying that the empty label implied by the trailing dot is not regarded as a label so that you don't end up calling toAscii() or toUnicode() with an empty string. I don't think it's saying the trailing dot can't be there. Michael On 07/08/13 13:44, Xuelei Fan wrote: On 8/7/2013 12:06 AM, Matthew Hall wrote: Trailing dots are allowed in plain DNS (thus almost surely in IDN), and the single dot represents the root zone. So you have to be careful making this sort of change to check the DNS RFCs first. That's the first question we need to answer, whether IDN allow tailling dots ("com."), zero-length root label ("."), and zero-length label ("", for example ""example..com")? Per the specification of IDN.toASCII(): === "ToASCII operation can fail. ToASCII fails if any step of it fails. If ToASCII operation fails, an IllegalArgumentException will be thrown. In this case, the input string should not be used in an internationalized domain name. A label is an individual part of a domain name. The original ToASCII operation, as defined in RFC 3490, only operates on a single label. This method can handle both label and entire domain name, by assuming that labels in a domain name are always separated by dots. ... Throws IllegalArgumentException - if the input string doesn't conform to RFC 3490 specification" Per the specification of RFC 3490: == [section 2] "A label is an individual part of a domain name. Labels are usually shown separated by dots; for example, the domain name "www.example.com" is composed of three labels: "www", "example", and "com". (The zero-length root label described in [STD13], which can be explicit as in "www.example.com." or implicit as in "www.example.com", is not considered a label in this specification.)" "An "internationalized label" is a label to which the ToASCII operation (see section 4) can be applied without failing (with the UseSTD3ASCIIRules flag unset). ... Although most Unicode characters can appear in internationalized labels, ToASCII will fail for some input strings, and such strings are not valid internationalized labels." "An "internationalized domain name" (IDN) is a domain name in which every label is an internationalized label." [Section 4.1] "ToASCII consists of the following steps: ... 8. Verify that the number of code points is in the range 1 to 63 inclusive." Here are the questions: 1. whether "example..com" is an valid IDN? As dot is used as label separators, there are three labels, "example", "", "com". Per RFC 3490, "" is not a valid label. Hence, "example..com" is not a valid IDN. We need to address the issue in IDN. 2. whether "xyz." is an valid IDN? It's an gray area, I think. We can treat the trailing "." as root label, or a label separator. If the trailing "." is treated as label separator, "xyz." is invalid per RFC 3490. if the trailing "." is treated as root label, what's the expected return value of IDN.toASCII("xyz.")? I think the return value can be either "xyz." or "xyz". The current implementation returns "xyz". We may need not to update the implementation if tailing "." is treated as root label. 3. whether "." is an valid IDN? It's an gray area again, I think. As above, if the trailing "." is treated as root label, I think the return value can be either "." or "". The current implementation throws a StringIndexOutOfBoundsException. However, what empty domain name ("") really means? I would prefer to return "." for "." instead. We need to address the issue in IDN. Here comes the solution, the IDN.toASCII() returns: 1. "." for "."; 2. "xyz" for "xyz."; 3. IAE for "example..com". Does it make sense? Thanks, Xuelei On 8/7/2013 1:35 AM, Michael McMahon wrote: I don't really understand the reason for the restriction in SNIHostName But, I guess that is where it should be enforced if it is required. Michael. On 06/08/13 17:43, Dmitry Samersoff wrote: Xuelei, . (dot) is perfectly valid domain name and it means root domain so com. is valid domain name as well. It thinks to me that in context of methods your change we should ignore trailing dots, rather than throw exception. -Dmitry On 2013-08-06 15:44, Xuelei Fan wrote: Hi, Please review the bug fix to strict the illegal input checking in IDN. webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/ Here is two test cases, which are expected to get IAE. Case 1: String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES); Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 0 at java.lang.StringBuffer.charAt(StringBuffer.java:204) at java.net.IDN.toASCIIInternal(IDN.java:279)
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Xuelei, root label is an empty label[1], dot is a label separator, so in printed form domain names is dot-terminated. Please see also below inline. [1] RFC rfc1034.txt: Internally, programs that manipulate domain names should represent them as sequences of labels, where each label is a length octet followed by an octet string. Because all domain names end at the root, *which has a null string for a label*, these internal representations can use a length byte of zero to terminate a domain name. On 2013-08-07 16:44, Xuelei Fan wrote: > On 8/7/2013 12:06 AM, Matthew Hall wrote: >> Trailing dots are allowed in plain DNS (thus almost surely in IDN), >> and the single dot represents the root zone. So you have to be >> careful making this sort of change to check the DNS RFCs first. > > That's the first question we need to answer, whether IDN allow tailling > dots ("com."), zero-length root label ("."), and zero-length label ("", > for example ""example..com")? > > Per the specification of IDN.toASCII(): > === > "ToASCII operation can fail. ToASCII fails if any step of it fails. If > ToASCII operation fails, an IllegalArgumentException will be thrown. In > this case, the input string should not be used in an internationalized > domain name. > > A label is an individual part of a domain name. The original ToASCII > operation, as defined in RFC 3490, only operates on a single label. This > method can handle both label and entire domain name, by assuming that > labels in a domain name are always separated by dots. ... > > Throws IllegalArgumentException - if the input string doesn't conform to > RFC 3490 specification" > > Per the specification of RFC 3490: > == > [section 2] > "A label is an individual part of a domain name. Labels are usually > shown separated by dots; for example, the domain name > "www.example.com" is composed of three labels: "www", "example", and > "com". (The zero-length root label described in [STD13], which can > be explicit as in "www.example.com." or implicit as in > "www.example.com", is not considered a label in this specification.)" > > "An "internationalized label" is a label to which the ToASCII > operation (see section 4) can be applied without failing (with the > UseSTD3ASCIIRules flag unset). ... > Although most Unicode characters can appear in > internationalized labels, ToASCII will fail for some input strings, > and such strings are not valid internationalized labels." > > "An "internationalized domain name" (IDN) is a domain name in which > every label is an internationalized label." > > [Section 4.1] > "ToASCII consists of the following steps: > > ... > > 8. Verify that the number of code points is in the range 1 to 63 > inclusive." > > > Here are the questions: > 1. whether "example..com" is an valid IDN? >As dot is used as label separators, there are three labels, > "example", "", "com". Per RFC 3490, "" is not a valid label. Hence, > "example..com" is not a valid IDN. > >We need to address the issue in IDN. Root label can't appear in the middle of domain name, so example..com is an invalid domain name and appropriate exception have to be thrown. > > 2. whether "xyz." is an valid IDN? >It's an gray area, I think. We can treat the trailing "." as root > label, or a label separator. >If the trailing "." is treated as label separator, "xyz." is invalid > per RFC 3490. >if the trailing "." is treated as root label, what's the expected > return value of IDN.toASCII("xyz.")? I think the return value can be > either "xyz." or "xyz". The current implementation returns "xyz". > >We may need not to update the implementation if tailing "." is > treated as root label. Empty label at the end of domain names is valid per RFC 1034 and means root label. So we should process this name and return all non-empty labels. > 3. whether "." is an valid IDN? >It's an gray area again, I think. >As above, if the trailing "." is treated as root label, I think the > return value can be either "." or "". The current implementation throws > a StringIndexOutOfBoundsException. > >However, what empty domain name ("") really means? I would prefer to > return "." for "." instead. > >We need to address the issue in IDN. As dot is a label separator and root (empty) label can't appear in the middle of domain name, . (dot) is not valid name and this case is similar to case (1) - we should throw an appropriate exception. -Dmitry > > Here comes the solution, the IDN.toASCII() returns: > 1. "." for "."; > 2. "xyz" for "xyz."; > 3. IAE for "example..com". > > Does it make sense? > > Thanks, > Xuelei > > > On 8/7/2013 1:35 AM, Michael McMahon wrote: >> I don't really understand the reason for the restriction in SNIHostName >> But, I guess that is where it should be enforced if it is required. >> >> Michael. >> >> On 06/08/13 17:43, Dmitry Samersoff wr
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
On 8/7/2013 12:06 AM, Matthew Hall wrote: > Trailing dots are allowed in plain DNS (thus almost surely in IDN), > and the single dot represents the root zone. So you have to be > careful making this sort of change to check the DNS RFCs first. That's the first question we need to answer, whether IDN allow tailling dots ("com."), zero-length root label ("."), and zero-length label ("", for example ""example..com")? Per the specification of IDN.toASCII(): === "ToASCII operation can fail. ToASCII fails if any step of it fails. If ToASCII operation fails, an IllegalArgumentException will be thrown. In this case, the input string should not be used in an internationalized domain name. A label is an individual part of a domain name. The original ToASCII operation, as defined in RFC 3490, only operates on a single label. This method can handle both label and entire domain name, by assuming that labels in a domain name are always separated by dots. ... Throws IllegalArgumentException - if the input string doesn't conform to RFC 3490 specification" Per the specification of RFC 3490: == [section 2] "A label is an individual part of a domain name. Labels are usually shown separated by dots; for example, the domain name "www.example.com" is composed of three labels: "www", "example", and "com". (The zero-length root label described in [STD13], which can be explicit as in "www.example.com." or implicit as in "www.example.com", is not considered a label in this specification.)" "An "internationalized label" is a label to which the ToASCII operation (see section 4) can be applied without failing (with the UseSTD3ASCIIRules flag unset). ... Although most Unicode characters can appear in internationalized labels, ToASCII will fail for some input strings, and such strings are not valid internationalized labels." "An "internationalized domain name" (IDN) is a domain name in which every label is an internationalized label." [Section 4.1] "ToASCII consists of the following steps: ... 8. Verify that the number of code points is in the range 1 to 63 inclusive." Here are the questions: 1. whether "example..com" is an valid IDN? As dot is used as label separators, there are three labels, "example", "", "com". Per RFC 3490, "" is not a valid label. Hence, "example..com" is not a valid IDN. We need to address the issue in IDN. 2. whether "xyz." is an valid IDN? It's an gray area, I think. We can treat the trailing "." as root label, or a label separator. If the trailing "." is treated as label separator, "xyz." is invalid per RFC 3490. if the trailing "." is treated as root label, what's the expected return value of IDN.toASCII("xyz.")? I think the return value can be either "xyz." or "xyz". The current implementation returns "xyz". We may need not to update the implementation if tailing "." is treated as root label. 3. whether "." is an valid IDN? It's an gray area again, I think. As above, if the trailing "." is treated as root label, I think the return value can be either "." or "". The current implementation throws a StringIndexOutOfBoundsException. However, what empty domain name ("") really means? I would prefer to return "." for "." instead. We need to address the issue in IDN. Here comes the solution, the IDN.toASCII() returns: 1. "." for "."; 2. "xyz" for "xyz."; 3. IAE for "example..com". Does it make sense? Thanks, Xuelei On 8/7/2013 1:35 AM, Michael McMahon wrote: > I don't really understand the reason for the restriction in SNIHostName > But, I guess that is where it should be enforced if it is required. > > Michael. > > On 06/08/13 17:43, Dmitry Samersoff wrote: >> Xuelei, >> >> . (dot) is perfectly valid domain name and it means root domain so com. >> is valid domain name as well. >> >> It thinks to me that in context of methods your change we should ignore >> trailing dots, rather than throw exception. >> >> -Dmitry >> >> >> >> On 2013-08-06 15:44, Xuelei Fan wrote: >>> Hi, >>> >>> Please review the bug fix to strict the illegal input checking in IDN. >>> >>> webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/ >>> >>> Here is two test cases, which are expected to get IAE. >>> >>> Case 1: >>> String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES); >>> Exception in thread "main" java.lang.StringIndexOutOfBoundsException: >>> String index out of range: 0 >>> at java.lang.StringBuffer.charAt(StringBuffer.java:204) >>> at java.net.IDN.toASCIIInternal(IDN.java:279) >>> at java.net.IDN.toASCII(IDN.java:118) >>> >>> Case 2: >>> String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES); >>> >>> Thanks, >>> Xuelei >>> >> >
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
I don't really understand the reason for the restriction in SNIHostName But, I guess that is where it should be enforced if it is required. Michael. On 06/08/13 17:43, Dmitry Samersoff wrote: Xuelei, . (dot) is perfectly valid domain name and it means root domain so com. is valid domain name as well. It thinks to me that in context of methods your change we should ignore trailing dots, rather than throw exception. -Dmitry On 2013-08-06 15:44, Xuelei Fan wrote: Hi, Please review the bug fix to strict the illegal input checking in IDN. webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/ Here is two test cases, which are expected to get IAE. Case 1: String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES); Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 0 at java.lang.StringBuffer.charAt(StringBuffer.java:204) at java.net.IDN.toASCIIInternal(IDN.java:279) at java.net.IDN.toASCII(IDN.java:118) Case 2: String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES); Thanks, Xuelei
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Xuelei, . (dot) is perfectly valid domain name and it means root domain so com. is valid domain name as well. It thinks to me that in context of methods your change we should ignore trailing dots, rather than throw exception. -Dmitry On 2013-08-06 15:44, Xuelei Fan wrote: > Hi, > > Please review the bug fix to strict the illegal input checking in IDN. > > webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/ > > Here is two test cases, which are expected to get IAE. > > Case 1: > String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES); > Exception in thread "main" java.lang.StringIndexOutOfBoundsException: > String index out of range: 0 > at java.lang.StringBuffer.charAt(StringBuffer.java:204) > at java.net.IDN.toASCIIInternal(IDN.java:279) > at java.net.IDN.toASCII(IDN.java:118) > > Case 2: > String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES); > > Thanks, > Xuelei > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources.
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Take a look here for more clarity: http://en.wikipedia.org/wiki/Fully_qualified_domain_name -- Sent from my mobile device. Matthew Hall wrote: >Trailing dots are allowed in plain DNS (thus almost surely in IDN), and >the single dot represents the root zone. So you have to be careful >making this sort of change to check the DNS RFCs first. > >Matthew.
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Trailing dots are allowed in plain DNS (thus almost surely in IDN), and the single dot represents the root zone. So you have to be careful making this sort of change to check the DNS RFCs first. Matthew. -- Sent from my mobile device. Weijun Wang wrote: >I am not sure if IDN.java is the correct place to change. At least I've > >seen trailing dots in DNS entries. So maybe it's not so illegal. > >--Max > >On 8/6/13 7:44 PM, Xuelei Fan wrote: >> Hi, >> >> Please review the bug fix to strict the illegal input checking in >IDN. >> >> webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/ >> >> Here is two test cases, which are expected to get IAE. >> >> Case 1: >> String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES); >> Exception in thread "main" java.lang.StringIndexOutOfBoundsException: >> String index out of range: 0 >> at java.lang.StringBuffer.charAt(StringBuffer.java:204) >> at java.net.IDN.toASCIIInternal(IDN.java:279) >> at java.net.IDN.toASCII(IDN.java:118) >> >> Case 2: >> String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES); >> >> Thanks, >> Xuelei >>
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
On Aug 6, 2013, at 23:08, Weijun Wang wrote: > I am not sure if IDN.java is the correct place to change. At least I've seen > trailing dots in DNS entries. So maybe it's not so illegal. > Per RFC 1034, a domain name cannot end with dot. I will check other related specifications. What's the case you saw with trailing dots? Thanks, Xuelei > --Max > > On 8/6/13 7:44 PM, Xuelei Fan wrote: >> Hi, >> >> Please review the bug fix to strict the illegal input checking in IDN. >> >> webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/ >> >> Here is two test cases, which are expected to get IAE. >> >> Case 1: >> String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES); >> Exception in thread "main" java.lang.StringIndexOutOfBoundsException: >> String index out of range: 0 >> at java.lang.StringBuffer.charAt(StringBuffer.java:204) >> at java.net.IDN.toASCIIInternal(IDN.java:279) >> at java.net.IDN.toASCII(IDN.java:118) >> >> Case 2: >> String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES); >> >> Thanks, >> Xuelei >>
Re: Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
I am not sure if IDN.java is the correct place to change. At least I've seen trailing dots in DNS entries. So maybe it's not so illegal. --Max On 8/6/13 7:44 PM, Xuelei Fan wrote: Hi, Please review the bug fix to strict the illegal input checking in IDN. webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/ Here is two test cases, which are expected to get IAE. Case 1: String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES); Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 0 at java.lang.StringBuffer.charAt(StringBuffer.java:204) at java.net.IDN.toASCIIInternal(IDN.java:279) at java.net.IDN.toASCII(IDN.java:118) Case 2: String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES); Thanks, Xuelei
Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Hi, Please review the bug fix to strict the illegal input checking in IDN. webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/ Here is two test cases, which are expected to get IAE. Case 1: String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES); Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 0 at java.lang.StringBuffer.charAt(StringBuffer.java:204) at java.net.IDN.toASCIIInternal(IDN.java:279) at java.net.IDN.toASCII(IDN.java:118) Case 2: String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES); Thanks, Xuelei