Re: APPSDIR review of draft-farrell-decade-ni-07, major design issue (one or two URI schemes)
Stephen, (Personal hat on) I've followed elements of this exchange. I must confess that when I read through the draft previously, I didn't really pay attention to the nih: parts. I can see that there are distinct use-cases here, and I think you have reasonable grounds for not wanting to combine them. What I can't see is why the speakable form (nih:) needs to be a URI scheme - what are the envisaged contexts of use where the information provided by an nih: URI actually needs to be a URI, as opposed to just, say, a simple string? In all the uses I can think of for ni:, I don't see a corresponding use for a speakable form. I think you said somewhere in this exchange, nih: is intended to be used to confirm some information that you already have. As such, I'm not seeing how it can be said to identify a resource. #g -- On 12/06/2012 15:09, Stephen Farrell wrote: Martin, I honestly don't think this exchange is going anywhere new so I've not provided blow-by-blow answers below. We (the authors) think ni and nih are better kept separate because the use-cases and requirements differ. And that's what the limited amount of running code does. I think I've explained that, and I have no intention whatsoever of getting into the write up the use-cases and requirements game that too-often happens in the IETF when its neither needed nor productive. (Regardless of whether the game is played with I-Ds or email.) Sometimes that is useful, but not here IMO. You disagree, which is fine. You think ni and nih should be merged because they don't differ sufficiently, (I think), and (I guess) because you think that new URI schemes are more expensive than what you see as the differences between these two proposed schemes warrant. I suggest we see if anyone else chimes in on this aspect (so far nobody has, despite a quite active IETF LC:-) and if not leave it to the sponsoring AD to figure out what, if anything, needs doing at the end of IETF LC. Cheers, S. PS: Many thanks for all the other good comments, though we disagree on this one, you've helped make the draft better for sure. On 06/12/2012 02:04 PM, Martin J. Dürst wrote: Hello Stephen, On 2012/06/12 18:59, Stephen Farrell wrote: Hi Martin, On 06/12/2012 10:13 AM, Martin J. Dürst wrote: Hello Stephen, This mail responds to your points on the main technical issue that I have identified. On 2012/06/05 20:11, Stephen Farrell wrote: On 06/05/2012 10:42 AM, Martin J. Dürst wrote: Hello everybody, Major design issue: The draft defines two schemes, which differ only slightly, and mostly just gratuitously (see also editorial issues). These are the ni: and the nih: scheme. As far as I understand, they differ as follows: ni:nih: authority: optional disallowed ascii-compatible encoding: base64url base16 check digit:disallowed optional query part: optional disallowed decimal presentation of algorithm: disallowed possible I'll note in passing that the two schemes differ in all those respects. You may disagree with our design, but basically you're showing that the two differ in pretty much all possible ways other than that both include a hash value. The usability of URIs is strongly influenced by the number of different schemes, with the smaller a number, the better. As a somewhat made-up example, if the original URIs had been separated into httph: for HTML pages and httpi: for images, or any other arbitrary subdivision that one can envision, that would have hurt the growth and extensibility of the Web. Creating new URI schemes is occasionally necessary, and the ideas that lead to this draft definitely seem to warrant a new scheme (*), but there's no reason for two schemes. [(*) I know people who would claim the the .well-formed http/https thing is completely sufficient, no new scheme needed at all.] More specifically, if the original URIs had been separated into httpm: (for machines) and httph: (for humans), the Web for sure wouldn't have grown at the speed it did (and does) grow. In practice, there are huge differences in human 'speakability' for URIs (and IRIs, for that matter); compare e.g. http://google.com with http://www.google.co.jp/#sclient=psy-abhl=ensite=source=hpq=hashoq=hashaq=faqi=g4aql= (which I have significantly shortened to hopefully eliminate potential privacy issues), or compare the average mailto: URI with the average data: URI. However, what's important is that there never has been a strong dividing line between machine-only and human-only URIs or schemes, the division has always been very gradual. Short and mainly human-oriented URIs have of course been handled by machines, and on the other hand, very long URIs have been spoken when really necessary. Speakability has been maintained to some extent by scheme designers, and to some
Re: APPSDIR review of draft-farrell-decade-ni-07, major design issue (one or two URI schemes)
On 15/06/2012 20:18, Stephen Farrell wrote: I think you said somewhere in this exchange, nih: is intended to be used to confirm some information that you already have. As such, I'm not seeing how it can be said to identify a resource. I'm checking with others since I'm not entirely familiar with all that they want from nih, (so you might be right, or maybe not;-) As far as I do know now, the idea is that nih could be useful when values are to be read over the phone. I agree the listener wouldn't need the scheme, but I think the argument is that it'd help the speaker (or speaker's UA) to highlight the right thing. I'd agree if you argued that that's not the most compelling reason, but maybe there's more. In terms of identifying a resource, I don't get the comment, since the information is similar to that in an ni, so it does identify whatever it is matches that hash. I meant to say that I don't see how it is being *used* to identify a resource - as you point out, it clearly *can* so identify, as much as ni: can. What I'm trying to get to is why one might need a URI, rather than just a string, in situations where nih: might be used. One of the useful features of URIs is that they provide a uniform way of referencing things that may be accessed or baptized by a variety of different mechanisms or structures, decoupling the identification mechanism from the details of what is done with the thing identified. The particular use of confirming information one already has doesn't seem to fit this model of separating use of some value from the form of identification and/or access mechanism. #g -- On 12/06/2012 15:09, Stephen Farrell wrote: Martin, I honestly don't think this exchange is going anywhere new so I've not provided blow-by-blow answers below. We (the authors) think ni and nih are better kept separate because the use-cases and requirements differ. And that's what the limited amount of running code does. I think I've explained that, and I have no intention whatsoever of getting into the write up the use-cases and requirements game that too-often happens in the IETF when its neither needed nor productive. (Regardless of whether the game is played with I-Ds or email.) Sometimes that is useful, but not here IMO. You disagree, which is fine. You think ni and nih should be merged because they don't differ sufficiently, (I think), and (I guess) because you think that new URI schemes are more expensive than what you see as the differences between these two proposed schemes warrant. I suggest we see if anyone else chimes in on this aspect (so far nobody has, despite a quite active IETF LC:-) and if not leave it to the sponsoring AD to figure out what, if anything, needs doing at the end of IETF LC. Cheers, S. PS: Many thanks for all the other good comments, though we disagree on this one, you've helped make the draft better for sure. On 06/12/2012 02:04 PM, Martin J. Dürst wrote: Hello Stephen, On 2012/06/12 18:59, Stephen Farrell wrote: Hi Martin, On 06/12/2012 10:13 AM, Martin J. Dürst wrote: Hello Stephen, This mail responds to your points on the main technical issue that I have identified. On 2012/06/05 20:11, Stephen Farrell wrote: On 06/05/2012 10:42 AM, Martin J. Dürst wrote: Hello everybody, Major design issue: The draft defines two schemes, which differ only slightly, and mostly just gratuitously (see also editorial issues). These are the ni: and the nih: scheme. As far as I understand, they differ as follows: ni:nih: authority: optional disallowed ascii-compatible encoding: base64url base16 check digit:disallowed optional query part: optional disallowed decimal presentation of algorithm: disallowed possible I'll note in passing that the two schemes differ in all those respects. You may disagree with our design, but basically you're showing that the two differ in pretty much all possible ways other than that both include a hash value. The usability of URIs is strongly influenced by the number of different schemes, with the smaller a number, the better. As a somewhat made-up example, if the original URIs had been separated into httph: for HTML pages and httpi: for images, or any other arbitrary subdivision that one can envision, that would have hurt the growth and extensibility of the Web. Creating new URI schemes is occasionally necessary, and the ideas that lead to this draft definitely seem to warrant a new scheme (*), but there's no reason for two schemes. [(*) I know people who would claim the the .well-formed http/https thing is completely sufficient, no new scheme needed at all.] More specifically, if the original URIs had been separated into httpm: (for machines) and httph: (for humans), the Web for sure wouldn't have grown at the speed it did (and does)
Re: APPSDIR review of draft-farrell-decade-ni-07, major design issue (one or two URI schemes)
Hi Graham, On 06/15/2012 11:48 AM, Graham Klyne wrote: Stephen, (Personal hat on) I've followed elements of this exchange. I must confess that when I read through the draft previously, I didn't really pay attention to the nih: parts. I guess part of my reaction to Martin's comments was based on an assumption that such a comment (make your two registrations into one) was something one could expect to have gotten from the uri-review list. But anyway... I can see that there are distinct use-cases here, and I think you have reasonable grounds for not wanting to combine them. Right. What I can't see is why the speakable form (nih:) needs to be a URI scheme - what are the envisaged contexts of use where the information provided by an nih: URI actually needs to be a URI, as opposed to just, say, a simple string? In all the uses I can think of for ni:, I don't see a corresponding use for a speakable form. I agree that there's little overlap for sure. I think you said somewhere in this exchange, nih: is intended to be used to confirm some information that you already have. As such, I'm not seeing how it can be said to identify a resource. I'm checking with others since I'm not entirely familiar with all that they want from nih, (so you might be right, or maybe not;-) As far as I do know now, the idea is that nih could be useful when values are to be read over the phone. I agree the listener wouldn't need the scheme, but I think the argument is that it'd help the speaker (or speaker's UA) to highlight the right thing. I'd agree if you argued that that's not the most compelling reason, but maybe there's more. In terms of identifying a resource, I don't get the comment, since the information is similar to that in an ni, so it does identify whatever it is matches that hash. Cheers, S. #g -- On 12/06/2012 15:09, Stephen Farrell wrote: Martin, I honestly don't think this exchange is going anywhere new so I've not provided blow-by-blow answers below. We (the authors) think ni and nih are better kept separate because the use-cases and requirements differ. And that's what the limited amount of running code does. I think I've explained that, and I have no intention whatsoever of getting into the write up the use-cases and requirements game that too-often happens in the IETF when its neither needed nor productive. (Regardless of whether the game is played with I-Ds or email.) Sometimes that is useful, but not here IMO. You disagree, which is fine. You think ni and nih should be merged because they don't differ sufficiently, (I think), and (I guess) because you think that new URI schemes are more expensive than what you see as the differences between these two proposed schemes warrant. I suggest we see if anyone else chimes in on this aspect (so far nobody has, despite a quite active IETF LC:-) and if not leave it to the sponsoring AD to figure out what, if anything, needs doing at the end of IETF LC. Cheers, S. PS: Many thanks for all the other good comments, though we disagree on this one, you've helped make the draft better for sure. On 06/12/2012 02:04 PM, Martin J. Dürst wrote: Hello Stephen, On 2012/06/12 18:59, Stephen Farrell wrote: Hi Martin, On 06/12/2012 10:13 AM, Martin J. Dürst wrote: Hello Stephen, This mail responds to your points on the main technical issue that I have identified. On 2012/06/05 20:11, Stephen Farrell wrote: On 06/05/2012 10:42 AM, Martin J. Dürst wrote: Hello everybody, Major design issue: The draft defines two schemes, which differ only slightly, and mostly just gratuitously (see also editorial issues). These are the ni: and the nih: scheme. As far as I understand, they differ as follows: ni:nih: authority: optional disallowed ascii-compatible encoding: base64url base16 check digit:disallowed optional query part: optional disallowed decimal presentation of algorithm: disallowed possible I'll note in passing that the two schemes differ in all those respects. You may disagree with our design, but basically you're showing that the two differ in pretty much all possible ways other than that both include a hash value. The usability of URIs is strongly influenced by the number of different schemes, with the smaller a number, the better. As a somewhat made-up example, if the original URIs had been separated into httph: for HTML pages and httpi: for images, or any other arbitrary subdivision that one can envision, that would have hurt the growth and extensibility of the Web. Creating new URI schemes is occasionally necessary, and the ideas that lead to this draft definitely seem to warrant a new scheme (*), but there's no reason for two schemes. [(*) I know people who would claim
Re: APPSDIR review of draft-farrell-decade-ni-07, major design issue (one or two URI schemes)
Hello Stephen, This mail responds to your points on the main technical issue that I have identified. On 2012/06/05 20:11, Stephen Farrell wrote: On 06/05/2012 10:42 AM, Martin J. Dürst wrote: Hello everybody, [For replies, please trim the cc list, thanks!] Done, removed apps-disc...@ietf.org for the moment. Major design issue: The draft defines two schemes, which differ only slightly, and mostly just gratuitously (see also editorial issues). These are the ni: and the nih: scheme. As far as I understand, they differ as follows: ni:nih: authority: optional disallowed ascii-compatible encoding: base64url base16 check digit:disallowed optional query part: optional disallowed decimal presentation of algorithm: disallowed possible The usability of URIs is strongly influenced by the number of different schemes, with the smaller a number, the better. As a somewhat made-up example, if the original URIs had been separated into httph: for HTML pages and httpi: for images, or any other arbitrary subdivision that one can envision, that would have hurt the growth and extensibility of the Web. Creating new URI schemes is occasionally necessary, and the ideas that lead to this draft definitely seem to warrant a new scheme (*), but there's no reason for two schemes. [(*) I know people who would claim the the .well-formed http/https thing is completely sufficient, no new scheme needed at all.] More specifically, if the original URIs had been separated into httpm: (for machines) and httph: (for humans), the Web for sure wouldn't have grown at the speed it did (and does) grow. In practice, there are huge differences in human 'speakability' for URIs (and IRIs, for that matter); compare e.g. http://google.com with http://www.google.co.jp/#sclient=psy-abhl=ensite=source=hpq=hashoq=hashaq=faqi=g4aql= (which I have significantly shortened to hopefully eliminate potential privacy issues), or compare the average mailto: URI with the average data: URI. However, what's important is that there never has been a strong dividing line between machine-only and human-only URIs or schemes, the division has always been very gradual. Short and mainly human-oriented URIs have of course been handled by machines, and on the other hand, very long URIs have been spoken when really necessary. Speakability has been maintained to some extent by scheme designers, and to some extent by survival of the fittest (URIs that weren't very speakable (or spellable/memorizable/guessable/...), and their Web sites, might just die out slowly). It should also be noted that the resistance against multiple URI schemes may have been low because there are so many different ways to express hashes in the draft anyway, and one more (the nih: section is the last one before the examples section) didn't seem like much of a deal anymore. But when it comes to URIs, one less is a lot better than one more. In the above ni:/nih: distinction, nih: seems to have been added as an afterthought after realizing that reading an ni: URI aloud over the phone may be somewhat suboptimal because there is a need for repeated upper case - lower case (sure very quickly shortened to upper - lower and then to up - low or something similar). It is not a bad idea to try to make sure that IETF technology, and URIs in particular, are accessible to people with certain kinds of dislexya. (There are indeed people who have tremendous difficulties with distinguishing upper- and lower-case letters, and this may or may not be connected with other aspects of dislexya.) It is however totally unclear to this reviewer why this has to lead to two different URI schemes with other gratuitous differences. Finding a solution is rather easy (of course, other solutions may also be possible): Merge the schemes, so that authority, check digit, and query part are all optional (an authority part and/or a query part may very well be very useful in human communication, and a check digit won't hurt when transmitted electronically) and the decimal presentation of the algorithm is always allowed, and use base32 (http://tools.ietf.org/html/rfc4648) as the encoding. This leads to a 16.6% less efficient encoding of the value part of the ni: URI, but given that other URI-related encodings, e.g. the %-encoding resulting when converting an IRI to an URI, are much less efficient, and that URI infrastructure these days can handle URIs with more than 1000 bytes, this should not be a serious problem. Also, there's a separate binary format (section 6) that is more compact already. I strongly disagree with merging ni nih. Though that clearly could be done, it would be an error. There was no such comment on the uri-review list and the designated expert was happy. That review was IMO the time for such comments and second-guessing the
Re: APPSDIR review of draft-farrell-decade-ni-07, major design issue (one or two URI schemes)
Hi Martin, On 06/12/2012 10:13 AM, Martin J. Dürst wrote: Hello Stephen, This mail responds to your points on the main technical issue that I have identified. On 2012/06/05 20:11, Stephen Farrell wrote: On 06/05/2012 10:42 AM, Martin J. Dürst wrote: Hello everybody, [For replies, please trim the cc list, thanks!] Done, removed apps-disc...@ietf.org for the moment. Major design issue: The draft defines two schemes, which differ only slightly, and mostly just gratuitously (see also editorial issues). These are the ni: and the nih: scheme. As far as I understand, they differ as follows: ni:nih: authority: optional disallowed ascii-compatible encoding: base64url base16 check digit:disallowed optional query part: optional disallowed decimal presentation of algorithm: disallowed possible I'll note in passing that the two schemes differ in all those respects. You may disagree with our design, but basically you're showing that the two differ in pretty much all possible ways other than that both include a hash value. The usability of URIs is strongly influenced by the number of different schemes, with the smaller a number, the better. As a somewhat made-up example, if the original URIs had been separated into httph: for HTML pages and httpi: for images, or any other arbitrary subdivision that one can envision, that would have hurt the growth and extensibility of the Web. Creating new URI schemes is occasionally necessary, and the ideas that lead to this draft definitely seem to warrant a new scheme (*), but there's no reason for two schemes. [(*) I know people who would claim the the .well-formed http/https thing is completely sufficient, no new scheme needed at all.] More specifically, if the original URIs had been separated into httpm: (for machines) and httph: (for humans), the Web for sure wouldn't have grown at the speed it did (and does) grow. In practice, there are huge differences in human 'speakability' for URIs (and IRIs, for that matter); compare e.g. http://google.com with http://www.google.co.jp/#sclient=psy-abhl=ensite=source=hpq=hashoq=hashaq=faqi=g4aql= (which I have significantly shortened to hopefully eliminate potential privacy issues), or compare the average mailto: URI with the average data: URI. However, what's important is that there never has been a strong dividing line between machine-only and human-only URIs or schemes, the division has always been very gradual. Short and mainly human-oriented URIs have of course been handled by machines, and on the other hand, very long URIs have been spoken when really necessary. Speakability has been maintained to some extent by scheme designers, and to some extent by survival of the fittest (URIs that weren't very speakable (or spellable/memorizable/guessable/...), and their Web sites, might just die out slowly). It should also be noted that the resistance against multiple URI schemes may have been low because there are so many different ways to express hashes in the draft anyway, and one more (the nih: section is the last one before the examples section) didn't seem like much of a deal anymore. But when it comes to URIs, one less is a lot better than one more. In the above ni:/nih: distinction, nih: seems to have been added as an afterthought after realizing that reading an ni: URI aloud over the phone may be somewhat suboptimal because there is a need for repeated upper case - lower case (sure very quickly shortened to upper - lower and then to up - low or something similar). It is not a bad idea to try to make sure that IETF technology, and URIs in particular, are accessible to people with certain kinds of dislexya. (There are indeed people who have tremendous difficulties with distinguishing upper- and lower-case letters, and this may or may not be connected with other aspects of dislexya.) It is however totally unclear to this reviewer why this has to lead to two different URI schemes with other gratuitous differences. Finding a solution is rather easy (of course, other solutions may also be possible): Merge the schemes, so that authority, check digit, and query part are all optional (an authority part and/or a query part may very well be very useful in human communication, and a check digit won't hurt when transmitted electronically) and the decimal presentation of the algorithm is always allowed, and use base32 (http://tools.ietf.org/html/rfc4648) as the encoding. This leads to a 16.6% less efficient encoding of the value part of the ni: URI, but given that other URI-related encodings, e.g. the %-encoding resulting when converting an IRI to an URI, are much less efficient, and that URI infrastructure these days can handle URIs with more than 1000 bytes, this should
Re: APPSDIR review of draft-farrell-decade-ni-07, major design issue (one or two URI schemes)
Hello Stephen, On 2012/06/12 18:59, Stephen Farrell wrote: Hi Martin, On 06/12/2012 10:13 AM, Martin J. Dürst wrote: Hello Stephen, This mail responds to your points on the main technical issue that I have identified. On 2012/06/05 20:11, Stephen Farrell wrote: On 06/05/2012 10:42 AM, Martin J. Dürst wrote: Hello everybody, Major design issue: The draft defines two schemes, which differ only slightly, and mostly just gratuitously (see also editorial issues). These are the ni: and the nih: scheme. As far as I understand, they differ as follows: ni:nih: authority: optional disallowed ascii-compatible encoding: base64url base16 check digit:disallowed optional query part: optional disallowed decimal presentation of algorithm: disallowed possible I'll note in passing that the two schemes differ in all those respects. You may disagree with our design, but basically you're showing that the two differ in pretty much all possible ways other than that both include a hash value. The usability of URIs is strongly influenced by the number of different schemes, with the smaller a number, the better. As a somewhat made-up example, if the original URIs had been separated into httph: for HTML pages and httpi: for images, or any other arbitrary subdivision that one can envision, that would have hurt the growth and extensibility of the Web. Creating new URI schemes is occasionally necessary, and the ideas that lead to this draft definitely seem to warrant a new scheme (*), but there's no reason for two schemes. [(*) I know people who would claim the the .well-formed http/https thing is completely sufficient, no new scheme needed at all.] More specifically, if the original URIs had been separated into httpm: (for machines) and httph: (for humans), the Web for sure wouldn't have grown at the speed it did (and does) grow. In practice, there are huge differences in human 'speakability' for URIs (and IRIs, for that matter); compare e.g. http://google.com with http://www.google.co.jp/#sclient=psy-abhl=ensite=source=hpq=hashoq=hashaq=faqi=g4aql= (which I have significantly shortened to hopefully eliminate potential privacy issues), or compare the average mailto: URI with the average data: URI. However, what's important is that there never has been a strong dividing line between machine-only and human-only URIs or schemes, the division has always been very gradual. Short and mainly human-oriented URIs have of course been handled by machines, and on the other hand, very long URIs have been spoken when really necessary. Speakability has been maintained to some extent by scheme designers, and to some extent by survival of the fittest (URIs that weren't very speakable (or spellable/memorizable/guessable/...), and their Web sites, might just die out slowly). It should also be noted that the resistance against multiple URI schemes may have been low because there are so many different ways to express hashes in the draft anyway, and one more (the nih: section is the last one before the examples section) didn't seem like much of a deal anymore. But when it comes to URIs, one less is a lot better than one more. In the above ni:/nih: distinction, nih: seems to have been added as an afterthought after realizing that reading an ni: URI aloud over the phone may be somewhat suboptimal because there is a need for repeated upper case - lower case (sure very quickly shortened to upper - lower and then to up - low or something similar). It is not a bad idea to try to make sure that IETF technology, and URIs in particular, are accessible to people with certain kinds of dislexya. (There are indeed people who have tremendous difficulties with distinguishing upper- and lower-case letters, and this may or may not be connected with other aspects of dislexya.) It is however totally unclear to this reviewer why this has to lead to two different URI schemes with other gratuitous differences. Finding a solution is rather easy (of course, other solutions may also be possible): Merge the schemes, so that authority, check digit, and query part are all optional (an authority part and/or a query part may very well be very useful in human communication, and a check digit won't hurt when transmitted electronically) and the decimal presentation of the algorithm is always allowed, and use base32 (http://tools.ietf.org/html/rfc4648) as the encoding. This leads to a 16.6% less efficient encoding of the value part of the ni: URI, but given that other URI-related encodings, e.g. the %-encoding resulting when converting an IRI to an URI, are much less efficient, and that URI infrastructure these days can handle URIs with more than 1000 bytes, this should not be a serious problem. Also, there's a separate binary format (section 6) that is more compact already. I
Re: APPSDIR review of draft-farrell-decade-ni-07, major design issue (one or two URI schemes)
Martin, I honestly don't think this exchange is going anywhere new so I've not provided blow-by-blow answers below. We (the authors) think ni and nih are better kept separate because the use-cases and requirements differ. And that's what the limited amount of running code does. I think I've explained that, and I have no intention whatsoever of getting into the write up the use-cases and requirements game that too-often happens in the IETF when its neither needed nor productive. (Regardless of whether the game is played with I-Ds or email.) Sometimes that is useful, but not here IMO. You disagree, which is fine. You think ni and nih should be merged because they don't differ sufficiently, (I think), and (I guess) because you think that new URI schemes are more expensive than what you see as the differences between these two proposed schemes warrant. I suggest we see if anyone else chimes in on this aspect (so far nobody has, despite a quite active IETF LC:-) and if not leave it to the sponsoring AD to figure out what, if anything, needs doing at the end of IETF LC. Cheers, S. PS: Many thanks for all the other good comments, though we disagree on this one, you've helped make the draft better for sure. On 06/12/2012 02:04 PM, Martin J. Dürst wrote: Hello Stephen, On 2012/06/12 18:59, Stephen Farrell wrote: Hi Martin, On 06/12/2012 10:13 AM, Martin J. Dürst wrote: Hello Stephen, This mail responds to your points on the main technical issue that I have identified. On 2012/06/05 20:11, Stephen Farrell wrote: On 06/05/2012 10:42 AM, Martin J. Dürst wrote: Hello everybody, Major design issue: The draft defines two schemes, which differ only slightly, and mostly just gratuitously (see also editorial issues). These are the ni: and the nih: scheme. As far as I understand, they differ as follows: ni:nih: authority: optional disallowed ascii-compatible encoding: base64url base16 check digit:disallowed optional query part: optional disallowed decimal presentation of algorithm: disallowed possible I'll note in passing that the two schemes differ in all those respects. You may disagree with our design, but basically you're showing that the two differ in pretty much all possible ways other than that both include a hash value. The usability of URIs is strongly influenced by the number of different schemes, with the smaller a number, the better. As a somewhat made-up example, if the original URIs had been separated into httph: for HTML pages and httpi: for images, or any other arbitrary subdivision that one can envision, that would have hurt the growth and extensibility of the Web. Creating new URI schemes is occasionally necessary, and the ideas that lead to this draft definitely seem to warrant a new scheme (*), but there's no reason for two schemes. [(*) I know people who would claim the the .well-formed http/https thing is completely sufficient, no new scheme needed at all.] More specifically, if the original URIs had been separated into httpm: (for machines) and httph: (for humans), the Web for sure wouldn't have grown at the speed it did (and does) grow. In practice, there are huge differences in human 'speakability' for URIs (and IRIs, for that matter); compare e.g. http://google.com with http://www.google.co.jp/#sclient=psy-abhl=ensite=source=hpq=hashoq=hashaq=faqi=g4aql= (which I have significantly shortened to hopefully eliminate potential privacy issues), or compare the average mailto: URI with the average data: URI. However, what's important is that there never has been a strong dividing line between machine-only and human-only URIs or schemes, the division has always been very gradual. Short and mainly human-oriented URIs have of course been handled by machines, and on the other hand, very long URIs have been spoken when really necessary. Speakability has been maintained to some extent by scheme designers, and to some extent by survival of the fittest (URIs that weren't very speakable (or spellable/memorizable/guessable/...), and their Web sites, might just die out slowly). It should also be noted that the resistance against multiple URI schemes may have been low because there are so many different ways to express hashes in the draft anyway, and one more (the nih: section is the last one before the examples section) didn't seem like much of a deal anymore. But when it comes to URIs, one less is a lot better than one more. In the above ni:/nih: distinction, nih: seems to have been added as an afterthought after realizing that reading an ni: URI aloud over the phone may be somewhat suboptimal because there is a need for repeated upper case - lower case (sure very quickly shortened to upper - lower and then to up - low or something
registries and designated experts (was: Re: APPSDIR review of draft-farrell-decade-ni-07, major design issue (one or two URI schemes))
[change of subject] On 6/12/12 3:13 AM, Martin J. Dürst wrote: On 2012/06/05 20:11, Stephen Farrell wrote: snip/ I strongly disagree with merging ni nih. Though that clearly could be done, it would be an error. There was no such comment on the uri-review list and the designated expert was happy. That review was IMO the time for such comments and second-guessing the designated expert at this stage seems contrary to the registration requirements. So process-wise I think your main comment is late. First, if IETF Last Call is too late to make serious technical comments on drafts, then I think we have to rename it to IETF Too-Late Call. Second, designated experts are there to check for minimum requirements for a registration, and to give advice as they see fit (and have time). I'm myself a designated expert on Character Sets, and I have definitely in the past approved, and would again in the future approve, registrations for stuff on which I would complain strongly if the question was is this a good technical solution. Graham Klyne, the designated expert for URI scheme registrations, has confirmed offline that he does not see his role as expert reviewer as judging the technical merit of a URI scheme proposal. By my reading, the happiana discussions [1] over the 12+ months have led most participants to the conclusion that registration does not imply standardization, and that it's not the role of the designated expert to act as a gatekeeper with respect to the technical merits of the technologies that trigger registration requests. It might be good to have a wider discussion about the purpose of registries and the role of designated experts, but IMHO it's not correct to conclude that a technology is acceptable just because the designated expert didn't object to the registrations related to that technology. Peter [1] https://www.ietf.org/mailman/listinfo/happiana
Re: registries and designated experts (was: Re: APPSDIR review of draft-farrell-decade-ni-07, major design issue (one or two URI schemes))
Martin said: Second, designated experts are there to check for minimum requirements for a registration, and to give advice as they see fit (and have time). I'm myself a designated expert on Character Sets, and I have definitely in the past approved, and would again in the future approve, registrations for stuff on which I would complain strongly if the question was is this a good technical solution. There's no one correct statement about what designated experts do, or are supposed to do. Some specs that create designated experts give more or fewer instructions to the expert than others do. Some give none at all. The desognated expert should be doing what the specification that defined the registry said the expert should do, to the best of the expert's interpretation. That might involve more gatekeeping in some cases, and less in others, and there's always an appeal path available. It's my opinion that more specifications that create Expert Review or Specification Required registries should be specifying what they expect from the designated experts. Barry
Re: registries and designated experts (was: Re: APPSDIR review of draft-farrell-decade-ni-07, major design issue (one or two URI schemes))
Hi Peter, At 07:19 12-06-2012, Peter Saint-Andre wrote: By my reading, the happiana discussions [1] over the 12+ months have led most participants to the conclusion that registration does not imply standardization, and that it's not the role of the designated expert to act as a gatekeeper with respect to the technical merits of the technologies that trigger registration requests. It might be good to have a wider discussion about the purpose of registries and the role of designated experts, but IMHO it's not correct to conclude that a technology is acceptable just because the designated expert didn't object to the registrations related to that technology. I'll +1 the above. In a recent review the path followed by the draft is Standards Action whereas the assignment policy is Expert Review. Explaining to the authors that they should not use the assigned value isn't a worthwhile effort given that they have already been through the gate to get the value. The Designated Expert did his job; that is to see that the requirements were met instead of acting as gatekeeper. If you reject assignment requests people will find it simpler not to register the values. If you accept the request people might consider that the specification is fine. The reasons provided for managing a namespace are: - prevent the hoarding of or unnecessary wasting of values - provide a sanity check that the request actually makes sense - interoperability issues The above is at odds with standardization. The last reason does not apply for Expert review. Regards, -sm
Re: registries and designated experts (was: Re: APPSDIR review of draft-farrell-decade-ni-07, major design issue (one or two URI schemes))
* Peter Saint-Andre wrote: By my reading, the happiana discussions [1] over the 12+ months have led most participants to the conclusion that registration does not imply standardization, and that it's not the role of the designated expert to act as a gatekeeper with respect to the technical merits of the technologies that trigger registration requests. My impression is the same (though I do not agree with led). It might be good to have a wider discussion about the purpose of registries and the role of designated experts, but IMHO it's not correct to conclude that a technology is acceptable just because the designated expert didn't object to the registrations related to that technology. I would say that decisions by designated experts have no implications outside the confines of their role and their discretion. We could give experts the power to decide over is acceptable, but when we do not, then they do not have that power, and nobody should argue otherwise. -- Björn Höhrmann · mailto:bjo...@hoehrmann.de · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Re: APPSDIR review of draft-farrell-decade-ni-07 (minor editorial)
Hello Stephen, As Barry has suggested, I'm removing apps-disc...@ietf.org to reduced cross-posting. I'm going to split my response into several installments, hopefully separating the less contentious issues from the more contentious ones, and starting with the former ones. Also, I want to apologize for the delays here, related to day job duties. On 2012/06/05 20:11, Stephen Farrell wrote: Hi Martin, First, thanks for the speedy and thorough review. Some good points made, some with which I strongly disagree, but thrashing stuff like that out is part of the point of the reviews. On 06/05/2012 10:42 AM, Martin J. Dürst wrote: Minor editorial issues: Introduction: It would be good to have a general reference to hashing (for security purposes) for people not utterly familiar with the subject. Disagree. If I added that I could reasonably be asked to introduce URIs for the security folks. Serious and pointless ratholes would ensue. Okay. Intro: After reading the whole document, the structure of the Intro seems to make some sense, but it didn't on first reading (where it's actually more important). The main problem I was able to identify was that after a general outlook in paragraph 1, the Intro drops into a list of examples without saying what they are good for. I suggest to, after the sentence This document specifies standard ways to do that to aid interoperability., add a sentence along the lines: The next few paragraphs give usage examples for the various ways to include a hash in a name or identifier as they are defined later in this document.. It may also make sense to further streamline the following paragraphs, so that it is clearer which pieces of text refer each to one of the standard ways. There are two instances of the term binary presentation. Looking around, it seems that they are supposed to mean the same as binary format. Please replace all instances of binary presentation with binary format to avoid misunderstandings and useless seach time. Section 3: A Named Information (ni) URI consists of the following components:: It would be good to know exactly where the list ended. One way to do this would be to say consists of the following nine components. Those look reasonable. Will see if the changes work out. The change that I have seen in your pre-draft for the above three items look good. Section 3: Note that while the ni names with and without an authority differ syntactically, both names refer to the same object if the digest algorithm and value are the same.: What about cases with different authority? The text seems to apply by transitivity, but this may be easy to miss for an implementer. I suggest changing to: Note that while ni names with and without an authority, and ni names with different authorities, differ syntactically, they all refer to the same object if the digest algorithm and value are the same.. Sure. Okay. Section 3: Consequently no special escaping mechanism is required for the query parameter portion of ni URIs.: Does this mean no escaping mechanism at all? Or nothing besides %-encoding? Or something else? Please clarify. I wish I knew:-) What do you think is right here? (Honestly, input on this would be appreciated.) (In your internal draft, you removed the paragraph just before this one, and now the Consequently doesn't make any sense anymore.) One possibility here is to remove anything about encoding, and have people check in RFC 3986, but I guess you wanted to help the reader here, so I'd propose something along the following lines (replacing the Consequently paragraph, and also in some sense the paragraph before that you already removed): Escaping of characters follows the rules in RFC 3986. This means that %-encoding is used to distinguish between reserved and unreserved functions of the same character in the same URI component. As an example, an ampersand ('') is used in the query part to separate attribute-value pairs; an ampersand in a value therefore has to be escaped as '%26'. Note that the set of reserved characters differs for each component, as an example, a slash ('/') does not have any reserved function in a query part and therefore does not have to be escaped. However, it can still appear escaped as '%2f' or '%2F', and implementations have to be able to understand such escaped forms. Also note that any characters outside those allowed in the respective URI component have to be escaped. Figure 3: the = characters of the various rules should be aligned as much as possible to make it easier to scan the productions (see http://tools.ietf.org/html/rfc3986#appendix-A for an example). Ack. Great. Section 3: unreserved = ALPHA / DIGIT / - / . / _ / ~ ; directly from RFC 3986, section 2.3 ; authority and pct-encoded are also from RFC 3986 Please don't copy productions. Please don't copy half (or one-third, actually) of the productions you use, and
Re: APPSDIR review of draft-farrell-decade-ni-07 (minor editorial)
Hi Martin, On 06/11/2012 12:04 PM, Martin J. Dürst wrote: Hello Stephen, As Barry has suggested, I'm removing apps-disc...@ietf.org to reduced cross-posting. I'm going to split my response into several installments, hopefully separating the less contentious issues from the more contentious ones, and starting with the former ones. Also, I want to apologize for the delays here, related to day job duties. No probs. On 2012/06/05 20:11, Stephen Farrell wrote: Hi Martin, First, thanks for the speedy and thorough review. Some good points made, some with which I strongly disagree, but thrashing stuff like that out is part of the point of the reviews. On 06/05/2012 10:42 AM, Martin J. Dürst wrote: Minor editorial issues: Introduction: It would be good to have a general reference to hashing (for security purposes) for people not utterly familiar with the subject. Disagree. If I added that I could reasonably be asked to introduce URIs for the security folks. Serious and pointless ratholes would ensue. Okay. Intro: After reading the whole document, the structure of the Intro seems to make some sense, but it didn't on first reading (where it's actually more important). The main problem I was able to identify was that after a general outlook in paragraph 1, the Intro drops into a list of examples without saying what they are good for. I suggest to, after the sentence This document specifies standard ways to do that to aid interoperability., add a sentence along the lines: The next few paragraphs give usage examples for the various ways to include a hash in a name or identifier as they are defined later in this document.. It may also make sense to further streamline the following paragraphs, so that it is clearer which pieces of text refer each to one of the standard ways. There are two instances of the term binary presentation. Looking around, it seems that they are supposed to mean the same as binary format. Please replace all instances of binary presentation with binary format to avoid misunderstandings and useless seach time. Section 3: A Named Information (ni) URI consists of the following components:: It would be good to know exactly where the list ended. One way to do this would be to say consists of the following nine components. Those look reasonable. Will see if the changes work out. The change that I have seen in your pre-draft for the above three items look good. Section 3: Note that while the ni names with and without an authority differ syntactically, both names refer to the same object if the digest algorithm and value are the same.: What about cases with different authority? The text seems to apply by transitivity, but this may be easy to miss for an implementer. I suggest changing to: Note that while ni names with and without an authority, and ni names with different authorities, differ syntactically, they all refer to the same object if the digest algorithm and value are the same.. Sure. Okay. Section 3: Consequently no special escaping mechanism is required for the query parameter portion of ni URIs.: Does this mean no escaping mechanism at all? Or nothing besides %-encoding? Or something else? Please clarify. I wish I knew:-) What do you think is right here? (Honestly, input on this would be appreciated.) (In your internal draft, you removed the paragraph just before this one, and now the Consequently doesn't make any sense anymore.) One possibility here is to remove anything about encoding, and have people check in RFC 3986, but I guess you wanted to help the reader here, so I'd propose something along the following lines (replacing the Consequently paragraph, and also in some sense the paragraph before that you already removed): Escaping of characters follows the rules in RFC 3986. This means that %-encoding is used to distinguish between reserved and unreserved functions of the same character in the same URI component. As an example, an ampersand ('') is used in the query part to separate attribute-value pairs; an ampersand in a value therefore has to be escaped as '%26'. Note that the set of reserved characters differs for each component, as an example, a slash ('/') does not have any reserved function in a query part and therefore does not have to be escaped. However, it can still appear escaped as '%2f' or '%2F', and implementations have to be able to understand such escaped forms. Also note that any characters outside those allowed in the respective URI component have to be escaped. That looks good to me, ta. I've put it in. Figure 3: the = characters of the various rules should be aligned as much as possible to make it easier to scan the productions (see http://tools.ietf.org/html/rfc3986#appendix-A for an example). Ack. Great. Section 3: unreserved = ALPHA / DIGIT / - / . / _ / ~ ; directly from RFC 3986, section 2.3
APPSDIR review of draft-farrell-decade-ni-07
Hello everybody, [For replies, please trim the cc list, thanks!] I have been selected as the Applications Area Directorate reviewer for this draft (for background on appsdir, please see http://trac.tools.ietf.org/area/app/trac/wiki/ApplicationsAreaDirectorate ). Please resolve these comments along with any other Last Call comments you may receive. Please wait for direction from your document shepherd or AD before posting a new version of the draft. Document: draft-farrell-decade-ni-07 Title: Naming Things with Hashes Reviewer: Martin Dürst Review Date: 2012-06-03, 2012 (written up 2012-06-04/05) IETF Last Call Date: started 2012-06-04, ends 2012-07-02 Summary: This draft addresses a real generic need, but the current form of the draft is the result of adding more and more special cases without a clear overall view and a firm hand to separate the wheat from the chaff. This shows both in the technical issues as well as in many of the editorial issues below. This draft is not ready for publication without some serious additional work, but that work is mostly straightforward and should be easy to complete quickly. Major design issue: The draft defines two schemes, which differ only slightly, and mostly just gratuitously (see also editorial issues). These are the ni: and the nih: scheme. As far as I understand, they differ as follows: ni:nih: authority: optional disallowed ascii-compatible encoding: base64url base16 check digit:disallowed optional query part: optional disallowed decimal presentation of algorithm: disallowed possible The usability of URIs is strongly influenced by the number of different schemes, with the smaller a number, the better. As a somewhat made-up example, if the original URIs had been separated into httph: for HTML pages and httpi: for images, or any other arbitrary subdivision that one can envision, that would have hurt the growth and extensibility of the Web. Creating new URI schemes is occasionally necessary, and the ideas that lead to this draft definitely seem to warrant a new scheme (*), but there's no reason for two schemes. [(*) I know people who would claim the the .well-formed http/https thing is completely sufficient, no new scheme needed at all.] More specifically, if the original URIs had been separated into httpm: (for machines) and httph: (for humans), the Web for sure wouldn't have grown at the speed it did (and does) grow. In practice, there are huge differences in human 'speakability' for URIs (and IRIs, for that matter); compare e.g. http://google.com with http://www.google.co.jp/#sclient=psy-abhl=ensite=source=hpq=hashoq=hashaq=faqi=g4aql= (which I have significantly shortened to hopefully eliminate potential privacy issues), or compare the average mailto: URI with the average data: URI. However, what's important is that there never has been a strong dividing line between machine-only and human-only URIs or schemes, the division has always been very gradual. Short and mainly human-oriented URIs have of course been handled by machines, and on the other hand, very long URIs have been spoken when really necessary. Speakability has been maintained to some extent by scheme designers, and to some extent by survival of the fittest (URIs that weren't very speakable (or spellable/memorizable/guessable/...), and their Web sites, might just die out slowly). It should also be noted that the resistance against multiple URI schemes may have been low because there are so many different ways to express hashes in the draft anyway, and one more (the nih: section is the last one before the examples section) didn't seem like much of a deal anymore. But when it comes to URIs, one less is a lot better than one more. In the above ni:/nih: distinction, nih: seems to have been added as an afterthought after realizing that reading an ni: URI aloud over the phone may be somewhat suboptimal because there is a need for repeated upper case - lower case (sure very quickly shortened to upper - lower and then to up - low or something similar). It is not a bad idea to try to make sure that IETF technology, and URIs in particular, are accessible to people with certain kinds of dislexya. (There are indeed people who have tremendous difficulties with distinguishing upper- and lower-case letters, and this may or may not be connected with other aspects of dislexya.) It is however totally unclear to this reviewer why this has to lead to two different URI schemes with other gratuitous differences. Finding a solution is rather easy (of course, other solutions may also be possible): Merge the schemes, so that authority, check digit, and query part are all optional (an authority part and/or a query part may very well be very useful in human
Re: APPSDIR review of draft-farrell-decade-ni-07
Hi Martin, First, thanks for the speedy and thorough review. Some good points made, some with which I strongly disagree, but thrashing stuff like that out is part of the point of the reviews. On 06/05/2012 10:42 AM, Martin J. Dürst wrote: Hello everybody, [For replies, please trim the cc list, thanks!] Not sure what trimming you mean. I've reduced it to ietf-discuss and apps-discuss and authors. I have been selected as the Applications Area Directorate reviewer for this draft (for background on appsdir, please see http://trac.tools.ietf.org/area/app/trac/wiki/ApplicationsAreaDirectorate ). Please resolve these comments along with any other Last Call comments you may receive. Please wait for direction from your document shepherd or AD before posting a new version of the draft. Document: draft-farrell-decade-ni-07 Title: Naming Things with Hashes Reviewer: Martin Dürst Review Date: 2012-06-03, 2012 (written up 2012-06-04/05) IETF Last Call Date: started 2012-06-04, ends 2012-07-02 Summary: This draft addresses a real generic need, but the current form of the draft is the result of adding more and more special cases without a clear overall view and a firm hand to separate the wheat from the chaff. This shows both in the technical issues as well as in many of the editorial issues below. This draft is not ready for publication without some serious additional work, but that work is mostly straightforward and should be easy to complete quickly. Wheat? Chaff? Firm hand? Hmm. Are those useful phrases here? I disagree about 'em in any case, and you're wrong as to how we got here and IMO quite wrong in your main technical comment. See below for that. Major design issue: The draft defines two schemes, which differ only slightly, and mostly just gratuitously (see also editorial issues). These are the ni: and the nih: scheme. As far as I understand, they differ as follows: ni:nih: authority: optional disallowed ascii-compatible encoding: base64url base16 check digit:disallowed optional query part: optional disallowed decimal presentation of algorithm: disallowed possible The usability of URIs is strongly influenced by the number of different schemes, with the smaller a number, the better. As a somewhat made-up example, if the original URIs had been separated into httph: for HTML pages and httpi: for images, or any other arbitrary subdivision that one can envision, that would have hurt the growth and extensibility of the Web. Creating new URI schemes is occasionally necessary, and the ideas that lead to this draft definitely seem to warrant a new scheme (*), but there's no reason for two schemes. [(*) I know people who would claim the the .well-formed http/https thing is completely sufficient, no new scheme needed at all.] More specifically, if the original URIs had been separated into httpm: (for machines) and httph: (for humans), the Web for sure wouldn't have grown at the speed it did (and does) grow. In practice, there are huge differences in human 'speakability' for URIs (and IRIs, for that matter); compare e.g. http://google.com with http://www.google.co.jp/#sclient=psy-abhl=ensite=source=hpq=hashoq=hashaq=faqi=g4aql= (which I have significantly shortened to hopefully eliminate potential privacy issues), or compare the average mailto: URI with the average data: URI. However, what's important is that there never has been a strong dividing line between machine-only and human-only URIs or schemes, the division has always been very gradual. Short and mainly human-oriented URIs have of course been handled by machines, and on the other hand, very long URIs have been spoken when really necessary. Speakability has been maintained to some extent by scheme designers, and to some extent by survival of the fittest (URIs that weren't very speakable (or spellable/memorizable/guessable/...), and their Web sites, might just die out slowly). It should also be noted that the resistance against multiple URI schemes may have been low because there are so many different ways to express hashes in the draft anyway, and one more (the nih: section is the last one before the examples section) didn't seem like much of a deal anymore. But when it comes to URIs, one less is a lot better than one more. In the above ni:/nih: distinction, nih: seems to have been added as an afterthought after realizing that reading an ni: URI aloud over the phone may be somewhat suboptimal because there is a need for repeated upper case - lower case (sure very quickly shortened to upper - lower and then to up - low or something similar). It is not a bad idea to try to make sure that IETF technology, and URIs in particular, are accessible to people with certain kinds of