"Adam M. Costello" <[EMAIL PROTECTED]> writes: > Simon Josefsson <[EMAIL PROTECTED]> wrote: > >> Authentication identity "admin", authorization identity U+4711, >> password X. For the argument, let's say U+4711 decomposes into U+1234 >> in Unicode 3.2 but is later changed to U+4321. >> >> The SASL library, acting as a proxy in front of the application >> software, implements the current libstringprep correctly. It checks >> that admin's password is X and that he is authorized to log in as >> U+1234 (which is the result after stringprep of U+4711, which was sent >> because the client hadn't been updated to use stringprep, which should >> cause no problem) and says OK to the application. >> >> Now, in 1a the application is using updated tables from a more recent >> stringprep that incorporates the fixed decomposition mapping, causing >> it to admit the user to an account U+4321. This is bad. >> >> In 2a, the application sees that the characters are deprecated due >> to its decomposition mapping changed, and rejects the user. This is >> good. > > It looks like the security hole in 1a stems from the existence of two > Unicode strings X and Y such that now Stringprep(X) != Stringprep(Y) > (so that two distinct accounts for X and Y can be created), but later > (after the update of the decomposition mappings) Stringprep(X) == > Stringprep(Y), so the two accounts will get confused. > > But I think the same phenomenon can happen with 2a. There are CNS > 11643 strings A and B such that now Stringprep(CNS11643toUnicode(A)) > != Stringprep(CNS11643toUnicode(B)) (so that two distinct accounts for > A and B can be created), but later (after the deprecation and addition > of Unicode characters, and the subsequent update of CNS11643toUnicode > to use the new Unicode characters instead of the deprecated ones) > Stringprep(CNS11643toUnicode(A)) == Stringprep(CNS11643toUnicode(B)), > so again the two accounts get confused. No deprecated characters are > going to be seen and rejected, because no CNS 11643 characters are > deprecated, and the deprecated Unicode characters do not appear in the > new CNS11643toUnicode table.
Yes, if I understand it correctly, I belive this was my motivation for proposing a security consideration saying something along the lines of that transcoding to and from the Unicode charset is a critical part of secure IDN and the the current IDN specification set doesn't address that problem, so it is a security consideration. IMHO that problem is bigger than the issues discussed here which only concerns a few and, more importantly, well known characters. To my knowledge, nobody has studied how many or which characters are in conflict in various transcoding table used. But in case 1a, it seems the problem can happen even when only Unicode is used. This makes the problem caused by transcoding exist even when no transcoding is involved. > An approach that would really avoid this pitfall would be to deprecate > these characters not only in Unicode, but also in CNS 11643 and any > other character sets that contain them, and create new characters > in all these character sets, and leave all the mappings of the old > deprecated characters unchanged in both the Unicode database and the > WhateverToUnicode tables. That would solve it. However, it seems the IDN WG has decided to only care about Unicode though, so the consequence of that decision is to declare that solution as out of scope as far as IDN is concerned. Anyone implementing this in the real world will have to solve the problem herself though, perhaps by adovcating the solution you proposed.
