Hi Norman,
Norman Rasmussen schrieb:
I thought that was the whole point of stringprep? (make similar chars
the same), but it seems not: RFC, section 9.1: ""stringprep does
nothing to map similar-looking characters together nor to prohibit
some characters because they look like others""
Well if stringprep would make similar chars the same, we might not have
some homography attacks, that we are faced with using IDNs (but are
there without IDNs as well).
The point of stringprep seems to be: (RFC, section 1): ""these
[stringprep] profiles will allow users to enter internationalized text
strings in applications and have the highest chance of getting the
content of the strings correct. In this case, "correct" means that if
two different people enter what they think is the same string into two
different input mechanisms, the strings should match on a
character-by-character basis. [...] In addition to helping string
matching, profiles of stringprep can also exclude characters that
should not normally appear in text that is used in the protocol.""
Having only ever lived and worked in a english centric world, I have
troubles understanding the issues that stringprep address, but I'm
trying!
Well I think one of my examples (that's why I started with that one)
demonstrates this very good.
Look at "℉" and "°F". If someone reads a JabberID in a magazine of a
fictive new bot at jabber.org, where you can get the temperatures at
different places in the world. This JID is [EMAIL PROTECTED] How can he tell
if he has to enter [EMAIL PROTECTED] or [EMAIL PROTECTED] in his Jabber client?
There is no real difference in these two addresses (biside that thy
might look different in this mail as you client may use different fonts
for these two). Stringprep ensures, that you can enter both and get the
same bot.
I have this great concern that a lot of xmpp developers might think
the same thoughts that I thought the first time I looked at
stringprep, mainly: ""It's too hard, and I don't understand it, and
my application works at the moment, so why should I care?"" Then two
months after the release of your code, you get some
spanish/polish/russian guy using non-english characters telling you
that 'your program doesn't work', because you were too lazy to figure
out stringprep.
Yes, I understand this. But there are great libraries already available
for all types of programming languages. You don't have to understand the
description of stringprep. You can just use an existing library to
prepare your strings and you are done.
Tot kijk
Matthias