Hi Tim, > Just a few questions. > > 1. > Why don't you use 'opt.locale' to check if the local encoding is UTF-8 ?
I thought that was usable only if ENABLE_IRI was defined. > 2. > I don't understand how you distinguish between illegal and legal UTF-8 > sequences. I guess only legal sequences should be unescaped. > Or to make it easy: if the string is valid UTF-8, do not escape. > If it is not valid UTF-8, escape it. > You could: > Add unistr/u8-check to bootstrap.conf (./bootstrap thereafter), > include #include "unistr.h" and use > if (u8_check (s, strlen(s)) == 0) to test for validity. Yes, I expected you to say something like this. My reason: I consider this escaping a very doubtful activity. In my eyes the correct code is not: always escape except when UTF-8, but rather: never escape except perhaps when someone asks for it. So the precise check for UTF-8 is in my eyes just bloat. Moreover: what to do if the name is not valid UTF-8? The current escaping produces something that not valid UTF-8. So doing the current escaping is certainly a mistake, not better than using the name as-is. Invent a new type of escaping? So, for the time being, my previous patch avoided the old mistake, without introducing new mistakes :-). Andries
