Since I know this is going to come up, I figure I should pre-empt it and be done with it. (Though I should've put this in the string document. Ah, well. Hopefully timing isn't everything, or I am *so* in trouble...)

Why aren't we converting to Unicode on the edge? Since, after all, any Sane Language will do all its string handling in Unicode, right? Why leave things the way they are until late?

Simple. Efficiency.

It's no less efficient to defer conversion of string data to Unicode (or, heck, from a harder-to-use (UTF-8) encoding to an easier to use one (UTF-32)) on demand then it is to do it at the edge. But... we get the bonus of *not* spending the time to do the encoding shifts and charset shifts if we don't need to. Which will happen for folks if they, for example, never *do* anything that'd mandate the shift. And if they do, well, we do the shift once then switch over the string vtable pointers to the new encoding and never have to do so again.

And while that may not be an overwhelming win, nor convincing to everyone, it also means that folks who want to stick with a single, non-Unicode setup (US-ASCII or Latin-1 folks who don't want to shift) can do so without incurring a penalty in time, space, or e-mail complaining. (And, bluntly, at this point I consider features that let people not grumble a big win)

Is it a bit more work for us? Well, a little, but no more so than using vtables for PMCs to do stuff, and that's all worked out quite nicely, honestly.

I do realize that the Big ICU Patch tossed a lot of the infrastructure for this, which broke parrot for folks who can't/won't do ICU. (And there are a number of folks shut out of development because they can't get ICU going) That'll be put back over then next week or so and ICU factored out to an optional build feature.
--
Dan


--------------------------------------it's like this-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to