Planning to migrate SDWF to Unicode

Stewart Gordon Sat, 21 Jan 2012 17:00:57 -0800

Those who've been following SDWF will by now have realised that it abuses char[] for ANSIstrings, whereas D strings are meant to be in Unicode. It's high time I did somethingabout this.

When I started on it, I was still using Windows 98, which has very limited Unicodesupport. But that was years ago now. And it must be coming on 7 years now since MSdiscontinued support for it. So maybe I might as well drop Windows 9x support, just like16-bit support was dropped with the creation of D (which was only 4 years after Windows95, after all).

As such, I plan to change SDWF to work in Unicode. Probably using UTF-16 internally, butpossibly giving the programmer the choice between UTF-8 and UTF-16.

But this begs the question of what to do with the existing char-based API. PossibilitiesI've thought of:

(a) Just get rid of it. Programmers upgrading to the new SDWF version will be forced tochange instances of char to wchar; what more there is to do depends on what else theprogram does with character/string data.

(b) Keep functions that take a char or char[] parameter, make them convert from ANSI toUTF-16, but deprecate them. Thinking about it now, there are problems:- In order to have versions of each function that return an ANSI string and that return aUnicode string, I would need to name them differently, which could get ugly.

- When returning ANSI, what would happen to characters outside the code page?

- Mixing ANSI and Unicode could also have adverse effects on the interpretation of stringliterals.

So maybe this isn't a good plan at all.

(c) Use versioning to give the programmer the choice of an ANSI API or a UTF-16 API,rather like the WindowsAPI bindings themselves.

(d) Change char functions to use UTF-8. This would break any code that relies on thecharacters being ANSI, or even that manipulates text on a one character, one byte basis.As with (c), versioning could be used to give a choice between UTF-8 and UTF-16.

If path (b) or (c) is taken, the ANSI API could later be removed. Once this is done, orif path (a) is taken, we could add UTF-8 support, thereby ending up at (d).

It's early days yet, but the thread I started a few hours ago ("D1, D2 and the future oflibraries") could still lead to my migrating SDWF to D2. If it does, I'll likely combinethe migration to Unicode with this.


Thoughts?

Stewart.

Planning to migrate SDWF to Unicode

Reply via email to