Re: [fltk.development] RFC: Pure UTF-8 or Hybrid CP1252 ?

Albrecht Schlosser Mon, 22 Nov 2010 08:05:11 -0800

On 20.11.2010, at 01:34, Michael Sweet wrote:
> On Nov 19, 2010, at 3:45 PM, Duncan Gibson wrote:
>> ...
>> Before we release FLTK-1.3.0 and commit to keeping the same character
>> set support until at least the next major release after that, we need
>> to decide on whether we support one of [at least] three options:
>>
>> 1. We decide that FLTK-1.3.0 will be the first release that will
>> support pure UTF-8 only, and that CP1252 data in files, etc. will
>> be converted to pure UTF-8 during input, with or without warning.
>
> I think we actually want option 1A, namely that we don't do any stinkin'
> automatic conversions and that everything is UTF-8 only.


+1

> In theory we could support Bill's original "ISO-8859-1 + UTF-8" hybrid
> mode, however that adds a lot of complexity and may have issues if we
> expand it to include CP-1252 (which is a superset of ISO-8859-1).
> Moreover, it puts the onus on us to correctly guess the encoding and
> convert every time we draw, since no system API supports the hybrid mode.

Agreed. In fact, I don't think that we can "guess right" at all, if we
allow mixing ISO-8859-1 and/or CP-1252 with UTF-8, since almost all
UTF-8 bytes are also valid in CP-1252 (some less in ISO-8859-1).

The only important point I can see is that the user's application should
not crash (not even with a controlled assert()), if s/he uses standard
FLTK functions. However, I don't know if we can achieve this...

I'm thinking of test/editor: can/should it check the text buffer after
reading a file? Sometimes you don't know a file's encoding before
you open it, and what about the file chooser's preview? Could it
crash a user's program if s/he looks at a file in CP-1252 encoding?

How much can we do about this?

Just a few thoughts...

Albrecht
_______________________________________________
fltk-dev mailing list
[email protected]
http://lists.easysw.com/mailman/listinfo/fltk-dev

Re: [fltk.development] RFC: Pure UTF-8 or Hybrid CP1252 ?

Reply via email to