Re: [fpc-devel] String handling in trunk (was utf8 in 2.6.0)

Mark Morgan Lloyd Mon, 07 Jan 2013 09:05:18 -0800

Tomas Hajny wrote:

On Mon, January 7, 2013 13:28, Ewald wrote:

Once upon a time, on 01/07/2013 12:39 PM to be precise, Michael Schnell
said:

On 01/05/2013 12:28 PM, Jonas Maebe wrote:

Using whatever #xx#xx or #xx#xx#xx sequence represents the UTF-8
encoding of that character.

Sorry, I can't follow. Does #xx not just define a numerical
representation of an 8 bit entity ?


The interpretation in any code might be done later by any code that
digests the string.

Am I wrong ?

I *think* Jonas is trying to say that if you want the character `Ǿ` in a
string you would either type
- 'Ǿ' or
- #$C7#$BE if you want to keep the source free of encoding specific
characters

 .
 .

...or
- #$01FE and then the whole string becomes a Unicode string which is
either kept that way (if it is assigned to a UnicodeString constant), or
it is converted to some 8-bit encoding at compile time (if it is assigned
to an 8-bit constant/variable like ansistring)

(also just my understanding of what Jonas wrote)

That's how I read it as well. In which case, is #A3 16-bit Unicode(representing the UK £ Sterling) or malformed UTF-8 (should be #c2#a3)?


--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] String handling in trunk (was utf8 in 2.6.0)

Reply via email to