Re: UTF-16 endianess

2016-01-30 Thread Marek Janukowicz via Digitalmars-d-learn
On Fri, 29 Jan 2016 18:58:17 -0500, Steven Schveighoffer wrote: >>> Note the version identifiers BigEndian and LittleEndian can be used to >>> compile the correct code. >> >> This solution is of no use to me as I don't want to change the endianess in >> general. > > What I mean is that you can

Re: UTF-16 endianess

2016-01-29 Thread Steven Schveighoffer via Digitalmars-d-learn
On 1/29/16 6:03 PM, Marek Janukowicz wrote: On Fri, 29 Jan 2016 17:43:26 -0500, Steven Schveighoffer wrote: Is there anything I should know about UTF endianess? It's not any different from other endianness. In other words, a UTF16 code unit is expected to be in the endianness of the platform

UTF-16 endianess

2016-01-29 Thread Marek Janukowicz via Digitalmars-d-learn
I have trouble understanding how endianess works for UTF-16. For example UTF-16 code for 'ł' character is 0x0142. But this program shows otherwise: import std.stdio; public void main () { ubyte[] properOrder = [0x01, 0x42]; ubyte[] reverseOrder = [0x42, 0x01]; writefln(

Re: UTF-16 endianess

2016-01-29 Thread Johannes Pfau via Digitalmars-d-learn
Am Fri, 29 Jan 2016 18:58:17 -0500 schrieb Steven Schveighoffer : > On 1/29/16 6:03 PM, Marek Janukowicz wrote: > > On Fri, 29 Jan 2016 17:43:26 -0500, Steven Schveighoffer wrote: > >>> Is there anything I should know about UTF endianess? > >> > >> It's not any different

Re: UTF-16 endianess

2016-01-29 Thread Adam D. Ruppe via Digitalmars-d-learn
On Friday, 29 January 2016 at 22:36:37 UTC, Marek Janukowicz wrote: I have trouble understanding how endianess works for UTF-16. UTF-16 (as well as UTF-32) comes in both little-endian and big-endian variants. A byte-order marker in the file can help you detect which one it is in. See t his

Re: UTF-16 endianess

2016-01-29 Thread Steven Schveighoffer via Digitalmars-d-learn
On 1/29/16 5:36 PM, Marek Janukowicz wrote: I have trouble understanding how endianess works for UTF-16. For example UTF-16 code for 'ł' character is 0x0142. But this program shows otherwise: import std.stdio; public void main () { ubyte[] properOrder = [0x01, 0x42]; ubyte[]

Re: UTF-16 endianess

2016-01-29 Thread Marek Janukowicz via Digitalmars-d-learn
On Fri, 29 Jan 2016 17:43:26 -0500, Steven Schveighoffer wrote: >> Is there anything I should know about UTF endianess? > > It's not any different from other endianness. > > In other words, a UTF16 code unit is expected to be in the endianness of > the platform you are running on. > > If you are