Re: [Pharo-users] OSProcess command with german umlaut does not work

Sven Van Caekenberghe Mon, 06 Jun 2016 11:36:06 -0700

> On 06 Jun 2016, at 17:22, Sabine Manaa <manaa.sab...@gmail.com> wrote:
> 
> why ByteArray?


http://www.unicode.org/faq/utf_bom.html

A Unicode transformation format (UTF) is an algorithmic mapping from every 
Unicode code point (except surrogate code points) to a unique byte sequence.

https://en.wikipedia.org/wiki/UTF-8

UTF-8 encodes each of the 1,112,064 valid code points in the Unicode code space 
(1,114,112 code points minus 2,048 surrogate code points) using one to four 
8-bit bytes (a group of 8 bits is known as an octet in the Unicode Standard).

In Pharo

https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html

Of course, given a ByteArray, whose values are all between 0 and 255 by 
definition, you can convert it to a ByteString. That String is not a correct 
(Pharo) String anymore, it is like converting a PNG or JPEG to String, you can 
do it, it is just wrong.

When talking to the outside world, be it over a network connection, or via 
primitive calls, anything but pure ASCII strings need an encoding. This has to 
be agreed upon by both parties. If the receiving party wants UTF-8 forced into 
a (kind of) String, that is (still) possible.

Your initial solution seems to indicate that this is expected. This (ugly) 
conversion should be done at an as low level as possible, IMHO.

Sven

Re: [Pharo-users] OSProcess command with german umlaut does not work

Reply via email to