Re: Streaming library

Andrei Alexandrescu Thu, 14 Oct 2010 21:26:01 -0700

On 10/14/10 21:22 CDT, Rainer Deyke wrote:

On 10/14/2010 15:49, Andrei Alexandrescu wrote:

Good point. Perhaps indeed it's best to only deal with bytes and
characters at transport level.


Make that just bytes.

Characters data must be encoded into bytes before it is written and
decoded before it is read.  The low-level OS functions only deal with
bytes, not characters.

I'm not so sure about that. For example, some code in std.stdio isdedicated to supporting fwide():


http://www.opengroup.org/onlinepubs/000095399/functions/fwide.html

As far as I understand, a wide stream is essentially an UCS-2 (orUTF-16? Not sure) stream that is impossible to abstract away as a streamof bytes.


I see Windows' commitment to fwide is... odd:

http://msdn.microsoft.com/en-us/library/aa985619%28VS.80%29.aspx

The ultimate question is whether we want to support that (as well asother dedicated text streams) or not.

Text encoding is a complicated process - consider different unicode
encodings, different non-unicode encodings, byte order markers, and
Windows versus Unix line endings.  Furthermore, it is often useful to
wedge an additional translation layer between the low-level (binary)
stream and the high-level text encoding layer, such as an encryption or
compression layer.

Writing characters directly to streams made sense in the pre-Unicode
world where there was a one-to-one correspondence between characters and
bytes.  In a modern world, text encoding is an important service that
deserves its own standalone module.

I'd say quite the opposite. Since now encodings are embedded all the waydown at the low level (per fwide above), we can't pretend it's all bytesdown there and leave characters to upper layers. There _are_ transportsthat deal with characters directly.


So the $1M question is, do we support text transports or not?

- fwide streams

- files for which isatty() returns true(http://www.opengroup.org/onlinepubs/009695399/functions/isatty.html)


- email protocol and probably other Internet protocols

- others?

If we don't support text at the transport level, things can still madeto work but in a more fragile manner: upper-level protocols will need to_know_ that although the API accepts any ubyte[], in fact the resultswould be weird and malfunctioning if the wrong things are being passed.A text-based transport would clarify at the type level that a textstream accepts only UTF-encoded characters.


I think either way is not a catastrophe. We can make it work.


Andrei

Re: Streaming library

Reply via email to