Denis Koroskin schrieb:
On Wed, 13 Oct 2010 20:55:04 +0400, Andrei Alexandrescu <seewebsiteforem...@erdani.org> wrote:

On 10/13/10 11:16 CDT, Denis Koroskin wrote:
On Wed, 13 Oct 2010 18:32:15 +0400, Andrei Alexandrescu
So far so good. I will point out, however, that the classic read/write
routines are not all that good. For example if you want to implement a
line-buffered stream on top of a block-buffered stream you'll be
forced to write inefficient code.


Never heard of filesystems that allow reading files in lines - they
always read in blocks, and that's what streams should do.

http://www.gnu.org/s/libc/manual/html_node/Buffering-Concepts.html

I don't think streams must mimic the low-level OS I/O interface.


I in contrast think that Streams should be a lowest-level possible platform-independent abstraction. No buffering besides what an OS provides, no additional functionality. If you need to be able to read something up to some character (besides, what should be considered a new-line separator: \r, \n, \r\n?), this should be done manually in "byLine".


Platform-independent? OS-Independent, yes. But being independent of Endianess and availability of 80bit real etc is to much for a simple stream (of course we'd need an EndianStream that can wrap a simple stream and take care of the endianess).

That's because
most of the steams are binary streams, and there is no such thing as a
"line" in them (e.g. how often do you need to read a line from a
SocketStream?).

http://www.opengroup.org/onlinepubs/009695399/functions/isatty.html


These are special cases I don't like. There is no such thing in Windows anyway.

You need a line when e.g. you parse a HTML header or a email header or an FTP response. Again, if at a low level the transfer occurs in blocks, that doesn't mean the API must do the same at all levels.


BSD sockets transmits in blocks. If you need to find a special sequence in a socket stream, you are forced to fetch a chunk, and manually search for a needed sequence. My position is that you should do it with an external predicate (e.g. read until whitespace).

I don't think streams should buffer anything either (what an underlying
OS I/O API caches should suffice), buffered streams adapters can do that
in a stream-independent way (why duplicate code when you can do that as
efficiently with external methods?).

Most OS primitives don't give access to their own internal buffers. Instead, they ask user code to provide a buffer and transfer data into it.

Right. This is why Stream may not cache.


Simple streams should not cache, but there must be a BufferedStream wrapping 
simple streams.
When you read from a non-buffered SocketStream each read() (like readInt()) is a syscall - that's really expensive. In my project I got a speedup of about factor 4-5 by replacing std.Streams SocketStream with a custom BufferedSocketStream. I have to do further testing, but I think that shifted the bottleneck from socket-I/O to something else, so in other cases the speedup may be even bigger.

So clearly buffering on the client side is a must.


I don't see how is it implied from above.

Besides, as you noted, the buffering is redundant for byChunk/byLine
adapter ranges. It means that byChunk/byLine should operate on
unbuffered streams.

Chunks keep their own buffer so indeed they could operate on streams that don't do additional buffering. The story with lines is a fair amount more complicated if it needs to be done efficiently.


Yes. But line-reading is a case that I don't see a need to be handled specially.

I'll explain my I/O streams implementation below in case you didn't read
my message (I've changed some stuff a little since then).

Honest, I opened it to remember to read it but somehow your fonts are small and make my eyes hurt.

My Stream
interface is very simple:

// A generic stream
interface Stream
{
@property InputStream input();
@property OutputStream output();
@property SeekableStream seekable();
@property bool endOfStream();
void close();
}

You may ask, why separate Input and Output streams?

I think my first question is: why doesn't Stream inherit InputStream and OutputStream? My hypothesis: you want to sometimes return null. Nice.


Right.

Well, that's because
you either read from them, write from them, or both.
Some streams are read-only (think Stdin), some write-only (Stdout), some
support both, like FileStream. Right?

Sounds good. But then where's flush()? Must be in OutputStream.


That's probably because unbuffered streams don't need them.

You may need to tell the OS to flush its buffer (fsync()).



I'm surprised there's no flush().


No buffering - no flush.

see above


Cheers,
- Daniel

Reply via email to