Re: Streaming library

Daniel Gibson Wed, 13 Oct 2010 13:10:37 -0700

Denis Koroskin schrieb:

On Wed, 13 Oct 2010 20:55:04 +0400, Andrei Alexandrescu<seewebsiteforem...@erdani.org> wrote:
On 10/13/10 11:16 CDT, Denis Koroskin wrote:
On Wed, 13 Oct 2010 18:32:15 +0400, Andrei Alexandrescu
So far so good. I will point out, however, that the classic read/write
routines are not all that good. For example if you want to implement a
line-buffered stream on top of a block-buffered stream you'll be
forced to write inefficient code.
Never heard of filesystems that allow reading files in lines - they
always read in blocks, and that's what streams should do.
http://www.gnu.org/s/libc/manual/html_node/Buffering-Concepts.html

I don't think streams must mimic the low-level OS I/O interface.
I in contrast think that Streams should be a lowest-level possibleplatform-independent abstraction.No buffering besides what an OS provides, no additional functionality.If you need to be able to read something up to some character (besides,what should be considered a new-line separator: \r, \n, \r\n?), thisshould be done manually in "byLine".

Platform-independent? OS-Independent, yes. But being independent of Endianess and availability of80bit real etc is to much for a simple stream (of course we'd need an EndianStream that can wrap asimple stream and take care of the endianess).

That's because
most of the steams are binary streams, and there is no such thing as a
"line" in them (e.g. how often do you need to read a line from a
SocketStream?).
http://www.opengroup.org/onlinepubs/009695399/functions/isatty.html
These are special cases I don't like. There is no such thing in Windowsanyway.
You need a line when e.g. you parse a HTML header or a email header oran FTP response. Again, if at a low level the transfer occurs inblocks, that doesn't mean the API must do the same at all levels.
BSD sockets transmits in blocks. If you need to find a special sequencein a socket stream, you are forced to fetch a chunk, and manually searchfor a needed sequence. My position is that you should do it with anexternal predicate (e.g. read until whitespace).
I don't think streams should buffer anything either (what an underlying
OS I/O API caches should suffice), buffered streams adapters can do that
in a stream-independent way (why duplicate code when you can do that as
efficiently with external methods?).
Most OS primitives don't give access to their own internal buffers.Instead, they ask user code to provide a buffer and transfer data intoit.
Right. This is why Stream may not cache.


Simple streams should not cache, but there must be a BufferedStream wrapping 
simple streams.

When you read from a non-buffered SocketStream each read() (like readInt()) is a syscall - that'sreally expensive.In my project I got a speedup of about factor 4-5 by replacing std.Streams SocketStream with acustom BufferedSocketStream. I have to do further testing, but I think that shifted the bottleneckfrom socket-I/O to something else, so in other cases the speedup may be even bigger.

So clearly buffering on the client side is a must.
I don't see how is it implied from above.
Besides, as you noted, the buffering is redundant for byChunk/byLine
adapter ranges. It means that byChunk/byLine should operate on
unbuffered streams.
Chunks keep their own buffer so indeed they could operate on streamsthat don't do additional buffering. The story with lines is a fairamount more complicated if it needs to be done efficiently.
Yes. But line-reading is a case that I don't see a need to be handledspecially.
I'll explain my I/O streams implementation below in case you didn't read
my message (I've changed some stuff a little since then).
Honest, I opened it to remember to read it but somehow your fonts aresmall and make my eyes hurt.
My Stream
interface is very simple:

// A generic stream
interface Stream
{
@property InputStream input();
@property OutputStream output();
@property SeekableStream seekable();
@property bool endOfStream();
void close();
}

You may ask, why separate Input and Output streams?
I think my first question is: why doesn't Stream inherit InputStreamand OutputStream? My hypothesis: you want to sometimes return null. Nice.
Right.
Well, that's because
you either read from them, write from them, or both.
Some streams are read-only (think Stdin), some write-only (Stdout), some
support both, like FileStream. Right?
Sounds good. But then where's flush()? Must be in OutputStream.
That's probably because unbuffered streams don't need them.


You may need to tell the OS to flush its buffer (fsync()).


I'm surprised there's no flush().


No buffering - no flush.


see above


Cheers,
- Daniel

Re: Streaming library

Reply via email to