Re: Streaming library

Denis Koroskin Wed, 13 Oct 2010 14:10:25 -0700

On Thu, 14 Oct 2010 00:19:45 +0400, Andrei Alexandrescu<seewebsiteforem...@erdani.org> wrote:

On 10/13/10 14:02 CDT, Denis Koroskin wrote:

On Wed, 13 Oct 2010 20:55:04 +0400, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:

http://www.gnu.org/s/libc/manual/html_node/Buffering-Concepts.html


I don't think streams must mimic the low-level OS I/O interface.


I in contrast think that Streams should be a lowest-level possible
platform-independent abstraction.
No buffering besides what an OS provides, no additional functionality.
If you need to be able to read something up to some character (besides,
what should be considered a new-line separator: \r, \n, \r\n?), this
should be done manually in "byLine".

This aggravates client code for the sake of simplicity in a library thatwas supposed to make streaming easy. I'm not seeing progress.

This library code needs to be put somewhere. I just believe it belongs toline-reader, not a generic stream. By putting line reading into a streaminterface, you want make it more efficient.

That's because
most of the steams are binary streams, and there is no such thing as a
"line" in them (e.g. how often do you need to read a line from a
SocketStream?).


http://www.opengroup.org/onlinepubs/009695399/functions/isatty.html


These are special cases I don't like. There is no such thing in Windows
anyway.

I didn't say I like them. Windows has _isatty:http://msdn.microsoft.com/en-us/library/f4s0ddew(v=VS.80).aspx

I stand corrected. Windows pretends to be Posix compliant, yes, but that'sa sad story to tell. I don't see why would

You need a line when e.g. you parse a HTML header or a email header or
an FTP response. Again, if at a low level the transfer occurs in
blocks, that doesn't mean the API must do the same at all levels.
BSD sockets transmits in blocks. If you need to find a special sequence
in a socket stream, you are forced to fetch a chunk, and manually search
for a needed sequence. My position is that you should do it with an
external predicate (e.g. read until whitespace).
Problem is how you set up interfaces to avoid inefficiencies andcontortions in the client.
I don't think streams should buffer anything either (what anunderlyingOS I/O API caches should suffice), buffered streams adapters can dothatin a stream-independent way (why duplicate code when you can do thatas
efficiently with external methods?).
Most OS primitives don't give access to their own internal buffers.
Instead, they ask user code to provide a buffer and transfer data into
it.
Right. This is why Stream may not cache.
This is a big misunderstanding. If the interface is:

size_t read(byte[] buffer);
then *I*, the client, need to provide the buffer. It's in client space.This means willing or not I need to do buffering, regardless of whateverinternal buffering is going on under the wraps.

Use BufferedStream adapter if you need buffering, and raw streams if youdo the buffering manually.That's the way it's implemented in C#, Java, Tango and many many otherAPIs.

So clearly buffering on the client side is a must.


I don't see how is it implied from above.


Please implement an abstraction that given this:

interface InputStream
{
     size_t read(ubyte[] buf);
}

defines a line reader.

I thought we agreed that byLine/byChunk need to do buffering manuallyanyway.


class ByLine
{
        ubyte[] nextLine()
        {
                ubyte[BUFFER_SIZE] buffer;
                while (!inputStream.endOfStream()) {
                        size_t bytesRead = inputStream.read(buffer);
                        foreach (i, ubyte c; buffer[0..bytesRead]) {
                                if (c != '\n') {
                                        continue;
                                }
                                
                                appender.put(buffer[0..i]);
                                ubyte[] line = appender.data.dup();
                                appender.reset();
                                appender.put(buffer[i+1..$]);
                        
                                return line;
                        }
                
                        appender.put(buffer[0..bytesRead]);
                }

                ubyte[] line = appender.data.dup();
                appender.reset();
                return line;
        }
        
        InputStream inputStream;
        Appender!(ubyte[]) appender;
}

(I've skipped the range interface for the sake of simplicity, replaced itwith nextLine() function. I also don't remember proper appender interface,so I've used imaginary function names).

Once again, what's the point of byLine, if all it does is callstream.readLine(); ? That's moving code from one place to many unrelatedones. I don't agree with that.

I'm not convinced we need line-based API at core stream level. I don'tthink we need to sacrifice performance for a general case in order toavoid performance hit and a special case. who even told you it will be anyless efficient that way?


Andrei

Re: Streaming library

Reply via email to