On Thu, 14 Oct 2010 00:19:45 +0400, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:
On 10/13/10 14:02 CDT, Denis Koroskin wrote:
On Wed, 13 Oct 2010 20:55:04 +0400, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:
http://www.gnu.org/s/libc/manual/html_node/Buffering-Concepts.html
I don't think streams must mimic the low-level OS I/O interface.
I in contrast think that Streams should be a lowest-level possible
platform-independent abstraction.
No buffering besides what an OS provides, no additional functionality.
If you need to be able to read something up to some character (besides,
what should be considered a new-line separator: \r, \n, \r\n?), this
should be done manually in "byLine".
This aggravates client code for the sake of simplicity in a library that
was supposed to make streaming easy. I'm not seeing progress.
This library code needs to be put somewhere. I just believe it belongs to
line-reader, not a generic stream. By putting line reading into a stream
interface, you want make it more efficient.
That's because
most of the steams are binary streams, and there is no such thing as a
"line" in them (e.g. how often do you need to read a line from a
SocketStream?).
http://www.opengroup.org/onlinepubs/009695399/functions/isatty.html
These are special cases I don't like. There is no such thing in Windows
anyway.
I didn't say I like them. Windows has _isatty:
http://msdn.microsoft.com/en-us/library/f4s0ddew(v=VS.80).aspx
I stand corrected. Windows pretends to be Posix compliant, yes, but that's
a sad story to tell. I don't see why would
You need a line when e.g. you parse a HTML header or a email header or
an FTP response. Again, if at a low level the transfer occurs in
blocks, that doesn't mean the API must do the same at all levels.
BSD sockets transmits in blocks. If you need to find a special sequence
in a socket stream, you are forced to fetch a chunk, and manually search
for a needed sequence. My position is that you should do it with an
external predicate (e.g. read until whitespace).
Problem is how you set up interfaces to avoid inefficiencies and
contortions in the client.
I don't think streams should buffer anything either (what an
underlying
OS I/O API caches should suffice), buffered streams adapters can do
that
in a stream-independent way (why duplicate code when you can do that
as
efficiently with external methods?).
Most OS primitives don't give access to their own internal buffers.
Instead, they ask user code to provide a buffer and transfer data into
it.
Right. This is why Stream may not cache.
This is a big misunderstanding. If the interface is:
size_t read(byte[] buffer);
then *I*, the client, need to provide the buffer. It's in client space.
This means willing or not I need to do buffering, regardless of whatever
internal buffering is going on under the wraps.
Use BufferedStream adapter if you need buffering, and raw streams if you
do the buffering manually.
That's the way it's implemented in C#, Java, Tango and many many other
APIs.
So clearly buffering on the client side is a must.
I don't see how is it implied from above.
Please implement an abstraction that given this:
interface InputStream
{
size_t read(ubyte[] buf);
}
defines a line reader.
I thought we agreed that byLine/byChunk need to do buffering manually
anyway.
class ByLine
{
ubyte[] nextLine()
{
ubyte[BUFFER_SIZE] buffer;
while (!inputStream.endOfStream()) {
size_t bytesRead = inputStream.read(buffer);
foreach (i, ubyte c; buffer[0..bytesRead]) {
if (c != '\n') {
continue;
}
appender.put(buffer[0..i]);
ubyte[] line = appender.data.dup();
appender.reset();
appender.put(buffer[i+1..$]);
return line;
}
appender.put(buffer[0..bytesRead]);
}
ubyte[] line = appender.data.dup();
appender.reset();
return line;
}
InputStream inputStream;
Appender!(ubyte[]) appender;
}
(I've skipped the range interface for the sake of simplicity, replaced it
with nextLine() function. I also don't remember proper appender interface,
so I've used imaginary function names).
Once again, what's the point of byLine, if all it does is call
stream.readLine(); ? That's moving code from one place to many unrelated
ones. I don't agree with that.
I'm not convinced we need line-based API at core stream level. I don't
think we need to sacrifice performance for a general case in order to
avoid performance hit and a special case. who even told you it will be any
less efficient that way?
Andrei