On 12/28/10 5:14 PM, Haruki Shigemori wrote:
(2010/12/28 16:02), Andrei Alexandrescu wrote:
I've put together over the past days an embryonic streaming interface.
It separates transport from formatting, input from output, and buffered
from unbuffered operation.
http://erdani.com/d/phobos/std_stream2.html
There are a number of questions interspersed. It would be great to start
a discussion using that design as a baseline. Please voice any related
thoughts - thanks!
Andrei
I've waited so long for this day.
Excuse me, would you give me a user side code and librarian side code
using std.stream2?
I don't know a concrete implementation of the std.stream2 interfaces.
There isn't one. The source code is just support for documentation, and
I attach it with this message.
Thanks for participating! I know there has been some good stream-related
activity in the Japanese D community.
Andrei
// Written in the D programming language.
/**
Streams are structured in two layers. At the bottom there's the
transport layer, which is responsible for opening and closing a
stream, positioning in the stream, and transferring bytes. Atop of the
transport layer sits the formatting layer, which is concerned with
formatting typed data into raw bytes which then are passed to the
underlying transport.
Macros:
WIKI = Phobos/StdAlgorithm
QUESTION = $(I <font color=red>Question:</font> $0)
Copyright: Andrei Alexandrescu 2010-.
License: $(WEB boost.org/LICENSE_1_0.txt, Boost License 1.0).
Authors: $(WEB erdani.com, Andrei Alexandrescu)
*/
module std.stream2;
import std.variant;
/**
The base transport interface $(D TransportBase) supports primitives
for checking whether the transport is opened, closing the transport,
and positioning in the stream. Opening is not part of this interface;
it is assumed that a factory function opens the transport with the
appropriate parameters. Some streams may not actually be positionable,
in which case the positioning primitives throw.
$(QUESTION Should we offer an $(D open) primitive at this level? If
so, what parameter(s) should it take?)
$(QUESTION Should we offer a primitive $(D rewind) that takes the
stream back to the beginning? That might be supported even by some
streams that don't support general $(D seek) calls. Alternatively,
some streams might support $(D seek(0, SeekAnchor.start)) but not
other calls to $(D seek).)
*/
interface TransportBase
{
/**
Positions the stream $(D position) bytes from the beginning,
returns the new absolute _position. Throws on error.
*/
ulong seek(ulong position);
/**
Seeks the stream $(D position) bytes from stream's current
position, returns the new absolute _position. Throws on error.
*/
ulong seekFromCurrent(long position);
/**
Seeks the stream $(D position) bytes from stream's end, returns
the new absolute _position. Throws on error. The semantics of
this primitive for $(D position > 0) are defined by the stream
implementation (e.g. on certain file systems, such calls may
allow writing sparse files).
$(QUESTION May we eliminate $(D seekFromCurrent) and $(D
seekFromEnd) and just have $(D seek) with absolute positioning?
I don't know of streams that allow $(D seek) without allowing
$(D tell). Even if some stream doesn't, it's easy to add
support for $(D tell) in a wrapper. The marginal cost of
calling $(D tell) is small enough compared to the cost of $(D
seek).)
*/
ulong seekFromEnd(long position);
/**
Returns the absolute position in the stream. Throws on error.
*/
ulong tell() const;
/**
Returns whether the stream is at its logical end. Subsequent
reads from the stream will fail, and subsequent writes to the
stream will add new data.
*/
@property bool atEnd() const;
/**
Is this stream open?
*/
@property bool isOpen() const;
/**
Close the stream. Does nothing on an unopened stream. Throws on error.
$(QUESTION Should this throw on an unopened stream? I don't
think so, because throwing does not offer any additional
information that user code didn't have, and the idiom $(D if
(s.isOpen) s.close()) is verbose and frequently encountered.)
*/
void close();
}
/**
Unbuffered transport interfaces hold no buffers of their own and
therefore rely on user-supplied buffers to do their deed.
*/
interface UnbufferedInputTransport : TransportBase
{
/**
Reads data off the stream and returns the data _read (which is
a slice of $(D buffer)). If this function returns zero, the
stream has become empty. Reading from a stream that is $(D
atEnd) just returns empty slices. If the stream is closed or
some error occurs during reading, an exception is thrown.
$(QUESTION Should we allow $(D read) to return an empty slice
even if $(D atEnd) is $(D false)? If we do, we allow
non-blocking streams with burst transfer. However, naive client
code on non-blocking streams will be inefficient because it
would essentially implement busy-waiting.)
*/
ubyte[] read(ubyte[] buffer);
}
/**
Unbuffered output transport offers one primitive for writing. Client
code should never assume that unbuffered writes in fact go straight to
the hardware support of the stream. This is because of at least two
factors. First, the underlying operating system-specific primitives
might not offer guaranteed write-through (which is e.g. the case for
Linux unbuffered files). Second, $(D BufferedOutputTransport) (below)
inherits $(D UnbufferedOutputTransport) to offer guaranteed buffering.
So $(D UnbufferedOutputTransport) is best understood as "transport
without guaranteed buffering".
*/
interface UnbufferedOutputTransport : TransportBase
{
/**
Writes data to the stream. Throws on error.
*/
void write(in ubyte[] buffer);
/**
Alias for $(D write) that supports the output range interface.
*/
alias write put;
}
/**
Buffered transport interfaces hold internal buffers as intermediaries
between the data source and client code.
The $(D BufferedOutputTransport) interface is formally an input range
of $(D ubyte[]), which means it can be used directly with a variety of
algorithms.
*/
interface BufferedInputTransport : UnbufferedInputTransport
{
/**
Alias for $(D atEnd) for compliance with the input range
interface.
*/
alias atEnd empty;
/**
If the internal buffer is not empty, returns the
already-buffered data, which user code may inspect or copy as
it finds fit. No reading from the stream is made. If there is
no already buffered data, makes sure more data is input off the
stream. The amount of data read depends on the actual stream.
$(QUESTION Should we allow an empty _front on a non-empty
stream? This goes back to handling non-blocking streams.)
*/
@property ubyte[] front();
/**
Discards the existing buffer, reads a new buffer.
*/
void popFront();
/**
Peeks $(D n) bytes forward in the stream. The buffer returned
may be shorter than $(D n) only in case the stream has
ended. Following a call $(D peek(n)), $(D front) will yield the
same buffer.
*/
ubyte[] peek(size_t n);
/**
Discards $(D n) bytes off the stream. Returns the number of
bytes discarded, which may be less than $(D n) if and only if
the stream has ended. The stream need not be seekable.
$(QUESTION Should we eliminate this function? Theoretically
calling $(D advance(n)) is equivalent with $(D
seekFromCurrent(n)). However, in practice a file-based stream
will have to implement $(D advance) even though the underlying
file is not seekable.)
*/
ulong advance(ulong n);
}
/**
Buffered transport interfaces hold internal buffers as intermediaries
between the data source and client code.
The $(D BufferedOutputTransport) interface is formally an output range
of $(D ubyte[]), which means it can be used with a variety of
algorithms directly.
*/
interface BufferedOutputTransport : UnbufferedOutputTransport
{
/**
Normally data may not be written immediately. $(D flush) makes
sure that buffers are actually written to the stream. It is up
to the stream to ensure that data is written to its actual
destination (e.g. disk).
*/
void flush();
}
/**
The $(D Formatter) interface is concerned with formatting typed
objects into bytes. The resulting bytes are passed to a backend
transport object.
*/
interface Formatter
{
/**
Gets and sets the underlying _transport object. Each formatter
is associated with one _transport object and forwards to it the
bytes to be read after formatting. It is an error to attempt
writes to a $(D Formatter) that has a $(D null)
_transport. Also, certain formatters might enforce during
runtime that the _transport must be buffered.
$(QUESTION Should all formatters require buffered _transport?
Otherwise they might need to keep their own buffering, which
ends up being less efficient with buffered transports.)
*/
@property UnbufferedOutputTransport transport();
/// Ditto
@property void transport(UnbufferedOutputTransport);
/**
Formats and writes an integral _value, including a UTF character.
*/
void put(ubyte value);
/// Ditto
void put(ushort value);
/// Ditto
void put(uint value);
/// Ditto
void put(ulong value);
/// Ditto
void put(byte value);
/// Ditto
void put(short value);
/// Ditto
void put(int value);
/// Ditto
void put(long value);
/// Ditto
void put(char value);
/// Ditto
void put(wchar value);
/// Ditto
void put(dchar value);
/**
Formats and writes a floating-point _value.
*/
void put(float value);
/// Ditto
void put(double value);
/// Ditto
void put(real value);
/**
Formats and writes a UTF-encoded string.
$(QUESTION Should we also define $(D putln) that writes the string
and then an line terminator?)
*/
void put(in char[] value);
/// Ditto
void put(in wchar[] value);
/// Ditto
void put(in dchar[] value);
/**
Formats and writes an array (other than strings). The type of
the array element is passed dynamically as $(D elementType).
*/
void put(void[] value, TypeInfo elementType);
/**
Convenience generic function that accepts an array of any type
and forwards it to $(D put(array, typeid(T.init))). Due to a
bug in the implementation, this function has temporarily the
name $(D put_) although it will ultimately be $(D put).
*/
final void put_(T)(in T[] array) if (!isSomeChar!T) {
return put(array, typeid(T.init));
}
/**
Writes a class object to the stream. The stream must implement
$(D toString(Formatter)). This function simply calls $(D
obj.toString(this)), thereby closing a double dispatch
loop. The responsibility of formatting the object's contents is
left to the object.
$(QUESTION Should we define a more involved protocol? For
example, even for objects that don't implement formatting, a
$(D Formatter) might define a reasonable output routine by
using introspection to figure out the object's layout. This
approach has the nice consequence that one implementation can
be applied to many objects. But that also means we need to wait
for better reflection support. We also need to figure out a way
to detect that an object does not override $(D
toString(Formatter)), which at the moment I consider a
to-be-added primitive method of $(D Object).)
*/
void put(Object obj);
/**
Writes a struct to the stream. This final function writes a
customizable "header" and a customizable "footer". Inside, the
elements of the struct are formatted transitively. Due to a bug
in the implementation, this function has temporarily the name
$(D put_) although it will ultimately be $(D put).
$(QUESTION Should we put some support for avoiding writing the
same subobject twice, or is that more of a charter of
serialization?)
*/
final void put_(S)(auto ref S) if (is(S == struct)) {
}
/**
Overridable hooks called before and after writing a $(D
struct)'s fields.
$(QUESTION How to handle associative arrays? They don't have a
common base, as arrays do. Should we offer some overridable
hooks similar to these? For example, $(D beforeAssocArray), $(D
afterAssocArray), $(D beforeAssocArrayElement), $(D
afterAssocArrayElement).)
*/
void beforeStruct(void * s, TypeInfo ti);
/// Ditto
void afterStruct(void * s, TypeInfo ti);
/**
Formats and writes _data according to an extended $(D
printf)-like format specifier.
$(QUESTION How to define format specifiers for $(D struct)s and
$(D class)es in ways that extend $(D printf) specifiers naturally?)
$(QUESTION Should we define $(D writefln) too? Note that that
only makes sense for streams that use a text-based transport.)
*/
void writef(in char[] format, Variant[] data...);
}
/**
$(D Unformatter) in an interface for formatted read. The name $(D
Parser) has been avoided in order to prevent confusion with the
meaning of "parser" in formal grammars.
*/
interface Unformatter
{
/**
Gets and sets the underlying _transport object. Each
unformatter is associated with one _transport object. It is an
error to attempt reads from an $(D Unformatter) that has a $(D
null) _transport. Also, certain formatters might enforce during
runtime that the transport must be buffered.
*/
@property UnbufferedInputTransport transport();
/// Ditto
@property void transport(UnbufferedInputTransport);
/**
Reads an integral _value, including a UTF character.
*/
void read(ref ubyte value);
/// Ditto
void read(ref ushort value);
/// Ditto
void read(ref uint value);
/// Ditto
void read(ref ulong value);
/// Ditto
void read(ref byte value);
/// Ditto
void read(ref short value);
/// Ditto
void read(ref int value);
/// Ditto
void read(ref long value);
/// Ditto
void read(ref char value);
/// Ditto
void read(ref wchar value);
/// Ditto
void read(ref dchar value);
/**
Reads a floating-point _value.
*/
void read(ref float value);
/// Ditto
void read(ref double value);
/// Ditto
void read(ref real value);
/**
Reads a UTF-encoded string.
$(QUESTION Should we pass the size in advance, or make the
stream responsible for inferring it?)
*/
void read(ref char[] value);
/// Ditto
void read(ref wchar[] value);
/// Ditto
void read(ref dchar[] value);
/**
Formats and writes an array (other than strings). The type of
the array element is passed dynamically as $(D elementType).
*/
void read(ref void[] value, TypeInfo elementType);
/**
Convenience generic function that accepts an array of any type
and forwards it to $(D read(array, typeid(T.init))). Due to a
bug in the implementation, this function has temporarily the
name $(D read_) although it will ultimately be $(D read).
*/
final void read_(T)(in T[] array) if (!isSomeChar!T) {
return put(array, typeid(T.init));
}
/**
Writes a class object to the stream. The stream must implement
$(D toString(Formatter)). This function simply calls $(D
obj.toString(this)), thereby closing a double dispatch
loop. The responsibility of formatting the object's contents is
left to the object.
$(QUESTION Should we define a more involved protocol? For
example, even for objects that don't implement formatting, a
$(D Formatter) might define a reasonable output routine by
using introspection to figure out the object's layout. This
approach has the nice consequence that one implementation can
be applied to many objects. But that also means we need to wait
for better reflection support. We also need to figure out a way
to detect that an object does not override $(D
toString(Formatter)), which at the moment I consider a
to-be-added primitive method of $(D Object).)
*/
void read(ref Object obj);
/**
Reads a struct from the stream. This final function reads a
customizable "header" and a customizable "footer". Inside, the
elements of the struct are formatted transitively. Due to a bug
in the implementation, this function has temporarily the name
$(D read_) although it will ultimately be $(D read).
*/
final void read_(S)(ref S) if (is(S == struct)) {
}
/**
Overridable hooks called before and after writing a $(D
struct)'s fields.
$(QUESTION How to handle associative arrays? They don't have a
common base, as arrays do. Should we offer some overridable
hooks similar to these? For example, $(D beforeAssocArray), $(D
afterAssocArray), $(D beforeAssocArrayElement), $(D
afterAssocArrayElement).)
*/
void beforeStruct(void * s, TypeInfo ti);
/// Ditto
void afterStruct(void * s, TypeInfo ti);
/**
Convenience function that forwards to the appropriate
by-reference overload. Due to a bug in the implementation, this
function has temporarily the name $(D read_) although it will
ultimately be $(D read).
*/
final T read_(T)() {
T result;
read(result);
return result;
}
/**
Reads _data according to an extended $(D scanf)-like format
specifier.
*/
void readf(in char[] format, Variant[] data...);
}