Re: [RFC] I/O and Buffer Range

Dmitry Olshansky Thu, 16 Jan 2014 14:31:28 -0800

17-Jan-2014 00:00, Steven Schveighoffer пишет:

On Thu, 16 Jan 2014 13:44:08 -0500, Dmitry Olshansky
<dmitry.o...@gmail.com> wrote:

16-Jan-2014 19:55, Steven Schveighoffer пишет:

On Tue, 07 Jan 2014 05:04:07 -0500, Dmitry Olshansky
<dmitry.o...@gmail.com> wrote:

[snip]

In essence a transcoding filter for UTF-16 would wrap a buffer of
ubyte and itself present a buffer interface (but of wchar).


My intended interface allows you to specify the desired type per read.
Think of the case of stdin, where the clients will be varied and written
by many different people, and its interface is decided by Phobos.

But a transcoding buffer may make some optimizations. For instance,
reading a UTF32 file as utf-8 can re-use the same buffer, as no code
unit uses more than 4 code points (did I get that right?).


The other way around :) 4 code units - 1 code point.

I am going to study your code some more and see how I can update my code
to use it. I still need to maintain the std.stdio.File interface, and
Walter is insistent that the initial state of stdout/err/in must be
synchronous with C (which kind of sucks, but I have plans on how to make
it not be so bad).


I seriously not seeing how interfacing with C runtime could be fast
enough.


It's not. But an important stipulation in order for this to all be
accepted is that it doesn't break existing code that expects things like
printf and writef to interleave properly.

However, I think we can have an opt-in scheme, and there are certain
cases where we can proactively switch to a D-buffer scheme. For example,
if you get a ByLine range, it expects to exhaust the data from stream,
and may not properly work with C printf.

The idea is that stdio.File can switch at runtime from FILE * to D
streams as needed or directed.

There is still a lot of work left to do, but I think one of the hard
parts is done, namely dealing with UTF transcoding. The remaining sticky
part is dealing with shared. But with structs, this should make things
much easier.


I'm thinking a generic locking wrapper is possible along the lines of:

shared Locked!(GenericBuffer!char) stdin; //usage

struct Locked(T){
shared:
private:
    T _this;
    Mutex mut;
public:
    //forwarded methods
}

The wrapper will introduce a lock, and implement every method of
wrapped struct roughly like this:
mut.lock();
scope(exit) mut.unlock();
(cast(T*)_this).method(args);

I'm sure it could be pretty automatic.


This would be a key addition for ANY type in order to properly work with
shared. BUT, I don't see how it works safely generically because you
necessarily have to cast away shared in order to call the methods. You
would have to limit this to only working on types it was intended for.

The requirement may be that it's pure or should I say "well-contained".In other words as long as it doesn't smuggle references somewhere elseit should be fine.That is to say it's 100% fool-proof, nor do I think that essentiallysimulating a synchronized class is a always a good thing to do...

I've been expecting to have to do something like this, but not looking
forward to it :(

One question, is there a reason a buffer type has to be a range at all?
I can see where it's easy to make it a range, but I don't see
higher-level code using the range primitives when dealing with chunks of
a stream.


Lexers/parsers enjoy it - i.e. they work pretty much as ranges
especially when skipping spaces and the like. As I said the main
reason was: if it fits as range why not? After all it makes one-pass
processing of data trivial as it rides on top of foreach:

foreach(octect; mybuffer)
{
    if(intersting(octect))
        do_cool_stuff();
}

Things like countUntil make perfect sense when called on buffer (e.g.
to find matching sentinel).


I think I misstated my question. What I am curious about is why a type
must be a forward range to pass isBuffer. Of course, if it makes sense
for a buffer type to also be a range, it can certainly implement that
interface as well. But I don't know that I would need those primitives
in all cases. I don't have any specific use case for having a buffer
that doesn't implement a range interface, but I am hesitant to
necessarily couple the buffer interface to ranges just because we can't
think of a counter-case :)

Convenient to work with does ring good to me. I simply see no need toreinvent std.algorithm on buffers especially the ones that just scanleft-to-right.Example would be calculating a checksum of a stream (say data comes froma pipe or socket i.e. buffered). It's a trivial application ofstd.algorithm.reduce and there no need to reinvent that wheel IMHO.


--
Dmitry Olshansky

Re: [RFC] I/O and Buffer Range

Reply via email to