On Tue, 06 Sep 2011 19:46:35 -0400, Christophe <trav...@phare.normalesup.org> wrote:

I've had a look at readUntil API, and it's not completely clear. Is the
delegate supposed to remember what it has read and interpreted so far,
or does it have to start from scratch each time ?

The start is an index at which new data was added. The deal is, the stream continually appends more data to the array until the delegate is satisfied. The start helps keep some context of "how much of this haven't I seen before?"

So depending on how your delegate is implemented, you can avoid reading anything before start if you wish. However, you still have to take into account the data prior to start when returning how much data was processed.

I can see where this scheme has its downsides for parsing that needs to keep state. It might be aggravating or even impossible to do this when your delegate has to exit when not enough data is present.

However, the buffer default size is something like 10 pages. So the likelyhood that you have to return "get me more data" is pretty low, and even if it is, restarting the parsing would be a rare occurrence.

So I agree, a delegate *callable* by the "readFrom" function would be preferrable and easier to deal with than using readUntil.

Where could I see an
implementation of a delegate suitable for readUntil ?

In the source code for the revamped stdio. Here is a byChunk range which uses it:

https://github.com/schveiguy/phobos/blob/ceb4ec43057d18d42371128a614e81dbec45a5f6/std/stdio.d#L1665


Basically, in both your and my API, a stream is giving some more
characters to a readFrom method, as long as it asks for more. What I am
not sure is if readFrom is supposed to build the read object like in my
API , or if it is supposed to be built after with the string returned by
readUntil.

It should be processed while the delegate is called for checking if readUntil should be stopped. In other words, the data returned by readUntil will be ignored.

I think the main difference is that your API is written from the stream
point of view, whereas my API is written from the point of view of the
object being read, which will make implementation of readFrom easier by
the users, who will not have to worry about their delegate being called
multiple time.

If I have more time, I may look deeper into Phobos stdin and your stdin
proposal, but I'm not sure I should afford that...
In the mean time, I hope I gave you nice ideas to improve your own
proposal. Here are some more...

Yes, I'm thinking readFrom probably instead of being a readUntil delegate itself, should just accept a DInput (or whatever it gets renamed to). Then it has the choice of running the show, or just using readUntil.

I will sum up the different ways to deal with buffering and any one of
your API for readUntil, and my proposed API:

1/ _use only peek_
-the API is written to peek only one character at a time. You
definitely lose the possibility for a stream to give a char[] directly
to the parsing function, even for streams that are not files...

I plan in the next iteration of my revamped stdio to implement a peek function. It's actually pretty simple to implement in terms of readUntil:

const(ubyte)[] peek(size_t nbytes)
{
   const(ubyte)[] retval;
   size_t stopCond(const(ubyte)[] data, size_t start)
   {
       retval = data;
       if(data.length == start)
          return 0; // EOF
       return data.length >= nbytes ? 0 : size_t.max;
   }

   readUntil(&stopCond);
   return retval.length > nbytes ? retval[0..nbytes] : retval;
}

2/ _use c for low level stdin_
-the default stream derived from stdin or from a file peeks only one
character at a time. Everything works fine with c functions.
-you can still explicitly create a stream object from a File to make
double buffering and return several characters, but that makes the File
no longer suitable for c functions, since some unread buffer can be
hidden in the object performing the streaming operations.

If you are going this route, I think you're better off to use a rewritten buffering scheme. You've already lost the only reason to use C stdio to begin with -- compatibility with C functions.

3/ _hack into c functions_
-the default stream stream hacks into FILE* to use it's own internal
buffer. This may not be easy to implement, but should be feasible by a
system programmer, shouldn't it ?

Yes and no.  There are issues:

- What if the implementation is opaque?
- What if you run out of buffer?
- What if the implementation is open-source, but uses static functions?

There are also other issues with FILE * not related to this discussion which make it a good idea to avoid.

4/ _WTH, d should not rely on c functions to do all low level jobs_
-the default stream peeks several characters. c functions are broken.
-you can still rewrite c-like functions. For example, scanf could be the
same as readf, but would support 0-terminated strings, and be
implemented as a c-style variadic function (avoiding multiple template
instanciation which make the generated code so big Walter refuses to
use it).
-if you need, you can still instanciate a FILE* that will never be seen
by the d library, and that will work fine with c functions.

5/ _variation on 2 and 4_
- File are still compatible with current Phobos API, and the default
streaming mode for file only peek one caracter at a time.
- Some new struct can perform file operations in a d-like way that is
incompatible with c function. However, no accessible File object is ever
created for this structure, so no one will mix c and d read/write
function.

This is somewhat what my new strategy is. Except File will seamlessly support both the existing phobos implementation and my new implementation. I'll be outlining how it works once I've settled on the API (and I'll probably have implementation ready too).

One last point: any comments about using writeTo with my "stream" API
like readFrom ?

I think this is what writeTo (as proposed) already does.

-Steve

Reply via email to