On 24-mar-10, at 03:51, Andrei Alexandrescu wrote:

The Phobos file I/O functions all avoid doing any more buffering than the backing FILE* does. They achieve performance by locking the file once with flockfile/funlockfile and then using fgetc_unlocked().

This puts me in real trouble with the formatted reading functions (a la fscanf but generalized to all input ranges), which I'm gestating about. The problem with the current API is that if you call input.front(), it will call fgetc(). But then say I decide I'm done with the range, as is the case with e.g. reading an integer and stopping at the first non-digit. That non-digit character will be lost. So there's a need to say, hey, put this guy back because whoever reads after me will need to look at it. So I need a putBackFront() or something (which would call fungetc()). I wish things were simpler.

I had a pushBack (I called that unget) in http://github.com/fawzi/blip/blob/master/blip/text/TextParser.d , but I recently removed that in favor of a peek function that I think is much more flexible. What I did is to base most parsing on CharReaders (for example the char based ones from BasicIO):
{{{
/// extent of a slice of a buffer
enum SliceExtent{ Partial, Maximal, ToEnd }

/// a delegate that reads in from a character source
alias size_t delegate(char[]buf, SliceExtent slice,out bool iterate) CharReader;

/// a handler of CharReader, returns true if something was read
alias bool delegate(CharReader)CharReaderHandler;
}}}

a char reader reads from the given buffer buf, and can either request more (by returning EOF), or eat some characters out of it. If it sets iterate to true it wants to iterate with the eaten buffer (useful to for example skip undefined amount of whitespace that might overflow the buffer).

Once you have that you can easily create a Peeker structure that wraps a CharReader, and exposes a CharReaded that tries to match it, but always eats 0 characters, even if the match was successful. With it you can have a peek method that returns true if the CharReader that you pass in matches, false if it does not match, and what you want if the buffer is too small to resolve the issue.

Most of these things are templates that work for any type T. Then I built buffered types that using a size_t delegate(T[]) give a Reader based interface.

All this is not based on single elements anymore, but on arrays (ranges? :), but I think that is what is needed for efficient i/o.

On 03/23/2010 09:12 PM, Steven Schveighoffer wrote:
I don't think we should give up on trying to make a stream range that
is not awkward, I really dislike the way today's input ranges map to
streams.

Me too. Let's keep on looking, I have the feeling something good is right behind the corner. But then I felt that way for a year :o).

give a try to
bool popFront(ref T) ( or next, or another name, or even just a delegate with that signature) I was surprised how well it works, not perfect but better than the other alternatives I had tried.

loop on a T[] array:
        bool popFront(ref T* el);

mapped to
        opApply(int delegate(ref T x) loopBody);

loop on a source of elements T:
        bool popFront(ref T el);

mapped to
        opApply(int delegate(ref T x) loopBody);
(well the ref there does not make much sense, but that is how opApply works to avoid the explosion of opApply). All it takes is a check for pointers in the templates, and dereference the type of opApply.

also direct loops often look reasonable thanks to with D automatically dereferencing with ".":
while (it.popFront(el)){
        el.doSomething;
}

yes assigning stuff directly to el, and not its components you need a T* iterator and you have to write
        *el=...
and
        x= *el
but that is not so terrible.

filter applied on an iterator it is just
bool popNext(ref T el){
  while (it.popNext(el)){
    if (acceptable(el)){
      return true;
    }
  }
  return false;
}

combiners of iterators are likewise quite simple to write.

Reply via email to