Re: buffered input

Andrei Alexandrescu Sat, 05 Feb 2011 19:55:58 -0800

On 2/5/11 7:28 PM, Don wrote:

spir wrote:

On 02/05/2011 10:44 PM, Nick Sabalausky wrote:

"Andrei Alexandrescu"<seewebsiteforem...@erdani.org> wrote in message
news:iijq99$1a5o$1...@digitalmars.com...

On 2/5/11 6:46 AM, spir wrote:

On 02/05/2011 10:36 AM, Nick Sabalausky wrote:

On a separate note, I think a good way of testing the design (and
end up
getting something useful anyway) would be to try to use it to
create a
range
that's automatically-buffered in a more traditional way. Ie, Given
any
input
range 'myRange', "buffered(myRange, 2048)" (or something like that)
would
wrap it in a new input range that automatically buffers the next 2048
elements (asynchronously?) whenever its internal buffer is
exhausted. Or
something like that. It's late and I'm tired and I can't think
anymore
;)


That's exactly what I'm expecting.
Funnily enough, I was about to start a thread on the topic after
reading
related posts. My point was:
"I'm not a specialist in efficiency (rather the opposite), I just know
there is --theoretically-- relevant performance loss to expect from
unbuffered input in various cases. Could we define a generic
input-buffering primitive allowing people to benefit from others'
competence? Just like Appender."

Denis


The buffered range interface as I defined it supports infinite
lookahead.
The interface mentioned by Nick has lookahead between 1 and 2048. So I
don't think my interface is appropriate for that.

Infinite lookahead is a wonderful thing. Consider reading lines from a
file. Essentially what you need to do is to keep on reading blocks
of data
until you see \n (possibly followed by some more stuff). Then you offer
the client the line up to the \n. When the client wants a new line, you
combine the leftover data you already have with new stuff you read. On
occasion you need to move over leftovers, but if your internal
buffers are
large enough that is negligible (I actually tested this recently).

Another example: consider dealing with line continuations in reading
CSV
files. Under certain conditions, you need to read one more line and
stitch
it with the existing one. This is easy with infinite lookahead, but
quite
elaborate with lookahead 1.


I think I can see how it might be worthwhile to discourage the
traditional
buffer interface I described in favor of the above. It wouldn't be as
trivial to use as what people are used to, but I can see that it
could avoid
a lot of unnessisary copying, especially with other people's
suggestion of
allowing the user to provide their own buffer to be filled (and it seems
easy enough to learn).

But what about when you want a circular buffer? Ie, When you know a
certain
maximum lookahead is fine and you want to minimize memory usage and
buffer-appends. Circular buffers don't do infinite lookahead so the
interface maybe doesn't work as well. Plus you probably wouldn't want to
provide an interface for slicing into the buffer, since the slice could
straddle the wrap-around point which would require a new allocation (ie
"return buffer[indexOfFront+sliceStart..$] ~
buffer[0..sliceLength-($-(frontIndex+sliceStart))]"). I guess maybe that
would just call for another type of range.


Becomes too complicated, doesnt it?

Denis


Circular buffers don't seem like an 'optional' use case to me. Most real
I/O works that way.
I think if the abstraction can't handle it, the abstraction is a failure.

The abstraction does handle it implicitly, except that it doesn't fixthe buffer size. If you ask for appendToFront() with large numberswithout calling shiftFront() too, the size of the buffer will ultimatelyincrease to accommodate the entire input. That's the infinite ininfinite lookahead.

Most uses, however, stick with front/popFront (i.e. let the range choosethe buffer size and handle circularity transparently) and on occasioncall appendToFront()/shiftFront(). Whenever a part of the buffer hasbeen released by calling shiftFront(), the implementation may use it asa circular buffer.

Circularity is all transparent, as I think it should be. But the poweris there. Consider FIE* for contrast. The C API provides setbuf andsetvbuf, but no way to otherwise take advantage of buffering - at all.This really hurts; you can only call fungetc() once and be guaranteedsuccess - i.e. all FILE* offer L(1) capabilities no matter how big abuffer you set!



Andrei

Re: buffered input

Reply via email to