On 05/16/12 21:38, H. S. Teoh wrote: > On Wed, May 16, 2012 at 12:48:49PM -0500, Andrei Alexandrescu wrote: >> On 5/16/12 12:34 PM, Steven Schveighoffer wrote: >>> In other words, ranges aren't enough. >> >> This is copiously clear to me, but the way I like to think about it >> is by extending the notion of range (with notions such as e.g. >> BufferedRange, LookaheadRange, and such) instead of developing an >> abstraction independent from ranges and then working on stitching >> that with ranges. > [...] > > One direction that _could_ be helpful, perhaps, is to extend the concept > of range to include, let's tentatively call it, a ChunkedRange. > Basically a ChunkedRange implements the usual InputRange operations > (empty, front, popfront) but adds the following new primitives: > > - bool hasAtLeast(R)(R range, int n) - true if underlying range has at > least n elements left; > > - E[] frontN(R)(R range, int n) - returns a slice containing the front n > elements from the range: this will buffer the next n elements from the > range if they aren't already; repeated calls will just return the > buffer; > > - void popN(R)(R range, int n) - discards the first n elements from the > buffer, thus causing the next call to frontN() to fetch more data if > necessary. > > These are all tentative names, of course. But the idea is that you can > keep N elements of the range "in view" at a time, without having to > individually read them out and save them in a second buffer, and you can > advance this view once you're done with the current data and want to > move on. > > Existing range operations like popFrontN, take, takeExactly, drop, etc., > can be extended to take advantage of the extra functionality of > ChunkedRanges. (Perhaps popFrontN can even be merged with popN, since > they amount to the same thing.) > > Using a ChunkedRange allows you to write functions that parse a > particular range and return a range of chunks (say, a deserializer that > returns a range of objects given a range of bytes). > > Thinking on it a bit further, perhaps we can call this a WindowedRange, > since it somewhat resembles the sliding window protocol where you keep a > "window" of sequential packet ids in an active buffer, and remove them > from the buffer as they get ack'ed (consumed by popN). The buffer thus > acts like a "window" into the next n elements in the range, which can be > "slid forward" as data is consumed.
Right now, everybody reinvents this, with a slightly different interface... It's really obvious, needed and just has to be standardized. A few notes: hasAtLeast is redundant as it can be better expressed as .length; what would be the point of wrapping 'r.length>=n'? An '.available' property would be useful to find eg out how much can be consumed w/o blocking, but that one should return a size_t too. 'E[] frontN' should have a version that returns all available elements; i called it '@property E[] fronts()' here. It's more efficient that way and doesn't rely on the compiler to inline and optimize the limit checks away. PopN -- well, its signature here is 'void popFronts(size_t n)', other than that, there's no difference. Similar things are necessary for output ranges. Here, what i needed was: void put(ref E el) void puts(E[] els) @property size_t free() // Not the most intuitive name w/o context; // returns the number of E's that can be 'put()' // w/o blocking. Note that all of this doesn't address the consume-variable-sized-chunks issue. But that can now be efficiently handled by another layer on top. On 05/16/12 22:15, Steven Schveighoffer wrote: > I still don't get the need to "add" this to ranges. The streaming API works > fine on its own. This is not an argument against a streaming API (at least not for me), but for efficient ranges. With the API above I can shift tens of gigabytes of data per second between threads. And still use the 'std' range API and everything that works with it... > But there is an omission with your proposed API regardless -- reading data is > a mutating event. It destructively mutates the underlying data stream so > that you cannot get the data again. This means you must double-buffer data > in order to support frontN and popN that are not necessary with a simple read > API. > > For example: > > auto buf = new ubyte[1000000]; > stream.read(buf); > > does not need to first buffer the data inside the stream and then copy it to > buf, it can read it from the OS *directly* into buf. Sometimes having the buffer managed by 'stream' and 'read()' returning a slice into it works (this is what 'fronts' above does). Reusing a caller managed buffer can be useful in other cases, yes. artur