On 12/11/2010 01:00 AM, Christopher Nicholson-Sauls wrote: > On 12/10/10 22:36, Matthias Walter wrote: >> On 12/10/2010 09:57 PM, Matthias Walter wrote: >>> Hi all, >>> >>> I currently work on a parser for some file format. I wanted to use the >>> std.stdio.ByChunk Range to read from a file and extract tokens from the >>> chunks. Obviously it can happen that the current chunk ends before a >>> token can be extracted, in which case I can ask for the next chunk from >>> the Range. In order to keep the already-read part in mind, I need to dup >>> at least the unprocessed part of the older chunk and concatenate it in >>> front of the next part or at least write the code that works like they >>> were concatenated. This looks like a stupid approach to me. >>> >>> Here is a small example: >>> >>> file contents: "Hello world" >>> chunks: "Hello w" "orld" >>> >>> First I read the token "Hello" from the first chunk and maybe skip the >>> whitespace. Then I have the "w" (which I need to move away from the >>> buffer, because ByChunk fill overwrite it) and get "orld". >>> >>> My idea was to have a ByChunk-related Object, which the user can tell >>> how much of the buffer he/she actually used, such that it can move this >>> data to the beginning of the buffer and append the next chunk. This >>> wouldn't need further allocations and give the user contiguous data >>> he/she can work with. >> I coded something that works like this: >> >> foreach (ref ubyte[] data; byBuffer(file, 12)) >> { >> writefln("[%s]", cast(string) data); >> data = data[$-2 .. $]; >> } >> >> The 2nd line in the loop tells ByBuffer that we didn't process the last >> two chars and would like to get them again along with newly read data. >> And as long as we do process something, the internal buffer does not get >> reallocated. >> >> It works and respects the formal requirements of ranges. Whether it >> respects the intended semantics, one can discuss about. Any comments >> whether the above things make sense or is an evil exploit of the >> provided syntax sugar? > I don't think it's a bad approach, but I have a suggestion. > > It leaves a lot of room for abuse or misuse if you require the user code > to modify the data[] array in order to send this "protect some > characters" message. I think it would be better to provide an explicit > function/method that means precisely that. Maybe return a transparent > struct wrapping a view to the buffer's data, that further provides a > function for doing precisely this. > > foreach( data; byBuffer( file, 12 )) { > // do things with data, decide we need to keep 2 chars > data.save( 2 ); > } > > Or something like it. With regards to this, you may want to allow the > internal buffer to grow (if you aren't already) as needed. Imagine what > would otherwise happen if you needed to 'save' the entire current buffer. > > -- Chris N-S Thank you! This is a really good idea. So I basically wrap the buffer-array and implement it such that the default behavior (without explicitely doing something) is like the ByChunk mechanism.
Matthias