On Monday, 19 March 2012 at 17:23:36 UTC, Andrei Alexandrescu wrote:

[.....]


I wanted for a long time to improve byLine by allowing it to do its own buffering. That means once you used byLine it's not possible to stop it, get back to the original File, and continue reading it. Using byLine is a commitment. This is what most uses of it do anyway.

Great!! Perhaps we don't have to choose. We may have both!!
Allow me to suggest:

      byLineBuffered(bufferSize, keepTerminator);
or    byLineOnly(bufferSize, keepTerminator);
or    byLineChunked(bufferSize, keepTerminator);
or    byLineFastAndDangerous :-) hahah :-)

Or the other way around:

      byLine(keepTerminator, underlyingBufferSize);
renaming the current one to:
      byLineUnbuffered(keepTerminator);

Other ideas (I think I read them somewhere about
this same byLine topic):
  * I think it'd be cool if 'line' could be a slice of the
underlying buffer when possible if buffering is added.
  * Another good idea would be a new argument, maxLineLength,
so that one can avoid reading and allocating the whole
file into a big line string if there are no newlines
in the file, and one knows the max length desired.

--jm



Ok, this was the good surprise. Reading by chunks was faster than
reading the whole file, by several ms.

What may be at work here is cache effects. Reusing the same 1MB may place it in faster cache memory, whereas reading 20MB at once may spill into slower memory.


Andrei




Reply via email to