The other thing is, how many people are really going to use readln(char[] buf) directly instead of using the version that returns a newly allocated string or using byLine()? If they're using the version that returns a newly allocated string, this is a non-issue. If they're using byLine(), then a lot of the logic for recycling buffers could be moved there, where it is safer because it is well-encapsulated.
On Thu, Feb 4, 2010 at 12:02 PM, Andrei Alexandrescu <[email protected]>wrote: > So it seems like due to the low-level APIs that readlnImpl calls, it flies > under the radar of Steve's detection. > > I think using an Appender with the useExistingBuffer primitive would work, > but would be suboptimal - each read can only use as many characters as read > the last time, which would trigger quite a few allocation (whenever the line > size increases). Please let me know whether this assessment is correct. > > A possible solution would be to cache the last buffer and its maximum > length in static variables inside readlnImpl. I think it's reasonable to say > that readln is the owner of the char[] and you shouldn't take portions of it > and expect they won't be changed. However, it will not trump over existing > arrays. > > Andrei > > David Simcha wrote: > >> >> >> On Thu, Feb 4, 2010 at 12:10 AM, Andrei Alexandrescu >> <[email protected]<mailto: >> [email protected]>> wrote: >> >> David Simcha wrote: >> >> I recently filed bug 3673 >> (http://d.puremagic.com/issues/show_bug.cgi?id=3763) and started >> looking at possible fixes. This is a bug in >> std.stdio.readlnImpl(). However, the more I think about it the >> more I think the function needs to be completely rethought, not >> just patched. I'm not sure if this is a high priority given >> that it has nothing to do with the language spec and TDPL but >> it's a pretty embarrassing quality of implementation issue. >> >> Anyhow, readlnImpl() takes a ref char[] and tries to recycle the >> memory to read in another line. In doing so, it queries >> GC.capacity for the relevant memory block and resizes the array >> to the size of the memory block. This is arguably unsafe in the >> general case because, if someone passes in a slice that starts >> at the beginning of a GC block to use as the buffer, everything >> else in the same GC block can get overwritten. This massively >> violates the principle of least surprise. On the other hand, >> when encapsulated in the higher level API of byLine(), it's both >> safe and efficient. >> >> >> Could you please give an example? >> >> import std.stdio; >> >> void main() { >> auto writer = File("foo.txt", "wb"); >> foreach(i; 0..1000) { >> writer.write('a'); >> } >> writer.writeln(); >> writer.close(); >> >> auto chars = new char[500]; >> chars[] = 'b'; >> auto buf = chars[0..100]; >> >> auto reader = File("foo.txt", "rb"); >> reader.readln(buf); >> writeln(chars[$ - 1]); // a >> assert(chars[$ - 1] == 'b'); // FAILS. >> } >> >> >> ------------------------------------------------------------------------ >> >> >> _______________________________________________ >> phobos mailing list >> [email protected] >> http://lists.puremagic.com/mailman/listinfo/phobos >> > _______________________________________________ > phobos mailing list > [email protected] > http://lists.puremagic.com/mailman/listinfo/phobos >
_______________________________________________ phobos mailing list [email protected] http://lists.puremagic.com/mailman/listinfo/phobos
