On Tue, 7 Dec 2010, Sven Van Caekenberghe wrote:
Levente,
On 07 Dec 2010, at 16:20, Levente Uzonyi wrote:
That's because filestreams are read buffered, but not write buffered. I
implemented a subclass of FileStream (intended as a possible replacement of
StandardFileStream) which is read and write buffered. It gives the same
performance for reading as the current implementation and a significant boost
for writes, so it can be done. But write buffering has side effects, while read
buffering doesn't. Maybe it can be added as a separate subclass of FileStream
if there's need for it, but the multibyte stuff has to be duplicated in this
case (note that it's already duplicated in MultiByteFileStream and
MultiByteBinaryOrTextStream). I also had an idea to create MultiByteStream
which would be a stream that wraps another stream and does the conversion stuff
using a TextConverter. It'd be a lot of work to do it and I don't expect more
than 30% performance improvement (for the read performance).
Thanks for the explanation, some quick and dirty buffering makes a huge
difference:
[ FileStream fileNamed: '/tmp/numbers.txt' do: [ :fileStream |
1000 timesRepeat: [
fileStream nextPutAll:
(String streamContents: [ :stream |
100 timesRepeat: [ stream print: 100 atRandom;
space ] ]) ] ] ] timeToRun.
159
Still, the asymmetry is a bit strange.
Can't the side effects be dealt with using #flush ?
Lets go back in time. A year ago there was no read buffering (Pharo 1.0
was not released, Squeak 3.10.2 was out) and reading from a file was as
slow as writing is currently. Read buffering could be added transparently,
so it could give a huge speed improvement to all existing code.
Write buffering could be done the same way, but it would break code,
because currently a write is immediately done, while with buffering it
wouldn't be. Some files would be written only when the finalization
process closes the file. The solution for this could be automatic flushing
on each write, which could be turned off by a method. But that would be
the same as not using write buffering at all.
But with the same effort you could use another stream implementation, that
does write buffering. And write buffering can't be used to speed up
existing code without reviewing it.
Levente
There are several stream libraries (for example XTreams) that can easily
support write buffering without the need to care about compatibility.
Yeah, although the Smalltalk Collection and Stream classes were better than
everything else 20, 30 years ago, lots of things have changed and there is lots
of competition. The fact that these classes are so nice to use seem to have
prevented necessary improvements.
I think I might file this as a Pharo issue.
Sven