On Tue, 7 Dec 2010, Sven Van Caekenberghe wrote:

Levente,

On 07 Dec 2010, at 16:20, Levente Uzonyi wrote:

That's because filestreams are read buffered, but not write buffered. I 
implemented a subclass of FileStream (intended as a possible replacement of 
StandardFileStream) which is read and write buffered. It gives the same 
performance for reading as the current implementation and a significant boost 
for writes, so it can be done. But write buffering has side effects, while read 
buffering doesn't. Maybe it can be added as a separate subclass of FileStream 
if there's need for it, but the multibyte stuff has to be duplicated in this 
case (note that it's already duplicated in MultiByteFileStream and 
MultiByteBinaryOrTextStream). I also had an idea to create MultiByteStream 
which would be a stream that wraps another stream and does the conversion stuff 
using a TextConverter. It'd be a lot of work to do it and I don't expect more 
than 30% performance improvement (for the read performance).

Thanks for the explanation, some quick and dirty buffering makes a huge 
difference:

[ FileStream fileNamed: '/tmp/numbers.txt' do: [ :fileStream |
        1000 timesRepeat: [
                fileStream nextPutAll:
                        (String streamContents: [ :stream |
                                100 timesRepeat: [ stream print: 100 atRandom; 
space ] ]) ] ] ] timeToRun.
159

Still, the asymmetry is a bit strange.
Can't the side effects be dealt with using #flush ?

Lets go back in time. A year ago there was no read buffering (Pharo 1.0 was not released, Squeak 3.10.2 was out) and reading from a file was as slow as writing is currently. Read buffering could be added transparently, so it could give a huge speed improvement to all existing code. Write buffering could be done the same way, but it would break code, because currently a write is immediately done, while with buffering it wouldn't be. Some files would be written only when the finalization process closes the file. The solution for this could be automatic flushing on each write, which could be turned off by a method. But that would be the same as not using write buffering at all. But with the same effort you could use another stream implementation, that does write buffering. And write buffering can't be used to speed up existing code without reviewing it.


Levente


There are several stream libraries (for example XTreams) that can easily 
support write buffering without the need to care about compatibility.

Yeah, although the Smalltalk Collection and Stream classes were better than 
everything else 20, 30 years ago, lots of things have changed and there is lots 
of competition. The fact that these classes are so nice to use seem to have 
prevented necessary improvements.

I think I might file this as a Pharo issue.

Sven




Reply via email to