Re: [Pharo-project] Why is FileStream writing almost an order of a magnitude slower than reading ?

Levente Uzonyi Wed, 08 Dec 2010 04:04:02 -0800

On Wed, 8 Dec 2010, Sven Van Caekenberghe wrote:


On 08 Dec 2010, at 00:25, Levente Uzonyi wrote:

Thanks for the explanation, some quick and dirty buffering makes a huge 
difference:

[ FileStream fileNamed: '/tmp/numbers.txt' do: [ :fileStream |
        1000 timesRepeat: [
                fileStream nextPutAll:
                        (String streamContents: [ :stream |
                                100 timesRepeat: [ stream print: 100 atRandom; 
space ] ]) ] ] ] timeToRun.
159

Still, the asymmetry is a bit strange.
Can't the side effects be dealt with using #flush ?


Lets go back in time. A year ago there was no read buffering (Pharo 1.0 was not 
released, Squeak 3.10.2 was out) and reading from a file was as slow as writing 
is currently. Read buffering could be added transparently, so it could give a 
huge speed improvement to all existing code.
Write buffering could be done the same way, but it would break code, because 
currently a write is immediately done, while with buffering it
wouldn't be. Some files would be written only when the finalization process 
closes the file. The solution for this could be automatic flushing on each 
write, which could be turned off by a method. But that would be the same as not 
using write buffering at all.
But with the same effort you could use another stream implementation, that does 
write buffering. And write buffering can't be used to speed up existing code 
without reviewing it.


Thanks again for the explanation.

OK, I tried writing my own buffered write stream class:

[ FileStream fileNamed: '/tmp/numbers.txt' do: [ :fileStream | | bufferedStream 
|
        bufferedStream := ZnBufferedWriteStream on: fileStream.
        100000 timesRepeat: [ bufferedStream print: 100 atRandom; space ].
        bufferedStream flush ] ] timeToRun.
165

That wasn' too hard. And indeed, it is necessary to manually send #flush or 
#close to force the buffer out.

But I do not completely agree with the fact that it would be that much work. 
Stream>>#flush is already a no-op. Adding it to #streamContents: and some 
others can not be that much work. In fact, SocketStream does already do both input 
and output buffering (and thus requires #flush or #close), so would potentially fail 
in certain situations according to your reasoning. No ?

It would be much work to add the write buffering to StandardFileStream(it should also work with MultiByteFileStream) and fix all places in theimage, that use StandardFileStream or MultiByteFileStream for writing to afile.I don't get how #streamContents: could be used to send #flush. That's amethod of SequenceableCollection IIRC.SocketStream is unrelated here, because it doesn't write to files andbuffering was always implemented in it AFAIK.



Levente


Sven

Re: [Pharo-project] Why is FileStream writing almost an order of a magnitude slower than reading ?

Reply via email to