Related to the Matrix CSV input/output optimalization quest, I was puzzled why
writing seemed so much slower than reading.
Here is a simple example:
[ FileStream fileNamed: '/tmp/numbers.txt' do: [ :stream |
100000 timesRepeat: [ stream print: 100 atRandom; space ] ] ] timeToRun.
1558
[ FileStream fileNamed: '/tmp/numbers.txt' do: [ :stream |
100000 timesRepeat: [ Integer readFrom: stream. stream peekFor: $ ] ] ]
timeToRun.
183
[ FileStream fileNamed: '/tmp/numbers.txt' do: [ :stream |
100000 timesRepeat: [ stream nextPut: ($a to: $z) atRandom; space ] ] ]
timeToRun.
1705
[ FileStream fileNamed: '/tmp/numbers.txt' do: [ :stream |
100000 timesRepeat: [ stream next. stream peekFor: $ ] ] ] timeToRun.
47
Clearly, the writing is close to an order of magnitude slower than reading.
This was on Pharo 1.1 with Cog, but I double-checked with Pharo 1.2 and Squeak
4.1.
On my machine (Mac Book Pro), this is what another dynamic language does:
> (time (with-output-to-file (out "/tmp/numbers.txt")
(loop repeat 100000 do (format out "~d " (random 100)))))
Timing the evaluation of (WITH-OUTPUT-TO-FILE (OUT "/tmp/numbers.txt") (LOOP
REPEAT 100000 DO (FORMAT OUT "~d " (RANDOM 100))))
User time = 0.413
System time = 0.002
Elapsed time = 0.401
Allocation = 2502320 bytes
0 Page faults
Calls to %EVAL 1700063
NIL
> (time (with-open-file (in "/tmp/numbers.txt")
(loop repeat 100000 do (read in))))
Timing the evaluation of (WITH-OPEN-FILE (IN "/tmp/numbers.txt") (LOOP REPEAT
100000 DO (READ IN)))
User time = 0.328
System time = 0.001
Elapsed time = 0.315
Allocation = 2500764 bytes
0 Page faults
Calls to %EVAL 1400056
NIL
So Pharo Smalltalk clearly matches the read/parse speed, which is great, but
fails at simple writing.
Maybe I am doing something wrong here (I know these are MultiByteFileSteams),
but I fail to see what. Something with buffering/flushing ?
Anybody any idea ?
Sven