On Thu, 2 Sep 1999, Ingo Molnar wrote:
> > -------Sequential Output-------- ---Sequential Input-- --Random--
> > -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
> > Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
> > 2000 19712 97.6 85378 85.8 30903 73.0 26272 97.1 83648 92.1 323.2 3.8
First, thank you for your comments. You are right that around 85 MByte/s
read and write for bulk operations on 2 GB file (with 256 M memory) is
not that bad :)
However, if you have some time, I would be some comments on the following:
> > -------Sequential Output-------- ---Sequential Input-- --Random--
> > -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
> > Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
> > 2000 19662 97.4 63567 64.3 21655 50.0 26064 96.0 68886 69.6 225.2 2.4
Note that the rewriting speed is quite low. Can this be because if I
rewrite 512 bytes, Linux has to read 512 bytes first ? This is true at
least for fs/block_dev.c when the device blksize is not 512 (it's 1024 for
SCSI; 4096 for IDE and when using md it looks like it depends on the
underlying devices but I might be wrong).
Is this comportment always the case also for fs operations ? Or the low
rewriting speed has something to do with something else ?
> i think you are hitting hardware limits. First, getting 85.3 MB/sec out of
> your RAID0 array isnt all that bad :) But CPU load seems to be pretty
> high, that could already be a limit. Also, DMA load is probably very high
I am going to try with raw IO (with sct patch) to see if it's the
memcpy_to_fs() (or 2.x equivalent) which is responsible for the slow down.
I am also going to try with two QLOGIC ISP1080 since they seem even faster
than the AIC7895 which was already quite fast.
But you are right, it could be DMA contention (except that this
motherboard, L440GX+) has two PCI buses, very fast memory, and an AGP
memory bus. Also I could try 66 MHz PCI, but I haven't got any HA for this
yet :)
> (and coming from several devices) as well. It would be interesting to
> check out the very same benchmarks with an identical but higher-clocked
> CPU, to see how much the saturation point depends on CPU speed. (this
> might not be possible with your system i guess)
In your opinion, could a dual-CPU system help in any way if the speed
saturation is due to memcpy_to_fs(), assuming one single Bonnie process ?
I would think not.