On Thu, 24 May 2012, Edgar Fu? wrote: > > Keep in mind mpt uese a rather inefficient communication protocol and does > > tagged queuing. > You mean the protocol the main CPU uses to communicate with an MPT adapter is > inefficient? Or do you mean SAS is inefficient?
The protocol used to communicate between the CPU and the adapter is inefficient. Not well designed. They redesigned it for SAS2. > > The former means the overhead for each command is not so good, but the > > latter means it can keep lots of commands in the air at the same time. > I'm sorry, I'm unable to conclude why this explains my results. dd will send the kernel individual write operations. sd and physio() will break them up into MAXPHYS chunks. Each chunk will be queued at the HBA. The HBA will dispatch them all as fast as it can. Tagged queuing will overlap them. With smaller transfers, the setup overhead becomes significant and you see poor performance. With large transfers (larger than MAXPHYS) the writes are split up into MAXPHYS chunks and the disk handles them in parallel, hence the performance increase even beyond MAXPHYS. Also keep in mind: When using the block device the data is copied from the process buffer into the buffer cache and the I/O happens from the buffer cache pages. When using the raw device the I/O happens directly from process memory, no copying involved. Eduardo