> 
> Zhihui Zhang wrote:
> 
> <snip>
> 
>> ... I also do not read anything during the partial block write,
>> and I think the disk controller should not do that either.
> 
> If you do a partial block write, surely at some point the block must be read
> in order to preserve that segment of data you are _not_ overwriting?

This was *exactly* my experience in FreeBSD 3.2, which was the last time I
looked into this in detail.  The performance of writing full blocks instead
of partitial blocks was at least an order of magnitude better.  (By "blocks"
here I mean the size the filesystem was formatted with, the -b parameter to
newfs.)  I found that a filesystem formatted as -b8192 -f8192 performed so
much faster than the usual -b8192 -f1024 that it was well worth taking the
hit in wasted allocation space for small files.

When I instrumented code in various places to try to track down why there
was such a huge difference when fragsize != blocksize I found that the
killer was repeated read-modify-write cycles, especially on filesystem
metadata.  Creating a file and writing a few bytes to it could result in
dozens of blocks read then written, and some of the blocks got re-read
several times in the process.  It was always a mystery to me why the same
sectors would get read over and over again (isn't that what buffer and
filesystem caches are for?)  But I know for certain the physical reads were
happening because the instrumentation for that was in a custom raid driver
of our own.

But, FreeBSD 3.2 is ancient history now, I have no idea whether filesystem
performance is still this bad (and surely softupdates would ameliorate this
problem anyway).  Also, this may not be relevant to Zhilhui Zang's situation
because filesystem behavior is probably different than working directly with
the /dev/daxxxx device.  (Or maybe not, I guess there must be an implied
blocksize from an incore disklabel or something.)

It would be interesting to see if formatting a filesystem with blocksize ==
fragsize still makes a big difference in performance these days, but I
remember all the instrumentation I had to do to prove the read-modify-write
was happening last time being a BIG hassle, and nobody is paying me to do it
anymore.  :-)


-- Ian



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Reply via email to