:Paul Richards said in "Re: patches for test / review":
:
:> Richard, do you want to post a summary of your tests?
:
:Well I'd best post the working draft of my report on the issues
:I've seen, as I'm not going to have time to work on it in the near
:future, and it raises serious performance issues that are best
:looked at soon. Note none of these detailed results are from
:current, but Paul Richards has checked that these issues are still
:present in current.
:
: (lots of good stuff)
Interesting. The behavior is probably related closely to the
write-behind methodology that UFS uses.
A while back while fixing an O(N^2) degenerate condition in the buffer
cache queueing code, DG and I had a long discussion of the write_behind
behavior. I added a sysctl to 4.x that changes the write_behind
behavior:
sysctl vfs.write_behind
0 Turned off
1 Normal (default)
2 Backed off
It would be interesting to see how the benchmark performs with
write_behind turned off (set to 0). Note that a setting of 2
is highly experimental and will probably suffer from the same problem(s)
that normal mode suffers from. (see below, I ran the benchmark)
In general turning off write behind is *NOT* a good idea, because
it saturates the buffer cache with dirty blocks and can lead to seriously
degraded performance on a normal system due to write hogging. On the
flip side, this was all before I put in the new buffer cache flushing code
so it is possible that 4.x will not degrade as seriously with write
behind turned off. I haven't run saturation tests recently with
write_behind turned off.
A secondary issue -- actually the reason *why* performance is so bad, is
that the buffer cache nominally locks the underlying VM pages when issuing
a write and this is almost certainly the cause of the program stalls.
When a program writes a piece of data (and I/O is started immediately),
and then reads it back later on, the read operation may stall even though
the data is in the cache due to the write not having yet completed. The
write operation might also stall if another nearby write is in progress
(I'm not sure on that last point).
Kirk has made significant improvements to stalls related to bitmap
operations. I'm not sure if softupdates must be turned on or not to
get these improvements. The data blocks can still stall, though, but
part of the plan for later this year is to fix that too.
:The benchmark program source code is available, and easy to run,
:the bottom of the report has links.
test3:/test/tmp# sysctl -w vfs.write_behind=0 (turned off)
test3:/test/tmp# time ./seekreadwrite xxx 10000
0.125u 0.807s 0:00.93 98.9% 5+181k 0+0io 0pf+0w
test3:/test/tmp# sysctl -w vfs.write_behind=1 (normal)
test3:/test/tmp# time ./seekreadwrite xxx 10000
0.040u 1.709s 0:32.57 5.3% 4+174k 0+8750io 0pf+0w
:I also have a range of results from an ATA (IDE) cheap deskside
:Dell system running FreeBSD 3.3-RELEASE, with a range of wd(4)
:flags. This system exhibits much better performance than the SCSI
:systems above at this benchmark, perhaps related to better DMA
:ability.
:
:ATA being faster than SCSI on this benchmark is a bit of a side-issue
:to the thrust of this report, but the performance numbers may give
:hints diagnosing the problem.
IDE drives sometimes appear to be faster because they fake the
write-completion response (they return the response prior to the
write actually completing). It could also simply be that the
lack of any real mixed I/O (due to the file being so small) is
a slightly faster operation on an IDE drive. I wouldn't read much
into it... where SCSI really shines is in more heavily loaded
environments.
-Matt
Matthew Dillon
<[EMAIL PROTECTED]>
:Thanks,
: Richard
:-
:Richard Wendland [EMAIL PROTECTED]
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message