Vu Pham wrote:
Cameron Harr wrote:
One thing that makes results hard to interpret is that they vary
enormously. I've been doing more testing with 3 physical LUNs
(instead of two) on the target, srpt_thread=0, and changing between
scst_thread=[1,2,3]. With scst_thread=1, I'm fairly low (50K IOPs),
while at 2 and three threads, the results are higher, though in all
cases, the context switches are low, often less than 1:1.
Can you test again with srpt_thread=0,1 and scst_threads=1,2,3 in
NULLIO mode (with 1,2,3 export NULLIO luns)
srpt_thread=0:
scst_t: | 1 | 2 | 3 |
-------------------------------------------|
1 LUN* | 54K | 54K-75K | 54K-75K |
2 LUNs* |120K-200K|150K-200K**| 120K-180K**|
3 LUNs* |170K-195K|160K-195K | 130K-170K**|
srpt_thread=1:
scst_t: | 1 | 2 | 3 |
------------------------------------------|
1 LUN* | 74K | 54K | 55K |
2 LUNs* |140K-190K| 130K-200K | 150K-220K |
3 LUNs* |170K-195K| 170K-195K | 175K-195K |
* a FIO (benchmark) process was run for each LUN, so when there were 3
LUNs, there were three FIO processes runnning simultaneously.
** Sometimes the benchmark "zombied" (process doing no work, but process
can't be killed) after running a certain amount of time. However, it
wasn't repeatable in a reliable way, so I mark that this particular run
has zombied before.
- Note 1: There were a number of outliers (often between 98K and 230K),
but I tried to capture where the bulk of the activity happened. It's
still somewhat of a rough guess though. Where the range is large, it
usually mean the results were just really scattered.
Summary: It's hard to draw a good summary due to the variation of
results. I would say the runs with srpt_thread=1 tended to have fewer
outliers at the beginning, but as time went on, they scattered as well.
Running with 2 or 3 threads almost seems to be a toss-up.
Also a little disconcerting is that my average request size on the
target has gotten larger. I'm always writing 512B packets, and when I
run on one initiator, the average reqsz is around 600-800B. When I
add an initiator, the average reqsz basically doubles and is now
around 1200 - 1600B. I'm specifying direct IO in the test and scst is
configured as blockio (and thus direct IO), but it appears something
is cached at some point and seems to be coalesced when another
initiator is involved. Does this seem odd or normal? This shows true
whether the initiators are writing to different partitions on the
same LUN or the same LUN with no partitions.
What io scheduler are you running on local storage? Since you are
using blockio you should play around with io scheduler's tuned
parameters (for example deadline scheduler: front_merges,
write_starved,...) Please see ~/Documentation/block/*.txt
I'm using CFQ. Months ago, I tried different schedulers with their
default options and saw basically no difference. I can try some of that
again; however I don't believe I can tune the schedulers because my back
end doesn't give me a "queue" directory in /sys/block/<dev>/
-Cameron
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general