Cameron Harr wrote:
New results, with markers. ---- type=randwrite bs=512 drives=1 scst_threads=1 srptthread=1 iops=65612.40 type=randwrite bs=4k drives=1 scst_threads=1 srptthread=1 iops=54934.31 type=randwrite bs=512 drives=2 scst_threads=1 srptthread=1 iops=82514.57 type=randwrite bs=4k drives=2 scst_threads=1 srptthread=1 iops=79680.42 type=randwrite bs=512 drives=1 scst_threads=2 srptthread=1 iops=60439.73 type=randwrite bs=4k drives=1 scst_threads=2 srptthread=1 iops=51510.68 type=randwrite bs=512 drives=2 scst_threads=2 srptthread=1 iops=102735.07 type=randwrite bs=4k drives=2 scst_threads=2 srptthread=1 iops=78558.77 type=randwrite bs=512 drives=1 scst_threads=3 srptthread=1 iops=62941.35 type=randwrite bs=4k drives=1 scst_threads=3 srptthread=1 iops=51924.17 type=randwrite bs=512 drives=2 scst_threads=3 srptthread=1 iops=120961.39 type=randwrite bs=4k drives=2 scst_threads=3 srptthread=1 iops=75411.52 type=randwrite bs=512 drives=1 scst_threads=1 srptthread=0 iops=50891.13 type=randwrite bs=4k drives=1 scst_threads=1 srptthread=0 iops=50199.90 type=randwrite bs=512 drives=2 scst_threads=1 srptthread=0 iops=58711.87 type=randwrite bs=4k drives=2 scst_threads=1 srptthread=0 iops=74504.65 type=randwrite bs=512 drives=1 scst_threads=2 srptthread=0 iops=61043.73 type=randwrite bs=4k drives=1 scst_threads=2 srptthread=0 iops=49951.89 type=randwrite bs=512 drives=2 scst_threads=2 srptthread=0 iops=83195.60 type=randwrite bs=4k drives=2 scst_threads=2 srptthread=0 iops=75224.25 type=randwrite bs=512 drives=1 scst_threads=3 srptthread=0 iops=60277.98 type=randwrite bs=4k drives=1 scst_threads=3 srptthread=0 iops=49874.57 type=randwrite bs=512 drives=2 scst_threads=3 srptthread=0 iops=84851.43 type=randwrite bs=4k drives=2 scst_threads=3 srptthread=0 iops=73238.46
I think srptthread=0 performs worse in this case, because with it part of processing done in SIRQ, but seems scheduler make it be done on the same CPU as fct0-worker, which does the data transfer to your SSD device job. And this thread is always consumes about 100% CPU, so it has less CPU time, hence less overall performance.
So, try to affine fctX-worker, SCST threads and SIRQ processing on different CPUs and check again. You can affine threads using utility from http://www.kernel.org/pub/linux/kernel/people/rml/cpu-affinity/, how to affine IRQ see Documentation/IRQ-affinity.txt in your kernel tree.
Vlad _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
