As previously mentioned, I was investigating performance issues with 9pfs. Raw file read/write of 9pfs is actually quite good, provided that client picked a reasonable high msize (maximum message size). I would recommend to log a warning on 9p server side if a client attached with a small msize that would cause performance issues for that reason.
However there are other aspects where 9pfs currently performs suboptimally, especially readdir handling of 9pfs is extremely slow, a simple readdir request of a guest typically blocks for several hundred milliseconds or even several seconds, no matter how powerful the underlying hardware is. The reason for this performance issue: latency. Currently 9pfs is heavily dispatching a T_readdir request numerous times between main I/O thread and a background I/O thread back and forth; in fact it is actually hopping between threads even multiple times for every single directory entry during T_readdir request handling which leads in total to huge latencies for a single T_readdir request. This patch series aims to address this severe performance issue of 9pfs T_readdir request handling. The actual performance fix is patch 10. I also provided a convenient benchmark for comparing the performance improvements by using the 9pfs "synth" driver (see patch 8 for instructions how to run the benchmark), so no guest OS installation is required to peform this benchmark A/B comparison. With patch 10 I achieved a performance improvement of factor 40 on my test machine. ** NOTE: ** As outlined by patch 7 there seems to be an outstanding issue (both with current, unoptimized readdir code, as well as with new, optimized readdir code) causing a transport error with splitted readdir requests. This issue only occurs if patch 7 is applied. I haven't investigated the cause of this issue yet, it looks like a memory issue though. I am not sure if it is a problem with the actual 9p server or rather "just" with the test environment. Apart from that issue, the actual splitted readdir seems to work well with the new performance optimized readdir code as well though. v2->v3: * NEW patch: require msize >= 4096 [patch 2]. * Shortened commit log message [patch 3] (since previously mentioned issue now addressed by new patch 2). * Merged previous 2 test case patches into one -> [patch 5] (since trivial enough for one patch). * Fixed code style issue [patch 5]. * Fixed memory leak in test case [patch 5] (missing v9fs_req_free() in v9fs_rreaddir()). * NEW patch: added splitted readdir test [patch 6]. * NEW patch: Failing splitted readdir issue [patch 7] (see issue description above). * Adjusted commit log message [patch 9] (that this patch would break the new splitted readdir test). * Fixed comment in code [patch 10]. Christian Schoenebeck (11): tests/virtio-9p: add terminating null in v9fs_string_read() 9pfs: require msize >= 4096 9pfs: validate count sent by client with T_readdir hw/9pfs/9p-synth: added directory for readdir test tests/virtio-9p: added readdir test tests/virtio-9p: added splitted readdir test tests/virtio-9p: failing splitted readdir test 9pfs: readdir benchmark hw/9pfs/9p-synth: avoid n-square issue in synth_readdir() 9pfs: T_readdir latency optimization hw/9pfs/9p.c: benchmark time on T_readdir request hw/9pfs/9p-synth.c | 48 ++++++- hw/9pfs/9p-synth.h | 5 + hw/9pfs/9p.c | 163 +++++++++++++---------- hw/9pfs/9p.h | 34 +++++ hw/9pfs/codir.c | 183 ++++++++++++++++++++++++-- hw/9pfs/coth.h | 3 + tests/virtio-9p-test.c | 287 ++++++++++++++++++++++++++++++++++++++++- 7 files changed, 640 insertions(+), 83 deletions(-) -- 2.20.1