Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On Tue, 15 Dec 2020 12:33:34 +0100 Halil Pasic wrote: > On Tue, 15 Dec 2020 09:26:56 +0100 > Cornelia Huck wrote: > > Do we have a better idea now about which values would make sense here? > > > > Hi Conny! > > I have nothing new since then. Capping at 4 queues still looks like a > reasonable compromise to me. @Mimu: anything new since then? > > If you like I can spin a new version (we need compat handling now). > > Halil > If your tests show that 4 queues are a reasonable value, this is fine by me (with an explanatory comment in the code.) Feel free to respin.
Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On Tue, 15 Dec 2020 09:26:56 +0100 Cornelia Huck wrote: > On Tue, 10 Nov 2020 14:18:39 +0100 > Christian Borntraeger wrote: > > > On 10.11.20 11:40, Halil Pasic wrote: > > > On Tue, 10 Nov 2020 09:47:51 +0100 > > > Christian Borntraeger wrote: > > > > > >> > > >> > > >> On 09.11.20 19:53, Halil Pasic wrote: > > >>> On Mon, 9 Nov 2020 17:06:16 +0100 > > >>> Cornelia Huck wrote: > > >>> > > > @@ -20,6 +21,11 @@ static void virtio_ccw_blk_realize(VirtioCcwDevice > > > *ccw_dev, Error **errp) > > > { > > > VirtIOBlkCcw *dev = VIRTIO_BLK_CCW(ccw_dev); > > > DeviceState *vdev = DEVICE(&dev->vdev); > > > +VirtIOBlkConf *conf = &dev->vdev.conf; > > > + > > > +if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { > > > +conf->num_queues = MIN(4, current_machine->smp.cpus); > > > +} > > > > I would like to have a comment explaining the numbers here, however. > > > > virtio-pci has a pretty good explanation (use 1:1 for vqs:vcpus if > > possible, apply some other capping). 4 seems to be a bit arbitrary > > without explanation, although I'm sure you did some measurements :) > > >>> > > >>> Frankly, I don't have any measurements yet. For the secure case, > > >>> I think Mimu has assessed the impact of multiqueue, hence adding Mimu to > > >>> the cc list. @Mimu can you help us out. > > >>> > > >>> Regarding the normal non-protected VMs I'm in a middle of producing some > > >>> measurement data. This was admittedly a bit rushed because of where we > > >>> are in the cycle. Sorry to disappoint you. > > >>> > > >>> The number 4 was suggested by Christian, maybe Christian does have some > > >>> readily available measurement data for the normal VM case. @Christian: > > >>> can you help me out? > > >> My point was to find a balance between performance gain and memory usage. > > >> As a matter of fact, virtqueue do consume memory. So 4 looked like a > > >> reasonable default for me for large guests as long as we do not have > > >> directed > > >> interrupts. > > >> > > >> Now, thinking about this again: If we want to change the default to > > >> something > > >> else in the future (e.g. to num vcpus) then the compat handling will get > > >> really complicated. > > > > > > Regarding compat handling, I believe we would need a new property for > > > virtio-blk-ccw: something like def_num_queues_max. Then logic would > > > morph to MIN(def_num_queues_max, current_machine->smp.cpus), and we could > > > relatively freely do compat stuff on def_num_queues_max. > > > > > > IMHO not pretty but certainly doable. > > > > > >> > > >> So we can > > >> - go with num queues = num cpus. But this will consume memory > > >> for guests with lots of CPUs. > > > > > > In absence of data that showcases the benefit outweighing the obvious > > > detriment, I lean towards finding this option the least favorable. > > > > > >> - go with the proposed logic of min(4,vcpus) and accept that future > > >> compat handling > > >> is harder > > > > > > IMHO not a bad option, but I think I would still feel better about a > > > more informed decision. In the end, the end user can already specify the > > > num_queues explicitly, so I don't think this is urgent. > > > > > >> - defer this change > > > > > > So I tend to lean towards deferring. > > > > Yes, I was pushing this for 5.2 to avoid compat handling. But maybe it is > > better > > to wait and do it later. But we should certainly continue the discussion to > > have > > something for the next release. > > > > Do we have a better idea now about which values would make sense here? > Hi Conny! I have nothing new since then. Capping at 4 queues still looks like a reasonable compromise to me. @Mimu: anything new since then? If you like I can spin a new version (we need compat handling now). Halil
Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On Tue, 10 Nov 2020 14:18:39 +0100 Christian Borntraeger wrote: > On 10.11.20 11:40, Halil Pasic wrote: > > On Tue, 10 Nov 2020 09:47:51 +0100 > > Christian Borntraeger wrote: > > > >> > >> > >> On 09.11.20 19:53, Halil Pasic wrote: > >>> On Mon, 9 Nov 2020 17:06:16 +0100 > >>> Cornelia Huck wrote: > >>> > > @@ -20,6 +21,11 @@ static void virtio_ccw_blk_realize(VirtioCcwDevice > > *ccw_dev, Error **errp) > > { > > VirtIOBlkCcw *dev = VIRTIO_BLK_CCW(ccw_dev); > > DeviceState *vdev = DEVICE(&dev->vdev); > > +VirtIOBlkConf *conf = &dev->vdev.conf; > > + > > +if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { > > +conf->num_queues = MIN(4, current_machine->smp.cpus); > > +} > > I would like to have a comment explaining the numbers here, however. > > virtio-pci has a pretty good explanation (use 1:1 for vqs:vcpus if > possible, apply some other capping). 4 seems to be a bit arbitrary > without explanation, although I'm sure you did some measurements :) > >>> > >>> Frankly, I don't have any measurements yet. For the secure case, > >>> I think Mimu has assessed the impact of multiqueue, hence adding Mimu to > >>> the cc list. @Mimu can you help us out. > >>> > >>> Regarding the normal non-protected VMs I'm in a middle of producing some > >>> measurement data. This was admittedly a bit rushed because of where we > >>> are in the cycle. Sorry to disappoint you. > >>> > >>> The number 4 was suggested by Christian, maybe Christian does have some > >>> readily available measurement data for the normal VM case. @Christian: > >>> can you help me out? > >> My point was to find a balance between performance gain and memory usage. > >> As a matter of fact, virtqueue do consume memory. So 4 looked like a > >> reasonable default for me for large guests as long as we do not have > >> directed > >> interrupts. > >> > >> Now, thinking about this again: If we want to change the default to > >> something > >> else in the future (e.g. to num vcpus) then the compat handling will get > >> really complicated. > > > > Regarding compat handling, I believe we would need a new property for > > virtio-blk-ccw: something like def_num_queues_max. Then logic would > > morph to MIN(def_num_queues_max, current_machine->smp.cpus), and we could > > relatively freely do compat stuff on def_num_queues_max. > > > > IMHO not pretty but certainly doable. > > > >> > >> So we can > >> - go with num queues = num cpus. But this will consume memory > >> for guests with lots of CPUs. > > > > In absence of data that showcases the benefit outweighing the obvious > > detriment, I lean towards finding this option the least favorable. > > > >> - go with the proposed logic of min(4,vcpus) and accept that future compat > >> handling > >> is harder > > > > IMHO not a bad option, but I think I would still feel better about a > > more informed decision. In the end, the end user can already specify the > > num_queues explicitly, so I don't think this is urgent. > > > >> - defer this change > > > > So I tend to lean towards deferring. > > Yes, I was pushing this for 5.2 to avoid compat handling. But maybe it is > better > to wait and do it later. But we should certainly continue the discussion to > have > something for the next release. Do we have a better idea now about which values would make sense here? > > > > > Another thought is, provided the load is about evenly spread on the > > different virtqueues, if the game is about vCPU locality, one could > > think of decreasing the size of each individual virtqueue while > > increasing their number, with the idea of not paying much more in terms > > of memory. The queue size however needs to be a power of 2, > > so there is a limit on the granularity. > > > > Regarding secure VMs, currently we have to cramp at least the swiotlb and > > the virtqueues into ZONE_DMA. So increasing the number of > > virtqueues heavily may get us into new trouble with exotic configs. > > > > Regards, > > Halil > > >
Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On Wed, 11 Nov 2020 13:49:08 +0100 Michael Mueller wrote: > Halil, > > still I would like to know what the exact memory consumption per queue > is that you are talking about. Have you made a calculation? Thanks. Hi! The default size for virtio-blk seems to be 256 ring entries, which translates to 6668 bytes for the split ring(s). The queue size is user configurable, and guest, in theory, can decide to have a smaller queue. The indirect descriptors are allocated separately, and bounced via swiotlb in case of secure guests. Regards, Halil
Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On 11.11.20 13:38, Cornelia Huck wrote: On Wed, 11 Nov 2020 13:26:11 +0100 Michael Mueller wrote: On 10.11.20 15:16, Michael Mueller wrote: On 09.11.20 19:53, Halil Pasic wrote: On Mon, 9 Nov 2020 17:06:16 +0100 Cornelia Huck wrote: @@ -20,6 +21,11 @@ static void virtio_ccw_blk_realize(VirtioCcwDevice *ccw_dev, Error **errp) { VirtIOBlkCcw *dev = VIRTIO_BLK_CCW(ccw_dev); DeviceState *vdev = DEVICE(&dev->vdev); + VirtIOBlkConf *conf = &dev->vdev.conf; + + if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { + conf->num_queues = MIN(4, current_machine->smp.cpus); + } I would like to have a comment explaining the numbers here, however. virtio-pci has a pretty good explanation (use 1:1 for vqs:vcpus if possible, apply some other capping). 4 seems to be a bit arbitrary without explanation, although I'm sure you did some measurements :) Frankly, I don't have any measurements yet. For the secure case, I think Mimu has assessed the impact of multiqueue, hence adding Mimu to the cc list. @Mimu can you help us out. Regarding the normal non-protected VMs I'm in a middle of producing some measurement data. This was admittedly a bit rushed because of where we are in the cycle. Sorry to disappoint you. I'm talking with the perf team tomorrow. They have done some measurements with multiqueue for PV guests and I asked for a comparison to non PV guests as well. The perf team has performed measurements for us that show that a *PV KVM guest* benefits in terms of throughput for random read, random write and sequential read (no difference for sequential write) by a multi queue setup. CPU cost are reduced as well due to reduced spinlock contention. Just to be clear, that was with 4 queues? Yes, we have seen it with 4 and also with 9 queues. Halil, still I would like to know what the exact memory consumption per queue is that you are talking about. Have you made a calculation? Thanks. For a *standard KVM guest* it currently has no throughput effect. No benefit and no harm. I have asked them to finalize their measurements by comparing the CPU cost as well. I will receive that information on Friday. Thank you for checking! Michael Michael The number 4 was suggested by Christian, maybe Christian does have some readily available measurement data for the normal VM case. @Christian: can you help me out? Regards, Halil
Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On Wed, 11 Nov 2020 13:26:11 +0100 Michael Mueller wrote: > On 10.11.20 15:16, Michael Mueller wrote: > > > > > > On 09.11.20 19:53, Halil Pasic wrote: > >> On Mon, 9 Nov 2020 17:06:16 +0100 > >> Cornelia Huck wrote: > >> > @@ -20,6 +21,11 @@ static void > virtio_ccw_blk_realize(VirtioCcwDevice *ccw_dev, Error **errp) > { > VirtIOBlkCcw *dev = VIRTIO_BLK_CCW(ccw_dev); > DeviceState *vdev = DEVICE(&dev->vdev); > + VirtIOBlkConf *conf = &dev->vdev.conf; > + > + if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { > + conf->num_queues = MIN(4, current_machine->smp.cpus); > + } > >>> > >>> I would like to have a comment explaining the numbers here, however. > >>> > >>> virtio-pci has a pretty good explanation (use 1:1 for vqs:vcpus if > >>> possible, apply some other capping). 4 seems to be a bit arbitrary > >>> without explanation, although I'm sure you did some measurements :) > >> > >> Frankly, I don't have any measurements yet. For the secure case, > >> I think Mimu has assessed the impact of multiqueue, hence adding Mimu to > >> the cc list. @Mimu can you help us out. > >> > >> Regarding the normal non-protected VMs I'm in a middle of producing some > >> measurement data. This was admittedly a bit rushed because of where we > >> are in the cycle. Sorry to disappoint you. > > > > I'm talking with the perf team tomorrow. They have done some > > measurements with multiqueue for PV guests and I asked for a comparison > > to non PV guests as well. > > The perf team has performed measurements for us that show that a *PV > KVM guest* benefits in terms of throughput for random read, random write > and sequential read (no difference for sequential write) by a multi > queue setup. CPU cost are reduced as well due to reduced spinlock > contention. Just to be clear, that was with 4 queues? > > For a *standard KVM guest* it currently has no throughput effect. No > benefit and no harm. I have asked them to finalize their measurements > by comparing the CPU cost as well. I will receive that information on > Friday. Thank you for checking! > > Michael > > > > > > Michael > > > >> > >> The number 4 was suggested by Christian, maybe Christian does have some > >> readily available measurement data for the normal VM case. @Christian: > >> can you help me out? > >> > >> Regards, > >> Halil > >> > > > >
Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On 10.11.20 15:16, Michael Mueller wrote: On 09.11.20 19:53, Halil Pasic wrote: On Mon, 9 Nov 2020 17:06:16 +0100 Cornelia Huck wrote: @@ -20,6 +21,11 @@ static void virtio_ccw_blk_realize(VirtioCcwDevice *ccw_dev, Error **errp) { VirtIOBlkCcw *dev = VIRTIO_BLK_CCW(ccw_dev); DeviceState *vdev = DEVICE(&dev->vdev); + VirtIOBlkConf *conf = &dev->vdev.conf; + + if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { + conf->num_queues = MIN(4, current_machine->smp.cpus); + } I would like to have a comment explaining the numbers here, however. virtio-pci has a pretty good explanation (use 1:1 for vqs:vcpus if possible, apply some other capping). 4 seems to be a bit arbitrary without explanation, although I'm sure you did some measurements :) Frankly, I don't have any measurements yet. For the secure case, I think Mimu has assessed the impact of multiqueue, hence adding Mimu to the cc list. @Mimu can you help us out. Regarding the normal non-protected VMs I'm in a middle of producing some measurement data. This was admittedly a bit rushed because of where we are in the cycle. Sorry to disappoint you. I'm talking with the perf team tomorrow. They have done some measurements with multiqueue for PV guests and I asked for a comparison to non PV guests as well. The perf team has performed measurements for us that show that a *PV KVM guest* benefits in terms of throughput for random read, random write and sequential read (no difference for sequential write) by a multi queue setup. CPU cost are reduced as well due to reduced spinlock contention. For a *standard KVM guest* it currently has no throughput effect. No benefit and no harm. I have asked them to finalize their measurements by comparing the CPU cost as well. I will receive that information on Friday. Michael Michael The number 4 was suggested by Christian, maybe Christian does have some readily available measurement data for the normal VM case. @Christian: can you help me out? Regards, Halil
Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On 09.11.20 19:53, Halil Pasic wrote: On Mon, 9 Nov 2020 17:06:16 +0100 Cornelia Huck wrote: @@ -20,6 +21,11 @@ static void virtio_ccw_blk_realize(VirtioCcwDevice *ccw_dev, Error **errp) { VirtIOBlkCcw *dev = VIRTIO_BLK_CCW(ccw_dev); DeviceState *vdev = DEVICE(&dev->vdev); +VirtIOBlkConf *conf = &dev->vdev.conf; + +if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { +conf->num_queues = MIN(4, current_machine->smp.cpus); +} I would like to have a comment explaining the numbers here, however. virtio-pci has a pretty good explanation (use 1:1 for vqs:vcpus if possible, apply some other capping). 4 seems to be a bit arbitrary without explanation, although I'm sure you did some measurements :) Frankly, I don't have any measurements yet. For the secure case, I think Mimu has assessed the impact of multiqueue, hence adding Mimu to the cc list. @Mimu can you help us out. Regarding the normal non-protected VMs I'm in a middle of producing some measurement data. This was admittedly a bit rushed because of where we are in the cycle. Sorry to disappoint you. I'm talking with the perf team tomorrow. They have done some measurements with multiqueue for PV guests and I asked for a comparison to non PV guests as well. Michael The number 4 was suggested by Christian, maybe Christian does have some readily available measurement data for the normal VM case. @Christian: can you help me out? Regards, Halil -- Mit freundlichen Grüßen / Kind regards Michael Müller IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Gregor Pillen Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294
Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On 10.11.20 11:40, Halil Pasic wrote: > On Tue, 10 Nov 2020 09:47:51 +0100 > Christian Borntraeger wrote: > >> >> >> On 09.11.20 19:53, Halil Pasic wrote: >>> On Mon, 9 Nov 2020 17:06:16 +0100 >>> Cornelia Huck wrote: >>> > @@ -20,6 +21,11 @@ static void virtio_ccw_blk_realize(VirtioCcwDevice > *ccw_dev, Error **errp) > { > VirtIOBlkCcw *dev = VIRTIO_BLK_CCW(ccw_dev); > DeviceState *vdev = DEVICE(&dev->vdev); > +VirtIOBlkConf *conf = &dev->vdev.conf; > + > +if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { > +conf->num_queues = MIN(4, current_machine->smp.cpus); > +} I would like to have a comment explaining the numbers here, however. virtio-pci has a pretty good explanation (use 1:1 for vqs:vcpus if possible, apply some other capping). 4 seems to be a bit arbitrary without explanation, although I'm sure you did some measurements :) >>> >>> Frankly, I don't have any measurements yet. For the secure case, >>> I think Mimu has assessed the impact of multiqueue, hence adding Mimu to >>> the cc list. @Mimu can you help us out. >>> >>> Regarding the normal non-protected VMs I'm in a middle of producing some >>> measurement data. This was admittedly a bit rushed because of where we >>> are in the cycle. Sorry to disappoint you. >>> >>> The number 4 was suggested by Christian, maybe Christian does have some >>> readily available measurement data for the normal VM case. @Christian: >>> can you help me out? >> My point was to find a balance between performance gain and memory usage. >> As a matter of fact, virtqueue do consume memory. So 4 looked like a >> reasonable default for me for large guests as long as we do not have directed >> interrupts. >> >> Now, thinking about this again: If we want to change the default to something >> else in the future (e.g. to num vcpus) then the compat handling will get >> really complicated. > > Regarding compat handling, I believe we would need a new property for > virtio-blk-ccw: something like def_num_queues_max. Then logic would > morph to MIN(def_num_queues_max, current_machine->smp.cpus), and we could > relatively freely do compat stuff on def_num_queues_max. > > IMHO not pretty but certainly doable. > >> >> So we can >> - go with num queues = num cpus. But this will consume memory >> for guests with lots of CPUs. > > In absence of data that showcases the benefit outweighing the obvious > detriment, I lean towards finding this option the least favorable. > >> - go with the proposed logic of min(4,vcpus) and accept that future compat >> handling >> is harder > > IMHO not a bad option, but I think I would still feel better about a > more informed decision. In the end, the end user can already specify the > num_queues explicitly, so I don't think this is urgent. > >> - defer this change > > So I tend to lean towards deferring. Yes, I was pushing this for 5.2 to avoid compat handling. But maybe it is better to wait and do it later. But we should certainly continue the discussion to have something for the next release. > > Another thought is, provided the load is about evenly spread on the > different virtqueues, if the game is about vCPU locality, one could > think of decreasing the size of each individual virtqueue while > increasing their number, with the idea of not paying much more in terms > of memory. The queue size however needs to be a power of 2, > so there is a limit on the granularity. > > Regarding secure VMs, currently we have to cramp at least the swiotlb and > the virtqueues into ZONE_DMA. So increasing the number of > virtqueues heavily may get us into new trouble with exotic configs. > > Regards, > Halil >
Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On Tue, 10 Nov 2020 09:47:51 +0100 Christian Borntraeger wrote: > > > On 09.11.20 19:53, Halil Pasic wrote: > > On Mon, 9 Nov 2020 17:06:16 +0100 > > Cornelia Huck wrote: > > > >>> @@ -20,6 +21,11 @@ static void virtio_ccw_blk_realize(VirtioCcwDevice > >>> *ccw_dev, Error **errp) > >>> { > >>> VirtIOBlkCcw *dev = VIRTIO_BLK_CCW(ccw_dev); > >>> DeviceState *vdev = DEVICE(&dev->vdev); > >>> +VirtIOBlkConf *conf = &dev->vdev.conf; > >>> + > >>> +if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { > >>> +conf->num_queues = MIN(4, current_machine->smp.cpus); > >>> +} > >> > >> I would like to have a comment explaining the numbers here, however. > >> > >> virtio-pci has a pretty good explanation (use 1:1 for vqs:vcpus if > >> possible, apply some other capping). 4 seems to be a bit arbitrary > >> without explanation, although I'm sure you did some measurements :) > > > > Frankly, I don't have any measurements yet. For the secure case, > > I think Mimu has assessed the impact of multiqueue, hence adding Mimu to > > the cc list. @Mimu can you help us out. > > > > Regarding the normal non-protected VMs I'm in a middle of producing some > > measurement data. This was admittedly a bit rushed because of where we > > are in the cycle. Sorry to disappoint you. > > > > The number 4 was suggested by Christian, maybe Christian does have some > > readily available measurement data for the normal VM case. @Christian: > > can you help me out? > My point was to find a balance between performance gain and memory usage. > As a matter of fact, virtqueue do consume memory. So 4 looked like a > reasonable default for me for large guests as long as we do not have directed > interrupts. > > Now, thinking about this again: If we want to change the default to something > else in the future (e.g. to num vcpus) then the compat handling will get > really complicated. Regarding compat handling, I believe we would need a new property for virtio-blk-ccw: something like def_num_queues_max. Then logic would morph to MIN(def_num_queues_max, current_machine->smp.cpus), and we could relatively freely do compat stuff on def_num_queues_max. IMHO not pretty but certainly doable. > > So we can > - go with num queues = num cpus. But this will consume memory > for guests with lots of CPUs. In absence of data that showcases the benefit outweighing the obvious detriment, I lean towards finding this option the least favorable. > - go with the proposed logic of min(4,vcpus) and accept that future compat > handling > is harder IMHO not a bad option, but I think I would still feel better about a more informed decision. In the end, the end user can already specify the num_queues explicitly, so I don't think this is urgent. > - defer this change So I tend to lean towards deferring. Another thought is, provided the load is about evenly spread on the different virtqueues, if the game is about vCPU locality, one could think of decreasing the size of each individual virtqueue while increasing their number, with the idea of not paying much more in terms of memory. The queue size however needs to be a power of 2, so there is a limit on the granularity. Regarding secure VMs, currently we have to cramp at least the swiotlb and the virtqueues into ZONE_DMA. So increasing the number of virtqueues heavily may get us into new trouble with exotic configs. Regards, Halil
Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On Tue, 10 Nov 2020 09:47:51 +0100 Christian Borntraeger wrote: > On 09.11.20 19:53, Halil Pasic wrote: > > On Mon, 9 Nov 2020 17:06:16 +0100 > > Cornelia Huck wrote: > > > >>> @@ -20,6 +21,11 @@ static void virtio_ccw_blk_realize(VirtioCcwDevice > >>> *ccw_dev, Error **errp) > >>> { > >>> VirtIOBlkCcw *dev = VIRTIO_BLK_CCW(ccw_dev); > >>> DeviceState *vdev = DEVICE(&dev->vdev); > >>> +VirtIOBlkConf *conf = &dev->vdev.conf; > >>> + > >>> +if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { > >>> +conf->num_queues = MIN(4, current_machine->smp.cpus); > >>> +} > >> > >> I would like to have a comment explaining the numbers here, however. > >> > >> virtio-pci has a pretty good explanation (use 1:1 for vqs:vcpus if > >> possible, apply some other capping). 4 seems to be a bit arbitrary > >> without explanation, although I'm sure you did some measurements :) > > > > Frankly, I don't have any measurements yet. For the secure case, > > I think Mimu has assessed the impact of multiqueue, hence adding Mimu to > > the cc list. @Mimu can you help us out. > > > > Regarding the normal non-protected VMs I'm in a middle of producing some > > measurement data. This was admittedly a bit rushed because of where we > > are in the cycle. Sorry to disappoint you. > > > > The number 4 was suggested by Christian, maybe Christian does have some > > readily available measurement data for the normal VM case. @Christian: > > can you help me out? > My point was to find a balance between performance gain and memory usage. > As a matter of fact, virtqueue do consume memory. So 4 looked like a > reasonable default for me for large guests as long as we do not have directed > interrupts. Yes, 4 does not look like a bad number, but I still don't feel really comfortable with it without at least some data. What about large guests with slow vs. fast storage? > > Now, thinking about this again: If we want to change the default to something > else in the future (e.g. to num vcpus) then the compat handling will get > really complicated. Yes, I fear that will be messy. Just picking a value later will need compat handling, but not a really complicated one. > > So we can > - go with num queues = num cpus. But this will consume memory > for guests with lots of CPUs. I'm not sure that would be a good choice, as we don't have the benefits that pci has. > - go with the proposed logic of min(4,vcpus) and accept that future compat > handling > is harder With a bit more data, I'd be way more comfortable. Might still be ok for the next rc. > - defer this change We might end up with that, given the timing :( (not blaming anyone)
Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On 09.11.20 19:53, Halil Pasic wrote: > On Mon, 9 Nov 2020 17:06:16 +0100 > Cornelia Huck wrote: > >>> @@ -20,6 +21,11 @@ static void virtio_ccw_blk_realize(VirtioCcwDevice >>> *ccw_dev, Error **errp) >>> { >>> VirtIOBlkCcw *dev = VIRTIO_BLK_CCW(ccw_dev); >>> DeviceState *vdev = DEVICE(&dev->vdev); >>> +VirtIOBlkConf *conf = &dev->vdev.conf; >>> + >>> +if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { >>> +conf->num_queues = MIN(4, current_machine->smp.cpus); >>> +} >> >> I would like to have a comment explaining the numbers here, however. >> >> virtio-pci has a pretty good explanation (use 1:1 for vqs:vcpus if >> possible, apply some other capping). 4 seems to be a bit arbitrary >> without explanation, although I'm sure you did some measurements :) > > Frankly, I don't have any measurements yet. For the secure case, > I think Mimu has assessed the impact of multiqueue, hence adding Mimu to > the cc list. @Mimu can you help us out. > > Regarding the normal non-protected VMs I'm in a middle of producing some > measurement data. This was admittedly a bit rushed because of where we > are in the cycle. Sorry to disappoint you. > > The number 4 was suggested by Christian, maybe Christian does have some > readily available measurement data for the normal VM case. @Christian: > can you help me out? My point was to find a balance between performance gain and memory usage. As a matter of fact, virtqueue do consume memory. So 4 looked like a reasonable default for me for large guests as long as we do not have directed interrupts. Now, thinking about this again: If we want to change the default to something else in the future (e.g. to num vcpus) then the compat handling will get really complicated. So we can - go with num queues = num cpus. But this will consume memory for guests with lots of CPUs. - go with the proposed logic of min(4,vcpus) and accept that future compat handling is harder - defer this change
Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On Mon, 9 Nov 2020 17:06:16 +0100 Cornelia Huck wrote: > > @@ -20,6 +21,11 @@ static void virtio_ccw_blk_realize(VirtioCcwDevice > > *ccw_dev, Error **errp) > > { > > VirtIOBlkCcw *dev = VIRTIO_BLK_CCW(ccw_dev); > > DeviceState *vdev = DEVICE(&dev->vdev); > > +VirtIOBlkConf *conf = &dev->vdev.conf; > > + > > +if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { > > +conf->num_queues = MIN(4, current_machine->smp.cpus); > > +} > > I would like to have a comment explaining the numbers here, however. > > virtio-pci has a pretty good explanation (use 1:1 for vqs:vcpus if > possible, apply some other capping). 4 seems to be a bit arbitrary > without explanation, although I'm sure you did some measurements :) Frankly, I don't have any measurements yet. For the secure case, I think Mimu has assessed the impact of multiqueue, hence adding Mimu to the cc list. @Mimu can you help us out. Regarding the normal non-protected VMs I'm in a middle of producing some measurement data. This was admittedly a bit rushed because of where we are in the cycle. Sorry to disappoint you. The number 4 was suggested by Christian, maybe Christian does have some readily available measurement data for the normal VM case. @Christian: can you help me out? Regards, Halil
Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On Mon, 9 Nov 2020 16:48:31 +0100 Halil Pasic wrote: > Currently the default value of num_queues is effectively 1 for > virtio-blk-ccw. Recently 9445e1e15e ("virtio-blk-pci: default num_queues > to -smp N") changed the default for pci to the number of vcpus, citing > interrupt better locality and better performance as a rationale. > > While virtio-blk-ccw does not yet benefit from better interrupt > locality, measurements have shown that for secure VMs multiqueue does > help with performance. Since the bounce buffering happens with the queue > lock held (in the guest) this is hardly a surprise. > > As for non-secure VMs the extra queues shouldn't hurt too much. > > Suggested-by: Christian Borntraeger > Signed-off-by: Halil Pasic > --- > > We would prefer to land this for 5.2. If we do then commit 9445e1e15e > ("virtio-blk-pci: default num_queues to -smp N") took care of all the > necessary compat handling. > > If that's not possible, I will send a version that does the necessary > compat handling. I think we can still get this into 5.2, and that would indeed be less hassle. > --- > hw/s390x/virtio-ccw-blk.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/hw/s390x/virtio-ccw-blk.c b/hw/s390x/virtio-ccw-blk.c > index 2294ce1ce4..7296140dde 100644 > --- a/hw/s390x/virtio-ccw-blk.c > +++ b/hw/s390x/virtio-ccw-blk.c > @@ -10,6 +10,7 @@ > */ > > #include "qemu/osdep.h" > +#include "hw/boards.h" > #include "hw/qdev-properties.h" > #include "hw/virtio/virtio.h" > #include "qapi/error.h" > @@ -20,6 +21,11 @@ static void virtio_ccw_blk_realize(VirtioCcwDevice > *ccw_dev, Error **errp) > { > VirtIOBlkCcw *dev = VIRTIO_BLK_CCW(ccw_dev); > DeviceState *vdev = DEVICE(&dev->vdev); > +VirtIOBlkConf *conf = &dev->vdev.conf; > + > +if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { > +conf->num_queues = MIN(4, current_machine->smp.cpus); > +} I would like to have a comment explaining the numbers here, however. virtio-pci has a pretty good explanation (use 1:1 for vqs:vcpus if possible, apply some other capping). 4 seems to be a bit arbitrary without explanation, although I'm sure you did some measurements :) Do we also want something similar for virtio-scsi (and vhost-scsi)? > > qdev_realize(vdev, BUS(&ccw_dev->bus), errp); > } > > base-commit: 2a190a7256a3e0563b29ffd67e0164097b4a6dac
Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On 09.11.20 16:48, Halil Pasic wrote: > Currently the default value of num_queues is effectively 1 for > virtio-blk-ccw. Recently 9445e1e15e ("virtio-blk-pci: default num_queues > to -smp N") changed the default for pci to the number of vcpus, citing > interrupt better locality and better performance as a rationale. > > While virtio-blk-ccw does not yet benefit from better interrupt > locality, measurements have shown that for secure VMs multiqueue does > help with performance. Since the bounce buffering happens with the queue > lock held (in the guest) this is hardly a surprise. > > As for non-secure VMs the extra queues shouldn't hurt too much. In fact we have also seen improvments for multiqueues for non secure guests as one virtqueue seems to have a hard capping in terms of bandwidth that can be smaller than newer storage devices. > > Suggested-by: Christian Borntraeger > Signed-off-by: Halil Pasic > --- > > We would prefer to land this for 5.2. If we do then commit 9445e1e15e > ("virtio-blk-pci: default num_queues to -smp N") took care of all the > necessary compat handling. > > If that's not possible, I will send a version that does the necessary > compat handling. Right. At the moment the original patch has all the necessary compat handling for 5.1 for all platforms. Adding multi-queue for s390x for > 5.2 would mean that we would need to have additional platform specific compat handling and it would increase the patch size unnecessarily. In the end you could consider this an enhancement (not calling it fix) of the original patch. Acked-by: Christian Borntraeger > --- > hw/s390x/virtio-ccw-blk.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/hw/s390x/virtio-ccw-blk.c b/hw/s390x/virtio-ccw-blk.c > index 2294ce1ce4..7296140dde 100644 > --- a/hw/s390x/virtio-ccw-blk.c > +++ b/hw/s390x/virtio-ccw-blk.c > @@ -10,6 +10,7 @@ > */ > > #include "qemu/osdep.h" > +#include "hw/boards.h" > #include "hw/qdev-properties.h" > #include "hw/virtio/virtio.h" > #include "qapi/error.h" > @@ -20,6 +21,11 @@ static void virtio_ccw_blk_realize(VirtioCcwDevice > *ccw_dev, Error **errp) > { > VirtIOBlkCcw *dev = VIRTIO_BLK_CCW(ccw_dev); > DeviceState *vdev = DEVICE(&dev->vdev); > +VirtIOBlkConf *conf = &dev->vdev.conf; > + > +if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { > +conf->num_queues = MIN(4, current_machine->smp.cpus); > +} > > qdev_realize(vdev, BUS(&ccw_dev->bus), errp); > } > > base-commit: 2a190a7256a3e0563b29ffd67e0164097b4a6dac >
[PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
Currently the default value of num_queues is effectively 1 for virtio-blk-ccw. Recently 9445e1e15e ("virtio-blk-pci: default num_queues to -smp N") changed the default for pci to the number of vcpus, citing interrupt better locality and better performance as a rationale. While virtio-blk-ccw does not yet benefit from better interrupt locality, measurements have shown that for secure VMs multiqueue does help with performance. Since the bounce buffering happens with the queue lock held (in the guest) this is hardly a surprise. As for non-secure VMs the extra queues shouldn't hurt too much. Suggested-by: Christian Borntraeger Signed-off-by: Halil Pasic --- We would prefer to land this for 5.2. If we do then commit 9445e1e15e ("virtio-blk-pci: default num_queues to -smp N") took care of all the necessary compat handling. If that's not possible, I will send a version that does the necessary compat handling. --- hw/s390x/virtio-ccw-blk.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/hw/s390x/virtio-ccw-blk.c b/hw/s390x/virtio-ccw-blk.c index 2294ce1ce4..7296140dde 100644 --- a/hw/s390x/virtio-ccw-blk.c +++ b/hw/s390x/virtio-ccw-blk.c @@ -10,6 +10,7 @@ */ #include "qemu/osdep.h" +#include "hw/boards.h" #include "hw/qdev-properties.h" #include "hw/virtio/virtio.h" #include "qapi/error.h" @@ -20,6 +21,11 @@ static void virtio_ccw_blk_realize(VirtioCcwDevice *ccw_dev, Error **errp) { VirtIOBlkCcw *dev = VIRTIO_BLK_CCW(ccw_dev); DeviceState *vdev = DEVICE(&dev->vdev); +VirtIOBlkConf *conf = &dev->vdev.conf; + +if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { +conf->num_queues = MIN(4, current_machine->smp.cpus); +} qdev_realize(vdev, BUS(&ccw_dev->bus), errp); } base-commit: 2a190a7256a3e0563b29ffd67e0164097b4a6dac -- 2.17.1