Re: [PATCH] Rescan the entire target on transport reset when LUN is 0

2020-09-08 Thread Felipe Franciosi



> On Sep 8, 2020, at 3:22 PM, Paolo Bonzini  wrote:
> 
> On 28/08/20 14:21, Matej Genci wrote:
>> VirtIO 1.0 spec says
>>The removed and rescan events ... when sent for LUN 0, they MAY
>>apply to the entire target so the driver can ask the initiator
>>to rescan the target to detect this.
>> 
>> This change introduces the behaviour described above by scanning the
>> entire scsi target when LUN is set to 0. This is both a functional and a
>> performance fix. It aligns the driver with the spec and allows control
>> planes to hotplug targets with large numbers of LUNs without having to
>> request a RESCAN for each one of them.
>> 
>> Signed-off-by: Matej Genci 
>> Suggested-by: Felipe Franciosi 
>> ---
>> drivers/scsi/virtio_scsi.c | 7 ++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
>> index bfec84aacd90..a4b9bc7b4b4a 100644
>> --- a/drivers/scsi/virtio_scsi.c
>> +++ b/drivers/scsi/virtio_scsi.c
>> @@ -284,7 +284,12 @@ static void virtscsi_handle_transport_reset(struct 
>> virtio_scsi *vscsi,
>> 
>>  switch (virtio32_to_cpu(vscsi->vdev, event->reason)) {
>>  case VIRTIO_SCSI_EVT_RESET_RESCAN:
>> -scsi_add_device(shost, 0, target, lun);
>> +if (lun == 0) {
>> +scsi_scan_target(>shost_gendev, 0, target,
>> + SCAN_WILD_CARD, SCSI_SCAN_INITIAL);
>> +} else {
>> +scsi_add_device(shost, 0, target, lun);
>> +}
>>  break;
>>  case VIRTIO_SCSI_EVT_RESET_REMOVED:
>>  sdev = scsi_device_lookup(shost, 0, target, lun);
>> 
> 
> 
> Acked-by: Paolo Bonzini 

Cc: sta...@vger.kernel.org

Thanks, Paolo.

I'm Cc'ing stable as I believe this fixes a driver bug where it
doesn't follow the spec. Per commit message, today devices are
required to issue RESCAN events for each LUN behind a target when
hotplugging, or risking the driver not seeing the new LUNs.

Is this enough? Or should we resend after merge per below?
https://www.kernel.org/doc/Documentation/process/stable-kernel-rules.rst

F.


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] virtio_ring: fix description of virtqueue_get_buf

2016-11-14 Thread Felipe Franciosi
The device (not the driver) populates the used ring and includes the len
of how much data was written.

Signed-off-by: Felipe Franciosi <fel...@nutanix.com>
---
 drivers/virtio/virtio_ring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 489bfc6..8a0d6a9 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -649,7 +649,7 @@ static inline bool more_used(const struct vring_virtqueue 
*vq)
  * @vq: the struct virtqueue we're talking about.
  * @len: the length written into the buffer
  *
- * If the driver wrote data into the buffer, @len will be set to the
+ * If the device wrote data into the buffer, @len will be set to the
  * amount written.  This means you don't need to clear the buffer
  * beforehand to ensure there's no data leakage in the case of short
  * writes.
-- 
1.9.4

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next RFC V3 0/3] basic busy polling support for vhost_net

2015-11-17 Thread Felipe Franciosi
Hi Jason,

I understand your busy loop timeout is quite conservative at 50us. Did you try 
any other values?

Also, did you measure how polling affects many VMs talking to each other (e.g. 
20 VMs on each host, perhaps with several vNICs each, transmitting to a 
corresponding VM/vNIC pair on another host)?


On a complete separate experiment (busy waiting on storage I/O rings on Xen), I 
have observed that bigger timeouts gave bigger benefits. On the other hand, all 
cases that contended for CPU were badly hurt with any sort of polling.

The cases that contended for CPU consisted of many VMs generating workload over 
very fast I/O devices (in that case, several NVMe devices on a single host). 
And the metric that got affected was aggregate throughput from all VMs.

The solution was to determine whether to poll depending on the host's overall 
CPU utilisation at that moment. That gave me the best of both worlds as polling 
made everything faster without slowing down any other metric.

Thanks,
Felipe



On 12/11/2015 10:20, "kvm-ow...@vger.kernel.org on behalf of Jason Wang" 
 wrote:

>
>
>On 11/12/2015 06:16 PM, Jason Wang wrote:
>> Hi all:
>>
>> This series tries to add basic busy polling for vhost net. The idea is
>> simple: at the end of tx/rx processing, busy polling for new tx added
>> descriptor and rx receive socket for a while. The maximum number of
>> time (in us) could be spent on busy polling was specified ioctl.
>>
>> Test were done through:
>>
>> - 50 us as busy loop timeout
>> - Netperf 2.6
>> - Two machines with back to back connected ixgbe
>> - Guest with 1 vcpu and 1 queue
>>
>> Results:
>> - For stream workload, ioexits were reduced dramatically in medium
>>   size (1024-2048) of tx (at most -39%) and almost all rx (at most
>>   -79%) as a result of polling. This compensate for the possible
>>   wasted cpu cycles more or less. That porbably why we can still see
>>   some increasing in the normalized throughput in some cases.
>> - Throughput of tx were increased (at most 105%) expect for the huge
>>   write (16384). And we can send more packets in the case (+tpkts were
>>   increased).
>> - Very minor rx regression in some cases.
>> - Improvemnt on TCP_RR (at most 16%).
>
>Forget to mention, the following test results by order are:
>
>1) Guest TX
>2) Guest RX
>3) TCP_RR
>
>> size/session/+thu%/+normalize%/+tpkts%/+rpkts%/+ioexits%/
>>64/ 1/   +9%/  -17%/   +5%/  +10%/   -2%
>>64/ 2/   +8%/  -18%/   +6%/  +10%/   -1%
>>64/ 4/   +4%/  -21%/   +6%/  +10%/   -1%
>>64/ 8/   +9%/  -17%/   +6%/   +9%/   -2%
>>   256/ 1/  +20%/   -1%/  +15%/  +11%/   -9%
>>   256/ 2/  +15%/   -6%/  +15%/   +8%/   -8%
>>   256/ 4/  +17%/   -4%/  +16%/   +8%/   -8%
>>   256/ 8/  -61%/  -69%/  +16%/  +10%/  -10%
>>   512/ 1/  +15%/   -3%/  +19%/  +18%/  -11%
>>   512/ 2/  +19%/0%/  +19%/  +13%/  -10%
>>   512/ 4/  +18%/   -2%/  +18%/  +15%/  -10%
>>   512/ 8/  +17%/   -1%/  +18%/  +15%/  -11%
>>  1024/ 1/  +25%/   +4%/  +27%/  +16%/  -21%
>>  1024/ 2/  +28%/   +8%/  +25%/  +15%/  -22%
>>  1024/ 4/  +25%/   +5%/  +25%/  +14%/  -21%
>>  1024/ 8/  +27%/   +7%/  +25%/  +16%/  -21%
>>  2048/ 1/  +32%/  +12%/  +31%/  +22%/  -38%
>>  2048/ 2/  +33%/  +12%/  +30%/  +23%/  -36%
>>  2048/ 4/  +31%/  +10%/  +31%/  +24%/  -37%
>>  2048/ 8/ +105%/  +75%/  +33%/  +23%/  -39%
>> 16384/ 1/0%/  -14%/   +2%/0%/  +19%
>> 16384/ 2/0%/  -13%/  +19%/  -13%/  +17%
>> 16384/ 4/0%/  -12%/   +3%/0%/   +2%
>> 16384/ 8/0%/  -11%/   -2%/   +1%/   +1%
>> size/session/+thu%/+normalize%/+tpkts%/+rpkts%/+ioexits%/
>>64/ 1/   -7%/  -23%/   +4%/   +6%/  -74%
>>64/ 2/   -2%/  -12%/   +2%/   +2%/  -55%
>>64/ 4/   +2%/   -5%/  +10%/   -2%/  -43%
>>64/ 8/   -5%/   -5%/  +11%/  -34%/  -59%
>>   256/ 1/   -6%/  -16%/   +9%/  +11%/  -60%
>>   256/ 2/   +3%/   -4%/   +6%/   -3%/  -28%
>>   256/ 4/0%/   -5%/   -9%/   -9%/  -10%
>>   256/ 8/   -3%/   -6%/  -12%/   -9%/  -40%
>>   512/ 1/   -4%/  -17%/  -10%/  +21%/  -34%
>>   512/ 2/0%/   -9%/  -14%/   -3%/  -30%
>>   512/ 4/0%/   -4%/  -18%/  -12%/   -4%
>>   512/ 8/   -1%/   -4%/   -1%/   -5%/   +4%
>>  1024/ 1/0%/  -16%/  +12%/  +11%/  -10%
>>  1024/ 2/0%/  -11%/0%/   +5%/  -31%
>>  1024/ 4/0%/   -4%/   -7%/   +1%/  -22%
>>  1024/ 8/   -5%/   -6%/  -17%/  -29%/  -79%
>>  2048/ 1/0%/  -16%/   +1%/   +9%/  -10%
>>  2048/ 2/0%/  -12%/   +7%/   +9%/  -26%
>>  2048/ 4/0%/   -7%/   -4%/   +3%/  -64%
>>  2048/ 8/   -1%/   -5%/   -6%/   +4%/  -20%
>> 16384/ 1/0%/  -12%/  +11%/   +7%/  -20%
>> 16384/ 2/0%/   -7%/   +1%/   +5%/  -26%
>> 16384/ 4/0%/   -5%/  +12%/  +22%/  -23%
>> 16384/ 8/0%/   -1%/   -8%/   +5%/   -3%
>>