Re: [virtio-dev] [RFC PATCH V2] virtio_pci: Add SR-IOV support

2018-02-25 Thread Yan Vugenfirer


> On 22 Feb 2018, at 19:52, Mark Rustad  wrote:
> 
> Hardware-realized virtio-pci devices can implement SR-IOV, so this
> patch enables its use. The device in question is an upcoming Intel
> NIC that implements both a virtio-net PF and virtio-net VFs. These
> are hardware realizations of what has been up to now been a software
> interface.
> 
> The device in question has the following 4-part PCI IDs:
> 
> PF: device: 1af4 vendor: 1041 subvendor: 8086 subdevice: 15fe
> VF: device: 1af4 vendor: 1041 subvendor: 8086 subdevice: 05fe

Small mistake in the commit message. Red Hat (Qumranet) vendor ID is 1af4, 
virtio-net device ID is 1041.
Should be:
PF: vendor: 1af4 device: 1041 subvendor: 8086 subdevice: 15fe
VF: vendor: 1af4 device: 1041 subvendor: 8086 subdevice: 05fe

Best regards,
Yan.

> 
> The patch needs no check for device ID, because the callback will
> never be made for devices that do not assert the capability or
> when run on a platform incapable of SR-IOV.
> 
> One reason for this patch is because the hardware requires the
> vendor ID of a VF to be the same as the vendor ID of the PF that
> created it. So it seemed logical to simply have a fully-functioning
> virtio-net PF create the VFs. This patch makes that possible.
> 
> Signed-off-by: Mark Rustad 
> Reviewed-by: Alexander Duyck 
> ---
> Changes in V2:
> - Simplified logic from previous version, removed added driver variable
> - Disable SR-IOV on driver removal excapt when VFs are assigned
> - Sent as RFC to virtio-dev, linux-pci, netdev, lkml and others
> ---
> drivers/virtio/virtio_pci_common.c |   47 
> 1 file changed, 47 insertions(+)
> 
> diff --git a/drivers/virtio/virtio_pci_common.c 
> b/drivers/virtio/virtio_pci_common.c
> index 48d4d1cf1cb6..78b53ffc4cee 100644
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -572,6 +572,47 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,
>   return rc;
> }
> 
> +#ifdef CONFIG_PCI_IOV
> +static int virtio_pci_sriov_disable(struct pci_dev *pci_dev)
> +{
> + /* If vfs are assigned we cannot shut down SR-IOV without causing
> +  * issues, so just leave the hardware available.
> +  */
> + if (pci_vfs_assigned(pci_dev)) {
> + dev_warn(_dev->dev,
> +  "Unloading driver while VFs are assigned - VFs will 
> not be deallocated\n");
> + return -EPERM;
> + }
> + pci_disable_sriov(pci_dev);
> + return 0;
> +}
> +
> +static int virtio_pci_sriov_enable(struct pci_dev *pci_dev, int num_vfs)
> +{
> + int rc = 0;
> +
> + if (pci_num_vf(pci_dev))
> + return -EINVAL;
> +
> + rc = pci_enable_sriov(pci_dev, num_vfs);
> + if (rc) {
> + dev_warn(_dev->dev, "Failed to enable PCI sriov: %d\n", rc);
> + return rc;
> + }
> + dev_info(_dev->dev, "SR-IOV enabled with %d VFs\n", num_vfs);
> + return num_vfs;
> +}
> +
> +static int virtio_pci_sriov_configure(struct pci_dev *dev, int num_vfs)
> +{
> + if (num_vfs)
> + return virtio_pci_sriov_enable(dev, num_vfs);
> + if (!pci_num_vf(dev))
> + return -EINVAL;
> + return virtio_pci_sriov_disable(dev);
> +}
> +#endif /* CONFIG_PCI_IOV */
> +
> static void virtio_pci_remove(struct pci_dev *pci_dev)
> {
>   struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
> @@ -584,6 +625,9 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
>   else
>   virtio_pci_modern_remove(vp_dev);
> 
> +#ifdef CONFIG_PCI_IOV
> + virtio_pci_sriov_disable(pci_dev);
> +#endif
>   pci_disable_device(pci_dev);
>   put_device(dev);
> }
> @@ -596,6 +640,9 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
> #ifdef CONFIG_PM_SLEEP
>   .driver.pm  = _pci_pm_ops,
> #endif
> +#ifdef CONFIG_PCI_IOV
> + .sriov_configure = virtio_pci_sriov_configure,
> +#endif
> };
> 
> module_pci_driver(virtio_pci_driver);
> 
> 
> -
> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
> 



Re: [virtio-dev] [RFC PATCH V2] virtio_pci: Add SR-IOV support

2018-02-25 Thread Yan Vugenfirer


> On 22 Feb 2018, at 19:52, Mark Rustad  wrote:
> 
> Hardware-realized virtio-pci devices can implement SR-IOV, so this
> patch enables its use. The device in question is an upcoming Intel
> NIC that implements both a virtio-net PF and virtio-net VFs. These
> are hardware realizations of what has been up to now been a software
> interface.
> 
> The device in question has the following 4-part PCI IDs:
> 
> PF: device: 1af4 vendor: 1041 subvendor: 8086 subdevice: 15fe
> VF: device: 1af4 vendor: 1041 subvendor: 8086 subdevice: 05fe

Small mistake in the commit message. Red Hat (Qumranet) vendor ID is 1af4, 
virtio-net device ID is 1041.
Should be:
PF: vendor: 1af4 device: 1041 subvendor: 8086 subdevice: 15fe
VF: vendor: 1af4 device: 1041 subvendor: 8086 subdevice: 05fe

Best regards,
Yan.

> 
> The patch needs no check for device ID, because the callback will
> never be made for devices that do not assert the capability or
> when run on a platform incapable of SR-IOV.
> 
> One reason for this patch is because the hardware requires the
> vendor ID of a VF to be the same as the vendor ID of the PF that
> created it. So it seemed logical to simply have a fully-functioning
> virtio-net PF create the VFs. This patch makes that possible.
> 
> Signed-off-by: Mark Rustad 
> Reviewed-by: Alexander Duyck 
> ---
> Changes in V2:
> - Simplified logic from previous version, removed added driver variable
> - Disable SR-IOV on driver removal excapt when VFs are assigned
> - Sent as RFC to virtio-dev, linux-pci, netdev, lkml and others
> ---
> drivers/virtio/virtio_pci_common.c |   47 
> 1 file changed, 47 insertions(+)
> 
> diff --git a/drivers/virtio/virtio_pci_common.c 
> b/drivers/virtio/virtio_pci_common.c
> index 48d4d1cf1cb6..78b53ffc4cee 100644
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -572,6 +572,47 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,
>   return rc;
> }
> 
> +#ifdef CONFIG_PCI_IOV
> +static int virtio_pci_sriov_disable(struct pci_dev *pci_dev)
> +{
> + /* If vfs are assigned we cannot shut down SR-IOV without causing
> +  * issues, so just leave the hardware available.
> +  */
> + if (pci_vfs_assigned(pci_dev)) {
> + dev_warn(_dev->dev,
> +  "Unloading driver while VFs are assigned - VFs will 
> not be deallocated\n");
> + return -EPERM;
> + }
> + pci_disable_sriov(pci_dev);
> + return 0;
> +}
> +
> +static int virtio_pci_sriov_enable(struct pci_dev *pci_dev, int num_vfs)
> +{
> + int rc = 0;
> +
> + if (pci_num_vf(pci_dev))
> + return -EINVAL;
> +
> + rc = pci_enable_sriov(pci_dev, num_vfs);
> + if (rc) {
> + dev_warn(_dev->dev, "Failed to enable PCI sriov: %d\n", rc);
> + return rc;
> + }
> + dev_info(_dev->dev, "SR-IOV enabled with %d VFs\n", num_vfs);
> + return num_vfs;
> +}
> +
> +static int virtio_pci_sriov_configure(struct pci_dev *dev, int num_vfs)
> +{
> + if (num_vfs)
> + return virtio_pci_sriov_enable(dev, num_vfs);
> + if (!pci_num_vf(dev))
> + return -EINVAL;
> + return virtio_pci_sriov_disable(dev);
> +}
> +#endif /* CONFIG_PCI_IOV */
> +
> static void virtio_pci_remove(struct pci_dev *pci_dev)
> {
>   struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
> @@ -584,6 +625,9 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
>   else
>   virtio_pci_modern_remove(vp_dev);
> 
> +#ifdef CONFIG_PCI_IOV
> + virtio_pci_sriov_disable(pci_dev);
> +#endif
>   pci_disable_device(pci_dev);
>   put_device(dev);
> }
> @@ -596,6 +640,9 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
> #ifdef CONFIG_PM_SLEEP
>   .driver.pm  = _pci_pm_ops,
> #endif
> +#ifdef CONFIG_PCI_IOV
> + .sriov_configure = virtio_pci_sriov_configure,
> +#endif
> };
> 
> module_pci_driver(virtio_pci_driver);
> 
> 
> -
> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
> 



Re: [Qemu-devel] [PATCH 2/2] qemu: add linkspeed and duplex setting to virtio-net

2017-12-20 Thread Yan Vugenfirer

> On 20 Dec 2017, at 16:31, Michael S. Tsirkin <m...@redhat.com> wrote:
> 
> On Tue, Dec 19, 2017 at 11:52:39AM -0500, Jason Baron wrote:
>> 
>> 
>> On 12/19/2017 04:19 AM, Yan Vugenfirer wrote:
>>> 
>>>> On 18 Dec 2017, at 18:04, Jason Baron via Qemu-devel
>>>> <qemu-de...@nongnu.org <mailto:qemu-de...@nongnu.org>> wrote:
>>>> 
>>>> 
>>>> 
>>>> On 12/18/2017 06:34 AM, Yan Vugenfirer wrote:
>>>>> 
>>>>>> On 14 Dec 2017, at 21:33, Jason Baron via Qemu-devel
>>>>>> <qemu-de...@nongnu.org <mailto:qemu-de...@nongnu.org>> wrote:
>>>>>> 
>>>>>> Although they can be currently set in linux via 'ethtool -s', this
>>>>>> requires
>>>>>> guest changes, and thus it would be nice to extend this
>>>>>> functionality such
>>>>>> that it can be configured automatically from the host (as other network
>>>>>> do).
>>>>>> 
>>>>>> Linkspeed and duplex settings can be set as:
>>>>>> '-device virtio-net,speed=1,duplex=full'
>>>>>> 
>>>>>> where speed is [-1...INT_MAX], and duplex is ["half"|"full"].
>>>>>> 
>>>>>> Signed-off-by: Jason Baron <jba...@akamai.com
>>>>>> <mailto:jba...@akamai.com>>
>>>>>> Cc: "Michael S. Tsirkin" <m...@redhat.com <mailto:m...@redhat.com>>
>>>>>> Cc: Jason Wang <jasow...@redhat.com <mailto:jasow...@redhat.com>>
>>>>>> ---
>>>>>> hw/net/virtio-net.c | 29
>>>>>> +
>>>>>> include/hw/virtio/virtio-net.h  |  3 +++
>>>>>> include/standard-headers/linux/virtio_net.h |  4 
>>>>>> 3 files changed, 36 insertions(+)
>>>>>> 
>>>>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>>>>>> index 38674b0..d63e790 100644
>>>>>> --- a/hw/net/virtio-net.c
>>>>>> +++ b/hw/net/virtio-net.c
>>>>>> @@ -40,6 +40,12 @@
>>>>>> #define VIRTIO_NET_RX_QUEUE_MIN_SIZE VIRTIO_NET_RX_QUEUE_DEFAULT_SIZE
>>>>>> #define VIRTIO_NET_TX_QUEUE_MIN_SIZE VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE
>>>>>> 
>>>>>> +/* duplex and speed defines */
>>>>>> +#define DUPLEX_UNKNOWN  0xff
>>>>>> +#define DUPLEX_HALF 0x00
>>>>>> +#define DUPLEX_FULL 0x01
>>>>>> +#define SPEED_UNKNOWN   -1
>>>>>> +
>>>>>> /*
>>>>>> * Calculate the number of bytes up to and including the given 'field' of
>>>>>> * 'container'.
>>>>>> @@ -61,6 +67,8 @@ static VirtIOFeature feature_sizes[] = {
>>>>>> .end = endof(struct virtio_net_config, max_virtqueue_pairs)},
>>>>>>{.flags = 1 << VIRTIO_NET_F_MTU,
>>>>>> .end = endof(struct virtio_net_config, mtu)},
>>>>>> +{.flags = 1 << VIRTIO_NET_F_SPEED_DUPLEX,
>>>>>> + .end = endof(struct virtio_net_config, duplex)},
>>>>>>{}
>>>>>> };
>>>>>> 
>>>>>> @@ -88,6 +96,8 @@ static void virtio_net_get_config(VirtIODevice
>>>>>> *vdev, uint8_t *config)
>>>>>>virtio_stw_p(vdev, , n->status);
>>>>>>virtio_stw_p(vdev, _virtqueue_pairs, n->max_queues);
>>>>>>virtio_stw_p(vdev, , n->net_conf.mtu);
>>>>>> +virtio_stl_p(vdev, , n->net_conf.speed);
>>>>>> +netcfg.duplex = n->net_conf.duplex;
>>>>>>memcpy(netcfg.mac, n->mac, ETH_ALEN);
>>>>>>memcpy(config, , n->config_size);
>>>>>> }
>>>>>> @@ -1941,6 +1951,23 @@ static void
>>>>>> virtio_net_device_realize(DeviceState *dev, Error **errp)
>>>>>>n->host_features |= (0x1 << VIRTIO_NET_F_MTU);
>>>>>>}
>>>>>> 
>>>>>> +n->host_features |= (0x1 << VIRTIO_NET_F_SPEED_DUPLEX);
>>>>>> +if (n->net_conf.duplex_str) {
>>>>>> +if (strncmp(n->net_conf.duplex_str, "half", 5) == 0) {
>>>>>> + 

Re: [Qemu-devel] [PATCH 2/2] qemu: add linkspeed and duplex setting to virtio-net

2017-12-20 Thread Yan Vugenfirer

> On 20 Dec 2017, at 16:31, Michael S. Tsirkin  wrote:
> 
> On Tue, Dec 19, 2017 at 11:52:39AM -0500, Jason Baron wrote:
>> 
>> 
>> On 12/19/2017 04:19 AM, Yan Vugenfirer wrote:
>>> 
>>>> On 18 Dec 2017, at 18:04, Jason Baron via Qemu-devel
>>>> mailto:qemu-de...@nongnu.org>> wrote:
>>>> 
>>>> 
>>>> 
>>>> On 12/18/2017 06:34 AM, Yan Vugenfirer wrote:
>>>>> 
>>>>>> On 14 Dec 2017, at 21:33, Jason Baron via Qemu-devel
>>>>>> mailto:qemu-de...@nongnu.org>> wrote:
>>>>>> 
>>>>>> Although they can be currently set in linux via 'ethtool -s', this
>>>>>> requires
>>>>>> guest changes, and thus it would be nice to extend this
>>>>>> functionality such
>>>>>> that it can be configured automatically from the host (as other network
>>>>>> do).
>>>>>> 
>>>>>> Linkspeed and duplex settings can be set as:
>>>>>> '-device virtio-net,speed=1,duplex=full'
>>>>>> 
>>>>>> where speed is [-1...INT_MAX], and duplex is ["half"|"full"].
>>>>>> 
>>>>>> Signed-off-by: Jason Baron >>>>> <mailto:jba...@akamai.com>>
>>>>>> Cc: "Michael S. Tsirkin" mailto:m...@redhat.com>>
>>>>>> Cc: Jason Wang mailto:jasow...@redhat.com>>
>>>>>> ---
>>>>>> hw/net/virtio-net.c | 29
>>>>>> +
>>>>>> include/hw/virtio/virtio-net.h  |  3 +++
>>>>>> include/standard-headers/linux/virtio_net.h |  4 
>>>>>> 3 files changed, 36 insertions(+)
>>>>>> 
>>>>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>>>>>> index 38674b0..d63e790 100644
>>>>>> --- a/hw/net/virtio-net.c
>>>>>> +++ b/hw/net/virtio-net.c
>>>>>> @@ -40,6 +40,12 @@
>>>>>> #define VIRTIO_NET_RX_QUEUE_MIN_SIZE VIRTIO_NET_RX_QUEUE_DEFAULT_SIZE
>>>>>> #define VIRTIO_NET_TX_QUEUE_MIN_SIZE VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE
>>>>>> 
>>>>>> +/* duplex and speed defines */
>>>>>> +#define DUPLEX_UNKNOWN  0xff
>>>>>> +#define DUPLEX_HALF 0x00
>>>>>> +#define DUPLEX_FULL 0x01
>>>>>> +#define SPEED_UNKNOWN   -1
>>>>>> +
>>>>>> /*
>>>>>> * Calculate the number of bytes up to and including the given 'field' of
>>>>>> * 'container'.
>>>>>> @@ -61,6 +67,8 @@ static VirtIOFeature feature_sizes[] = {
>>>>>> .end = endof(struct virtio_net_config, max_virtqueue_pairs)},
>>>>>>{.flags = 1 << VIRTIO_NET_F_MTU,
>>>>>> .end = endof(struct virtio_net_config, mtu)},
>>>>>> +{.flags = 1 << VIRTIO_NET_F_SPEED_DUPLEX,
>>>>>> + .end = endof(struct virtio_net_config, duplex)},
>>>>>>{}
>>>>>> };
>>>>>> 
>>>>>> @@ -88,6 +96,8 @@ static void virtio_net_get_config(VirtIODevice
>>>>>> *vdev, uint8_t *config)
>>>>>>virtio_stw_p(vdev, , n->status);
>>>>>>virtio_stw_p(vdev, _virtqueue_pairs, n->max_queues);
>>>>>>virtio_stw_p(vdev, , n->net_conf.mtu);
>>>>>> +virtio_stl_p(vdev, , n->net_conf.speed);
>>>>>> +netcfg.duplex = n->net_conf.duplex;
>>>>>>memcpy(netcfg.mac, n->mac, ETH_ALEN);
>>>>>>memcpy(config, , n->config_size);
>>>>>> }
>>>>>> @@ -1941,6 +1951,23 @@ static void
>>>>>> virtio_net_device_realize(DeviceState *dev, Error **errp)
>>>>>>n->host_features |= (0x1 << VIRTIO_NET_F_MTU);
>>>>>>}
>>>>>> 
>>>>>> +n->host_features |= (0x1 << VIRTIO_NET_F_SPEED_DUPLEX);
>>>>>> +if (n->net_conf.duplex_str) {
>>>>>> +if (strncmp(n->net_conf.duplex_str, "half", 5) == 0) {
>>>>>> +n->net_conf.duplex = DUPLEX_HALF;
>>>>>> +} else if (strncmp(n->net_conf.duplex_str, "full", 5) == 0) {
>&

Re: [Qemu-devel] [PATCH 2/2] qemu: add linkspeed and duplex setting to virtio-net

2017-12-18 Thread Yan Vugenfirer

> On 14 Dec 2017, at 21:33, Jason Baron via Qemu-devel  
> wrote:
> 
> Although they can be currently set in linux via 'ethtool -s', this requires
> guest changes, and thus it would be nice to extend this functionality such
> that it can be configured automatically from the host (as other network
> do).
> 
> Linkspeed and duplex settings can be set as:
> '-device virtio-net,speed=1,duplex=full'
> 
> where speed is [-1...INT_MAX], and duplex is ["half"|"full"].
> 
> Signed-off-by: Jason Baron 
> Cc: "Michael S. Tsirkin" 
> Cc: Jason Wang 
> ---
> hw/net/virtio-net.c | 29 +
> include/hw/virtio/virtio-net.h  |  3 +++
> include/standard-headers/linux/virtio_net.h |  4 
> 3 files changed, 36 insertions(+)
> 
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 38674b0..d63e790 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -40,6 +40,12 @@
> #define VIRTIO_NET_RX_QUEUE_MIN_SIZE VIRTIO_NET_RX_QUEUE_DEFAULT_SIZE
> #define VIRTIO_NET_TX_QUEUE_MIN_SIZE VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE
> 
> +/* duplex and speed defines */
> +#define DUPLEX_UNKNOWN  0xff
> +#define DUPLEX_HALF 0x00
> +#define DUPLEX_FULL 0x01
> +#define SPEED_UNKNOWN   -1
> +
> /*
>  * Calculate the number of bytes up to and including the given 'field' of
>  * 'container'.
> @@ -61,6 +67,8 @@ static VirtIOFeature feature_sizes[] = {
>  .end = endof(struct virtio_net_config, max_virtqueue_pairs)},
> {.flags = 1 << VIRTIO_NET_F_MTU,
>  .end = endof(struct virtio_net_config, mtu)},
> +{.flags = 1 << VIRTIO_NET_F_SPEED_DUPLEX,
> + .end = endof(struct virtio_net_config, duplex)},
> {}
> };
> 
> @@ -88,6 +96,8 @@ static void virtio_net_get_config(VirtIODevice *vdev, 
> uint8_t *config)
> virtio_stw_p(vdev, , n->status);
> virtio_stw_p(vdev, _virtqueue_pairs, n->max_queues);
> virtio_stw_p(vdev, , n->net_conf.mtu);
> +virtio_stl_p(vdev, , n->net_conf.speed);
> +netcfg.duplex = n->net_conf.duplex;
> memcpy(netcfg.mac, n->mac, ETH_ALEN);
> memcpy(config, , n->config_size);
> }
> @@ -1941,6 +1951,23 @@ static void virtio_net_device_realize(DeviceState 
> *dev, Error **errp)
> n->host_features |= (0x1 << VIRTIO_NET_F_MTU);
> }
> 
> +n->host_features |= (0x1 << VIRTIO_NET_F_SPEED_DUPLEX);
> +if (n->net_conf.duplex_str) {
> +if (strncmp(n->net_conf.duplex_str, "half", 5) == 0) {
> +n->net_conf.duplex = DUPLEX_HALF;
> +} else if (strncmp(n->net_conf.duplex_str, "full", 5) == 0) {
> +n->net_conf.duplex = DUPLEX_FULL;
> +} else {
> +error_setg(errp, "'duplex' must be 'half' or 'full'");
> +}
> +} else {
> +n->net_conf.duplex = DUPLEX_UNKNOWN;
> +}
> +if (n->net_conf.speed < SPEED_UNKNOWN) {
> +error_setg(errp, "'speed' must be between -1 (SPEED_UNKOWN) and "
> +   "INT_MAX");
> +}
> +
> virtio_net_set_config_size(n, n->host_features);
> virtio_init(vdev, "virtio-net", VIRTIO_ID_NET, n->config_size);
> 
> @@ -2160,6 +2187,8 @@ static Property virtio_net_properties[] = {
> DEFINE_PROP_UINT16("host_mtu", VirtIONet, net_conf.mtu, 0),
> DEFINE_PROP_BOOL("x-mtu-bypass-backend", VirtIONet, mtu_bypass_backend,
>  true),
> +DEFINE_PROP_INT32("speed", VirtIONet, net_conf.speed, SPEED_UNKNOWN),

>From Windows guest perspective I prefer to have some reasonable default (10G 
>for example). 

Thanks,
Yan.

> +DEFINE_PROP_STRING("duplex", VirtIONet, net_conf.duplex_str),
> DEFINE_PROP_END_OF_LIST(),
> };
> 
> diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
> index b81b6a4..af74a94 100644
> --- a/include/hw/virtio/virtio-net.h
> +++ b/include/hw/virtio/virtio-net.h
> @@ -38,6 +38,9 @@ typedef struct virtio_net_conf
> uint16_t rx_queue_size;
> uint16_t tx_queue_size;
> uint16_t mtu;
> +int32_t speed;
> +char *duplex_str;
> +uint8_t duplex;
> } virtio_net_conf;
> 
> /* Maximum packet size we can receive from tap device: header + 64k */
> diff --git a/include/standard-headers/linux/virtio_net.h 
> b/include/standard-headers/linux/virtio_net.h
> index 30ff249..0ff1447 100644
> --- a/include/standard-headers/linux/virtio_net.h
> +++ b/include/standard-headers/linux/virtio_net.h
> @@ -36,6 +36,7 @@
> #define VIRTIO_NET_F_GUEST_CSUM   1   /* Guest handles pkts w/ 
> partial csum */
> #define VIRTIO_NET_F_CTRL_GUEST_OFFLOADS 2 /* Dynamic offload configuration. 
> */
> #define VIRTIO_NET_F_MTU  3   /* Initial MTU advice */
> +#define VIRTIO_NET_F_SPEED_DUPLEX 4  /* Host set linkspeed and duplex */
> #define VIRTIO_NET_F_MAC  5   /* Host has given MAC address. */
> #define VIRTIO_NET_F_GUEST_TSO4   7   /* Guest can handle TSOv4 in. */
> #define 

Re: [Qemu-devel] [PATCH 2/2] qemu: add linkspeed and duplex setting to virtio-net

2017-12-18 Thread Yan Vugenfirer

> On 14 Dec 2017, at 21:33, Jason Baron via Qemu-devel  
> wrote:
> 
> Although they can be currently set in linux via 'ethtool -s', this requires
> guest changes, and thus it would be nice to extend this functionality such
> that it can be configured automatically from the host (as other network
> do).
> 
> Linkspeed and duplex settings can be set as:
> '-device virtio-net,speed=1,duplex=full'
> 
> where speed is [-1...INT_MAX], and duplex is ["half"|"full"].
> 
> Signed-off-by: Jason Baron 
> Cc: "Michael S. Tsirkin" 
> Cc: Jason Wang 
> ---
> hw/net/virtio-net.c | 29 +
> include/hw/virtio/virtio-net.h  |  3 +++
> include/standard-headers/linux/virtio_net.h |  4 
> 3 files changed, 36 insertions(+)
> 
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 38674b0..d63e790 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -40,6 +40,12 @@
> #define VIRTIO_NET_RX_QUEUE_MIN_SIZE VIRTIO_NET_RX_QUEUE_DEFAULT_SIZE
> #define VIRTIO_NET_TX_QUEUE_MIN_SIZE VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE
> 
> +/* duplex and speed defines */
> +#define DUPLEX_UNKNOWN  0xff
> +#define DUPLEX_HALF 0x00
> +#define DUPLEX_FULL 0x01
> +#define SPEED_UNKNOWN   -1
> +
> /*
>  * Calculate the number of bytes up to and including the given 'field' of
>  * 'container'.
> @@ -61,6 +67,8 @@ static VirtIOFeature feature_sizes[] = {
>  .end = endof(struct virtio_net_config, max_virtqueue_pairs)},
> {.flags = 1 << VIRTIO_NET_F_MTU,
>  .end = endof(struct virtio_net_config, mtu)},
> +{.flags = 1 << VIRTIO_NET_F_SPEED_DUPLEX,
> + .end = endof(struct virtio_net_config, duplex)},
> {}
> };
> 
> @@ -88,6 +96,8 @@ static void virtio_net_get_config(VirtIODevice *vdev, 
> uint8_t *config)
> virtio_stw_p(vdev, , n->status);
> virtio_stw_p(vdev, _virtqueue_pairs, n->max_queues);
> virtio_stw_p(vdev, , n->net_conf.mtu);
> +virtio_stl_p(vdev, , n->net_conf.speed);
> +netcfg.duplex = n->net_conf.duplex;
> memcpy(netcfg.mac, n->mac, ETH_ALEN);
> memcpy(config, , n->config_size);
> }
> @@ -1941,6 +1951,23 @@ static void virtio_net_device_realize(DeviceState 
> *dev, Error **errp)
> n->host_features |= (0x1 << VIRTIO_NET_F_MTU);
> }
> 
> +n->host_features |= (0x1 << VIRTIO_NET_F_SPEED_DUPLEX);
> +if (n->net_conf.duplex_str) {
> +if (strncmp(n->net_conf.duplex_str, "half", 5) == 0) {
> +n->net_conf.duplex = DUPLEX_HALF;
> +} else if (strncmp(n->net_conf.duplex_str, "full", 5) == 0) {
> +n->net_conf.duplex = DUPLEX_FULL;
> +} else {
> +error_setg(errp, "'duplex' must be 'half' or 'full'");
> +}
> +} else {
> +n->net_conf.duplex = DUPLEX_UNKNOWN;
> +}
> +if (n->net_conf.speed < SPEED_UNKNOWN) {
> +error_setg(errp, "'speed' must be between -1 (SPEED_UNKOWN) and "
> +   "INT_MAX");
> +}
> +
> virtio_net_set_config_size(n, n->host_features);
> virtio_init(vdev, "virtio-net", VIRTIO_ID_NET, n->config_size);
> 
> @@ -2160,6 +2187,8 @@ static Property virtio_net_properties[] = {
> DEFINE_PROP_UINT16("host_mtu", VirtIONet, net_conf.mtu, 0),
> DEFINE_PROP_BOOL("x-mtu-bypass-backend", VirtIONet, mtu_bypass_backend,
>  true),
> +DEFINE_PROP_INT32("speed", VirtIONet, net_conf.speed, SPEED_UNKNOWN),

>From Windows guest perspective I prefer to have some reasonable default (10G 
>for example). 

Thanks,
Yan.

> +DEFINE_PROP_STRING("duplex", VirtIONet, net_conf.duplex_str),
> DEFINE_PROP_END_OF_LIST(),
> };
> 
> diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
> index b81b6a4..af74a94 100644
> --- a/include/hw/virtio/virtio-net.h
> +++ b/include/hw/virtio/virtio-net.h
> @@ -38,6 +38,9 @@ typedef struct virtio_net_conf
> uint16_t rx_queue_size;
> uint16_t tx_queue_size;
> uint16_t mtu;
> +int32_t speed;
> +char *duplex_str;
> +uint8_t duplex;
> } virtio_net_conf;
> 
> /* Maximum packet size we can receive from tap device: header + 64k */
> diff --git a/include/standard-headers/linux/virtio_net.h 
> b/include/standard-headers/linux/virtio_net.h
> index 30ff249..0ff1447 100644
> --- a/include/standard-headers/linux/virtio_net.h
> +++ b/include/standard-headers/linux/virtio_net.h
> @@ -36,6 +36,7 @@
> #define VIRTIO_NET_F_GUEST_CSUM   1   /* Guest handles pkts w/ 
> partial csum */
> #define VIRTIO_NET_F_CTRL_GUEST_OFFLOADS 2 /* Dynamic offload configuration. 
> */
> #define VIRTIO_NET_F_MTU  3   /* Initial MTU advice */
> +#define VIRTIO_NET_F_SPEED_DUPLEX 4  /* Host set linkspeed and duplex */
> #define VIRTIO_NET_F_MAC  5   /* Host has given MAC address. */
> #define VIRTIO_NET_F_GUEST_TSO4   7   /* Guest can handle TSOv4 in. */
> #define VIRTIO_NET_F_GUEST_TSO6   8   /* Guest can handle TSOv6 in. */
> @@ -76,6 

Re: [PATCH net V2] vhost: net: switch to use data copy if pending DMAs exceed the limit

2014-03-17 Thread Yan Vugenfirer

On Mar 17, 2014, at 8:43 AM, Ronen Hod  wrote:

> On 03/13/2014 09:28 AM, Jason Wang wrote:
>> On 03/10/2014 04:03 PM, Michael S. Tsirkin wrote:
>>> On Fri, Mar 07, 2014 at 01:28:27PM +0800, Jason Wang wrote:
> We used to stop the handling of tx when the number of pending DMAs
> exceeds VHOST_MAX_PEND. This is used to reduce the memory occupation
> of both host and guest. But it was too aggressive in some cases, since
> any delay or blocking of a single packet may delay or block the guest
> transmission. Consider the following setup:
> 
> +-++-+
> | VM1 || VM2 |
> +--+--++--+--+
>|  |
> +--+--++--+--+
> | tap0|| tap1|
> +--+--++--+--+
>|  |
> pfifo_fast   htb(10Mbit/s)
>|  |
> +--+--+---+
> | bridge  |
> +--+--+
>|
> pfifo_fast
>|
> +-+
> | eth0|(100Mbit/s)
> +-+
> 
> - start two VMs and connect them to a bridge
> - add an physical card (100Mbit/s) to that bridge
> - setup htb on tap1 and limit its throughput to 10Mbit/s
> - run two netperfs in the same time, one is from VM1 to VM2. Another is
>   from VM1 to an external host through eth0.
> - result shows that not only the VM1 to VM2 traffic were throttled but
>   also the VM1 to external host through eth0 is also throttled somehow.
> 
> This is because the delay added by htb may lead the delay the finish
> of DMAs and cause the pending DMAs for tap0 exceeds the limit
> (VHOST_MAX_PEND). In this case vhost stop handling tx request until
> htb send some packets. The problem here is all of the packets
> transmission were blocked even if it does not go to VM2.
> 
> We can solve this issue by relaxing it a little bit: switching to use
> data copy instead of stopping tx when the number of pending DMAs
> exceed half of the vq size. This is safe because:
> 
> - The number of pending DMAs were still limited (half of the vq size)
> - The out of order completion during mode switch can make sure that
>   most of the tx buffers were freed in time in guest.
> 
> So even if about 50% packets were delayed in zero-copy case, vhost
> could continue to do the transmission through data copy in this case.
> 
> Test result:
> 
> Before this patch:
> VM1 to VM2 throughput is 9.3Mbit/s
> VM1 to External throughput is 40Mbit/s
> CPU utilization is 7%
> 
> After this patch:
> VM1 to VM2 throughput is 9.3Mbit/s
> Vm1 to External throughput is 93Mbit/s
> CPU utilization is 16%
> 
> Completed performance test on 40gbe shows no obvious changes in both
> throughput and cpu utilization with this patch.
> 
> The patch only solve this issue when unlimited sndbuf. We still need a
> solution for limited sndbuf.
> 
> Cc: Michael S. Tsirkin 
> Cc: Qin Chuanyu 
> Signed-off-by: Jason Wang 
>>> I thought hard about this.
>>> Here's what worries me: if there are still head of line
>>> blocking issues lurking in the stack, they will still
>>> hurt guests such as windows which rely on timely
>>> completion of buffers, but it makes it
>>> that much harder to reproduce the problems with
>>> linux guests which don't.
>>> And this will make even it harder to figure out
>>> whether zero copy is actually active to diagnose
>>> high cpu utilization cases.
>> Yes.
>>> 
>>> So I think this is a good trick, but let's make
>>> this path conditional on a new debugging module parameter:
>>> how about head_of_line_blocking with default off?
>> Sure. But the head of line blocking was only partially solved in the
>> patch since we only support in-order completion of zerocopy packets.
>> Maybe we need consider switching to out of order completion even for
>> zerocopy skbs?
> 
> Yan, Dima,
> 
> I remember that there is an issue with out-of-order packets and WHQL.
The test considers out of order packets reception as a failure.

Yan.

> 
> Ronen.
> 
>>> This way if we suspect packets are delayed forever
>>> somewhere, we can enable that and see guest networking block.
>>> 
>>> Additionally, I think we should add a way to count zero copy
>>> and non zero copy packets.
>>> I see two ways to implement this: add tracepoints in vhost-net
>>> or add counters in tun accessible with ethtool.
>>> This can be a patch on top and does not have to block
>>> this one though.
>>> 
>> Yes, I post a RFC about 2 years ago, see
>> https://lkml.org/lkml/2012/4/9/478 which only traces generic vhost
>> behaviours. I can refresh this and add some -net specific tracepoints.
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo 

Re: [PATCH net V2] vhost: net: switch to use data copy if pending DMAs exceed the limit

2014-03-17 Thread Yan Vugenfirer

On Mar 17, 2014, at 8:43 AM, Ronen Hod r...@redhat.com wrote:

 On 03/13/2014 09:28 AM, Jason Wang wrote:
 On 03/10/2014 04:03 PM, Michael S. Tsirkin wrote:
 On Fri, Mar 07, 2014 at 01:28:27PM +0800, Jason Wang wrote:
 We used to stop the handling of tx when the number of pending DMAs
 exceeds VHOST_MAX_PEND. This is used to reduce the memory occupation
 of both host and guest. But it was too aggressive in some cases, since
 any delay or blocking of a single packet may delay or block the guest
 transmission. Consider the following setup:
 
 +-++-+
 | VM1 || VM2 |
 +--+--++--+--+
|  |
 +--+--++--+--+
 | tap0|| tap1|
 +--+--++--+--+
|  |
 pfifo_fast   htb(10Mbit/s)
|  |
 +--+--+---+
 | bridge  |
 +--+--+
|
 pfifo_fast
|
 +-+
 | eth0|(100Mbit/s)
 +-+
 
 - start two VMs and connect them to a bridge
 - add an physical card (100Mbit/s) to that bridge
 - setup htb on tap1 and limit its throughput to 10Mbit/s
 - run two netperfs in the same time, one is from VM1 to VM2. Another is
   from VM1 to an external host through eth0.
 - result shows that not only the VM1 to VM2 traffic were throttled but
   also the VM1 to external host through eth0 is also throttled somehow.
 
 This is because the delay added by htb may lead the delay the finish
 of DMAs and cause the pending DMAs for tap0 exceeds the limit
 (VHOST_MAX_PEND). In this case vhost stop handling tx request until
 htb send some packets. The problem here is all of the packets
 transmission were blocked even if it does not go to VM2.
 
 We can solve this issue by relaxing it a little bit: switching to use
 data copy instead of stopping tx when the number of pending DMAs
 exceed half of the vq size. This is safe because:
 
 - The number of pending DMAs were still limited (half of the vq size)
 - The out of order completion during mode switch can make sure that
   most of the tx buffers were freed in time in guest.
 
 So even if about 50% packets were delayed in zero-copy case, vhost
 could continue to do the transmission through data copy in this case.
 
 Test result:
 
 Before this patch:
 VM1 to VM2 throughput is 9.3Mbit/s
 VM1 to External throughput is 40Mbit/s
 CPU utilization is 7%
 
 After this patch:
 VM1 to VM2 throughput is 9.3Mbit/s
 Vm1 to External throughput is 93Mbit/s
 CPU utilization is 16%
 
 Completed performance test on 40gbe shows no obvious changes in both
 throughput and cpu utilization with this patch.
 
 The patch only solve this issue when unlimited sndbuf. We still need a
 solution for limited sndbuf.
 
 Cc: Michael S. Tsirkin m...@redhat.com
 Cc: Qin Chuanyu qinchua...@huawei.com
 Signed-off-by: Jason Wang jasow...@redhat.com
 I thought hard about this.
 Here's what worries me: if there are still head of line
 blocking issues lurking in the stack, they will still
 hurt guests such as windows which rely on timely
 completion of buffers, but it makes it
 that much harder to reproduce the problems with
 linux guests which don't.
 And this will make even it harder to figure out
 whether zero copy is actually active to diagnose
 high cpu utilization cases.
 Yes.
 
 So I think this is a good trick, but let's make
 this path conditional on a new debugging module parameter:
 how about head_of_line_blocking with default off?
 Sure. But the head of line blocking was only partially solved in the
 patch since we only support in-order completion of zerocopy packets.
 Maybe we need consider switching to out of order completion even for
 zerocopy skbs?
 
 Yan, Dima,
 
 I remember that there is an issue with out-of-order packets and WHQL.
The test considers out of order packets reception as a failure.

Yan.

 
 Ronen.
 
 This way if we suspect packets are delayed forever
 somewhere, we can enable that and see guest networking block.
 
 Additionally, I think we should add a way to count zero copy
 and non zero copy packets.
 I see two ways to implement this: add tracepoints in vhost-net
 or add counters in tun accessible with ethtool.
 This can be a patch on top and does not have to block
 this one though.
 
 Yes, I post a RFC about 2 years ago, see
 https://lkml.org/lkml/2012/4/9/478 which only traces generic vhost
 behaviours. I can refresh this and add some -net specific tracepoints.
 --
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: KVM VM(windows xp) reseted when running geekbench for about 2 days

2013-04-18 Thread Yan Vugenfirer

On Apr 18, 2013, at 3:55 PM, Gleb Natapov wrote:

> On Thu, Apr 18, 2013 at 12:00:49PM +, Zhanghaoyu (A) wrote:
>> I start 10 VMs(windows xp), then running geekbench tool on them, about 2 
>> days, one of them was reset,
>> I found the reset operation is done by
>> int kvm_cpu_exec(CPUArchState *env)
>> {
>>...
>>   switch (run->exit_reason)
>>   ...
>>case KVM_EXIT_SHUTDOWN:
>>DPRINTF("shutdown\n");
>>qemu_system_reset_request();
>>ret = EXCP_INTERRUPT;
>>break;
>>...
>> }
>> 
>> KVM_EXIT_SHUTDOWN exit reason was set previously in triple fault handle 
>> handle_triple_fault().
>> 
> How do you know that reset was done here? This is not the only place
> where qemu_system_reset_request() is called.

Make sure XP is not set to auto-reset in case of BSOD. 

Best regards,
Yan.

> 
>> What causes the triple fault?
>> 
> Are you asking what is triple fault or why it happened in your case?
> For the former see here: http://en.wikipedia.org/wiki/Triple_fault
> For the later it is to late to tell after VM reset. You can run QEMU with
> -no-reboot -no-shutdown. VM will pause instead of rebooting and then you
> can examine what is going on.
> 
> --
>   Gleb.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: KVM VM(windows xp) reseted when running geekbench for about 2 days

2013-04-18 Thread Yan Vugenfirer

On Apr 18, 2013, at 3:55 PM, Gleb Natapov wrote:

 On Thu, Apr 18, 2013 at 12:00:49PM +, Zhanghaoyu (A) wrote:
 I start 10 VMs(windows xp), then running geekbench tool on them, about 2 
 days, one of them was reset,
 I found the reset operation is done by
 int kvm_cpu_exec(CPUArchState *env)
 {
...
   switch (run-exit_reason)
   ...
case KVM_EXIT_SHUTDOWN:
DPRINTF(shutdown\n);
qemu_system_reset_request();
ret = EXCP_INTERRUPT;
break;
...
 }
 
 KVM_EXIT_SHUTDOWN exit reason was set previously in triple fault handle 
 handle_triple_fault().
 
 How do you know that reset was done here? This is not the only place
 where qemu_system_reset_request() is called.

Make sure XP is not set to auto-reset in case of BSOD. 

Best regards,
Yan.

 
 What causes the triple fault?
 
 Are you asking what is triple fault or why it happened in your case?
 For the former see here: http://en.wikipedia.org/wiki/Triple_fault
 For the later it is to late to tell after VM reset. You can run QEMU with
 -no-reboot -no-shutdown. VM will pause instead of rebooting and then you
 can examine what is going on.
 
 --
   Gleb.
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Qemu-devel] [PATCH v8] kvm: notify host when the guest is panicked

2012-08-15 Thread Yan Vugenfirer

On Aug 15, 2012, at 12:56 PM, Gleb Natapov wrote:

> On Tue, Aug 14, 2012 at 02:35:34PM -0500, Anthony Liguori wrote:
>>> Do you consider allowing support for Windows as overengineering?
>> 
>> I don't think there is a way to hook BSOD on Windows so attempting to
>> engineer something that works with Windows seems odd, no?
>> 
> Yan says in other email that is is possible to register a bugcheck callback.
> 

Here you go - 
http://msdn.microsoft.com/en-us/library/windows/hardware/ff553105(v=vs.85).aspx
Already done in virtio-net for two reasons: 1. we could configure virtio-net to 
notify QEMU in a hacky way (write 1 to VIRTIO_PCI_ISR register) that there was 
a bugckeck .It was very useful debugging complex WHQL issues that involved host 
networking. 2. Store additional information (for example time stamps of last 
receive packet, last interrupt and etc) in crash dump.

Yan.

> --
>   Gleb.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Qemu-devel] [PATCH v8] kvm: notify host when the guest is panicked

2012-08-15 Thread Yan Vugenfirer

On Aug 14, 2012, at 10:35 PM, Anthony Liguori wrote:

> Marcelo Tosatti  writes:
> 
>> On Tue, Aug 14, 2012 at 01:53:01PM -0500, Anthony Liguori wrote:
>>> Marcelo Tosatti  writes:
>>> 
>>>> On Tue, Aug 14, 2012 at 05:55:54PM +0300, Yan Vugenfirer wrote:
>>>>> 
>>>>> On Aug 14, 2012, at 1:42 PM, Jan Kiszka wrote:
>>>>> 
>>>>>> On 2012-08-14 10:56, Daniel P. Berrange wrote:
>>>>>>> On Mon, Aug 13, 2012 at 03:21:32PM -0300, Marcelo Tosatti wrote:
>>>>>>>> On Wed, Aug 08, 2012 at 10:43:01AM +0800, Wen Congyang wrote:
>>>>>>>>> We can know the guest is panicked when the guest runs on xen.
>>>>>>>>> But we do not have such feature on kvm.
>>>>>>>>> 
>>>>>>>>> Another purpose of this feature is: management app(for example:
>>>>>>>>> libvirt) can do auto dump when the guest is panicked. If management
>>>>>>>>> app does not do auto dump, the guest's user can do dump by hand if
>>>>>>>>> he sees the guest is panicked.
>>>>>>>>> 
>>>>>>>>> We have three solutions to implement this feature:
>>>>>>>>> 1. use vmcall
>>>>>>>>> 2. use I/O port
>>>>>>>>> 3. use virtio-serial.
>>>>>>>>> 
>>>>>>>>> We have decided to avoid touching hypervisor. The reason why I choose
>>>>>>>>> choose the I/O port is:
>>>>>>>>> 1. it is easier to implememt
>>>>>>>>> 2. it does not depend any virtual device
>>>>>>>>> 3. it can work when starting the kernel
>>>>>>>> 
>>>>>>>> How about searching for the "Kernel panic - not syncing" string 
>>>>>>>> in the guests serial output? Say libvirtd could take an action upon
>>>>>>>> that?
>>>>>>> 
>>>>>>> No, this is not satisfactory. It depends on the guest OS being
>>>>>>> configured to use the serial port for console output which we
>>>>>>> cannot mandate, since it may well be required for other purposes.
>>>>>> 
>>>>> Please don't forget Windows guests, there is no console and no "Kernel 
>>>>> Panic" string ;)
>>>>> 
>>>>> What I used for debugging purposes on Windows guest is to register a 
>>>>> bugcheck callback in virtio-net driver and write 1 to VIRTIO_PCI_ISR 
>>>>> register.
>>>>> 
>>>>> Yan. 
>>>> 
>>>> Considering whether a "panic-device" should cover other OSes is also \
>> 
>>>> something to consider. Even for Linux, is "panic" the only case which
>>>> should be reported via the mechanism? What about oopses without panic? 
>>>> 
>>>> Is the mechanism general enough for supporting new events, etc.
>>> 
>>> Hi,
>>> 
>>> I think this discussion is gone of the deep end.
>>> 
>>> Forget about !x86 platforms.  They have their own way to do this sort of
>>> thing.  
>> 
>> The panic function in kernel/panic.c has the following options, which
>> appear to be arch independent, on panic:
>> 
>> - reboot 
>> - blink
> 
> Not sure the semantics of blink but that might be a good place for a
> pvops hook.
> 
>> 
>> None are paravirtual interfaces however.
>> 
>>> Think of this feature like a status LED on a motherboard.  These
>>> are very common and usually controlled by IO ports.
>>> 
>>> We're simply reserving a "status LED" for the guest to indicate that it
>>> has paniced.  Let's not over engineer this.
>> 
>> My concern is that you end up with state that is dependant on x86.
>> 
>> Subject: [PATCH v8 3/6] add a new runstate: RUN_STATE_GUEST_PANICKED
>> 
>> Having the ability to stop/restart the guest (and even introducing a 
>> new VM runstate) is more than a status LED analogy.
> 
> I must admit, I don't know why a new runstate is necessary/useful.  The
> kernel shouldn't have to care about the difference between a halted guest
> and a panicked guest.  That level of information belongs in userspace IMHO.
> 
>> Can this new infrastructure be used by other architectures?
> 
> I guess I don't u

Re: [Qemu-devel] [PATCH v8] kvm: notify host when the guest is panicked

2012-08-15 Thread Yan Vugenfirer

On Aug 14, 2012, at 10:35 PM, Anthony Liguori wrote:

 Marcelo Tosatti mtosa...@redhat.com writes:
 
 On Tue, Aug 14, 2012 at 01:53:01PM -0500, Anthony Liguori wrote:
 Marcelo Tosatti mtosa...@redhat.com writes:
 
 On Tue, Aug 14, 2012 at 05:55:54PM +0300, Yan Vugenfirer wrote:
 
 On Aug 14, 2012, at 1:42 PM, Jan Kiszka wrote:
 
 On 2012-08-14 10:56, Daniel P. Berrange wrote:
 On Mon, Aug 13, 2012 at 03:21:32PM -0300, Marcelo Tosatti wrote:
 On Wed, Aug 08, 2012 at 10:43:01AM +0800, Wen Congyang wrote:
 We can know the guest is panicked when the guest runs on xen.
 But we do not have such feature on kvm.
 
 Another purpose of this feature is: management app(for example:
 libvirt) can do auto dump when the guest is panicked. If management
 app does not do auto dump, the guest's user can do dump by hand if
 he sees the guest is panicked.
 
 We have three solutions to implement this feature:
 1. use vmcall
 2. use I/O port
 3. use virtio-serial.
 
 We have decided to avoid touching hypervisor. The reason why I choose
 choose the I/O port is:
 1. it is easier to implememt
 2. it does not depend any virtual device
 3. it can work when starting the kernel
 
 How about searching for the Kernel panic - not syncing string 
 in the guests serial output? Say libvirtd could take an action upon
 that?
 
 No, this is not satisfactory. It depends on the guest OS being
 configured to use the serial port for console output which we
 cannot mandate, since it may well be required for other purposes.
 
 Please don't forget Windows guests, there is no console and no Kernel 
 Panic string ;)
 
 What I used for debugging purposes on Windows guest is to register a 
 bugcheck callback in virtio-net driver and write 1 to VIRTIO_PCI_ISR 
 register.
 
 Yan. 
 
 Considering whether a panic-device should cover other OSes is also \
 
 something to consider. Even for Linux, is panic the only case which
 should be reported via the mechanism? What about oopses without panic? 
 
 Is the mechanism general enough for supporting new events, etc.
 
 Hi,
 
 I think this discussion is gone of the deep end.
 
 Forget about !x86 platforms.  They have their own way to do this sort of
 thing.  
 
 The panic function in kernel/panic.c has the following options, which
 appear to be arch independent, on panic:
 
 - reboot 
 - blink
 
 Not sure the semantics of blink but that might be a good place for a
 pvops hook.
 
 
 None are paravirtual interfaces however.
 
 Think of this feature like a status LED on a motherboard.  These
 are very common and usually controlled by IO ports.
 
 We're simply reserving a status LED for the guest to indicate that it
 has paniced.  Let's not over engineer this.
 
 My concern is that you end up with state that is dependant on x86.
 
 Subject: [PATCH v8 3/6] add a new runstate: RUN_STATE_GUEST_PANICKED
 
 Having the ability to stop/restart the guest (and even introducing a 
 new VM runstate) is more than a status LED analogy.
 
 I must admit, I don't know why a new runstate is necessary/useful.  The
 kernel shouldn't have to care about the difference between a halted guest
 and a panicked guest.  That level of information belongs in userspace IMHO.
 
 Can this new infrastructure be used by other architectures?
 
 I guess I don't understand why the kernel side of this isn't anything
 more than a paravirt op hook that does a single outb() with the
 remaining logic handled 100% in QEMU.
 
 Do you consider allowing support for Windows as overengineering?
 
 I don't think there is a way to hook BSOD on Windows so attempting to
 engineer something that works with Windows seems odd, no?
 

Actually there is a way 
(http://msdn.microsoft.com/en-us/library/windows/hardware/ff553105(v=vs.85).aspx).
 That's what I just mentioned already done in Windows virtio-net driver. 


Best regards,
Yan.

 Regards,
 
 Anthony Liguori
 
 
 Regards,
 
 Anthony Liguori
 
 
 
 Well, we have more than a single serial port, even when leaving
 virtio-serial aside...
 
 Jan
 
 -- 
 Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
 Corporate Competence Center Embedded Linux
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Qemu-devel] [PATCH v8] kvm: notify host when the guest is panicked

2012-08-15 Thread Yan Vugenfirer

On Aug 15, 2012, at 12:56 PM, Gleb Natapov wrote:

 On Tue, Aug 14, 2012 at 02:35:34PM -0500, Anthony Liguori wrote:
 Do you consider allowing support for Windows as overengineering?
 
 I don't think there is a way to hook BSOD on Windows so attempting to
 engineer something that works with Windows seems odd, no?
 
 Yan says in other email that is is possible to register a bugcheck callback.
 

Here you go - 
http://msdn.microsoft.com/en-us/library/windows/hardware/ff553105(v=vs.85).aspx
Already done in virtio-net for two reasons: 1. we could configure virtio-net to 
notify QEMU in a hacky way (write 1 to VIRTIO_PCI_ISR register) that there was 
a bugckeck .It was very useful debugging complex WHQL issues that involved host 
networking. 2. Store additional information (for example time stamps of last 
receive packet, last interrupt and etc) in crash dump.

Yan.

 --
   Gleb.
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8] kvm: notify host when the guest is panicked

2012-08-14 Thread Yan Vugenfirer

On Aug 14, 2012, at 1:42 PM, Jan Kiszka wrote:

> On 2012-08-14 10:56, Daniel P. Berrange wrote:
>> On Mon, Aug 13, 2012 at 03:21:32PM -0300, Marcelo Tosatti wrote:
>>> On Wed, Aug 08, 2012 at 10:43:01AM +0800, Wen Congyang wrote:
 We can know the guest is panicked when the guest runs on xen.
 But we do not have such feature on kvm.
 
 Another purpose of this feature is: management app(for example:
 libvirt) can do auto dump when the guest is panicked. If management
 app does not do auto dump, the guest's user can do dump by hand if
 he sees the guest is panicked.
 
 We have three solutions to implement this feature:
 1. use vmcall
 2. use I/O port
 3. use virtio-serial.
 
 We have decided to avoid touching hypervisor. The reason why I choose
 choose the I/O port is:
 1. it is easier to implememt
 2. it does not depend any virtual device
 3. it can work when starting the kernel
>>> 
>>> How about searching for the "Kernel panic - not syncing" string 
>>> in the guests serial output? Say libvirtd could take an action upon
>>> that?
>> 
>> No, this is not satisfactory. It depends on the guest OS being
>> configured to use the serial port for console output which we
>> cannot mandate, since it may well be required for other purposes.
> 
Please don't forget Windows guests, there is no console and no "Kernel Panic" 
string ;)

What I used for debugging purposes on Windows guest is to register a bugcheck 
callback in virtio-net driver and write 1 to VIRTIO_PCI_ISR register.

Yan. 


> Well, we have more than a single serial port, even when leaving
> virtio-serial aside...
> 
> Jan
> 
> -- 
> Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
> Corporate Competence Center Embedded Linux
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8] kvm: notify host when the guest is panicked

2012-08-14 Thread Yan Vugenfirer

On Aug 14, 2012, at 1:42 PM, Jan Kiszka wrote:

 On 2012-08-14 10:56, Daniel P. Berrange wrote:
 On Mon, Aug 13, 2012 at 03:21:32PM -0300, Marcelo Tosatti wrote:
 On Wed, Aug 08, 2012 at 10:43:01AM +0800, Wen Congyang wrote:
 We can know the guest is panicked when the guest runs on xen.
 But we do not have such feature on kvm.
 
 Another purpose of this feature is: management app(for example:
 libvirt) can do auto dump when the guest is panicked. If management
 app does not do auto dump, the guest's user can do dump by hand if
 he sees the guest is panicked.
 
 We have three solutions to implement this feature:
 1. use vmcall
 2. use I/O port
 3. use virtio-serial.
 
 We have decided to avoid touching hypervisor. The reason why I choose
 choose the I/O port is:
 1. it is easier to implememt
 2. it does not depend any virtual device
 3. it can work when starting the kernel
 
 How about searching for the Kernel panic - not syncing string 
 in the guests serial output? Say libvirtd could take an action upon
 that?
 
 No, this is not satisfactory. It depends on the guest OS being
 configured to use the serial port for console output which we
 cannot mandate, since it may well be required for other purposes.
 
Please don't forget Windows guests, there is no console and no Kernel Panic 
string ;)

What I used for debugging purposes on Windows guest is to register a bugcheck 
callback in virtio-net driver and write 1 to VIRTIO_PCI_ISR register.

Yan. 


 Well, we have more than a single serial port, even when leaving
 virtio-serial aside...
 
 Jan
 
 -- 
 Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
 Corporate Competence Center Embedded Linux
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/