On Thu, Jul 16, 2020 at 05:14:32PM +0800, Jason Wang wrote: > > >+static void suspend_vqs(struct mlx5_vdpa_net *ndev) > >+{ > >+ int i; > >+ > >+ for (i = 0; i < MLX5_MAX_SUPPORTED_VQS; i++) > >+ suspend_vq(ndev, &ndev->vqs[i]); > > > In teardown_virtqueues() it has a check of mvq->num_ent, any reason > not doing it here? >
Looks like checking intialized is enough. Will fix this. > >+ > >+static void mlx5_vdpa_set_vq_ready(struct vdpa_device *vdev, u16 idx, bool > >ready) > >+{ > >+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); > >+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); > >+ struct mlx5_vdpa_virtqueue *mvq = &ndev->vqs[idx]; > >+ int err; > >+ > >+ if (!mvq->ready && ready && mvq->fw_state != > >MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY) { > >+ err = modify_virtqueue(ndev, mvq, > >MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY); > >+ if (err) { > >+ mlx5_vdpa_warn(mvdev, "failed to modify > >virtqueue(%d)\n", err); > >+ return; > >+ } > >+ } > > > I wonder what's the reason of changing vq state on the hardware > here. I think we can defer it to set_status(). > I can defer this to set status. I just wonder if it is possible that the core vdpa driver may call this function with ready equals false and after some time call it with ready equals true. > (Anyhow we don't sync vq address in set_vq_address()). > > > >+static u64 mlx5_vdpa_get_features(struct vdpa_device *vdev) > >+{ > >+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); > >+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); > >+ u16 dev_features; > >+ > >+ dev_features = MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, > >device_features_bits_mask); > >+ ndev->mvdev.mlx_features = mlx_to_vritio_features(dev_features); > >+ if (MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, virtio_version_1_0)) > >+ ndev->mvdev.mlx_features |= BIT(VIRTIO_F_VERSION_1); > > > This is interesting. This suggests !VIRTIO_F_VERSION_1 && > VIRTIO_F_IOMMU_PLATFORM is valid. But virito spec doesn't allow such > configuration. Will fix > > So I think you need either: > > 1) Fail vDPA device probe when VERSION_1 is not supported > 2) clear IOMMU_PLATFORM if VERSION_1 is not negotiated > > For 2) I guess it can't work, according to the spec, without > IOMMU_PLATFORM, device need to use PA to access the memory > > > >+ ndev->mvdev.mlx_features |= BIT(VIRTIO_F_ANY_LAYOUT); I think this should be removed too > >+ ndev->mvdev.mlx_features |= BIT(VIRTIO_F_IOMMU_PLATFORM); > >+ if (mlx5_vdpa_max_qps(ndev->mvdev.max_vqs) > 1) > >+ ndev->mvdev.mlx_features |= BIT(VIRTIO_NET_F_MQ); Also this, since multqueue requires configuration vq which is not supported in this version. > >+ > >+ print_features(mvdev, ndev->mvdev.mlx_features, false); > >+ return ndev->mvdev.mlx_features; > >+} > >+ > >+static int verify_min_features(struct mlx5_vdpa_dev *mvdev, u64 features) > >+{ > >+ /* FIXME: qemu currently does not set all the feaures due to a bug. > >+ * Add checks when this is fixed. > >+ */ > > > I think we should add the check now then qemu can get notified. (E.g > IOMMU_PLATFORM) Will do. > > > >+} > >+ > >+#define MLX5_VDPA_MAX_VQ_ENTRIES 256 > > > Is this a hardware limitation, qemu can support up to 1K which the > requirement of some NFV cases. > Theoretically the device is limit is much higher. In the near future we will have a device capability for this. I wanted to stay on the safe side with this but I can change this if you think it's necessary. > > >+ > >+static void mlx5_vdpa_get_config(struct vdpa_device *vdev, unsigned int > >offset, void *buf, > >+ unsigned int len) > >+{ > >+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); > >+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); > >+ > >+ if (offset + len < sizeof(struct virtio_net_config)) > >+ memcpy(buf, &ndev->config + offset, len); > > > Note that guest expect LE, what's the endian for ndev->config? This is struct virtio_net_config from include/uapi/linux/virtio_net.h. Looking there I see it has mixed endianess. I currently don't touch it but if I do, I should follow endianess guidance per the struct definition. So I don't think I should care about endianess when copying. > > > >+} > >+ > >+static void mlx5_vdpa_set_config(struct vdpa_device *vdev, unsigned int > >offset, const void *buf, > >+ unsigned int len) > >+{ > >+ /* not supported */ > >+} > >+ > >+static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev) > >+{ > >+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); > >+ > >+ return mvdev->generation; > >+} > >+ > >+static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct vhost_iotlb > >*iotlb) > >+{ > >+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); > >+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); > >+ bool change_map; > >+ int err; > >+ > >+ err = mlx5_vdpa_handle_set_map(mvdev, iotlb, &change_map); > >+ if (err) { > >+ mlx5_vdpa_warn(mvdev, "set map failed(%d)\n", err); > >+ return err; > >+ } > >+ > >+ if (change_map) > >+ return mlx5_vdpa_change_map(ndev, iotlb); > > > Any reason for not doing this inside mlx5_handle_set_map()? > All memory registration related operations are done inside mr.c in the common code directory. But change map involves operations on other objects managed in this file. > >+ > >+void mlx5_vdpa_remove_dev(struct mlx5_vdpa_dev *mvdev) > >+{ > >+ struct mlx5_vdpa_net *ndev; > >+ > >+ mvdev->status = 0; > > > This is probably unnecessary. > Right, will remove.