Hi Zhiyong,

On 11/30/2017 10:46 AM, Zhiyong Yang wrote:
Vhostpci PMD is a new type driver working in guest OS which has ability to
drive the vhostpci modern pci device, which is a new virtio device.

The following linking is about vhostpci design:

An initial device design is presented at KVM Forum'16:
http://www.linux-kvm.org/images/5/55/02x07A-Wei_Wang-Design_of-Vhost-pci.pdf
The latest device design and implementation will be posted to the QEMU 
community soon.

Vhostpci PMD works in pair with virtio-net PMD to achieve point-to-point 
communication
between VMs. DPDK already has virtio/vhost user PMD pair to implement RX/TX 
packets
between guest/host scenario. However, for VM2VM use cases, Virtio PMD needs to
transmit pkts from VM1 to host OS firstly by vhost user port, then transmit 
pkts to
the 2nd VM by virtio PMD port again. Virtio/Vhostpci PMD pair can implement 
shared
memory to receive/trasmit packets directly between two VMs. Currently, the 
entire memory
of the virtio-net side VM is shared to the vhost-pci side VM, and mapped via 
device BAR2,
and the first 4KB area of BAR2 is reserved to store the metadata.

The vhostpci/virtio PMD working processing is the following:

1.VM1 startup with vhostpci device, bind the device to DPDK in the guest1,
launch the DPDK testpmd, then waiting for the remote memory info (the VM2
shares memory, memory regions and vring info).

2.VM2 startup with virtio-net device, bind the virito-net to DPDK in the VM2,
run testpmd using virtio PMD.

3.vhostpci device negotiate virtio message with virtio-net device via socket
as vhost user/virtio-net do that.

4.Vhostpci device gets VM2's memory region and vring info and write the metadata
to VM2's shared memory.

5.When the metadata is ready to be read by the Vhostpci PMD, the PMD
will receive a config interrupt with LINK_UP set in the status config.

6.Vhostpci PMD and Virtio PMD can transmit/receive the packets.

How to test?

1. launch VM1 with vhostpci device.
qemu/x86_64-softmmu/qemu-system-x86_64 -cpu host -M pc -enable-kvm \
-smp 16,threads=1,sockets=1 -m 8G -mem-prealloc -realtime mlock=on \
-object memory-backend-file,id=mem,size=8G,mem-path=/dev/hugepages, \
share=on -numa node,memdev=mem -drive 
if=virtio,file=/root/vhost-pci/guest1.img,format=raw \
-kernel /opt/guest_kernel -append 'root=/dev/vda1 ro default_hugepagesz=1G 
hugepagesz=1G \
hugepages=2 console=ttyS0,115200,8n1 3' -netdev 
tap,id=net1,br=br0,script=/etc/qemu-ifup \
-chardev socket,id=slave1,server,wait=off, path=/opt/vhost-pci-slave1 -device 
vhost-pci-net-pci, \
chardev=slave1 \
-nographic

2. bind vhostpci device to dpdk using igb_uio.
startup dpdk
./x86_64-native-linuxapp-gcc/app/testpmd -c 0x3 -n 4 -- -i

3. launch VM2 with virtio-net device.

qemu/x86_64-softmmu/qemu-system-x86_64 -cpu host -M pc -enable-kvm \
-smp 4,threads=1,sockets=1 -m 8G -mem-prealloc -realtime mlock=on \
-object memory-backend-file,id=mem,size=8G,mem-path=/dev/hugepages,share=on \
-numa node,memdev=mem -drive 
if=virtio,file=/root/vhost-pci/guest2.img,format=raw \
-net none -no-hpet -kernel /opt/guest_kernel \
-append 'root=/dev/vda1 ro default_hugepagesz=1G hugepagesz=1G hugepages=2 
console=ttyS0,115200,8n1 3' \
-chardev socket,id=sock2,path=/opt/vhost-pci-slave1 \
-netdev type=vhost-user,id=net2,chardev=sock2,vhostforce \
-device virtio-net-pci,mac=52:54:00:00:00:02,netdev=net2 \
-nographic

4.bind virtio-net to dpdk using igb_uio
run dpdk

./x86_64-native-linuxapp-gcc/app/testpmd -c 0x3 -n 4 --socket-mem 512,0 \
-- -i --rxq=1 --txq=1 --nb-cores=1

5. vhostpci PMD run "start"

6. virtio PMD side run "start tx_first"

loopback testing can work.

note:
1. only support igb_uio for now.
2. vhostpci device is a modern pci device. vhostpci PMD only supports mergable
mode. Virtio device side must be mergable mode.
3. vhostpci PMD supports one queue pair for now.

Zhiyong Yang (11):
   drivers/net: add vhostpci PMD base files
   net/vhostpci: public header files
   net/vhostpci: add debugging log macros
   net/vhostpci: add basic framework
   net/vhostpci: add queue setup
   net/vhostpci: add support for link status change
   net/vhostpci: get remote memory region and vring info
   net/vhostpci: add RX function
   net/vhostpci: add TX function
   net/vhostpci: support RX/TX packets statistics
   net/vhostpci: update release note

  MAINTAINERS                                       |    6 +
  config/common_base                                |    9 +
  config/common_linuxapp                            |    1 +
  doc/guides/rel_notes/release_18_02.rst            |    6 +
  drivers/net/Makefile                              |    1 +
  drivers/net/vhostpci/Makefile                     |   54 +
  drivers/net/vhostpci/rte_pmd_vhostpci_version.map |    3 +
  drivers/net/vhostpci/vhostpci_ethdev.c            | 1521 +++++++++++++++++++++
  drivers/net/vhostpci/vhostpci_ethdev.h            |  176 +++
  drivers/net/vhostpci/vhostpci_logs.h              |   69 +
  drivers/net/vhostpci/vhostpci_net.h               |   74 +
  drivers/net/vhostpci/vhostpci_pci.c               |  334 +++++
  drivers/net/vhostpci/vhostpci_pci.h               |  240 ++++
  mk/rte.app.mk                                     |    1 +
  14 files changed, 2495 insertions(+)
  create mode 100644 drivers/net/vhostpci/Makefile
  create mode 100644 drivers/net/vhostpci/rte_pmd_vhostpci_version.map
  create mode 100644 drivers/net/vhostpci/vhostpci_ethdev.c
  create mode 100644 drivers/net/vhostpci/vhostpci_ethdev.h
  create mode 100644 drivers/net/vhostpci/vhostpci_logs.h
  create mode 100644 drivers/net/vhostpci/vhostpci_net.h
  create mode 100644 drivers/net/vhostpci/vhostpci_pci.c
  create mode 100644 drivers/net/vhostpci/vhostpci_pci.h


Thanks for the RFC.
It seems there is a lot of code duplication between this series and
libvhost-user.

Does the non-RFC would make reuse of libvhost-user? I'm thinking of all
the code copied from virtio-net.c for example.

If not, I think this is problematic as it will double the maintenance
cost.

Cheers,
Maxime

Reply via email to