On 1/26/2016 10:58 AM, Tetsuya Mukawa wrote: > On 2016/01/25 19:15, Xie, Huawei wrote: >> On 1/22/2016 6:38 PM, Tetsuya Mukawa wrote: >>> On 2016/01/22 17:14, Xie, Huawei wrote: >>>> On 1/21/2016 7:09 PM, Tetsuya Mukawa wrote: >>>>> virtio: Extend virtio-net PMD to support container environment >>>>> >>>>> The patch adds a new virtio-net PMD configuration that allows the PMD to >>>>> work on host as if the PMD is in VM. >>>>> Here is new configuration for virtio-net PMD. >>>>> - CONFIG_RTE_LIBRTE_VIRTIO_HOST_MODE >>>>> To use this mode, EAL needs physically contiguous memory. To allocate >>>>> such memory, add "--shm" option to application command line. >>>>> >>>>> To prepare virtio-net device on host, the users need to invoke QEMU >>>>> process in special qtest mode. This mode is mainly used for testing QEMU >>>>> devices from outer process. In this mode, no guest runs. >>>>> Here is QEMU command line. >>>>> >>>>> $ qemu-system-x86_64 \ >>>>> -machine pc-i440fx-1.4,accel=qtest \ >>>>> -display none -qtest-log /dev/null \ >>>>> -qtest unix:/tmp/socket,server \ >>>>> -netdev type=tap,script=/etc/qemu-ifup,id=net0,queues=1\ >>>>> -device virtio-net-pci,netdev=net0,mq=on \ >>>>> -chardev socket,id=chr1,path=/tmp/ivshmem,server \ >>>>> -device ivshmem,size=1G,chardev=chr1,vectors=1 >>>>> >>>>> * QEMU process is needed per port. >>>> Does qtest supports hot plug virtio-net pci device, so that we could run >>>> one QEMU process in host, which provisions the virtio-net virtual >>>> devices for the container? >>> Theoretically, we can use hot plug in some cases. >>> But I guess we have 3 concerns here. >>> >>> 1. Security. >>> If we share QEMU process between multiple DPDK applications, this QEMU >>> process will have all fds of the applications on different containers. >>> In some cases, it will be security concern. >>> So, I guess we need to support current 1:1 configuration at least. >>> >>> 2. shared memory. >>> Currently, QEMU and DPDK application will map shared memory using same >>> virtual address. >>> So if multiple DPDK application connects to one QEMU process, each DPDK >>> application should have different address for shared memory. I guess >>> this will be a big limitation. >>> >>> 3. PCI bridge. >>> So far, QEMU has one PCI bridge, so we can connect almost 10 PCI devices >>> to QEMU. >>> (I forget correct number, but it's almost 10, because some slots are >>> reserved by QEMU) >>> A DPDK application needs both virtio-net and ivshmem device, so I guess >>> almost 5 DPDK applications can connect to one QEMU process, so far. >>> To add more PCI bridges solves this. >>> But we need to add a lot of implementation to support cascaded PCI >>> bridges and PCI devices. >>> (Also we need to solve above "2nd" concern.) >>> >>> Anyway, if we use virtio-net PMD and vhost-user PMD, QEMU process will >>> not do anything after initialization. >>> (QEMU will try to read a qtest socket, then be stopped because there is >>> no message after initialization) >>> So I guess we can ignore overhead of these QEMU processes. >>> If someone cannot ignore it, I guess this is the one of cases that it's >>> nice to use your light weight container implementation. >> Thanks for the explanation, and also in your opinion where is the best >> place to run the QEMU instance? If we run QEMU instances in host, for >> vhost-kernel support, we could get rid of the root privilege issue. > Do you mean below? > If we deploy QEMU instance on host, we can start a container without the > root privilege. > (But on host, still QEMU instance needs the privilege to access to > vhost-kernel)
There is no issue running QEMU instance with root privilege on host, but i think it is not acceptable granting the container root privilege. > > If so, I agree to deploy QEMU instance on host or other privileged > container will be nice. > In the case of vhost-user, to deploy on host or non-privileged container > will be good. > >> Another issue is do you plan to support multiple virtio devices in >> container? Currently i find the code assuming only one virtio-net device >> in QEMU, right? > Yes, so far, 1 port needs 1 QEMU instance. > So if you need multiple virtio devices, you need to invoke multiple QEMU > instances. > > Do you want to deploy 1 QEMU instance for each DPDK application, even if > the application has multiple virtio-net ports? > > So far, I am not sure whether we need it, because this type of DPDK > application will need only one port in most cases. > But if you need this, yes, I can implement using QEMU PCI hotplug feature. > (But probably we can only attach almost 10 ports. This will be limitation.) I am OK with supporting one virtio device for the first version. > >> Btw, i have read most of your qtest code. No obvious issues found so far >> but quite a couple of nits. You must have spent a lot of time on this. >> It is great work! > I appreciate your reviewing! > > BTW, my container implementation needed a QEMU patch in the case of > vhost-user. > But the patch has been merged in upstream QEMU, so we don't have this > limitation any more. Great, better put the QEMU dependency information in the commit message > > Thanks, > Tetsuya >

