On Thu, Dec 17, 2009 at 03:39:05PM -0600, Anthony Liguori wrote: > Chris Wright wrote: > > > >Doesn't sound useful. Low-level, sure worth being able to turn things > >on and off for testing/debugging, but probably not something a user > >should be burdened with in libvirt. > > > >But I dont' understand your -net vhost,fd=X, that would still be -net > >tap=fd=X, no? IOW, vhost is an internal qemu impl. detail of the virtio > >backend (or if you get your wish, $nic_backend). > > > > I don't want to get bogged down in a qemu-devel discussion on > libvirt-devel :-) > > But from a libvirt perspective, I assume that it wants to open up > /dev/vhost in order to not have to grant the qemu instance privileges > which means that it needs to hand qemu the file descriptor to it. > > Given a file descriptor, I don't think qemu can easily tell whether it's > a tun/tap fd or whether it's a vhost fd. Since they have different > interfaces, we need libvirt to tell us which one it is. Whether that's > -net tap,vhost or -net vhost, we can figure that part out on qemu-devel :-)
That is no problem, since we already do that kind of thing for TAP devices it is perfectly feasible for us to also do it for vhost FDs. > > >>The more interesting invocation of vhost-net though is one where the > >>vhost-net device backs directly to a physical network card. In this > >>mode, vhost should get considerably better performance than the > >>current implementation. I don't know the syntax yet, but I think > >>it's reasonable to assume that it will look something like -net > >>tap,dev=eth0. The effect will be that eth0 is dedicated to the > >>guest. > >> > > > >tap? we'd want either macvtap or raw socket here. > > > > I screwed up. I meant to say, -net vhost,dev=eth0. But maybe it > doesn't matter if libvirt is the one that initializes the vhost device, > setups up the raw socket (or macvtap), and hands us a file descriptor. > > In general, I think it's best to avoid as much network configuration in > qemu as humanly possible so I'd rather see libvirt configure the vhost > device ahead of time and pass us an fd that we can start using. Agreed, if we can avoid needing to give QEMU CAP_NET_ADMIN then that is preferred - indeed when libvirt runs QEMU as root, we already strip it of CAP_NET_ADMIN (and all other capabilities). > >>Another model would be to have libvirt see an SR-IOV adapter as a > >>network pool whereas it handled all of the VF management. > >>Considering how inflexible SR-IOV is today, I'm not sure whether > >>this is the best model. > >> > > > >We already need to know the VF<->PF relationship. For example, don't > >want to assign a VF to a guest, then a PF to another guest for basic > >sanity reasons. As we get better ability to manage the embedded switch > >in an SR-IOV NIC we will need to manage them as well. So we do need > >to have some concept of managing an SR-IOV adapter. > > > > But we still need to support the notion of backing a VNIC to a NIC, no? > If this just happens to also work with a naive usage of SR-IOV, is that > so bad? :-) > > Long term, yes, I think you want to manage SR-IOV adapters as if they're > a network pool. But since they're sufficiently inflexible right now, > I'm not sure it's all that useful today. FYI, we have generic capabilities for creating & deleting host devices via the virNodeDevCreate / virNodeDevDestroy APIs. We use this for creating & deleting NPIV scsi adapters. If we need to support this for some types of NICs too, that fits into the model fine. > >So I think we want to maintain a concept of the qemu backend (virtio, > >e1000, etc), tbhe fd that connects the qemu backend to the host (tap, > >socket, macvtap, etc), and the bridge. The bridge bit gets a little > >complicated. We have the following bridge cases: > > > >- sw bridge > > - normal existing setup, w/ Linux bridging code > > - macvlan > >- hw bridge > > - on SR-IOV card > > - configured to simply fwd to external hw bridge (like VEPA mode) > > - configured as a bridge w/ policies (QoS, ACL, port mirroring, > > etc. and allows inter-guest traffic and looks a bit like above > > sw switch) > > - external > > - need to possibly inform switch of incoming vport > > I've got mixed feelings here. With respect to sw vs. hw bridge, I > really think that that's an implementation detail that should not be > exposed to a user. A user doesn't typically want to think about whether > they're using a hardware switch vs. software switch. Instead, they > approach it from, I want to have this network topology, and these > features enabled. Agree there is alot of low level detail there, and I think it will be very hard for users, or apps to gain enough knowledge to make intelligent decisions about which they should use. So I don't think we want to expose all that detail. For a libvirt representation we need to consider it more in terms of what capabilities does each options provide, rather than what implementation each option uses Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| -- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list