Re: [PATCH 2/5] /dev/vring: simple userspace-kernel ringbuffer interface.
From: Rusty Russell <[EMAIL PROTECTED]> Date: Sun, 20 Apr 2008 02:41:14 +1000 > If only there were some kind of, I don't know... summit... for kernel > people... I'm starting to disbelieve the myth that because we can discuss technical issues on mailing lists, we should talk primarily about process issues during the kernel summit. There is a distinct advantage to discussing and hashing things out in person. You can't say "screw you, your idea sucks" when you're face to face with the other person, whereas online it's way too easy. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 5/5] tun: vringfd xmit support.
> On Sun, 20 Apr 2008 00:41:43 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote: > On Saturday 19 April 2008 05:06:34 Andrew Morton wrote: > > On Sat, 19 Apr 2008 01:15:15 +1000 Rusty Russell <[EMAIL PROTECTED]> > wrote: > > > > What is the maximum numbet of pages which an unpriviliged user can > > > > concurrently pin with this code? > > > > > > Since only root can open the tun device, it's currently OK. The old code > > > kmalloced and copied: is there some mm-fu reason why pinning userspace > > > memory is worse? > > > > We generally try to avoid it - it allows users to dos the box. > > My question is: is pinning a page worse than allocating a (kernel) page in > some way? > I guess pinning is not as bad as straight-out allocating. Pinning is limited to the size of the program's VM. Pinning it at least pining something which is accounted and is exposed to admin tools. But they're both pretty similar in effect and risk. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 2/5] /dev/vring: simple userspace-kernel ringbuffer interface.
On Saturday 19 April 2008 05:38:50 Michael Kerrisk wrote: > On 4/18/08, Andrew Morton <[EMAIL PROTECTED]> wrote: > > This is may be our third high-bandwidth user/kernel interface to > > transport bulk data ("hbukittbd") which was implemented because its > > predecessors weren't quite right. In a year or two's time someone else > > will need a hbukittbd and will find that the existing three aren't quite > > right and will give us another one. One day we need to stop doing this > > ;) If only there were some kind of, I don't know... summit... for kernel people... > > It could be that this person will look at Rusty's hbukittbd and find > > that it _could_ be tweaked to do what he wants, but it's already shipping > > and it's part of the kernel API and hence can't be made to do what he > > wants. Indeed. I marked it experimental because of these questions (ie. it's not yet kernel ABI). Getting everyone's attention is hard tho, so I figured we put it in as a device and moving to a syscall if and when we feel it's ready. > > So I think it would be good to plonk the proposed interface on the table > > and have a poke at it. Is it compat-safe? Is it extensible in a > > backward-compatible fashion? Are there future-safe changes we should > > make to it? Can Michael Kerrisk understand, review and document it? > > etc. > > Well, it helps if he's CCed It is compat safe, and we've already extended it once, so I'm reasonably happy so far. If it were a syscall I'd add a flags arg, for the device it'd be an ioctl. Starting with the virtio ABI seemed a reasonable first step, because *we* can use this today even if noone else does. > I'm happy to work *with someone* on the documentation (pointless to do > it on my own -- how do I know what Rusty's *intended* behavior for the > interface is), and review, and testing. Document coming up... Rusty. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 2/5] /dev/vring: simple userspace-kernel ringbuffer interface.
On Sunday 20 April 2008 02:33:22 Evgeniy Polyakov wrote: > On Sun, Apr 20, 2008 at 02:05:31AM +1000, Rusty Russell ([EMAIL PROTECTED]) wrote: > > There are two reasons not to grab the lock. It turns out that if we > > tried to lock here, we'd deadlock, since the callbacks are called under > > the lock. Secondly, it's possible to implement an atomic > > vring_used_buffer variant, which could fail: this would avoid using the > > thread most of the time. > > Yep, I decided that too. But it limits its usage to tun only or any > other system where only single thread picks up results, so no generic > userspace ring buffers? I don't think so, it just externalizes the locking. The mutex protects the attaching and detaching of the ops structure, some other lock or code protects simultenous kernel ring accesses. Cheers, Rusty. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 2/5] /dev/vring: simple userspace-kernel ringbuffer interface.
On Saturday 19 April 2008 20:22:15 Evgeniy Polyakov wrote: > Hi. > > On Fri, Apr 18, 2008 at 02:39:48PM +1000, Rusty Russell ([EMAIL PROTECTED]) wrote: > > +int vring_get_buffer(struct vring_info *vr, > > +struct iovec *in_iov, > > +unsigned int *num_in, unsigned long *in_len, > > +struct iovec *out_iov, > > +unsigned int *num_out, unsigned long *out_len) > > +{ > > + unsigned int i, in = 0, out = 0; > > + unsigned long dummy; > > + u16 avail, last_avail, head; > > + struct vring_desc d; > > Should this whole function and vring_used_buffer() be protected with > vr->lock mutex? No; it's up to the caller to make sure that they are serialized. In the case of tun that happens naturally. There are two reasons not to grab the lock. It turns out that if we tried to lock here, we'd deadlock, since the callbacks are called under the lock. Secondly, it's possible to implement an atomic vring_used_buffer variant, which could fail: this would avoid using the thread most of the time. Hope that helps, Rusty. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 2/5] /dev/vring: simple userspace-kernel ringbuffer interface.
> So I think it would be good to plonk the proposed interface on the table > and have a poke at it. Is it compat-safe? Is it extensible in a > backward-compatible fashion? Are there future-safe changes we should make > to it? Can Michael Kerrisk understand, review and document it? etc. > > You know what I'm saying ;) What is the proposed interface? So, I'm not Michael, but I *did* make an attempt to document this interface - user and kernel sides - so that it could be more easily understood: http://lwn.net/Articles/276856/ That was the previous posting, but a quick look suggests it hasn't changed *that* much in this round. jon ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 5/5] tun: vringfd xmit support.
On Saturday 19 April 2008 05:06:34 Andrew Morton wrote: > On Sat, 19 Apr 2008 01:15:15 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote: > > > What is the maximum numbet of pages which an unpriviliged user can > > > concurrently pin with this code? > > > > Since only root can open the tun device, it's currently OK. The old code > > kmalloced and copied: is there some mm-fu reason why pinning userspace > > memory is worse? > > We generally try to avoid it - it allows users to dos the box. My question is: is pinning a page worse than allocating a (kernel) page in some way? Cheers, Rusty. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization