Re: [PATCH 2/5] /dev/vring: simple userspace-kernel ringbuffer interface.

2008-04-19 Thread David Miller
From: Rusty Russell <[EMAIL PROTECTED]>
Date: Sun, 20 Apr 2008 02:41:14 +1000

> If only there were some kind of, I don't know... summit... for kernel 
> people... 

I'm starting to disbelieve the myth that because we can discuss
technical issues on mailing lists, we should talk primarily about
process issues during the kernel summit.

There is a distinct advantage to discussing and hashing things out in
person.  You can't say "screw you, your idea sucks" when you're face
to face with the other person, whereas online it's way too easy.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 5/5] tun: vringfd xmit support.

2008-04-19 Thread Andrew Morton
> On Sun, 20 Apr 2008 00:41:43 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote:
> On Saturday 19 April 2008 05:06:34 Andrew Morton wrote:
> > On Sat, 19 Apr 2008 01:15:15 +1000 Rusty Russell <[EMAIL PROTECTED]> 
> wrote:
> > > > What is the maximum numbet of pages which an unpriviliged user can
> > > > concurrently pin with this code?
> > >
> > > Since only root can open the tun device, it's currently OK.  The old code
> > > kmalloced and copied: is there some mm-fu reason why pinning userspace
> > > memory is worse?
> >
> > We generally try to avoid it - it allows users to dos the box.
> 
> My question is: is pinning a page worse than allocating a (kernel) page in 
> some way?
> 

I guess pinning is not as bad as straight-out allocating.

Pinning is limited to the size of the program's VM.  Pinning
it at least pining something which is accounted and is exposed
to admin tools.

But they're both pretty similar in effect and risk.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/5] /dev/vring: simple userspace-kernel ringbuffer interface.

2008-04-19 Thread Rusty Russell
On Saturday 19 April 2008 05:38:50 Michael Kerrisk wrote:
> On 4/18/08, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > This is may be our third high-bandwidth user/kernel interface to
> > transport bulk data ("hbukittbd") which was implemented because its
> > predecessors weren't quite right.  In a year or two's time someone else
> > will need a hbukittbd and will find that the existing three aren't quite
> > right and will give us another one.  One day we need to stop doing this
> > ;)

If only there were some kind of, I don't know... summit... for kernel 
people... 

> >  It could be that this person will look at Rusty's hbukittbd and find
> > that it _could_ be tweaked to do what he wants, but it's already shipping
> > and it's part of the kernel API and hence can't be made to do what he
> > wants.

Indeed.  I marked it experimental because of these questions (ie. it's not yet 
kernel ABI).  Getting everyone's attention is hard tho, so I figured we put 
it in as a device and moving to a syscall if and when we feel it's ready.

> >  So I think it would be good to plonk the proposed interface on the table
> >  and have a poke at it.  Is it compat-safe?  Is it extensible in a
> >  backward-compatible fashion?  Are there future-safe changes we should
> > make to it?  Can Michael Kerrisk understand, review and document it? 
> > etc.
>
> Well, it helps if he's CCed

It is compat safe, and we've already extended it once, so I'm reasonably happy 
so far.  If it were a syscall I'd add a flags arg, for the device it'd be an 
ioctl.  Starting with the virtio ABI seemed a reasonable first step, because 
*we* can use this today even if noone else does.

> I'm happy to work *with someone* on the documentation (pointless to do
> it on my own -- how do I know what Rusty's *intended* behavior for the
> interface is), and review, and testing.

Document coming up...
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/5] /dev/vring: simple userspace-kernel ringbuffer interface.

2008-04-19 Thread Rusty Russell
On Sunday 20 April 2008 02:33:22 Evgeniy Polyakov wrote:
> On Sun, Apr 20, 2008 at 02:05:31AM +1000, Rusty Russell 
([EMAIL PROTECTED]) wrote:
> > There are two reasons not to grab the lock.  It turns out that if we
> > tried to lock here, we'd deadlock, since the callbacks are called under
> > the lock. Secondly, it's possible to implement an atomic
> > vring_used_buffer variant, which could fail: this would avoid using the
> > thread most of the time.
>
> Yep, I decided that too. But it limits its usage to tun only or any
> other system where only single thread picks up results, so no generic
> userspace ring buffers?

I don't think so, it just externalizes the locking.  The mutex protects the 
attaching and detaching of the ops structure, some other lock or code 
protects simultenous kernel ring accesses.

Cheers,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/5] /dev/vring: simple userspace-kernel ringbuffer interface.

2008-04-19 Thread Rusty Russell
On Saturday 19 April 2008 20:22:15 Evgeniy Polyakov wrote:
> Hi.
>
> On Fri, Apr 18, 2008 at 02:39:48PM +1000, Rusty Russell 
([EMAIL PROTECTED]) wrote:
> > +int vring_get_buffer(struct vring_info *vr,
> > +struct iovec *in_iov,
> > +unsigned int *num_in, unsigned long *in_len,
> > +struct iovec *out_iov,
> > +unsigned int *num_out, unsigned long *out_len)
> > +{
> > +   unsigned int i, in = 0, out = 0;
> > +   unsigned long dummy;
> > +   u16 avail, last_avail, head;
> > +   struct vring_desc d;
>
> Should this whole function and vring_used_buffer() be protected with
> vr->lock mutex?

No; it's up to the caller to make sure that they are serialized.  In the case 
of tun that happens naturally.

There are two reasons not to grab the lock.  It turns out that if we tried to 
lock here, we'd deadlock, since the callbacks are called under the lock.  
Secondly, it's possible to implement an atomic vring_used_buffer variant, 
which could fail: this would avoid using the thread most of the time.

Hope that helps,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/5] /dev/vring: simple userspace-kernel ringbuffer interface.

2008-04-19 Thread Jonathan Corbet
> So I think it would be good to plonk the proposed interface on the table
> and have a poke at it.  Is it compat-safe?  Is it extensible in a
> backward-compatible fashion?  Are there future-safe changes we should make
> to it?  Can Michael Kerrisk understand, review and document it?  etc.
> 
> You know what I'm saying ;)  What is the proposed interface?

So, I'm not Michael, but I *did* make an attempt to document this
interface - user and kernel sides - so that it could be more easily
understood:

http://lwn.net/Articles/276856/

That was the previous posting, but a quick look suggests it hasn't
changed *that* much in this round.

jon
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 5/5] tun: vringfd xmit support.

2008-04-19 Thread Rusty Russell
On Saturday 19 April 2008 05:06:34 Andrew Morton wrote:
> On Sat, 19 Apr 2008 01:15:15 +1000 Rusty Russell <[EMAIL PROTECTED]> 
wrote:
> > > What is the maximum numbet of pages which an unpriviliged user can
> > > concurrently pin with this code?
> >
> > Since only root can open the tun device, it's currently OK.  The old code
> > kmalloced and copied: is there some mm-fu reason why pinning userspace
> > memory is worse?
>
> We generally try to avoid it - it allows users to dos the box.

My question is: is pinning a page worse than allocating a (kernel) page in 
some way?

Cheers,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization