On Wed, 2012-07-18 at 01:24 +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 04:09:25PM -0600, Alex Williamson wrote:
> > On Wed, 2012-07-18 at 00:23 +0300, Michael S. Tsirkin wrote:
> > > On Tue, Jul 17, 2012 at 02:03:05PM -0600, Alex Williamson wrote:
> > > > On Tue, 2012-07-17 at 21:58 +0300, Michael S. Tsirkin wrote:
> > > > > On Tue, Jul 17, 2012 at 10:52:16AM -0600, Alex Williamson wrote:
> > > > > > On Tue, 2012-07-17 at 19:19 +0300, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Jul 17, 2012 at 10:06:01AM -0600, Alex Williamson wrote:
> > > > > > > > On Tue, 2012-07-17 at 18:53 +0300, Michael S. Tsirkin wrote:
> > > > > > > > > On Tue, Jul 17, 2012 at 09:41:09AM -0600, Alex Williamson 
> > > > > > > > > wrote:
> > > > > > > > > > On Tue, 2012-07-17 at 18:13 +0300, Michael S. Tsirkin wrote:
> > > > > > > > > > > On Tue, Jul 17, 2012 at 08:57:04AM -0600, Alex Williamson 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > On Tue, 2012-07-17 at 17:42 +0300, Michael S. Tsirkin 
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > On Tue, Jul 17, 2012 at 08:29:43AM -0600, Alex 
> > > > > > > > > > > > > Williamson wrote:
> > > > > > > > > > > > > > On Tue, 2012-07-17 at 17:10 +0300, Michael S. 
> > > > > > > > > > > > > > Tsirkin wrote:
> > > > > > > > > > > > > > > On Tue, Jul 17, 2012 at 07:59:16AM -0600, Alex 
> > > > > > > > > > > > > > > Williamson wrote:
> > > > > > > > > > > > > > > > On Tue, 2012-07-17 at 13:21 +0300, Michael S. 
> > > > > > > > > > > > > > > > Tsirkin wrote:
> > > > > > > > > > > > > > > > > On Mon, Jul 16, 2012 at 02:33:55PM -0600, 
> > > > > > > > > > > > > > > > > Alex Williamson wrote:
> > > > > > > > > > > > > > > > > > +   if (args->flags & 
> > > > > > > > > > > > > > > > > > KVM_EOIFD_FLAG_LEVEL_IRQFD) {
> > > > > > > > > > > > > > > > > > +           struct _irqfd *irqfd = 
> > > > > > > > > > > > > > > > > > _irqfd_fdget_lock(kvm, args->irqfd);
> > > > > > > > > > > > > > > > > > +           if (IS_ERR(irqfd)) {
> > > > > > > > > > > > > > > > > > +                   ret = PTR_ERR(irqfd);
> > > > > > > > > > > > > > > > > > +                   goto fail;
> > > > > > > > > > > > > > > > > > +           }
> > > > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > > > +           gsi = irqfd->gsi;
> > > > > > > > > > > > > > > > > > +           level_irqfd = 
> > > > > > > > > > > > > > > > > > eventfd_ctx_get(irqfd->eventfd);
> > > > > > > > > > > > > > > > > > +           source = 
> > > > > > > > > > > > > > > > > > _irq_source_get(irqfd->source);
> > > > > > > > > > > > > > > > > > +           _irqfd_put_unlock(irqfd);
> > > > > > > > > > > > > > > > > > +           if (!source) {
> > > > > > > > > > > > > > > > > > +                   ret = -EINVAL;
> > > > > > > > > > > > > > > > > > +                   goto fail;
> > > > > > > > > > > > > > > > > > +           }
> > > > > > > > > > > > > > > > > > +   } else {
> > > > > > > > > > > > > > > > > > +           ret = -EINVAL;
> > > > > > > > > > > > > > > > > > +           goto fail;
> > > > > > > > > > > > > > > > > > +   }
> > > > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > > > +   INIT_LIST_HEAD(&eoifd->list);
> > > > > > > > > > > > > > > > > > +   eoifd->kvm = kvm;
> > > > > > > > > > > > > > > > > > +   eoifd->eventfd = eventfd;
> > > > > > > > > > > > > > > > > > +   eoifd->source = source;
> > > > > > > > > > > > > > > > > > +   eoifd->level_irqfd = level_irqfd;
> > > > > > > > > > > > > > > > > > +   eoifd->notifier.gsi = gsi;
> > > > > > > > > > > > > > > > > > +   eoifd->notifier.irq_acked = eoifd_event;
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > OK so this means eoifd keeps a reference to 
> > > > > > > > > > > > > > > > > the irqfd.
> > > > > > > > > > > > > > > > > And since this is the case, can't we drop the 
> > > > > > > > > > > > > > > > > reference counting
> > > > > > > > > > > > > > > > > around source ids now? Everything is 
> > > > > > > > > > > > > > > > > referenced through irqfd.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Holding a reference and using it as a reference 
> > > > > > > > > > > > > > > > count are not the same
> > > > > > > > > > > > > > > > thing.  What if another module holds a 
> > > > > > > > > > > > > > > > reference to this eventfd?  How
> > > > > > > > > > > > > > > > do we do anything on release?
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > We don't as there is no release, and using kref 
> > > > > > > > > > > > > > > on source id does not
> > > > > > > > > > > > > > > help: it just never gets invoked.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Please work out how you think it should work and 
> > > > > > > > > > > > > > let me know, I don't
> > > > > > > > > > > > > > see it.  We have an irq source id that needs to be 
> > > > > > > > > > > > > > allocated by irqfd
> > > > > > > > > > > > > > and returned when it's unused.  It becomes unused 
> > > > > > > > > > > > > > when neither irqfd nor
> > > > > > > > > > > > > > eoifd are making use of it.  irqfd and eoifd may be 
> > > > > > > > > > > > > > closed in any order.
> > > > > > > > > > > > > > Use of the source id is what we're reference 
> > > > > > > > > > > > > > counting, which is why it's
> > > > > > > > > > > > > > in struct _irq_source.  How can I use an eventfd 
> > > > > > > > > > > > > > reference for the same?
> > > > > > > > > > > > > > I don't know when it's unused.  I don't know who 
> > > > > > > > > > > > > > else holds a reference
> > > > > > > > > > > > > > to it...  Doesn't make sense to me.  Feels like 
> > > > > > > > > > > > > > attempting to squat on
> > > > > > > > > > > > > > someone else's object.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > eoifd should prevent irqfd from being released.
> > > > > > > > > > > > 
> > > > > > > > > > > > Why?  Note that this is actually quite difficult too.  
> > > > > > > > > > > > We can't fail a
> > > > > > > > > > > > release, nobody checks close(3p) return.  Blocking a 
> > > > > > > > > > > > release is likely
> > > > > > > > > > > > to cause all sorts of problems, so what you mean is 
> > > > > > > > > > > > that irqfd should
> > > > > > > > > > > > linger around until there are no references to it... 
> > > > > > > > > > > > but that's exactly
> > > > > > > > > > > > what struct _irq_source is for, is to hold the bits 
> > > > > > > > > > > > that we care about
> > > > > > > > > > > > references to and automatically release it when there 
> > > > > > > > > > > > are none.
> > > > > > > > > > > 
> > > > > > > > > > > No no. You *already* prevent it. You take a reference to 
> > > > > > > > > > > the eventfd
> > > > > > > > > > > context.
> > > > > > > > > > 
> > > > > > > > > > Right, which keeps the fd from going away, not the struct 
> > > > > > > > > > _irqfd.
> > > > > > > > > 
> > > > > > > > > _irqfd too.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > How so?
> > > > > > > 
> > > > > > > Normally irqfd_wakeup is called with POLLHUP and calls 
> > > > > > > irqfd_deactivate.
> > > > > > > If you get a ctx reference this does not happen.
> > > > > > 
> > > > > > I think you're mistaken.  wake_up_poll(,POLLHUP) is called from
> > > > > > eventfd_release (file_operations.release), not from ctx reference
> > > > > > release.
> > > > > 
> > > > > True. I was wrong. so close has the same bug as deassign. To fix,
> > > > > how about eoifd will hold a reference to the irqfd instead of the
> > > > > eventfd context?
> > > > 
> > > > What does it mean to hold a reference to the irqfd?
> > > 
> > > I meant file *reference: eventfd_fget. But there are other options see
> > > below.
> > 
> > That's no better than the eventfd context we already hold.
> 
> It means POLLHUP is not invoked until eoifd is closed.
> 
> > > > What state of functionality is an irqfd that has been
> > > > closed/de-assigned but is still attached to an eoifd?  It can't
> > > > continue to fire interrupts into the guest.
> > > >
> > > > I don't think close or de-assign have a bug, assign has a bug that it
> > > > can allow re-assignment using an in-use eventfd.  I think I'd rather
> > > > fix that.
> > > 
> > > Let me show you that the bug is in deassign:
> > >   assign irqfd for fd=1
> > >   assign for eoifd fd=2, irqfd=1
> > >   deassign irqfd 1
> > > 
> > > At this point eoifd has no meaning and there is also no way to deassign
> > > it,
> > 
> > Yes, there is.  This is exactly why I hold a reference to the eventfd
> > ctx.  It can still be deassigned by passing irqfd=1, we'll do an
> > eventfd_ctx_get and match it to that stored.
> 
> OK.
> What if instead we close irqfd 1?

Then the user isn't reading directions very well because the API clearly
indicates to pass the irqfd on both assign and de-assign of the eoifd.
However, it will still get de-assigned if they close the eoifd.

> > >  so the bug already triggered.
> > >
> > > I can see two ways out:
> > > 1. easy way - fail deassign
> > 
> > Then close() and deassign are not the same.
> > 
> > > 2. elegant way - shut down eoifd on irqfd deassign too
> > 
> > Sorry, I've always been told it's a bad idea to have one interface kill
> > another from inside the kernel.
> 
> Not kill merely deassign.

That's what I mean.  Unintended consequences should not be designed in.

> > Given that your assertion above is incorrect, I still stand by fixing
> > assign.
> 
> OK, but then you also would need to protect against someone binding
> an irqfd that is not level to same GSI.
> 
> Also if we go ahead with fixing assign - I do not think we need
> to rebind to the same source id - just failing assign
> of this irqfd with EBUSY should be enough.
> 



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to