Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-18 Thread Michael S. Tsirkin
On Wed, Jul 18, 2012 at 09:27:42AM +0300, Gleb Natapov wrote:
> On Tue, Jul 17, 2012 at 07:14:52PM +0300, Michael S. Tsirkin wrote:
> > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > 
> > Look at this:
> > 
> > static inline int kvm_irq_line_state(unsigned long *irq_state,
> >  int irq_source_id, int level)
> > {
> > /* Logical OR for level trig interrupt */
> > if (level)
> > set_bit(irq_source_id, irq_state);
> > else
> > clear_bit(irq_source_id, irq_state);
> > 
> > return !!(*irq_state);
> > }
> > 
> > 
> > Now:
> > If other CPU changes some other bit after the atomic change,
> > it looks like !!(*irq_state) might return a stale value.
> > 
> > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > If CPU 0 sees a stale value now it will return 0 here
> > and interrupt will get cleared.
> > 
> This will hardly happen on x86 especially since bit is set with
> serialized instruction. But there is actually a race here.
> CPU 0 clears bit 0. CPU 0 read irq_state as 0. CPU 1 sets level to 1.
> CPU 1 calls kvm_ioapic_set_irq(1). CPU 0 calls kvm_ioapic_set_irq(0).
> No ioapic thinks the level is 0 but irq_state is not 0.
> 
> This untested and un-compiled patch should fix it.

Getting rid of atomics completely makes me more comfortable,
and by moving all bitmap handling to under pic/ioapic lock
we can do just that.
I just tested and posted a patch that fixes the race in this way.
Could you take a look pls?

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-18 Thread Gleb Natapov
On Wed, Jul 18, 2012 at 02:08:43PM +0300, Michael S. Tsirkin wrote:
> On Wed, Jul 18, 2012 at 01:53:15PM +0300, Gleb Natapov wrote:
> > On Wed, Jul 18, 2012 at 01:51:05PM +0300, Michael S. Tsirkin wrote:
> > > On Wed, Jul 18, 2012 at 01:36:08PM +0300, Gleb Natapov wrote:
> > > > On Wed, Jul 18, 2012 at 01:33:35PM +0300, Michael S. Tsirkin wrote:
> > > > > On Wed, Jul 18, 2012 at 01:27:39PM +0300, Gleb Natapov wrote:
> > > > > > On Wed, Jul 18, 2012 at 01:20:29PM +0300, Michael S. Tsirkin wrote:
> > > > > > > On Wed, Jul 18, 2012 at 09:27:42AM +0300, Gleb Natapov wrote:
> > > > > > > > On Tue, Jul 17, 2012 at 07:14:52PM +0300, Michael S. Tsirkin 
> > > > > > > > wrote:
> > > > > > > > > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > > > > > > > > 
> > > > > > > > > Look at this:
> > > > > > > > > 
> > > > > > > > > static inline int kvm_irq_line_state(unsigned long *irq_state,
> > > > > > > > >  int irq_source_id, int 
> > > > > > > > > level)
> > > > > > > > > {
> > > > > > > > > /* Logical OR for level trig interrupt */
> > > > > > > > > if (level)
> > > > > > > > > set_bit(irq_source_id, irq_state);
> > > > > > > > > else
> > > > > > > > > clear_bit(irq_source_id, irq_state);
> > > > > > > > > 
> > > > > > > > > return !!(*irq_state);
> > > > > > > > > }
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Now:
> > > > > > > > > If other CPU changes some other bit after the atomic change,
> > > > > > > > > it looks like !!(*irq_state) might return a stale value.
> > > > > > > > > 
> > > > > > > > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > > > > > > > If CPU 0 sees a stale value now it will return 0 here
> > > > > > > > > and interrupt will get cleared.
> > > > > > > > > 
> > > > > > > > This will hardly happen on x86 especially since bit is set with
> > > > > > > > serialized instruction.
> > > > > > > 
> > > > > > > Probably. But it does make me a bit uneasy.  Why don't we pass
> > > > > > > irq_source_id to kvm_pic_set_irq/kvm_ioapic_set_irq, and move
> > > > > > > kvm_irq_line_state to under pic_lock/ioapic_lock?  We can then use
> > > > > > > __set_bit/__clear_bit in kvm_irq_line_state, making the ordering 
> > > > > > > simpler
> > > > > > > and saving an atomic op in the process.
> > > > > > > 
> > > > > > With my patch I do not see why we can't change them to unlocked 
> > > > > > variant
> > > > > > without moving them anywhere. The only requirement is to not use RMW
> > > > > > sequence to set/clear bits. The ordering of setting does not 
> > > > > > matter. The
> > > > > > ordering of reading is.
> > > > > 
> > > > > You want to use __set_bit/__clear_bit on the same word
> > > > > from multiple CPUs, without locking?
> > > > > Why won't this lose information?
> > > > Because it is not RMW. If it is then yes, you can't do that.
> > > 
> > > You are saying __set_bit does not do RMW on x86? Interesting.
> > I think it doesn't.
> 
> Anywhere I can read about this?
> 
Well actually SDM says LOCK prefix is needed, so yes we cannot use
__set_bit/__clear_bit without moving it under lock.

> > > It's probably not a good idea to rely on this I think.
> > > 
> > The code is no in arch/x86 so probably no. Although it is used only on
> > x86 (and ia64 which has broken kvm anyway).
> 
> Yes but exactly the reverse is documented.
> 
> /**
>  * __set_bit - Set a bit in memory
>  * @nr: the bit to set
>  * @addr: the address to start counting from
>  *
>  * Unlike set_bit(), this function is non-atomic and may be reordered.
> 
> 
>  pls note the below
> 
>  * If it's called on the same region of memory simultaneously, the effect
>  * may be that only one operation succeeds.
>  until here
> 
>  */
> static inline void __set_bit(int nr, volatile unsigned long *addr)
> {
> asm volatile("bts %1,%0" : ADDR : "Ir" (nr) : "memory");
> }
> 
> 
> 
> 
> > > > > 
> > > > > In any case, it seems simpler and safer to do accesses under lock
> > > > > than rely on specific use.
> > > > > 
> > > > > > --
> > > > > > Gleb.
> > > > 
> > > > --
> > > > Gleb.
> > 
> > --
> > Gleb.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-18 Thread Michael S. Tsirkin
On Wed, Jul 18, 2012 at 01:53:15PM +0300, Gleb Natapov wrote:
> On Wed, Jul 18, 2012 at 01:51:05PM +0300, Michael S. Tsirkin wrote:
> > On Wed, Jul 18, 2012 at 01:36:08PM +0300, Gleb Natapov wrote:
> > > On Wed, Jul 18, 2012 at 01:33:35PM +0300, Michael S. Tsirkin wrote:
> > > > On Wed, Jul 18, 2012 at 01:27:39PM +0300, Gleb Natapov wrote:
> > > > > On Wed, Jul 18, 2012 at 01:20:29PM +0300, Michael S. Tsirkin wrote:
> > > > > > On Wed, Jul 18, 2012 at 09:27:42AM +0300, Gleb Natapov wrote:
> > > > > > > On Tue, Jul 17, 2012 at 07:14:52PM +0300, Michael S. Tsirkin 
> > > > > > > wrote:
> > > > > > > > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > > > > > > > 
> > > > > > > > Look at this:
> > > > > > > > 
> > > > > > > > static inline int kvm_irq_line_state(unsigned long *irq_state,
> > > > > > > >  int irq_source_id, int 
> > > > > > > > level)
> > > > > > > > {
> > > > > > > > /* Logical OR for level trig interrupt */
> > > > > > > > if (level)
> > > > > > > > set_bit(irq_source_id, irq_state);
> > > > > > > > else
> > > > > > > > clear_bit(irq_source_id, irq_state);
> > > > > > > > 
> > > > > > > > return !!(*irq_state);
> > > > > > > > }
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Now:
> > > > > > > > If other CPU changes some other bit after the atomic change,
> > > > > > > > it looks like !!(*irq_state) might return a stale value.
> > > > > > > > 
> > > > > > > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > > > > > > If CPU 0 sees a stale value now it will return 0 here
> > > > > > > > and interrupt will get cleared.
> > > > > > > > 
> > > > > > > This will hardly happen on x86 especially since bit is set with
> > > > > > > serialized instruction.
> > > > > > 
> > > > > > Probably. But it does make me a bit uneasy.  Why don't we pass
> > > > > > irq_source_id to kvm_pic_set_irq/kvm_ioapic_set_irq, and move
> > > > > > kvm_irq_line_state to under pic_lock/ioapic_lock?  We can then use
> > > > > > __set_bit/__clear_bit in kvm_irq_line_state, making the ordering 
> > > > > > simpler
> > > > > > and saving an atomic op in the process.
> > > > > > 
> > > > > With my patch I do not see why we can't change them to unlocked 
> > > > > variant
> > > > > without moving them anywhere. The only requirement is to not use RMW
> > > > > sequence to set/clear bits. The ordering of setting does not matter. 
> > > > > The
> > > > > ordering of reading is.
> > > > 
> > > > You want to use __set_bit/__clear_bit on the same word
> > > > from multiple CPUs, without locking?
> > > > Why won't this lose information?
> > > Because it is not RMW. If it is then yes, you can't do that.
> > 
> > You are saying __set_bit does not do RMW on x86? Interesting.
> I think it doesn't.

Anywhere I can read about this?

> > It's probably not a good idea to rely on this I think.
> > 
> The code is no in arch/x86 so probably no. Although it is used only on
> x86 (and ia64 which has broken kvm anyway).

Yes but exactly the reverse is documented.

/**
 * __set_bit - Set a bit in memory
 * @nr: the bit to set
 * @addr: the address to start counting from
 *
 * Unlike set_bit(), this function is non-atomic and may be reordered.


 pls note the below

 * If it's called on the same region of memory simultaneously, the effect
 * may be that only one operation succeeds.
 until here

 */
static inline void __set_bit(int nr, volatile unsigned long *addr)
{
asm volatile("bts %1,%0" : ADDR : "Ir" (nr) : "memory");
}




> > > > 
> > > > In any case, it seems simpler and safer to do accesses under lock
> > > > than rely on specific use.
> > > > 
> > > > > --
> > > > >   Gleb.
> > > 
> > > --
> > >   Gleb.
> 
> --
>   Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-18 Thread Gleb Natapov
On Wed, Jul 18, 2012 at 01:51:05PM +0300, Michael S. Tsirkin wrote:
> On Wed, Jul 18, 2012 at 01:36:08PM +0300, Gleb Natapov wrote:
> > On Wed, Jul 18, 2012 at 01:33:35PM +0300, Michael S. Tsirkin wrote:
> > > On Wed, Jul 18, 2012 at 01:27:39PM +0300, Gleb Natapov wrote:
> > > > On Wed, Jul 18, 2012 at 01:20:29PM +0300, Michael S. Tsirkin wrote:
> > > > > On Wed, Jul 18, 2012 at 09:27:42AM +0300, Gleb Natapov wrote:
> > > > > > On Tue, Jul 17, 2012 at 07:14:52PM +0300, Michael S. Tsirkin wrote:
> > > > > > > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > > > > > > 
> > > > > > > Look at this:
> > > > > > > 
> > > > > > > static inline int kvm_irq_line_state(unsigned long *irq_state,
> > > > > > >  int irq_source_id, int level)
> > > > > > > {
> > > > > > > /* Logical OR for level trig interrupt */
> > > > > > > if (level)
> > > > > > > set_bit(irq_source_id, irq_state);
> > > > > > > else
> > > > > > > clear_bit(irq_source_id, irq_state);
> > > > > > > 
> > > > > > > return !!(*irq_state);
> > > > > > > }
> > > > > > > 
> > > > > > > 
> > > > > > > Now:
> > > > > > > If other CPU changes some other bit after the atomic change,
> > > > > > > it looks like !!(*irq_state) might return a stale value.
> > > > > > > 
> > > > > > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > > > > > If CPU 0 sees a stale value now it will return 0 here
> > > > > > > and interrupt will get cleared.
> > > > > > > 
> > > > > > This will hardly happen on x86 especially since bit is set with
> > > > > > serialized instruction.
> > > > > 
> > > > > Probably. But it does make me a bit uneasy.  Why don't we pass
> > > > > irq_source_id to kvm_pic_set_irq/kvm_ioapic_set_irq, and move
> > > > > kvm_irq_line_state to under pic_lock/ioapic_lock?  We can then use
> > > > > __set_bit/__clear_bit in kvm_irq_line_state, making the ordering 
> > > > > simpler
> > > > > and saving an atomic op in the process.
> > > > > 
> > > > With my patch I do not see why we can't change them to unlocked variant
> > > > without moving them anywhere. The only requirement is to not use RMW
> > > > sequence to set/clear bits. The ordering of setting does not matter. The
> > > > ordering of reading is.
> > > 
> > > You want to use __set_bit/__clear_bit on the same word
> > > from multiple CPUs, without locking?
> > > Why won't this lose information?
> > Because it is not RMW. If it is then yes, you can't do that.
> 
> You are saying __set_bit does not do RMW on x86? Interesting.
I think it doesn't.

> It's probably not a good idea to rely on this I think.
> 
The code is no in arch/x86 so probably no. Although it is used only on
x86 (and ia64 which has broken kvm anyway).

> > > 
> > > In any case, it seems simpler and safer to do accesses under lock
> > > than rely on specific use.
> > > 
> > > > --
> > > > Gleb.
> > 
> > --
> > Gleb.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-18 Thread Michael S. Tsirkin
On Wed, Jul 18, 2012 at 01:36:08PM +0300, Gleb Natapov wrote:
> On Wed, Jul 18, 2012 at 01:33:35PM +0300, Michael S. Tsirkin wrote:
> > On Wed, Jul 18, 2012 at 01:27:39PM +0300, Gleb Natapov wrote:
> > > On Wed, Jul 18, 2012 at 01:20:29PM +0300, Michael S. Tsirkin wrote:
> > > > On Wed, Jul 18, 2012 at 09:27:42AM +0300, Gleb Natapov wrote:
> > > > > On Tue, Jul 17, 2012 at 07:14:52PM +0300, Michael S. Tsirkin wrote:
> > > > > > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > > > > > 
> > > > > > Look at this:
> > > > > > 
> > > > > > static inline int kvm_irq_line_state(unsigned long *irq_state,
> > > > > >  int irq_source_id, int level)
> > > > > > {
> > > > > > /* Logical OR for level trig interrupt */
> > > > > > if (level)
> > > > > > set_bit(irq_source_id, irq_state);
> > > > > > else
> > > > > > clear_bit(irq_source_id, irq_state);
> > > > > > 
> > > > > > return !!(*irq_state);
> > > > > > }
> > > > > > 
> > > > > > 
> > > > > > Now:
> > > > > > If other CPU changes some other bit after the atomic change,
> > > > > > it looks like !!(*irq_state) might return a stale value.
> > > > > > 
> > > > > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > > > > If CPU 0 sees a stale value now it will return 0 here
> > > > > > and interrupt will get cleared.
> > > > > > 
> > > > > This will hardly happen on x86 especially since bit is set with
> > > > > serialized instruction.
> > > > 
> > > > Probably. But it does make me a bit uneasy.  Why don't we pass
> > > > irq_source_id to kvm_pic_set_irq/kvm_ioapic_set_irq, and move
> > > > kvm_irq_line_state to under pic_lock/ioapic_lock?  We can then use
> > > > __set_bit/__clear_bit in kvm_irq_line_state, making the ordering simpler
> > > > and saving an atomic op in the process.
> > > > 
> > > With my patch I do not see why we can't change them to unlocked variant
> > > without moving them anywhere. The only requirement is to not use RMW
> > > sequence to set/clear bits. The ordering of setting does not matter. The
> > > ordering of reading is.
> > 
> > You want to use __set_bit/__clear_bit on the same word
> > from multiple CPUs, without locking?
> > Why won't this lose information?
> Because it is not RMW. If it is then yes, you can't do that.

You are saying __set_bit does not do RMW on x86? Interesting.
It's probably not a good idea to rely on this I think.

> > 
> > In any case, it seems simpler and safer to do accesses under lock
> > than rely on specific use.
> > 
> > > --
> > >   Gleb.
> 
> --
>   Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-18 Thread Gleb Natapov
On Wed, Jul 18, 2012 at 01:33:35PM +0300, Michael S. Tsirkin wrote:
> On Wed, Jul 18, 2012 at 01:27:39PM +0300, Gleb Natapov wrote:
> > On Wed, Jul 18, 2012 at 01:20:29PM +0300, Michael S. Tsirkin wrote:
> > > On Wed, Jul 18, 2012 at 09:27:42AM +0300, Gleb Natapov wrote:
> > > > On Tue, Jul 17, 2012 at 07:14:52PM +0300, Michael S. Tsirkin wrote:
> > > > > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > > > > 
> > > > > Look at this:
> > > > > 
> > > > > static inline int kvm_irq_line_state(unsigned long *irq_state,
> > > > >  int irq_source_id, int level)
> > > > > {
> > > > > /* Logical OR for level trig interrupt */
> > > > > if (level)
> > > > > set_bit(irq_source_id, irq_state);
> > > > > else
> > > > > clear_bit(irq_source_id, irq_state);
> > > > > 
> > > > > return !!(*irq_state);
> > > > > }
> > > > > 
> > > > > 
> > > > > Now:
> > > > > If other CPU changes some other bit after the atomic change,
> > > > > it looks like !!(*irq_state) might return a stale value.
> > > > > 
> > > > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > > > If CPU 0 sees a stale value now it will return 0 here
> > > > > and interrupt will get cleared.
> > > > > 
> > > > This will hardly happen on x86 especially since bit is set with
> > > > serialized instruction.
> > > 
> > > Probably. But it does make me a bit uneasy.  Why don't we pass
> > > irq_source_id to kvm_pic_set_irq/kvm_ioapic_set_irq, and move
> > > kvm_irq_line_state to under pic_lock/ioapic_lock?  We can then use
> > > __set_bit/__clear_bit in kvm_irq_line_state, making the ordering simpler
> > > and saving an atomic op in the process.
> > > 
> > With my patch I do not see why we can't change them to unlocked variant
> > without moving them anywhere. The only requirement is to not use RMW
> > sequence to set/clear bits. The ordering of setting does not matter. The
> > ordering of reading is.
> 
> You want to use __set_bit/__clear_bit on the same word
> from multiple CPUs, without locking?
> Why won't this lose information?
Because it is not RMW. If it is then yes, you can't do that.
> 
> In any case, it seems simpler and safer to do accesses under lock
> than rely on specific use.
> 
> > --
> > Gleb.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-18 Thread Michael S. Tsirkin
On Wed, Jul 18, 2012 at 01:27:39PM +0300, Gleb Natapov wrote:
> On Wed, Jul 18, 2012 at 01:20:29PM +0300, Michael S. Tsirkin wrote:
> > On Wed, Jul 18, 2012 at 09:27:42AM +0300, Gleb Natapov wrote:
> > > On Tue, Jul 17, 2012 at 07:14:52PM +0300, Michael S. Tsirkin wrote:
> > > > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > > > 
> > > > Look at this:
> > > > 
> > > > static inline int kvm_irq_line_state(unsigned long *irq_state,
> > > >  int irq_source_id, int level)
> > > > {
> > > > /* Logical OR for level trig interrupt */
> > > > if (level)
> > > > set_bit(irq_source_id, irq_state);
> > > > else
> > > > clear_bit(irq_source_id, irq_state);
> > > > 
> > > > return !!(*irq_state);
> > > > }
> > > > 
> > > > 
> > > > Now:
> > > > If other CPU changes some other bit after the atomic change,
> > > > it looks like !!(*irq_state) might return a stale value.
> > > > 
> > > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > > If CPU 0 sees a stale value now it will return 0 here
> > > > and interrupt will get cleared.
> > > > 
> > > This will hardly happen on x86 especially since bit is set with
> > > serialized instruction.
> > 
> > Probably. But it does make me a bit uneasy.  Why don't we pass
> > irq_source_id to kvm_pic_set_irq/kvm_ioapic_set_irq, and move
> > kvm_irq_line_state to under pic_lock/ioapic_lock?  We can then use
> > __set_bit/__clear_bit in kvm_irq_line_state, making the ordering simpler
> > and saving an atomic op in the process.
> > 
> With my patch I do not see why we can't change them to unlocked variant
> without moving them anywhere. The only requirement is to not use RMW
> sequence to set/clear bits. The ordering of setting does not matter. The
> ordering of reading is.

You want to use __set_bit/__clear_bit on the same word
from multiple CPUs, without locking?
Why won't this lose information?

In any case, it seems simpler and safer to do accesses under lock
than rely on specific use.

> --
>   Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-18 Thread Gleb Natapov
On Wed, Jul 18, 2012 at 01:20:29PM +0300, Michael S. Tsirkin wrote:
> On Wed, Jul 18, 2012 at 09:27:42AM +0300, Gleb Natapov wrote:
> > On Tue, Jul 17, 2012 at 07:14:52PM +0300, Michael S. Tsirkin wrote:
> > > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > > 
> > > Look at this:
> > > 
> > > static inline int kvm_irq_line_state(unsigned long *irq_state,
> > >  int irq_source_id, int level)
> > > {
> > > /* Logical OR for level trig interrupt */
> > > if (level)
> > > set_bit(irq_source_id, irq_state);
> > > else
> > > clear_bit(irq_source_id, irq_state);
> > > 
> > > return !!(*irq_state);
> > > }
> > > 
> > > 
> > > Now:
> > > If other CPU changes some other bit after the atomic change,
> > > it looks like !!(*irq_state) might return a stale value.
> > > 
> > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > If CPU 0 sees a stale value now it will return 0 here
> > > and interrupt will get cleared.
> > > 
> > This will hardly happen on x86 especially since bit is set with
> > serialized instruction.
> 
> Probably. But it does make me a bit uneasy.  Why don't we pass
> irq_source_id to kvm_pic_set_irq/kvm_ioapic_set_irq, and move
> kvm_irq_line_state to under pic_lock/ioapic_lock?  We can then use
> __set_bit/__clear_bit in kvm_irq_line_state, making the ordering simpler
> and saving an atomic op in the process.
> 
With my patch I do not see why we can't change them to unlocked variant
without moving them anywhere. The only requirement is to not use RMW
sequence to set/clear bits. The ordering of setting does not matter. The
ordering of reading is.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-18 Thread Michael S. Tsirkin
On Wed, Jul 18, 2012 at 09:27:42AM +0300, Gleb Natapov wrote:
> On Tue, Jul 17, 2012 at 07:14:52PM +0300, Michael S. Tsirkin wrote:
> > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > 
> > Look at this:
> > 
> > static inline int kvm_irq_line_state(unsigned long *irq_state,
> >  int irq_source_id, int level)
> > {
> > /* Logical OR for level trig interrupt */
> > if (level)
> > set_bit(irq_source_id, irq_state);
> > else
> > clear_bit(irq_source_id, irq_state);
> > 
> > return !!(*irq_state);
> > }
> > 
> > 
> > Now:
> > If other CPU changes some other bit after the atomic change,
> > it looks like !!(*irq_state) might return a stale value.
> > 
> > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > If CPU 0 sees a stale value now it will return 0 here
> > and interrupt will get cleared.
> > 
> This will hardly happen on x86 especially since bit is set with
> serialized instruction.

Probably. But it does make me a bit uneasy.  Why don't we pass
irq_source_id to kvm_pic_set_irq/kvm_ioapic_set_irq, and move
kvm_irq_line_state to under pic_lock/ioapic_lock?  We can then use
__set_bit/__clear_bit in kvm_irq_line_state, making the ordering simpler
and saving an atomic op in the process.

> But there is actually a race here.
> CPU 0 clears bit 0. CPU 0 read irq_state as 0. CPU 1 sets level to 1.
> CPU 1 calls kvm_ioapic_set_irq(1). CPU 0 calls kvm_ioapic_set_irq(0).
> No ioapic thinks the level is 0 but irq_state is not 0.
> 
> This untested and un-compiled patch should fix it.
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index ef91d79..e22c78b 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -825,7 +825,7 @@ int kvm_read_guest_page_mmu(struct kvm_vcpu *vcpu, struct 
> kvm_mmu *mmu,
>  void kvm_propagate_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault);
>  bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl);
>  
> -int kvm_pic_set_irq(void *opaque, int irq, int level);
> +int kvm_pic_set_irq(void *opaque, int irq);
>  
>  void kvm_inject_nmi(struct kvm_vcpu *vcpu);
>  
> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
> index 81cf4fa..0d6988f 100644
> --- a/arch/x86/kvm/i8259.c
> +++ b/arch/x86/kvm/i8259.c
> @@ -188,12 +188,13 @@ void kvm_pic_update_irq(struct kvm_pic *s)
>   pic_unlock(s);
>  }
>  
> -int kvm_pic_set_irq(void *opaque, int irq, int level)
> +int kvm_pic_set_irq(void *opaque, int irq)
>  {
>   struct kvm_pic *s = opaque;
> - int ret = -1;
> + int ret = -1, level;
>  
>   pic_lock(s);
> + level = !!s->irq_states[irq];
>   if (irq >= 0 && irq < PIC_NUM_PINS) {
>   ret = pic_set_irq1(&s->pics[irq >> 3], irq & 7, level);
>   pic_update_irq(s);
> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
> index 26fd54d..6ad6a6b 100644
> --- a/virt/kvm/ioapic.c
> +++ b/virt/kvm/ioapic.c
> @@ -191,14 +191,15 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic, 
> int irq)
>   return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe);
>  }
>  
> -int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level)
> +int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq)
>  {
>   u32 old_irr;
>   u32 mask = 1 << irq;
>   union kvm_ioapic_redirect_entry entry;
> - int ret = 1;
> + int ret = 1, level;
>  
>   spin_lock(&ioapic->lock);
> + level = !!ioapic->irq_states[irq];
>   old_irr = ioapic->irr;
>   if (irq >= 0 && irq < IOAPIC_NUM_PINS) {
>   entry = ioapic->redirtbl[irq];
> diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
> index 32872a0..65894dd 100644
> --- a/virt/kvm/ioapic.h
> +++ b/virt/kvm/ioapic.h
> @@ -74,7 +74,7 @@ void kvm_ioapic_update_eoi(struct kvm *kvm, int vector, int 
> trigger_mode);
>  bool kvm_ioapic_handles_vector(struct kvm *kvm, int vector);
>  int kvm_ioapic_init(struct kvm *kvm);
>  void kvm_ioapic_destroy(struct kvm *kvm);
> -int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level);
> +int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq);
>  void kvm_ioapic_reset(struct kvm_ioapic *ioapic);
>  int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
>   struct kvm_lapic_irq *irq);
> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> index a6a0365..db0ccef 100644
> --- a/virt/kvm/irq_comm.c
> +++ b/virt/kvm/irq_comm.c
> @@ -33,7 +33,7 @@
>  
>  #include "ioapic.h"
>  
> -static inline int kvm_irq_line_state(unsigned long *irq_state,
> +static inline void kvm_irq_line_state(unsigned long *irq_state,
>int irq_source_id, int level)
>  {
>   /* Logical OR for level trig interrupt */
> @@ -41,8 +41,6 @@ static inline int kvm_irq_line_state(unsigned long 
> *irq_state,
>   set_bit(irq_source_id, irq_state);
>   else
>   clear_b

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Gleb Natapov
On Tue, Jul 17, 2012 at 07:14:52PM +0300, Michael S. Tsirkin wrote:
> > _Seems_ racy, or _is_ racy?  Please identify the race.
> 
> Look at this:
> 
> static inline int kvm_irq_line_state(unsigned long *irq_state,
>  int irq_source_id, int level)
> {
> /* Logical OR for level trig interrupt */
> if (level)
> set_bit(irq_source_id, irq_state);
> else
> clear_bit(irq_source_id, irq_state);
> 
> return !!(*irq_state);
> }
> 
> 
> Now:
> If other CPU changes some other bit after the atomic change,
> it looks like !!(*irq_state) might return a stale value.
> 
> CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> If CPU 0 sees a stale value now it will return 0 here
> and interrupt will get cleared.
> 
This will hardly happen on x86 especially since bit is set with
serialized instruction. But there is actually a race here.
CPU 0 clears bit 0. CPU 0 read irq_state as 0. CPU 1 sets level to 1.
CPU 1 calls kvm_ioapic_set_irq(1). CPU 0 calls kvm_ioapic_set_irq(0).
No ioapic thinks the level is 0 but irq_state is not 0.

This untested and un-compiled patch should fix it.

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index ef91d79..e22c78b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -825,7 +825,7 @@ int kvm_read_guest_page_mmu(struct kvm_vcpu *vcpu, struct 
kvm_mmu *mmu,
 void kvm_propagate_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault);
 bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl);
 
-int kvm_pic_set_irq(void *opaque, int irq, int level);
+int kvm_pic_set_irq(void *opaque, int irq);
 
 void kvm_inject_nmi(struct kvm_vcpu *vcpu);
 
diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
index 81cf4fa..0d6988f 100644
--- a/arch/x86/kvm/i8259.c
+++ b/arch/x86/kvm/i8259.c
@@ -188,12 +188,13 @@ void kvm_pic_update_irq(struct kvm_pic *s)
pic_unlock(s);
 }
 
-int kvm_pic_set_irq(void *opaque, int irq, int level)
+int kvm_pic_set_irq(void *opaque, int irq)
 {
struct kvm_pic *s = opaque;
-   int ret = -1;
+   int ret = -1, level;
 
pic_lock(s);
+   level = !!s->irq_states[irq];
if (irq >= 0 && irq < PIC_NUM_PINS) {
ret = pic_set_irq1(&s->pics[irq >> 3], irq & 7, level);
pic_update_irq(s);
diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index 26fd54d..6ad6a6b 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -191,14 +191,15 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic, int 
irq)
return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe);
 }
 
-int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level)
+int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq)
 {
u32 old_irr;
u32 mask = 1 << irq;
union kvm_ioapic_redirect_entry entry;
-   int ret = 1;
+   int ret = 1, level;
 
spin_lock(&ioapic->lock);
+   level = !!ioapic->irq_states[irq];
old_irr = ioapic->irr;
if (irq >= 0 && irq < IOAPIC_NUM_PINS) {
entry = ioapic->redirtbl[irq];
diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
index 32872a0..65894dd 100644
--- a/virt/kvm/ioapic.h
+++ b/virt/kvm/ioapic.h
@@ -74,7 +74,7 @@ void kvm_ioapic_update_eoi(struct kvm *kvm, int vector, int 
trigger_mode);
 bool kvm_ioapic_handles_vector(struct kvm *kvm, int vector);
 int kvm_ioapic_init(struct kvm *kvm);
 void kvm_ioapic_destroy(struct kvm *kvm);
-int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level);
+int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq);
 void kvm_ioapic_reset(struct kvm_ioapic *ioapic);
 int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
struct kvm_lapic_irq *irq);
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index a6a0365..db0ccef 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -33,7 +33,7 @@
 
 #include "ioapic.h"
 
-static inline int kvm_irq_line_state(unsigned long *irq_state,
+static inline void kvm_irq_line_state(unsigned long *irq_state,
 int irq_source_id, int level)
 {
/* Logical OR for level trig interrupt */
@@ -41,8 +41,6 @@ static inline int kvm_irq_line_state(unsigned long *irq_state,
set_bit(irq_source_id, irq_state);
else
clear_bit(irq_source_id, irq_state);
-
-   return !!(*irq_state);
 }
 
 static int kvm_set_pic_irq(struct kvm_kernel_irq_routing_entry *e,
@@ -50,9 +48,9 @@ static int kvm_set_pic_irq(struct 
kvm_kernel_irq_routing_entry *e,
 {
 #ifdef CONFIG_X86
struct kvm_pic *pic = pic_irqchip(kvm);
-   level = kvm_irq_line_state(&pic->irq_states[e->irqchip.pin],
+   kvm_irq_line_state(&pic->irq_states[e->irqchip.pin],
   irq_source_id, level);
-   return kvm_pic_set_irq(pic, e->irqchip.pin, level);
+   return kvm_pic_set

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Michael S. Tsirkin
On Tue, Jul 17, 2012 at 04:22:10PM -0600, Alex Williamson wrote:
> On Wed, 2012-07-18 at 01:05 +0300, Michael S. Tsirkin wrote:
> > On Tue, Jul 17, 2012 at 04:01:40PM -0600, Alex Williamson wrote:
> > > On Wed, 2012-07-18 at 00:05 +0300, Michael S. Tsirkin wrote:
> > > > On Tue, Jul 17, 2012 at 01:51:27PM -0600, Alex Williamson wrote:
> > > > > On Tue, 2012-07-17 at 21:55 +0300, Michael S. Tsirkin wrote:
> > > > > > On Tue, Jul 17, 2012 at 10:45:52AM -0600, Alex Williamson wrote:
> > > > > > > On Tue, 2012-07-17 at 19:21 +0300, Michael S. Tsirkin wrote:
> > > > > > > > On Tue, Jul 17, 2012 at 10:17:03AM -0600, Alex Williamson wrote:
> > > > > > > > > > > > > > > >   And current code looks buggy if yes we need 
> > > > > > > > > > > > > > > > to fix it somehow.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Which to me seems to indicate this should be 
> > > > > > > > > > > > > > > handled as a separate
> > > > > > > > > > > > > > > effort.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > A separate patchset, sure. But likely a 
> > > > > > > > > > > > > > prerequisite: we still need to
> > > > > > > > > > > > > > look at all the code. Let's not copy bugs, need to 
> > > > > > > > > > > > > > fix them.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > This looks tangential to me unless you can come up 
> > > > > > > > > > > > > with an actual reason
> > > > > > > > > > > > > the above spinlock usage is incorrect or insufficient.
> > > > > > > > > > > > 
> > > > > > > > > > > > You copy the same pattern that seems racy. So you 
> > > > > > > > > > > > double the
> > > > > > > > > > > > amount of code that woul need to be fixed.
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > > > > > > > > > 
> > > > > > > > > > Look at this:
> > > > > > > > > > 
> > > > > > > > > > static inline int kvm_irq_line_state(unsigned long 
> > > > > > > > > > *irq_state,
> > > > > > > > > >  int irq_source_id, int 
> > > > > > > > > > level)
> > > > > > > > > > {
> > > > > > > > > > /* Logical OR for level trig interrupt */
> > > > > > > > > > if (level)
> > > > > > > > > > set_bit(irq_source_id, irq_state);
> > > > > > > > > > else
> > > > > > > > > > clear_bit(irq_source_id, irq_state);
> > > > > > > > > > 
> > > > > > > > > > return !!(*irq_state);
> > > > > > > > > > }
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Now:
> > > > > > > > > > If other CPU changes some other bit after the atomic change,
> > > > > > > > > > it looks like !!(*irq_state) might return a stale value.
> > > > > > > > > > 
> > > > > > > > > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > > > > > > > > If CPU 0 sees a stale value now it will return 0 here
> > > > > > > > > > and interrupt will get cleared.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Maybe this is not a problem. But in that case IMO it needs
> > > > > > > > > > a comment explaining why and why it's not a problem in
> > > > > > > > > > your code.
> > > > > > > > > 
> > > > > > > > > So you want to close the door on anything that uses 
> > > > > > > > > kvm_set_irq until
> > > > > > > > > this gets fixed... that's insane.
> > > > > > > > 
> > > > > > > > What does kvm_set_irq use have to do with it?  You posted this 
> > > > > > > > patch:
> > > > > > > > 
> > > > > > > > +static int kvm_clear_pic_irq(struct 
> > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > +struct kvm *kvm, int irq_source_id)
> > > > > > > > +{
> > > > > > > > +#ifdef CONFIG_X86
> > > > > > > > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > > > > +   int level =
> > > > > > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > > > > +irq_source_id);
> > > > > > > > +   if (level)
> > > > > > > > +   kvm_pic_set_irq(pic, e->irqchip.pin,
> > > > > > > > +   
> > > > > > > > !!pic->irq_states[e->irqchip.pin]);
> > > > > > > > +   return level;
> > > > > > > > +#else
> > > > > > > > +   return -1;
> > > > > > > > +#endif
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > 
> > > > > > > > it seems racy in the same way.
> > > > > > > 
> > > > > > > Now you're just misrepresenting how we got here, which was:
> > > > > > > 
> > > > > > > > > > > > > IMHO, we're going off into the weeds again with these 
> > > > > > > > > > > > > last
> > > > > > > > > > > > > two patches.  It may be a valid optimization, but it 
> > > > > > > > > > > > > really has no
> > > > > > > > > > > > > bearing on the meat of the series (and afaict, no 
> > > > > > > > > > > > > significant
> > > > > > > > > > > > > performance difference either).
> > > > > > > > > > > > 
> > > > > > > > > > > > For me 

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Alex Williamson
On Wed, 2012-07-18 at 01:05 +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 04:01:40PM -0600, Alex Williamson wrote:
> > On Wed, 2012-07-18 at 00:05 +0300, Michael S. Tsirkin wrote:
> > > On Tue, Jul 17, 2012 at 01:51:27PM -0600, Alex Williamson wrote:
> > > > On Tue, 2012-07-17 at 21:55 +0300, Michael S. Tsirkin wrote:
> > > > > On Tue, Jul 17, 2012 at 10:45:52AM -0600, Alex Williamson wrote:
> > > > > > On Tue, 2012-07-17 at 19:21 +0300, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Jul 17, 2012 at 10:17:03AM -0600, Alex Williamson wrote:
> > > > > > > > > > > > > > >   And current code looks buggy if yes we need to 
> > > > > > > > > > > > > > > fix it somehow.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Which to me seems to indicate this should be 
> > > > > > > > > > > > > > handled as a separate
> > > > > > > > > > > > > > effort.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > A separate patchset, sure. But likely a prerequisite: 
> > > > > > > > > > > > > we still need to
> > > > > > > > > > > > > look at all the code. Let's not copy bugs, need to 
> > > > > > > > > > > > > fix them.
> > > > > > > > > > > > 
> > > > > > > > > > > > This looks tangential to me unless you can come up with 
> > > > > > > > > > > > an actual reason
> > > > > > > > > > > > the above spinlock usage is incorrect or insufficient.
> > > > > > > > > > > 
> > > > > > > > > > > You copy the same pattern that seems racy. So you double 
> > > > > > > > > > > the
> > > > > > > > > > > amount of code that woul need to be fixed.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > > > > > > > > 
> > > > > > > > > Look at this:
> > > > > > > > > 
> > > > > > > > > static inline int kvm_irq_line_state(unsigned long *irq_state,
> > > > > > > > >  int irq_source_id, int 
> > > > > > > > > level)
> > > > > > > > > {
> > > > > > > > > /* Logical OR for level trig interrupt */
> > > > > > > > > if (level)
> > > > > > > > > set_bit(irq_source_id, irq_state);
> > > > > > > > > else
> > > > > > > > > clear_bit(irq_source_id, irq_state);
> > > > > > > > > 
> > > > > > > > > return !!(*irq_state);
> > > > > > > > > }
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Now:
> > > > > > > > > If other CPU changes some other bit after the atomic change,
> > > > > > > > > it looks like !!(*irq_state) might return a stale value.
> > > > > > > > > 
> > > > > > > > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > > > > > > > If CPU 0 sees a stale value now it will return 0 here
> > > > > > > > > and interrupt will get cleared.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Maybe this is not a problem. But in that case IMO it needs
> > > > > > > > > a comment explaining why and why it's not a problem in
> > > > > > > > > your code.
> > > > > > > > 
> > > > > > > > So you want to close the door on anything that uses kvm_set_irq 
> > > > > > > > until
> > > > > > > > this gets fixed... that's insane.
> > > > > > > 
> > > > > > > What does kvm_set_irq use have to do with it?  You posted this 
> > > > > > > patch:
> > > > > > > 
> > > > > > > +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry 
> > > > > > > *e,
> > > > > > > +struct kvm *kvm, int irq_source_id)
> > > > > > > +{
> > > > > > > +#ifdef CONFIG_X86
> > > > > > > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > > > +   int level =
> > > > > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > > > +irq_source_id);
> > > > > > > +   if (level)
> > > > > > > +   kvm_pic_set_irq(pic, e->irqchip.pin,
> > > > > > > +   
> > > > > > > !!pic->irq_states[e->irqchip.pin]);
> > > > > > > +   return level;
> > > > > > > +#else
> > > > > > > +   return -1;
> > > > > > > +#endif
> > > > > > > +}
> > > > > > > +
> > > > > > > 
> > > > > > > it seems racy in the same way.
> > > > > > 
> > > > > > Now you're just misrepresenting how we got here, which was:
> > > > > > 
> > > > > > > > > > > > IMHO, we're going off into the weeds again with these 
> > > > > > > > > > > > last
> > > > > > > > > > > > two patches.  It may be a valid optimization, but it 
> > > > > > > > > > > > really has no
> > > > > > > > > > > > bearing on the meat of the series (and afaict, no 
> > > > > > > > > > > > significant
> > > > > > > > > > > > performance difference either).
> > > > > > > > > > > 
> > > > > > > > > > > For me it's not a performance thing. IMO code is cleaner 
> > > > > > > > > > > without this locking:
> > > > > > > > > > > we add a lock but only use it in some cases, so the rules 
> > > > > > > > > > > become really
> > > > > > > > > > > complex.
> > > > > > 
> > > > > > So I'm happy to drop

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Michael S. Tsirkin
On Tue, Jul 17, 2012 at 04:01:40PM -0600, Alex Williamson wrote:
> On Wed, 2012-07-18 at 00:05 +0300, Michael S. Tsirkin wrote:
> > On Tue, Jul 17, 2012 at 01:51:27PM -0600, Alex Williamson wrote:
> > > On Tue, 2012-07-17 at 21:55 +0300, Michael S. Tsirkin wrote:
> > > > On Tue, Jul 17, 2012 at 10:45:52AM -0600, Alex Williamson wrote:
> > > > > On Tue, 2012-07-17 at 19:21 +0300, Michael S. Tsirkin wrote:
> > > > > > On Tue, Jul 17, 2012 at 10:17:03AM -0600, Alex Williamson wrote:
> > > > > > > > > > > > > >   And current code looks buggy if yes we need to 
> > > > > > > > > > > > > > fix it somehow.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Which to me seems to indicate this should be handled 
> > > > > > > > > > > > > as a separate
> > > > > > > > > > > > > effort.
> > > > > > > > > > > > 
> > > > > > > > > > > > A separate patchset, sure. But likely a prerequisite: 
> > > > > > > > > > > > we still need to
> > > > > > > > > > > > look at all the code. Let's not copy bugs, need to fix 
> > > > > > > > > > > > them.
> > > > > > > > > > > 
> > > > > > > > > > > This looks tangential to me unless you can come up with 
> > > > > > > > > > > an actual reason
> > > > > > > > > > > the above spinlock usage is incorrect or insufficient.
> > > > > > > > > > 
> > > > > > > > > > You copy the same pattern that seems racy. So you double the
> > > > > > > > > > amount of code that woul need to be fixed.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > > > > > > > 
> > > > > > > > Look at this:
> > > > > > > > 
> > > > > > > > static inline int kvm_irq_line_state(unsigned long *irq_state,
> > > > > > > >  int irq_source_id, int 
> > > > > > > > level)
> > > > > > > > {
> > > > > > > > /* Logical OR for level trig interrupt */
> > > > > > > > if (level)
> > > > > > > > set_bit(irq_source_id, irq_state);
> > > > > > > > else
> > > > > > > > clear_bit(irq_source_id, irq_state);
> > > > > > > > 
> > > > > > > > return !!(*irq_state);
> > > > > > > > }
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Now:
> > > > > > > > If other CPU changes some other bit after the atomic change,
> > > > > > > > it looks like !!(*irq_state) might return a stale value.
> > > > > > > > 
> > > > > > > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > > > > > > If CPU 0 sees a stale value now it will return 0 here
> > > > > > > > and interrupt will get cleared.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Maybe this is not a problem. But in that case IMO it needs
> > > > > > > > a comment explaining why and why it's not a problem in
> > > > > > > > your code.
> > > > > > > 
> > > > > > > So you want to close the door on anything that uses kvm_set_irq 
> > > > > > > until
> > > > > > > this gets fixed... that's insane.
> > > > > > 
> > > > > > What does kvm_set_irq use have to do with it?  You posted this 
> > > > > > patch:
> > > > > > 
> > > > > > +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry 
> > > > > > *e,
> > > > > > +struct kvm *kvm, int irq_source_id)
> > > > > > +{
> > > > > > +#ifdef CONFIG_X86
> > > > > > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > > +   int level =
> > > > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > > +irq_source_id);
> > > > > > +   if (level)
> > > > > > +   kvm_pic_set_irq(pic, e->irqchip.pin,
> > > > > > +   !!pic->irq_states[e->irqchip.pin]);
> > > > > > +   return level;
> > > > > > +#else
> > > > > > +   return -1;
> > > > > > +#endif
> > > > > > +}
> > > > > > +
> > > > > > 
> > > > > > it seems racy in the same way.
> > > > > 
> > > > > Now you're just misrepresenting how we got here, which was:
> > > > > 
> > > > > > > > > > > IMHO, we're going off into the weeds again with these last
> > > > > > > > > > > two patches.  It may be a valid optimization, but it 
> > > > > > > > > > > really has no
> > > > > > > > > > > bearing on the meat of the series (and afaict, no 
> > > > > > > > > > > significant
> > > > > > > > > > > performance difference either).
> > > > > > > > > > 
> > > > > > > > > > For me it's not a performance thing. IMO code is cleaner 
> > > > > > > > > > without this locking:
> > > > > > > > > > we add a lock but only use it in some cases, so the rules 
> > > > > > > > > > become really
> > > > > > > > > > complex.
> > > > > 
> > > > > So I'm happy to drop the last 2 patches, which were done at your 
> > > > > request
> > > > > anyway, but you've failed to show how the locking in patches 1&2 is
> > > > > messy, inconsistent, or complex and now you're asking to block all
> > > > > progress.
> > > > 
> > > > I'm asking for bugs to get fixed and not duplicated. Adding more bugs is
>

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Alex Williamson
On Wed, 2012-07-18 at 00:05 +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 01:51:27PM -0600, Alex Williamson wrote:
> > On Tue, 2012-07-17 at 21:55 +0300, Michael S. Tsirkin wrote:
> > > On Tue, Jul 17, 2012 at 10:45:52AM -0600, Alex Williamson wrote:
> > > > On Tue, 2012-07-17 at 19:21 +0300, Michael S. Tsirkin wrote:
> > > > > On Tue, Jul 17, 2012 at 10:17:03AM -0600, Alex Williamson wrote:
> > > > > > > > > > > > >   And current code looks buggy if yes we need to fix 
> > > > > > > > > > > > > it somehow.
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > Which to me seems to indicate this should be handled as 
> > > > > > > > > > > > a separate
> > > > > > > > > > > > effort.
> > > > > > > > > > > 
> > > > > > > > > > > A separate patchset, sure. But likely a prerequisite: we 
> > > > > > > > > > > still need to
> > > > > > > > > > > look at all the code. Let's not copy bugs, need to fix 
> > > > > > > > > > > them.
> > > > > > > > > > 
> > > > > > > > > > This looks tangential to me unless you can come up with an 
> > > > > > > > > > actual reason
> > > > > > > > > > the above spinlock usage is incorrect or insufficient.
> > > > > > > > > 
> > > > > > > > > You copy the same pattern that seems racy. So you double the
> > > > > > > > > amount of code that woul need to be fixed.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > > > > > > 
> > > > > > > Look at this:
> > > > > > > 
> > > > > > > static inline int kvm_irq_line_state(unsigned long *irq_state,
> > > > > > >  int irq_source_id, int level)
> > > > > > > {
> > > > > > > /* Logical OR for level trig interrupt */
> > > > > > > if (level)
> > > > > > > set_bit(irq_source_id, irq_state);
> > > > > > > else
> > > > > > > clear_bit(irq_source_id, irq_state);
> > > > > > > 
> > > > > > > return !!(*irq_state);
> > > > > > > }
> > > > > > > 
> > > > > > > 
> > > > > > > Now:
> > > > > > > If other CPU changes some other bit after the atomic change,
> > > > > > > it looks like !!(*irq_state) might return a stale value.
> > > > > > > 
> > > > > > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > > > > > If CPU 0 sees a stale value now it will return 0 here
> > > > > > > and interrupt will get cleared.
> > > > > > > 
> > > > > > > 
> > > > > > > Maybe this is not a problem. But in that case IMO it needs
> > > > > > > a comment explaining why and why it's not a problem in
> > > > > > > your code.
> > > > > > 
> > > > > > So you want to close the door on anything that uses kvm_set_irq 
> > > > > > until
> > > > > > this gets fixed... that's insane.
> > > > > 
> > > > > What does kvm_set_irq use have to do with it?  You posted this patch:
> > > > > 
> > > > > +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry *e,
> > > > > +struct kvm *kvm, int irq_source_id)
> > > > > +{
> > > > > +#ifdef CONFIG_X86
> > > > > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > +   int level =
> > > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > +irq_source_id);
> > > > > +   if (level)
> > > > > +   kvm_pic_set_irq(pic, e->irqchip.pin,
> > > > > +   !!pic->irq_states[e->irqchip.pin]);
> > > > > +   return level;
> > > > > +#else
> > > > > +   return -1;
> > > > > +#endif
> > > > > +}
> > > > > +
> > > > > 
> > > > > it seems racy in the same way.
> > > > 
> > > > Now you're just misrepresenting how we got here, which was:
> > > > 
> > > > > > > > > > IMHO, we're going off into the weeds again with these last
> > > > > > > > > > two patches.  It may be a valid optimization, but it really 
> > > > > > > > > > has no
> > > > > > > > > > bearing on the meat of the series (and afaict, no 
> > > > > > > > > > significant
> > > > > > > > > > performance difference either).
> > > > > > > > > 
> > > > > > > > > For me it's not a performance thing. IMO code is cleaner 
> > > > > > > > > without this locking:
> > > > > > > > > we add a lock but only use it in some cases, so the rules 
> > > > > > > > > become really
> > > > > > > > > complex.
> > > > 
> > > > So I'm happy to drop the last 2 patches, which were done at your request
> > > > anyway, but you've failed to show how the locking in patches 1&2 is
> > > > messy, inconsistent, or complex and now you're asking to block all
> > > > progress.
> > > 
> > > I'm asking for bugs to get fixed and not duplicated. Adding more bugs is
> > > not progress. Or maybe there is no bug. Let's see why and add a comment.
> > > 
> > > >  Those patches are just users of kvm_set_irq.
> > > 
> > > 
> > > Well these add calls to kvm_set_irq which scans all vcpus under
> > > spinlock. In the past Avi thought this is not a good idea too.
> > > Maybe things changed.
> > 
> 

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Michael S. Tsirkin
On Tue, Jul 17, 2012 at 01:51:27PM -0600, Alex Williamson wrote:
> On Tue, 2012-07-17 at 21:55 +0300, Michael S. Tsirkin wrote:
> > On Tue, Jul 17, 2012 at 10:45:52AM -0600, Alex Williamson wrote:
> > > On Tue, 2012-07-17 at 19:21 +0300, Michael S. Tsirkin wrote:
> > > > On Tue, Jul 17, 2012 at 10:17:03AM -0600, Alex Williamson wrote:
> > > > > > > > > > > >   And current code looks buggy if yes we need to fix it 
> > > > > > > > > > > > somehow.
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > Which to me seems to indicate this should be handled as a 
> > > > > > > > > > > separate
> > > > > > > > > > > effort.
> > > > > > > > > > 
> > > > > > > > > > A separate patchset, sure. But likely a prerequisite: we 
> > > > > > > > > > still need to
> > > > > > > > > > look at all the code. Let's not copy bugs, need to fix them.
> > > > > > > > > 
> > > > > > > > > This looks tangential to me unless you can come up with an 
> > > > > > > > > actual reason
> > > > > > > > > the above spinlock usage is incorrect or insufficient.
> > > > > > > > 
> > > > > > > > You copy the same pattern that seems racy. So you double the
> > > > > > > > amount of code that woul need to be fixed.
> > > > > > > 
> > > > > > > 
> > > > > > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > > > > > 
> > > > > > Look at this:
> > > > > > 
> > > > > > static inline int kvm_irq_line_state(unsigned long *irq_state,
> > > > > >  int irq_source_id, int level)
> > > > > > {
> > > > > > /* Logical OR for level trig interrupt */
> > > > > > if (level)
> > > > > > set_bit(irq_source_id, irq_state);
> > > > > > else
> > > > > > clear_bit(irq_source_id, irq_state);
> > > > > > 
> > > > > > return !!(*irq_state);
> > > > > > }
> > > > > > 
> > > > > > 
> > > > > > Now:
> > > > > > If other CPU changes some other bit after the atomic change,
> > > > > > it looks like !!(*irq_state) might return a stale value.
> > > > > > 
> > > > > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > > > > If CPU 0 sees a stale value now it will return 0 here
> > > > > > and interrupt will get cleared.
> > > > > > 
> > > > > > 
> > > > > > Maybe this is not a problem. But in that case IMO it needs
> > > > > > a comment explaining why and why it's not a problem in
> > > > > > your code.
> > > > > 
> > > > > So you want to close the door on anything that uses kvm_set_irq until
> > > > > this gets fixed... that's insane.
> > > > 
> > > > What does kvm_set_irq use have to do with it?  You posted this patch:
> > > > 
> > > > +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry *e,
> > > > +struct kvm *kvm, int irq_source_id)
> > > > +{
> > > > +#ifdef CONFIG_X86
> > > > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > > > +   int level =
> > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > +irq_source_id);
> > > > +   if (level)
> > > > +   kvm_pic_set_irq(pic, e->irqchip.pin,
> > > > +   !!pic->irq_states[e->irqchip.pin]);
> > > > +   return level;
> > > > +#else
> > > > +   return -1;
> > > > +#endif
> > > > +}
> > > > +
> > > > 
> > > > it seems racy in the same way.
> > > 
> > > Now you're just misrepresenting how we got here, which was:
> > > 
> > > > > > > > > IMHO, we're going off into the weeds again with these last
> > > > > > > > > two patches.  It may be a valid optimization, but it really 
> > > > > > > > > has no
> > > > > > > > > bearing on the meat of the series (and afaict, no significant
> > > > > > > > > performance difference either).
> > > > > > > > 
> > > > > > > > For me it's not a performance thing. IMO code is cleaner 
> > > > > > > > without this locking:
> > > > > > > > we add a lock but only use it in some cases, so the rules 
> > > > > > > > become really
> > > > > > > > complex.
> > > 
> > > So I'm happy to drop the last 2 patches, which were done at your request
> > > anyway, but you've failed to show how the locking in patches 1&2 is
> > > messy, inconsistent, or complex and now you're asking to block all
> > > progress.
> > 
> > I'm asking for bugs to get fixed and not duplicated. Adding more bugs is
> > not progress. Or maybe there is no bug. Let's see why and add a comment.
> > 
> > >  Those patches are just users of kvm_set_irq.
> > 
> > 
> > Well these add calls to kvm_set_irq which scans all vcpus under
> > spinlock. In the past Avi thought this is not a good idea too.
> > Maybe things changed.
> 
> We can drop the spinlock if we don't care about spurious EOIs, which is
> only a theoretical scalability problem anyway.

Not theoretical at all IMO. We see the problem with virtio with old
guests today.

> We're talking about
> level interrupts here, how scalable do we need to be?
> 

The reason we are moving them into kernel at all 

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Alex Williamson
On Tue, 2012-07-17 at 21:55 +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 10:45:52AM -0600, Alex Williamson wrote:
> > On Tue, 2012-07-17 at 19:21 +0300, Michael S. Tsirkin wrote:
> > > On Tue, Jul 17, 2012 at 10:17:03AM -0600, Alex Williamson wrote:
> > > > > > > > > > >   And current code looks buggy if yes we need to fix it 
> > > > > > > > > > > somehow.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Which to me seems to indicate this should be handled as a 
> > > > > > > > > > separate
> > > > > > > > > > effort.
> > > > > > > > > 
> > > > > > > > > A separate patchset, sure. But likely a prerequisite: we 
> > > > > > > > > still need to
> > > > > > > > > look at all the code. Let's not copy bugs, need to fix them.
> > > > > > > > 
> > > > > > > > This looks tangential to me unless you can come up with an 
> > > > > > > > actual reason
> > > > > > > > the above spinlock usage is incorrect or insufficient.
> > > > > > > 
> > > > > > > You copy the same pattern that seems racy. So you double the
> > > > > > > amount of code that woul need to be fixed.
> > > > > > 
> > > > > > 
> > > > > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > > > > 
> > > > > Look at this:
> > > > > 
> > > > > static inline int kvm_irq_line_state(unsigned long *irq_state,
> > > > >  int irq_source_id, int level)
> > > > > {
> > > > > /* Logical OR for level trig interrupt */
> > > > > if (level)
> > > > > set_bit(irq_source_id, irq_state);
> > > > > else
> > > > > clear_bit(irq_source_id, irq_state);
> > > > > 
> > > > > return !!(*irq_state);
> > > > > }
> > > > > 
> > > > > 
> > > > > Now:
> > > > > If other CPU changes some other bit after the atomic change,
> > > > > it looks like !!(*irq_state) might return a stale value.
> > > > > 
> > > > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > > > If CPU 0 sees a stale value now it will return 0 here
> > > > > and interrupt will get cleared.
> > > > > 
> > > > > 
> > > > > Maybe this is not a problem. But in that case IMO it needs
> > > > > a comment explaining why and why it's not a problem in
> > > > > your code.
> > > > 
> > > > So you want to close the door on anything that uses kvm_set_irq until
> > > > this gets fixed... that's insane.
> > > 
> > > What does kvm_set_irq use have to do with it?  You posted this patch:
> > > 
> > > +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry *e,
> > > +struct kvm *kvm, int irq_source_id)
> > > +{
> > > +#ifdef CONFIG_X86
> > > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > > +   int level =
> > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > +irq_source_id);
> > > +   if (level)
> > > +   kvm_pic_set_irq(pic, e->irqchip.pin,
> > > +   !!pic->irq_states[e->irqchip.pin]);
> > > +   return level;
> > > +#else
> > > +   return -1;
> > > +#endif
> > > +}
> > > +
> > > 
> > > it seems racy in the same way.
> > 
> > Now you're just misrepresenting how we got here, which was:
> > 
> > > > > > > > IMHO, we're going off into the weeds again with these last
> > > > > > > > two patches.  It may be a valid optimization, but it really has 
> > > > > > > > no
> > > > > > > > bearing on the meat of the series (and afaict, no significant
> > > > > > > > performance difference either).
> > > > > > > 
> > > > > > > For me it's not a performance thing. IMO code is cleaner without 
> > > > > > > this locking:
> > > > > > > we add a lock but only use it in some cases, so the rules become 
> > > > > > > really
> > > > > > > complex.
> > 
> > So I'm happy to drop the last 2 patches, which were done at your request
> > anyway, but you've failed to show how the locking in patches 1&2 is
> > messy, inconsistent, or complex and now you're asking to block all
> > progress.
> 
> I'm asking for bugs to get fixed and not duplicated. Adding more bugs is
> not progress. Or maybe there is no bug. Let's see why and add a comment.
> 
> >  Those patches are just users of kvm_set_irq.
> 
> 
> Well these add calls to kvm_set_irq which scans all vcpus under
> spinlock. In the past Avi thought this is not a good idea too.
> Maybe things changed.

We can drop the spinlock if we don't care about spurious EOIs, which is
only a theoretical scalability problem anyway.  We're talking about
level interrupts here, how scalable do we need to be?



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Michael S. Tsirkin
On Tue, Jul 17, 2012 at 10:45:52AM -0600, Alex Williamson wrote:
> On Tue, 2012-07-17 at 19:21 +0300, Michael S. Tsirkin wrote:
> > On Tue, Jul 17, 2012 at 10:17:03AM -0600, Alex Williamson wrote:
> > > > > > > > > >   And current code looks buggy if yes we need to fix it 
> > > > > > > > > > somehow.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Which to me seems to indicate this should be handled as a 
> > > > > > > > > separate
> > > > > > > > > effort.
> > > > > > > > 
> > > > > > > > A separate patchset, sure. But likely a prerequisite: we still 
> > > > > > > > need to
> > > > > > > > look at all the code. Let's not copy bugs, need to fix them.
> > > > > > > 
> > > > > > > This looks tangential to me unless you can come up with an actual 
> > > > > > > reason
> > > > > > > the above spinlock usage is incorrect or insufficient.
> > > > > > 
> > > > > > You copy the same pattern that seems racy. So you double the
> > > > > > amount of code that woul need to be fixed.
> > > > > 
> > > > > 
> > > > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > > > 
> > > > Look at this:
> > > > 
> > > > static inline int kvm_irq_line_state(unsigned long *irq_state,
> > > >  int irq_source_id, int level)
> > > > {
> > > > /* Logical OR for level trig interrupt */
> > > > if (level)
> > > > set_bit(irq_source_id, irq_state);
> > > > else
> > > > clear_bit(irq_source_id, irq_state);
> > > > 
> > > > return !!(*irq_state);
> > > > }
> > > > 
> > > > 
> > > > Now:
> > > > If other CPU changes some other bit after the atomic change,
> > > > it looks like !!(*irq_state) might return a stale value.
> > > > 
> > > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > > If CPU 0 sees a stale value now it will return 0 here
> > > > and interrupt will get cleared.
> > > > 
> > > > 
> > > > Maybe this is not a problem. But in that case IMO it needs
> > > > a comment explaining why and why it's not a problem in
> > > > your code.
> > > 
> > > So you want to close the door on anything that uses kvm_set_irq until
> > > this gets fixed... that's insane.
> > 
> > What does kvm_set_irq use have to do with it?  You posted this patch:
> > 
> > +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry *e,
> > +struct kvm *kvm, int irq_source_id)
> > +{
> > +#ifdef CONFIG_X86
> > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > +   int level =
> > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > +irq_source_id);
> > +   if (level)
> > +   kvm_pic_set_irq(pic, e->irqchip.pin,
> > +   !!pic->irq_states[e->irqchip.pin]);
> > +   return level;
> > +#else
> > +   return -1;
> > +#endif
> > +}
> > +
> > 
> > it seems racy in the same way.
> 
> Now you're just misrepresenting how we got here, which was:
> 
> > > > > > > IMHO, we're going off into the weeds again with these last
> > > > > > > two patches.  It may be a valid optimization, but it really has no
> > > > > > > bearing on the meat of the series (and afaict, no significant
> > > > > > > performance difference either).
> > > > > > 
> > > > > > For me it's not a performance thing. IMO code is cleaner without 
> > > > > > this locking:
> > > > > > we add a lock but only use it in some cases, so the rules become 
> > > > > > really
> > > > > > complex.
> 
> So I'm happy to drop the last 2 patches, which were done at your request
> anyway, but you've failed to show how the locking in patches 1&2 is
> messy, inconsistent, or complex and now you're asking to block all
> progress.

I'm asking for bugs to get fixed and not duplicated. Adding more bugs is
not progress. Or maybe there is no bug. Let's see why and add a comment.

>  Those patches are just users of kvm_set_irq.


Well these add calls to kvm_set_irq which scans all vcpus under
spinlock. In the past Avi thought this is not a good idea too.
Maybe things changed.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Gleb Natapov
On Tue, Jul 17, 2012 at 07:36:49PM +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 10:08:21AM -0600, Alex Williamson wrote:
> > On Tue, 2012-07-17 at 18:57 +0300, Michael S. Tsirkin wrote:
> > > On Tue, Jul 17, 2012 at 09:51:41AM -0600, Alex Williamson wrote:
> > > > On Tue, 2012-07-17 at 18:36 +0300, Michael S. Tsirkin wrote:
> > > > > On Tue, Jul 17, 2012 at 09:20:11AM -0600, Alex Williamson wrote:
> > > > > > On Tue, 2012-07-17 at 17:53 +0300, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Jul 17, 2012 at 08:21:51AM -0600, Alex Williamson wrote:
> > > > > > > > On Tue, 2012-07-17 at 17:08 +0300, Michael S. Tsirkin wrote:
> > > > > > > > > On Tue, Jul 17, 2012 at 07:56:09AM -0600, Alex Williamson 
> > > > > > > > > wrote:
> > > > > > > > > > On Tue, 2012-07-17 at 13:14 +0300, Michael S. Tsirkin wrote:
> > > > > > > > > > > On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > This is an alternative to kvm_set_irq(,,,0) which 
> > > > > > > > > > > > returns the previous
> > > > > > > > > > > > assertion state of the interrupt and does nothing if it 
> > > > > > > > > > > > isn't changed.
> > > > > > > > > > > > 
> > > > > > > > > > > > Signed-off-by: Alex Williamson 
> > > > > > > > > > > > 
> > > > > > > > > > > > ---
> > > > > > > > > > > > 
> > > > > > > > > > > >  include/linux/kvm_host.h |3 ++
> > > > > > > > > > > >  virt/kvm/irq_comm.c  |   78 
> > > > > > > > > > > > ++
> > > > > > > > > > > >  2 files changed, 81 insertions(+)
> > > > > > > > > > > > 
> > > > > > > > > > > > diff --git a/include/linux/kvm_host.h 
> > > > > > > > > > > > b/include/linux/kvm_host.h
> > > > > > > > > > > > index a7661c0..6c168f1 100644
> > > > > > > > > > > > --- a/include/linux/kvm_host.h
> > > > > > > > > > > > +++ b/include/linux/kvm_host.h
> > > > > > > > > > > > @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry 
> > > > > > > > > > > > {
> > > > > > > > > > > > u32 type;
> > > > > > > > > > > > int (*set)(struct kvm_kernel_irq_routing_entry 
> > > > > > > > > > > > *e,
> > > > > > > > > > > >struct kvm *kvm, int irq_source_id, 
> > > > > > > > > > > > int level);
> > > > > > > > > > > > +   int (*clear)(struct 
> > > > > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > > > +struct kvm *kvm, int 
> > > > > > > > > > > > irq_source_id);
> > > > > > > > > > > > union {
> > > > > > > > > > > > struct {
> > > > > > > > > > > > unsigned irqchip;
> > > > > > > > > > > > @@ -629,6 +631,7 @@ void 
> > > > > > > > > > > > kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic,
> > > > > > > > > > > >unsigned long 
> > > > > > > > > > > > *deliver_bitmask);
> > > > > > > > > > > >  #endif
> > > > > > > > > > > >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, 
> > > > > > > > > > > > u32 irq, int level);
> > > > > > > > > > > > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, 
> > > > > > > > > > > > u32 irq);
> > > > > > > > > > > >  int kvm_set_msi(struct kvm_kernel_irq_routing_entry 
> > > > > > > > > > > > *irq_entry, struct kvm *kvm,
> > > > > > > > > > > > int irq_source_id, int level);
> > > > > > > > > > > >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned 
> > > > > > > > > > > > irqchip, unsigned pin);
> > > > > > > > > > > > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > > > > > > > > > > > index 5afb431..76e8f22 100644
> > > > > > > > > > > > --- a/virt/kvm/irq_comm.c
> > > > > > > > > > > > +++ b/virt/kvm/irq_comm.c
> > > > > > > > > > > > @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> > > > > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > > > return kvm_ioapic_set_irq(ioapic, 
> > > > > > > > > > > > e->irqchip.pin, level);
> > > > > > > > > > > >  }
> > > > > > > > > > > >  
> > > > > > > > > > > > +static inline int kvm_clear_irq_line_state(unsigned 
> > > > > > > > > > > > long *irq_state,
> > > > > > > > > > > > +   int 
> > > > > > > > > > > > irq_source_id)
> > > > > > > > > > > > +{
> > > > > > > > > > > > +   return !!test_and_clear_bit(irq_source_id, 
> > > > > > > > > > > > irq_state);
> > > > > > > > > > > > +}
> > > > > > > > > > > > +
> > > > > > > > > > > > +static int kvm_clear_pic_irq(struct 
> > > > > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > > > +struct kvm *kvm, int 
> > > > > > > > > > > > irq_source_id)
> > > > > > > > > > > > +{
> > > > > > > > > > > > +#ifdef CONFIG_X86
> > > > > > > > > > > > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > > > > > > > > +   int level = 
> > > > > > > > > > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > > > > > > > > +

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Alex Williamson
On Tue, 2012-07-17 at 19:21 +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 10:17:03AM -0600, Alex Williamson wrote:
> > > > > > > > >   And current code looks buggy if yes we need to fix it 
> > > > > > > > > somehow.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Which to me seems to indicate this should be handled as a 
> > > > > > > > separate
> > > > > > > > effort.
> > > > > > > 
> > > > > > > A separate patchset, sure. But likely a prerequisite: we still 
> > > > > > > need to
> > > > > > > look at all the code. Let's not copy bugs, need to fix them.
> > > > > > 
> > > > > > This looks tangential to me unless you can come up with an actual 
> > > > > > reason
> > > > > > the above spinlock usage is incorrect or insufficient.
> > > > > 
> > > > > You copy the same pattern that seems racy. So you double the
> > > > > amount of code that woul need to be fixed.
> > > > 
> > > > 
> > > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > > 
> > > Look at this:
> > > 
> > > static inline int kvm_irq_line_state(unsigned long *irq_state,
> > >  int irq_source_id, int level)
> > > {
> > > /* Logical OR for level trig interrupt */
> > > if (level)
> > > set_bit(irq_source_id, irq_state);
> > > else
> > > clear_bit(irq_source_id, irq_state);
> > > 
> > > return !!(*irq_state);
> > > }
> > > 
> > > 
> > > Now:
> > > If other CPU changes some other bit after the atomic change,
> > > it looks like !!(*irq_state) might return a stale value.
> > > 
> > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > If CPU 0 sees a stale value now it will return 0 here
> > > and interrupt will get cleared.
> > > 
> > > 
> > > Maybe this is not a problem. But in that case IMO it needs
> > > a comment explaining why and why it's not a problem in
> > > your code.
> > 
> > So you want to close the door on anything that uses kvm_set_irq until
> > this gets fixed... that's insane.
> 
> What does kvm_set_irq use have to do with it?  You posted this patch:
> 
> +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry *e,
> +struct kvm *kvm, int irq_source_id)
> +{
> +#ifdef CONFIG_X86
> +   struct kvm_pic *pic = pic_irqchip(kvm);
> +   int level =
> kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> +irq_source_id);
> +   if (level)
> +   kvm_pic_set_irq(pic, e->irqchip.pin,
> +   !!pic->irq_states[e->irqchip.pin]);
> +   return level;
> +#else
> +   return -1;
> +#endif
> +}
> +
> 
> it seems racy in the same way.

Now you're just misrepresenting how we got here, which was:

> > > > > > IMHO, we're going off into the weeds again with these last
> > > > > > two patches.  It may be a valid optimization, but it really has no
> > > > > > bearing on the meat of the series (and afaict, no significant
> > > > > > performance difference either).
> > > > > 
> > > > > For me it's not a performance thing. IMO code is cleaner without this 
> > > > > locking:
> > > > > we add a lock but only use it in some cases, so the rules become 
> > > > > really
> > > > > complex.

So I'm happy to drop the last 2 patches, which were done at your request
anyway, but you've failed to show how the locking in patches 1&2 is
messy, inconsistent, or complex and now you're asking to block all
progress.  Those patches are just users of kvm_set_irq.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Michael S. Tsirkin
On Tue, Jul 17, 2012 at 10:08:21AM -0600, Alex Williamson wrote:
> On Tue, 2012-07-17 at 18:57 +0300, Michael S. Tsirkin wrote:
> > On Tue, Jul 17, 2012 at 09:51:41AM -0600, Alex Williamson wrote:
> > > On Tue, 2012-07-17 at 18:36 +0300, Michael S. Tsirkin wrote:
> > > > On Tue, Jul 17, 2012 at 09:20:11AM -0600, Alex Williamson wrote:
> > > > > On Tue, 2012-07-17 at 17:53 +0300, Michael S. Tsirkin wrote:
> > > > > > On Tue, Jul 17, 2012 at 08:21:51AM -0600, Alex Williamson wrote:
> > > > > > > On Tue, 2012-07-17 at 17:08 +0300, Michael S. Tsirkin wrote:
> > > > > > > > On Tue, Jul 17, 2012 at 07:56:09AM -0600, Alex Williamson wrote:
> > > > > > > > > On Tue, 2012-07-17 at 13:14 +0300, Michael S. Tsirkin wrote:
> > > > > > > > > > On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson 
> > > > > > > > > > wrote:
> > > > > > > > > > > This is an alternative to kvm_set_irq(,,,0) which returns 
> > > > > > > > > > > the previous
> > > > > > > > > > > assertion state of the interrupt and does nothing if it 
> > > > > > > > > > > isn't changed.
> > > > > > > > > > > 
> > > > > > > > > > > Signed-off-by: Alex Williamson 
> > > > > > > > > > > 
> > > > > > > > > > > ---
> > > > > > > > > > > 
> > > > > > > > > > >  include/linux/kvm_host.h |3 ++
> > > > > > > > > > >  virt/kvm/irq_comm.c  |   78 
> > > > > > > > > > > ++
> > > > > > > > > > >  2 files changed, 81 insertions(+)
> > > > > > > > > > > 
> > > > > > > > > > > diff --git a/include/linux/kvm_host.h 
> > > > > > > > > > > b/include/linux/kvm_host.h
> > > > > > > > > > > index a7661c0..6c168f1 100644
> > > > > > > > > > > --- a/include/linux/kvm_host.h
> > > > > > > > > > > +++ b/include/linux/kvm_host.h
> > > > > > > > > > > @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
> > > > > > > > > > >   u32 type;
> > > > > > > > > > >   int (*set)(struct kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > >  struct kvm *kvm, int irq_source_id, int 
> > > > > > > > > > > level);
> > > > > > > > > > > + int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > > +  struct kvm *kvm, int irq_source_id);
> > > > > > > > > > >   union {
> > > > > > > > > > >   struct {
> > > > > > > > > > >   unsigned irqchip;
> > > > > > > > > > > @@ -629,6 +631,7 @@ void 
> > > > > > > > > > > kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic,
> > > > > > > > > > >  unsigned long 
> > > > > > > > > > > *deliver_bitmask);
> > > > > > > > > > >  #endif
> > > > > > > > > > >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 
> > > > > > > > > > > irq, int level);
> > > > > > > > > > > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, 
> > > > > > > > > > > u32 irq);
> > > > > > > > > > >  int kvm_set_msi(struct kvm_kernel_irq_routing_entry 
> > > > > > > > > > > *irq_entry, struct kvm *kvm,
> > > > > > > > > > >   int irq_source_id, int level);
> > > > > > > > > > >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned 
> > > > > > > > > > > irqchip, unsigned pin);
> > > > > > > > > > > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > > > > > > > > > > index 5afb431..76e8f22 100644
> > > > > > > > > > > --- a/virt/kvm/irq_comm.c
> > > > > > > > > > > +++ b/virt/kvm/irq_comm.c
> > > > > > > > > > > @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> > > > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > >   return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, 
> > > > > > > > > > > level);
> > > > > > > > > > >  }
> > > > > > > > > > >  
> > > > > > > > > > > +static inline int kvm_clear_irq_line_state(unsigned long 
> > > > > > > > > > > *irq_state,
> > > > > > > > > > > + int irq_source_id)
> > > > > > > > > > > +{
> > > > > > > > > > > + return !!test_and_clear_bit(irq_source_id, irq_state);
> > > > > > > > > > > +}
> > > > > > > > > > > +
> > > > > > > > > > > +static int kvm_clear_pic_irq(struct 
> > > > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > > +  struct kvm *kvm, int irq_source_id)
> > > > > > > > > > > +{
> > > > > > > > > > > +#ifdef CONFIG_X86
> > > > > > > > > > > + struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > > > > > > > + int level = 
> > > > > > > > > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > > > > > > > +  irq_source_id);
> > > > > > > > > > > + if (level)
> > > > > > > > > > > + kvm_pic_set_irq(pic, e->irqchip.pin,
> > > > > > > > > > > + 
> > > > > > > > > > > !!pic->irq_states[e->irqchip.pin]);
> > > > > > > > > > > + return level;
> > > > > > > > > > 
> > > > > > > > > > I think I begin to understand: if (level) checks it was 
> > > > > > > > > > previously set,
> > > > > > > > > > and then we clear if needed?
> > > > > > > > > 
> > > > > > > > > It's actually very simple, if

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Michael S. Tsirkin
On Tue, Jul 17, 2012 at 10:17:03AM -0600, Alex Williamson wrote:
> > > > > > > >   And current code looks buggy if yes we need to fix it somehow.
> > > > > > > 
> > > > > > > 
> > > > > > > Which to me seems to indicate this should be handled as a separate
> > > > > > > effort.
> > > > > > 
> > > > > > A separate patchset, sure. But likely a prerequisite: we still need 
> > > > > > to
> > > > > > look at all the code. Let's not copy bugs, need to fix them.
> > > > > 
> > > > > This looks tangential to me unless you can come up with an actual 
> > > > > reason
> > > > > the above spinlock usage is incorrect or insufficient.
> > > > 
> > > > You copy the same pattern that seems racy. So you double the
> > > > amount of code that woul need to be fixed.
> > > 
> > > 
> > > _Seems_ racy, or _is_ racy?  Please identify the race.
> > 
> > Look at this:
> > 
> > static inline int kvm_irq_line_state(unsigned long *irq_state,
> >  int irq_source_id, int level)
> > {
> > /* Logical OR for level trig interrupt */
> > if (level)
> > set_bit(irq_source_id, irq_state);
> > else
> > clear_bit(irq_source_id, irq_state);
> > 
> > return !!(*irq_state);
> > }
> > 
> > 
> > Now:
> > If other CPU changes some other bit after the atomic change,
> > it looks like !!(*irq_state) might return a stale value.
> > 
> > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > If CPU 0 sees a stale value now it will return 0 here
> > and interrupt will get cleared.
> > 
> > 
> > Maybe this is not a problem. But in that case IMO it needs
> > a comment explaining why and why it's not a problem in
> > your code.
> 
> So you want to close the door on anything that uses kvm_set_irq until
> this gets fixed... that's insane.

What does kvm_set_irq use have to do with it?  You posted this patch:

+static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry *e,
+struct kvm *kvm, int irq_source_id)
+{
+#ifdef CONFIG_X86
+   struct kvm_pic *pic = pic_irqchip(kvm);
+   int level =
kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
+irq_source_id);
+   if (level)
+   kvm_pic_set_irq(pic, e->irqchip.pin,
+   !!pic->irq_states[e->irqchip.pin]);
+   return level;
+#else
+   return -1;
+#endif
+}
+

it seems racy in the same way.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Alex Williamson
On Tue, 2012-07-17 at 19:14 +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 10:08:21AM -0600, Alex Williamson wrote:
> > On Tue, 2012-07-17 at 18:57 +0300, Michael S. Tsirkin wrote:
> > > On Tue, Jul 17, 2012 at 09:51:41AM -0600, Alex Williamson wrote:
> > > > On Tue, 2012-07-17 at 18:36 +0300, Michael S. Tsirkin wrote:
> > > > > On Tue, Jul 17, 2012 at 09:20:11AM -0600, Alex Williamson wrote:
> > > > > > On Tue, 2012-07-17 at 17:53 +0300, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Jul 17, 2012 at 08:21:51AM -0600, Alex Williamson wrote:
> > > > > > > > On Tue, 2012-07-17 at 17:08 +0300, Michael S. Tsirkin wrote:
> > > > > > > > > On Tue, Jul 17, 2012 at 07:56:09AM -0600, Alex Williamson 
> > > > > > > > > wrote:
> > > > > > > > > > On Tue, 2012-07-17 at 13:14 +0300, Michael S. Tsirkin wrote:
> > > > > > > > > > > On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > This is an alternative to kvm_set_irq(,,,0) which 
> > > > > > > > > > > > returns the previous
> > > > > > > > > > > > assertion state of the interrupt and does nothing if it 
> > > > > > > > > > > > isn't changed.
> > > > > > > > > > > > 
> > > > > > > > > > > > Signed-off-by: Alex Williamson 
> > > > > > > > > > > > 
> > > > > > > > > > > > ---
> > > > > > > > > > > > 
> > > > > > > > > > > >  include/linux/kvm_host.h |3 ++
> > > > > > > > > > > >  virt/kvm/irq_comm.c  |   78 
> > > > > > > > > > > > ++
> > > > > > > > > > > >  2 files changed, 81 insertions(+)
> > > > > > > > > > > > 
> > > > > > > > > > > > diff --git a/include/linux/kvm_host.h 
> > > > > > > > > > > > b/include/linux/kvm_host.h
> > > > > > > > > > > > index a7661c0..6c168f1 100644
> > > > > > > > > > > > --- a/include/linux/kvm_host.h
> > > > > > > > > > > > +++ b/include/linux/kvm_host.h
> > > > > > > > > > > > @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry 
> > > > > > > > > > > > {
> > > > > > > > > > > > u32 type;
> > > > > > > > > > > > int (*set)(struct kvm_kernel_irq_routing_entry 
> > > > > > > > > > > > *e,
> > > > > > > > > > > >struct kvm *kvm, int irq_source_id, 
> > > > > > > > > > > > int level);
> > > > > > > > > > > > +   int (*clear)(struct 
> > > > > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > > > +struct kvm *kvm, int 
> > > > > > > > > > > > irq_source_id);
> > > > > > > > > > > > union {
> > > > > > > > > > > > struct {
> > > > > > > > > > > > unsigned irqchip;
> > > > > > > > > > > > @@ -629,6 +631,7 @@ void 
> > > > > > > > > > > > kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic,
> > > > > > > > > > > >unsigned long 
> > > > > > > > > > > > *deliver_bitmask);
> > > > > > > > > > > >  #endif
> > > > > > > > > > > >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, 
> > > > > > > > > > > > u32 irq, int level);
> > > > > > > > > > > > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, 
> > > > > > > > > > > > u32 irq);
> > > > > > > > > > > >  int kvm_set_msi(struct kvm_kernel_irq_routing_entry 
> > > > > > > > > > > > *irq_entry, struct kvm *kvm,
> > > > > > > > > > > > int irq_source_id, int level);
> > > > > > > > > > > >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned 
> > > > > > > > > > > > irqchip, unsigned pin);
> > > > > > > > > > > > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > > > > > > > > > > > index 5afb431..76e8f22 100644
> > > > > > > > > > > > --- a/virt/kvm/irq_comm.c
> > > > > > > > > > > > +++ b/virt/kvm/irq_comm.c
> > > > > > > > > > > > @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> > > > > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > > > return kvm_ioapic_set_irq(ioapic, 
> > > > > > > > > > > > e->irqchip.pin, level);
> > > > > > > > > > > >  }
> > > > > > > > > > > >  
> > > > > > > > > > > > +static inline int kvm_clear_irq_line_state(unsigned 
> > > > > > > > > > > > long *irq_state,
> > > > > > > > > > > > +   int 
> > > > > > > > > > > > irq_source_id)
> > > > > > > > > > > > +{
> > > > > > > > > > > > +   return !!test_and_clear_bit(irq_source_id, 
> > > > > > > > > > > > irq_state);
> > > > > > > > > > > > +}
> > > > > > > > > > > > +
> > > > > > > > > > > > +static int kvm_clear_pic_irq(struct 
> > > > > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > > > +struct kvm *kvm, int 
> > > > > > > > > > > > irq_source_id)
> > > > > > > > > > > > +{
> > > > > > > > > > > > +#ifdef CONFIG_X86
> > > > > > > > > > > > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > > > > > > > > +   int level = 
> > > > > > > > > > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > > > > > > > > +   

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Michael S. Tsirkin
On Tue, Jul 17, 2012 at 10:08:21AM -0600, Alex Williamson wrote:
> On Tue, 2012-07-17 at 18:57 +0300, Michael S. Tsirkin wrote:
> > On Tue, Jul 17, 2012 at 09:51:41AM -0600, Alex Williamson wrote:
> > > On Tue, 2012-07-17 at 18:36 +0300, Michael S. Tsirkin wrote:
> > > > On Tue, Jul 17, 2012 at 09:20:11AM -0600, Alex Williamson wrote:
> > > > > On Tue, 2012-07-17 at 17:53 +0300, Michael S. Tsirkin wrote:
> > > > > > On Tue, Jul 17, 2012 at 08:21:51AM -0600, Alex Williamson wrote:
> > > > > > > On Tue, 2012-07-17 at 17:08 +0300, Michael S. Tsirkin wrote:
> > > > > > > > On Tue, Jul 17, 2012 at 07:56:09AM -0600, Alex Williamson wrote:
> > > > > > > > > On Tue, 2012-07-17 at 13:14 +0300, Michael S. Tsirkin wrote:
> > > > > > > > > > On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson 
> > > > > > > > > > wrote:
> > > > > > > > > > > This is an alternative to kvm_set_irq(,,,0) which returns 
> > > > > > > > > > > the previous
> > > > > > > > > > > assertion state of the interrupt and does nothing if it 
> > > > > > > > > > > isn't changed.
> > > > > > > > > > > 
> > > > > > > > > > > Signed-off-by: Alex Williamson 
> > > > > > > > > > > 
> > > > > > > > > > > ---
> > > > > > > > > > > 
> > > > > > > > > > >  include/linux/kvm_host.h |3 ++
> > > > > > > > > > >  virt/kvm/irq_comm.c  |   78 
> > > > > > > > > > > ++
> > > > > > > > > > >  2 files changed, 81 insertions(+)
> > > > > > > > > > > 
> > > > > > > > > > > diff --git a/include/linux/kvm_host.h 
> > > > > > > > > > > b/include/linux/kvm_host.h
> > > > > > > > > > > index a7661c0..6c168f1 100644
> > > > > > > > > > > --- a/include/linux/kvm_host.h
> > > > > > > > > > > +++ b/include/linux/kvm_host.h
> > > > > > > > > > > @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
> > > > > > > > > > >   u32 type;
> > > > > > > > > > >   int (*set)(struct kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > >  struct kvm *kvm, int irq_source_id, int 
> > > > > > > > > > > level);
> > > > > > > > > > > + int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > > +  struct kvm *kvm, int irq_source_id);
> > > > > > > > > > >   union {
> > > > > > > > > > >   struct {
> > > > > > > > > > >   unsigned irqchip;
> > > > > > > > > > > @@ -629,6 +631,7 @@ void 
> > > > > > > > > > > kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic,
> > > > > > > > > > >  unsigned long 
> > > > > > > > > > > *deliver_bitmask);
> > > > > > > > > > >  #endif
> > > > > > > > > > >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 
> > > > > > > > > > > irq, int level);
> > > > > > > > > > > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, 
> > > > > > > > > > > u32 irq);
> > > > > > > > > > >  int kvm_set_msi(struct kvm_kernel_irq_routing_entry 
> > > > > > > > > > > *irq_entry, struct kvm *kvm,
> > > > > > > > > > >   int irq_source_id, int level);
> > > > > > > > > > >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned 
> > > > > > > > > > > irqchip, unsigned pin);
> > > > > > > > > > > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > > > > > > > > > > index 5afb431..76e8f22 100644
> > > > > > > > > > > --- a/virt/kvm/irq_comm.c
> > > > > > > > > > > +++ b/virt/kvm/irq_comm.c
> > > > > > > > > > > @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> > > > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > >   return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, 
> > > > > > > > > > > level);
> > > > > > > > > > >  }
> > > > > > > > > > >  
> > > > > > > > > > > +static inline int kvm_clear_irq_line_state(unsigned long 
> > > > > > > > > > > *irq_state,
> > > > > > > > > > > + int irq_source_id)
> > > > > > > > > > > +{
> > > > > > > > > > > + return !!test_and_clear_bit(irq_source_id, irq_state);
> > > > > > > > > > > +}
> > > > > > > > > > > +
> > > > > > > > > > > +static int kvm_clear_pic_irq(struct 
> > > > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > > +  struct kvm *kvm, int irq_source_id)
> > > > > > > > > > > +{
> > > > > > > > > > > +#ifdef CONFIG_X86
> > > > > > > > > > > + struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > > > > > > > + int level = 
> > > > > > > > > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > > > > > > > +  irq_source_id);
> > > > > > > > > > > + if (level)
> > > > > > > > > > > + kvm_pic_set_irq(pic, e->irqchip.pin,
> > > > > > > > > > > + 
> > > > > > > > > > > !!pic->irq_states[e->irqchip.pin]);
> > > > > > > > > > > + return level;
> > > > > > > > > > 
> > > > > > > > > > I think I begin to understand: if (level) checks it was 
> > > > > > > > > > previously set,
> > > > > > > > > > and then we clear if needed?
> > > > > > > > > 
> > > > > > > > > It's actually very simple, if

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Alex Williamson
On Tue, 2012-07-17 at 18:57 +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 09:51:41AM -0600, Alex Williamson wrote:
> > On Tue, 2012-07-17 at 18:36 +0300, Michael S. Tsirkin wrote:
> > > On Tue, Jul 17, 2012 at 09:20:11AM -0600, Alex Williamson wrote:
> > > > On Tue, 2012-07-17 at 17:53 +0300, Michael S. Tsirkin wrote:
> > > > > On Tue, Jul 17, 2012 at 08:21:51AM -0600, Alex Williamson wrote:
> > > > > > On Tue, 2012-07-17 at 17:08 +0300, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Jul 17, 2012 at 07:56:09AM -0600, Alex Williamson wrote:
> > > > > > > > On Tue, 2012-07-17 at 13:14 +0300, Michael S. Tsirkin wrote:
> > > > > > > > > On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson 
> > > > > > > > > wrote:
> > > > > > > > > > This is an alternative to kvm_set_irq(,,,0) which returns 
> > > > > > > > > > the previous
> > > > > > > > > > assertion state of the interrupt and does nothing if it 
> > > > > > > > > > isn't changed.
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Alex Williamson 
> > > > > > > > > > ---
> > > > > > > > > > 
> > > > > > > > > >  include/linux/kvm_host.h |3 ++
> > > > > > > > > >  virt/kvm/irq_comm.c  |   78 
> > > > > > > > > > ++
> > > > > > > > > >  2 files changed, 81 insertions(+)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/include/linux/kvm_host.h 
> > > > > > > > > > b/include/linux/kvm_host.h
> > > > > > > > > > index a7661c0..6c168f1 100644
> > > > > > > > > > --- a/include/linux/kvm_host.h
> > > > > > > > > > +++ b/include/linux/kvm_host.h
> > > > > > > > > > @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
> > > > > > > > > > u32 type;
> > > > > > > > > > int (*set)(struct kvm_kernel_irq_routing_entry *e,
> > > > > > > > > >struct kvm *kvm, int irq_source_id, int 
> > > > > > > > > > level);
> > > > > > > > > > +   int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > +struct kvm *kvm, int irq_source_id);
> > > > > > > > > > union {
> > > > > > > > > > struct {
> > > > > > > > > > unsigned irqchip;
> > > > > > > > > > @@ -629,6 +631,7 @@ void 
> > > > > > > > > > kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic,
> > > > > > > > > >unsigned long 
> > > > > > > > > > *deliver_bitmask);
> > > > > > > > > >  #endif
> > > > > > > > > >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 
> > > > > > > > > > irq, int level);
> > > > > > > > > > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 
> > > > > > > > > > irq);
> > > > > > > > > >  int kvm_set_msi(struct kvm_kernel_irq_routing_entry 
> > > > > > > > > > *irq_entry, struct kvm *kvm,
> > > > > > > > > > int irq_source_id, int level);
> > > > > > > > > >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned 
> > > > > > > > > > irqchip, unsigned pin);
> > > > > > > > > > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > > > > > > > > > index 5afb431..76e8f22 100644
> > > > > > > > > > --- a/virt/kvm/irq_comm.c
> > > > > > > > > > +++ b/virt/kvm/irq_comm.c
> > > > > > > > > > @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> > > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, 
> > > > > > > > > > level);
> > > > > > > > > >  }
> > > > > > > > > >  
> > > > > > > > > > +static inline int kvm_clear_irq_line_state(unsigned long 
> > > > > > > > > > *irq_state,
> > > > > > > > > > +   int irq_source_id)
> > > > > > > > > > +{
> > > > > > > > > > +   return !!test_and_clear_bit(irq_source_id, irq_state);
> > > > > > > > > > +}
> > > > > > > > > > +
> > > > > > > > > > +static int kvm_clear_pic_irq(struct 
> > > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > +struct kvm *kvm, int irq_source_id)
> > > > > > > > > > +{
> > > > > > > > > > +#ifdef CONFIG_X86
> > > > > > > > > > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > > > > > > +   int level = 
> > > > > > > > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > > > > > > +irq_source_id);
> > > > > > > > > > +   if (level)
> > > > > > > > > > +   kvm_pic_set_irq(pic, e->irqchip.pin,
> > > > > > > > > > +   
> > > > > > > > > > !!pic->irq_states[e->irqchip.pin]);
> > > > > > > > > > +   return level;
> > > > > > > > > 
> > > > > > > > > I think I begin to understand: if (level) checks it was 
> > > > > > > > > previously set,
> > > > > > > > > and then we clear if needed?
> > > > > > > > 
> > > > > > > > It's actually very simple, if we change anything in irq_states, 
> > > > > > > > then
> > > > > > > > update via the chip specific set_irq function.
> > > > > > > > 
> > > > > > > > >  I think it's worthwhile to rename
> > > > > > > > > level to orig_leve

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Gleb Natapov
On Tue, Jul 17, 2012 at 06:57:01PM +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 09:51:41AM -0600, Alex Williamson wrote:
> > On Tue, 2012-07-17 at 18:36 +0300, Michael S. Tsirkin wrote:
> > > On Tue, Jul 17, 2012 at 09:20:11AM -0600, Alex Williamson wrote:
> > > > On Tue, 2012-07-17 at 17:53 +0300, Michael S. Tsirkin wrote:
> > > > > On Tue, Jul 17, 2012 at 08:21:51AM -0600, Alex Williamson wrote:
> > > > > > On Tue, 2012-07-17 at 17:08 +0300, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Jul 17, 2012 at 07:56:09AM -0600, Alex Williamson wrote:
> > > > > > > > On Tue, 2012-07-17 at 13:14 +0300, Michael S. Tsirkin wrote:
> > > > > > > > > On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson 
> > > > > > > > > wrote:
> > > > > > > > > > This is an alternative to kvm_set_irq(,,,0) which returns 
> > > > > > > > > > the previous
> > > > > > > > > > assertion state of the interrupt and does nothing if it 
> > > > > > > > > > isn't changed.
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Alex Williamson 
> > > > > > > > > > ---
> > > > > > > > > > 
> > > > > > > > > >  include/linux/kvm_host.h |3 ++
> > > > > > > > > >  virt/kvm/irq_comm.c  |   78 
> > > > > > > > > > ++
> > > > > > > > > >  2 files changed, 81 insertions(+)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/include/linux/kvm_host.h 
> > > > > > > > > > b/include/linux/kvm_host.h
> > > > > > > > > > index a7661c0..6c168f1 100644
> > > > > > > > > > --- a/include/linux/kvm_host.h
> > > > > > > > > > +++ b/include/linux/kvm_host.h
> > > > > > > > > > @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
> > > > > > > > > > u32 type;
> > > > > > > > > > int (*set)(struct kvm_kernel_irq_routing_entry *e,
> > > > > > > > > >struct kvm *kvm, int irq_source_id, int 
> > > > > > > > > > level);
> > > > > > > > > > +   int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > +struct kvm *kvm, int irq_source_id);
> > > > > > > > > > union {
> > > > > > > > > > struct {
> > > > > > > > > > unsigned irqchip;
> > > > > > > > > > @@ -629,6 +631,7 @@ void 
> > > > > > > > > > kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic,
> > > > > > > > > >unsigned long 
> > > > > > > > > > *deliver_bitmask);
> > > > > > > > > >  #endif
> > > > > > > > > >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 
> > > > > > > > > > irq, int level);
> > > > > > > > > > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 
> > > > > > > > > > irq);
> > > > > > > > > >  int kvm_set_msi(struct kvm_kernel_irq_routing_entry 
> > > > > > > > > > *irq_entry, struct kvm *kvm,
> > > > > > > > > > int irq_source_id, int level);
> > > > > > > > > >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned 
> > > > > > > > > > irqchip, unsigned pin);
> > > > > > > > > > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > > > > > > > > > index 5afb431..76e8f22 100644
> > > > > > > > > > --- a/virt/kvm/irq_comm.c
> > > > > > > > > > +++ b/virt/kvm/irq_comm.c
> > > > > > > > > > @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> > > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, 
> > > > > > > > > > level);
> > > > > > > > > >  }
> > > > > > > > > >  
> > > > > > > > > > +static inline int kvm_clear_irq_line_state(unsigned long 
> > > > > > > > > > *irq_state,
> > > > > > > > > > +   int irq_source_id)
> > > > > > > > > > +{
> > > > > > > > > > +   return !!test_and_clear_bit(irq_source_id, irq_state);
> > > > > > > > > > +}
> > > > > > > > > > +
> > > > > > > > > > +static int kvm_clear_pic_irq(struct 
> > > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > > > +struct kvm *kvm, int irq_source_id)
> > > > > > > > > > +{
> > > > > > > > > > +#ifdef CONFIG_X86
> > > > > > > > > > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > > > > > > +   int level = 
> > > > > > > > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > > > > > > +irq_source_id);
> > > > > > > > > > +   if (level)
> > > > > > > > > > +   kvm_pic_set_irq(pic, e->irqchip.pin,
> > > > > > > > > > +   
> > > > > > > > > > !!pic->irq_states[e->irqchip.pin]);
> > > > > > > > > > +   return level;
> > > > > > > > > 
> > > > > > > > > I think I begin to understand: if (level) checks it was 
> > > > > > > > > previously set,
> > > > > > > > > and then we clear if needed?
> > > > > > > > 
> > > > > > > > It's actually very simple, if we change anything in irq_states, 
> > > > > > > > then
> > > > > > > > update via the chip specific set_irq function.
> > > > > > > > 
> > > > > > > > >  I think it's worthwhile to rename
> > > > > > > > > level to or

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Michael S. Tsirkin
On Tue, Jul 17, 2012 at 09:51:41AM -0600, Alex Williamson wrote:
> On Tue, 2012-07-17 at 18:36 +0300, Michael S. Tsirkin wrote:
> > On Tue, Jul 17, 2012 at 09:20:11AM -0600, Alex Williamson wrote:
> > > On Tue, 2012-07-17 at 17:53 +0300, Michael S. Tsirkin wrote:
> > > > On Tue, Jul 17, 2012 at 08:21:51AM -0600, Alex Williamson wrote:
> > > > > On Tue, 2012-07-17 at 17:08 +0300, Michael S. Tsirkin wrote:
> > > > > > On Tue, Jul 17, 2012 at 07:56:09AM -0600, Alex Williamson wrote:
> > > > > > > On Tue, 2012-07-17 at 13:14 +0300, Michael S. Tsirkin wrote:
> > > > > > > > On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson wrote:
> > > > > > > > > This is an alternative to kvm_set_irq(,,,0) which returns the 
> > > > > > > > > previous
> > > > > > > > > assertion state of the interrupt and does nothing if it isn't 
> > > > > > > > > changed.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Alex Williamson 
> > > > > > > > > ---
> > > > > > > > > 
> > > > > > > > >  include/linux/kvm_host.h |3 ++
> > > > > > > > >  virt/kvm/irq_comm.c  |   78 
> > > > > > > > > ++
> > > > > > > > >  2 files changed, 81 insertions(+)
> > > > > > > > > 
> > > > > > > > > diff --git a/include/linux/kvm_host.h 
> > > > > > > > > b/include/linux/kvm_host.h
> > > > > > > > > index a7661c0..6c168f1 100644
> > > > > > > > > --- a/include/linux/kvm_host.h
> > > > > > > > > +++ b/include/linux/kvm_host.h
> > > > > > > > > @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
> > > > > > > > >   u32 type;
> > > > > > > > >   int (*set)(struct kvm_kernel_irq_routing_entry *e,
> > > > > > > > >  struct kvm *kvm, int irq_source_id, int 
> > > > > > > > > level);
> > > > > > > > > + int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> > > > > > > > > +  struct kvm *kvm, int irq_source_id);
> > > > > > > > >   union {
> > > > > > > > >   struct {
> > > > > > > > >   unsigned irqchip;
> > > > > > > > > @@ -629,6 +631,7 @@ void kvm_get_intr_delivery_bitmask(struct 
> > > > > > > > > kvm_ioapic *ioapic,
> > > > > > > > >  unsigned long 
> > > > > > > > > *deliver_bitmask);
> > > > > > > > >  #endif
> > > > > > > > >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, 
> > > > > > > > > int level);
> > > > > > > > > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 
> > > > > > > > > irq);
> > > > > > > > >  int kvm_set_msi(struct kvm_kernel_irq_routing_entry 
> > > > > > > > > *irq_entry, struct kvm *kvm,
> > > > > > > > >   int irq_source_id, int level);
> > > > > > > > >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, 
> > > > > > > > > unsigned pin);
> > > > > > > > > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > > > > > > > > index 5afb431..76e8f22 100644
> > > > > > > > > --- a/virt/kvm/irq_comm.c
> > > > > > > > > +++ b/virt/kvm/irq_comm.c
> > > > > > > > > @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > >   return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, 
> > > > > > > > > level);
> > > > > > > > >  }
> > > > > > > > >  
> > > > > > > > > +static inline int kvm_clear_irq_line_state(unsigned long 
> > > > > > > > > *irq_state,
> > > > > > > > > + int irq_source_id)
> > > > > > > > > +{
> > > > > > > > > + return !!test_and_clear_bit(irq_source_id, irq_state);
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > +static int kvm_clear_pic_irq(struct 
> > > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > > +  struct kvm *kvm, int irq_source_id)
> > > > > > > > > +{
> > > > > > > > > +#ifdef CONFIG_X86
> > > > > > > > > + struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > > > > > + int level = 
> > > > > > > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > > > > > +  irq_source_id);
> > > > > > > > > + if (level)
> > > > > > > > > + kvm_pic_set_irq(pic, e->irqchip.pin,
> > > > > > > > > + 
> > > > > > > > > !!pic->irq_states[e->irqchip.pin]);
> > > > > > > > > + return level;
> > > > > > > > 
> > > > > > > > I think I begin to understand: if (level) checks it was 
> > > > > > > > previously set,
> > > > > > > > and then we clear if needed?
> > > > > > > 
> > > > > > > It's actually very simple, if we change anything in irq_states, 
> > > > > > > then
> > > > > > > update via the chip specific set_irq function.
> > > > > > > 
> > > > > > > >  I think it's worthwhile to rename
> > > > > > > > level to orig_level and rewrite as:
> > > > > > > > 
> > > > > > > > if (orig_level && !pic->irq_states[e->irqchip.pin])
> > > > > > > > kvm_pic_set_irq(pic, e->irqchip.pin, 0);
> > > > > > > > 
> > > > > > > > This bo

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Alex Williamson
On Tue, 2012-07-17 at 18:36 +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 09:20:11AM -0600, Alex Williamson wrote:
> > On Tue, 2012-07-17 at 17:53 +0300, Michael S. Tsirkin wrote:
> > > On Tue, Jul 17, 2012 at 08:21:51AM -0600, Alex Williamson wrote:
> > > > On Tue, 2012-07-17 at 17:08 +0300, Michael S. Tsirkin wrote:
> > > > > On Tue, Jul 17, 2012 at 07:56:09AM -0600, Alex Williamson wrote:
> > > > > > On Tue, 2012-07-17 at 13:14 +0300, Michael S. Tsirkin wrote:
> > > > > > > On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson wrote:
> > > > > > > > This is an alternative to kvm_set_irq(,,,0) which returns the 
> > > > > > > > previous
> > > > > > > > assertion state of the interrupt and does nothing if it isn't 
> > > > > > > > changed.
> > > > > > > > 
> > > > > > > > Signed-off-by: Alex Williamson 
> > > > > > > > ---
> > > > > > > > 
> > > > > > > >  include/linux/kvm_host.h |3 ++
> > > > > > > >  virt/kvm/irq_comm.c  |   78 
> > > > > > > > ++
> > > > > > > >  2 files changed, 81 insertions(+)
> > > > > > > > 
> > > > > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > > > > > > index a7661c0..6c168f1 100644
> > > > > > > > --- a/include/linux/kvm_host.h
> > > > > > > > +++ b/include/linux/kvm_host.h
> > > > > > > > @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
> > > > > > > > u32 type;
> > > > > > > > int (*set)(struct kvm_kernel_irq_routing_entry *e,
> > > > > > > >struct kvm *kvm, int irq_source_id, int 
> > > > > > > > level);
> > > > > > > > +   int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> > > > > > > > +struct kvm *kvm, int irq_source_id);
> > > > > > > > union {
> > > > > > > > struct {
> > > > > > > > unsigned irqchip;
> > > > > > > > @@ -629,6 +631,7 @@ void kvm_get_intr_delivery_bitmask(struct 
> > > > > > > > kvm_ioapic *ioapic,
> > > > > > > >unsigned long 
> > > > > > > > *deliver_bitmask);
> > > > > > > >  #endif
> > > > > > > >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, 
> > > > > > > > int level);
> > > > > > > > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq);
> > > > > > > >  int kvm_set_msi(struct kvm_kernel_irq_routing_entry 
> > > > > > > > *irq_entry, struct kvm *kvm,
> > > > > > > > int irq_source_id, int level);
> > > > > > > >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, 
> > > > > > > > unsigned pin);
> > > > > > > > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > > > > > > > index 5afb431..76e8f22 100644
> > > > > > > > --- a/virt/kvm/irq_comm.c
> > > > > > > > +++ b/virt/kvm/irq_comm.c
> > > > > > > > @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, 
> > > > > > > > level);
> > > > > > > >  }
> > > > > > > >  
> > > > > > > > +static inline int kvm_clear_irq_line_state(unsigned long 
> > > > > > > > *irq_state,
> > > > > > > > +   int irq_source_id)
> > > > > > > > +{
> > > > > > > > +   return !!test_and_clear_bit(irq_source_id, irq_state);
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static int kvm_clear_pic_irq(struct 
> > > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > > > +struct kvm *kvm, int irq_source_id)
> > > > > > > > +{
> > > > > > > > +#ifdef CONFIG_X86
> > > > > > > > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > > > > +   int level = 
> > > > > > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > > > > +irq_source_id);
> > > > > > > > +   if (level)
> > > > > > > > +   kvm_pic_set_irq(pic, e->irqchip.pin,
> > > > > > > > +   
> > > > > > > > !!pic->irq_states[e->irqchip.pin]);
> > > > > > > > +   return level;
> > > > > > > 
> > > > > > > I think I begin to understand: if (level) checks it was 
> > > > > > > previously set,
> > > > > > > and then we clear if needed?
> > > > > > 
> > > > > > It's actually very simple, if we change anything in irq_states, then
> > > > > > update via the chip specific set_irq function.
> > > > > > 
> > > > > > >  I think it's worthwhile to rename
> > > > > > > level to orig_level and rewrite as:
> > > > > > > 
> > > > > > >   if (orig_level && !pic->irq_states[e->irqchip.pin])
> > > > > > >   kvm_pic_set_irq(pic, e->irqchip.pin, 0);
> > > > > > > 
> > > > > > > This both makes the logic clear without need for comments and
> > > > > > > saves some cycles on pic in case nothing actually changed.
> > > > > > 
> > > > > > That may work, but it's not actually the same thing.  
> > > > > > kvm_set_irq(,,,0)
> > > > > > will clear the bit and cal

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Michael S. Tsirkin
On Tue, Jul 17, 2012 at 09:20:11AM -0600, Alex Williamson wrote:
> On Tue, 2012-07-17 at 17:53 +0300, Michael S. Tsirkin wrote:
> > On Tue, Jul 17, 2012 at 08:21:51AM -0600, Alex Williamson wrote:
> > > On Tue, 2012-07-17 at 17:08 +0300, Michael S. Tsirkin wrote:
> > > > On Tue, Jul 17, 2012 at 07:56:09AM -0600, Alex Williamson wrote:
> > > > > On Tue, 2012-07-17 at 13:14 +0300, Michael S. Tsirkin wrote:
> > > > > > On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson wrote:
> > > > > > > This is an alternative to kvm_set_irq(,,,0) which returns the 
> > > > > > > previous
> > > > > > > assertion state of the interrupt and does nothing if it isn't 
> > > > > > > changed.
> > > > > > > 
> > > > > > > Signed-off-by: Alex Williamson 
> > > > > > > ---
> > > > > > > 
> > > > > > >  include/linux/kvm_host.h |3 ++
> > > > > > >  virt/kvm/irq_comm.c  |   78 
> > > > > > > ++
> > > > > > >  2 files changed, 81 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > > > > > index a7661c0..6c168f1 100644
> > > > > > > --- a/include/linux/kvm_host.h
> > > > > > > +++ b/include/linux/kvm_host.h
> > > > > > > @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
> > > > > > >   u32 type;
> > > > > > >   int (*set)(struct kvm_kernel_irq_routing_entry *e,
> > > > > > >  struct kvm *kvm, int irq_source_id, int level);
> > > > > > > + int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> > > > > > > +  struct kvm *kvm, int irq_source_id);
> > > > > > >   union {
> > > > > > >   struct {
> > > > > > >   unsigned irqchip;
> > > > > > > @@ -629,6 +631,7 @@ void kvm_get_intr_delivery_bitmask(struct 
> > > > > > > kvm_ioapic *ioapic,
> > > > > > >  unsigned long *deliver_bitmask);
> > > > > > >  #endif
> > > > > > >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int 
> > > > > > > level);
> > > > > > > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq);
> > > > > > >  int kvm_set_msi(struct kvm_kernel_irq_routing_entry *irq_entry, 
> > > > > > > struct kvm *kvm,
> > > > > > >   int irq_source_id, int level);
> > > > > > >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, 
> > > > > > > unsigned pin);
> > > > > > > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > > > > > > index 5afb431..76e8f22 100644
> > > > > > > --- a/virt/kvm/irq_comm.c
> > > > > > > +++ b/virt/kvm/irq_comm.c
> > > > > > > @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> > > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > >   return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, level);
> > > > > > >  }
> > > > > > >  
> > > > > > > +static inline int kvm_clear_irq_line_state(unsigned long 
> > > > > > > *irq_state,
> > > > > > > + int irq_source_id)
> > > > > > > +{
> > > > > > > + return !!test_and_clear_bit(irq_source_id, irq_state);
> > > > > > > +}
> > > > > > > +
> > > > > > > +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry 
> > > > > > > *e,
> > > > > > > +  struct kvm *kvm, int irq_source_id)
> > > > > > > +{
> > > > > > > +#ifdef CONFIG_X86
> > > > > > > + struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > > > + int level = 
> > > > > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > > > +  irq_source_id);
> > > > > > > + if (level)
> > > > > > > + kvm_pic_set_irq(pic, e->irqchip.pin,
> > > > > > > + !!pic->irq_states[e->irqchip.pin]);
> > > > > > > + return level;
> > > > > > 
> > > > > > I think I begin to understand: if (level) checks it was previously 
> > > > > > set,
> > > > > > and then we clear if needed?
> > > > > 
> > > > > It's actually very simple, if we change anything in irq_states, then
> > > > > update via the chip specific set_irq function.
> > > > > 
> > > > > >  I think it's worthwhile to rename
> > > > > > level to orig_level and rewrite as:
> > > > > > 
> > > > > > if (orig_level && !pic->irq_states[e->irqchip.pin])
> > > > > > kvm_pic_set_irq(pic, e->irqchip.pin, 0);
> > > > > > 
> > > > > > This both makes the logic clear without need for comments and
> > > > > > saves some cycles on pic in case nothing actually changed.
> > > > > 
> > > > > That may work, but it's not actually the same thing.  
> > > > > kvm_set_irq(,,,0)
> > > > > will clear the bit and call kvm_pic_set_irq with the new irq_states
> > > > > value, whether it's 0 or 1.  The optimization I make is to only call
> > > > > kvm_pic_set_irq if we've "changed" irq_states.  You're taking that one
> > > > > step further to "changed and is now 0".  I don't know if that's 
> > > > > correct
> > > > > behavior.
> > > > 
> > > > If not then I don't understand. You clear a bit
> > > > in a word. You never change it to 1, do you?
> > > 

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Alex Williamson
On Tue, 2012-07-17 at 17:53 +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 08:21:51AM -0600, Alex Williamson wrote:
> > On Tue, 2012-07-17 at 17:08 +0300, Michael S. Tsirkin wrote:
> > > On Tue, Jul 17, 2012 at 07:56:09AM -0600, Alex Williamson wrote:
> > > > On Tue, 2012-07-17 at 13:14 +0300, Michael S. Tsirkin wrote:
> > > > > On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson wrote:
> > > > > > This is an alternative to kvm_set_irq(,,,0) which returns the 
> > > > > > previous
> > > > > > assertion state of the interrupt and does nothing if it isn't 
> > > > > > changed.
> > > > > > 
> > > > > > Signed-off-by: Alex Williamson 
> > > > > > ---
> > > > > > 
> > > > > >  include/linux/kvm_host.h |3 ++
> > > > > >  virt/kvm/irq_comm.c  |   78 
> > > > > > ++
> > > > > >  2 files changed, 81 insertions(+)
> > > > > > 
> > > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > > > > index a7661c0..6c168f1 100644
> > > > > > --- a/include/linux/kvm_host.h
> > > > > > +++ b/include/linux/kvm_host.h
> > > > > > @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
> > > > > > u32 type;
> > > > > > int (*set)(struct kvm_kernel_irq_routing_entry *e,
> > > > > >struct kvm *kvm, int irq_source_id, int level);
> > > > > > +   int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> > > > > > +struct kvm *kvm, int irq_source_id);
> > > > > > union {
> > > > > > struct {
> > > > > > unsigned irqchip;
> > > > > > @@ -629,6 +631,7 @@ void kvm_get_intr_delivery_bitmask(struct 
> > > > > > kvm_ioapic *ioapic,
> > > > > >unsigned long *deliver_bitmask);
> > > > > >  #endif
> > > > > >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int 
> > > > > > level);
> > > > > > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq);
> > > > > >  int kvm_set_msi(struct kvm_kernel_irq_routing_entry *irq_entry, 
> > > > > > struct kvm *kvm,
> > > > > > int irq_source_id, int level);
> > > > > >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, 
> > > > > > unsigned pin);
> > > > > > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > > > > > index 5afb431..76e8f22 100644
> > > > > > --- a/virt/kvm/irq_comm.c
> > > > > > +++ b/virt/kvm/irq_comm.c
> > > > > > @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> > > > > > kvm_kernel_irq_routing_entry *e,
> > > > > > return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, level);
> > > > > >  }
> > > > > >  
> > > > > > +static inline int kvm_clear_irq_line_state(unsigned long 
> > > > > > *irq_state,
> > > > > > +   int irq_source_id)
> > > > > > +{
> > > > > > +   return !!test_and_clear_bit(irq_source_id, irq_state);
> > > > > > +}
> > > > > > +
> > > > > > +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry 
> > > > > > *e,
> > > > > > +struct kvm *kvm, int irq_source_id)
> > > > > > +{
> > > > > > +#ifdef CONFIG_X86
> > > > > > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > > +   int level = 
> > > > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > > +irq_source_id);
> > > > > > +   if (level)
> > > > > > +   kvm_pic_set_irq(pic, e->irqchip.pin,
> > > > > > +   !!pic->irq_states[e->irqchip.pin]);
> > > > > > +   return level;
> > > > > 
> > > > > I think I begin to understand: if (level) checks it was previously 
> > > > > set,
> > > > > and then we clear if needed?
> > > > 
> > > > It's actually very simple, if we change anything in irq_states, then
> > > > update via the chip specific set_irq function.
> > > > 
> > > > >  I think it's worthwhile to rename
> > > > > level to orig_level and rewrite as:
> > > > > 
> > > > >   if (orig_level && !pic->irq_states[e->irqchip.pin])
> > > > >   kvm_pic_set_irq(pic, e->irqchip.pin, 0);
> > > > > 
> > > > > This both makes the logic clear without need for comments and
> > > > > saves some cycles on pic in case nothing actually changed.
> > > > 
> > > > That may work, but it's not actually the same thing.  kvm_set_irq(,,,0)
> > > > will clear the bit and call kvm_pic_set_irq with the new irq_states
> > > > value, whether it's 0 or 1.  The optimization I make is to only call
> > > > kvm_pic_set_irq if we've "changed" irq_states.  You're taking that one
> > > > step further to "changed and is now 0".  I don't know if that's correct
> > > > behavior.
> > > 
> > > If not then I don't understand. You clear a bit
> > > in a word. You never change it to 1, do you?
> > 
> > Correct, but kvm_set_irq(,,,0) may call kvm_pic_set_irq(,,1) if other
> > source IDs are still asserting the interrupt.  Your proposal assumes
> > that unless irq_states is also 0 we don't need to call kvm_pic_set_irq,
> > and I don't k

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Michael S. Tsirkin
On Tue, Jul 17, 2012 at 08:21:51AM -0600, Alex Williamson wrote:
> On Tue, 2012-07-17 at 17:08 +0300, Michael S. Tsirkin wrote:
> > On Tue, Jul 17, 2012 at 07:56:09AM -0600, Alex Williamson wrote:
> > > On Tue, 2012-07-17 at 13:14 +0300, Michael S. Tsirkin wrote:
> > > > On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson wrote:
> > > > > This is an alternative to kvm_set_irq(,,,0) which returns the previous
> > > > > assertion state of the interrupt and does nothing if it isn't changed.
> > > > > 
> > > > > Signed-off-by: Alex Williamson 
> > > > > ---
> > > > > 
> > > > >  include/linux/kvm_host.h |3 ++
> > > > >  virt/kvm/irq_comm.c  |   78 
> > > > > ++
> > > > >  2 files changed, 81 insertions(+)
> > > > > 
> > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > > > index a7661c0..6c168f1 100644
> > > > > --- a/include/linux/kvm_host.h
> > > > > +++ b/include/linux/kvm_host.h
> > > > > @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
> > > > >   u32 type;
> > > > >   int (*set)(struct kvm_kernel_irq_routing_entry *e,
> > > > >  struct kvm *kvm, int irq_source_id, int level);
> > > > > + int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> > > > > +  struct kvm *kvm, int irq_source_id);
> > > > >   union {
> > > > >   struct {
> > > > >   unsigned irqchip;
> > > > > @@ -629,6 +631,7 @@ void kvm_get_intr_delivery_bitmask(struct 
> > > > > kvm_ioapic *ioapic,
> > > > >  unsigned long *deliver_bitmask);
> > > > >  #endif
> > > > >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int 
> > > > > level);
> > > > > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq);
> > > > >  int kvm_set_msi(struct kvm_kernel_irq_routing_entry *irq_entry, 
> > > > > struct kvm *kvm,
> > > > >   int irq_source_id, int level);
> > > > >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, 
> > > > > unsigned pin);
> > > > > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > > > > index 5afb431..76e8f22 100644
> > > > > --- a/virt/kvm/irq_comm.c
> > > > > +++ b/virt/kvm/irq_comm.c
> > > > > @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> > > > > kvm_kernel_irq_routing_entry *e,
> > > > >   return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, level);
> > > > >  }
> > > > >  
> > > > > +static inline int kvm_clear_irq_line_state(unsigned long *irq_state,
> > > > > + int irq_source_id)
> > > > > +{
> > > > > + return !!test_and_clear_bit(irq_source_id, irq_state);
> > > > > +}
> > > > > +
> > > > > +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry *e,
> > > > > +  struct kvm *kvm, int irq_source_id)
> > > > > +{
> > > > > +#ifdef CONFIG_X86
> > > > > + struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > + int level = 
> > > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > +  irq_source_id);
> > > > > + if (level)
> > > > > + kvm_pic_set_irq(pic, e->irqchip.pin,
> > > > > + !!pic->irq_states[e->irqchip.pin]);
> > > > > + return level;
> > > > 
> > > > I think I begin to understand: if (level) checks it was previously set,
> > > > and then we clear if needed?
> > > 
> > > It's actually very simple, if we change anything in irq_states, then
> > > update via the chip specific set_irq function.
> > > 
> > > >  I think it's worthwhile to rename
> > > > level to orig_level and rewrite as:
> > > > 
> > > > if (orig_level && !pic->irq_states[e->irqchip.pin])
> > > > kvm_pic_set_irq(pic, e->irqchip.pin, 0);
> > > > 
> > > > This both makes the logic clear without need for comments and
> > > > saves some cycles on pic in case nothing actually changed.
> > > 
> > > That may work, but it's not actually the same thing.  kvm_set_irq(,,,0)
> > > will clear the bit and call kvm_pic_set_irq with the new irq_states
> > > value, whether it's 0 or 1.  The optimization I make is to only call
> > > kvm_pic_set_irq if we've "changed" irq_states.  You're taking that one
> > > step further to "changed and is now 0".  I don't know if that's correct
> > > behavior.
> > 
> > If not then I don't understand. You clear a bit
> > in a word. You never change it to 1, do you?
> 
> Correct, but kvm_set_irq(,,,0) may call kvm_pic_set_irq(,,1) if other
> source IDs are still asserting the interrupt.  Your proposal assumes
> that unless irq_states is also 0 we don't need to call kvm_pic_set_irq,
> and I don't know if that's correct.

Well you are asked to clear some id and level was 1. So we know
interrupt was asserted. Either we clear it or we don't. No?

> > 
> > But this brings another question:
> > 
> > static inline int kvm_irq_line_state(unsigned long *irq_state,
> >   

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Alex Williamson
On Tue, 2012-07-17 at 17:08 +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 17, 2012 at 07:56:09AM -0600, Alex Williamson wrote:
> > On Tue, 2012-07-17 at 13:14 +0300, Michael S. Tsirkin wrote:
> > > On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson wrote:
> > > > This is an alternative to kvm_set_irq(,,,0) which returns the previous
> > > > assertion state of the interrupt and does nothing if it isn't changed.
> > > > 
> > > > Signed-off-by: Alex Williamson 
> > > > ---
> > > > 
> > > >  include/linux/kvm_host.h |3 ++
> > > >  virt/kvm/irq_comm.c  |   78 
> > > > ++
> > > >  2 files changed, 81 insertions(+)
> > > > 
> > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > > index a7661c0..6c168f1 100644
> > > > --- a/include/linux/kvm_host.h
> > > > +++ b/include/linux/kvm_host.h
> > > > @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
> > > > u32 type;
> > > > int (*set)(struct kvm_kernel_irq_routing_entry *e,
> > > >struct kvm *kvm, int irq_source_id, int level);
> > > > +   int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> > > > +struct kvm *kvm, int irq_source_id);
> > > > union {
> > > > struct {
> > > > unsigned irqchip;
> > > > @@ -629,6 +631,7 @@ void kvm_get_intr_delivery_bitmask(struct 
> > > > kvm_ioapic *ioapic,
> > > >unsigned long *deliver_bitmask);
> > > >  #endif
> > > >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int 
> > > > level);
> > > > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq);
> > > >  int kvm_set_msi(struct kvm_kernel_irq_routing_entry *irq_entry, struct 
> > > > kvm *kvm,
> > > > int irq_source_id, int level);
> > > >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned 
> > > > pin);
> > > > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > > > index 5afb431..76e8f22 100644
> > > > --- a/virt/kvm/irq_comm.c
> > > > +++ b/virt/kvm/irq_comm.c
> > > > @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> > > > kvm_kernel_irq_routing_entry *e,
> > > > return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, level);
> > > >  }
> > > >  
> > > > +static inline int kvm_clear_irq_line_state(unsigned long *irq_state,
> > > > +   int irq_source_id)
> > > > +{
> > > > +   return !!test_and_clear_bit(irq_source_id, irq_state);
> > > > +}
> > > > +
> > > > +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry *e,
> > > > +struct kvm *kvm, int irq_source_id)
> > > > +{
> > > > +#ifdef CONFIG_X86
> > > > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > > > +   int level = 
> > > > kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > +irq_source_id);
> > > > +   if (level)
> > > > +   kvm_pic_set_irq(pic, e->irqchip.pin,
> > > > +   !!pic->irq_states[e->irqchip.pin]);
> > > > +   return level;
> > > 
> > > I think I begin to understand: if (level) checks it was previously set,
> > > and then we clear if needed?
> > 
> > It's actually very simple, if we change anything in irq_states, then
> > update via the chip specific set_irq function.
> > 
> > >  I think it's worthwhile to rename
> > > level to orig_level and rewrite as:
> > > 
> > >   if (orig_level && !pic->irq_states[e->irqchip.pin])
> > >   kvm_pic_set_irq(pic, e->irqchip.pin, 0);
> > > 
> > > This both makes the logic clear without need for comments and
> > > saves some cycles on pic in case nothing actually changed.
> > 
> > That may work, but it's not actually the same thing.  kvm_set_irq(,,,0)
> > will clear the bit and call kvm_pic_set_irq with the new irq_states
> > value, whether it's 0 or 1.  The optimization I make is to only call
> > kvm_pic_set_irq if we've "changed" irq_states.  You're taking that one
> > step further to "changed and is now 0".  I don't know if that's correct
> > behavior.
> 
> If not then I don't understand. You clear a bit
> in a word. You never change it to 1, do you?

Correct, but kvm_set_irq(,,,0) may call kvm_pic_set_irq(,,1) if other
source IDs are still asserting the interrupt.  Your proposal assumes
that unless irq_states is also 0 we don't need to call kvm_pic_set_irq,
and I don't know if that's correct.

> 
> But this brings another question:
> 
> static inline int kvm_irq_line_state(unsigned long *irq_state,
>  int irq_source_id, int level)
> {
> /* Logical OR for level trig interrupt */
> if (level)
> set_bit(irq_source_id, irq_state);
> else
> clear_bit(irq_source_id, irq_state);
> 
> 
> ^^^
> above uses locked instructions
> 
> return !!(*irq_state);
> 
> 
> above 

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Michael S. Tsirkin
On Tue, Jul 17, 2012 at 07:56:09AM -0600, Alex Williamson wrote:
> On Tue, 2012-07-17 at 13:14 +0300, Michael S. Tsirkin wrote:
> > On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson wrote:
> > > This is an alternative to kvm_set_irq(,,,0) which returns the previous
> > > assertion state of the interrupt and does nothing if it isn't changed.
> > > 
> > > Signed-off-by: Alex Williamson 
> > > ---
> > > 
> > >  include/linux/kvm_host.h |3 ++
> > >  virt/kvm/irq_comm.c  |   78 
> > > ++
> > >  2 files changed, 81 insertions(+)
> > > 
> > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > index a7661c0..6c168f1 100644
> > > --- a/include/linux/kvm_host.h
> > > +++ b/include/linux/kvm_host.h
> > > @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
> > >   u32 type;
> > >   int (*set)(struct kvm_kernel_irq_routing_entry *e,
> > >  struct kvm *kvm, int irq_source_id, int level);
> > > + int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> > > +  struct kvm *kvm, int irq_source_id);
> > >   union {
> > >   struct {
> > >   unsigned irqchip;
> > > @@ -629,6 +631,7 @@ void kvm_get_intr_delivery_bitmask(struct kvm_ioapic 
> > > *ioapic,
> > >  unsigned long *deliver_bitmask);
> > >  #endif
> > >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level);
> > > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq);
> > >  int kvm_set_msi(struct kvm_kernel_irq_routing_entry *irq_entry, struct 
> > > kvm *kvm,
> > >   int irq_source_id, int level);
> > >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned 
> > > pin);
> > > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > > index 5afb431..76e8f22 100644
> > > --- a/virt/kvm/irq_comm.c
> > > +++ b/virt/kvm/irq_comm.c
> > > @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> > > kvm_kernel_irq_routing_entry *e,
> > >   return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, level);
> > >  }
> > >  
> > > +static inline int kvm_clear_irq_line_state(unsigned long *irq_state,
> > > + int irq_source_id)
> > > +{
> > > + return !!test_and_clear_bit(irq_source_id, irq_state);
> > > +}
> > > +
> > > +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry *e,
> > > +  struct kvm *kvm, int irq_source_id)
> > > +{
> > > +#ifdef CONFIG_X86
> > > + struct kvm_pic *pic = pic_irqchip(kvm);
> > > + int level = kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > +  irq_source_id);
> > > + if (level)
> > > + kvm_pic_set_irq(pic, e->irqchip.pin,
> > > + !!pic->irq_states[e->irqchip.pin]);
> > > + return level;
> > 
> > I think I begin to understand: if (level) checks it was previously set,
> > and then we clear if needed?
> 
> It's actually very simple, if we change anything in irq_states, then
> update via the chip specific set_irq function.
> 
> >  I think it's worthwhile to rename
> > level to orig_level and rewrite as:
> > 
> > if (orig_level && !pic->irq_states[e->irqchip.pin])
> > kvm_pic_set_irq(pic, e->irqchip.pin, 0);
> > 
> > This both makes the logic clear without need for comments and
> > saves some cycles on pic in case nothing actually changed.
> 
> That may work, but it's not actually the same thing.  kvm_set_irq(,,,0)
> will clear the bit and call kvm_pic_set_irq with the new irq_states
> value, whether it's 0 or 1.  The optimization I make is to only call
> kvm_pic_set_irq if we've "changed" irq_states.  You're taking that one
> step further to "changed and is now 0".  I don't know if that's correct
> behavior.

If not then I don't understand. You clear a bit
in a word. You never change it to 1, do you?

But this brings another question:

static inline int kvm_irq_line_state(unsigned long *irq_state,
 int irq_source_id, int level)
{
/* Logical OR for level trig interrupt */
if (level)
set_bit(irq_source_id, irq_state);
else
clear_bit(irq_source_id, irq_state);


^^^
above uses locked instructions

return !!(*irq_state);


above doesn't

}


why the insonsistency?

> > > +#else
> > > + return -1;
> > > +#endif
> > > +}
> > > +
> > > +static int kvm_clear_ioapic_irq(struct kvm_kernel_irq_routing_entry *e,
> > > + struct kvm *kvm, int irq_source_id)
> > > +{
> > > + struct kvm_ioapic *ioapic = kvm->arch.vioapic;
> > > + int level;
> > > +
> > > + level = kvm_clear_irq_line_state(&ioapic->irq_states[e->irqchip.pin],
> > > +  irq_source_id);
> > > + if (level)
> > > + kvm_ioapic_set_irq(ioapic, e->irqchip.pin,
> > > +!!ioapic->irq_states[e->irqchip.pin]);
> > > + return level;
> > > +}
> > > +
> > 

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Alex Williamson
On Tue, 2012-07-17 at 13:14 +0300, Michael S. Tsirkin wrote:
> On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson wrote:
> > This is an alternative to kvm_set_irq(,,,0) which returns the previous
> > assertion state of the interrupt and does nothing if it isn't changed.
> > 
> > Signed-off-by: Alex Williamson 
> > ---
> > 
> >  include/linux/kvm_host.h |3 ++
> >  virt/kvm/irq_comm.c  |   78 
> > ++
> >  2 files changed, 81 insertions(+)
> > 
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index a7661c0..6c168f1 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
> > u32 type;
> > int (*set)(struct kvm_kernel_irq_routing_entry *e,
> >struct kvm *kvm, int irq_source_id, int level);
> > +   int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> > +struct kvm *kvm, int irq_source_id);
> > union {
> > struct {
> > unsigned irqchip;
> > @@ -629,6 +631,7 @@ void kvm_get_intr_delivery_bitmask(struct kvm_ioapic 
> > *ioapic,
> >unsigned long *deliver_bitmask);
> >  #endif
> >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level);
> > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq);
> >  int kvm_set_msi(struct kvm_kernel_irq_routing_entry *irq_entry, struct kvm 
> > *kvm,
> > int irq_source_id, int level);
> >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin);
> > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > index 5afb431..76e8f22 100644
> > --- a/virt/kvm/irq_comm.c
> > +++ b/virt/kvm/irq_comm.c
> > @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> > kvm_kernel_irq_routing_entry *e,
> > return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, level);
> >  }
> >  
> > +static inline int kvm_clear_irq_line_state(unsigned long *irq_state,
> > +   int irq_source_id)
> > +{
> > +   return !!test_and_clear_bit(irq_source_id, irq_state);
> > +}
> > +
> > +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry *e,
> > +struct kvm *kvm, int irq_source_id)
> > +{
> > +#ifdef CONFIG_X86
> > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > +   int level = kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > +irq_source_id);
> > +   if (level)
> > +   kvm_pic_set_irq(pic, e->irqchip.pin,
> > +   !!pic->irq_states[e->irqchip.pin]);
> > +   return level;
> 
> I think I begin to understand: if (level) checks it was previously set,
> and then we clear if needed?

It's actually very simple, if we change anything in irq_states, then
update via the chip specific set_irq function.

>  I think it's worthwhile to rename
> level to orig_level and rewrite as:
> 
>   if (orig_level && !pic->irq_states[e->irqchip.pin])
>   kvm_pic_set_irq(pic, e->irqchip.pin, 0);
> 
> This both makes the logic clear without need for comments and
> saves some cycles on pic in case nothing actually changed.

That may work, but it's not actually the same thing.  kvm_set_irq(,,,0)
will clear the bit and call kvm_pic_set_irq with the new irq_states
value, whether it's 0 or 1.  The optimization I make is to only call
kvm_pic_set_irq if we've "changed" irq_states.  You're taking that one
step further to "changed and is now 0".  I don't know if that's correct
behavior.

> > +#else
> > +   return -1;
> > +#endif
> > +}
> > +
> > +static int kvm_clear_ioapic_irq(struct kvm_kernel_irq_routing_entry *e,
> > +   struct kvm *kvm, int irq_source_id)
> > +{
> > +   struct kvm_ioapic *ioapic = kvm->arch.vioapic;
> > +   int level;
> > +
> > +   level = kvm_clear_irq_line_state(&ioapic->irq_states[e->irqchip.pin],
> > +irq_source_id);
> > +   if (level)
> > +   kvm_ioapic_set_irq(ioapic, e->irqchip.pin,
> > +  !!ioapic->irq_states[e->irqchip.pin]);
> > +   return level;
> > +}
> > +
> >  inline static bool kvm_is_dm_lowest_prio(struct kvm_lapic_irq *irq)
> >  {
> >  #ifdef CONFIG_IA64
> > @@ -190,6 +226,45 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, 
> > u32 irq, int level)
> > return ret;
> >  }
> >  
> > +/*
> > + * Return value:
> > + *  < 0   Error
> > + *  = 0   Interrupt was not set, did nothing
> > + *  > 0   Interrupt was pending, cleared
> > + */
> > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq)
> > +{
> > +   struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
> > +   int ret = -EINVAL, i = 0;
> > +   struct kvm_irq_routing_table *irq_rt;
> > +   struct hlist_node *n;
> > +
> > +   /* Not possible to detect if the guest uses the PIC or the
> > +* IOAPIC.  So clear the bit in both. The guest wi

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Michael S. Tsirkin
On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson wrote:
> This is an alternative to kvm_set_irq(,,,0) which returns the previous
> assertion state of the interrupt and does nothing if it isn't changed.
> 
> Signed-off-by: Alex Williamson 
> ---
> 
>  include/linux/kvm_host.h |3 ++
>  virt/kvm/irq_comm.c  |   78 
> ++
>  2 files changed, 81 insertions(+)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index a7661c0..6c168f1 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
>   u32 type;
>   int (*set)(struct kvm_kernel_irq_routing_entry *e,
>  struct kvm *kvm, int irq_source_id, int level);
> + int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> +  struct kvm *kvm, int irq_source_id);
>   union {
>   struct {
>   unsigned irqchip;
> @@ -629,6 +631,7 @@ void kvm_get_intr_delivery_bitmask(struct kvm_ioapic 
> *ioapic,
>  unsigned long *deliver_bitmask);
>  #endif
>  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level);
> +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq);
>  int kvm_set_msi(struct kvm_kernel_irq_routing_entry *irq_entry, struct kvm 
> *kvm,
>   int irq_source_id, int level);
>  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin);
> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> index 5afb431..76e8f22 100644
> --- a/virt/kvm/irq_comm.c
> +++ b/virt/kvm/irq_comm.c
> @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> kvm_kernel_irq_routing_entry *e,
>   return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, level);
>  }
>  
> +static inline int kvm_clear_irq_line_state(unsigned long *irq_state,
> + int irq_source_id)
> +{
> + return !!test_and_clear_bit(irq_source_id, irq_state);
> +}
> +
> +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry *e,
> +  struct kvm *kvm, int irq_source_id)
> +{
> +#ifdef CONFIG_X86
> + struct kvm_pic *pic = pic_irqchip(kvm);
> + int level = kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> +  irq_source_id);
> + if (level)
> + kvm_pic_set_irq(pic, e->irqchip.pin,
> + !!pic->irq_states[e->irqchip.pin]);
> + return level;
> +#else
> + return -1;
> +#endif
> +}
> +
> +static int kvm_clear_ioapic_irq(struct kvm_kernel_irq_routing_entry *e,
> + struct kvm *kvm, int irq_source_id)
> +{
> + struct kvm_ioapic *ioapic = kvm->arch.vioapic;
> + int level;
> +
> + level = kvm_clear_irq_line_state(&ioapic->irq_states[e->irqchip.pin],
> +  irq_source_id);
> + if (level)
> + kvm_ioapic_set_irq(ioapic, e->irqchip.pin,
> +!!ioapic->irq_states[e->irqchip.pin]);
> + return level;
> +}
> +
>  inline static bool kvm_is_dm_lowest_prio(struct kvm_lapic_irq *irq)
>  {
>  #ifdef CONFIG_IA64
> @@ -190,6 +226,45 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 
> irq, int level)
>   return ret;
>  }
>  
> +/*
> + * Return value:
> + *  < 0   Error
> + *  = 0   Interrupt was not set, did nothing
> + *  > 0   Interrupt was pending, cleared
> + */
> +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq)
> +{
> + struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
> + int ret = -EINVAL, i = 0;
> + struct kvm_irq_routing_table *irq_rt;
> + struct hlist_node *n;
> +
> + /* Not possible to detect if the guest uses the PIC or the
> +  * IOAPIC.  So clear the bit in both. The guest will ignore
> +  * writes to the unused one.
> +  */
> + rcu_read_lock();
> + irq_rt = rcu_dereference(kvm->irq_routing);
> + if (irq < irq_rt->nr_rt_entries)
> + hlist_for_each_entry(e, n, &irq_rt->map[irq], link)
> + irq_set[i++] = *e;
> + rcu_read_unlock();
> +
> + while (i--) {
> + int r = -EINVAL;
> +
> + if (irq_set[i].clear)

I would normally suggest if (likely()) here but recently Marcelo
started pushing back against these tags. Maybe add in a separate patch
so it's easier to ignore ...

> + r = irq_set[i].clear(&irq_set[i], kvm, irq_source_id);
> +
> + if (r < 0)
> + continue;
> +
> + ret = r + ((ret < 0) ? 0 : ret);
> + }
> +
> + return ret;
> +}
> +
>  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
>  {
>   struct kvm_irq_ack_notifier *kian;
> @@ -344,16 +419,19 @@ static int setup_routing_entry(struct 
> kvm_irq_routing_table *rt,
>   switch (ue->u.irqchip.irqchip) {
> 

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-17 Thread Michael S. Tsirkin
On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson wrote:
> This is an alternative to kvm_set_irq(,,,0) which returns the previous
> assertion state of the interrupt and does nothing if it isn't changed.
> 
> Signed-off-by: Alex Williamson 
> ---
> 
>  include/linux/kvm_host.h |3 ++
>  virt/kvm/irq_comm.c  |   78 
> ++
>  2 files changed, 81 insertions(+)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index a7661c0..6c168f1 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
>   u32 type;
>   int (*set)(struct kvm_kernel_irq_routing_entry *e,
>  struct kvm *kvm, int irq_source_id, int level);
> + int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> +  struct kvm *kvm, int irq_source_id);
>   union {
>   struct {
>   unsigned irqchip;
> @@ -629,6 +631,7 @@ void kvm_get_intr_delivery_bitmask(struct kvm_ioapic 
> *ioapic,
>  unsigned long *deliver_bitmask);
>  #endif
>  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level);
> +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq);
>  int kvm_set_msi(struct kvm_kernel_irq_routing_entry *irq_entry, struct kvm 
> *kvm,
>   int irq_source_id, int level);
>  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin);
> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> index 5afb431..76e8f22 100644
> --- a/virt/kvm/irq_comm.c
> +++ b/virt/kvm/irq_comm.c
> @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> kvm_kernel_irq_routing_entry *e,
>   return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, level);
>  }
>  
> +static inline int kvm_clear_irq_line_state(unsigned long *irq_state,
> + int irq_source_id)
> +{
> + return !!test_and_clear_bit(irq_source_id, irq_state);
> +}
> +
> +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry *e,
> +  struct kvm *kvm, int irq_source_id)
> +{
> +#ifdef CONFIG_X86
> + struct kvm_pic *pic = pic_irqchip(kvm);
> + int level = kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> +  irq_source_id);
> + if (level)
> + kvm_pic_set_irq(pic, e->irqchip.pin,
> + !!pic->irq_states[e->irqchip.pin]);
> + return level;

I think I begin to understand: if (level) checks it was previously set,
and then we clear if needed? I think it's worthwhile to rename
level to orig_level and rewrite as:

if (orig_level && !pic->irq_states[e->irqchip.pin])
kvm_pic_set_irq(pic, e->irqchip.pin, 0);

This both makes the logic clear without need for comments and
saves some cycles on pic in case nothing actually changed.

> +#else
> + return -1;
> +#endif
> +}
> +
> +static int kvm_clear_ioapic_irq(struct kvm_kernel_irq_routing_entry *e,
> + struct kvm *kvm, int irq_source_id)
> +{
> + struct kvm_ioapic *ioapic = kvm->arch.vioapic;
> + int level;
> +
> + level = kvm_clear_irq_line_state(&ioapic->irq_states[e->irqchip.pin],
> +  irq_source_id);
> + if (level)
> + kvm_ioapic_set_irq(ioapic, e->irqchip.pin,
> +!!ioapic->irq_states[e->irqchip.pin]);
> + return level;
> +}
> +
>  inline static bool kvm_is_dm_lowest_prio(struct kvm_lapic_irq *irq)
>  {
>  #ifdef CONFIG_IA64
> @@ -190,6 +226,45 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 
> irq, int level)
>   return ret;
>  }
>  
> +/*
> + * Return value:
> + *  < 0   Error
> + *  = 0   Interrupt was not set, did nothing
> + *  > 0   Interrupt was pending, cleared
> + */
> +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq)
> +{
> + struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
> + int ret = -EINVAL, i = 0;
> + struct kvm_irq_routing_table *irq_rt;
> + struct hlist_node *n;
> +
> + /* Not possible to detect if the guest uses the PIC or the
> +  * IOAPIC.  So clear the bit in both. The guest will ignore
> +  * writes to the unused one.
> +  */
> + rcu_read_lock();
> + irq_rt = rcu_dereference(kvm->irq_routing);
> + if (irq < irq_rt->nr_rt_entries)
> + hlist_for_each_entry(e, n, &irq_rt->map[irq], link)
> + irq_set[i++] = *e;
> + rcu_read_unlock();
> +
> + while (i--) {
> + int r = -EINVAL;
> +
> + if (irq_set[i].clear)
> + r = irq_set[i].clear(&irq_set[i], kvm, irq_source_id);
> +
> + if (r < 0)
> + continue;
> +
> + ret = r + ((ret < 0) ? 0 : ret);
> + }
> +
> + return ret;
> +}
> +
>  void kvm_notify_acked_irq(struct

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-16 Thread Alex Williamson
On Tue, 2012-07-17 at 03:51 +0300, Michael S. Tsirkin wrote:
> On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson wrote:
> > This is an alternative to kvm_set_irq(,,,0) which returns the previous
> > assertion state of the interrupt and does nothing if it isn't changed.
> > 
> > Signed-off-by: Alex Williamson 
> > ---
> > 
> >  include/linux/kvm_host.h |3 ++
> >  virt/kvm/irq_comm.c  |   78 
> > ++
> >  2 files changed, 81 insertions(+)
> > 
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index a7661c0..6c168f1 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
> > u32 type;
> > int (*set)(struct kvm_kernel_irq_routing_entry *e,
> >struct kvm *kvm, int irq_source_id, int level);
> > +   int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> > +struct kvm *kvm, int irq_source_id);
> > union {
> > struct {
> > unsigned irqchip;
> > @@ -629,6 +631,7 @@ void kvm_get_intr_delivery_bitmask(struct kvm_ioapic 
> > *ioapic,
> >unsigned long *deliver_bitmask);
> >  #endif
> >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level);
> > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq);
> >  int kvm_set_msi(struct kvm_kernel_irq_routing_entry *irq_entry, struct kvm 
> > *kvm,
> > int irq_source_id, int level);
> >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin);
> > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > index 5afb431..76e8f22 100644
> > --- a/virt/kvm/irq_comm.c
> > +++ b/virt/kvm/irq_comm.c
> > @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> > kvm_kernel_irq_routing_entry *e,
> > return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, level);
> >  }
> >  
> > +static inline int kvm_clear_irq_line_state(unsigned long *irq_state,
> > +   int irq_source_id)
> > +{
> > +   return !!test_and_clear_bit(irq_source_id, irq_state);
> > +}
> > +
> > +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry *e,
> > +struct kvm *kvm, int irq_source_id)
> > +{
> > +#ifdef CONFIG_X86
> > +   struct kvm_pic *pic = pic_irqchip(kvm);
> > +   int level = kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > +irq_source_id);
> > +   if (level)
> > +   kvm_pic_set_irq(pic, e->irqchip.pin,
> > +   !!pic->irq_states[e->irqchip.pin]);
> > +   return level;
> > +#else
> > +   return -1;
> > +#endif
> 
> What does this ifdef exclude exactly?

No pic on ia64.  Not that it works, but I figured the consistency with
kvm_set_pic_irq would make more sense whenever someone goes through and
cleans out the code.  Thanks,

Alex

> > +}
> > +
> > +static int kvm_clear_ioapic_irq(struct kvm_kernel_irq_routing_entry *e,
> > +   struct kvm *kvm, int irq_source_id)
> > +{
> > +   struct kvm_ioapic *ioapic = kvm->arch.vioapic;
> > +   int level;
> > +
> > +   level = kvm_clear_irq_line_state(&ioapic->irq_states[e->irqchip.pin],
> > +irq_source_id);
> > +   if (level)
> > +   kvm_ioapic_set_irq(ioapic, e->irqchip.pin,
> > +  !!ioapic->irq_states[e->irqchip.pin]);
> > +   return level;
> > +}
> > +
> >  inline static bool kvm_is_dm_lowest_prio(struct kvm_lapic_irq *irq)
> >  {
> >  #ifdef CONFIG_IA64
> > @@ -190,6 +226,45 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, 
> > u32 irq, int level)
> > return ret;
> >  }
> >  
> > +/*
> > + * Return value:
> > + *  < 0   Error
> > + *  = 0   Interrupt was not set, did nothing
> > + *  > 0   Interrupt was pending, cleared
> > + */
> > +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq)
> > +{
> > +   struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
> > +   int ret = -EINVAL, i = 0;
> > +   struct kvm_irq_routing_table *irq_rt;
> > +   struct hlist_node *n;
> > +
> > +   /* Not possible to detect if the guest uses the PIC or the
> > +* IOAPIC.  So clear the bit in both. The guest will ignore
> > +* writes to the unused one.
> > +*/
> > +   rcu_read_lock();
> > +   irq_rt = rcu_dereference(kvm->irq_routing);
> > +   if (irq < irq_rt->nr_rt_entries)
> > +   hlist_for_each_entry(e, n, &irq_rt->map[irq], link)
> > +   irq_set[i++] = *e;
> > +   rcu_read_unlock();
> > +
> > +   while (i--) {
> > +   int r = -EINVAL;
> > +
> > +   if (irq_set[i].clear)
> > +   r = irq_set[i].clear(&irq_set[i], kvm, irq_source_id);
> > +
> > +   if (r < 0)
> > +   continue;
> > +
> > +   ret = r + ((ret < 0) ? 0 : ret);
> > +   }
> > +
> > +   return ret;
> > +}
> > +
> >  void kvm_notif

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-16 Thread Michael S. Tsirkin
On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson wrote:
> This is an alternative to kvm_set_irq(,,,0) which returns the previous
> assertion state of the interrupt and does nothing if it isn't changed.
> 
> Signed-off-by: Alex Williamson 
> ---
> 
>  include/linux/kvm_host.h |3 ++
>  virt/kvm/irq_comm.c  |   78 
> ++
>  2 files changed, 81 insertions(+)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index a7661c0..6c168f1 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
>   u32 type;
>   int (*set)(struct kvm_kernel_irq_routing_entry *e,
>  struct kvm *kvm, int irq_source_id, int level);
> + int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> +  struct kvm *kvm, int irq_source_id);
>   union {
>   struct {
>   unsigned irqchip;
> @@ -629,6 +631,7 @@ void kvm_get_intr_delivery_bitmask(struct kvm_ioapic 
> *ioapic,
>  unsigned long *deliver_bitmask);
>  #endif
>  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level);
> +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq);
>  int kvm_set_msi(struct kvm_kernel_irq_routing_entry *irq_entry, struct kvm 
> *kvm,
>   int irq_source_id, int level);
>  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin);
> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> index 5afb431..76e8f22 100644
> --- a/virt/kvm/irq_comm.c
> +++ b/virt/kvm/irq_comm.c
> @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> kvm_kernel_irq_routing_entry *e,
>   return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, level);
>  }
>  
> +static inline int kvm_clear_irq_line_state(unsigned long *irq_state,
> + int irq_source_id)
> +{
> + return !!test_and_clear_bit(irq_source_id, irq_state);
> +}
> +
> +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry *e,
> +  struct kvm *kvm, int irq_source_id)
> +{
> +#ifdef CONFIG_X86
> + struct kvm_pic *pic = pic_irqchip(kvm);
> + int level = kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> +  irq_source_id);
> + if (level)
> + kvm_pic_set_irq(pic, e->irqchip.pin,
> + !!pic->irq_states[e->irqchip.pin]);

This is a bit tricky: add a comment explaining the logic?

> + return level;
> +#else
> + return -1;
> +#endif
> +}
> +
> +static int kvm_clear_ioapic_irq(struct kvm_kernel_irq_routing_entry *e,
> + struct kvm *kvm, int irq_source_id)
> +{
> + struct kvm_ioapic *ioapic = kvm->arch.vioapic;
> + int level;
> +
> + level = kvm_clear_irq_line_state(&ioapic->irq_states[e->irqchip.pin],
> +  irq_source_id);
> + if (level)
> + kvm_ioapic_set_irq(ioapic, e->irqchip.pin,
> +!!ioapic->irq_states[e->irqchip.pin]);

This is a bit tricky: add a comment explaining the logic?

> + return level;
> +}
> +
>  inline static bool kvm_is_dm_lowest_prio(struct kvm_lapic_irq *irq)
>  {
>  #ifdef CONFIG_IA64
> @@ -190,6 +226,45 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 
> irq, int level)
>   return ret;
>  }
>  
> +/*
> + * Return value:
> + *  < 0   Error
> + *  = 0   Interrupt was not set, did nothing
> + *  > 0   Interrupt was pending, cleared
> + */
> +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq)
> +{
> + struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
> + int ret = -EINVAL, i = 0;
> + struct kvm_irq_routing_table *irq_rt;
> + struct hlist_node *n;
> +
> + /* Not possible to detect if the guest uses the PIC or the
> +  * IOAPIC.  So clear the bit in both. The guest will ignore
> +  * writes to the unused one.
> +  */
> + rcu_read_lock();
> + irq_rt = rcu_dereference(kvm->irq_routing);
> + if (irq < irq_rt->nr_rt_entries)
> + hlist_for_each_entry(e, n, &irq_rt->map[irq], link)
> + irq_set[i++] = *e;
> + rcu_read_unlock();
> +
> + while (i--) {
> + int r = -EINVAL;
> +
> + if (irq_set[i].clear)
> + r = irq_set[i].clear(&irq_set[i], kvm, irq_source_id);
> +
> + if (r < 0)
> + continue;
> +
> + ret = r + ((ret < 0) ? 0 : ret);
> + }
> +
> + return ret;
> +}
> +
>  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
>  {
>   struct kvm_irq_ack_notifier *kian;
> @@ -344,16 +419,19 @@ static int setup_routing_entry(struct 
> kvm_irq_routing_table *rt,
>   switch (ue->u.irqchip.irqchip) {
>   case KVM_IRQCHIP_PIC_MASTER:
>   

Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

2012-07-16 Thread Michael S. Tsirkin
On Mon, Jul 16, 2012 at 02:34:03PM -0600, Alex Williamson wrote:
> This is an alternative to kvm_set_irq(,,,0) which returns the previous
> assertion state of the interrupt and does nothing if it isn't changed.
> 
> Signed-off-by: Alex Williamson 
> ---
> 
>  include/linux/kvm_host.h |3 ++
>  virt/kvm/irq_comm.c  |   78 
> ++
>  2 files changed, 81 insertions(+)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index a7661c0..6c168f1 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -219,6 +219,8 @@ struct kvm_kernel_irq_routing_entry {
>   u32 type;
>   int (*set)(struct kvm_kernel_irq_routing_entry *e,
>  struct kvm *kvm, int irq_source_id, int level);
> + int (*clear)(struct kvm_kernel_irq_routing_entry *e,
> +  struct kvm *kvm, int irq_source_id);
>   union {
>   struct {
>   unsigned irqchip;
> @@ -629,6 +631,7 @@ void kvm_get_intr_delivery_bitmask(struct kvm_ioapic 
> *ioapic,
>  unsigned long *deliver_bitmask);
>  #endif
>  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level);
> +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq);
>  int kvm_set_msi(struct kvm_kernel_irq_routing_entry *irq_entry, struct kvm 
> *kvm,
>   int irq_source_id, int level);
>  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin);
> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> index 5afb431..76e8f22 100644
> --- a/virt/kvm/irq_comm.c
> +++ b/virt/kvm/irq_comm.c
> @@ -68,6 +68,42 @@ static int kvm_set_ioapic_irq(struct 
> kvm_kernel_irq_routing_entry *e,
>   return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, level);
>  }
>  
> +static inline int kvm_clear_irq_line_state(unsigned long *irq_state,
> + int irq_source_id)
> +{
> + return !!test_and_clear_bit(irq_source_id, irq_state);
> +}
> +
> +static int kvm_clear_pic_irq(struct kvm_kernel_irq_routing_entry *e,
> +  struct kvm *kvm, int irq_source_id)
> +{
> +#ifdef CONFIG_X86
> + struct kvm_pic *pic = pic_irqchip(kvm);
> + int level = kvm_clear_irq_line_state(&pic->irq_states[e->irqchip.pin],
> +  irq_source_id);
> + if (level)
> + kvm_pic_set_irq(pic, e->irqchip.pin,
> + !!pic->irq_states[e->irqchip.pin]);
> + return level;
> +#else
> + return -1;
> +#endif

What does this ifdef exclude exactly?

> +}
> +
> +static int kvm_clear_ioapic_irq(struct kvm_kernel_irq_routing_entry *e,
> + struct kvm *kvm, int irq_source_id)
> +{
> + struct kvm_ioapic *ioapic = kvm->arch.vioapic;
> + int level;
> +
> + level = kvm_clear_irq_line_state(&ioapic->irq_states[e->irqchip.pin],
> +  irq_source_id);
> + if (level)
> + kvm_ioapic_set_irq(ioapic, e->irqchip.pin,
> +!!ioapic->irq_states[e->irqchip.pin]);
> + return level;
> +}
> +
>  inline static bool kvm_is_dm_lowest_prio(struct kvm_lapic_irq *irq)
>  {
>  #ifdef CONFIG_IA64
> @@ -190,6 +226,45 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 
> irq, int level)
>   return ret;
>  }
>  
> +/*
> + * Return value:
> + *  < 0   Error
> + *  = 0   Interrupt was not set, did nothing
> + *  > 0   Interrupt was pending, cleared
> + */
> +int kvm_clear_irq(struct kvm *kvm, int irq_source_id, u32 irq)
> +{
> + struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
> + int ret = -EINVAL, i = 0;
> + struct kvm_irq_routing_table *irq_rt;
> + struct hlist_node *n;
> +
> + /* Not possible to detect if the guest uses the PIC or the
> +  * IOAPIC.  So clear the bit in both. The guest will ignore
> +  * writes to the unused one.
> +  */
> + rcu_read_lock();
> + irq_rt = rcu_dereference(kvm->irq_routing);
> + if (irq < irq_rt->nr_rt_entries)
> + hlist_for_each_entry(e, n, &irq_rt->map[irq], link)
> + irq_set[i++] = *e;
> + rcu_read_unlock();
> +
> + while (i--) {
> + int r = -EINVAL;
> +
> + if (irq_set[i].clear)
> + r = irq_set[i].clear(&irq_set[i], kvm, irq_source_id);
> +
> + if (r < 0)
> + continue;
> +
> + ret = r + ((ret < 0) ? 0 : ret);
> + }
> +
> + return ret;
> +}
> +
>  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
>  {
>   struct kvm_irq_ack_notifier *kian;
> @@ -344,16 +419,19 @@ static int setup_routing_entry(struct 
> kvm_irq_routing_table *rt,
>   switch (ue->u.irqchip.irqchip) {
>   case KVM_IRQCHIP_PIC_MASTER:
>   e->set = kvm_set_pic_irq;
> + e->clear = kvm_clear_p