subject:"Re\: \[PATCH\] x86\: Add check for number of available vectors before CPU down"



On 01/02/2014 07:57 AM, Prarit Bhargava wrote:
> 
> 
> On 01/01/2014 09:41 PM, Chen, Gong wrote:
>> On Tue, Dec 31, 2013 at 04:22:09PM -0500, Prarit Bhargava wrote:
>>> Okay, how about,
>>> if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
>>> ((!cpumask_empty(_new)) &&
>>>   !cpumask_subset(_new, _new)) 
>>> ||
>>>  cpumask_empty(_new))
>>> this_count++;
>>>
>> I think it is good but a little bit complicated. How about this:
>>
>> if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
>> /* add some commments to emphysize the importance of turn */
>> (cpumask_empty(_new) ||
>> !cpumask_subset(_new, _new)))
> 
> Yeah :)  I thought of that after I sent it. :)
> 
>>
>>> I tried this with the following examples and AFAICT I get the correct 
>>> result:
>>>
>>> 1) affinity mask = online mask = 0xf.  CPU 3 (1000b) is down'd.
>>>
>>> this_count is not incremented.
>>>
>>> 2) affinity mask is a non-zero subset of the online mask (which IMO is
>>> the "typical" case).  For example, affinity_mask = 0x9, online mask = 0xf.  
>>> CPU
>>> 3 is again down'd.
>>>
>>> this_count is not incremented.
>>>
>>> 3) affinity_mask = 0x1, online mask = 0x3. (this is your example).  CPU
>>> 1 is going down.
>>>
>>> this_count is incremented, as the resulting affinity mask will be 0.
>>>
>>> 4) affinity_mask = 0x0, online mask = 0x7.  CPU 1 is going down.
>>>
>>> this_count is incremented, as the affinity mask is 0.
>>>
>> The 4th scenario is very tricky. If you try to set affinity from user space,
>> it will return failure because before kernel tried to change the affinity it
>> will verify it:
>> int __ioapic_set_affinity(...)
>> {
>> ...
>> if (!cpumask_intersects(mask, cpu_online_mask))
>> return -EINVAL;
>> ...
>> }
>>
>> So from this point of view, affinity can't be 0. But your patch is very
>> special because you change it by hand:
>> cpu_clear(smp_processor_id(), affinity_new);
>>
>> so it is reasonable. It makes me thinking a little bit more. In fixup_irqs
>> we have similar logic but we don't protect it. Maybe it is because currently
>> the scenario 4 can't happen because we stop it in advance. But who knows
>> if one day we use it in other situation we will hit this subtle issue
>> probably.
>>
>> So, Prarit, I suggest you writing another patch to fix this potential issue
>> for fixup_irqs. How would you think?
> 
> As you know Rui, I've been staring at that code wondering if it needs a fix.
> I'd like to hear Gong Chen's thoughts about it too...
> 

Oops -- got my sender header mixed up somehow.  Sorry 'bout that Gong :(

Gong, I think I'd like to go with the above patch for now and then take a closer
look at fixing the fixup_irqs() in another patch.  The reason is that I think
the above patch is critical for getting cpu hotplug working with large systems
with many devices.

I'm going to do more testing and will resubmit the patch later in the day.

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]



On 01/01/2014 09:41 PM, Chen, Gong wrote:
> On Tue, Dec 31, 2013 at 04:22:09PM -0500, Prarit Bhargava wrote:
>> Okay, how about,
>> if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
>> ((!cpumask_empty(_new)) &&
>>   !cpumask_subset(_new, _new)) ||
>>  cpumask_empty(_new))
>> this_count++;
>>
> I think it is good but a little bit complicated. How about this:
> 
> if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
> /* add some commments to emphysize the importance of turn */
> (cpumask_empty(_new) ||
> !cpumask_subset(_new, _new)))

Yeah :)  I thought of that after I sent it. :)

> 
>> I tried this with the following examples and AFAICT I get the correct result:
>>
>> 1) affinity mask = online mask = 0xf.  CPU 3 (1000b) is down'd.
>>
>> this_count is not incremented.
>>
>> 2) affinity mask is a non-zero subset of the online mask (which IMO is
>> the "typical" case).  For example, affinity_mask = 0x9, online mask = 0xf.  
>> CPU
>> 3 is again down'd.
>>
>> this_count is not incremented.
>>
>> 3) affinity_mask = 0x1, online mask = 0x3. (this is your example).  CPU
>> 1 is going down.
>>
>> this_count is incremented, as the resulting affinity mask will be 0.
>>
>> 4) affinity_mask = 0x0, online mask = 0x7.  CPU 1 is going down.
>>
>> this_count is incremented, as the affinity mask is 0.
>>
> The 4th scenario is very tricky. If you try to set affinity from user space,
> it will return failure because before kernel tried to change the affinity it
> will verify it:
> int __ioapic_set_affinity(...)
> {
> ...
> if (!cpumask_intersects(mask, cpu_online_mask))
> return -EINVAL;
> ...
> }
> 
> So from this point of view, affinity can't be 0. But your patch is very
> special because you change it by hand:
> cpu_clear(smp_processor_id(), affinity_new);
> 
> so it is reasonable. It makes me thinking a little bit more. In fixup_irqs
> we have similar logic but we don't protect it. Maybe it is because currently
> the scenario 4 can't happen because we stop it in advance. But who knows
> if one day we use it in other situation we will hit this subtle issue
> probably.
> 
> So, Prarit, I suggest you writing another patch to fix this potential issue
> for fixup_irqs. How would you think?

As you know Rui, I've been staring at that code wondering if it needs a fix.
I'd like to hear Gong Chen's thoughts about it too...

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]



On 01/01/2014 09:41 PM, Chen, Gong wrote:
 On Tue, Dec 31, 2013 at 04:22:09PM -0500, Prarit Bhargava wrote:
 Okay, how about,
 if (irq_has_action(irq)  !irqd_is_per_cpu(data) 
 ((!cpumask_empty(affinity_new)) 
   !cpumask_subset(affinity_new, online_new)) ||
  cpumask_empty(affinity_new))
 this_count++;

 I think it is good but a little bit complicated. How about this:
 
 if (irq_has_action(irq)  !irqd_is_per_cpu(data) 
 /* add some commments to emphysize the importance of turn */
 (cpumask_empty(affinity_new) ||
 !cpumask_subset(affinity_new, online_new)))

Yeah :)  I thought of that after I sent it. :)

 
 I tried this with the following examples and AFAICT I get the correct result:

 1) affinity mask = online mask = 0xf.  CPU 3 (1000b) is down'd.

 this_count is not incremented.

 2) affinity mask is a non-zero subset of the online mask (which IMO is
 the typical case).  For example, affinity_mask = 0x9, online mask = 0xf.  
 CPU
 3 is again down'd.

 this_count is not incremented.

 3) affinity_mask = 0x1, online mask = 0x3. (this is your example).  CPU
 1 is going down.

 this_count is incremented, as the resulting affinity mask will be 0.

 4) affinity_mask = 0x0, online mask = 0x7.  CPU 1 is going down.

 this_count is incremented, as the affinity mask is 0.

 The 4th scenario is very tricky. If you try to set affinity from user space,
 it will return failure because before kernel tried to change the affinity it
 will verify it:
 int __ioapic_set_affinity(...)
 {
 ...
 if (!cpumask_intersects(mask, cpu_online_mask))
 return -EINVAL;
 ...
 }
 
 So from this point of view, affinity can't be 0. But your patch is very
 special because you change it by hand:
 cpu_clear(smp_processor_id(), affinity_new);
 
 so it is reasonable. It makes me thinking a little bit more. In fixup_irqs
 we have similar logic but we don't protect it. Maybe it is because currently
 the scenario 4 can't happen because we stop it in advance. But who knows
 if one day we use it in other situation we will hit this subtle issue
 probably.
 
 So, Prarit, I suggest you writing another patch to fix this potential issue
 for fixup_irqs. How would you think?

As you know Rui, I've been staring at that code wondering if it needs a fix.
I'd like to hear Gong Chen's thoughts about it too...

P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]



On 01/02/2014 07:57 AM, Prarit Bhargava wrote:
 
 
 On 01/01/2014 09:41 PM, Chen, Gong wrote:
 On Tue, Dec 31, 2013 at 04:22:09PM -0500, Prarit Bhargava wrote:
 Okay, how about,
 if (irq_has_action(irq)  !irqd_is_per_cpu(data) 
 ((!cpumask_empty(affinity_new)) 
   !cpumask_subset(affinity_new, online_new)) 
 ||
  cpumask_empty(affinity_new))
 this_count++;

 I think it is good but a little bit complicated. How about this:

 if (irq_has_action(irq)  !irqd_is_per_cpu(data) 
 /* add some commments to emphysize the importance of turn */
 (cpumask_empty(affinity_new) ||
 !cpumask_subset(affinity_new, online_new)))
 
 Yeah :)  I thought of that after I sent it. :)
 

 I tried this with the following examples and AFAICT I get the correct 
 result:

 1) affinity mask = online mask = 0xf.  CPU 3 (1000b) is down'd.

 this_count is not incremented.

 2) affinity mask is a non-zero subset of the online mask (which IMO is
 the typical case).  For example, affinity_mask = 0x9, online mask = 0xf.  
 CPU
 3 is again down'd.

 this_count is not incremented.

 3) affinity_mask = 0x1, online mask = 0x3. (this is your example).  CPU
 1 is going down.

 this_count is incremented, as the resulting affinity mask will be 0.

 4) affinity_mask = 0x0, online mask = 0x7.  CPU 1 is going down.

 this_count is incremented, as the affinity mask is 0.

 The 4th scenario is very tricky. If you try to set affinity from user space,
 it will return failure because before kernel tried to change the affinity it
 will verify it:
 int __ioapic_set_affinity(...)
 {
 ...
 if (!cpumask_intersects(mask, cpu_online_mask))
 return -EINVAL;
 ...
 }

 So from this point of view, affinity can't be 0. But your patch is very
 special because you change it by hand:
 cpu_clear(smp_processor_id(), affinity_new);

 so it is reasonable. It makes me thinking a little bit more. In fixup_irqs
 we have similar logic but we don't protect it. Maybe it is because currently
 the scenario 4 can't happen because we stop it in advance. But who knows
 if one day we use it in other situation we will hit this subtle issue
 probably.

 So, Prarit, I suggest you writing another patch to fix this potential issue
 for fixup_irqs. How would you think?
 
 As you know Rui, I've been staring at that code wondering if it needs a fix.
 I'd like to hear Gong Chen's thoughts about it too...
 

Oops -- got my sender header mixed up somehow.  Sorry 'bout that Gong :(

Gong, I think I'd like to go with the above patch for now and then take a closer
look at fixing the fixup_irqs() in another patch.  The reason is that I think
the above patch is critical for getting cpu hotplug working with large systems
with many devices.

I'm going to do more testing and will resubmit the patch later in the day.

P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v4]

2014-01-02 Thread Chen, Gong

Add some nitpicks below.

Reviewed-by: Chen, Gong gong.c...@linux.intel.com

On Thu, Jan 02, 2014 at 07:47:24PM -0500, Prarit Bhargava wrote:
 +int check_irq_vectors_for_cpu_disable(void)
 +{
 + int irq, cpu;
 + unsigned int vector, this_count, count;
 + struct irq_desc *desc;
 + struct irq_data *data;
 + struct cpumask affinity_new, online_new;
 +
 + cpumask_copy(online_new, cpu_online_mask);
 + cpu_clear(smp_processor_id(), online_new);
I notice that you use smp_processor_id() many times. Maybe we can
save it first for speed.

 +
 + this_count = 0;
 + for (vector = FIRST_EXTERNAL_VECTOR; vector  NR_VECTORS; vector++) {
 + irq = __this_cpu_read(vector_irq[vector]);
 + if (irq = 0) {
 + desc = irq_to_desc(irq);
 + data = irq_desc_get_irq_data(desc);
 + cpumask_copy(affinity_new, data-affinity);
 + cpu_clear(smp_processor_id(), affinity_new);
 + /*
 +  * Only count active non-percpu irqs, and those
 +  * irqs that are not linked to on an online cpu; in
 +  * fixup_irqs(), chip-irq_set_affinity() will be
 +  * called which will set vector_irq[irq] for each
 +  * cpu.
 +  */
 + if (irq_has_action(irq)  !irqd_is_per_cpu(data) 
 + (cpumask_empty(affinity_new) ||
 +  !cpumask_subset(affinity_new, online_new)))
 + this_count++;
Would you please add some extra comments to describe these two different
conditions that cpumask is empty and non-empty. I feel it is a little bit
sutble and I don't expect I'm confused by myself one day :-).



signature.asc
Description: Digital signature

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

2014-01-01 Thread Chen, Gong

On Tue, Dec 31, 2013 at 04:22:09PM -0500, Prarit Bhargava wrote:
> Okay, how about,
> if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
> ((!cpumask_empty(_new)) &&
>   !cpumask_subset(_new, _new)) ||
>  cpumask_empty(_new))
> this_count++;
> 
I think it is good but a little bit complicated. How about this:

if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
/* add some commments to emphysize the importance of turn */
(cpumask_empty(_new) ||
!cpumask_subset(_new, _new)))

> I tried this with the following examples and AFAICT I get the correct result:
> 
> 1) affinity mask = online mask = 0xf.  CPU 3 (1000b) is down'd.
> 
> this_count is not incremented.
> 
> 2) affinity mask is a non-zero subset of the online mask (which IMO is
> the "typical" case).  For example, affinity_mask = 0x9, online mask = 0xf.  
> CPU
> 3 is again down'd.
> 
> this_count is not incremented.
> 
> 3) affinity_mask = 0x1, online mask = 0x3. (this is your example).  CPU
> 1 is going down.
> 
> this_count is incremented, as the resulting affinity mask will be 0.
> 
> 4) affinity_mask = 0x0, online mask = 0x7.  CPU 1 is going down.
> 
> this_count is incremented, as the affinity mask is 0.
> 
The 4th scenario is very tricky. If you try to set affinity from user space,
it will return failure because before kernel tried to change the affinity it
will verify it:
int __ioapic_set_affinity(...)
{
...
if (!cpumask_intersects(mask, cpu_online_mask))
return -EINVAL;
...
}

So from this point of view, affinity can't be 0. But your patch is very
special because you change it by hand:
cpu_clear(smp_processor_id(), affinity_new);

so it is reasonable. It makes me thinking a little bit more. In fixup_irqs
we have similar logic but we don't protect it. Maybe it is because currently
the scenario 4 can't happen because we stop it in advance. But who knows
if one day we use it in other situation we will hit this subtle issue
probably.

So, Prarit, I suggest you writing another patch to fix this potential issue
for fixup_irqs. How would you think?


signature.asc
Description: Digital signature

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

2014-01-01 Thread Chen, Gong

On Tue, Dec 31, 2013 at 04:22:09PM -0500, Prarit Bhargava wrote:
 Okay, how about,
 if (irq_has_action(irq)  !irqd_is_per_cpu(data) 
 ((!cpumask_empty(affinity_new)) 
   !cpumask_subset(affinity_new, online_new)) ||
  cpumask_empty(affinity_new))
 this_count++;
 
I think it is good but a little bit complicated. How about this:

if (irq_has_action(irq)  !irqd_is_per_cpu(data) 
/* add some commments to emphysize the importance of turn */
(cpumask_empty(affinity_new) ||
!cpumask_subset(affinity_new, online_new)))

 I tried this with the following examples and AFAICT I get the correct result:
 
 1) affinity mask = online mask = 0xf.  CPU 3 (1000b) is down'd.
 
 this_count is not incremented.
 
 2) affinity mask is a non-zero subset of the online mask (which IMO is
 the typical case).  For example, affinity_mask = 0x9, online mask = 0xf.  
 CPU
 3 is again down'd.
 
 this_count is not incremented.
 
 3) affinity_mask = 0x1, online mask = 0x3. (this is your example).  CPU
 1 is going down.
 
 this_count is incremented, as the resulting affinity mask will be 0.
 
 4) affinity_mask = 0x0, online mask = 0x7.  CPU 1 is going down.
 
 this_count is incremented, as the affinity mask is 0.
 
The 4th scenario is very tricky. If you try to set affinity from user space,
it will return failure because before kernel tried to change the affinity it
will verify it:
int __ioapic_set_affinity(...)
{
...
if (!cpumask_intersects(mask, cpu_online_mask))
return -EINVAL;
...
}

So from this point of view, affinity can't be 0. But your patch is very
special because you change it by hand:
cpu_clear(smp_processor_id(), affinity_new);

so it is reasonable. It makes me thinking a little bit more. In fixup_irqs
we have similar logic but we don't protect it. Maybe it is because currently
the scenario 4 can't happen because we stop it in advance. But who knows
if one day we use it in other situation we will hit this subtle issue
probably.

So, Prarit, I suggest you writing another patch to fix this potential issue
for fixup_irqs. How would you think?


signature.asc
Description: Digital signature

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

2013-12-31 Thread Prarit Bhargava



On 12/30/2013 09:58 PM, rui wang wrote:
> On 12/30/13, Prarit Bhargava  wrote:
>>
>>
>> On 12/30/2013 07:56 AM, rui wang wrote:
>>
> ...
>> Okay, so the big issue is that we need to do the calculation without this 
>> cpu,
> 
>>
>> int check_irq_vectors_for_cpu_disable(void)
>> {
>>int irq, cpu;
>>unsigned int vector, this_count, count;
>>struct irq_desc *desc;
>>struct irq_data *data;
>>struct cpumask online_new; /* cpu_online_mask - this_cpu */
>>struct cpumask affinity_new; /* affinity - this_cpu */
>>
>>cpumask_copy(_new, cpu_online_mask);
>>cpu_clear(smp_processor_id(), online_new);
>>
>>this_count = 0;
>>for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
>>irq = __this_cpu_read(vector_irq[vector]);
>>if (irq >= 0) {
>>desc = irq_to_desc(irq);
>>data = irq_desc_get_irq_data(desc);
>>cpumask_copy(_new, data->affinity);
>>cpu_clear(smp_processor_id(), affinity_new);
>>if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
>>!cpumask_subset(_new, _new) &&
>>!cpumask_empty(_new))
> 
> If this cpu is the only target, then affinity_new becomes empty.
> Should we count it for migration?

Okay, how about,
if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
((!cpumask_empty(_new)) &&
  !cpumask_subset(_new, _new)) ||
 cpumask_empty(_new))
this_count++;

I tried this with the following examples and AFAICT I get the correct result:

1) affinity mask = online mask = 0xf.  CPU 3 (1000b) is down'd.

this_count is not incremented.

2) affinity mask is a non-zero subset of the online mask (which IMO is
the "typical" case).  For example, affinity_mask = 0x9, online mask = 0xf.  CPU
3 is again down'd.

this_count is not incremented.

3) affinity_mask = 0x1, online mask = 0x3. (this is your example).  CPU
1 is going down.

this_count is incremented, as the resulting affinity mask will be 0.

4) affinity_mask = 0x0, online mask = 0x7.  CPU 1 is going down.

this_count is incremented, as the affinity mask is 0.

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

2013-12-31 Thread Prarit Bhargava



On 12/30/2013 09:58 PM, rui wang wrote:
 On 12/30/13, Prarit Bhargava pra...@redhat.com wrote:


 On 12/30/2013 07:56 AM, rui wang wrote:

 ...
 Okay, so the big issue is that we need to do the calculation without this 
 cpu,
 

 int check_irq_vectors_for_cpu_disable(void)
 {
int irq, cpu;
unsigned int vector, this_count, count;
struct irq_desc *desc;
struct irq_data *data;
struct cpumask online_new; /* cpu_online_mask - this_cpu */
struct cpumask affinity_new; /* affinity - this_cpu */

cpumask_copy(online_new, cpu_online_mask);
cpu_clear(smp_processor_id(), online_new);

this_count = 0;
for (vector = FIRST_EXTERNAL_VECTOR; vector  NR_VECTORS; vector++) {
irq = __this_cpu_read(vector_irq[vector]);
if (irq = 0) {
desc = irq_to_desc(irq);
data = irq_desc_get_irq_data(desc);
cpumask_copy(affinity_new, data-affinity);
cpu_clear(smp_processor_id(), affinity_new);
if (irq_has_action(irq)  !irqd_is_per_cpu(data) 
!cpumask_subset(affinity_new, online_new) 
!cpumask_empty(affinity_new))
 
 If this cpu is the only target, then affinity_new becomes empty.
 Should we count it for migration?

Okay, how about,
if (irq_has_action(irq)  !irqd_is_per_cpu(data) 
((!cpumask_empty(affinity_new)) 
  !cpumask_subset(affinity_new, online_new)) ||
 cpumask_empty(affinity_new))
this_count++;

I tried this with the following examples and AFAICT I get the correct result:

1) affinity mask = online mask = 0xf.  CPU 3 (1000b) is down'd.

this_count is not incremented.

2) affinity mask is a non-zero subset of the online mask (which IMO is
the typical case).  For example, affinity_mask = 0x9, online mask = 0xf.  CPU
3 is again down'd.

this_count is not incremented.

3) affinity_mask = 0x1, online mask = 0x3. (this is your example).  CPU
1 is going down.

this_count is incremented, as the resulting affinity mask will be 0.

4) affinity_mask = 0x0, online mask = 0x7.  CPU 1 is going down.

this_count is incremented, as the affinity mask is 0.

P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

On 12/30/13, Prarit Bhargava  wrote:
>
>
> On 12/30/2013 07:56 AM, rui wang wrote:
>
>> An irq can be mapped to only one vector number, but can have multiple
>> destination CPUs. i.e. the same irq/vector can appear on multiple
>> CPUs' vector_irq[]. So checking data->affinity is necessary I think.
>
> That's true Rui -- but here's what I think the scenario actually is.
>
> Suppose we have a 4-cpu system, and we have an IRQ that is mapped to
> multiple
> cpu's vector_irq[].  For example, we have IRQ 200 that is mapped to CPU 2
> vector_irq[50], and CPU 3 vector_irq[60].
>

This should not happen. There's only one LSB to fill the vector number
in IRTE. So either 50 or 60 but not both for an irq.

...
> Okay, so the big issue is that we need to do the calculation without this cpu,

Yes if it's done before calling cpu_disable_comm().

> so I think this works (sorry for the cut-and-paste)
>
> int check_irq_vectors_for_cpu_disable(void)
> {
>int irq, cpu;
>unsigned int vector, this_count, count;
>struct irq_desc *desc;
>struct irq_data *data;
>struct cpumask online_new; /* cpu_online_mask - this_cpu */
>struct cpumask affinity_new; /* affinity - this_cpu */
>
>cpumask_copy(_new, cpu_online_mask);
>cpu_clear(smp_processor_id(), online_new);
>
>this_count = 0;
>for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
>irq = __this_cpu_read(vector_irq[vector]);
>if (irq >= 0) {
>desc = irq_to_desc(irq);
>data = irq_desc_get_irq_data(desc);
>cpumask_copy(_new, data->affinity);
>cpu_clear(smp_processor_id(), affinity_new);
>if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
>!cpumask_subset(_new, _new) &&
>!cpumask_empty(_new))

If this cpu is the only target, then affinity_new becomes empty.
Should we count it for migration?

Thanks
Rui
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]



On 12/30/2013 07:56 AM, rui wang wrote:
> On 12/29/13, Prarit Bhargava  wrote:
>>
>>
>> On 12/20/2013 04:41 AM, rui wang wrote:
> <>
>>> The vector number for an irq is programmed in the LSB of the IOAPIC
>>> IRTE (or MSI data register in the case of MSI/MSIx). So there can be
>>> only one vector number (although multiple CPUs can be specified
>>> through DM). An MSI-capable device can dynamically change the lower
>>> few bits in the LSB to signal multiple interrupts with a contiguous
>>> range of vectors in powers of 2,but each of these vectors is treated
>>> as a separate IRQ. i.e. each of them has a separate irq desc, or a
>>> separate line in the /proc/interrupt file. This patch shows the MSI
>>> irq allocation in detail:
>>> http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=51906e779f2b13b38f8153774c4c7163d412ffd9
>>>
>>> Thanks
>>> Rui
>>>
>>
>> Gong and Rui,
>>
>> After looking at this in detail I realized I made a mistake in my patch by
>> including the check for the smp_affinity.  Simply put, it shouldn't be
>> there
>> given Rui's explanation above.
>>
>> So I think the patch simply needs to do:
>>
>> this_count = 0;
>> for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++)
>> {
>> irq = __this_cpu_read(vector_irq[vector]);
>> if (irq >= 0) {
>> desc = irq_to_desc(irq);
>> data = irq_desc_get_irq_data(desc);
>> affinity = data->affinity;
>> if (irq_has_action(irq) && !irqd_is_per_cpu(data))
>> this_count++;
>> }
>> }
>>
>> Can the two of you confirm the above is correct?  It would be greatly
>> appreciated.
> 
> An irq can be mapped to only one vector number, but can have multiple
> destination CPUs. i.e. the same irq/vector can appear on multiple
> CPUs' vector_irq[]. So checking data->affinity is necessary I think.
> But notice that data->affinity is updated in chip->irq_set_affinity()
> inside fixup_irqs(), while cpu_online_mask is updated in
> remove_cpu_from_maps() inside cpu_disable_common(). They are updated
> in different places. So the algorithm to check them against each other
> should be different, depending on where you put the check_vectors().
> That's my understanding.


Okay, so the big issue is that we need to do the calculation without this cpu,
so I think this works (sorry for the cut-and-paste)

int check_irq_vectors_for_cpu_disable(void)
{
int irq, cpu;
unsigned int vector, this_count, count;
struct irq_desc *desc;
struct irq_data *data;
struct cpumask online_new; /* cpu_online_mask - this_cpu */
struct cpumask affinity_new; /* affinity - this_cpu */

cpumask_copy(_new, cpu_online_mask);
cpu_clear(smp_processor_id(), online_new);

this_count = 0;
for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
irq = __this_cpu_read(vector_irq[vector]);
if (irq >= 0) {
desc = irq_to_desc(irq);
data = irq_desc_get_irq_data(desc);
cpumask_copy(_new, data->affinity);
cpu_clear(smp_processor_id(), affinity_new);
if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
!cpumask_subset(_new, _new) &&
!cpumask_empty(_new))
this_count++;
}
}

...

If I go back to the various examples this appears to work.  For example, your
previous case was all cpus are online, CPU 1 goes down and we have an IRQ with
affinity for CPU (1,2).  We skip this IRQ which is correct.

And if we have another IRQ with affinity of only CPU 1 we will not skip this
IRQ, which is also correct.

I've tried other examples and they appear to work AFAICT.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

On 12/30/2013 07:56 AM, rui wang wrote:

> An irq can be mapped to only one vector number, but can have multiple
> destination CPUs. i.e. the same irq/vector can appear on multiple
> CPUs' vector_irq[]. So checking data->affinity is necessary I think.

That's true Rui -- but here's what I think the scenario actually is.

Suppose we have a 4-cpu system, and we have an IRQ that is mapped to multiple
cpu's vector_irq[].  For example, we have IRQ 200 that is mapped to CPU 2
vector_irq[50], and CPU 3 vector_irq[60].

Now I 'echo 0 > /sys/devices/system/cpu/cpu3/online'.

cpu_disable is called and the kernel migrates IRQs off to other cpus.
Regardless if IRQ 200 is already mapped to CPU2 vector_irq[50], the mapping for
CPU 3 vector_irq[60] *must be migrated* to another CPU.  It has a valid irq
handler and the IRQ is active.  It doesn't just disappear because the CPU went 
down.

ie) AFAICT we should not differentiate between a multiple mapped IRQ and a
singly mapped IRQ when traversing the vector_irq[] for CPU 3.

I'm probably being dense on this but I'm not seeing a problem with migrating the
IRQ.

> But notice that data->affinity is updated in chip->irq_set_affinity()
> inside fixup_irqs(), while cpu_online_mask is updated in
> remove_cpu_from_maps() inside cpu_disable_common(). 

It shouldn't matter that the maps are updated in different areas during the
execution as we're in stop_machine().

They are updated
> in different places. So the algorithm to check them against each other
> should be different, depending on where you put the check_vectors().
> That's my understanding.
> 

P.

> Thanks
> Rui
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]



On 12/30/2013 02:44 AM, Chen, Gong wrote:
> On Sat, Dec 28, 2013 at 12:10:38PM -0500, Prarit Bhargava wrote:
>> Gong and Rui,
>>
>> After looking at this in detail I realized I made a mistake in my patch by
>> including the check for the smp_affinity.  Simply put, it shouldn't be there
>> given Rui's explanation above.
>>
>> So I think the patch simply needs to do:
>>
>> this_count = 0;
>> for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
>> irq = __this_cpu_read(vector_irq[vector]);
>> if (irq >= 0) {
>> desc = irq_to_desc(irq);
>> data = irq_desc_get_irq_data(desc);
>> affinity = data->affinity;
>> if (irq_has_action(irq) && !irqd_is_per_cpu(data))
>> this_count++;
>> }
>> }
>>
>> Can the two of you confirm the above is correct?  It would be greatly 
>> appreciated.
>>
> 
> No, I don't think it is correct. We still need to consider smp_affinity.
> 
> fixup_irqs
> irq_set_affinity(native_ioapic_set_affinity)
> __ioapic_set_affinity
> assign_irq_vector
> __assign_irq_vector
> cpu_mask_to_apicid_and
> /* now begin to set ioapic RET */
> 
> __assign_irq_vector(int irq, struct irq_cfg *cfg, const struct cpumask *mask)
> {
> ...
> apic->vector_allocation_domain(cpu, tmp_mask, mask);
> ...
> for_each_cpu_and(new_cpu, tmp_mask, cpu_online_mask)
> per_cpu(vector_irq, new_cpu)[vector] = irq;
> cfg->vector = vector;
> cpumask_copy(cfg->domain, tmp_mask);
> ...
> }
> 
> On same vecotr on all related vector_irq, irq is set. So such kind of
> irq should happen in multiple vector_irq. In cpu_mask_to_apicid_and(e.g.
> x2apic_cpu_mask_to_apicid_and for cluster mode), apic is updated
> depending on new mask. That's why I think this kind of interrupt
> should be bypassed.

Hmm ... okay.  I'll take a closer look at this.

Thanks for the additional information.

P.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

On 12/29/13, Prarit Bhargava  wrote:
>
>
> On 12/20/2013 04:41 AM, rui wang wrote:
<>
>> The vector number for an irq is programmed in the LSB of the IOAPIC
>> IRTE (or MSI data register in the case of MSI/MSIx). So there can be
>> only one vector number (although multiple CPUs can be specified
>> through DM). An MSI-capable device can dynamically change the lower
>> few bits in the LSB to signal multiple interrupts with a contiguous
>> range of vectors in powers of 2,but each of these vectors is treated
>> as a separate IRQ. i.e. each of them has a separate irq desc, or a
>> separate line in the /proc/interrupt file. This patch shows the MSI
>> irq allocation in detail:
>> http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=51906e779f2b13b38f8153774c4c7163d412ffd9
>>
>> Thanks
>> Rui
>>
>
> Gong and Rui,
>
> After looking at this in detail I realized I made a mistake in my patch by
> including the check for the smp_affinity.  Simply put, it shouldn't be
> there
> given Rui's explanation above.
>
> So I think the patch simply needs to do:
>
> this_count = 0;
> for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++)
> {
> irq = __this_cpu_read(vector_irq[vector]);
> if (irq >= 0) {
> desc = irq_to_desc(irq);
> data = irq_desc_get_irq_data(desc);
> affinity = data->affinity;
> if (irq_has_action(irq) && !irqd_is_per_cpu(data))
> this_count++;
> }
> }
>
> Can the two of you confirm the above is correct?  It would be greatly
> appreciated.

An irq can be mapped to only one vector number, but can have multiple
destination CPUs. i.e. the same irq/vector can appear on multiple
CPUs' vector_irq[]. So checking data->affinity is necessary I think.
But notice that data->affinity is updated in chip->irq_set_affinity()
inside fixup_irqs(), while cpu_online_mask is updated in
remove_cpu_from_maps() inside cpu_disable_common(). They are updated
in different places. So the algorithm to check them against each other
should be different, depending on where you put the check_vectors().
That's my understanding.

Thanks
Rui
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

2013-12-30 Thread Chen, Gong

On Sat, Dec 28, 2013 at 12:10:38PM -0500, Prarit Bhargava wrote:
> Gong and Rui,
> 
> After looking at this in detail I realized I made a mistake in my patch by
> including the check for the smp_affinity.  Simply put, it shouldn't be there
> given Rui's explanation above.
> 
> So I think the patch simply needs to do:
> 
> this_count = 0;
> for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
> irq = __this_cpu_read(vector_irq[vector]);
> if (irq >= 0) {
> desc = irq_to_desc(irq);
> data = irq_desc_get_irq_data(desc);
> affinity = data->affinity;
> if (irq_has_action(irq) && !irqd_is_per_cpu(data))
> this_count++;
> }
> }
> 
> Can the two of you confirm the above is correct?  It would be greatly 
> appreciated.
> 

No, I don't think it is correct. We still need to consider smp_affinity.

fixup_irqs
irq_set_affinity(native_ioapic_set_affinity)
__ioapic_set_affinity
assign_irq_vector
__assign_irq_vector
cpu_mask_to_apicid_and
/* now begin to set ioapic RET */

__assign_irq_vector(int irq, struct irq_cfg *cfg, const struct cpumask *mask)
{
...
apic->vector_allocation_domain(cpu, tmp_mask, mask);
...
for_each_cpu_and(new_cpu, tmp_mask, cpu_online_mask)
per_cpu(vector_irq, new_cpu)[vector] = irq;
cfg->vector = vector;
cpumask_copy(cfg->domain, tmp_mask);
...
}

On same vecotr on all related vector_irq, irq is set. So such kind of
irq should happen in multiple vector_irq. In cpu_mask_to_apicid_and(e.g.
x2apic_cpu_mask_to_apicid_and for cluster mode), apic is updated
depending on new mask. That's why I think this kind of interrupt
should be bypassed.

signature.asc
Description: Digital signature

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

2013-12-30 Thread Chen, Gong

On Sat, Dec 28, 2013 at 12:10:38PM -0500, Prarit Bhargava wrote:
 Gong and Rui,
 
 After looking at this in detail I realized I made a mistake in my patch by
 including the check for the smp_affinity.  Simply put, it shouldn't be there
 given Rui's explanation above.
 
 So I think the patch simply needs to do:
 
 this_count = 0;
 for (vector = FIRST_EXTERNAL_VECTOR; vector  NR_VECTORS; vector++) {
 irq = __this_cpu_read(vector_irq[vector]);
 if (irq = 0) {
 desc = irq_to_desc(irq);
 data = irq_desc_get_irq_data(desc);
 affinity = data-affinity;
 if (irq_has_action(irq)  !irqd_is_per_cpu(data))
 this_count++;
 }
 }
 
 Can the two of you confirm the above is correct?  It would be greatly 
 appreciated.
 

No, I don't think it is correct. We still need to consider smp_affinity.

fixup_irqs
irq_set_affinity(native_ioapic_set_affinity)
__ioapic_set_affinity
assign_irq_vector
__assign_irq_vector
cpu_mask_to_apicid_and
/* now begin to set ioapic RET */

__assign_irq_vector(int irq, struct irq_cfg *cfg, const struct cpumask *mask)
{
...
apic-vector_allocation_domain(cpu, tmp_mask, mask);
...
for_each_cpu_and(new_cpu, tmp_mask, cpu_online_mask)
per_cpu(vector_irq, new_cpu)[vector] = irq;
cfg-vector = vector;
cpumask_copy(cfg-domain, tmp_mask);
...
}

On same vecotr on all related vector_irq, irq is set. So such kind of
irq should happen in multiple vector_irq. In cpu_mask_to_apicid_and(e.g.
x2apic_cpu_mask_to_apicid_and for cluster mode), apic is updated
depending on new mask. That's why I think this kind of interrupt
should be bypassed.


signature.asc
Description: Digital signature

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

On 12/29/13, Prarit Bhargava pra...@redhat.com wrote:


 On 12/20/2013 04:41 AM, rui wang wrote:
snip
 The vector number for an irq is programmed in the LSB of the IOAPIC
 IRTE (or MSI data register in the case of MSI/MSIx). So there can be
 only one vector number (although multiple CPUs can be specified
 through DM). An MSI-capable device can dynamically change the lower
 few bits in the LSB to signal multiple interrupts with a contiguous
 range of vectors in powers of 2,but each of these vectors is treated
 as a separate IRQ. i.e. each of them has a separate irq desc, or a
 separate line in the /proc/interrupt file. This patch shows the MSI
 irq allocation in detail:
 http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=51906e779f2b13b38f8153774c4c7163d412ffd9

 Thanks
 Rui


 Gong and Rui,

 After looking at this in detail I realized I made a mistake in my patch by
 including the check for the smp_affinity.  Simply put, it shouldn't be
 there
 given Rui's explanation above.

 So I think the patch simply needs to do:

 this_count = 0;
 for (vector = FIRST_EXTERNAL_VECTOR; vector  NR_VECTORS; vector++)
 {
 irq = __this_cpu_read(vector_irq[vector]);
 if (irq = 0) {
 desc = irq_to_desc(irq);
 data = irq_desc_get_irq_data(desc);
 affinity = data-affinity;
 if (irq_has_action(irq)  !irqd_is_per_cpu(data))
 this_count++;
 }
 }

 Can the two of you confirm the above is correct?  It would be greatly
 appreciated.

An irq can be mapped to only one vector number, but can have multiple
destination CPUs. i.e. the same irq/vector can appear on multiple
CPUs' vector_irq[]. So checking data-affinity is necessary I think.
But notice that data-affinity is updated in chip-irq_set_affinity()
inside fixup_irqs(), while cpu_online_mask is updated in
remove_cpu_from_maps() inside cpu_disable_common(). They are updated
in different places. So the algorithm to check them against each other
should be different, depending on where you put the check_vectors().
That's my understanding.

Thanks
Rui
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]



On 12/30/2013 02:44 AM, Chen, Gong wrote:
 On Sat, Dec 28, 2013 at 12:10:38PM -0500, Prarit Bhargava wrote:
 Gong and Rui,

 After looking at this in detail I realized I made a mistake in my patch by
 including the check for the smp_affinity.  Simply put, it shouldn't be there
 given Rui's explanation above.

 So I think the patch simply needs to do:

 this_count = 0;
 for (vector = FIRST_EXTERNAL_VECTOR; vector  NR_VECTORS; vector++) {
 irq = __this_cpu_read(vector_irq[vector]);
 if (irq = 0) {
 desc = irq_to_desc(irq);
 data = irq_desc_get_irq_data(desc);
 affinity = data-affinity;
 if (irq_has_action(irq)  !irqd_is_per_cpu(data))
 this_count++;
 }
 }

 Can the two of you confirm the above is correct?  It would be greatly 
 appreciated.

 
 No, I don't think it is correct. We still need to consider smp_affinity.
 
 fixup_irqs
 irq_set_affinity(native_ioapic_set_affinity)
 __ioapic_set_affinity
 assign_irq_vector
 __assign_irq_vector
 cpu_mask_to_apicid_and
 /* now begin to set ioapic RET */
 
 __assign_irq_vector(int irq, struct irq_cfg *cfg, const struct cpumask *mask)
 {
 ...
 apic-vector_allocation_domain(cpu, tmp_mask, mask);
 ...
 for_each_cpu_and(new_cpu, tmp_mask, cpu_online_mask)
 per_cpu(vector_irq, new_cpu)[vector] = irq;
 cfg-vector = vector;
 cpumask_copy(cfg-domain, tmp_mask);
 ...
 }
 
 On same vecotr on all related vector_irq, irq is set. So such kind of
 irq should happen in multiple vector_irq. In cpu_mask_to_apicid_and(e.g.
 x2apic_cpu_mask_to_apicid_and for cluster mode), apic is updated
 depending on new mask. That's why I think this kind of interrupt
 should be bypassed.

Hmm ... okay.  I'll take a closer look at this.

Thanks for the additional information.

P.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]



On 12/30/2013 07:56 AM, rui wang wrote:

 An irq can be mapped to only one vector number, but can have multiple
 destination CPUs. i.e. the same irq/vector can appear on multiple
 CPUs' vector_irq[]. So checking data-affinity is necessary I think.

That's true Rui -- but here's what I think the scenario actually is.

Suppose we have a 4-cpu system, and we have an IRQ that is mapped to multiple
cpu's vector_irq[].  For example, we have IRQ 200 that is mapped to CPU 2
vector_irq[50], and CPU 3 vector_irq[60].

Now I 'echo 0  /sys/devices/system/cpu/cpu3/online'.

cpu_disable is called and the kernel migrates IRQs off to other cpus.
Regardless if IRQ 200 is already mapped to CPU2 vector_irq[50], the mapping for
CPU 3 vector_irq[60] *must be migrated* to another CPU.  It has a valid irq
handler and the IRQ is active.  It doesn't just disappear because the CPU went 
down.

ie) AFAICT we should not differentiate between a multiple mapped IRQ and a
singly mapped IRQ when traversing the vector_irq[] for CPU 3.

I'm probably being dense on this but I'm not seeing a problem with migrating the
IRQ.

 But notice that data-affinity is updated in chip-irq_set_affinity()
 inside fixup_irqs(), while cpu_online_mask is updated in
 remove_cpu_from_maps() inside cpu_disable_common(). 

It shouldn't matter that the maps are updated in different areas during the
execution as we're in stop_machine().

They are updated
 in different places. So the algorithm to check them against each other
 should be different, depending on where you put the check_vectors().
 That's my understanding.
 

P.

 Thanks
 Rui
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]



On 12/30/2013 07:56 AM, rui wang wrote:
 On 12/29/13, Prarit Bhargava pra...@redhat.com wrote:


 On 12/20/2013 04:41 AM, rui wang wrote:
 snip
 The vector number for an irq is programmed in the LSB of the IOAPIC
 IRTE (or MSI data register in the case of MSI/MSIx). So there can be
 only one vector number (although multiple CPUs can be specified
 through DM). An MSI-capable device can dynamically change the lower
 few bits in the LSB to signal multiple interrupts with a contiguous
 range of vectors in powers of 2,but each of these vectors is treated
 as a separate IRQ. i.e. each of them has a separate irq desc, or a
 separate line in the /proc/interrupt file. This patch shows the MSI
 irq allocation in detail:
 http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=51906e779f2b13b38f8153774c4c7163d412ffd9

 Thanks
 Rui


 Gong and Rui,

 After looking at this in detail I realized I made a mistake in my patch by
 including the check for the smp_affinity.  Simply put, it shouldn't be
 there
 given Rui's explanation above.

 So I think the patch simply needs to do:

 this_count = 0;
 for (vector = FIRST_EXTERNAL_VECTOR; vector  NR_VECTORS; vector++)
 {
 irq = __this_cpu_read(vector_irq[vector]);
 if (irq = 0) {
 desc = irq_to_desc(irq);
 data = irq_desc_get_irq_data(desc);
 affinity = data-affinity;
 if (irq_has_action(irq)  !irqd_is_per_cpu(data))
 this_count++;
 }
 }

 Can the two of you confirm the above is correct?  It would be greatly
 appreciated.
 
 An irq can be mapped to only one vector number, but can have multiple
 destination CPUs. i.e. the same irq/vector can appear on multiple
 CPUs' vector_irq[]. So checking data-affinity is necessary I think.
 But notice that data-affinity is updated in chip-irq_set_affinity()
 inside fixup_irqs(), while cpu_online_mask is updated in
 remove_cpu_from_maps() inside cpu_disable_common(). They are updated
 in different places. So the algorithm to check them against each other
 should be different, depending on where you put the check_vectors().
 That's my understanding.


Okay, so the big issue is that we need to do the calculation without this cpu,
so I think this works (sorry for the cut-and-paste)

int check_irq_vectors_for_cpu_disable(void)
{
int irq, cpu;
unsigned int vector, this_count, count;
struct irq_desc *desc;
struct irq_data *data;
struct cpumask online_new; /* cpu_online_mask - this_cpu */
struct cpumask affinity_new; /* affinity - this_cpu */

cpumask_copy(online_new, cpu_online_mask);
cpu_clear(smp_processor_id(), online_new);

this_count = 0;
for (vector = FIRST_EXTERNAL_VECTOR; vector  NR_VECTORS; vector++) {
irq = __this_cpu_read(vector_irq[vector]);
if (irq = 0) {
desc = irq_to_desc(irq);
data = irq_desc_get_irq_data(desc);
cpumask_copy(affinity_new, data-affinity);
cpu_clear(smp_processor_id(), affinity_new);
if (irq_has_action(irq)  !irqd_is_per_cpu(data) 
!cpumask_subset(affinity_new, online_new) 
!cpumask_empty(affinity_new))
this_count++;
}
}

...

If I go back to the various examples this appears to work.  For example, your
previous case was all cpus are online, CPU 1 goes down and we have an IRQ with
affinity for CPU (1,2).  We skip this IRQ which is correct.

And if we have another IRQ with affinity of only CPU 1 we will not skip this
IRQ, which is also correct.

I've tried other examples and they appear to work AFAICT.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

On 12/30/13, Prarit Bhargava pra...@redhat.com wrote:


 On 12/30/2013 07:56 AM, rui wang wrote:

 An irq can be mapped to only one vector number, but can have multiple
 destination CPUs. i.e. the same irq/vector can appear on multiple
 CPUs' vector_irq[]. So checking data-affinity is necessary I think.

 That's true Rui -- but here's what I think the scenario actually is.

 Suppose we have a 4-cpu system, and we have an IRQ that is mapped to
 multiple
 cpu's vector_irq[].  For example, we have IRQ 200 that is mapped to CPU 2
 vector_irq[50], and CPU 3 vector_irq[60].


This should not happen. There's only one LSB to fill the vector number
in IRTE. So either 50 or 60 but not both for an irq.

...
 Okay, so the big issue is that we need to do the calculation without this cpu,

Yes if it's done before calling cpu_disable_comm().

 so I think this works (sorry for the cut-and-paste)

 int check_irq_vectors_for_cpu_disable(void)
 {
int irq, cpu;
unsigned int vector, this_count, count;
struct irq_desc *desc;
struct irq_data *data;
struct cpumask online_new; /* cpu_online_mask - this_cpu */
struct cpumask affinity_new; /* affinity - this_cpu */

cpumask_copy(online_new, cpu_online_mask);
cpu_clear(smp_processor_id(), online_new);

this_count = 0;
for (vector = FIRST_EXTERNAL_VECTOR; vector  NR_VECTORS; vector++) {
irq = __this_cpu_read(vector_irq[vector]);
if (irq = 0) {
desc = irq_to_desc(irq);
data = irq_desc_get_irq_data(desc);
cpumask_copy(affinity_new, data-affinity);
cpu_clear(smp_processor_id(), affinity_new);
if (irq_has_action(irq)  !irqd_is_per_cpu(data) 
!cpumask_subset(affinity_new, online_new) 
!cpumask_empty(affinity_new))

If this cpu is the only target, then affinity_new becomes empty.
Should we count it for migration?

Thanks
Rui
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

2013-12-28 Thread Prarit Bhargava



On 12/20/2013 04:41 AM, rui wang wrote:
> On 12/20/13, Prarit Bhargava  wrote:
>>
>>
>> On 12/19/2013 01:05 PM, Tony Luck wrote:
>>> On Wed, Dec 18, 2013 at 11:50 AM, Tony Luck  wrote:
 Looks good to me.
>>>
>>> Though now I've been confused by an offline question about affinity.
>>
>> Heh :)  I'm pursuing it now.  Rui has asked a pretty good question that I
>> don't
>> know the answer to off the top of my head.  I'm still looking at the code.
>>
>>>
>>> Suppose we have some interrupt that has affinity to multiple cpus. E.g.
>>> (real example from one of my machines):
>>>
>>> # cat /proc/irq/94/smp_affinity_list
>>> 26,54
>>>
>>> Now If I want to take either cpu26 or cpu54 offline - I'm guessing that I
>>> don't
>>> really need to find a new home for vector 94 - because the other one of
>>> that
>>> pair already has that set up.  But your check_vectors code doesn't look
>>> like
>>> it accounts for that - if we take cpu26 offline - it would see that
>>> cpu54 doesn't
>>> have 94 free - but doesn't check that it is for the same interrupt.
>>>
>>> But I may be mixing "vectors" and "irqs" here.
>>
>> Yep.  The question really is this: is the irq mapped to a single vector or
>> multiple vectors. (I think)
>>
> 
> The vector number for an irq is programmed in the LSB of the IOAPIC
> IRTE (or MSI data register in the case of MSI/MSIx). So there can be
> only one vector number (although multiple CPUs can be specified
> through DM). An MSI-capable device can dynamically change the lower
> few bits in the LSB to signal multiple interrupts with a contiguous
> range of vectors in powers of 2,but each of these vectors is treated
> as a separate IRQ. i.e. each of them has a separate irq desc, or a
> separate line in the /proc/interrupt file. This patch shows the MSI
> irq allocation in detail:
> http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=51906e779f2b13b38f8153774c4c7163d412ffd9
> 
> Thanks
> Rui
> 

Gong and Rui,

After looking at this in detail I realized I made a mistake in my patch by
including the check for the smp_affinity.  Simply put, it shouldn't be there
given Rui's explanation above.

So I think the patch simply needs to do:

this_count = 0;
for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
irq = __this_cpu_read(vector_irq[vector]);
if (irq >= 0) {
desc = irq_to_desc(irq);
data = irq_desc_get_irq_data(desc);
affinity = data->affinity;
if (irq_has_action(irq) && !irqd_is_per_cpu(data))
this_count++;
}
}

Can the two of you confirm the above is correct?  It would be greatly 
appreciated.

Tony, I apologize -- your comments made me think you were stating a fact and not
asking a question on the behavior of affinity.  I completely misunderstood what
you were suggesting.  I thought you were implying that that the affinity "tied"
IRQ behavior together; it does not.  It is simply a suggestion of what IRQs
should be assigned to a particular CPU.  There is an expectation that the system
will attempt to honour the affinity, however, it is not like each CPU is
assigned a separate IRQ.

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

2013-12-28 Thread Prarit Bhargava

On 12/20/2013 04:41 AM, rui wang wrote:
On 12/20/13, Prarit Bhargava pra...@redhat.com wrote:

On 12/19/2013 01:05 PM, Tony Luck wrote:
On Wed, Dec 18, 2013 at 11:50 AM, Tony Luck tony.l...@intel.com wrote:
Looks good to me.

Though now I've been confused by an offline question about affinity.

Heh :) I'm pursuing it now. Rui has asked a pretty good question that I
don't
know the answer to off the top of my head. I'm still looking at the code.

Suppose we have some interrupt that has affinity to multiple cpus. E.g.
(real example from one of my machines):

# cat /proc/irq/94/smp_affinity_list
26,54

Now If I want to take either cpu26 or cpu54 offline - I'm guessing that I
don't
really need to find a new home for vector 94 - because the other one of
that
pair already has that set up. But your check_vectors code doesn't look
like
it accounts for that - if we take cpu26 offline - it would see that
cpu54 doesn't
have 94 free - but doesn't check that it is for the same interrupt.

But I may be mixing vectors and irqs here.

Yep. The question really is this: is the irq mapped to a single vector or
multiple vectors. (I think)

The vector number for an irq is programmed in the LSB of the IOAPIC
IRTE (or MSI data register in the case of MSI/MSIx). So there can be
only one vector number (although multiple CPUs can be specified
through DM). An MSI-capable device can dynamically change the lower
few bits in the LSB to signal multiple interrupts with a contiguous
range of vectors in powers of 2,but each of these vectors is treated
as a separate IRQ. i.e. each of them has a separate irq desc, or a
separate line in the /proc/interrupt file. This patch shows the MSI
irq allocation in detail:
http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=51906e779f2b13b38f8153774c4c7163d412ffd9

Thanks
Rui

Gong and Rui,

After looking at this in detail I realized I made a mistake in my patch by
including the check for the smp_affinity. Simply put, it shouldn't be there
given Rui's explanation above.

So I think the patch simply needs to do:

this_count = 0;
for (vector = FIRST_EXTERNAL_VECTOR; vector NR_VECTORS; vector++) {
irq = __this_cpu_read(vector_irq[vector]);
if (irq = 0) {
desc = irq_to_desc(irq);
data = irq_desc_get_irq_data(desc);
affinity = data-affinity;
if (irq_has_action(irq) !irqd_is_per_cpu(data))
this_count++;
}
}

Can the two of you confirm the above is correct? It would be greatly
appreciated.

Tony, I apologize -- your comments made me think you were stating a fact and not
asking a question on the behavior of affinity. I completely misunderstood what
you were suggesting. I thought you were implying that that the affinity tied
IRQ behavior together; it does not. It is simply a suggestion of what IRQs
should be assigned to a particular CPU. There is an expectation that the system
will attempt to honour the affinity, however, it is not like each CPU is
assigned a separate IRQ.

P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v3]

2013-12-23 Thread Prarit Bhargava



On 12/23/2013 02:12 AM, Chen, Gong wrote:
> On Fri, Dec 20, 2013 at 10:50:09AM -0500, Prarit Bhargava wrote:
>> +int check_vectors(void)
>> +{
>> +int irq, cpu;
>> +unsigned int vector, this_count, count;
>> +struct irq_desc *desc;
>> +struct irq_data *data;
>> +struct cpumask *affinity;
>> +
>> +this_count = 0;
>> +for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
>> +irq = __this_cpu_read(vector_irq[vector]);
>> +if (irq >= 0) {
>> +desc = irq_to_desc(irq);
>> +data = irq_desc_get_irq_data(desc);
>> +affinity = data->affinity;
>> +if (irq_has_action(irq) || !irqd_is_per_cpu(data) ||
> This line looks weird. Why counter will be increased once the irq has
> action? Do you mean
> 
> if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
> !cpumask_subset(affinity, cpu_online_mask))
> 
> 

Yes, you're absolutely right.  I mixed up the and and or's on this line of code.
 I will send out a patch against tip shortly.

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v3]

2013-12-23 Thread Prarit Bhargava



On 12/23/2013 02:12 AM, Chen, Gong wrote:
 On Fri, Dec 20, 2013 at 10:50:09AM -0500, Prarit Bhargava wrote:
 +int check_vectors(void)
 +{
 +int irq, cpu;
 +unsigned int vector, this_count, count;
 +struct irq_desc *desc;
 +struct irq_data *data;
 +struct cpumask *affinity;
 +
 +this_count = 0;
 +for (vector = FIRST_EXTERNAL_VECTOR; vector  NR_VECTORS; vector++) {
 +irq = __this_cpu_read(vector_irq[vector]);
 +if (irq = 0) {
 +desc = irq_to_desc(irq);
 +data = irq_desc_get_irq_data(desc);
 +affinity = data-affinity;
 +if (irq_has_action(irq) || !irqd_is_per_cpu(data) ||
 This line looks weird. Why counter will be increased once the irq has
 action? Do you mean
 
 if (irq_has_action(irq)  !irqd_is_per_cpu(data) 
 !cpumask_subset(affinity, cpu_online_mask))
 
 

Yes, you're absolutely right.  I mixed up the and and or's on this line of code.
 I will send out a patch against tip shortly.

P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v3]

2013-12-22 Thread Chen, Gong

On Fri, Dec 20, 2013 at 10:50:09AM -0500, Prarit Bhargava wrote:
> +int check_vectors(void)
> +{
> + int irq, cpu;
> + unsigned int vector, this_count, count;
> + struct irq_desc *desc;
> + struct irq_data *data;
> + struct cpumask *affinity;
> +
> + this_count = 0;
> + for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
> + irq = __this_cpu_read(vector_irq[vector]);
> + if (irq >= 0) {
> + desc = irq_to_desc(irq);
> + data = irq_desc_get_irq_data(desc);
> + affinity = data->affinity;
> + if (irq_has_action(irq) || !irqd_is_per_cpu(data) ||
This line looks weird. Why counter will be increased once the irq has
action? Do you mean

if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
!cpumask_subset(affinity, cpu_online_mask))


> + !cpumask_subset(affinity, cpu_online_mask))
It looks like cpu_online_mask will be updated until cpu_disable_common
is called, but your check_vectors is called before that.

[...]
>  {
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 85dc05a..7ac6654 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1312,6 +1312,12 @@ void cpu_disable_common(void)
>  
>  int native_cpu_disable(void)
>  {
> + int ret;
> +
> + ret = check_vectors();
> + if (ret)
> + return ret;
> +
>   clear_local_APIC();
>  
>   cpu_disable_common();
> -- 
> 1.7.9.3
> 


signature.asc
Description: Digital signature

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v3]

2013-12-22 Thread Chen, Gong

On Fri, Dec 20, 2013 at 10:50:09AM -0500, Prarit Bhargava wrote:
 +int check_vectors(void)
 +{
 + int irq, cpu;
 + unsigned int vector, this_count, count;
 + struct irq_desc *desc;
 + struct irq_data *data;
 + struct cpumask *affinity;
 +
 + this_count = 0;
 + for (vector = FIRST_EXTERNAL_VECTOR; vector  NR_VECTORS; vector++) {
 + irq = __this_cpu_read(vector_irq[vector]);
 + if (irq = 0) {
 + desc = irq_to_desc(irq);
 + data = irq_desc_get_irq_data(desc);
 + affinity = data-affinity;
 + if (irq_has_action(irq) || !irqd_is_per_cpu(data) ||
This line looks weird. Why counter will be increased once the irq has
action? Do you mean

if (irq_has_action(irq)  !irqd_is_per_cpu(data) 
!cpumask_subset(affinity, cpu_online_mask))


 + !cpumask_subset(affinity, cpu_online_mask))
It looks like cpu_online_mask will be updated until cpu_disable_common
is called, but your check_vectors is called before that.

[...]
  {
 diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
 index 85dc05a..7ac6654 100644
 --- a/arch/x86/kernel/smpboot.c
 +++ b/arch/x86/kernel/smpboot.c
 @@ -1312,6 +1312,12 @@ void cpu_disable_common(void)
  
  int native_cpu_disable(void)
  {
 + int ret;
 +
 + ret = check_vectors();
 + if (ret)
 + return ret;
 +
   clear_local_APIC();
  
   cpu_disable_common();
 -- 
 1.7.9.3
 


signature.asc
Description: Digital signature

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v3]

2013-12-20 Thread H. Peter Anvin

On 12/20/2013 03:06 PM, Luck, Tony wrote:
>> I haven't double checked, but I'm assuming the hot plug locks are held
>> while you are doing this.
> 
> I dug into that when looking at v2 - the whole thing is under "stop_machine()"
> so locking was not an issue.
> 
> Reviewed-by: Tony Luck 
> 

Awesome, thank you!  I have queued this up for Linus already as a fix
for 3.13, as it seems important enough.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] x86: Add check for number of available vectors before CPU down [v3]

2013-12-20 Thread Luck, Tony

> I haven't double checked, but I'm assuming the hot plug locks are held
> while you are doing this.

I dug into that when looking at v2 - the whole thing is under "stop_machine()"
so locking was not an issue.

Reviewed-by: Tony Luck 

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v3]

2013-12-20 Thread Andi Kleen

On Fri, Dec 20, 2013 at 10:50:09AM -0500, Prarit Bhargava wrote:
> I tested across various systems with both the linux.git and linux-next.git
> trees and do not see any false positives.  The patch behaves as expected on
> a system with 590 IRQs allocated at boot time; the system refuses to down cpu
> 62.

I read the patch and it looks good to me

I haven't double checked, but I'm assuming the hot plug locks are held
while you are doing this.

Reviewed-by: Andi Kleen 

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

2013-12-20 Thread Prarit Bhargava



On 12/20/2013 04:41 AM, rui wang wrote:
> On 12/20/13, Prarit Bhargava  wrote:
>>
>>
>> On 12/19/2013 01:05 PM, Tony Luck wrote:
>>> On Wed, Dec 18, 2013 at 11:50 AM, Tony Luck  wrote:
 Looks good to me.
>>>
>>> Though now I've been confused by an offline question about affinity.
>>
>> Heh :)  I'm pursuing it now.  Rui has asked a pretty good question that I
>> don't
>> know the answer to off the top of my head.  I'm still looking at the code.
>>
>>>
>>> Suppose we have some interrupt that has affinity to multiple cpus. E.g.
>>> (real example from one of my machines):
>>>
>>> # cat /proc/irq/94/smp_affinity_list
>>> 26,54
>>>
>>> Now If I want to take either cpu26 or cpu54 offline - I'm guessing that I
>>> don't
>>> really need to find a new home for vector 94 - because the other one of
>>> that
>>> pair already has that set up.  But your check_vectors code doesn't look
>>> like
>>> it accounts for that - if we take cpu26 offline - it would see that
>>> cpu54 doesn't
>>> have 94 free - but doesn't check that it is for the same interrupt.
>>>
>>> But I may be mixing "vectors" and "irqs" here.
>>
>> Yep.  The question really is this: is the irq mapped to a single vector or
>> multiple vectors. (I think)
>>
> 
> The vector number for an irq is programmed in the LSB of the IOAPIC
> IRTE (or MSI data register in the case of MSI/MSIx). So there can be
> only one vector number (although multiple CPUs can be specified
> through DM). An MSI-capable device can dynamically change the lower
> few bits in the LSB to signal multiple interrupts with a contiguous
> range of vectors in powers of 2,but each of these vectors is treated
> as a separate IRQ. i.e. each of them has a separate irq desc, or a
> separate line in the /proc/interrupt file. This patch shows the MSI
> irq allocation in detail:
> http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=51906e779f2b13b38f8153774c4c7163d412ffd9
> 

Yep, that's where I got the idea for the affinity check in my previous post.  So
far linux.git top-of-tree + my patch looks good.  I'm pulling down
linux-next.git right now ...

P.

> Thanks
> Rui
> 
> 
>> P.
>>
>>>
>>> -Tony
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

2013-12-20 Thread rui wang

On 12/20/13, Prarit Bhargava  wrote:
>
>
> On 12/19/2013 01:05 PM, Tony Luck wrote:
>> On Wed, Dec 18, 2013 at 11:50 AM, Tony Luck  wrote:
>>> Looks good to me.
>>
>> Though now I've been confused by an offline question about affinity.
>
> Heh :)  I'm pursuing it now.  Rui has asked a pretty good question that I
> don't
> know the answer to off the top of my head.  I'm still looking at the code.
>
>>
>> Suppose we have some interrupt that has affinity to multiple cpus. E.g.
>> (real example from one of my machines):
>>
>> # cat /proc/irq/94/smp_affinity_list
>> 26,54
>>
>> Now If I want to take either cpu26 or cpu54 offline - I'm guessing that I
>> don't
>> really need to find a new home for vector 94 - because the other one of
>> that
>> pair already has that set up.  But your check_vectors code doesn't look
>> like
>> it accounts for that - if we take cpu26 offline - it would see that
>> cpu54 doesn't
>> have 94 free - but doesn't check that it is for the same interrupt.
>>
>> But I may be mixing "vectors" and "irqs" here.
>
> Yep.  The question really is this: is the irq mapped to a single vector or
> multiple vectors. (I think)
>

The vector number for an irq is programmed in the LSB of the IOAPIC
IRTE (or MSI data register in the case of MSI/MSIx). So there can be
only one vector number (although multiple CPUs can be specified
through DM). An MSI-capable device can dynamically change the lower
few bits in the LSB to signal multiple interrupts with a contiguous
range of vectors in powers of 2,but each of these vectors is treated
as a separate IRQ. i.e. each of them has a separate irq desc, or a
separate line in the /proc/interrupt file. This patch shows the MSI
irq allocation in detail:
http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=51906e779f2b13b38f8153774c4c7163d412ffd9

Thanks
Rui


> P.
>
>>
>> -Tony
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

2013-12-20 Thread rui wang

On 12/20/13, Prarit Bhargava pra...@redhat.com wrote:

On 12/19/2013 01:05 PM, Tony Luck wrote:
On Wed, Dec 18, 2013 at 11:50 AM, Tony Luck tony.l...@intel.com wrote:
Looks good to me.

Though now I've been confused by an offline question about affinity.

Heh :) I'm pursuing it now. Rui has asked a pretty good question that I
don't
know the answer to off the top of my head. I'm still looking at the code.

Suppose we have some interrupt that has affinity to multiple cpus. E.g.
(real example from one of my machines):

# cat /proc/irq/94/smp_affinity_list
26,54

But I may be mixing vectors and irqs here.

Yep. The question really is this: is the irq mapped to a single vector or
multiple vectors. (I think)

Thanks
Rui

-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

2013-12-20 Thread Prarit Bhargava

On 12/20/2013 04:41 AM, rui wang wrote:
On 12/20/13, Prarit Bhargava pra...@redhat.com wrote:

On 12/19/2013 01:05 PM, Tony Luck wrote:
On Wed, Dec 18, 2013 at 11:50 AM, Tony Luck tony.l...@intel.com wrote:
Looks good to me.

Though now I've been confused by an offline question about affinity.

Heh :) I'm pursuing it now. Rui has asked a pretty good question that I
don't
know the answer to off the top of my head. I'm still looking at the code.

Suppose we have some interrupt that has affinity to multiple cpus. E.g.
(real example from one of my machines):

# cat /proc/irq/94/smp_affinity_list
26,54

But I may be mixing vectors and irqs here.

Yep. The question really is this: is the irq mapped to a single vector or
multiple vectors. (I think)

Yep, that's where I got the idea for the affinity check in my previous post. So
far linux.git top-of-tree + my patch looks good. I'm pulling down
linux-next.git right now ...

Thanks
Rui

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v3]

2013-12-20 Thread Andi Kleen

On Fri, Dec 20, 2013 at 10:50:09AM -0500, Prarit Bhargava wrote:
 I tested across various systems with both the linux.git and linux-next.git
 trees and do not see any false positives.  The patch behaves as expected on
 a system with 590 IRQs allocated at boot time; the system refuses to down cpu
 62.

I read the patch and it looks good to me

I haven't double checked, but I'm assuming the hot plug locks are held
while you are doing this.

Reviewed-by: Andi Kleen a...@linux.intel.com

-Andi
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] x86: Add check for number of available vectors before CPU down [v3]

2013-12-20 Thread Luck, Tony

 I haven't double checked, but I'm assuming the hot plug locks are held
 while you are doing this.

I dug into that when looking at v2 - the whole thing is under stop_machine()
so locking was not an issue.

Reviewed-by: Tony Luck tony.l...@intel.com

-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v3]

2013-12-20 Thread H. Peter Anvin

On 12/20/2013 03:06 PM, Luck, Tony wrote:
 I haven't double checked, but I'm assuming the hot plug locks are held
 while you are doing this.
 
 I dug into that when looking at v2 - the whole thing is under stop_machine()
 so locking was not an issue.
 
 Reviewed-by: Tony Luck tony.l...@intel.com
 

Awesome, thank you!  I have queued this up for Linus already as a fix
for 3.13, as it seems important enough.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

2013-12-19 Thread Chen, Gong

On Thu, Dec 19, 2013 at 01:11:32PM -0500, Prarit Bhargava wrote:
> 
> Yep.  The question really is this: is the irq mapped to a single vector or
> multiple vectors. (I think)
> 
> P.
> 
Yes, this is the original thought I want to express on the bugzilla. On an
ideal environment, without irq balance, all vector_irq should own same
values, say, 0x. So when one CPU is off-lined, your patch will
considers no places to contain these to-be-migrated irqs but the fact is
they are shared on different CPUs. So if we can answer this question,
the answer will be clear :-).


signature.asc
Description: Digital signature

Re: [PATCH] x86: Add check for number of available vectors before CPU down



On 12/19/2013 02:19 AM, rui wang wrote:
> On 12/19/13, Prarit Bhargava  wrote:
>>
>>
>> On 12/03/2013 09:48 PM, rui wang wrote:
>>> On 11/20/13, Prarit Bhargava  wrote:
>>> Have you considered the case when an IRQ is destined to more than one CPU?
>>> e.g.:
>>>
>>> bash# cat /proc/irq/89/smp_affinity_list
>>> 30,62
>>> bash#
>>>
>>> In this case offlining CPU30 does not seem to require an empty vector
>>> slot. It seems that we only need to change the affinity mask of irq89.
>>> Your check_vectors() assumed that each irq on the offlining cpu
>>> requires a new vector slot.
>>>
>>
>> Rui,
>>
>> The smp_affinity_list only indicates a preferred destination of the IRQ, not
>> the
>> *actual* location of the CPU.  So the IRQ is on one of cpu 30 or 62 but not
>> both
>> simultaneously.
>>
> 
> It depends on how IOAPIC (or MSI/MSIx) is configured. An IRQ can be
> simultaneously broadcast to all destination CPUs (Fixed Mode) or
> delivered to the CPU with the lowest priority task (Lowest Priority
> Mode). It's programmed in the Delivery Mode bits of the IOAPIC's IO
> Redirection table registers, or the Message Data Register in the case
> of MSI/MSIx
> 

Thanks for clueing me in Rui :).  You're right.  I do need to do a

if (irq_has_action(irq) || !irqd_is_per_cpu(data) ||
!cpumask_subset(affinity, cpu_online_mask))

instead of just

if  (!irqd_is_per_cpu(data))

I'm going to do some additional testing tonight across various systems and will
repost tomorrow if the testing is successful.

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]



On 12/19/2013 01:05 PM, Tony Luck wrote:
> On Wed, Dec 18, 2013 at 11:50 AM, Tony Luck  wrote:
>> Looks good to me.
> 
> Though now I've been confused by an offline question about affinity.

Heh :)  I'm pursuing it now.  Rui has asked a pretty good question that I don't
know the answer to off the top of my head.  I'm still looking at the code.

> 
> Suppose we have some interrupt that has affinity to multiple cpus. E.g.
> (real example from one of my machines):
> 
> # cat /proc/irq/94/smp_affinity_list
> 26,54
> 
> Now If I want to take either cpu26 or cpu54 offline - I'm guessing that I 
> don't
> really need to find a new home for vector 94 - because the other one of that
> pair already has that set up.  But your check_vectors code doesn't look like
> it accounts for that - if we take cpu26 offline - it would see that
> cpu54 doesn't
> have 94 free - but doesn't check that it is for the same interrupt.
> 
> But I may be mixing "vectors" and "irqs" here.

Yep.  The question really is this: is the irq mapped to a single vector or
multiple vectors. (I think)

P.

> 
> -Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

2013-12-19 Thread Tony Luck

On Wed, Dec 18, 2013 at 11:50 AM, Tony Luck  wrote:
> Looks good to me.

Though now I've been confused by an offline question about affinity.

Suppose we have some interrupt that has affinity to multiple cpus. E.g.
(real example from one of my machines):

# cat /proc/irq/94/smp_affinity_list
26,54

Now If I want to take either cpu26 or cpu54 offline - I'm guessing that I don't
really need to find a new home for vector 94 - because the other one of that
pair already has that set up.  But your check_vectors code doesn't look like
it accounts for that - if we take cpu26 offline - it would see that
cpu54 doesn't
have 94 free - but doesn't check that it is for the same interrupt.

But I may be mixing "vectors" and "irqs" here.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down



On 12/19/2013 02:19 AM, rui wang wrote:
> On 12/19/13, Prarit Bhargava  wrote:
>>
>>
>> On 12/03/2013 09:48 PM, rui wang wrote:
>>> On 11/20/13, Prarit Bhargava  wrote:
>>> Have you considered the case when an IRQ is destined to more than one CPU?
>>> e.g.:
>>>
>>> bash# cat /proc/irq/89/smp_affinity_list
>>> 30,62
>>> bash#
>>>
>>> In this case offlining CPU30 does not seem to require an empty vector
>>> slot. It seems that we only need to change the affinity mask of irq89.
>>> Your check_vectors() assumed that each irq on the offlining cpu
>>> requires a new vector slot.
>>>
>>
>> Rui,
>>
>> The smp_affinity_list only indicates a preferred destination of the IRQ, not
>> the
>> *actual* location of the CPU.  So the IRQ is on one of cpu 30 or 62 but not
>> both
>> simultaneously.
>>
> 
> It depends on how IOAPIC (or MSI/MSIx) is configured. An IRQ can be
> simultaneously broadcast to all destination CPUs (Fixed Mode) or
> delivered to the CPU with the lowest priority task (Lowest Priority
> Mode). It's programmed in the Delivery Mode bits of the IOAPIC's IO
> Redirection table registers, or the Message Data Register in the case
> of MSI/MSIx

Hmm ... I didn't realize that this was a possibility.  I'll go back and rework
the patch.

Thanks for the info Rui!

P.

> 
>> If the case is that 62 is being brought down, then the smp_affinity mask
>> will be
>> updated to reflect only cpu 30 (and vice versa).
>>
> 
> Yes the affinity mask should be updated. But if it was destined to
> more than one CPU, your "this_counter" does not seem to count the
> right numbers. Are you saying that smp_affinity mask is broken on
> Linux so that there's no way to configure an IRQ to target more than
> one CPU?
> 
> Thanks
> Rui
> 
>> P.
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down



On 12/19/2013 02:19 AM, rui wang wrote:
 On 12/19/13, Prarit Bhargava pra...@redhat.com wrote:


 On 12/03/2013 09:48 PM, rui wang wrote:
 On 11/20/13, Prarit Bhargava pra...@redhat.com wrote:
 Have you considered the case when an IRQ is destined to more than one CPU?
 e.g.:

 bash# cat /proc/irq/89/smp_affinity_list
 30,62
 bash#

 In this case offlining CPU30 does not seem to require an empty vector
 slot. It seems that we only need to change the affinity mask of irq89.
 Your check_vectors() assumed that each irq on the offlining cpu
 requires a new vector slot.


 Rui,

 The smp_affinity_list only indicates a preferred destination of the IRQ, not
 the
 *actual* location of the CPU.  So the IRQ is on one of cpu 30 or 62 but not
 both
 simultaneously.

 
 It depends on how IOAPIC (or MSI/MSIx) is configured. An IRQ can be
 simultaneously broadcast to all destination CPUs (Fixed Mode) or
 delivered to the CPU with the lowest priority task (Lowest Priority
 Mode). It's programmed in the Delivery Mode bits of the IOAPIC's IO
 Redirection table registers, or the Message Data Register in the case
 of MSI/MSIx

Hmm ... I didn't realize that this was a possibility.  I'll go back and rework
the patch.

Thanks for the info Rui!

P.

 
 If the case is that 62 is being brought down, then the smp_affinity mask
 will be
 updated to reflect only cpu 30 (and vice versa).

 
 Yes the affinity mask should be updated. But if it was destined to
 more than one CPU, your this_counter does not seem to count the
 right numbers. Are you saying that smp_affinity mask is broken on
 Linux so that there's no way to configure an IRQ to target more than
 one CPU?
 
 Thanks
 Rui
 
 P.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

2013-12-19 Thread Tony Luck

On Wed, Dec 18, 2013 at 11:50 AM, Tony Luck tony.l...@intel.com wrote:
 Looks good to me.

Though now I've been confused by an offline question about affinity.

Suppose we have some interrupt that has affinity to multiple cpus. E.g.
(real example from one of my machines):

# cat /proc/irq/94/smp_affinity_list
26,54

Now If I want to take either cpu26 or cpu54 offline - I'm guessing that I don't
really need to find a new home for vector 94 - because the other one of that
pair already has that set up.  But your check_vectors code doesn't look like
it accounts for that - if we take cpu26 offline - it would see that
cpu54 doesn't
have 94 free - but doesn't check that it is for the same interrupt.

But I may be mixing vectors and irqs here.

-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]



On 12/19/2013 01:05 PM, Tony Luck wrote:
 On Wed, Dec 18, 2013 at 11:50 AM, Tony Luck tony.l...@intel.com wrote:
 Looks good to me.
 
 Though now I've been confused by an offline question about affinity.

Heh :)  I'm pursuing it now.  Rui has asked a pretty good question that I don't
know the answer to off the top of my head.  I'm still looking at the code.

 
 Suppose we have some interrupt that has affinity to multiple cpus. E.g.
 (real example from one of my machines):
 
 # cat /proc/irq/94/smp_affinity_list
 26,54
 
 Now If I want to take either cpu26 or cpu54 offline - I'm guessing that I 
 don't
 really need to find a new home for vector 94 - because the other one of that
 pair already has that set up.  But your check_vectors code doesn't look like
 it accounts for that - if we take cpu26 offline - it would see that
 cpu54 doesn't
 have 94 free - but doesn't check that it is for the same interrupt.
 
 But I may be mixing vectors and irqs here.

Yep.  The question really is this: is the irq mapped to a single vector or
multiple vectors. (I think)

P.

 
 -Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down