Re: [Xen-devel] [PATCH v2] x86/apic/x2apic: Share IRQ vector between cluster members only when no cpumask is specified

2017-08-28 Thread Boris Ostrovsky
On 08/28/2017 11:41 AM, Jan Beulich wrote:
 On 28.08.17 at 16:35,  wrote:
>> On 08/28/2017 03:38 AM, Jan Beulich wrote:
> And finally I continue to be not really happy about the change as
> a whole. Despite what was discussed on v1, I'm concerned of the
> effects of this on hosts _not_ suffering from vector shortage.
> Could you live with the new behavior requiring a command line
> option to enable?
 I can add something like 'apic_share_vectors', defaulting to true,
 although it will not be useful in case of a hotplug. Defaulting to false?
>>> Along the lines of the above plus our desire to limit the number
>>> of top level options, how about "apic=isolate-vectors"?
>>>
>>> Also I don't understand your reference to hotplug, nor why you
>>> suggest two opposite default values.
>> Not two, just one --- not share vectors by default.
>>
>> As for hotplug, I was thinking of a case where a system is successfully
>> booted with shared vectors but then a device is added and we fail to
>> find enough free vectors. So the administrator would need to know in
>> advance whether a new card might be coming in.
>>
>> When defaulting to false (as in apic_share_vectors=false) if the
>> administrator decides to set it to true then he would be in some sense
>> explicitly agreeing to never plug anything in (or at least to tolerate
>> such a failure).
> Ah, I see. But imo the default ought to be current behavior.
>
>>> But finally, you agreeing to a command line option here makes
>>> me come back to an earlier suggestion: Didn't we agree that
>>> "x2apic_phys" would be sufficient to eliminate the issue? In which
>>> case no patch would be needed at all.
>> x2apic_phys means that we never share vectors. With 'isolate-vectors'
>> option we are still able to share them if the mask is explicitly specified.
> Well, aiui your primary goal is to address the vector shortage. For
> that you don't need the new option, you can get away with the
> existing one.


I don't have any numbers to prove (or disprove) that not sharing vectors
during boot but possibly sharing them later improves performance so yes,
x2apic_phys is a possible solution. Especially if with this (modified)
patch we'd default to original cluster mode behavior as you are suggesting.

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] x86/apic/x2apic: Share IRQ vector between cluster members only when no cpumask is specified

2017-08-28 Thread Jan Beulich
>>> On 28.08.17 at 16:35,  wrote:
> On 08/28/2017 03:38 AM, Jan Beulich wrote:
>>
 And finally I continue to be not really happy about the change as
 a whole. Despite what was discussed on v1, I'm concerned of the
 effects of this on hosts _not_ suffering from vector shortage.
 Could you live with the new behavior requiring a command line
 option to enable?
>>> I can add something like 'apic_share_vectors', defaulting to true,
>>> although it will not be useful in case of a hotplug. Defaulting to false?
>> Along the lines of the above plus our desire to limit the number
>> of top level options, how about "apic=isolate-vectors"?
>>
>> Also I don't understand your reference to hotplug, nor why you
>> suggest two opposite default values.
> 
> Not two, just one --- not share vectors by default.
> 
> As for hotplug, I was thinking of a case where a system is successfully
> booted with shared vectors but then a device is added and we fail to
> find enough free vectors. So the administrator would need to know in
> advance whether a new card might be coming in.
> 
> When defaulting to false (as in apic_share_vectors=false) if the
> administrator decides to set it to true then he would be in some sense
> explicitly agreeing to never plug anything in (or at least to tolerate
> such a failure).

Ah, I see. But imo the default ought to be current behavior.

>> But finally, you agreeing to a command line option here makes
>> me come back to an earlier suggestion: Didn't we agree that
>> "x2apic_phys" would be sufficient to eliminate the issue? In which
>> case no patch would be needed at all.
> 
> x2apic_phys means that we never share vectors. With 'isolate-vectors'
> option we are still able to share them if the mask is explicitly specified.

Well, aiui your primary goal is to address the vector shortage. For
that you don't need the new option, you can get away with the
existing one.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] x86/apic/x2apic: Share IRQ vector between cluster members only when no cpumask is specified

2017-08-28 Thread Boris Ostrovsky
On 08/28/2017 03:38 AM, Jan Beulich wrote:
>
>>> And finally I continue to be not really happy about the change as
>>> a whole. Despite what was discussed on v1, I'm concerned of the
>>> effects of this on hosts _not_ suffering from vector shortage.
>>> Could you live with the new behavior requiring a command line
>>> option to enable?
>> I can add something like 'apic_share_vectors', defaulting to true,
>> although it will not be useful in case of a hotplug. Defaulting to false?
> Along the lines of the above plus our desire to limit the number
> of top level options, how about "apic=isolate-vectors"?
>
> Also I don't understand your reference to hotplug, nor why you
> suggest two opposite default values.

Not two, just one --- not share vectors by default.

As for hotplug, I was thinking of a case where a system is successfully
booted with shared vectors but then a device is added and we fail to
find enough free vectors. So the administrator would need to know in
advance whether a new card might be coming in.

When defaulting to false (as in apic_share_vectors=false) if the
administrator decides to set it to true then he would be in some sense
explicitly agreeing to never plug anything in (or at least to tolerate
such a failure).

>
> But finally, you agreeing to a command line option here makes
> me come back to an earlier suggestion: Didn't we agree that
> "x2apic_phys" would be sufficient to eliminate the issue? In which
> case no patch would be needed at all.

x2apic_phys means that we never share vectors. With 'isolate-vectors'
option we are still able to share them if the mask is explicitly specified.


-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] x86/apic/x2apic: Share IRQ vector between cluster members only when no cpumask is specified

2017-08-28 Thread Jan Beulich
>>> On 25.08.17 at 18:00,  wrote:
> On 08/25/2017 10:56 AM, Jan Beulich wrote:
> On 08.08.17 at 17:59,  wrote:
>>> --- a/xen/arch/x86/genapic/delivery.c
>>> +++ b/xen/arch/x86/genapic/delivery.c
>>> @@ -30,7 +30,8 @@ void __init clustered_apic_check_flat(void)
>>> printk("Enabling APIC mode:  Flat.  Using %d I/O APICs\n", nr_ioapics);
>>>  }
>>>  
>>> -const cpumask_t *vector_allocation_cpumask_flat(int cpu)
>>> +const cpumask_t *vector_allocation_cpumask_flat(int cpu,
>>> +const cpumask_t *cpumask)
>>>  {
>>> return _online_map;
>>>  } 
>>> @@ -58,7 +59,8 @@ void __init clustered_apic_check_phys(void)
>>> printk("Enabling APIC mode:  Phys.  Using %d I/O APICs\n", nr_ioapics);
>>>  }
>>>  
>>> -const cpumask_t *vector_allocation_cpumask_phys(int cpu)
>>> +const cpumask_t *vector_allocation_cpumask_phys(int cpu,
>>> +const cpumask_t *cpumask)
>>>  {
>>> return cpumask_of(cpu);
>>>  }
>>> --- a/xen/arch/x86/genapic/x2apic.c
>>> +++ b/xen/arch/x86/genapic/x2apic.c
>>> @@ -72,8 +72,12 @@ static void __init clustered_apic_check_x2apic(void)
>>>  {
>>>  }
>>>  
>>> -static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu)
>>> +static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu,
>>> +const cpumask_t *cpumask)
>>>  {
>>> +if ( !cpumask )
>>> +return cpumask_of(cpu);
>>> +
>>>  return per_cpu(cluster_cpus, cpu);
>>>  }
>> It is a strange addition you're making here: None of the three
>> implementations care about the passed in mask. Why is this then
>> not a bool with a suitable name?
> 
> I can pass in a bool. Say, 'bool share_vectors'.

How about the opposite, 'isolate_vectors'? To me that would seem
to fit better with the intention of the change.

>> Further I'd prefer if you made it a single return statement here,
>> using a conditional expression.
>>
>> And finally I continue to be not really happy about the change as
>> a whole. Despite what was discussed on v1, I'm concerned of the
>> effects of this on hosts _not_ suffering from vector shortage.
>> Could you live with the new behavior requiring a command line
>> option to enable?
> 
> I can add something like 'apic_share_vectors', defaulting to true,
> although it will not be useful in case of a hotplug. Defaulting to false?

Along the lines of the above plus our desire to limit the number
of top level options, how about "apic=isolate-vectors"?

Also I don't understand your reference to hotplug, nor why you
suggest two opposite default values.

But finally, you agreeing to a command line option here makes
me come back to an earlier suggestion: Didn't we agree that
"x2apic_phys" would be sufficient to eliminate the issue? In which
case no patch would be needed at all.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] x86/apic/x2apic: Share IRQ vector between cluster members only when no cpumask is specified

2017-08-25 Thread Boris Ostrovsky
On 08/25/2017 10:56 AM, Jan Beulich wrote:
 On 08.08.17 at 17:59,  wrote:
>> --- a/xen/arch/x86/genapic/delivery.c
>> +++ b/xen/arch/x86/genapic/delivery.c
>> @@ -30,7 +30,8 @@ void __init clustered_apic_check_flat(void)
>>  printk("Enabling APIC mode:  Flat.  Using %d I/O APICs\n", nr_ioapics);
>>  }
>>  
>> -const cpumask_t *vector_allocation_cpumask_flat(int cpu)
>> +const cpumask_t *vector_allocation_cpumask_flat(int cpu,
>> +const cpumask_t *cpumask)
>>  {
>>  return _online_map;
>>  } 
>> @@ -58,7 +59,8 @@ void __init clustered_apic_check_phys(void)
>>  printk("Enabling APIC mode:  Phys.  Using %d I/O APICs\n", nr_ioapics);
>>  }
>>  
>> -const cpumask_t *vector_allocation_cpumask_phys(int cpu)
>> +const cpumask_t *vector_allocation_cpumask_phys(int cpu,
>> +const cpumask_t *cpumask)
>>  {
>>  return cpumask_of(cpu);
>>  }
>> --- a/xen/arch/x86/genapic/x2apic.c
>> +++ b/xen/arch/x86/genapic/x2apic.c
>> @@ -72,8 +72,12 @@ static void __init clustered_apic_check_x2apic(void)
>>  {
>>  }
>>  
>> -static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu)
>> +static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu,
>> +const cpumask_t *cpumask)
>>  {
>> +if ( !cpumask )
>> +return cpumask_of(cpu);
>> +
>>  return per_cpu(cluster_cpus, cpu);
>>  }
> It is a strange addition you're making here: None of the three
> implementations care about the passed in mask. Why is this then
> not a bool with a suitable name?

I can pass in a bool. Say, 'bool share_vectors'.

>
> Additionally, shouldn't vector_allocation_cpumask_flat() behave
> similar to vector_allocation_cpumask_x2apic_cluster() then?

Yes, I should probably do that as well.

>
> Further I'd prefer if you made it a single return statement here,
> using a conditional expression.
>
> And finally I continue to be not really happy about the change as
> a whole. Despite what was discussed on v1, I'm concerned of the
> effects of this on hosts _not_ suffering from vector shortage.
> Could you live with the new behavior requiring a command line
> option to enable?

I can add something like 'apic_share_vectors', defaulting to true,
although it will not be useful in case of a hotplug. Defaulting to false?

-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] x86/apic/x2apic: Share IRQ vector between cluster members only when no cpumask is specified

2017-08-25 Thread Jan Beulich
>>> On 08.08.17 at 17:59,  wrote:
> --- a/xen/arch/x86/genapic/delivery.c
> +++ b/xen/arch/x86/genapic/delivery.c
> @@ -30,7 +30,8 @@ void __init clustered_apic_check_flat(void)
>   printk("Enabling APIC mode:  Flat.  Using %d I/O APICs\n", nr_ioapics);
>  }
>  
> -const cpumask_t *vector_allocation_cpumask_flat(int cpu)
> +const cpumask_t *vector_allocation_cpumask_flat(int cpu,
> +const cpumask_t *cpumask)
>  {
>   return _online_map;
>  } 
> @@ -58,7 +59,8 @@ void __init clustered_apic_check_phys(void)
>   printk("Enabling APIC mode:  Phys.  Using %d I/O APICs\n", nr_ioapics);
>  }
>  
> -const cpumask_t *vector_allocation_cpumask_phys(int cpu)
> +const cpumask_t *vector_allocation_cpumask_phys(int cpu,
> +const cpumask_t *cpumask)
>  {
>   return cpumask_of(cpu);
>  }
> --- a/xen/arch/x86/genapic/x2apic.c
> +++ b/xen/arch/x86/genapic/x2apic.c
> @@ -72,8 +72,12 @@ static void __init clustered_apic_check_x2apic(void)
>  {
>  }
>  
> -static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu)
> +static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu,
> +const cpumask_t *cpumask)
>  {
> +if ( !cpumask )
> +return cpumask_of(cpu);
> +
>  return per_cpu(cluster_cpus, cpu);
>  }

It is a strange addition you're making here: None of the three
implementations care about the passed in mask. Why is this then
not a bool with a suitable name?

Additionally, shouldn't vector_allocation_cpumask_flat() behave
similar to vector_allocation_cpumask_x2apic_cluster() then?

Further I'd prefer if you made it a single return statement here,
using a conditional expression.

And finally I continue to be not really happy about the change as
a whole. Despite what was discussed on v1, I'm concerned of the
effects of this on hosts _not_ suffering from vector shortage.
Could you live with the new behavior requiring a command line
option to enable?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2] x86/apic/x2apic: Share IRQ vector between cluster members only when no cpumask is specified

2017-08-08 Thread Boris Ostrovsky
We have limited number (slightly under NR_DYNAMIC_VECTORS=192) of IRQ
vectors that are available to each processor. Currently, when x2apic
cluster mode is used (which is default), each vector is shared among
all processors in the cluster. With many IRQs (as is the case on systems
with multiple SR-IOV cards) and few clusters (e.g. single socket)
there is a good chance that we will run out of vectors.

This patch tries to decrease vector sharing between processors by
assigning vector to a single processor if the assignment request (via
__assign_irq_vector()) comes without explicitly specifying which
processors are expected to share the interrupt. This typically happens
during boot time (or possibly PCI hotplug) when create_irq(NUMA_NO_NODE)
is called. When __assign_irq_vector() is called from
set_desc_affinity() which provides sharing mask, vector sharing will
continue to be performed, as before.

This patch to some extent mirrors Linux commit d872818dbbee
("x86/apic/x2apic: Use multiple cluster members for the irq destination
only with the explicit affinity").

Note that this change still does not guarantee that we never run out of
vectors. For example, on a single core system we will be effectively
back to the single cluster/socket case of original code.

Signed-off-by: Boris Ostrovsky 
---
Changes in v2:
* Instead of relying on having mask set to TARGET_CPUS as indication that the
caller doesn't care about how vectors are shared allow passing NULL mask to
__assign_irq_vector() (and then to vector_allocation_cpumask()).


 xen/arch/x86/genapic/delivery.c  | 6 --
 xen/arch/x86/genapic/x2apic.c| 6 +-
 xen/arch/x86/irq.c   | 9 +
 xen/include/asm-x86/genapic.h| 9 ++---
 xen/include/asm-x86/mach-generic/mach_apic.h | 3 ++-
 5 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/genapic/delivery.c b/xen/arch/x86/genapic/delivery.c
index ced92a1..3cb65d2 100644
--- a/xen/arch/x86/genapic/delivery.c
+++ b/xen/arch/x86/genapic/delivery.c
@@ -30,7 +30,8 @@ void __init clustered_apic_check_flat(void)
printk("Enabling APIC mode:  Flat.  Using %d I/O APICs\n", nr_ioapics);
 }
 
-const cpumask_t *vector_allocation_cpumask_flat(int cpu)
+const cpumask_t *vector_allocation_cpumask_flat(int cpu,
+const cpumask_t *cpumask)
 {
return _online_map;
 } 
@@ -58,7 +59,8 @@ void __init clustered_apic_check_phys(void)
printk("Enabling APIC mode:  Phys.  Using %d I/O APICs\n", nr_ioapics);
 }
 
-const cpumask_t *vector_allocation_cpumask_phys(int cpu)
+const cpumask_t *vector_allocation_cpumask_phys(int cpu,
+const cpumask_t *cpumask)
 {
return cpumask_of(cpu);
 }
diff --git a/xen/arch/x86/genapic/x2apic.c b/xen/arch/x86/genapic/x2apic.c
index 5fffb31..b12d529 100644
--- a/xen/arch/x86/genapic/x2apic.c
+++ b/xen/arch/x86/genapic/x2apic.c
@@ -72,8 +72,12 @@ static void __init clustered_apic_check_x2apic(void)
 {
 }
 
-static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu)
+static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu,
+const cpumask_t *cpumask)
 {
+if ( !cpumask )
+return cpumask_of(cpu);
+
 return per_cpu(cluster_cpus, cpu);
 }
 
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 57e6c18..a0385a3 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -450,11 +450,12 @@ static int __assign_irq_vector(
 static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
 int cpu, err, old_vector;
 cpumask_t tmp_mask;
+const cpumask_t *initial_mask = mask ?: TARGET_CPUS;
 vmask_t *irq_used_vectors = NULL;
 
 old_vector = irq_to_vector(irq);
 if (old_vector > 0) {
-cpumask_and(_mask, mask, _online_map);
+cpumask_and(_mask, initial_mask, _online_map);
 if (cpumask_intersects(_mask, desc->arch.cpu_mask)) {
 desc->arch.vector = old_vector;
 return 0;
@@ -476,7 +477,7 @@ static int __assign_irq_vector(
 else
 irq_used_vectors = irq_get_used_vector_mask(irq);
 
-for_each_cpu(cpu, mask) {
+for_each_cpu(cpu, initial_mask) {
 int new_cpu;
 int vector, offset;
 
@@ -484,7 +485,7 @@ static int __assign_irq_vector(
 if (!cpu_online(cpu))
 continue;
 
-cpumask_and(_mask, vector_allocation_cpumask(cpu),
+cpumask_and(_mask, vector_allocation_cpumask(cpu, mask),
 _online_map);
 
 vector = current_vector;
@@ -550,7 +551,7 @@ int assign_irq_vector(int irq, const cpumask_t *mask)
 BUG_ON(irq >= nr_irqs || irq <0);
 
 spin_lock_irqsave(_lock, flags);
-ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
+ret = __assign_irq_vector(irq, desc, mask);
 if (!ret) {
 ret = desc->arch.vector;
 cpumask_copy(desc->affinity, desc->arch.cpu_mask);
diff --git a/xen/include/asm-x86/genapic.h