on 10/25/2016 06:09 PM, Marc Zyngier wrote:
> On 15/10/16 08:23, Cheng Chao wrote:
>> On 10/15/2016 01:33 AM, Marc Zyngier wrote:
>>>> on 10/13/2016 11:31 PM, Marc Zyngier wrote:
>>>>> On Thu, 13 Oct 2016 18:57:14 +0800
>>>>> Cheng Chao <cs.os.ker...@gmail.com> wrote:
>>>>>
>>>>>> GIC can distribute an interrupt to more than one cpu,
>>>>>> but now, gic_set_affinity sets only one cpu to handle interrupt.
>>>>>
>>>>> What makes you think this is a good idea? What purpose does it serves?
>>>>> I can only see drawbacks to this: You're waking up more than one CPU,
>>>>> wasting power, adding jitter and clobbering the cache.
>>>>>
>>>>> I assume you see a benefit to that approach, so can you please spell it
>>>>> out?
>>>>>
>>>>
>>>> Ok, You are right, but the performance is another point that we should 
>>>> consider.
>>>>
>>>> We use E1 device to transmit/receive video stream. we find that E1's 
>>>> interrupt is
>>>> only on the one cpu that cause this cpu usage is almost 100%,
>>>> but other cpus is much lower load, so the performance is not good.
>>>> the cpu is 4-core.
>>>
>>> It looks to me like you're barking up the wrong tree. We have
>>> NAPI-enabled network drivers for this exact reason, and adding more
>>> interrupts to an already overloaded system doesn't strike me as going in
>>> the right direction. May I suggest that you look at integrating NAPI
>>> into your E1 driver?
>>>
>>
>> great, NAPI maybe is a good option, I can try to use NAPI. thank you.
>>
>> In other hand, gic_set_affinity sets only one cpu to handle interrupt,
>> that really makes me a little confused, why does GIC's driver not like 
>> the others(MPIC, APIC etc) to support many cpus to handle interrupt?
>>
>> It seems that the GIC's driver constrain too much.
> 
> There is several drawbacks to this:
> - Cache impacts and power efficiency, as already mentioned
> - Not virtualizable (you cannot efficiently implement this in a 
>   hypervisor that emulates a GICv2 distributor)
> - Doesn't scale (you cannot go beyond 8 CPUs)
> 
> I strongly suggest you give NAPI a go, and only then consider
> delivering interrupts to multiple CPUs, because multiple CPU
> delivery is not future proof.
> 

Thanks again, the E1 driver with NAPI is on the right track.

>> I think it is more reasonable to let user decide what to do.
>>
>> If I care about the power etc, then I only echo single cpu to
>> /proc/irq/xx/smp_affinity, but if I expect more than one cpu to handle 
>> one special interrupt, I can echo 'what I expect cpus' to
>> /proc/irq/xx/smp_affinity.
> 
> If that's what you really want, a better patch may be something like this:
> 

I hope the GIC'c driver is more flexible, and gic_set_affinity() doesn't 
constrain 
to set only one cpu. the GIC supports to distribute more than one cpu after all.


> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index d6c404b..b301d72 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -326,20 +326,25 @@ static int gic_set_affinity(struct irq_data *d, const 
> struct cpumask *mask_val,
>  {
>       void __iomem *reg = gic_dist_base(d) + GIC_DIST_TARGET + (gic_irq(d) & 
> ~3);
>       unsigned int cpu, shift = (gic_irq(d) % 4) * 8;
> -     u32 val, mask, bit;
> -     unsigned long flags;
> +     u32 val, mask, bit = 0;
> +     unsigned long flags, aff = 0;
>  
> -     if (!force)
> -             cpu = cpumask_any_and(mask_val, cpu_online_mask);
> -     else
> -             cpu = cpumask_first(mask_val);
> +     for_each_cpu(cpu, mask_val) {
> +             if (force) {
> +                     aff = 1 << cpu;
> +                     break;
> +             }
> +
> +             aff |= cpu_online(cpu) << cpu;
> +     }
>  
> -     if (cpu >= NR_GIC_CPU_IF || cpu >= nr_cpu_ids)
> +     if (!aff)
>               return -EINVAL;
>  
>       gic_lock_irqsave(flags);
>       mask = 0xff << shift;
> -     bit = gic_cpu_map[cpu] << shift;
> +     for_each_set_bit(cpu, &aff, nr_cpu_ids)
> +             bit |= gic_cpu_map[cpu] << shift;
>       val = readl_relaxed(reg) & ~mask;
>       writel_relaxed(val | bit, reg);
>       gic_unlock_irqrestore(flags);
> 

this patch is more better than before.
a little check add.

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 58e5b4e..b3d0f07 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -326,20 +326,28 @@ static int gic_set_affinity(struct irq_data *d, const 
struct cpumask *mask_val,
 {
        void __iomem *reg = gic_dist_base(d) + GIC_DIST_TARGET + (gic_irq(d) & 
~3);
        unsigned int cpu, shift = (gic_irq(d) % 4) * 8;
-       u32 val, mask, bit;
-       unsigned long flags;
+       u32 val, mask, bit = 0;
+       unsigned long flags, aff = 0;

-       if (!force)
-               cpu = cpumask_any_and(mask_val, cpu_online_mask);
-       else
-               cpu = cpumask_first(mask_val);
+       for_each_cpu(cpu, mask_val) {
+               if (cpu >= NR_GIC_CPU_IF || cpu >= nr_cpu_ids)
+                       break;
+
+               if (force) {
+                       aff = 1 << cpu;
+                       break;
+               }
+
+               aff |= cpu_online(cpu) << cpu;
+       }

-       if (cpu >= NR_GIC_CPU_IF || cpu >= nr_cpu_ids)
+       if (!aff)
                return -EINVAL;

        gic_lock_irqsave(flags);
        mask = 0xff << shift;
-       bit = gic_cpu_map[cpu] << shift;
+       for_each_set_bit(cpu, &aff, nr_cpu_ids)
+               bit |= gic_cpu_map[cpu] << shift;
        val = readl_relaxed(reg) & ~mask;
        writel_relaxed(val | bit, reg);
        gic_unlock_irqrestore(flags);

> Thanks,
> 
>       M.
> 

Thanks,
       Cheng
   

Reply via email to