Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

2008-07-25 Thread Peter Zijlstra
On Thu, 2008-07-24 at 14:18 +0200, Sebastien Dugue wrote:
 On Thu, 24 Jul 2008 21:11:34 +1000 Nick Piggin [EMAIL PROTECTED] wrote:
 
  On Thursday 24 July 2008 20:50, Sebastien Dugue wrote:
   From: Sebastien Dugue [EMAIL PROTECTED]
   Date: Tue, 22 Jul 2008 11:56:41 +0200
   Subject: [PATCH][RT] powerpc - Make the irq reverse mapping radix tree
   lockless
  
 The radix tree used by interrupt controllers for their irq reverse
   mapping (currently only the XICS found on pSeries) have a complex locking
   scheme dating back to before the advent of the concurrent radix tree on
   preempt-rt.
  
 Take advantage of this and of the fact that the items of the tree are
   pointers to a static array (irq_map) elements which can never go under us
   to simplify the locking.
  
 Concurrency between readers and writers are handled by the intrinsic
   properties of the concurrent radix tree. Concurrency between the tree
   initialization which is done asynchronously with readers and writers 
   access
   is handled via an atomic variable (revmap_trees_allocated) set when the
   tree has been initialized and checked before any reader or writer access
   just like we used to check for tree.gfp_mask != 0 before.
  
  Hmm, RCU radix tree is in mainline too for quite a while. I thought
  Ben had already converted this code over ages ago...
 
   Mainline does not have the concurrent radix tree which this patch
 is based on, but maybe it's overkill and the RCU radix tree is enough.
 Not sure, will have to think about it a bit more.

Should be. The model of the concurrent radix tree can be mapped to
spinlock + rcu radix tree.

So instead of:

 +   DEFINE_RADIX_TREE_CONTEXT(ctx, tree);
 +   radix_tree_lock(ctx);
 +   radix_tree_insert(ctx.tree, hwirq, irq_map[virq]);
 +   radix_tree_unlock(ctx);


you then write:

spin_lock(host-revmap_data.tree_lock);
radix_tree_insert(host-revmap_data.tree, hwirq, irq_map[virq]);
spin_unlock(host-revmap_data.tree_lock);


The only advantage of the concurrent radix tree over this model is that
it can potentially do multiple modification operations at the same time.

Still, cool that you used it ;-)

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

2008-07-25 Thread Benjamin Herrenschmidt
On Fri, 2008-07-25 at 09:49 +0200, Peter Zijlstra wrote:
 
 
 The only advantage of the concurrent radix tree over this model is that
 it can potentially do multiple modification operations at the same time.

Yup, we do not need that for the irq revmap... concurrent lookup is all we need.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

2008-07-25 Thread Sebastien Dugue

  Hi Peter,

On Fri, 25 Jul 2008 09:49:37 +0200 Peter Zijlstra [EMAIL PROTECTED] wrote:

 On Thu, 2008-07-24 at 14:18 +0200, Sebastien Dugue wrote:
  On Thu, 24 Jul 2008 21:11:34 +1000 Nick Piggin [EMAIL PROTECTED] wrote:
  
   On Thursday 24 July 2008 20:50, Sebastien Dugue wrote:
From: Sebastien Dugue [EMAIL PROTECTED]
Date: Tue, 22 Jul 2008 11:56:41 +0200
Subject: [PATCH][RT] powerpc - Make the irq reverse mapping radix tree
lockless
   
  The radix tree used by interrupt controllers for their irq reverse
mapping (currently only the XICS found on pSeries) have a complex 
locking
scheme dating back to before the advent of the concurrent radix tree on
preempt-rt.
   
  Take advantage of this and of the fact that the items of the tree are
pointers to a static array (irq_map) elements which can never go under 
us
to simplify the locking.
   
  Concurrency between readers and writers are handled by the intrinsic
properties of the concurrent radix tree. Concurrency between the tree
initialization which is done asynchronously with readers and writers 
access
is handled via an atomic variable (revmap_trees_allocated) set when the
tree has been initialized and checked before any reader or writer access
just like we used to check for tree.gfp_mask != 0 before.
   
   Hmm, RCU radix tree is in mainline too for quite a while. I thought
   Ben had already converted this code over ages ago...
  
Mainline does not have the concurrent radix tree which this patch
  is based on, but maybe it's overkill and the RCU radix tree is enough.
  Not sure, will have to think about it a bit more.
 
 Should be. The model of the concurrent radix tree can be mapped to
 spinlock + rcu radix tree.
 
 So instead of:
 
  +   DEFINE_RADIX_TREE_CONTEXT(ctx, tree);
  +   radix_tree_lock(ctx);
  +   radix_tree_insert(ctx.tree, hwirq, irq_map[virq]);
  +   radix_tree_unlock(ctx);
 
 
 you then write:
 
   spin_lock(host-revmap_data.tree_lock);
   radix_tree_insert(host-revmap_data.tree, hwirq, irq_map[virq]);
   spin_unlock(host-revmap_data.tree_lock);
 

  Cool, that will indeed makes it much easier to have something applicable
to mainline which works with preempt-rt.

 
 The only advantage of the concurrent radix tree over this model is that
 it can potentially do multiple modification operations at the same time.

  Well in theory that can happen if a module is loaded which creates a mapping
while another one is unloaded at the same time. The time window is pretty 
narrow,
but still present nonetheless. That's why I chose to use the concurrent version.

 
 Still, cool that you used it ;-)


  Yep, looked like what was needed until I realized it was not available in
mainline. Nice work though and good paper for explaining it all.

  Sebastien.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

2008-07-25 Thread Sebastien Dugue
On Fri, 25 Jul 2008 18:27:20 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] 
wrote:

 On Fri, 2008-07-25 at 09:49 +0200, Peter Zijlstra wrote:
  
  
  The only advantage of the concurrent radix tree over this model is that
  it can potentially do multiple modification operations at the same time.
 
 Yup, we do not need that for the irq revmap... concurrent lookup is all we 
 need.
 

  Shouldn't we care about concurrent insertion and deletion in the tree? I agree
that concern might be a bit artificial but in theory that can happen.

  Sebastien.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

2008-07-25 Thread Sebastien Dugue
On Fri, 25 Jul 2008 18:40:21 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] 
wrote:

 On Fri, 2008-07-25 at 10:36 +0200, Sebastien Dugue wrote:
  On Fri, 25 Jul 2008 18:27:20 +1000 Benjamin Herrenschmidt [EMAIL 
  PROTECTED] wrote:
  
   On Fri, 2008-07-25 at 09:49 +0200, Peter Zijlstra wrote:


The only advantage of the concurrent radix tree over this model is that
it can potentially do multiple modification operations at the same time.
   
   Yup, we do not need that for the irq revmap... concurrent lookup is all 
   we need.
   
  
Shouldn't we care about concurrent insertion and deletion in the tree? I 
  agree
  that concern might be a bit artificial but in theory that can happen.
 
 Yes, we just need to protect it with a big hammer, like a spinlock, it's
 not a performance critical code path.

  Agreed. Will look into this in the next few days.

  Thanks,

  Sebastien.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

2008-07-25 Thread Benjamin Herrenschmidt
On Fri, 2008-07-25 at 10:36 +0200, Sebastien Dugue wrote:
 On Fri, 25 Jul 2008 18:27:20 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] 
 wrote:
 
  On Fri, 2008-07-25 at 09:49 +0200, Peter Zijlstra wrote:
   
   
   The only advantage of the concurrent radix tree over this model is that
   it can potentially do multiple modification operations at the same time.
  
  Yup, we do not need that for the irq revmap... concurrent lookup is all we 
  need.
  
 
   Shouldn't we care about concurrent insertion and deletion in the tree? I 
 agree
 that concern might be a bit artificial but in theory that can happen.

Yes, we just need to protect it with a big hammer, like a spinlock, it's
not a performance critical code path.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

2008-07-24 Thread Sebastien Dugue
From: Sebastien Dugue [EMAIL PROTECTED]
Date: Tue, 22 Jul 2008 11:56:41 +0200
Subject: [PATCH][RT] powerpc - Make the irq reverse mapping radix tree lockless

  The radix tree used by interrupt controllers for their irq reverse mapping
(currently only the XICS found on pSeries) have a complex locking scheme
dating back to before the advent of the concurrent radix tree on preempt-rt.

  Take advantage of this and of the fact that the items of the tree are
pointers to a static array (irq_map) elements which can never go under us
to simplify the locking.

  Concurrency between readers and writers are handled by the intrinsic
properties of the concurrent radix tree. Concurrency between the tree
initialization which is done asynchronously with readers and writers access is
handled via an atomic variable (revmap_trees_allocated) set when the tree
has been initialized and checked before any reader or writer access just
like we used to check for tree.gfp_mask != 0 before.

Signed-off-by: Sebastien Dugue [EMAIL PROTECTED]
Cc: Benjamin Herrenschmidt [EMAIL PROTECTED]
Cc: Paul Mackerras [EMAIL PROTECTED]

---
 arch/powerpc/kernel/irq.c |  102 --
 1 file changed, 27 insertions(+), 75 deletions(-)

Index: linux-2.6.25.8-rt7/arch/powerpc/kernel/irq.c
===
--- linux-2.6.25.8-rt7.orig/arch/powerpc/kernel/irq.c
+++ linux-2.6.25.8-rt7/arch/powerpc/kernel/irq.c
@@ -403,8 +403,7 @@ void do_softirq(void)
 
 static LIST_HEAD(irq_hosts);
 static DEFINE_RAW_SPINLOCK(irq_big_lock);
-static DEFINE_PER_CPU(unsigned int, irq_radix_reader);
-static unsigned int irq_radix_writer;
+static atomic_t revmap_trees_allocated = ATOMIC_INIT(0);
 struct irq_map_entry irq_map[NR_IRQS];
 static unsigned int irq_virq_count = NR_IRQS;
 static struct irq_host *irq_default_host;
@@ -547,57 +546,6 @@ void irq_set_virq_count(unsigned int cou
irq_virq_count = count;
 }
 
-/* radix tree not lockless safe ! we use a brlock-type mecanism
- * for now, until we can use a lockless radix tree
- */
-static void irq_radix_wrlock(unsigned long *flags)
-{
-   unsigned int cpu, ok;
-
-   spin_lock_irqsave(irq_big_lock, *flags);
-   irq_radix_writer = 1;
-   smp_mb();
-   do {
-   barrier();
-   ok = 1;
-   for_each_possible_cpu(cpu) {
-   if (per_cpu(irq_radix_reader, cpu)) {
-   ok = 0;
-   break;
-   }
-   }
-   if (!ok)
-   cpu_relax();
-   } while(!ok);
-}
-
-static void irq_radix_wrunlock(unsigned long flags)
-{
-   smp_wmb();
-   irq_radix_writer = 0;
-   spin_unlock_irqrestore(irq_big_lock, flags);
-}
-
-static void irq_radix_rdlock(unsigned long *flags)
-{
-   local_irq_save(*flags);
-   __get_cpu_var(irq_radix_reader) = 1;
-   smp_mb();
-   if (likely(irq_radix_writer == 0))
-   return;
-   __get_cpu_var(irq_radix_reader) = 0;
-   smp_wmb();
-   spin_lock(irq_big_lock);
-   __get_cpu_var(irq_radix_reader) = 1;
-   spin_unlock(irq_big_lock);
-}
-
-static void irq_radix_rdunlock(unsigned long flags)
-{
-   __get_cpu_var(irq_radix_reader) = 0;
-   local_irq_restore(flags);
-}
-
 static int irq_setup_virq(struct irq_host *host, unsigned int virq,
irq_hw_number_t hwirq)
 {
@@ -752,7 +700,6 @@ void irq_dispose_mapping(unsigned int vi
 {
struct irq_host *host;
irq_hw_number_t hwirq;
-   unsigned long flags;
 
if (virq == NO_IRQ)
return;
@@ -784,15 +731,20 @@ void irq_dispose_mapping(unsigned int vi
if (hwirq  host-revmap_data.linear.size)
host-revmap_data.linear.revmap[hwirq] = NO_IRQ;
break;
-   case IRQ_HOST_MAP_TREE:
+   case IRQ_HOST_MAP_TREE: {
+   DEFINE_RADIX_TREE_CONTEXT(ctx, host-revmap_data.tree);
+
/* Check if radix tree allocated yet */
-   if (host-revmap_data.tree.gfp_mask == 0)
+   if (atomic_read(revmap_trees_allocated) == 0)
break;
-   irq_radix_wrlock(flags);
-   radix_tree_delete(host-revmap_data.tree, hwirq);
-   irq_radix_wrunlock(flags);
+
+   radix_tree_lock(ctx);
+   radix_tree_delete(ctx.tree, hwirq);
+   radix_tree_unlock(ctx);
+
break;
}
+   }
 
/* Destroy map */
smp_mb();
@@ -845,22 +797,20 @@ unsigned int irq_radix_revmap(struct irq
struct radix_tree_root *tree;
struct irq_map_entry *ptr;
unsigned int virq;
-   unsigned long flags;
 
WARN_ON(host-revmap_type != IRQ_HOST_MAP_TREE);
 
-   /* Check if the radix tree exist yet. We test the value of
-* the gfp_mask for that. Sneaky but 

Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

2008-07-24 Thread Nick Piggin
On Thursday 24 July 2008 20:50, Sebastien Dugue wrote:
 From: Sebastien Dugue [EMAIL PROTECTED]
 Date: Tue, 22 Jul 2008 11:56:41 +0200
 Subject: [PATCH][RT] powerpc - Make the irq reverse mapping radix tree
 lockless

   The radix tree used by interrupt controllers for their irq reverse
 mapping (currently only the XICS found on pSeries) have a complex locking
 scheme dating back to before the advent of the concurrent radix tree on
 preempt-rt.

   Take advantage of this and of the fact that the items of the tree are
 pointers to a static array (irq_map) elements which can never go under us
 to simplify the locking.

   Concurrency between readers and writers are handled by the intrinsic
 properties of the concurrent radix tree. Concurrency between the tree
 initialization which is done asynchronously with readers and writers access
 is handled via an atomic variable (revmap_trees_allocated) set when the
 tree has been initialized and checked before any reader or writer access
 just like we used to check for tree.gfp_mask != 0 before.

Hmm, RCU radix tree is in mainline too for quite a while. I thought
Ben had already converted this code over ages ago...

Nothing against the -rt patch, but mainline should probably be updated
to use RCU as well?
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

2008-07-24 Thread Sebastien Dugue
On Thu, 24 Jul 2008 21:11:34 +1000 Nick Piggin [EMAIL PROTECTED] wrote:

 On Thursday 24 July 2008 20:50, Sebastien Dugue wrote:
  From: Sebastien Dugue [EMAIL PROTECTED]
  Date: Tue, 22 Jul 2008 11:56:41 +0200
  Subject: [PATCH][RT] powerpc - Make the irq reverse mapping radix tree
  lockless
 
The radix tree used by interrupt controllers for their irq reverse
  mapping (currently only the XICS found on pSeries) have a complex locking
  scheme dating back to before the advent of the concurrent radix tree on
  preempt-rt.
 
Take advantage of this and of the fact that the items of the tree are
  pointers to a static array (irq_map) elements which can never go under us
  to simplify the locking.
 
Concurrency between readers and writers are handled by the intrinsic
  properties of the concurrent radix tree. Concurrency between the tree
  initialization which is done asynchronously with readers and writers access
  is handled via an atomic variable (revmap_trees_allocated) set when the
  tree has been initialized and checked before any reader or writer access
  just like we used to check for tree.gfp_mask != 0 before.
 
 Hmm, RCU radix tree is in mainline too for quite a while. I thought
 Ben had already converted this code over ages ago...

  Mainline does not have the concurrent radix tree which this patch
is based on, but maybe it's overkill and the RCU radix tree is enough.
Not sure, will have to think about it a bit more.

 
 Nothing against the -rt patch, but mainline should probably be updated
 to use RCU as well?
 

  If rcu radix tree is enough, then definitely yes.

  Sebastien.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

2008-07-24 Thread Benjamin Herrenschmidt

Concurrency between readers and writers are handled by the intrinsic
  properties of the concurrent radix tree. Concurrency between the tree
  initialization which is done asynchronously with readers and writers access
  is handled via an atomic variable (revmap_trees_allocated) set when the
  tree has been initialized and checked before any reader or writer access
  just like we used to check for tree.gfp_mask != 0 before.
 
 Hmm, RCU radix tree is in mainline too for quite a while. I thought
 Ben had already converted this code over ages ago...
 
 Nothing against the -rt patch, but mainline should probably be updated
 to use RCU as well?

No, I haven't updated that code yet, and yes, we should do it :-)

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

2008-07-23 Thread Sebastien Dugue
From: Sebastien Dugue [EMAIL PROTECTED]
Date: Tue, 22 Jul 2008 11:56:41 +0200
Subject: [PATCH][RT] powerpc - Make the irq reverse mapping radix tree lockless

  The radix tree used by interrupt controllers for their irq reverse mapping
(currently only the XICS found on pSeries) have a complex locking scheme
dating back to before the advent of the concurrent radix tree on preempt-rt.

  Take advantage of this and of the fact that the items of the tree are
pointers to a static array (irq_map) elements which can never go under us
to simplify the locking.

  Concurrency between readers and writers are handled by the intrinsic
properties of the concurrent radix tree. Concurrency between the tree
initialization which is done asynchronously with readers and writers access is
handled via an atomic variable (revmap_trees_allocated) set when the tree
has been initialized and checked before any reader or writer access just
like we used to check for tree.gfp_mask != 0 before.

Signed-off-by: Sebastien Dugue [EMAIL PROTECTED]
Cc: Benjamin Herrenschmidt [EMAIL PROTECTED]
Cc: Paul Mackerras [EMAIL PROTECTED]

---
 arch/powerpc/kernel/irq.c |  102 --
 1 file changed, 27 insertions(+), 75 deletions(-)

Index: linux-2.6.25.8-rt7/arch/powerpc/kernel/irq.c
===
--- linux-2.6.25.8-rt7.orig/arch/powerpc/kernel/irq.c
+++ linux-2.6.25.8-rt7/arch/powerpc/kernel/irq.c
@@ -403,8 +403,7 @@ void do_softirq(void)
 
 static LIST_HEAD(irq_hosts);
 static DEFINE_RAW_SPINLOCK(irq_big_lock);
-static DEFINE_PER_CPU(unsigned int, irq_radix_reader);
-static unsigned int irq_radix_writer;
+static atomic_t revmap_trees_allocated = ATOMIC_INIT(0);
 struct irq_map_entry irq_map[NR_IRQS];
 static unsigned int irq_virq_count = NR_IRQS;
 static struct irq_host *irq_default_host;
@@ -547,57 +546,6 @@ void irq_set_virq_count(unsigned int cou
irq_virq_count = count;
 }
 
-/* radix tree not lockless safe ! we use a brlock-type mecanism
- * for now, until we can use a lockless radix tree
- */
-static void irq_radix_wrlock(unsigned long *flags)
-{
-   unsigned int cpu, ok;
-
-   spin_lock_irqsave(irq_big_lock, *flags);
-   irq_radix_writer = 1;
-   smp_mb();
-   do {
-   barrier();
-   ok = 1;
-   for_each_possible_cpu(cpu) {
-   if (per_cpu(irq_radix_reader, cpu)) {
-   ok = 0;
-   break;
-   }
-   }
-   if (!ok)
-   cpu_relax();
-   } while(!ok);
-}
-
-static void irq_radix_wrunlock(unsigned long flags)
-{
-   smp_wmb();
-   irq_radix_writer = 0;
-   spin_unlock_irqrestore(irq_big_lock, flags);
-}
-
-static void irq_radix_rdlock(unsigned long *flags)
-{
-   local_irq_save(*flags);
-   __get_cpu_var(irq_radix_reader) = 1;
-   smp_mb();
-   if (likely(irq_radix_writer == 0))
-   return;
-   __get_cpu_var(irq_radix_reader) = 0;
-   smp_wmb();
-   spin_lock(irq_big_lock);
-   __get_cpu_var(irq_radix_reader) = 1;
-   spin_unlock(irq_big_lock);
-}
-
-static void irq_radix_rdunlock(unsigned long flags)
-{
-   __get_cpu_var(irq_radix_reader) = 0;
-   local_irq_restore(flags);
-}
-
 static int irq_setup_virq(struct irq_host *host, unsigned int virq,
irq_hw_number_t hwirq)
 {
@@ -752,7 +700,6 @@ void irq_dispose_mapping(unsigned int vi
 {
struct irq_host *host;
irq_hw_number_t hwirq;
-   unsigned long flags;
 
if (virq == NO_IRQ)
return;
@@ -784,15 +731,20 @@ void irq_dispose_mapping(unsigned int vi
if (hwirq  host-revmap_data.linear.size)
host-revmap_data.linear.revmap[hwirq] = NO_IRQ;
break;
-   case IRQ_HOST_MAP_TREE:
+   case IRQ_HOST_MAP_TREE: {
+   DEFINE_RADIX_TREE_CONTEXT(ctx, host-revmap_data.tree);
+
/* Check if radix tree allocated yet */
-   if (host-revmap_data.tree.gfp_mask == 0)
+   if (atomic_read(revmap_trees_allocated) == 0)
break;
-   irq_radix_wrlock(flags);
-   radix_tree_delete(host-revmap_data.tree, hwirq);
-   irq_radix_wrunlock(flags);
+
+   radix_tree_lock(ctx);
+   radix_tree_delete(ctx.tree, hwirq);
+   radix_tree_unlock(ctx);
+
break;
}
+   }
 
/* Destroy map */
smp_mb();
@@ -845,22 +797,20 @@ unsigned int irq_radix_revmap(struct irq
struct radix_tree_root *tree;
struct irq_map_entry *ptr;
unsigned int virq;
-   unsigned long flags;
 
WARN_ON(host-revmap_type != IRQ_HOST_MAP_TREE);
 
-   /* Check if the radix tree exist yet. We test the value of
-* the gfp_mask for that. Sneaky but