Re: [PATCH] synchronize_irq needs a barrier

2007-10-21 Thread Benjamin Herrenschmidt

 .../...

 This patch (mostly written by Linus) fixes this by using spin
  locks instead of memory barries on the synchronize_irq() path.
  
  Signed-off-by: Herbert Xu [EMAIL PROTECTED]
 
 Good for me.
 
 Acked-by: Benjamin Herrenschmidt [EMAIL PROTECTED]

Hrm... not on yet. Herbert, care to resend, looks like it fell down the
wrong hole in Linus mailbox :-)

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-20 Thread Benjamin Herrenschmidt

On Sat, 2007-10-20 at 08:06 +0200, Maxim Levitsky wrote:

   /* Disable interrupts, DMA, and rest of the chip*/
   saa_writel(SAA7134_IRQ1, 0);
   saa_writel(SAA7134_IRQ2, 0);
   saa_writel(SAA7134_MAIN_CTRL, 0);

   dev-insuspend = 1;
   synchronize_irq(pci_dev-irq);
 
   /* ACK pending interrupts just in case*/
   saa_writel(SAA7134_IRQ_REPORT,saa_readl(SAA7134_IRQ_REPORT));
 
   ..
 This should be bullet-proof.

Hopefully :-)

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-19 Thread Maxim Levitsky
On Thursday 18 October 2007 03:25:42 Benjamin Herrenschmidt wrote:
 synchronize_irq needs at the very least a compiler barrier and a
 read barrier on SMP, but there are enough cases around where a
 write barrier is also needed and it's not a hot path so I prefer
 using a full smp_mb() here.
 
 It will degrade to a compiler barrier on !SMP.
 
 Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED]
 ---
 
 Index: linux-work/kernel/irq/manage.c
 ===
 --- linux-work.orig/kernel/irq/manage.c   2007-10-18 11:22:16.0 
 +1000
 +++ linux-work/kernel/irq/manage.c2007-10-18 11:22:20.0 +1000
 @@ -33,6 +33,7 @@ void synchronize_irq(unsigned int irq)
   if (irq = NR_IRQS)
   return;
  
 + smp_mb();
   while (desc-status  IRQ_INPROGRESS)
   cpu_relax();
  }
 
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 

Hi,

I have read this thread and I concluded few things:

1) It is impossible to know that the card won't send more interrupts:
Even if I do a read from the device, the IRQ can be pending in the bus/APIC
It is even possible (and likely) that the IRQ line will be shared, thus the 
handler can be called by non-relevant device.

2) the synchronize_irq(); in .suspend is useless:
an IRQ can happen immediately after this synchronize_irq();
and interrupt even the .suspend()
(At least theoretically)


Thus I now understand that .suspend() should do:

saa_writel(SAA7134_IRQ1, 0);
saa_writel(SAA7134_IRQ2, 0);
saa_writel(SAA7134_MAIN_CTRL, 0);

dev-insuspend = 1;
smp_wmb();

/* at that point the _request to disable card's IRQs was issued, we 
don't know 
   that there will be no irqs anymore.
   the smp_mb(); guaranties that the IRQ handler will bail out in that 
case. */

/* ...*/

pci_save_state(pci_dev);
pci_set_power_state(pci_dev, pci_choose_state(pci_dev, state));
return 0;

and the interrupt handler:

smp_rmb();
if (dev-insuspend)
goto out;

Am I right?
Best regards,
Maxim Levitsky
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-19 Thread Maxim Levitsky
On Saturday 20 October 2007 04:25:34 Linus Torvalds wrote:
 
 On Sat, 20 Oct 2007, Maxim Levitsky wrote:
  
  and the interrupt handler:
  
  smp_rmb();
  if (dev-insuspend)
  goto out;
 
 Something like that can work, yes. However, you need to make sure that:
 
  - even when you ignore the interrupt (because the driver doesn't care, 
it's suspending), you need to make sure the hardware gets shut up by 
reading (or writing) the proper interrupt status register.
I agree, but while device is powered off, its registers can't be accessed
Thus, if I ack the IRQ every time the handler is called, I will access the 
powered off device (this is probably won't hurt a lot, but a bit incorrectly)

 
Otherwise, with a level interrupt, the interrupt will continue to be 
held active (screaming) and the CPU will get interrupted over and 
over again, until the irq subsystem will just turn the irq off 
entirely.
 
  - when you resume, make sure that you get the engine going again, with 
the understanding that some interrupts may have gotten ignored.
Here it isn't a problem, this is a video capture card, and I suspend it by just 
stopping dma
on all active buffers even if in the middle of capture, and I send those 
buffers to card again
to fill them from the beginning during the resume.
 
 Also, in general, these kinds of things shouldn't always even be 
 neicessary. If you use the suspend_late()/resume_early() hooks, those will 
 be called with interrupts off, and while they can be harder to debug (and 
 may be worth avoiding for non-critical drivers), they do allow for simpler 
 models partly exactly because they don't need to worry about interrupts 
 etc.
Exactly, I am aware of suspend_late , but I don't want to abuse it.
 
   Linus
 


Best regards,
Maxim Levitsky

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-19 Thread Herbert Xu
On Sat, Oct 20, 2007 at 02:02:42AM +, Maxim Levitsky wrote:

 Thus I now understand that .suspend() should do:
 
   saa_writel(SAA7134_IRQ1, 0);
   saa_writel(SAA7134_IRQ2, 0);
   saa_writel(SAA7134_MAIN_CTRL, 0);
 
   dev-insuspend = 1;
   smp_wmb();

If we patch synchronize_irq() as discussed here then you can
use it in place of smp_wmb() and remove the smp_rmb from the
interrupt handler (the latter is the path that matters).

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-19 Thread Linus Torvalds


On Sat, 20 Oct 2007, Maxim Levitsky wrote:
 
 and the interrupt handler:
 
   smp_rmb();
   if (dev-insuspend)
   goto out;

Something like that can work, yes. However, you need to make sure that:

 - even when you ignore the interrupt (because the driver doesn't care, 
   it's suspending), you need to make sure the hardware gets shut up by 
   reading (or writing) the proper interrupt status register.

   Otherwise, with a level interrupt, the interrupt will continue to be 
   held active (screaming) and the CPU will get interrupted over and 
   over again, until the irq subsystem will just turn the irq off 
   entirely.

 - when you resume, make sure that you get the engine going again, with 
   the understanding that some interrupts may have gotten ignored.

Also, in general, these kinds of things shouldn't always even be 
neicessary. If you use the suspend_late()/resume_early() hooks, those will 
be called with interrupts off, and while they can be harder to debug (and 
may be worth avoiding for non-critical drivers), they do allow for simpler 
models partly exactly because they don't need to worry about interrupts 
etc.

Linus
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-19 Thread Benjamin Herrenschmidt

   - even when you ignore the interrupt (because the driver doesn't care, 
 it's suspending), you need to make sure the hardware gets shut up by 
 reading (or writing) the proper interrupt status register.
 
 Otherwise, with a level interrupt, the interrupt will continue to be 
 held active (screaming) and the CPU will get interrupted over and 
 over again, until the irq subsystem will just turn the irq off 
 entirely.
 
 His suspend routine wrote to the IRQ mask (or equivalent) register in
 his code example, thus the HW should shut up eventually, thus that isn't
 strictly necessary, the IRQ in that case is just a short
 interrupt (noticed by the PIC and delivered but possibly not asserted
 anymore at this stage or about to go down).

In fact, he -must not- ack it. Because is the HW is really down (in D3),
got knows what accessing the ACK register will do. I can give you
ideas... from nothing on most x86 desktops to machine checks on most
powerpc machines, though I could imagine some cards bad enough to lock
your bus up if you try to access a register while they are in D3 (I've
seen some of those critters and it seems not all bridges timeout on
cards that just keep sending retries).

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-19 Thread Benjamin Herrenschmidt

   - even when you ignore the interrupt (because the driver doesn't care, 
 it's suspending), you need to make sure the hardware gets shut up by 
 reading (or writing) the proper interrupt status register.
 I agree, but while device is powered off, its registers can't be accessed
 Thus, if I ack the IRQ every time the handler is called, I will access the 
 powered off device (this is probably won't hurt a lot, but a bit incorrectly)

It will actually crash your machine on some platforms. So no, best is to
-not- ack. The masking is enough, the IRQ will go down eventually.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-19 Thread Benjamin Herrenschmidt

On Fri, 2007-10-19 at 19:25 -0700, Linus Torvalds wrote:
 
 On Sat, 20 Oct 2007, Maxim Levitsky wrote:
  
  and the interrupt handler:
  
  smp_rmb();
  if (dev-insuspend)
  goto out;
 
 Something like that can work, yes. However, you need to make sure that:
 
  - even when you ignore the interrupt (because the driver doesn't care, 
it's suspending), you need to make sure the hardware gets shut up by 
reading (or writing) the proper interrupt status register.

Otherwise, with a level interrupt, the interrupt will continue to be 
held active (screaming) and the CPU will get interrupted over and 
over again, until the irq subsystem will just turn the irq off 
entirely.

His suspend routine wrote to the IRQ mask (or equivalent) register in
his code example, thus the HW should shut up eventually, thus that isn't
strictly necessary, the IRQ in that case is just a short
interrupt (noticed by the PIC and delivered but possibly not asserted
anymore at this stage or about to go down).

We have many scenario generating such short interrupts (pretty much any
time we use interrupt disable/enable on the chip, for example it's
common with NAPI).

This is one of the reasons why we used to have a timebomb causing an
IRQ to erroneously get disabled by the IRQ core assuming the IRQ was
stuck, just because we accumulated too many short interrupts on that
line. A recent patch by Alan fixed that to use some more sensible
algorithm checking the time since the last spurrious interrupt, though I
suspect he's being too optimistic on his timeout value and some
scenario, such as NAPI with just the wrong packet rate, can still
trigger the bug. I need  to find a piece of HW slow enough in
de-asserting the IRQ to try to make a repro-case one of these days.

  - when you resume, make sure that you get the engine going again, with 
the understanding that some interrupts may have gotten ignored.

And you forgot that he still needs synchronize_irq(), or his IRQ handler
might well be in the middle of doing something on another CPU, past the
point where it tested insuspend, which will do no good when a
simultaneous pci_set_power_state() puts the chip into D3. On some
machines, this will machine check. Either that or he needs a spinlock in
his IRQ handler that he also takes around the code that sets insuspend. 

 Also, in general, these kinds of things shouldn't always even be 
 neicessary. If you use the suspend_late()/resume_early() hooks, those will 
 be called with interrupts off, and while they can be harder to debug (and 
 may be worth avoiding for non-critical drivers), they do allow for simpler 
 models partly exactly because they don't need to worry about interrupts 
 etc.

Yes, this is a easier way to deal with devices that are on a directly
mapped bus (path to them isn't gone by the time you hit suspend_late)
and don't need to do fancy things involing waiting for a queue to drain
using interrupts etc... (at one stage of HW complexity, having to
re-implement polled versions of everything is just not worth the
benefit). In general, suspend_late should cover the need of most PCI
devices I suppose.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-19 Thread Benjamin Herrenschmidt

 I have read this thread and I concluded few things:
 
 1) It is impossible to know that the card won't send more interrupts:
 Even if I do a read from the device, the IRQ can be pending in the bus/APIC
 It is even possible (and likely) that the IRQ line will be shared, thus the 
 handler can be called by non-relevant device.
 
 2) the synchronize_irq(); in .suspend is useless:
 an IRQ can happen immediately after this synchronize_irq();
 and interrupt even the .suspend()
 (At least theoretically)

It's not totally useless not. It guarantees that by the time your
are out of your suspend(), a simultaneous instance of the IRQ handler
is either finished, or it (or any subsequent occurence) have seen
your insuspend flag set to 1.

 Thus I now understand that .suspend() should do:
 
   saa_writel(SAA7134_IRQ1, 0);
   saa_writel(SAA7134_IRQ2, 0);
   saa_writel(SAA7134_MAIN_CTRL, 0);
 
   dev-insuspend = 1;
   smp_wmb();
 
   /* at that point the _request to disable card's IRQs was issued, we 
 don't know 
  that there will be no irqs anymore.
  the smp_mb(); guaranties that the IRQ handler will bail out in that 
 case. */
   
   /* ...*/
 
   pci_save_state(pci_dev);
   pci_set_power_state(pci_dev, pci_choose_state(pci_dev, state));
   return 0;

The above doesn't handle the case where the IRQ handle just passed the
if (insuspend) test. You may end up calling pci_set_power_state() while
in the middle of some further HW accesses in the handler.

You still need synchronize_irq() for that reason. Or use a spinlock.

 and the interrupt handler:
 
   smp_rmb();
   if (dev-insuspend)
   goto out;
 
 Am I right?

Not quite :-)

Also not that the work we are doing with synchronize_irq, if it gets
merged, renders it unnecessary for you to add those two memory barriers,
synchronize_irq will pretty much do the ordering for you.

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-19 Thread Maxim Levitsky
On Saturday 20 October 2007 05:56:01 Benjamin Herrenschmidt wrote:
 
  I have read this thread and I concluded few things:
  
  1) It is impossible to know that the card won't send more interrupts:
  Even if I do a read from the device, the IRQ can be pending in the bus/APIC
  It is even possible (and likely) that the IRQ line will be shared, thus the 
  handler can be called by non-relevant device.
  
  2) the synchronize_irq(); in .suspend is useless:
  an IRQ can happen immediately after this synchronize_irq();
  and interrupt even the .suspend()
  (At least theoretically)
 
 It's not totally useless not. It guarantees that by the time your
 are out of your suspend(), a simultaneous instance of the IRQ handler
 is either finished, or it (or any subsequent occurence) have seen
 your insuspend flag set to 1.
 
  Thus I now understand that .suspend() should do:
  
  saa_writel(SAA7134_IRQ1, 0);
  saa_writel(SAA7134_IRQ2, 0);
  saa_writel(SAA7134_MAIN_CTRL, 0);
  
  dev-insuspend = 1;
  smp_wmb();
  
  /* at that point the _request to disable card's IRQs was issued, we 
  don't know 
 that there will be no irqs anymore.
 the smp_mb(); guaranties that the IRQ handler will bail out in that 
  case. */
  
  /* ...*/
  
  pci_save_state(pci_dev);
  pci_set_power_state(pci_dev, pci_choose_state(pci_dev, state));
  return 0;
 
 The above doesn't handle the case where the IRQ handle just passed the
 if (insuspend) test. You may end up calling pci_set_power_state() while
 in the middle of some further HW accesses in the handler.
 
 You still need synchronize_irq() for that reason. Or use a spinlock.
 
  and the interrupt handler:
  
  smp_rmb();
  if (dev-insuspend)
  goto out;
  
  Am I right?
 
 Not quite :-)
 
 Also not that the work we are doing with synchronize_irq, if it gets
 merged, renders it unnecessary for you to add those two memory barriers,
 synchronize_irq will pretty much do the ordering for you.
 
 Cheers,
 Ben.
 
 


Fine, I Agree, and this is why I used it in first place, I just forgot.
To wait for already running IRQ handler, that tested the dev-insuspend.
(And I assumed that interrupts are disabled on the cpu that runs .suspend, but 
I know now
that this isn't true.)

Besides that can I ask few more .suspend/resume questions:

1) some drivers use pci_disable_device(), and pci_enable_device().
should I use it too?

2) I accidentally did this:
pci_set_power_state(pci_dev, pci_choose_state(pci_dev, state));
pci_save_state(pci_dev);

I somehow thought that this is correct, that I should save the pci config state
after the power-down, but now I know that it isn't correct.
I now need to send a patch for dmfe.c network driver that has the same commands 
written by me.
(but it works perfectly anyway)
Is it possible to access pci configuration space in D3?

And lastly speaking of network drivers, one issue came to my mind:
most network drivars has a packet queue and in case of dmfe it is located in 
main memory,
and card does dma from it.


in .suspend I ignore that some packets may be in that queue, and I want
to ask, whenever there are better ways to do that.


this is my dmfe .suspend routine.

/* Disable upper layer interface */
netif_device_detach(dev);

/* Disable Tx/Rx */
db-cr6_data = ~(CR6_RXSC | CR6_TXSC);
update_cr6(db-cr6_data, dev-base_addr);

/* Disable Interrupt */
outl(0, dev-base_addr + DCR7);
outl(inl (dev-base_addr + DCR5), dev-base_addr + DCR5);

/* Fre RX buffers */
dmfe_free_rxbuffer(db);

/* Enable WOL */
pci_read_config_dword(pci_dev, 0x40, tmp);
tmp = ~(DMFE_WOL_LINKCHANGE|DMFE_WOL_MAGICPACKET);

if (db-wol_mode  WAKE_PHY)
tmp |= DMFE_WOL_LINKCHANGE;
if (db-wol_mode  WAKE_MAGIC)
tmp |= DMFE_WOL_MAGICPACKET;

pci_write_config_dword(pci_dev, 0x40, tmp);

pci_enable_wake(pci_dev, PCI_D3hot, 1);
pci_enable_wake(pci_dev, PCI_D3cold, 1);

/* Power down device*/
pci_set_power_state(pci_dev, pci_choose_state (pci_dev,state));
pci_save_state(pci_dev);


I guess, everybody makes mistakes... :-)

Other network drivers has a bit more complicated .suspend/.resume routines, 
but I didn't see a driver waiting for output queue to finish

Best regards,
Maxim Levitsky


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-19 Thread Benjamin Herrenschmidt

 1) some drivers use pci_disable_device(), and pci_enable_device().
 should I use it too?

I generally don't do the former, and I would expect the late to be done
by pci_restore_state() for you. pci_disable_device(), last I looked,
only cleared the bus master bit though, which might be a good idea to
do.

People in ACPI/x86 land, are there more good reasons to do one or the
other ?

That reminds me that I volunteered to write a documentation on how
drivers should do all that stuff at KS and didn't get to actually doing
it yet. shame ... I'll try to start something asap.

 2) I accidentally did this:
   pci_set_power_state(pci_dev, pci_choose_state(pci_dev, state));
   pci_save_state(pci_dev);
 
 I somehow thought that this is correct, that I should save the pci config 
 state
 after the power-down, but now I know that it isn't correct.

Right, you need to do the save_state before the power down. You need to
avoid pretty much any access to the device after the save state other
than the pending set_power_state on resume.

 I now need to send a patch for dmfe.c network driver that has the same 
 commands written by me.
 (but it works perfectly anyway)

On x86 desktop... might have surprises on others.

 Is it possible to access pci configuration space in D3?

It's only really safe to access the PM register itself, though I suppose
you should be able to walk the capability chain to do that. But I
wouldnt recommend doing anything else.

 And lastly speaking of network drivers, one issue came to my mind:
 most network drivars has a packet queue and in case of dmfe it is located in 
 main memory,
 and card does dma from it.

Note that the network stack nowadays does a fair bit of cleaning up for
you before your suspend routine is called
 
 in .suspend I ignore that some packets may be in that queue, and I want
 to ask, whenever there are better ways to do that.
 
 
 this is my dmfe .suspend routine.
 
   /* Disable upper layer interface */
   netif_device_detach(dev);

The above -might- not be needed any more, I have to check. I used to do
it too on my drivers.

   /* Disable Tx/Rx */
   db-cr6_data = ~(CR6_RXSC | CR6_TXSC);
   update_cr6(db-cr6_data, dev-base_addr);
 
   /* Disable Interrupt */
   outl(0, dev-base_addr + DCR7);
   outl(inl (dev-base_addr + DCR5), dev-base_addr + DCR5);
 
   /* Fre RX buffers */
   dmfe_free_rxbuffer(db);
 
   /* Enable WOL */
   pci_read_config_dword(pci_dev, 0x40, tmp);
   tmp = ~(DMFE_WOL_LINKCHANGE|DMFE_WOL_MAGICPACKET);
 
   if (db-wol_mode  WAKE_PHY)
   tmp |= DMFE_WOL_LINKCHANGE;
   if (db-wol_mode  WAKE_MAGIC)
   tmp |= DMFE_WOL_MAGICPACKET;
 
   pci_write_config_dword(pci_dev, 0x40, tmp);
 
   pci_enable_wake(pci_dev, PCI_D3hot, 1);
   pci_enable_wake(pci_dev, PCI_D3cold, 1);
 
   /* Power down device*/
   pci_set_power_state(pci_dev, pci_choose_state (pci_dev,state));
   pci_save_state(pci_dev);
 

Looks allright on a quick glance appart from the bits we already
discussed.

 I guess, everybody makes mistakes... :-)
 
 Other network drivers has a bit more complicated .suspend/.resume routines, 
 but I didn't see a driver waiting for output queue to finish

I think the network stack does that nowadays but we'll have to double
check, that's based on what DaveM told me at KS.

Ben. 

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-19 Thread Benjamin Herrenschmidt


 I probably need to add this synchronize_irq() logic in dmfe.c too, but I 
 probably do it later,
 I think I am overestimating this race, since most drivers don't do 
 dev-insuspend checks in IRQ handler.
 Maybe even just use free_irq() after all

Most drivers are probably underestimating the race :-)

free_irq() would work provided that you did the masking on chip before
(and unmask only after request_irq on the way back in). But it's a bit
like using a 10 tons truck to crush an ant... 

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-19 Thread Maxim Levitsky
On Saturday 20 October 2007 07:04:35 Benjamin Herrenschmidt wrote:
 
  1) some drivers use pci_disable_device(), and pci_enable_device().
  should I use it too?
 
 I generally don't do the former, and I would expect the late to be done
 by pci_restore_state() for you. pci_disable_device(), last I looked,
 only cleared the bus master bit though, which might be a good idea to
 do.
 
 People in ACPI/x86 land, are there more good reasons to do one or the
 other ?
 
 That reminds me that I volunteered to write a documentation on how
 drivers should do all that stuff at KS and didn't get to actually doing
 it yet. shame ... I'll try to start something asap.
 
  2) I accidentally did this:
  pci_set_power_state(pci_dev, pci_choose_state(pci_dev, state));
  pci_save_state(pci_dev);
  
  I somehow thought that this is correct, that I should save the pci config 
  state
  after the power-down, but now I know that it isn't correct.
 
 Right, you need to do the save_state before the power down. You need to
 avoid pretty much any access to the device after the save state other
 than the pending set_power_state on resume.
 
  I now need to send a patch for dmfe.c network driver that has the same 
  commands written by me.
  (but it works perfectly anyway)
 
 On x86 desktop... might have surprises on others.
 
  Is it possible to access pci configuration space in D3?
 
 It's only really safe to access the PM register itself, though I suppose
 you should be able to walk the capability chain to do that. But I
 wouldnt recommend doing anything else.
 
  And lastly speaking of network drivers, one issue came to my mind:
  most network drivars has a packet queue and in case of dmfe it is located 
  in main memory,
  and card does dma from it.
 
 Note that the network stack nowadays does a fair bit of cleaning up for
 you before your suspend routine is called
  
  in .suspend I ignore that some packets may be in that queue, and I want
  to ask, whenever there are better ways to do that.
  
  
  this is my dmfe .suspend routine.
  
  /* Disable upper layer interface */
  netif_device_detach(dev);
 

 
 Looks allright on a quick glance appart from the bits we already
 discussed.


 
  I guess, everybody makes mistakes... :-)
  
  Other network drivers has a bit more complicated .suspend/.resume routines, 
  but I didn't see a driver waiting for output queue to finish
 
 I think the network stack does that nowadays but we'll have to double
 check, that's based on what DaveM told me at KS.
 
 Ben. 
 
 

Hi,

Thanks a lot.
I fix the order of calls in dmfe.c
and in saa7134-core.c.

I probably need to add this synchronize_irq() logic in dmfe.c too, but I 
probably do it later,
I think I am overestimating this race, since most drivers don't do 
dev-insuspend checks in IRQ handler.
Maybe even just use free_irq() after all


Best regards,
Maxim Levitsky
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Herbert Xu
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:
 
 Note that some kind of read barrier or compiler barrier should be needed
 regardless, or we are just not sync'ing with anything at all (we may
 have loaded the value ages ago and thus operate on a totally stale
 value). I prefer a full barrier to also ensure all previous stores are
 pushed out.

We already have a compiler barrier there in the form of cpu_relax.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Herbert Xu
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:

 Take a real life example:
 
 drivers/message/fusion/mptbase.c
 
/* Disable interrupts! */
CHIPREG_WRITE32(ioc-chip-IntMask, 0x);
 
ioc-active = 0;
synchronize_irq(pdev-irq);
 
 And we aren't in a spinlock here. 
 
 That's just a random example grepped I think I see a few more. Then,
 some drivers like tg3 actually do an smp_mb() before calling
 synchronize_irq(). But then, some don't.

I really don't see what the point of the barrier would be here.
Barriers are generally useless unless you have a counter-part
on the other side.

The counterpart here is presumably the interrupt handler, which
should be terminated by the IO write above regardless of the
memory barrier.

Of course I might have missed your point.  If so please give
a description like this for the race that you see:

CPU1CPU2
disable IRQ
whatever the race is
synchronize_irq

 I think trying to have all drivers be correct here is asking for
 trouble, we'd rather have synchronize_irq() be uber-safe. It's not like
 it was used in hot path anyway.

While in general I'd agree with you about give latitude to
drivers, memory barriers I think is something that we can't
afford to.

The reason is that memory barries tend to come in pairs, e.g.,

CPU1CPU2
write A
wmb
write B
read B
rmb
read A

Taking away either barrier would render the other useless.

So whenever we add only one barrier for the benefit of driver
writers who don't bother to think about barriers we may not
be helping them at all.

In any case, such writers should use easier tools like spin
locks rather than trying to go lockless.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Benjamin Herrenschmidt

On Thu, 2007-10-18 at 22:35 +0800, Herbert Xu wrote:
 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:
  
  Note that some kind of read barrier or compiler barrier should be needed
  regardless, or we are just not sync'ing with anything at all (we may
  have loaded the value ages ago and thus operate on a totally stale
  value). I prefer a full barrier to also ensure all previous stores are
  pushed out.
 
 We already have a compiler barrier there in the form of cpu_relax.

Isn't it too late ? The barrier should be before the test_bit, to
prevent it from moving up.

Ben.
 

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Benjamin Herrenschmidt

On Thu, 2007-10-18 at 22:56 +0800, Herbert Xu wrote:
 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:
 
  Take a real life example:
  
  drivers/message/fusion/mptbase.c
  
 /* Disable interrupts! */
 CHIPREG_WRITE32(ioc-chip-IntMask, 0x);
  
 ioc-active = 0;
 synchronize_irq(pdev-irq);
  
  And we aren't in a spinlock here. 
  
  That's just a random example grepped I think I see a few more. Then,
  some drivers like tg3 actually do an smp_mb() before calling
  synchronize_irq(). But then, some don't.
 
 I really don't see what the point of the barrier would be here.
 Barriers are generally useless unless you have a counter-part
 on the other side.

The barrier would guarantee that ioc-active (and in fact the write to
the chip too above) are globally visible before the read of the irq
status. There are two different issues not to mixup here. Any previous
store vs. a read (general case) and MMIO store of IRQ mask which has
issues of it's own.

 The counterpart here is presumably the interrupt handler, which
 should be terminated by the IO write above regardless of the
 memory barrier.

 Of course I might have missed your point.  If so please give
 a description like this for the race that you see:
 
 CPU1  CPU2
 disable IRQ
   whatever the race is
 synchronize_irq

Let's ignore for a moment the generic problem of loads crossing previous
stores and focus on the pure issue of using MMIO writes to mask
interrupts.

Writing to a chip to disable an IRQ is generally never going to provide
any synchronisation guarantee. The MMIO write itself is asynchronous and
not ordered vs. a subsequent read from main storage (ie memory). So
unless you do an MMIO read to flush it, you basically haven't yet
disabled your IRQ don't know when it will be. That's one thing.

Another one is that even if you do an MMIO read to flush, your IRQ may
already have been latched by your PIC, or may be an MSI already issued
and still travelling along busses, and thus might well occur in a few
instructions. In that later case, note that ignoring it is perfectly
fine since the IRQ line will eventually go down (for a level IRQ that
is, for an edge IRQ, ignoring is always fine). That's what we cause
short interrupts, they can be common, though linux has historical issues
with them (unrelated to this discussion). But your interrupt handler
_will_ be called, and thus should be aware that your intend is to ignore
it.

So for both of the reasons above, the MMIO write doesn't mean you IRQ
won't happen and in fact, synchronize_irq() here is totally useless
since it won't prevent the IRQ from actually still happening just after
the test_bit of IRQ_INPROGRESS.

Now, the way to actually do that properly is to _also_ have a flag to
indicate your handler you don't want to deal with that interrupt. That
could be something along the lines of:

writel(irq mask);
wmb();
chip-masked = 1;
smp_mb();
synchronize_irq();

Now, the IRQ handler can just do

if (chip-masked == 1)
return whatever;

Another way is to use spinlocks of course, but we are talking about high
perf stuff that tries to avoid spinlocks on every IRQs, which is fair
enough, thus putting the synchronization burden on the slow path.

In the above example, you need that smp_mb() to make sure that the
effect of chip-masked = 1 is globally visible before the read in
synchronize_irq() is performed. Without that, that read could cross the
previous write, in fact, it may even travel all the way to before the
MMIO in some cases, and thus cause synchronize_irq() to operate on a
completely stale version of irq_desc-status, thus possibly bailing out
early while the IRQ is still happening and chip-masked not yet visible
to the other CPUs. 

In general, synchronize_irq() alone is always bogus, because that read
can travel up. That's why I believe it would simply make things simpler
and avoid problems to put the smp_mb(); inside synchronize_irq() (and same
goes with napi_synchronize() for the exact same reasons).

 While in general I'd agree with you about give latitude to
 drivers, memory barriers I think is something that we can't
 afford to.


 The reason is that memory barries tend to come in pairs, e.g.,
 
 CPU1  CPU2
 write A
 wmb
 write B
   read B
   rmb
   read A
 
 Taking away either barrier would render the other useless.
 
 So whenever we add only one barrier for the benefit of driver
 writers who don't bother to think about barriers we may not
 be helping them at all.

In the case of synchronize_irq and my above example, yes, indeed, there
should be a pending barrier between setting IRQ_INPROGRESS and the
reading of chip-masked by the driver. This is a very good point because
I incorrectly assumed that the spin_unlock inside handle_*_irq before
calling the handler would do it, but 

Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Benjamin Herrenschmidt

On Fri, 2007-10-19 at 12:48 +0800, Herbert Xu wrote:

 [IRQ]: Fix synchronize_irq races with IRQ handler
 
 As it is some callers of synchronize_irq rely on memory barriers
 to provide synchronisation against the IRQ handlers.  For example,
 the tg3 driver does
 
   tp-irq_sync = 1;
   smp_mb();
   synchronize_irq();
 
 and then in the IRQ handler:
 
   if (!tp-irq_sync)
   netif_rx_schedule(dev, tp-napi);
 
 Unfortunately memory barriers only work well when they come in
 pairs.  Because we don't actually have memory barriers on the
 IRQ path, the memory barrier before the synchronize_irq() doesn't
 actually protect us.
 
 In particular, synchronize_irq() may return followed by the
 result of netif_rx_schedule being made visible.
 
 This patch (mostly written by Linus) fixes this by using spin
 locks instead of memory barries on the synchronize_irq() path.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Good for me.

Acked-by: Benjamin Herrenschmidt [EMAIL PROTECTED]

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Benjamin Herrenschmidt

 What may happen is that action can either float upwards to give
 
   spin_lock
   action
   set IRQ_INPROGRESS
   spin_unlock
 
   spin_lock
   clear IRQ_INPROGRESS
   spin_unlock
 
 or it can float downwards to give
 
   spin_lock
   set IRQ_INPROGRESS
   spin_unlock
 
   spin_lock
   clear IRQ_INPROGRESS
   action
   spin_unlock
 

Well, we are generally safer here. That is, unless action is a store,
it will not pass spin_lock, at least not on powerpc afaik.

In fact, if action, as it is in our case, is something like

if (foo)
return;

We cant execute the store inside spin_lock() without having loaded
foo, there is an implicit dependency here.

But anyway, Linus patch fixes that too if it was a problem. Now if
we grep for while (test_bit and while (!test_bit I'm sure we'll find
other similar surprises.

That's also one of the reasons why I _love_ nick patches that add a
proper lock/unlock API on bits, because at least those who are doing
those hand-made pseudo-locks with bitops to save space will be
getting a proper lock/unlock API, no more excuse.

The network stack is doing more fancy things so it's harder (though I
wonder sometimes if it's really worth the risks taken for not using
spinlocks... maybe).

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Benjamin Herrenschmidt

On Fri, 2007-10-19 at 12:20 +0800, Herbert Xu wrote:
 
 That's why I think this patch is in fact the only one that
 solves all the races in this thread.  The case that it solves
 which the lock/unlock patch does not is the one where action
 flows downwards past the clearing of IRQ_INPROGRESS.  I missed
 this case earlier.
 
 In fact this bug exists elsewhere too.  For example, the network
 stack does this in net/sched/sch_generic.c:
 
 /* Wait for outstanding qdisc_run calls. */
 while (test_bit(__LINK_STATE_QDISC_RUNNING, dev-state))
 yield();
 
 This has the same problem as the current synchronize_irq code.

The network stack always made be nervous with it's bitops use as locks.
Any loop of test_bit() alone as a synchronisation method, without locks
or barriers is, imho, inherently buggy.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Benjamin Herrenschmidt

 The whole lock/set IRQ_INPROGRESS/unlock path can then only happen
 before the locked section above, in which case we see and wait nicely
 and all is good, or after, in which case the store to foo will be
 visible to the IRQ handler as it will be ordered with the unlock in the
 code above.

Note that napi_synchronize needs a slightly different treatement.

Here, the situation boils down to:

one CPU does:

foo = 1;
while(test_bit(bar))
barrier();

and the other:

if (!test_and_set_bit(bar)) {
read  use foo
smp_mb__before_clear_bit();
clear_bit(bar);
}

The good thing here is that read  use foo is part of the critical
section (I hate hand-made locks ...) defined by bar which makes
things somewhat easier than the synchronize_irq() case.

I think a simple smp_mb(); here after foo = 1; is enough, which means
basically just having an smp_mp(); inside napi_synchronize(), before
the test_bit(). Or do I miss something ?

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Herbert Xu
On Thu, Oct 18, 2007 at 08:26:45PM -0700, Linus Torvalds wrote:
 
 
 On Thu, 18 Oct 2007, Linus Torvalds wrote:
 
  I *think* it should work with something like
  
  for (;;) {
  smp_rmb();
  if (!spin_is_locked(desc-lock)) {
  smp_rmb();
  if (!(desc-status  IRQ_INPROGRESS)
  break;
  }
  cpu_relax();
  }
 
 I'm starting to doubt this. 
 
 One of the issues is that we still need the smp_mb() in front of the loop 
 (because we want to serialize the loop with any writes in the caller).

Right.  I think if we accept the definition of a spin lock/unlock
that Nick outlined earlier, then we can see that on the IRQ path
there simply isn't a memory barrier between the changing of the
status field and the execution of the action:

spin_lock
set IRQ_INPROGRESS
spin_unlock

action

spin_lock
clear IRQ_INPROGRESS
spin_unlock

What may happen is that action can either float upwards to give

spin_lock
action
set IRQ_INPROGRESS
spin_unlock

spin_lock
clear IRQ_INPROGRESS
spin_unlock

or it can float downwards to give

spin_lock
set IRQ_INPROGRESS
spin_unlock

spin_lock
clear IRQ_INPROGRESS
action
spin_unlock

As a result we can add as many barriers as we want on the slow
(synchronize_irq) path and it just won't do any good (not unless
we add some barriers on the IRQ path == fast path).

What we do have on the right though is the fact in some ways
action behaves as if it's inside the spin lock even though it's
not.  In particular, action cannot float up past the first spin
lock nor can it float down past the last spin unlock.

 See commit fa490cfd15d7ce0900097cc4e60cfd7a76381138 and ponder. Maybe we 
 should take the same approach here, and do something like
 
   repeat:
   /* Optimistic, no-locking loop */
   while (desc-status  IRQ_INPROGRESS)
   cpu_relax();
 
   /* Ok, that indicated we're done: double-check carefully */
   spin_lock_irqsave(desc-lock, flags);
   status = desc-status;
   spin_unlock_irqrestore(desc-lock, flags);
 
   /* Oops, that failed? */
   if (status  IRQ_INPROGRESS)
   goto repeat;

That's why I think this patch is in fact the only one that
solves all the races in this thread.  The case that it solves
which the lock/unlock patch does not is the one where action
flows downwards past the clearing of IRQ_INPROGRESS.  I missed
this case earlier.

In fact this bug exists elsewhere too.  For example, the network
stack does this in net/sched/sch_generic.c:

/* Wait for outstanding qdisc_run calls. */
while (test_bit(__LINK_STATE_QDISC_RUNNING, dev-state))
yield();

This has the same problem as the current synchronize_irq code.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Benjamin Herrenschmidt

 
   repeat:
   /* Optimistic, no-locking loop */
   while (desc-status  IRQ_INPROGRESS)
   cpu_relax();
 
   /* Ok, that indicated we're done: double-check carefully */
   spin_lock_irqsave(desc-lock, flags);
   status = desc-status;
   spin_unlock_irqrestore(desc-lock, flags);
 
   /* Oops, that failed? */
   if (status  IRQ_INPROGRESS)
   goto repeat;
 
 Hmm?

Paulus and I convinced ourselves that this would work. If we call our
variable that gets set before synchronize_irq and read in the IRQ
handler foo, we get to:

- writing foo can travel down at most to just before the unlock in the
code above

- reading foo can travel up out of the IRQ handler at most just after
the lock in the code that sets IRQ_INPROGRESS.

The whole lock/set IRQ_INPROGRESS/unlock path can then only happen
before the locked section above, in which case we see and wait nicely
and all is good, or after, in which case the store to foo will be
visible to the IRQ handler as it will be ordered with the unlock in the
code above.

Pfiew !

So Acked-by: Benjamin Herrenschmidt [EMAIL PROTECTED]

Thanks !

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Herbert Xu
Nick Piggin [EMAIL PROTECTED] wrote:
 
 First of all let's agree on some basic assumptions:

 * A pair of spin lock/unlock subsumes the effect of a full mb.
 
 Not unless you mean a pair of spin lock/unlock as in
 2 spin lock/unlock pairs (4 operations).
 
 *X = 10;
 spin_lock(lock);
 /* *Y speculatively loaded here */
 /* store to *X leaves CPU store queue here */
 spin_unlock(lock);
 y = *Y;

Good point.

Although in this case we're still safe because in the worst
cases:

CPU0CPU1
irq_sync = 1
synchronize_irq
spin lock
load IRQ_INPROGRESS
irq_sync sync is visible
spin unlock
spin lock
load irq_sync
while (IRQ_INPROGRESS)
wait
return
set IRQ_INPROGRESS
spin unlock
tg3_msi
ack IRQ
if (irq_sync)
return
spin lock
clear IRQ_INPROGRESS
spin unlock



CPU0CPU1
spin lock
load irq_sync
irq_sync = 1
synchronize_irq
set IRQ_INPROGRESS
spin unlock
spin lock
load IRQ_INPROGRESS
irq_sync sync is visible
spin unlock
while (IRQ_INPROGRESS)
wait
tg3_msi
ack IRQ
if (irq_sync)
return
do work
spin lock
clear IRQ_INPROGRESS
spin unlock
return

So because we're using the same lock on both sides, it does
do the right thing even without the memory barrier.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Linus Torvalds


On Thu, 18 Oct 2007, Linus Torvalds wrote:

 I *think* it should work with something like
 
   for (;;) {
   smp_rmb();
   if (!spin_is_locked(desc-lock)) {
   smp_rmb();
   if (!(desc-status  IRQ_INPROGRESS)
   break;
   }
   cpu_relax();
   }

I'm starting to doubt this. 

One of the issues is that we still need the smp_mb() in front of the loop 
(because we want to serialize the loop with any writes in the caller).

The other issue is that I don't think it's enough that we saw the 
descriptor lock unlocked, and then the IRQ_INPROGRESS bit clear. It might 
have been unlocked *while* the IRQ was in progress, but the interrupt 
handler is now in its last throes, and re-takes the spinlock and clears 
the IRQ_INPROGRESS thing. But we're not actually happy until we've seen 
the IRQ_INPROGRESS bit clear and the spinlock has been released *again*.

So those two tests should actually be the other way around: we want to see 
the IRQ_INPROGRESS bit clear first.

It's all just too damn subtle and clever. Something like this should not 
need to be that subtle. 

Maybe the rigth thing to do is to not rely on *any* ordering what-so-ever, 
and just make the rule be: if you look at the IRQ_INPROGRESS bit, you'd 
better hold the descriptor spinlock, and not have any subtle ordering 
issues at all.

But that makes us have a loop with getting/releasing the lock all the 
time, and then we get back to horrid issues with cacheline bouncing and 
unfairness of cache accesses across cores (ie look at the issues we had 
with the runqueue starvation in wait_task_inactive()).

Those were fixed by starting out with the non-locked and totally unsafe 
versions, but then having one last check with lock held, and repeat only 
if that says things went south. 

See commit fa490cfd15d7ce0900097cc4e60cfd7a76381138 and ponder. Maybe we 
should take the same approach here, and do something like

repeat:
/* Optimistic, no-locking loop */
while (desc-status  IRQ_INPROGRESS)
cpu_relax();

/* Ok, that indicated we're done: double-check carefully */
spin_lock_irqsave(desc-lock, flags);
status = desc-status;
spin_unlock_irqrestore(desc-lock, flags);

/* Oops, that failed? */
if (status  IRQ_INPROGRESS)
goto repeat;

Hmm?

Linus
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Linus Torvalds


On Fri, 19 Oct 2007, Herbert Xu wrote:

 In other words I think this patch is great :)

Hey, I appreciate it, but I do think that the spinlock only to 
immediately unlock it again is pretty horrid.

I'm convinced that there should be some way to do this without actually 
taking the lock. I *think* it should work with something like

for (;;) {
smp_rmb();
if (!spin_is_locked(desc-lock)) {
smp_rmb();
if (!(desc-status  IRQ_INPROGRESS)
break;
}
cpu_relax();
}

instead. Which basically requires that we see the descriptor lock being 
not held, before we see the IRQ_INPROGRESS bit being clear. Put another 
way: it loops until it sees the lock having been released, and the 
IRQ_INPROGRESS bit being clear after that.

The above requires no serializing instructions on x86, which is a good 
goal (now that smp_rmb() is just a compiler barrier). And it doesn't 
actually have to bounce any cachelines.

And it doesn't have that ugly get lock only to release it, which just 
makes me go Eww!.

But it's a bit subtler. It basically depends on the fact that 
spin_unlock() obviously has to make sure that there is a release barrier 
in the unlock, so any writes done (to the IRQ_INPROGRESS bit) within the 
locked region *must* be visible before the spinlock itself has been 
released.

So somebody should:
 - use another pair of eyes and brains to back me up on this.
 - write up some coherent changelog entry, using the emails that have 
   passed back and forth.
 - actually turn the above into a tested patch with a comment.

And I'm pushing for that somebody being somebody else than me ;)

Linus
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Herbert Xu
On Thu, Oct 18, 2007 at 04:39:59PM -0700, Linus Torvalds wrote:

 Basic notion: the only thing that serializes the IRQ_INPROGRESS flag is 
 the descriptor lock. And we don't have to (or even want to!) hold it while 
 waiting for the thing, but we want to *have*held*it* in between whatever 
 we're synchronizing with.

Thinking about this again and checking code I have come to the
conclusion that

1) There is a race (in some drivers) even with the full mb.
2) Linus's patch fixes it.
3) It's difficult to fix it elegantly in the driver.

In other words I think this patch is great :)

First of all let's agree on some basic assumptions:

* A pair of spin lock/unlock subsumes the effect of a full mb.
* A spin lock in general only equates to (SS/SL/LL).
* A spin unlock in general only equates to (SS/LS).

In particular, a load after a spin unlock may pass before the
spin unlock.

Here is the race (with tg3 as the only example that I know of).
The driver attempts to quiesce interrupts such that after the
call to synchronize_irq it is intended that no further IRQ
handler calls for that device will do any work besides acking
the IRQ.

Here is how it does it:

CPU0CPU1
spin lock
load irq_sync
irq_sync = 1
smp_mb
synchronize_irq()
while (IRQ_INPROGRESS)
wait
return
set IRQ_INPROGRESS
spin unlock
tg3_msi
ack IRQ
if (irq_sync)
return
do work

The problem here is that load of irq_sync in the handler has
passed above the setting of IRQ_INPROGRESS.

Linus's patch fixes it because this becomes:

CPU0CPU1
spin lock
load irq_sync
irq_sync = 1
smp_mb
synchronize_irq
set IRQ_INPROGRESS
spin unlock
spin lock
spin unlock
tg3_msi
ack IRQ
if (irq_sync)
return
do work
while (IRQ_INPROGRESS)
wait
spin lock
clear IRQ_INPROGRESS
spin unlock
return

Even though we still do the work on the right we will now notice
the INPROGRESS flag on the left and wait.

It's hard to fix this in the drivers because they'd either have
to access the desc lock or add a full mb to the fast path on the
right.

Once this goes in we can also remove the smp_mb from tg3.c.  BTW,
a lot of drivers (including the fusion example Ben quoted) call
synchronize_irq before free_irq.  This is unnecessary because
the latter already calls it anyway.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Benjamin Herrenschmidt

 So how about something like this (untested! not necessarily very well 
 thought through either!)
 
 Basic notion: the only thing that serializes the IRQ_INPROGRESS flag is 
 the descriptor lock. And we don't have to (or even want to!) hold it while 
 waiting for the thing, but we want to *have*held*it* in between whatever 
 we're synchronizing with.
 
 The internal irq handler functions already held the lock when they did 
 whatever they need to serialize - and they are possibly performance 
 critical too - so they use the internal function that doesn't get the 
 lock unnecessarily again.

That may do the trick as the read can't cross the spin_lock (it can
cross spin_unlock but not lock). Advantage over adding a barrier to
handle_IRQ_event() is that it keeps the overhead to the slow path
(synchronize_irq).

Note that I didn't actually experience a problem here. I just came upon
that by accident while thinking about a similar issue I have with
napi_synchronize().

Looks good to me on a first glance (unfortunately a bit ugly but heh).

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Linus Torvalds


On Fri, 19 Oct 2007, Benjamin Herrenschmidt wrote:
 
 I agree and you can see that in fact, we don't have enough barrier on
 the other side since spin_unlock doesn't prevent subsequent loads from
 crossing a previous store...
 
 I wonder if that's worth trying to address, adding a barrier in
 handle_IRQ_event for example, or we can continue ignoring the barrier
 and let some drivers do their own fixes in fancy ways.

So how about something like this (untested! not necessarily very well 
thought through either!)

Basic notion: the only thing that serializes the IRQ_INPROGRESS flag is 
the descriptor lock. And we don't have to (or even want to!) hold it while 
waiting for the thing, but we want to *have*held*it* in between whatever 
we're synchronizing with.

The internal irq handler functions already held the lock when they did 
whatever they need to serialize - and they are possibly performance 
critical too - so they use the internal function that doesn't get the 
lock unnecessarily again.

Hmm? 

Linus

---
 kernel/irq/manage.c |   22 ++
 1 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 80eab7a..f3e9575 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -14,6 +14,18 @@
 
 #include internals.h
 
+/*
+ * Internally wait for IRQ_INPROGRESS to go away on other CPU's,
+ * after having serialized with the descriptor lock.
+ */
+static inline void do_synchronize_irq(struct irq_desc *desc)
+{
+#ifdef CONFIG_SMP
+   while (desc-status  IRQ_INPROGRESS)
+   cpu_relax();
+#endif
+}
+
 #ifdef CONFIG_SMP
 
 /**
@@ -28,13 +40,15 @@
  */
 void synchronize_irq(unsigned int irq)
 {
+   unsigned long flags;
struct irq_desc *desc = irq_desc + irq;
 
if (irq = NR_IRQS)
return;
 
-   while (desc-status  IRQ_INPROGRESS)
-   cpu_relax();
+   spin_lock_irqsave(desc-lock, flags);
+   spin_unlock_irqrestore(desc-lock, flags);
+   do_synchronize_irq(desc);
 }
 EXPORT_SYMBOL(synchronize_irq);
 
@@ -129,7 +143,7 @@ void disable_irq(unsigned int irq)
 
disable_irq_nosync(irq);
if (desc-action)
-   synchronize_irq(irq);
+   do_synchronize_irq(desc);
 }
 EXPORT_SYMBOL(disable_irq);
 
@@ -443,7 +457,7 @@ void free_irq(unsigned int irq, void *dev_id)
unregister_handler_proc(irq, action);
 
/* Make sure it's not being used on another CPU */
-   synchronize_irq(irq);
+   do_synchronize_irq(desc);
 #ifdef CONFIG_DEBUG_SHIRQ
/*
 * It's a shared IRQ -- the driver ought to be
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Benjamin Herrenschmidt

On Thu, 2007-10-18 at 15:52 -0700, Linus Torvalds wrote:
 
 On Fri, 19 Oct 2007, Benjamin Herrenschmidt wrote:
  
  The barrier would guarantee that ioc-active (and in fact the write to
  the chip too above) are globally visible
 
 No, it doesn't really guarantee that.
 
 The thing is, there is no such thing as globally visible.
 
 There is a ordering of visibility wrt CPU's, but it's not global, it's 
 quite potentially per-CPU. So a barrier on one CPU doesn't guarantee 
 anything at all without a barrier on the *other* CPU.
 
 That said, the interrupt handling itself contains various barriers on the 
 CPU's that receive interrupts, thanks to the spinlocking. But I do agree 
 with Herbert that adding a smb_mb() is certainly in no way obviously 
 correct, because it doesn't talk about what the other side does wrt 
 barriers and that word in memory.

I agree and you can see that in fact, we don't have enough barrier on
the other side since spin_unlock doesn't prevent subsequent loads from
crossing a previous store...

I wonder if that's worth trying to address, adding a barrier in
handle_IRQ_event for example, or we can continue ignoring the barrier
and let some drivers do their own fixes in fancy ways.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Linus Torvalds


On Fri, 19 Oct 2007, Benjamin Herrenschmidt wrote:
 
 The barrier would guarantee that ioc-active (and in fact the write to
 the chip too above) are globally visible

No, it doesn't really guarantee that.

The thing is, there is no such thing as globally visible.

There is a ordering of visibility wrt CPU's, but it's not global, it's 
quite potentially per-CPU. So a barrier on one CPU doesn't guarantee 
anything at all without a barrier on the *other* CPU.

That said, the interrupt handling itself contains various barriers on the 
CPU's that receive interrupts, thanks to the spinlocking. But I do agree 
with Herbert that adding a smb_mb() is certainly in no way obviously 
correct, because it doesn't talk about what the other side does wrt 
barriers and that word in memory.

Linus
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Nick Piggin
On Friday 19 October 2007 12:32, Herbert Xu wrote:

 First of all let's agree on some basic assumptions:

 * A pair of spin lock/unlock subsumes the effect of a full mb.

Not unless you mean a pair of spin lock/unlock as in
2 spin lock/unlock pairs (4 operations).

*X = 10;
spin_lock(lock);
/* *Y speculatively loaded here */
/* store to *X leaves CPU store queue here */
spin_unlock(lock);
y = *Y;

 * A spin lock in general only equates to (SS/SL/LL).
 * A spin unlock in general only equates to (SS/LS).

I don't use the sparc barriers, so they don't come naturally to
me ;)

I think both loads and stores can pass into the critical section
by having the spin_lock pass earlier ops, or by having spin_unlock
be passed by later ones.


 In particular, a load after a spin unlock may pass before the
 spin unlock.

 Here is the race (with tg3 as the only example that I know of).
 The driver attempts to quiesce interrupts such that after the
 call to synchronize_irq it is intended that no further IRQ
 handler calls for that device will do any work besides acking
 the IRQ.

 Here is how it does it:

 CPU0  CPU1
   spin lock
   load irq_sync
 irq_sync = 1
 smp_mb
 synchronize_irq()
   while (IRQ_INPROGRESS)
   wait
   return
   set IRQ_INPROGRESS
   spin unlock
   tg3_msi
   ack IRQ
   if (irq_sync)
   return
   do work

 The problem here is that load of irq_sync in the handler has
 passed above the setting of IRQ_INPROGRESS.

 Linus's patch fixes it because this becomes:

 CPU0  CPU1
   spin lock
   load irq_sync
 irq_sync = 1
 smp_mb
 synchronize_irq
   set IRQ_INPROGRESS
   spin unlock
   spin lock
   spin unlock
   tg3_msi
   ack IRQ
   if (irq_sync)
   return
   do work
   while (IRQ_INPROGRESS)
   wait
   spin lock
   clear IRQ_INPROGRESS
   spin unlock
   return

 Even though we still do the work on the right we will now notice
 the INPROGRESS flag on the left and wait.

 It's hard to fix this in the drivers because they'd either have
 to access the desc lock or add a full mb to the fast path on the
 right.

 Once this goes in we can also remove the smp_mb from tg3.c.  BTW,
 a lot of drivers (including the fusion example Ben quoted) call
 synchronize_irq before free_irq.  This is unnecessary because
 the latter already calls it anyway.

 Cheers,
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-18 Thread Nick Piggin
On Friday 19 October 2007 13:28, Herbert Xu wrote:
 Nick Piggin [EMAIL PROTECTED] wrote:
  First of all let's agree on some basic assumptions:
 
  * A pair of spin lock/unlock subsumes the effect of a full mb.
 
  Not unless you mean a pair of spin lock/unlock as in
  2 spin lock/unlock pairs (4 operations).
 
  *X = 10;
  spin_lock(lock);
  /* *Y speculatively loaded here */
  /* store to *X leaves CPU store queue here */
  spin_unlock(lock);
  y = *Y;

 Good point.

 Although in this case we're still safe because in the worst
 cases:

 CPU0  CPU1
 irq_sync = 1
 synchronize_irq
   spin lock
   load IRQ_INPROGRESS
 irq_sync sync is visible
   spin unlock
   spin lock
   load irq_sync
   while (IRQ_INPROGRESS)
   wait
   return
   set IRQ_INPROGRESS
   spin unlock
   tg3_msi
   ack IRQ
   if (irq_sync)
   return
   spin lock
   clear IRQ_INPROGRESS
   spin unlock

 

 CPU0  CPU1
   spin lock
   load irq_sync
 irq_sync = 1
 synchronize_irq
   set IRQ_INPROGRESS
   spin unlock
   spin lock
   load IRQ_INPROGRESS
 irq_sync sync is visible
   spin unlock
   while (IRQ_INPROGRESS)
   wait
   tg3_msi
   ack IRQ
   if (irq_sync)
   return
   do work
   spin lock
   clear IRQ_INPROGRESS
   spin unlock
   return

 So because we're using the same lock on both sides, it does
 do the right thing even without the memory barrier.


Yeah, if you've got the lock on both sides there, then I
agree it will be correct.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] synchronize_irq needs a barrier

2007-10-17 Thread Benjamin Herrenschmidt
synchronize_irq needs at the very least a compiler barrier and a
read barrier on SMP, but there are enough cases around where a
write barrier is also needed and it's not a hot path so I prefer
using a full smp_mb() here.

It will degrade to a compiler barrier on !SMP.

Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED]
---

Index: linux-work/kernel/irq/manage.c
===
--- linux-work.orig/kernel/irq/manage.c 2007-10-18 11:22:16.0 +1000
+++ linux-work/kernel/irq/manage.c  2007-10-18 11:22:20.0 +1000
@@ -33,6 +33,7 @@ void synchronize_irq(unsigned int irq)
if (irq = NR_IRQS)
return;
 
+   smp_mb();
while (desc-status  IRQ_INPROGRESS)
cpu_relax();
 }


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-17 Thread Andrew Morton
On Thu, 18 Oct 2007 11:25:42 +1000
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:

 synchronize_irq needs at the very least a compiler barrier and a
 read barrier on SMP,

Why?

 but there are enough cases around where a
 write barrier is also needed and it's not a hot path so I prefer
 using a full smp_mb() here.
 
 It will degrade to a compiler barrier on !SMP.
 
 Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED]
 ---
 
 Index: linux-work/kernel/irq/manage.c
 ===
 --- linux-work.orig/kernel/irq/manage.c   2007-10-18 11:22:16.0 
 +1000
 +++ linux-work/kernel/irq/manage.c2007-10-18 11:22:20.0 +1000
 @@ -33,6 +33,7 @@ void synchronize_irq(unsigned int irq)
   if (irq = NR_IRQS)
   return;
  
 + smp_mb();
   while (desc-status  IRQ_INPROGRESS)
   cpu_relax();
  }

Anyone reading this code is going to ask wtf is that for.  It needs a
comment telling them.


mb() is the new lock_kernel().  Sigh.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-17 Thread Benjamin Herrenschmidt

  Index: linux-work/kernel/irq/manage.c
  ===
  --- linux-work.orig/kernel/irq/manage.c 2007-10-18 11:22:16.0 
  +1000
  +++ linux-work/kernel/irq/manage.c  2007-10-18 11:22:20.0 +1000
  @@ -33,6 +33,7 @@ void synchronize_irq(unsigned int irq)
  if (irq = NR_IRQS)
  return;
   
  +   smp_mb();
  while (desc-status  IRQ_INPROGRESS)
  cpu_relax();
   }
 
 Anyone reading this code is going to ask wtf is that for.  It needs a
 comment telling them.
 
 
 mb() is the new lock_kernel().  Sigh.

Ugh ?

That sounds fairly obvious to me :-) we are reading a value, that is
totally unordered, nothing to do about lock kernel or whatever, if we
want the above statement to make any sense in any kind of usage
scenario, it needs to be ordered vs. what happens before.

For example, take a construct like:

device-my_hw_is_off = 1;
synchronize_irq();
turn_off_hardware();

That basically makes sure the irq either sees device-my_hw_is_off
being set to 1, or if an irq handler is already in progress and hasn't
seen it, we wait for it to complete.

(You can replace hw_is_off with anything that we want to set and make
sure the IRQ handler sees it before proceeding. It could be clearing a
pointer to something and make sure the irq sees it before freeing the
data, etc...).

I think pretty much any use of synchronize_irq() I can imagine needs
such kind of ordering... or it simply doesn't synchronize anything :-)

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-17 Thread Linus Torvalds


On Thu, 18 Oct 2007, Benjamin Herrenschmidt wrote:
  
 + smp_mb();
   while (desc-status  IRQ_INPROGRESS)
   cpu_relax();

So, what exactly does it protect against? At a minimum, this needs a 
comment in the changelog, and probably preferably in the source code too.

The thing is, synchronize_irq() can only protect against interrupts that 
are *already*running* on another CPU, and the caller must have made sure 
that no new interrupts are coming in (or at least that whatever new 
interrupts that come in will not pick up a certain piece of data). 

So I can imagine that the smb_mb() is there so that the caller - who has 
cleared some list or done something like that - will have any preceding 
writes that it did be serialized with actually checking the old state of 
desc-status.

Fair enough - I can see that a smp_mb() is useful. But I think it needs 
documenting as such, and preferably with an example of how this actually 
happened in the first place (do you have one?)

The synchronize_irq() users that are really fundamental (ie the irq 
datastructures themselves) all should use the irq descriptor spinlock, and 
should *not* be needing any of this, since they would have serialized with 
whoever actually set the IRQ_INPROGRESS bit in the first place.

So in *that* sense, I think the memory barrier is useless, and I can't 
make up my mind whether it's good or bad. Which is why I'd really like to 
have an example scenario spelled out where it makes a difference..

Ok?

Linus
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-17 Thread Benjamin Herrenschmidt

On Wed, 2007-10-17 at 19:12 -0700, Linus Torvalds wrote:
 
 So, what exactly does it protect against? At a minimum, this needs a 
 comment in the changelog, and probably preferably in the source code too.

I replied to Andrew, but I agree, it's worth a comment, I'll add one.

 The thing is, synchronize_irq() can only protect against interrupts that 
 are *already*running* on another CPU, and the caller must have made sure 
 that no new interrupts are coming in (or at least that whatever new 
 interrupts that come in will not pick up a certain piece of data). 
 
 So I can imagine that the smb_mb() is there so that the caller - who has 
 cleared some list or done something like that - will have any preceding 
 writes that it did be serialized with actually checking the old state of 
 desc-status.

Yes.

 Fair enough - I can see that a smp_mb() is useful. But I think it needs 
 documenting as such, and preferably with an example of how this actually 
 happened in the first place (do you have one?)

One could argue that it's the caller that should do the mb() but that
would really be asking for trouble. Since most usage scenario require
it, and it's not a hot path, I prefer having it here.

Note that some kind of read barrier or compiler barrier should be needed
regardless, or we are just not sync'ing with anything at all (we may
have loaded the value ages ago and thus operate on a totally stale
value). I prefer a full barrier to also ensure all previous stores are
pushed out.

 The synchronize_irq() users that are really fundamental (ie the irq 
 datastructures themselves) all should use the irq descriptor spinlock, and 
 should *not* be needing any of this, since they would have serialized with 
 whoever actually set the IRQ_INPROGRESS bit in the first place.

That isn't always the case. You can have very efficient lock-less fast
path and use synchronize_irq as a kind of RCU-like way to have the slow
path do the right thing.

 So in *that* sense, I think the memory barrier is useless, and I can't 
 make up my mind whether it's good or bad. Which is why I'd really like to 
 have an example scenario spelled out where it makes a difference..

Well, typically, I can clear a pointer and want to make sure my IRQ
handler isn't using it anymore before freeing the memory. Or set a HW
is off flag. Having spinlocks all over isn't always the best approach
on things that are really hot path, it's fair enough to use lockless
approaches like that by moving the burden to the slow path that does
synchronize_irq.

In general, I tend to think that for this function to make any sense
(that is, to synchronize anything at all), it needs a barrier or you are
just making a decision based on a totally random value of desc-status
since it can have been re-ordered, speculatively loaded, pre-fetched,
whatever'ed... :-).

Want a respin with a comment ?

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] synchronize_irq needs a barrier

2007-10-17 Thread Benjamin Herrenschmidt

 
 In general, I tend to think that for this function to make any sense
 (that is, to synchronize anything at all), it needs a barrier or you are
 just making a decision based on a totally random value of desc-status
 since it can have been re-ordered, speculatively loaded, pre-fetched,
 whatever'ed... :-).

Take a real life example:

drivers/message/fusion/mptbase.c

/* Disable interrupts! */
CHIPREG_WRITE32(ioc-chip-IntMask, 0x);

ioc-active = 0;
synchronize_irq(pdev-irq);

And we aren't in a spinlock here. 

That's just a random example grepped I think I see a few more. Then,
some drivers like tg3 actually do an smp_mb() before calling
synchronize_irq(). But then, some don't.

I think trying to have all drivers be correct here is asking for
trouble, we'd rather have synchronize_irq() be uber-safe. It's not like
it was used in hot path anyway.

Ben.




___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev