Florian Fainelli <f.faine...@gmail.com> writes: > On 05/10/2017 03:31 AM, Phil Elwell wrote: >> On 10/05/2017 11:09, Marc Zyngier wrote: >>> On 10/05/17 10:05, Phil Elwell wrote: >>>> On 10/05/2017 09:55, Marc Zyngier wrote: >>>>> On Wed, May 10 2017 at 9:27:10 am BST, Phil Elwell >>>>> <p...@raspberrypi.org> wrote: >>>>>> On 10/05/2017 08:42, Marc Zyngier wrote: >>>>>>> On 09/05/17 20:02, Phil Elwell wrote: >>>>>>>> On 09/05/2017 19:53, Marc Zyngier wrote: >>>>>>>>> On 09/05/17 19:52, Phil Elwell wrote: >>>>>>>>>> On 09/05/2017 19:14, Marc Zyngier wrote: >>>>>>>>>>> On 09/05/17 19:08, Eric Anholt wrote: >>>>>>>>>>>> Marc Zyngier <marc.zyng...@arm.com> writes: >>>>>>>>>>>> >>>>>>>>>>>>> On 09/05/17 17:59, Eric Anholt wrote: >>>>>>>>>>>>>> Phil Elwell <p...@raspberrypi.org> writes: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> In order to reduce power consumption and bus traffic, it is >>>>>>>>>>>>>>> sensible >>>>>>>>>>>>>>> for secondary cores to enter a low-power idle state when >>>>>>>>>>>>>>> waiting to >>>>>>>>>>>>>>> be started. The wfe instruction causes a core to wait until an >>>>>>>>>>>>>>> event >>>>>>>>>>>>>>> or interrupt arrives before continuing to the next instruction. >>>>>>>>>>>>>>> The sev instruction sends a wakeup event to the other cores, so >>>>>>>>>>>>>>> call >>>>>>>>>>>>>>> it from bcm2836_smp_boot_secondary, the function that wakes up >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> waiting cores during booting. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It is harmless to use this patch without the corresponding >>>>>>>>>>>>>>> change >>>>>>>>>>>>>>> adding wfe to the ARMv7/ARMv8-32 stubs, but if the stubs are >>>>>>>>>>>>>>> updated >>>>>>>>>>>>>>> and this patch is not applied then the other cores will sleep >>>>>>>>>>>>>>> forever. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> See: https://github.com/raspberrypi/linux/issues/1989 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Signed-off-by: Phil Elwell <p...@raspberrypi.org> >>>>>>>>>>>>>>> --- >>>>>>>>>>>>>>> drivers/irqchip/irq-bcm2836.c | 3 +++ >>>>>>>>>>>>>>> 1 file changed, 3 insertions(+) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> diff --git a/drivers/irqchip/irq-bcm2836.c >>>>>>>>>>>>>>> b/drivers/irqchip/irq-bcm2836.c >>>>>>>>>>>>>>> index e10597c..6dccdf9 100644 >>>>>>>>>>>>>>> --- a/drivers/irqchip/irq-bcm2836.c >>>>>>>>>>>>>>> +++ b/drivers/irqchip/irq-bcm2836.c >>>>>>>>>>>>>>> @@ -248,6 +248,9 @@ static int __init >>>>>>>>>>>>>>> bcm2836_smp_boot_secondary(unsigned int cpu, >>>>>>>>>>>>>>> writel(secondary_startup_phys, >>>>>>>>>>>>>>> intc.base + LOCAL_MAILBOX3_SET0 + 16 * cpu); >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> + dsb(sy); /* Ensure write has completed before waking >>>>>>>>>>>>>>> the other CPUs */ >>>>>>>>>>>>>>> + sev(); >>>>>>>>>>>>>>> + >>>>>>>>>>>>>>> return 0; >>>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> This is also the behavior that the standard arm64 spin-table >>>>>>>>>>>>>> method has, >>>>>>>>>>>>>> which we unfortunately can't quite use. >>>>>>>>>>>>> >>>>>>>>>>>>> And why is that so? Why do you have to reinvent the wheel (and >>>>>>>>>>>>> hide the >>>>>>>>>>>>> cloned wheel in an interrupt controller driver)? >>>>>>>>>>>>> >>>>>>>>>>>>> That doesn't seem right to me. >>>>>>>>>>>> >>>>>>>>>>>> The armv8 stubs (firmware-supplied code in the low page that do the >>>>>>>>>>>> spinning) do actually implement arm64's spin-table method. It's >>>>>>>>>>>> the >>>>>>>>>>>> armv7 stubs that use these registers in the irqchip instead of >>>>>>>>>>>> plain >>>>>>>>>>>> addresses in system memory. >>>>>>>>>>> >>>>>>>>>>> Let's put ARMv7 aside for the time being. If your firmware already >>>>>>>>>>> implements spin-tables, why don't you simply use that at least on >>>>>>>>>>> arm64? >>>>>>>>>> >>>>>>>>>> We do. >>>>>>>>> >>>>>>>>> Obviously not the way it is intended if you have to duplicate the core >>>>>>>>> architectural code in the interrupt controller driver, which couldn't >>>>>>>>> care less. >>>>>>>> >>>>>>>> If we were using this method on arm64 then the other cores would not >>>>>>>> start up >>>>>>>> because armstub8.S has always included a wfe. Nothing in the commit >>>>>>>> mentions >>>>>>>> arm64 - this is an ARCH=arm fix. >>>>>>> >>>>>>> Thanks for the clarification, which you could have added to the commit >>>>>>> message. >>>>>>> >>>>>>> The question still remains: why do we have CPU bring-up code in an >>>>>>> interrupt controller, instead of having it in the architecture code? >>>>>>> >>>>>>> The RPi-2 is the *only* platform to have its SMP bringup code outside of >>>>>>> arch/arm, so the first course of action would be to move that code where >>>>>>> it belongs. >>>>>> >>>>>> You were CC'd on the commit (41f4988cc287e5f836d3f6620c9f900bc9b560e9) >>>>>> that >>>>>> introduced bcm2836_smp_boot_secondary - it seems strange to start >>>>>> objecting >>>>>> now. >>>>> >>>>> Well, I'm far from being perfect. If I had noticed it, I'd have NACKed >>>>> it. >>>>> >>>>>> Yes, I think it is odd that it didn't go into arch/arm/mach-bcm, but in >>>>>> the interests of making changes in small, independent steps, do you have >>>>>> a >>>>>> problem with this commit? >>>>> >>>>> On its own, no. I'm just not keen on adding more unrelated stuff to this >>>>> file, so let's start with dealing with the original bug, and you can >>>>> then add this fix on top. >>>> >>>> That's an interesting use of the word "bug". From Wikipedia: >>>> >>>> "A software bug is an error, flaw, failure or fault in a computer program >>>> or >>>> system that causes it to produce an incorrect or unexpected result, or to >>>> behave in unintended ways." >>> >>> Whatever. Should I call it "pile of crap dumped in unsuitable locations" >>> instead? What does Wikipedia says about it? >>> >>>> Although your concerns are valid, the faults you are objecting to are not >>>> causing >>>> a malfunction of any kind. If we were to update the RPi firmware before >>>> this >>>> patch was merged then upstream users would be left with one wheel on their >>>> wagon. >>> >>> And that'd be your problem, not mine. Look, you can argue around this >>> all day, or you can fix this mess. Your choice. >> >> Is that the opinion of all here? > > The choice of word here got largely out of the original topic and I > surely did eat a ton of popcorn here. There are two things that need > fixing, and the time line and process for fixing these is clear: > > - your bugfix (Phil) is something that should be applied now, and > backported to -stable trees once the fix hits the irqchip tree (or Linus') > > - relocating the code that does the secondary boot out of > drivers/irqchip/ into arch/arm/mach-bcm/ needs to happen (Marc), and > this is 4.13 material, there is no urgency in doing this *right now*, > but it needs to happen > > Does that work for everyone?
Agreed. This patch, which we'll want to go to -stable, should clearly go in first. Marc's patch can go in after, since it's not a -stable candidate. Thomas, could you add the cc to stable when picking this patch?
signature.asc
Description: PGP signature