Re: [PATCH 0/7] riscv: Correctly handle IPIs already pending upon boot
On 9/10/20 3:08 AM, Rick Chen wrote: > Hi Sean > >> On 9/8/20 10:02 PM, Rick Chen wrote: >>> Hi Sean >>> On the K210, the prior stage bootloader does not clear IPIs. This presents a problem, because U-Boot up until this point assumes (with one exception) that IPIs are cleared when it starts. This series attempts to fix this in a robust manner, and fix several concurrency bugs I noticed while fixing these other areas. Heinrich previously submitted a patch addressing part of this problem in [1]. [1] https://patchwork.ozlabs.org/project/uboot/patch/20200811035648.3284-1-xypron.g...@gmx.de/ >>> >>> It sounds like that the bootloader does not deal with SMP flow well >>> and jump to u-boot-spl, right ? >>> >>> I have a question that why not try to fix the prior stage bootloader >>> to clear IPIs correctly? >> >> Because it is a ROM :) > > Is it a mask ROM or flash ROM ? I don't know, it's not documented. From what I can tell, it's controlled by the OTP device. However, I haven't thoroughly investigated it. > >> >>> >>> Actually I have encounter a similar SMP issue like you. >>> Our prior stage bootloader will jump to u-boot-spl with the incorrect >>> mstatus and result in the SMP working abnormal in u-boot-spl. >> >> Perhaps we should just clear MIE then? I originally had a patch in this >> series which moved the handle_ipi code into handle_trap, and got rid of >> the manual checks on the interrupt. Something like >> >> secondary_hart_loop: >> wfi >> j secondary_hart_loop >> >> Of course as part of that we would need to explicitly enable and disable >> interrupts. Perhaps not the worst idea, but I didn't include it here >> because I figure the current system work OK, even if it is not what one >> might expect. >> >>> I mean this is an individual case, not a general case. >>> If we try to cover any errors which come from any prior stage bootloaders, >>> the SMP flow will become more and more complicated and hard to debug. >> >> Of course, if a prior bootloader doesn't hold up its end of the >> contract, we can be left with some awful bugs to fix. U-Boot is >> generally not too bad to debug, but I've had an awful time whenever some >> concurrency sneaks into the mix. I think it's much better to confine the >> (necessary) complexity to as few files as possible, so that the rest of >> the code can be ignorant. I think part of that is verifying that we have >> everything in a known state, so that when we see something unexpected, >> we can handle it/panic/whatever instead of silently getting a bug. > > It sounds like an error handling and the errors come from the prior > stage bootloader. > Without U-Boot, does Kernel handle this kind of IPIs not clean > unexpected errors ? Well, software interrupts are disabled on each hart until riscv_intc_init is called (and enables soft irqs). This prevents the handler from being called before ipi_data is initialized. After that, handle_IPI can be called, but if ops == 0 (e.g. the IPI wasn't signaled by Linux), then it just goes to done. --Sean
Re: [PATCH 0/7] riscv: Correctly handle IPIs already pending upon boot
Hi Sean > On 9/8/20 10:02 PM, Rick Chen wrote: > > Hi Sean > > > >> On the K210, the prior stage bootloader does not clear IPIs. This presents > >> a problem, because U-Boot up until this point assumes (with one exception) > >> that IPIs are cleared when it starts. This series attempts to fix this in a > >> robust manner, and fix several concurrency bugs I noticed while fixing > >> these other areas. Heinrich previously submitted a patch addressing part of > >> this problem in [1]. > >> > >> [1] > >> https://patchwork.ozlabs.org/project/uboot/patch/20200811035648.3284-1-xypron.g...@gmx.de/ > >> > > > > It sounds like that the bootloader does not deal with SMP flow well > > and jump to u-boot-spl, right ? > > > > I have a question that why not try to fix the prior stage bootloader > > to clear IPIs correctly? > > Because it is a ROM :) Is it a mask ROM or flash ROM ? > > > > > Actually I have encounter a similar SMP issue like you. > > Our prior stage bootloader will jump to u-boot-spl with the incorrect > > mstatus and result in the SMP working abnormal in u-boot-spl. > > Perhaps we should just clear MIE then? I originally had a patch in this > series which moved the handle_ipi code into handle_trap, and got rid of > the manual checks on the interrupt. Something like > > secondary_hart_loop: > wfi > j secondary_hart_loop > > Of course as part of that we would need to explicitly enable and disable > interrupts. Perhaps not the worst idea, but I didn't include it here > because I figure the current system work OK, even if it is not what one > might expect. > > > I mean this is an individual case, not a general case. > > If we try to cover any errors which come from any prior stage bootloaders, > > the SMP flow will become more and more complicated and hard to debug. > > Of course, if a prior bootloader doesn't hold up its end of the > contract, we can be left with some awful bugs to fix. U-Boot is > generally not too bad to debug, but I've had an awful time whenever some > concurrency sneaks into the mix. I think it's much better to confine the > (necessary) complexity to as few files as possible, so that the rest of > the code can be ignorant. I think part of that is verifying that we have > everything in a known state, so that when we see something unexpected, > we can handle it/panic/whatever instead of silently getting a bug. It sounds like an error handling and the errors come from the prior stage bootloader. Without U-Boot, does Kernel handle this kind of IPIs not clean unexpected errors ? Thanks, Rick > > --Sean
Re: [PATCH 0/7] riscv: Correctly handle IPIs already pending upon boot
On 9/8/20 10:38 PM, Sean Anderson wrote: > On 9/8/20 10:02 PM, Rick Chen wrote: >> Hi Sean >> >>> On the K210, the prior stage bootloader does not clear IPIs. This presents >>> a problem, because U-Boot up until this point assumes (with one exception) >>> that IPIs are cleared when it starts. This series attempts to fix this in a >>> robust manner, and fix several concurrency bugs I noticed while fixing >>> these other areas. Heinrich previously submitted a patch addressing part of >>> this problem in [1]. >>> >>> [1] >>> https://patchwork.ozlabs.org/project/uboot/patch/20200811035648.3284-1-xypron.g...@gmx.de/ >>> >> >> It sounds like that the bootloader does not deal with SMP flow well >> and jump to u-boot-spl, right ? >> >> I have a question that why not try to fix the prior stage bootloader >> to clear IPIs correctly? > > Because it is a ROM :) Err, perhaps I should clarify. When I say "prior stage bootloader," I mean that in the general sense of "anything which comes before U-Boot," and not to refer specifically to SPL or TPL. For the k210, this is something akin to the ZSBL on an fu540. --Sean
Re: [PATCH 0/7] riscv: Correctly handle IPIs already pending upon boot
On 9/8/20 10:02 PM, Rick Chen wrote: > Hi Sean > >> On the K210, the prior stage bootloader does not clear IPIs. This presents >> a problem, because U-Boot up until this point assumes (with one exception) >> that IPIs are cleared when it starts. This series attempts to fix this in a >> robust manner, and fix several concurrency bugs I noticed while fixing >> these other areas. Heinrich previously submitted a patch addressing part of >> this problem in [1]. >> >> [1] >> https://patchwork.ozlabs.org/project/uboot/patch/20200811035648.3284-1-xypron.g...@gmx.de/ >> > > It sounds like that the bootloader does not deal with SMP flow well > and jump to u-boot-spl, right ? > > I have a question that why not try to fix the prior stage bootloader > to clear IPIs correctly? Because it is a ROM :) > > Actually I have encounter a similar SMP issue like you. > Our prior stage bootloader will jump to u-boot-spl with the incorrect > mstatus and result in the SMP working abnormal in u-boot-spl. Perhaps we should just clear MIE then? I originally had a patch in this series which moved the handle_ipi code into handle_trap, and got rid of the manual checks on the interrupt. Something like secondary_hart_loop: wfi j secondary_hart_loop Of course as part of that we would need to explicitly enable and disable interrupts. Perhaps not the worst idea, but I didn't include it here because I figure the current system work OK, even if it is not what one might expect. > I mean this is an individual case, not a general case. > If we try to cover any errors which come from any prior stage bootloaders, > the SMP flow will become more and more complicated and hard to debug. Of course, if a prior bootloader doesn't hold up its end of the contract, we can be left with some awful bugs to fix. U-Boot is generally not too bad to debug, but I've had an awful time whenever some concurrency sneaks into the mix. I think it's much better to confine the (necessary) complexity to as few files as possible, so that the rest of the code can be ignorant. I think part of that is verifying that we have everything in a known state, so that when we see something unexpected, we can handle it/panic/whatever instead of silently getting a bug. --Sean
Re: [PATCH 0/7] riscv: Correctly handle IPIs already pending upon boot
Hi Sean > On the K210, the prior stage bootloader does not clear IPIs. This presents > a problem, because U-Boot up until this point assumes (with one exception) > that IPIs are cleared when it starts. This series attempts to fix this in a > robust manner, and fix several concurrency bugs I noticed while fixing > these other areas. Heinrich previously submitted a patch addressing part of > this problem in [1]. > > [1] > https://patchwork.ozlabs.org/project/uboot/patch/20200811035648.3284-1-xypron.g...@gmx.de/ > It sounds like that the bootloader does not deal with SMP flow well and jump to u-boot-spl, right ? I have a question that why not try to fix the prior stage bootloader to clear IPIs correctly? Actually I have encounter a similar SMP issue like you. Our prior stage bootloader will jump to u-boot-spl with the incorrect mstatus and result in the SMP working abnormal in u-boot-spl. I mean this is an individual case, not a general case. If we try to cover any errors which come from any prior stage bootloaders, the SMP flow will become more and more complicated and hard to debug. That is why it does not need implement SMP flow in U-Boot proper with SBI v0.2 HSM extension. Thanks, Rick > > Sean Anderson (7): > Revert "riscv: Clear pending interrupts before enabling IPIs" > riscv: Match memory barriers between send_ipi_many and handle_ipi > riscv: Use NULL as a sentinel value for smp_call_function > riscv: Clear pending IPIs on initialization > riscv: Add fence to available_harts_lock > riscv: Ensure gp is NULL or points to valid data > riscv: Add some comments to start.S > > arch/riscv/cpu/cpu.c| 18 ++ > arch/riscv/cpu/start.S | 47 + > arch/riscv/lib/interrupts.c | 3 ++- > arch/riscv/lib/smp.c| 26 +--- > 4 files changed, 80 insertions(+), 14 deletions(-) > > -- > 2.28.0 >