On Thu, Feb 26, 2026 at 05:30:06PM +0100, Stefan Eichenberger wrote:
> Hi Francesco and Tom,
> 
> On Thu, Feb 26, 2026 at 05:11:49PM +0100, Francesco Dolcini wrote:
> > +Emanuele
> > 
> > Hello Tom,
> > 
> > On Thu, Feb 26, 2026 at 08:23:45AM -0600, Tom Rini wrote:
> > > On Thu, Feb 26, 2026 at 08:05:02AM +0100, Francesco Dolcini wrote:
> > > > Hello Tom,
> > > > 
> > > > On Fri, Mar 14, 2025 at 11:06:49AM +0100, Stefan Eichenberger wrote:
> > > > > From: Stefan Eichenberger <[email protected]>
> > > > > 
> > > > > The get_ram_size() function fails to restore the original RAM data 
> > > > > when
> > > > > the data cache is enabled. This issue was observed on an AM625 R5 SPL
> > > > > with 512MB of RAM and is a regression that became visible with
> > > > > commit bc07851897bd ("board: ti: Pull redundant DDR functions to a 
> > > > > common
> > > > > location and Fixup DDR size when ECC is enabled").
> > > > > 
> > > > > Observed boot failure messages:
> > > > >   Warning: Did not detect image signing certificate. Skipping 
> > > > > authentication to prevent boot failure. This will fail on Security 
> > > > > Enforcing(HS-SE) devices
> > > > >   Authentication passed
> > > > >   Starting ATF on ARM64 core...
> > > > > 
> > > > > The system then hangs. This indicates that without a data cache flush,
> > > > > data in the cache is not coherent with RAM, preventing the system from
> > > > > booting. This was verified by printing the content of this address 
> > > > > when
> > > > > the issue occurs.
> > > > > 
> > > > > Add a data cache flush after each restore operation to resolve this
> > > > > issue.
> > > > > 
> > > > > Fixes: bc07851897bd ("board: ti: Pull redundant DDR functions to a 
> > > > > common location and Fixup DDR size when ECC is enabled")
> > > > > Fixes: 1c64b98c1ec4 ("common/memsize.c: Fix get_ram_size() when cache 
> > > > > is enabled")
> > > > > Signed-off-by: Stefan Eichenberger <[email protected]>
> > > > 
> > > > Tom, can we merge this?
> > > > This is the last bit to solve the regression reported here,
> > > > https://lore.kernel.org/all/20260224152405.GD340942@francesco-nb/
> > > 
> > > I wasn't happy with this at the time, and Stefan's last email in the
> > > thread left me with the impression more investigation was needed and
> > > likely something else was the root cause.
> > 
> > I believe that this patch is needed.
> > 
> > On AM62 what is happening is the following.
> > 
> > We have a cortex-R5 that is the first core booting (there is also a
> > cortex-m4, but it's not relevant for this discussion).
> > 
> > It runs from internal memory and it configures the DDR ram
> > 
> > We load to DDR memory various pieces of firmware (TFA, U-Boot for the
> > cortex A53, ...)
> > 
> > We do execute get_ram_size(), that read/write the memory, and it is
> > supposed to restore it back the original content
> > 
> > However when we have the cache enabled, we might miss to write back the
> > original memory content, where the other pieces of firmware are.
> > 
> > And after that we start the cortex A53, running in DDR, and there the
> > memory content might not be correct, because there is no cache coherency
> > between the cortex-A and the cortex-R. And because of that we have
> > crashes.
> > 
> > Stefan: any comment here? Can you help?
> 
> I think what you wrote summarises the issue well. If I recall correctly,
> I "fixed" the issue last time by simply calling get_ram_size() once
> before enabling the cache. This was in commit 4164289db882e. The SPL
> then informs U-Boot of the memory size via fdt fixup. However, something
> has probably changed now (possibly in the R5 SPL), meaning the cache is
> enabled earlier, so the cache is enabled again when get_ram_size() is
> called.
> 
> For the AMP use case, either "get_ram_size" should not be called once
> the cache is enabled, or a similar patch to the one I proposed is
> required.

I would lean towards the former if at all possible.

-- 
Tom

Attachment: signature.asc
Description: PGP signature

Reply via email to