Hi Emanuele, On 15.03.23 22:25, Emanuele Ghidoli wrote: > [Sie erhalten nicht häufig E-Mails von ghidoliemanu...@gmail.com. > Weitere Informationen, warum dies wichtig ist, finden Sie unter > https://aka.ms/LearnAboutSenderIdentification ] > > On 15/03/2023 16:24, Frieder Schrempf wrote: >> On 15.03.23 15:42, Frieder Schrempf wrote: >>> On 15.03.23 15:17, Michael Nazzareno Trimarchi wrote: >>>> Hi >>>> >>>> On Wed, Mar 15, 2023 at 3:13 PM Frieder Schrempf >>>> <frieder.schre...@kontron.de> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I'm trying to bring up a new board based on the i.MX8MP and I have an >>>>> issue I'm hoping someone can help solving. >>>>> >>>>> I'm seeing failures in the early SPL code, usually in the DDR >>>>> initialization. Often they look like: >>>>> >>>>> U-Boot SPL 2023.04-rc3 (Mar 07 2023 - 14:32:34 +0000) >>>>> Training FAILED >>>>> Failed to initialize DDR RAM! >>>>> ### ERROR ### Please RESET the board ### >>>>> >>>>> But sometimes ddr_init() doesn't even return an error and only the >>>>> get_ram_size() afterwards which tries to allocate the memory fails. >>>>> >>>> >>>> In my experience you don't have space inside the cpu internal >>>> memory. It means >>>> that you overlap some stack with the code. Change the printf means >>>> move a bit. So you have >>>> problem but depends what you are going to destroy >>> >>> Thanks for your reply. That's exactly what I'm thinking, too. >>> >>>> >>>>> The strange thing is that the issues appear or disappear >>>>> deterministically on the binary level. This means I sometimes get a >>>>> U-Boot binary which runs just fine in 100% of cases. Then I change for >>>>> example one of the following: >>>>> >>>>> * Adding a single printf() somewhere in the boards spl.c >>>>> * Using the same binary but booting from SD card instead of USB loader >>>>> * Using the same source but switching from the OS cross compiler to >>>>> the >>>>> one from Yocto/OE >>>>> >>>>> And afterwards I get 100% failure rate with an error as described >>>>> above. >>>>> >>>>> My suspicion is that there is some memory corruption/conflict. My >>>>> SPL is >>>>> quite large and I wonder if it exceeds some limit. >>>>> >>>>> SPL is loaded to 0x920000 and CONFIG_SPL_STACK is set to 0x960000, >>>>> which >>>>> leaves 256 KiB in between for the SPL. But all i.MX8MP boards seem to >>>>> set CONFIG_SPL_MAX_SIZE=0x26000 (152 KiB) for some reason. My >>>>> u-boot-spl-ddr.bin currently has around 193 KiB but I don't get any >>>>> warning about exceeding the SPL_MAX_SIZE. >>>>> >>>>> My questions: >>>>> >>>>> * Why is CONFIG_SPL_MAX_SIZE set to 152 KiB? >>> >>> I guess the remainder between the SPL code and the SPL stack is for the >>> DDR firmware. Which explains why I get failures with SPL exceeding 152 >>> KiB size. >> >> Still, it doesn't really make sense to me at the moment as the >> u-boot-spl-ddr.bin already contains the DDR firmware it should be fine >> to exceed the 152 KiB size. My u-boot-spl.bin (without DDR firmware) is >> only 135 KiB. >> >> Sorry for spamming you by thinking out loud... ;) >> >>> >>> Now I also understand the reason why the power init code was implemented >>> using legacy non-DM drivers in other i.MX8MP boards. I probably also >>> need to do this to save some space. >>> >>>>> * Why is there no warning in my case? >>> >>> Still, I fail to see why there isn't any error or where the size check >>> is even implemented. >>> >>>>> * Any other ideas or pointers? >>>>> >>>>> Thanks for your help! >>>>> >>>>> Best regards >>>>> Frieder >> > > Hello, > I fall in a similar problem. > > Some hints: > - commit 5004901efb3b ("board_init: Do not reserve MALLOC_F area on stack > if non-zero MALLOC_F_ADDR") - but you should already have it > - Reduce (set to something different from default value) > SPL_SYS_MALLOC_F_LEN. > Normally that area is not used a lot. Stack start before heap area and, > if I remember well, start address of heap area depend upon this config. > And... its default value is equal to SYS_MALLOC_F_LEN, that normally > is high. > > Suggestions from Rasmus are precious. I adopt a rather similar approch > to find > that stack / gd (global data) was overlapping DDR firmware / cfg.
Thanks a lot for the additional pointers. I do have commit 5004901efb3b, but I didn't look at MALLOC_F_ADDR before. It seems like there are some i.MX8MP boards which use this to place the malloc area in the separate OCRAM_S (0x184000) instead of OCRAM which is interesting and another possibility I didn't know of. Thanks Frieder