On 30.03.19 23:37, Simon Glass wrote:
Hi Simon,

On Sat, 30 Mar 2019 at 15:40, Simon Goldschmidt
<simon.k.r.goldschm...@gmail.com> wrote:



Simon Glass <s...@chromium.org> schrieb am Sa., 30. März 2019, 22:18:

Hi Simon,

On Sat, 30 Mar 2019 at 14:50, Simon Goldschmidt
<simon.k.r.goldschm...@gmail.com> wrote:



Simon Goldschmidt <simon.k.r.goldschm...@gmail.com> schrieb am Sa., 30. März 
2019, 21:18:



Simon Glass <s...@chromium.org> schrieb am Sa., 30. März 2019, 21:06:

Hi Simon,

On Wed, 27 Mar 2019 at 13:40, Simon Goldschmidt
<simon.k.r.goldschm...@gmail.com> wrote:

This introduces a new Kconfig option SPL_CLEAR_BSS_F. If enabled, it clears
the bss before calling board_init_f() instead of clearing it before calling
board_init_r().

This also ensures that variables placed in BSS can be shared between
board_init_f() and board_init_r() in SPL.

Such global variables are used, for example, when loading things from FAT
before SDRAM is available: the full heap required for FAT uses global
variables and clearing BSS after board_init_f() would reset the heap state.
An example for such a usage is socfpa_arria10 where an FPGA configuration
is required before SDRAM can be used.

Make the new option depend on ARM for now until more implementations follow.


I still have objections to this series and I think we should discuss
other ways of solving this problem.

Does socfgpa have SRAM that could be used before SDRAM is available?
If so, can we not use that for the configuration? What various are
actually in BSS that are needed before board_init_r() is called? Can
they not be in a struct created from malloc()?


The problem is the board needs to load an FPGA configuration from FAT before 
SDRAM is available. Yes, this is loaded into SRAM of course, but the whole code 
until that is done uses so many malloc/free iterations that The simple mall of 
implementation would require too much memory.

And it's the full malloc state variables only that use BSS, not the FAT code.

One way out could be to move the full mall of state variables into 'gd'...


Maybe I should point out again that the whole purpose of this series is to have 
an SPL that uses full malloc right from the start. This is currently not 
possible as full malloc needs BSS.

Right, and our assumption/design is that full malloc() requires SRAM.

Here we have an architecture that requires FAT just to init its SDRAM.
FAT requires malloc() and free() and there is not enough SRAM to skip
the free() calls. So we have to use full malloc() and that uses BSS.
But BSS is not available before board_init_r(). But we need to init
SDRAM before board_init_r().

Firstly I'd point out that someone should speak to the chip designers.
Did anyone try to boot U-Boot on the SoC model?


Well, it's a U-Boot thing to load it from FAT. It could probably be loaded from 
RAW mmc without problems, so I don't know if it's a chip designers issue. I 
think it's an issue that we need to fix in U-Boot: we have a good full malloc 
implementation but it's not usable in all cases were it should be.

OK then why use FAT? I assumed it was a boot-ROM requirement. How does
the boot ROM load SPL?

Honestly, I don't know. It's Altera/Intel that decided for this boot flow. However, since FPGA development is often done from Windows, I guess it's convenient for FPGA development to write the resulting binary to a FAT partition on a sdcard.




I think it is possible to change dlmalloc to put its variables in a
struct. Then I suppose the struct pointer could be kept in gd. That
would be one solution. Then full malloc() could be inited in
spl_common_init() perhaps.


That might be worth a try.

Yes shouldn't be too painful. I suspect it would be neutral on code size, too.



Another option might be to update the FAT code to use
ALLOC_CACHE_ALIGN_BUFFER() instead of malloc(), as it already does in
places, and perhaps to disable all caching for this case. Then it
might work with simple malloc().


Hmm, then the next platform will have problems because allocating via malloc 
would be preferable. If really rather fix using dlmalloc instead.

Hmm but there is no obvious analysis behind your preference. If we
have code like this:

do_something_with_fat()
{
    void *buf = malloc(...);

  ...

    free(buf);
}

it seems to me we should convert it to use the stack instead, unless
there is some recursion problem, etc. Otherwise we are just causing
pointless churn on malloc().

I don't think it's that easy. There are platforms where a big stack size hurts. I don't think such big buffers like MAX_CLUSTERSIZE should be allocated on stack.




Another option would be to put the FPGA image in a known position on
the media, outside the FAT partition. The position could perhaps be
written into a FAT file, and reading just that file in SPL might not
involve many free() operations.


Sounds like a workaround, too. I think the U-Boot infrastructure should work 
for the boards, not placing restrictions on them.

I'll await your answer to my first question in this email before
passing judgement.



I hesitate to suggest enhancing simple malloc() to support a free
list. That would increase the code size and I'm not sure it would be
better than using full malloc().


Yes, I think that might result in making a second dlmalloc.... :-)

Thanks for your thoughts and input!

Overall I feel that the current trade-offs and phases of boot are
reasonable. We should be suspicious of attempts to make them more
complex.

I think denying BSS access in SPL before board_init_r is a bit outdated. In times where board_init_f was ASM, it could have been OK, but nowadays where this is C and you can't check the code easily to *not* use BSS, I'm not too sure this limitation should holf.

Anyway, I realize there are people wanting to keep this up, so I'll work in that direction.

The initial intention of this series was to use full heap just like tiny heap in SPL. Tiny heap is configured via Kconfig by setting its size, full heap (available some time during board_init_r) currently needs CONFIG_SYS_SPL_MALLOC_SIZE and CONFIG_SYS_SPL_MALLOC_START

I'd prefer it if the full heap implementation could use the same allocation strategy as the tiny heap. That should make configuration easier, not more complex. Maybe I started it the wrong way, maybe you have a better idea of how to make is simpler in that direction?

Regards,
Simon


Regards,
Simon

_______________________________________________
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot

Reply via email to