On 3/23/22 2:53 PM, Simon Glass wrote:
Hi Sean,

On Wed, 23 Mar 2022 at 12:33, Sean Anderson <sean...@gmail.com> wrote:

On 3/23/22 2:26 PM, Heiko Thiery wrote:
Hi Simon,

Am Mi., 23. März 2022 um 19:04 Uhr schrieb Simon Glass <s...@chromium.org>:

Hi Heinrich,

On Tue, 22 Mar 2022 at 03:25, Heinrich Schuchardt <xypron.g...@gmx.de> wrote:

On 3/21/22 15:26, Heiko Thiery wrote:
It was observed that enabling additional DM modules the configured
malloc value is not sufficient. So lets increase the value.

Signed-off-by: Heiko Thiery <heiko.thi...@gmail.com>
---
v2:
    - add a more proper commit message to explan why the value was increased

    configs/kontron_pitx_imx8m_defconfig | 1 +
    1 file changed, 1 insertion(+)

diff --git a/configs/kontron_pitx_imx8m_defconfig 
b/configs/kontron_pitx_imx8m_defconfig
index 76430213e3..30c3586937 100644
--- a/configs/kontron_pitx_imx8m_defconfig
+++ b/configs/kontron_pitx_imx8m_defconfig
@@ -2,6 +2,7 @@ CONFIG_ARM=y
    CONFIG_ARCH_IMX8M=y
    CONFIG_SYS_TEXT_BASE=0x40200000
    CONFIG_SYS_MALLOC_LEN=0x600000
+CONFIG_SYS_MALLOC_F_LEN=0x10000

@Heiko
Should we really adjust this on board level? Won't we have the same
problem on all imx8m boards?

Why don't you change the default for all i.mx8 boards in /Kconfig?

@Tom, @Simon
Shouldn't we replace the default of 0x400 by 0x2000 generally?

I don't think that is a good idea. That is a lot of memory! Many
platforms don't need that much.

I wonder what is driving this large amount. Is it pinctrl?

The increase comes from the introduction of a clock driver for the
imx8mq platform.

Yes, the problem is that CCF creates a udevice+clk+private data for
every clock. This runs about 150-200 bytes per clock on a 64-bit
platform. In addition, many physical clocks are modeled as several
logical clocks plus a composite. This means a platform with maybe
20-30 physical clocks can easily allocate 10k-20k to create
the clock tree.

With the new DM tag support we could move uncommon fields in struct
udevice to tags, such as parent_plat_, uclass_plat_, driver_data,
uclass_priv_, parent_priv_, dma_offset. It would add a small amount of
code but save data. We could call this DM_TINY_DATA and It might save
64 bytes per device.

Also the #ifdef CONFIG_DEVRES should use CONFIG_IS_ENABLED(DEVRES) so
that devres_head is not included in SPL.

Personally, I think we could get away with a much smaller amount of data
per-clock, and not make every clock a separate device. The primary reason
each CCF clock gets its own device is that we cache things like clock
rates. Because of this, we need to keep track of children so we can
invalidate their rates correctly. But caching rates is probably not that
much faster than just recalculating (especially since most of the time
it's just a division). We also don't try and find the best way to achieve
a clock speed like Linux does. If we don't cache clock rates, we don't
need to keep track of a clock's children.

Further, we could likely also eliminate tracking enable/disable counts.
We don't have fine-grained low power states like Linux does, so generally
when someone disables a clock we're getting in the way by trying to keep
track of the count.

This is all a rather big change, so we're stuck with the status quo for
now, but it's the direction I would like to move in going ahead. I know
Marek would rather just not create as many clocks in SPL/pre-reloc.

--Sean

Reply via email to