Hi Michal, On Tue, 10 Dec 2024 at 05:41, Michal Simek <[email protected]> wrote: > > Hi Simon, > > On 12/9/24 20:27, Simon Glass wrote: > > Hi Michal, > > > > On Mon, 9 Dec 2024 at 11:34, Michal Simek <[email protected]> wrote: > >> > >> > >> > >> On 12/9/24 16:47, Simon Glass wrote: > >>> Hi, > >>> > >>> On Mon, 9 Dec 2024 at 08:32, Tom Rini <[email protected]> wrote: > >>>> > >>>> On Mon, Dec 09, 2024 at 04:26:15PM +0100, Michal Simek wrote: > >>>>> > >>>>> > >>>>> On 12/6/24 20:20, Simon Glass wrote: > >>>>>> On Fri, 1 Nov 2024 at 03:18, Michal Simek <[email protected]> wrote: > >>>>>>> > >>>>>>> Calling empty function when BINMAN_FDT is adding +64B for nothing > >>>>>>> which is > >>>>>>> not helping on size sensitive configurations as Xilinx mini > >>>>>>> configurations. > >>>>>>> > >>>>>>> Signed-off-by: Michal Simek <[email protected]> > >>>>>>> --- > >>>>>>> > >>>>>>> Changes in v2: > >>>>>>> - new patch > >>>>>>> > >>>>>>> From my perspective there is no reason to call empty function. It > >>>>>>> is just > >>>>>>> increase footprint for nothing and we are not far from that limit now. > >>>>>>> > >>>>>>> --- > >>>>>>> common/board_r.c | 7 +++---- > >>>>>>> 1 file changed, 3 insertions(+), 4 deletions(-) > >>>>>> > >>>>>> Reviewed-by: Simon Glass <[email protected]> > >>>>>> > >>>>>> This is a bit odd, though. Do you have LTO enabled? > >>>>>> > >>>>> > >>>>> yes LTO is enabled. And there are other candidates like this. > >>>>> Is LTO able to fix function arrays which is calling empty function? > >>>>> > >>>>> (without this patch) > >>>>> > >>>>> 00000000fffc0eb4 <initr_of_live>: > >>>>> fffc0eb4: 52800000 mov w0, #0x0 > >>>>> // #0 > >>>>> fffc0eb8: d65f03c0 ret > >>>>> > >>>>> 00000000fffc0ebc <initr_dm_devices>: > >>>>> fffc0ebc: 52800000 mov w0, #0x0 > >>>>> // #0 > >>>>> fffc0ec0: d65f03c0 ret > >>>>> > >>>>> 00000000fffc0ec4 <initr_bootstage>: > >>>>> fffc0ec4: 52800000 mov w0, #0x0 > >>>>> // #0 > >>>>> fffc0ec8: d65f03c0 ret > >>>>> > >>>>> 00000000fffc0ecc <power_init_board>: > >>>>> fffc0ecc: 52800000 mov w0, #0x0 > >>>>> // #0 > >>>>> fffc0ed0: d65f03c0 ret > >>>>> > >>>>> 00000000fffc0ed4 <initr_announce>: > >>>>> fffc0ed4: 52800000 mov w0, #0x0 > >>>>> // #0 > >>>>> fffc0ed8: d65f03c0 ret > >>>>> > >>>>> 00000000fffc0edc <initr_binman>: > >>>>> fffc0edc: 52800000 mov w0, #0x0 > >>>>> // #0 > >>>>> fffc0ee0: d65f03c0 ret > >>>>> > >>>>> 00000000fffc0ee4 <initr_status_led>: > >>>>> fffc0ee4: 52800000 mov w0, #0x0 > >>>>> // #0 > >>>>> fffc0ee8: d65f03c0 ret > >>>>> > >>>>> 00000000fffc0eec <initr_boot_led_blink>: > >>>>> fffc0eec: 52800000 mov w0, #0x0 > >>>>> // #0 > >>>>> fffc0ef0: d65f03c0 ret > >>>>> > >>>>> 00000000fffc0ef4 <initr_boot_led_on>: > >>>>> fffc0ef4: 52800000 mov w0, #0x0 > >>>>> // #0 > >>>>> fffc0ef8: d65f03c0 ret > >>>>> > >>>>> 00000000fffc0efc <initr_lmb>: > >>>>> fffc0efc: 52800000 mov w0, #0x0 > >>>>> // #0 > >>>>> fffc0f00: d65f03c0 ret > >>>> > >>>> No, but maybe Simon would prefer if we marked all of the could-be-empty > >>>> functions as __maybe_unused and did: > >>>> CONFIG_IS_ENABLED(BINMAN_FDT, initr_binman), > >>>> etc in the list instead? > >>> > >>> Yes that looks better. > >> > >> But we are talking about using macro inside array at best with using > >> #ifdefs. > >> Or maybe I am not seeing what you are saying. > >> > >>> > >>> Michal, see also [1] in case you can work out why it 'stopped > >>> working'. I could have sworn inlining the function was a win when it > >>> was applied, but no amount of toolchain juggling could make it be a > >>> win when I came back to it later. > >> > >> Are you saying that it worked in past? > > > > I wasn't able to verify that post facto, but I believe I do remember > > checking it at the time. If you read the original commit message: > > > > 47870afab92 initcall: Move to inline function > > > > The board_r init function was complaining that we are looping through > > an array, calling all our tiny init stubs sequentially via indirect > > function calls (which can't be speculated, so they are slow). > > > > The solution to that is pretty easy though. All we need to do is inline > > the function that loops through the functions and the compiler will > > automatically convert almost all indirect calls into direct inlined > > code. > > > > With this patch, the overall code size drops (by 40 bytes on riscv64) > > and boot time should become measurably faster for every target. > > > > Signed-off-by: Alexander Graf <[email protected]> > > > > Despite this hopeful sentiment, I seriously doubt any improvement in boot > > time. > > I am not able to replicate this observation on arm64 or riscv64. > > Loop unrolling is not happening even if you pass -funroll-all-loops flag. > > Maybe different toolchains should be used to see this behavior.
Yes, maybe. Or perhaps it changed for some reason. Regards, Simon

