Re: Runtime Loader Exported Symbols Address Size

2022-06-18 Thread Chris Johns
On 19/6/22 5:21 am, Joel Sherrill wrote:
> 
> 
> On Fri, Jun 17, 2022 at 8:13 PM Chris Johns  > wrote:
> 
> On 18/6/22 10:08 am, Joel Sherrill wrote:
> > On Fri, Jun 17, 2022, 2:28 PM Alex White  
> > >> wrote:
> >
> >     Hi,
> >
> >     While testing libdl on AArch64 QEMU, we found a bug where the 
> exported
> >     symbol table appears to always use 32-bit values for addresses, but 
> where
> >     the exported symbols table is read in 
> `rtems_rtl_symbol_global_add`, the
> >     addresses are expected to be of size `sizeof(unsigned long)`.
> >
> >     This did not cause a problem on ARM since `sizeof(unsigned long)` is
> 4, but
> >     with AArch64 `sizeof(unsigned long)` is 8. So it is trying to read 
> the
> >     address as 64 bits instead of 32.
> >
> >     The simple fix is to use an exact-width integer type - like
> >     `sizeof(uint32_t)`. But this would not allow for 64-bit 
> architectures
> to use
> >     the full address space. It would also break on 64-bit RISC-V (see 
> below).
> >     Perhaps we could have an ifdef to choose a 32 or 64-bit type based 
> on the
> >     architecture rather than relying on `sizeof(unsigned long)`?
> >
> >     It looks like there is an exception in the rtems-syms tool for 
> 64-bit
> RISC-V
> >     to emit 64-bit addresses rather than 32-bit addresses. Is this the 
> right
> >     solution? Should we add another exception for AArch64?
> >
> > After having given this some thought, I think the code that is generated
> and the
> > parsing code in rtl-sym.c should be conditionalized on the GCC CPP 
> predefine
> > that indicates the size of a void *.
> 
> Good idea. The symbol value is held as a `void*` in the symbol table so 
> the key
> is the generated asm size in rtems-syms.cpp matches the runtime `void*` 
> size.
> That code maybe conditional on the size selecting the correct variable 
> size.
> 
> > The code in rtl-sym.c should use a type definition for the type that the
> sizeof
> > is operating on. Using unsigned long is just not good enough. Both cases
> should
> > be fixed width types.
> 
> Sure `unsigned long` may not be the best type to use. It was selected 
> because it
> happened change size with the arch but I am sure there are types that are 
> better
> suited. The type needs to be the size of a full address space pointer.
> 
> 
> Glad you hear you agree on the basic approach. The cpp predefine is 
> __SIZEOF_POINTER__ and  it should be 4 or 8. If both sides use that, then this
> should be permanently fixed.
> 
> 
> > There also should be a very good comment above the parsing method in 
> rtl-sym.c
> > explaining where the file comes from that it is reading and why there 
> is the
> > need for a type which varies by architecture.
> 
> Sure, happy to see comments added. Is this something you would like me to
> handle?
> 
> The generation and source of the symbols can be handled in a number of 
> ways. I
> have provided a tool that should cover most use cases however I can 
> imagine
> more, eg symbol filters for defined exported symbols. I documented things 
> here:
> 
> 
> https://docs.rtems.org/branches/master/user/exe/loader.html#base-image-symbols
> 
> 
> 
> Does that documentation help cover what you are looking for?
> 
> 
> I'll let Ryan and Alex answer that since they are fixing this. I just was 
> helping Ryan debug the parsing loop and realized it was chopping off the 
> first four bytes of every symbol after the first one. 

OK

> > Based on our current understanding, I think this address is everything 
> and
> > leaves sufficient information and organization to cover all the cases 
> in the
> > future. Very likely without anyone actively having to fix anything.
> Running into
> > this issue again and again as architectures are added is not a good 
> long-term
> > solution.
> 
> Agreed. Figuring out stable 64bit addressing in libdl is a good idea and
> welcome. I think the main requirement is for the full address range to be
> supported.
> 
> > Chris.. is this a good path to you?
> 
> Yes this sounds good.
> 
> > Is there somewhere else in the dynamic loader system with a similar 
> issue?
> 
> This is the only external interface of this type where static checking 
> does not
> help. The other is ELF which I hope is OK. I am not sure if other places 
> may
> have a problem. All care was taken however only using the code with 
> 64bits will
> determine this.
> 
> The elftools libelf support was ported and tested with only 32bit 
> targets. I am
> not sure 

Re: Runtime Loader Exported Symbols Address Size

2022-06-18 Thread Joel Sherrill
On Fri, Jun 17, 2022 at 8:13 PM Chris Johns  wrote:

> On 18/6/22 10:08 am, Joel Sherrill wrote:
> > On Fri, Jun 17, 2022, 2:28 PM Alex White  > > wrote:
> >
> > Hi,
> >
> > While testing libdl on AArch64 QEMU, we found a bug where the
> exported
> > symbol table appears to always use 32-bit values for addresses, but
> where
> > the exported symbols table is read in `rtems_rtl_symbol_global_add`,
> the
> > addresses are expected to be of size `sizeof(unsigned long)`.
> >
> > This did not cause a problem on ARM since `sizeof(unsigned long)` is
> 4, but
> > with AArch64 `sizeof(unsigned long)` is 8. So it is trying to read
> the
> > address as 64 bits instead of 32.
> >
> > The simple fix is to use an exact-width integer type - like
> > `sizeof(uint32_t)`. But this would not allow for 64-bit
> architectures to use
> > the full address space. It would also break on 64-bit RISC-V (see
> below).
> > Perhaps we could have an ifdef to choose a 32 or 64-bit type based
> on the
> > architecture rather than relying on `sizeof(unsigned long)`?
> >
> > It looks like there is an exception in the rtems-syms tool for
> 64-bit RISC-V
> > to emit 64-bit addresses rather than 32-bit addresses. Is this the
> right
> > solution? Should we add another exception for AArch64?
> >
> > After having given this some thought, I think the code that is generated
> and the
> > parsing code in rtl-sym.c should be conditionalized on the GCC CPP
> predefine
> > that indicates the size of a void *.
>
> Good idea. The symbol value is held as a `void*` in the symbol table so
> the key
> is the generated asm size in rtems-syms.cpp matches the runtime `void*`
> size.
> That code maybe conditional on the size selecting the correct variable
> size.
>
> > The code in rtl-sym.c should use a type definition for the type that the
> sizeof
> > is operating on. Using unsigned long is just not good enough. Both cases
> should
> > be fixed width types.
>
> Sure `unsigned long` may not be the best type to use. It was selected
> because it
> happened change size with the arch but I am sure there are types that are
> better
> suited. The type needs to be the size of a full address space pointer.
>

Glad you hear you agree on the basic approach. The cpp predefine is
__SIZEOF_POINTER__ and  it should be 4 or 8. If both sides use that, then
this
should be permanently fixed.


> > There also should be a very good comment above the parsing method in
> rtl-sym.c
> > explaining where the file comes from that it is reading and why there is
> the
> > need for a type which varies by architecture.
>
> Sure, happy to see comments added. Is this something you would like me to
> handle?
>
> The generation and source of the symbols can be handled in a number of
> ways. I
> have provided a tool that should cover most use cases however I can imagine
> more, eg symbol filters for defined exported symbols. I documented things
> here:
>
>
> https://docs.rtems.org/branches/master/user/exe/loader.html#base-image-symbols
>
> Does that documentation help cover what you are looking for?
>

I'll let Ryan and Alex answer that since they are fixing this. I just was
helping Ryan debug the parsing loop and realized it was chopping off the
first four bytes of every symbol after the first one.

>
> > Based on our current understanding, I think this address is everything
> and
> > leaves sufficient information and organization to cover all the cases in
> the
> > future. Very likely without anyone actively having to fix anything.
> Running into
> > this issue again and again as architectures are added is not a good
> long-term
> > solution.
>
> Agreed. Figuring out stable 64bit addressing in libdl is a good idea and
> welcome. I think the main requirement is for the full address range to be
> supported.
>
> > Chris.. is this a good path to you?
>
> Yes this sounds good.
>
> > Is there somewhere else in the dynamic loader system with a similar
> issue?
>
> This is the only external interface of this type where static checking
> does not
> help. The other is ELF which I hope is OK. I am not sure if other places
> may
> have a problem. All care was taken however only using the code with 64bits
> will
> determine this.
>
> The elftools libelf support was ported and tested with only 32bit targets.
> I am
> not sure if there are issues there.
>

OK. Ryan and Alex will be fixing this as part of adding aarch64 libdl
support.
Just having you available to give insight is very helpful.

--joel

>
> Chris
>
___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel

Re: Runtime Loader Exported Symbols Address Size

2022-06-17 Thread Chris Johns
On 18/6/22 10:08 am, Joel Sherrill wrote:
> On Fri, Jun 17, 2022, 2:28 PM Alex White  > wrote:
> 
> Hi,
> 
> While testing libdl on AArch64 QEMU, we found a bug where the exported
> symbol table appears to always use 32-bit values for addresses, but where
> the exported symbols table is read in `rtems_rtl_symbol_global_add`, the
> addresses are expected to be of size `sizeof(unsigned long)`.
> 
> This did not cause a problem on ARM since `sizeof(unsigned long)` is 4, 
> but
> with AArch64 `sizeof(unsigned long)` is 8. So it is trying to read the
> address as 64 bits instead of 32.
> 
> The simple fix is to use an exact-width integer type - like
> `sizeof(uint32_t)`. But this would not allow for 64-bit architectures to 
> use
> the full address space. It would also break on 64-bit RISC-V (see below).
> Perhaps we could have an ifdef to choose a 32 or 64-bit type based on the
> architecture rather than relying on `sizeof(unsigned long)`?
> 
> It looks like there is an exception in the rtems-syms tool for 64-bit 
> RISC-V
> to emit 64-bit addresses rather than 32-bit addresses. Is this the right
> solution? Should we add another exception for AArch64?
> 
> After having given this some thought, I think the code that is generated and 
> the
> parsing code in rtl-sym.c should be conditionalized on the GCC CPP predefine
> that indicates the size of a void *.

Good idea. The symbol value is held as a `void*` in the symbol table so the key
is the generated asm size in rtems-syms.cpp matches the runtime `void*` size.
That code maybe conditional on the size selecting the correct variable size.

> The code in rtl-sym.c should use a type definition for the type that the 
> sizeof
> is operating on. Using unsigned long is just not good enough. Both cases 
> should
> be fixed width types.

Sure `unsigned long` may not be the best type to use. It was selected because it
happened change size with the arch but I am sure there are types that are better
suited. The type needs to be the size of a full address space pointer.

> There also should be a very good comment above the parsing method in rtl-sym.c
> explaining where the file comes from that it is reading and why there is the
> need for a type which varies by architecture.

Sure, happy to see comments added. Is this something you would like me to 
handle?

The generation and source of the symbols can be handled in a number of ways. I
have provided a tool that should cover most use cases however I can imagine
more, eg symbol filters for defined exported symbols. I documented things here:

https://docs.rtems.org/branches/master/user/exe/loader.html#base-image-symbols

Does that documentation help cover what you are looking for?

> Based on our current understanding, I think this address is everything and
> leaves sufficient information and organization to cover all the cases in the
> future. Very likely without anyone actively having to fix anything. Running 
> into
> this issue again and again as architectures are added is not a good long-term
> solution.

Agreed. Figuring out stable 64bit addressing in libdl is a good idea and
welcome. I think the main requirement is for the full address range to be 
supported.

> Chris.. is this a good path to you? 

Yes this sounds good.

> Is there somewhere else in the dynamic loader system with a similar issue?

This is the only external interface of this type where static checking does not
help. The other is ELF which I hope is OK. I am not sure if other places may
have a problem. All care was taken however only using the code with 64bits will
determine this.

The elftools libelf support was ported and tested with only 32bit targets. I am
not sure if there are issues there.

Chris
___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel


Re: Runtime Loader Exported Symbols Address Size

2022-06-17 Thread Joel Sherrill
On Fri, Jun 17, 2022, 2:28 PM Alex White  wrote:

> Hi,
>
> While testing libdl on AArch64 QEMU, we found a bug where the exported
> symbol table appears to always use 32-bit values for addresses, but where
> the exported symbols table is read in `rtems_rtl_symbol_global_add`, the
> addresses are expected to be of size `sizeof(unsigned long)`.
>
> This did not cause a problem on ARM since `sizeof(unsigned long)` is 4,
> but with AArch64 `sizeof(unsigned long)` is 8. So it is trying to read the
> address as 64 bits instead of 32.
>
> The simple fix is to use an exact-width integer type - like
> `sizeof(uint32_t)`. But this would not allow for 64-bit architectures to
> use the full address space. It would also break on 64-bit RISC-V (see
> below). Perhaps we could have an ifdef to choose a 32 or 64-bit type based
> on the architecture rather than relying on `sizeof(unsigned long)`?
>
> It looks like there is an exception in the rtems-syms tool for 64-bit
> RISC-V to emit 64-bit addresses rather than 32-bit addresses. Is this the
> right solution? Should we add another exception for AArch64?
>

After having given this some thought, I think the code that is generated
and the parsing code in rtl-sym.c should be conditionalized on the GCC CPP
predefine that indicates the size of a void *.

The code in rtl-sym.c should use a type definition for the type that the
sizeof is operating on. Using unsigned long is just not good enough. Both
cases should be fixed width types.

There also should be a very good comment above the parsing method in
rtl-sym.c explaining where the file comes from that it is reading and why
there is the need for a type which varies by architecture.

Based on our current understanding, I think this address is everything and
leaves sufficient information and organization to cover all the cases in
the future. Very likely without anyone actively having to fix anything.
Running into this issue again and again as architectures are added is not a
good long-term solution.

Chris.. is this a good path to you? Is there somewhere else in the dynamic
loader system with a similar issue?

--joel

>
> Thanks,
>
> Alex
> ___
> devel mailing list
> devel@rtems.org
> http://lists.rtems.org/mailman/listinfo/devel
>
___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel

Runtime Loader Exported Symbols Address Size

2022-06-17 Thread Alex White
Hi,

While testing libdl on AArch64 QEMU, we found a bug where the exported symbol 
table appears to always use 32-bit values for addresses, but where the exported 
symbols table is read in `rtems_rtl_symbol_global_add`, the addresses are 
expected to be of size `sizeof(unsigned long)`.

This did not cause a problem on ARM since `sizeof(unsigned long)` is 4, but 
with AArch64 `sizeof(unsigned long)` is 8. So it is trying to read the address 
as 64 bits instead of 32.

The simple fix is to use an exact-width integer type - like `sizeof(uint32_t)`. 
But this would not allow for 64-bit architectures to use the full address 
space. It would also break on 64-bit RISC-V (see below). Perhaps we could have 
an ifdef to choose a 32 or 64-bit type based on the architecture rather than 
relying on `sizeof(unsigned long)`?

It looks like there is an exception in the rtems-syms tool for 64-bit RISC-V to 
emit 64-bit addresses rather than 32-bit addresses. Is this the right solution? 
Should we add another exception for AArch64?

Thanks,

Alex
___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel