Do what's easiest for you.

On Sun, Mar 12, 2023 at 8:05 PM Kent Mcleod <kent.mcleo...@gmail.com> wrote:

> On Fri, Feb 3, 2023 at 10:29 AM Sam Leffler <sleff...@google.com> wrote:
> >
> > On Thu, Feb 2, 2023 at 1:56 PM Kent Mcleod <kent.mcleo...@gmail.com>
> wrote:
> >>
> >> On Fri, Feb 3, 2023 at 8:26 AM Sam Leffler via Devel <devel@sel4.systems>
> wrote:
> >> >
> >> > I have a target platform with only 4M of memory. When the system
> image is
> >> > generated and the shoehorn helper script is used to find a place in
> memory
> >> > to load the build artifacts it tacks on an extra 4M of memory use (aka
> >> > fudge_factor). The comment in the code
> >> > <
> https://github.com/AmbiML/sparrow-seL4_tools/blame/master/cmake-tool/helpers/shoehorn.py#L209
> >
> >> > says this is to accommodate sel4test_driver. Needless to say this
> breaks on
> >> > my 4M target platform. So I made the fudge-factor settable from the
> cmd
> >> > line with a default of 0 and changed the sel4test build glue to set
> 4M when
> >> > building elfloader. Works fine for my target platform. But this change
> >> > breaks building a bootable image for rpi3 (AARCH64=1
> bcm28367)--shoehorn
> >> > places elfloader s.t. it overlaps the image; e.g.
> >> >
> >> > ELF-loader started on CPU: ARM Ltd. Cortex-A53 r0p4
> >> > >   paddr=[335000..51a0ff]
> >> > > No DTB passed in from boot loader.
> >> > > Looking for DTB in CPIO archive...found at 378778.
> >> > > Loaded DTB from 378778.
> >> > >    paddr=[237000..23afff]
> >> > > ELF-loading image 'kernel' to 0
> >> > >   paddr=[0..236fff]
> >> > >   vaddr=[ffffff8000000000..ffffff8000236fff]
> >> > >   virt_entry=ffffff8000000000
> >> > > ELF-loading image 'capdl-loader' to 23b000
> >> > >   paddr=[23b000..33bfff]
> >> > >   vaddr=[400000..500fff]
> >> > >   virt_entry=4009a8
> >> > > ERROR: image load address overlaps with ELF-loader!
> >> > > ERROR: Physical address range invalid
> >> > > ERROR: Could not load user image ELF
> >> >
> >> >
> >> > Debug output of shoehorn for this case:
> >> >
> >> > shoehorn: debug: found CPIO identifying sequence b'070701' at offset
> 0x40
> >> > > in
> >> > >
> /usr/local/google/home/sleffler/shodan/out/cantrip/aarch64-unknown-elf/release/elfloader/archive.o
> >> > > shoehorn: debug: encountered CPIO entry name: kernel.elf
> >> > > shoehorn: debug: encountered CPIO entry name: kernel.dtb
> >> > > shoehorn: debug: encountered CPIO entry name: capdl-loader
> >> > > shoehorn: debug: setting marker to 0x0 (region 0 start)
> >> > > shoehorn: debug: setting marker to 0x237000 (kernel_end)
> >> > > shoehorn: debug: setting marker to 0x23b000 (dtb_end)
> >> > > shoehorn: debug: setting marker to 0x335000 (end of rootserver)
> >> >
> >> >
> >> > So two questions:
> >> > 1. Where is the 4M under-count of sel4test_driver? (the code
> indicates this
> >> > might be explained in JIRA SELFOUR-2335 but I couldn't locate it)
> >>
> >> Here is the referred to Jira issue, but it doesn't provide any
> >> additional context: https://sel4.atlassian.net/browse/SELFOUR-2335
> >>
> >> shoehorn is attempting to calculate how the kernel and root server
> >> binaries will be unpacked into memory in order to place the
> >> elfloader's start address above the unpacked region. shoehorn
> >> calculates the region by iterating over the PT_LOAD segments from each
> >> ELF file. The elfloader then unpacks each ELF file at runtime by
> >> iterating over the PT_LOAD segments.
> >>
> >> For some reason, the two implementations don't agree. In your case,
> >> the offline calculation expects that the root server is loaded from
> >> [0x23b000, 0x335000) whereas the online calculation attempts:
> >> [0x23b000, 0x33bfff). Are you able to print the segment headers for
> >> the root server image you are loading?
> >>
> >> I'm guessing (from quickly looking at the code) the issue is that the
> >> shoehorn calculation only sums the p_memsz amounts for each PT_LOAD
> >> segment and isn't taking into account any gaps between segments in the
> >> virtual address space.
> >
> >
> > Yes, that appears to be the issue. readelf of capdl-loader shows:
> >
> > Program Headers:
> >   Type           Offset             VirtAddr           PhysAddr
> >                  FileSiz            MemSiz              Flags  Align
> >   LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
> >                  0x00000000000a9130 0x00000000000a9130  RWE    0x1000
> >   LOAD           0x0000000000000000 0x00000000004b0000 0x00000000004b0000
> >                  0x0000000000000000 0x0000000000050168  RW     0x1000
> >
> > so there's a gap between the two load segments that isn't accounted for.
> Attached is a change that seems to DTRT. It also appears to eliminate the
> need for fudge_factor (in quick testing). You'll probably want to write
> your own fix as my python fu is basic.
>
> Thanks for this fix Sam,
> This seems to be an appropriate fix. If
> https://github.com/seL4/seL4_tools/pull/158 passes the test suite then
> I'll try and get the fix merged.  Can I use your commit and sign-off
> the certificate of originality or would you prefer I rewrite it?
>
>
>
> >>
> >>
> >> A fudge-factor wouldn't be needed if these two calculations weren't out
> of sync.
> >>
> >>
> >> > 2. Should zero'ing fudge_factor work? If yes, where should I look to
> remedy
> >> > the above?
> >> >
> >> > I looked upstream for changes that might address this issue but
> didn't see
> >> > anything.
> >> >
> >> > I suspect I can invert my logic and default fudge_factor to some
> value and
> >> > then override as needed (e.g. 0 for my sparrow platform & 4M for
> sel4test
> >> > builds).
> >>
> >> This seems fine to me.
> >>
> >> >
> >> > -Sam
> >> > _______________________________________________
> >> > Devel mailing list -- devel@sel4.systems
> >> > To unsubscribe send an email to devel-leave@sel4.systems
>
_______________________________________________
Devel mailing list -- devel@sel4.systems
To unsubscribe send an email to devel-leave@sel4.systems

Reply via email to