Do what's easiest for you. On Sun, Mar 12, 2023 at 8:05 PM Kent Mcleod <kent.mcleo...@gmail.com> wrote:
> On Fri, Feb 3, 2023 at 10:29 AM Sam Leffler <sleff...@google.com> wrote: > > > > On Thu, Feb 2, 2023 at 1:56 PM Kent Mcleod <kent.mcleo...@gmail.com> > wrote: > >> > >> On Fri, Feb 3, 2023 at 8:26 AM Sam Leffler via Devel <devel@sel4.systems> > wrote: > >> > > >> > I have a target platform with only 4M of memory. When the system > image is > >> > generated and the shoehorn helper script is used to find a place in > memory > >> > to load the build artifacts it tacks on an extra 4M of memory use (aka > >> > fudge_factor). The comment in the code > >> > < > https://github.com/AmbiML/sparrow-seL4_tools/blame/master/cmake-tool/helpers/shoehorn.py#L209 > > > >> > says this is to accommodate sel4test_driver. Needless to say this > breaks on > >> > my 4M target platform. So I made the fudge-factor settable from the > cmd > >> > line with a default of 0 and changed the sel4test build glue to set > 4M when > >> > building elfloader. Works fine for my target platform. But this change > >> > breaks building a bootable image for rpi3 (AARCH64=1 > bcm28367)--shoehorn > >> > places elfloader s.t. it overlaps the image; e.g. > >> > > >> > ELF-loader started on CPU: ARM Ltd. Cortex-A53 r0p4 > >> > > paddr=[335000..51a0ff] > >> > > No DTB passed in from boot loader. > >> > > Looking for DTB in CPIO archive...found at 378778. > >> > > Loaded DTB from 378778. > >> > > paddr=[237000..23afff] > >> > > ELF-loading image 'kernel' to 0 > >> > > paddr=[0..236fff] > >> > > vaddr=[ffffff8000000000..ffffff8000236fff] > >> > > virt_entry=ffffff8000000000 > >> > > ELF-loading image 'capdl-loader' to 23b000 > >> > > paddr=[23b000..33bfff] > >> > > vaddr=[400000..500fff] > >> > > virt_entry=4009a8 > >> > > ERROR: image load address overlaps with ELF-loader! > >> > > ERROR: Physical address range invalid > >> > > ERROR: Could not load user image ELF > >> > > >> > > >> > Debug output of shoehorn for this case: > >> > > >> > shoehorn: debug: found CPIO identifying sequence b'070701' at offset > 0x40 > >> > > in > >> > > > /usr/local/google/home/sleffler/shodan/out/cantrip/aarch64-unknown-elf/release/elfloader/archive.o > >> > > shoehorn: debug: encountered CPIO entry name: kernel.elf > >> > > shoehorn: debug: encountered CPIO entry name: kernel.dtb > >> > > shoehorn: debug: encountered CPIO entry name: capdl-loader > >> > > shoehorn: debug: setting marker to 0x0 (region 0 start) > >> > > shoehorn: debug: setting marker to 0x237000 (kernel_end) > >> > > shoehorn: debug: setting marker to 0x23b000 (dtb_end) > >> > > shoehorn: debug: setting marker to 0x335000 (end of rootserver) > >> > > >> > > >> > So two questions: > >> > 1. Where is the 4M under-count of sel4test_driver? (the code > indicates this > >> > might be explained in JIRA SELFOUR-2335 but I couldn't locate it) > >> > >> Here is the referred to Jira issue, but it doesn't provide any > >> additional context: https://sel4.atlassian.net/browse/SELFOUR-2335 > >> > >> shoehorn is attempting to calculate how the kernel and root server > >> binaries will be unpacked into memory in order to place the > >> elfloader's start address above the unpacked region. shoehorn > >> calculates the region by iterating over the PT_LOAD segments from each > >> ELF file. The elfloader then unpacks each ELF file at runtime by > >> iterating over the PT_LOAD segments. > >> > >> For some reason, the two implementations don't agree. In your case, > >> the offline calculation expects that the root server is loaded from > >> [0x23b000, 0x335000) whereas the online calculation attempts: > >> [0x23b000, 0x33bfff). Are you able to print the segment headers for > >> the root server image you are loading? > >> > >> I'm guessing (from quickly looking at the code) the issue is that the > >> shoehorn calculation only sums the p_memsz amounts for each PT_LOAD > >> segment and isn't taking into account any gaps between segments in the > >> virtual address space. > > > > > > Yes, that appears to be the issue. readelf of capdl-loader shows: > > > > Program Headers: > > Type Offset VirtAddr PhysAddr > > FileSiz MemSiz Flags Align > > LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000 > > 0x00000000000a9130 0x00000000000a9130 RWE 0x1000 > > LOAD 0x0000000000000000 0x00000000004b0000 0x00000000004b0000 > > 0x0000000000000000 0x0000000000050168 RW 0x1000 > > > > so there's a gap between the two load segments that isn't accounted for. > Attached is a change that seems to DTRT. It also appears to eliminate the > need for fudge_factor (in quick testing). You'll probably want to write > your own fix as my python fu is basic. > > Thanks for this fix Sam, > This seems to be an appropriate fix. If > https://github.com/seL4/seL4_tools/pull/158 passes the test suite then > I'll try and get the fix merged. Can I use your commit and sign-off > the certificate of originality or would you prefer I rewrite it? > > > > >> > >> > >> A fudge-factor wouldn't be needed if these two calculations weren't out > of sync. > >> > >> > >> > 2. Should zero'ing fudge_factor work? If yes, where should I look to > remedy > >> > the above? > >> > > >> > I looked upstream for changes that might address this issue but > didn't see > >> > anything. > >> > > >> > I suspect I can invert my logic and default fudge_factor to some > value and > >> > then override as needed (e.g. 0 for my sparrow platform & 4M for > sel4test > >> > builds). > >> > >> This seems fine to me. > >> > >> > > >> > -Sam > >> > _______________________________________________ > >> > Devel mailing list -- devel@sel4.systems > >> > To unsubscribe send an email to devel-leave@sel4.systems > _______________________________________________ Devel mailing list -- devel@sel4.systems To unsubscribe send an email to devel-leave@sel4.systems