On 2024-03-20 10:39, Jan Beulich wrote:
On 19.03.2024 21:58, Jason Andryuk wrote:
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -537,6 +537,97 @@ static paddr_t __init find_memory(
return INVALID_PADDR;
}
+static bool __init check_load_address(
+ const struct domain *d, const struct elf_binary *elf)
+{
+ paddr_t kernel_start = (paddr_t)elf->dest_base;
As before I think it is illegitimate to cast a pointer to paddr_t. The
variable type wants to remain such, but the cast wants to be to
unsigned long or uintptr_t (or else at least a comment wants adding).
Ok, thanks. This explains it clearly.
+/* Find an e820 RAM region that fits the kernel at a suitable alignment. */
+static paddr_t __init find_kernel_memory(
+ const struct domain *d, struct elf_binary *elf,
+ const struct elf_dom_parms *parms)
+{
+ paddr_t kernel_size = elf->dest_size;
+ unsigned int i;
+
+ for ( i = 0; i < d->arch.nr_e820; i++ )
+ {
+ paddr_t start = d->arch.e820[i].addr;
+ paddr_t end = d->arch.e820[i].addr + d->arch.e820[i].size;
+ paddr_t kstart, kend;
+
+ if ( d->arch.e820[i].type != E820_RAM ||
+ d->arch.e820[i].size < kernel_size )
+ continue;
+
+ kstart = ROUNDUP(start, parms->phys_align);
+ kstart = max_t(paddr_t, kstart, parms->phys_min);
I'd generally try to avoid max_t(), but I cannot think of any good way
of expressing this without using it.
+ kend = kstart + kernel_size;
+
+ if ( kend > parms->phys_max )
So despite its default phys_max is exclusive aiui. Otherwise with
kend being exclusive too, this would look to be off by one.
Yes, I'll fix the off-by-one. Hmmm, phys_max being 32bit, we want it to
be inclusive to represent the maximum range.
+ return 0;
+
+ if ( kend <= end )
+ return kstart;
+ }
+
+ return 0;
+}
+
+/* Check the kernel load address, and adjust if necessary and possible. */
+static bool __init check_and_adjust_load_address(
+ const struct domain *d, struct elf_binary *elf, struct elf_dom_parms
*parms)
+{
+ paddr_t reloc_base;
+
+ if ( check_load_address(d, elf) )
+ return true;
+
+ if ( !parms->phys_reloc )
+ {
+ printk("%pd kernel: Address conflict and not relocatable\n", d);
+ return false;
This better wouldn't result in -ENOMEM in the caller. Then again reasons
are logged, so the specific error code perhaos doesn't matter that much.
Failure here is turned into -ENOMEM by pvh_load_kernel(). -ENOMEM is
already returned for later failure with find_memory(), so I thought it
was acceptable. Without this code, elf_load_binary() would failed with
-1 and that would be returned. I'll change it to whatever you prefer.
+ }
+
+ reloc_base = find_kernel_memory(d, elf, parms);
+ if ( reloc_base == 0 )
With ! used above please be consistent and do so here, too.
phys_reloc is a bool, and reloc_base is a paddr_t.
+ {
+ printk("%pd kernel: Failed find a load address\n", d);
+ return false;
+ }
+
+ if ( opt_dom0_verbose )
+ printk("%pd kernel: Moving [%p, %p] -> [%"PRIpaddr", %"PRIpaddr"]\n",
d,
+ elf->dest_base, elf->dest_base + elf->dest_size - 1,
+ reloc_base, reloc_base + elf->dest_size - 1);
+
+ parms->phys_entry = reloc_base +
+ (parms->phys_entry - (paddr_t)elf->dest_base);
+ elf->dest_base = (char *)reloc_base;
Just as a remark, no request to change anything: We're not dealing with
strings here. Hence char * isn't quite right, just that "dest_base" is
of that type (for whatever reason).
I think the reason is just to be byte addressable.
@@ -225,6 +232,28 @@ elf_errorstatus elf_xen_parse_note(struct elf_binary *elf,
case XEN_ELFNOTE_PHYS32_ENTRY:
parms->phys_entry = val;
break;
+
+ case XEN_ELFNOTE_PHYS32_RELOC:
+ parms->phys_reloc = true;
+
+ if ( descsz >= 4 )
+ {
+ parms->phys_max = elf_note_numeric_array(elf, note, 4, 0);
+ elf_msg(elf, " = max: %#"PRIx32, parms->phys_max);
+ }
+ if ( descsz >= 8 )
+ {
+ parms->phys_min = elf_note_numeric_array(elf, note, 4, 1);
+ elf_msg(elf, " min: %#"PRIx32, parms->phys_min);
+ }
+ if ( descsz >= 12 )
+ {
+ parms->phys_align = elf_note_numeric_array(elf, note, 4, 2);
+ elf_msg(elf, " align: %#"PRIx32, parms->phys_align);
+ }
+
+ elf_msg(elf, "\n");
Imo the newline wants adding outside of the switch(), for everything
set to ELFNOTE_CUSTOM. In fact with that I think ELFNOTE_CUSTOM and
ELFNOTE_NAME could be folded (with what's NAME right now simply
printing nothing in the switch here). Which suggests that I may
better not commit the (slightly adjusted) earlier patch, yet.
Yes, please hold off and I'll re-work both.
@@ -536,6 +565,10 @@ elf_errorstatus elf_xen_parse(struct elf_binary *elf,
parms->p2m_base = UNSET_ADDR;
parms->elf_paddr_offset = UNSET_ADDR;
parms->phys_entry = UNSET_ADDR32;
+ parms->phys_min = 0;
+ parms->phys_max = 0xffffffff;
+ parms->phys_align = 0x200000;
I think this wants to be MB(2) (requiring a pre-patch to centralize MB()
in the tool stack to tools/include/xen-tools/common-macros.h). And I
further think this needs to be an arch-specific constant, even if right
now the note is expected to be present only for x86. Which then also
needs saying ...
Ok, I'll look into this.
--- a/xen/include/public/elfnote.h
+++ b/xen/include/public/elfnote.h
@@ -194,10 +194,26 @@
*/
#define XEN_ELFNOTE_PHYS32_ENTRY 18
+/*
+ * Physical loading constraints for PVH kernels
+ *
+ * Used to place constraints on the guest physical loading addresses and
+ * alignment for a PVH kernel.
+ *
+ * The presence of this note indicates the kernel supports relocating itself.
+ *
+ * The note may include up to three 32bit values in the following order:
+ * - a maximum address for the entire image to be loaded below (default
+ * 0xffffffff)
+ * - a minimum address for the start of the image (default 0)
+ * - a required start alignment (default 0x200000)
+ */
+#define XEN_ELFNOTE_PHYS32_RELOC 19
... in the comment here. The reasoning behind not (intermediately) falling
back to the alignment in the ELF headers also wants adding to the patch
description (or a code comment, albeit there's no really good place to put
it, imo).
Additionally I think the pieces of the comment want re-ordering. The primary
purpose is indicating relocatability; when not relocating, the constraints
aren't applied in any way.
Yes, this is a good point.
Thanks for the review.
-Jason