Jason, can you run objdump -Sdr on jump2ipl.o on a broken variant?
On 06.02.20 12:00, Thomas Huth wrote: > On 06/02/2020 11.09, Christian Borntraeger wrote: >> >> >> On 05.02.20 19:21, Jason J. Herne wrote: >>> This fixes vfio-ccw when booting non-Linux operating systems. Without this >>> struct being packed, a few extra bytes of low core memory get overwritten >>> when >>> we assign a value to memory address 0 in jump_to_IPL_2. This is enough to >>> cause some non-Linux OSes of fail when booting. >>> >>> The problem was introduced by: >>> 5c6f0d5f46a77d77 "pc-bios/s390x: Fix reset psw mask". >>> >>> The fix is to pack the struct thereby removing the 4 bytes of padding that >>> get >>> added at the end, likely to allow an array of these structs to naturally >>> align >>> on an 8-byte boundary. >>> >>> Fixes: 5c6f0d5f46a7 ("pc-bios/s390x: Fix reset psw mask") >>> CC: Janosch Frank <fran...@linux.ibm.com> >>> Signed-off-by: Jason J. Herne <jjhe...@linux.ibm.com> >>> --- >>> pc-bios/s390-ccw/jump2ipl.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c >>> index da13c43cc0..1e9eaa037f 100644 >>> --- a/pc-bios/s390-ccw/jump2ipl.c >>> +++ b/pc-bios/s390-ccw/jump2ipl.c >>> @@ -18,7 +18,7 @@ >>> typedef struct ResetInfo { >>> uint64_t ipl_psw; >>> uint32_t ipl_continue; >>> -} ResetInfo; >>> +} __attribute__((packed)) ResetInfo; >>> >>> static ResetInfo save; >> >> Just looked into that. >> >> We do save the old content in "save" and restore the old memory content. >> >> static void jump_to_IPL_2(void) >> { >> ResetInfo *current = 0; >> >> void (*ipl)(void) = (void *) (uint64_t) current->ipl_continue; >> --->*current = save; >> ipl(); /* should not return */ >> } >> >> void jump_to_IPL_code(uint64_t address) >> { >> /* store the subsystem information _after_ the bootmap was loaded */ >> write_subsystem_identification(); >> >> /* prevent unknown IPL types in the guest */ >> if (iplb.pbt == S390_IPL_TYPE_QEMU_SCSI) { >> iplb.pbt = S390_IPL_TYPE_CCW; >> set_iplb(&iplb); >> } >> >> /* >> * The IPL PSW is at address 0. We also must not overwrite the >> * content of non-BIOS memory after we loaded the guest, so we >> * save the original content and restore it in jump_to_IPL_2. >> */ >> ResetInfo *current = 0; >> >> --->save = *current; > > Right, and this should also work without your modification. I've stared > at the code a couple of weeks ago, looking for a very similar issue: > > https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg03484.html > > ... but in the end, the problem was something else: > > https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg03520.html > > and the fix had been done in the startup code of the test: > > https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg04225.html > > So I'd guess that you face the very same problem here. That means, you > either have to convince the non-Linux OS to check their startup code > whether they depend on zeroed registers somewhere, or we fix this issue > for good in jump_to_IPL_2() by clearing the registers there before > jumping into the OS code (which we likely should do anyway since the OS > may expect a clean state). > > Thomas > >