On 4/11/20 3:57 PM, Nicholas Piggin wrote: > Nicholas Piggin's on April 11, 2020 7:32 pm: >> Nathan Chancellor's on April 11, 2020 10:53 am: >>> The tt.config values are needed to reproduce but I did not verify that >>> ONLY tt.config was needed. Other than that, no, we are just building >>> either pseries_defconfig or powernv_defconfig with those configs and >>> letting it boot up with a simple initramfs, which prints the version >>> string then shuts the machine down. >>> >>> Let me know if you need any more information, cheers! >> >> Okay I can reproduce it. Sometimes it eventually recovers after a long >> pause, and some keyboard input often helps it along. So that seems like >> it might be a lost interrupt. >> >> POWER8 vs POWER9 might just be a timing thing if P9 is still hanging >> sometimes. I wasn't able to reproduce it with defconfig+tt.config, I >> needed your other config with various other debug options. >> >> Thanks for the very good report. I'll let you know what I find. > > It looks like a qemu bug. Booting with '-d int' shows the decrementer > simply stops firing at the point of the hang, even though MSR[EE]=1 and > the DEC register is wrapping. Linux appears to be doing the right thing > as far as I can tell (not losing interrupts). > > This qemu patch fixes the boot hang for me. I don't know that qemu > really has the right idea of "context synchronizing" as defined in the > powerpc architecture -- mtmsrd L=1 is not context synchronizing but that > does not mean it can avoid looking at exceptions until the next such > event. It looks like the decrementer exception goes high but the > execution of mtmsrd L=1 is ignoring it. > > Prior to the Linux patch 3282a3da25b you bisected to, interrupt replay > code would return with an 'rfi' instruction as part of interrupt return, > which probably helped to get things moving along a bit. However it would > not be foolproof, and Cedric did say he encountered some mysterious > lockups under load with qemu powernv before that patch was merged, so > maybe it's the same issue?
Nope :/ but this is a fix for an important problem reported by Anton in November. Attached is the test case. Thanks, C.
/* Mikey and I noticed that the decrementer isn't firing when it should. If a decrementer is pending and an mtmsrd(MSR_EE) is executed then we should take the decrementer exception. From the PPC AS: If MSR EE = 0 and an External, Decrementer, or Per- formance Monitor exception is pending, executing an mtmsrd instruction that sets MSR EE to 1 will cause the interrupt to occur before the next instruc- tion is executed, if no higher priority exception exists A test case is below. r31 is incremented for every decrementer exception. powerpc64le-linux-gcc -c test.S powerpc64le-linux-ld -Ttext=0x0 -o test.elf test.o powerpc64le-linux-objcopy -O binary test.elf test.bin qemu-system-ppc64 -M powernv -cpu POWER9 -nographic -bios test.bin "info registers" shows it looping in the lower loop, ie the decrementer exception was never taken. r31 never moves. If I build with: powerpc64le-linux-gcc -DFIX_BROKEN -c test.S I see r31 move. */ #include <ppc-asm.h> /* Load an immediate 64-bit value into a register */ #define LOAD_IMM64(r, e) \ lis r,(e)@highest; \ ori r,r,(e)@higher; \ rldicr r,r, 32, 31; \ oris r,r, (e)@h; \ ori r,r, (e)@l; #define FIXUP_ENDIAN \ tdi 0,0,0x48; /* Reverse endian of b . + 8 */ \ b 191f; /* Skip trampoline if endian is good */ \ .long 0xa600607d; /* mfmsr r11 */ \ .long 0x01006b69; /* xori r11,r11,1 */ \ .long 0x05009f42; /* bcl 20,31,$+4 */ \ .long 0xa602487d; /* mflr r10 */ \ .long 0x14004a39; /* addi r10,r10,20 */ \ .long 0xa64b5a7d; /* mthsrr0 r10 */ \ .long 0xa64b7b7d; /* mthsrr1 r11 */ \ .long 0x2402004c; /* hrfid */ \ 191: .= 0x0 .globl _start _start: b 1f .= 0x10 FIXUP_ENDIAN b 1f .= 0x100 1: FIXUP_ENDIAN b __initialize #define EXCEPTION(nr) \ .= nr ;\ b . /* More exception stubs */ EXCEPTION(0x300) EXCEPTION(0x380) EXCEPTION(0x400) EXCEPTION(0x480) EXCEPTION(0x500) EXCEPTION(0x600) EXCEPTION(0x700) EXCEPTION(0x800) .= 0x900 LOAD_IMM64(r0, 0x1000000) mtdec r0 addi r31,r31,1 rfid EXCEPTION(0x980) EXCEPTION(0xa00) EXCEPTION(0xb00) EXCEPTION(0xc00) EXCEPTION(0xd00) EXCEPTION(0xe00) EXCEPTION(0xe20) EXCEPTION(0xe40) EXCEPTION(0xe60) EXCEPTION(0xe80) EXCEPTION(0xf00) EXCEPTION(0xf20) EXCEPTION(0xf40) EXCEPTION(0xf60) EXCEPTION(0xf80) EXCEPTION(0x1000) EXCEPTION(0x1100) EXCEPTION(0x1200) EXCEPTION(0x1300) EXCEPTION(0x1400) EXCEPTION(0x1500) EXCEPTION(0x1600) __initialize: /* SF, HV, EE, RI, LE */ LOAD_IMM64(r0, 0x9000000000008003) mtmsrd r0 /* HID0: HILE */ LOAD_IMM64(r0, 0x800000000000000) mtspr 0x3f0,r0 LOAD_IMM64(r0, 0x1000000) mtdec r0 1: LOAD_IMM64(r30,0x8000) mtmsrd r30,1 /* We should take the decrementer here */ #ifdef FIX_BROKEN LOAD_IMM64(r29,0x100000000) mtctr r29 2: bdnz 2b #endif li r30,0x0 mtmsrd r30,1 b 1b