Re: Bug report for Gigabyte_R152_P30 - Multiprocessor boot fails with kernel panic
Am Mon, Jun 13, 2022 at 07:44:24PM +0200 schrieb Mark Kettenis: > > From: kiltz > > Date: Mon, 13 Jun 2022 18:12:27 +0200 > > > > Dear Mark, > > first of all, thank you very much for your explainations, the diff > > and, indeed, the ultra swift reply! > > That helps us a lot already. > > A snapshot with a higher value of max CPUs out of the box, of course, > > would be the proverbial icing on the cake. > > Probably a strange question but I hazard it anyways - should we > > monitor the snapshot directory the /pub/OpenBSD/snapshots folder or is > > there a quicker way to find out what your fellow developers think? > > Again, many thanks for your help and best wishes, > > Hi Stefan, > > Theo put that diff in snaphots. I suspect that tomorrow's snapshot > will have it. You can easily tell, since all 80 CPUs should attach > with that diff. > > Cheers, > > Mark And it's nice to hear that SP install already worked. I remember booting it up on an Oracle machine with an Ampere Altra which led to messages like agintcmsi0 at agintc0: unsupported type 0x001700026f31 See http://ix.io/3GEX While I had a diff somewhere to 'fix that', I never got the timer interrupt to fire. That you already had an SP install means all that should be fine. If this change/new snap works, I'd be interested to read a full dmesg! Cheers, Patrick
Re: Bug report for Gigabyte_R152_P30 - Multiprocessor boot fails with kernel panic
> From: kiltz > Date: Mon, 13 Jun 2022 18:12:27 +0200 > > Dear Mark, > first of all, thank you very much for your explainations, the diff > and, indeed, the ultra swift reply! > That helps us a lot already. > A snapshot with a higher value of max CPUs out of the box, of course, > would be the proverbial icing on the cake. > Probably a strange question but I hazard it anyways - should we > monitor the snapshot directory the /pub/OpenBSD/snapshots folder or is > there a quicker way to find out what your fellow developers think? > Again, many thanks for your help and best wishes, Hi Stefan, Theo put that diff in snaphots. I suspect that tomorrow's snapshot will have it. You can easily tell, since all 80 CPUs should attach with that diff. Cheers, Mark > - > Dr.-Ing. Stefan Kiltz > > Otto-von-Guericke University of Magdeburg > ITI Research Group on > Multimedia and Security > Universitaetsplatz 2 > 39106 Magdeburg > Germany > > Tel: +49-391-67-52838 > Fax: +49-391-67-18110 > > eMail: ki...@iti.cs.uni-magdeburg.de > > > > > > On 13 Jun 2022, at 17:20, Mark Kettenis wrote: > > >> From: kiltz > >> Date: Mon, 13 Jun 2022 14:46:39 +0200 > > > > Hi Stefan, > > > >> Dear kind people at OpenBSD.org, > >> we want to run OpenBSD as a firewall system on a Gigabyte R152_P30 > >> with the following specifications: > >> > >>Ampere Altra Q80-33 processor (80 Cores, 3,3 GHz) > >>512 GB RAM (3200 MHz ECC-reg.) > >>2 x 480 GB SSD SATA 6 Gb/s 2,5'' > >>Dual-Port 1 GbE (RJ-45) > >>IPMI 2.0 Baseboard Management Controller (BMC) > >> 1 x PCIe4.0 x16 (FHHL) > >>1 x PCIe3.0 x16 OCP2.0 (belegt) > >>1 x USB 3.0 (front), 3 x USB 3.0 (rear), 1 x VGA (rear) > >> > >> We tried both: > >> - official stable 7.1 (/pub/OpenBSD/7.1/arm64) and > >> - snapshot from 6th of June 2022 (/pub/OpenBSD/snapshots/arm64) > >> > >> The repeatable result is a working install in single CPU/Core > >> installation mode, cpu panic after first reboot with mp kernel. We > >> use > >> the serial to LAN console provided by the IMPI/BMC card. > >> Attached you will find screenshots from: > >> > >> - the last 49 columns of the reboot into mp kernel > >> (Screenshot_boot_after_install_Gigabyte_R152_P30 at 2022-06-13 > >> 13-51-00.png), > >> - the ddb trace output (Screenshot ddb_trace_2022-06-13 > >> 14-02-11.png), > >> - the ddb ps output (Screenshot ddb_ps_at 2022-06-13 14-03-25.png), > >> - the ddb show panic output (Screenshot ddb_show_panic_at 2022-06-13 > >> 14-04-28.png) > >> - the ddb show registers output (Screenshot ddb_show_registers_at > >> 2022-06-13 14-06-34.png) > >> > >> Due to the nature of the early boot panic, the kernel output is not > >> accessible to us. > >> > >> Interestingly, FreeBSD only supports them in their current release, > >> the stable fails with a similar panic. They seem to have found a fix > >> of sorts. But we very much prefer OpenBSD for the firewalling role of > >> aforementioned system. > >> > >> Of course we support your effort so if you need more info from us > >> regarding the circumstances, we will happily try and supply the > >> required information. > > > > The immediate problem is that OpenBSD currently supports a maximum of > > 32 CPUs. That limit is a bit arbitrary, so the diff below bumps it to > > 128. You could try building a GENERIC.MP kernel with this diff after > > booting the GENERIC (bsd.sp) single-processor kernel. I'll see what > > my fellow developers think abut bumping MAXCPUS. Depending on the > > outcome of that a snapshot with this change may be available in a few > > days. > > > > I'm not sure how well OpenBSD/arm64 scales to 80 CPUs. Probably not > > very well but I guess there is only one way to find out... > > > > Cheers, > > > > Mark > > > > > > Index: arch/arm64/include/cpu.h > > === > > RCS file: /cvs/src/sys/arch/arm64/include/cpu.h,v > > retrieving revision 1.25 > > diff -u -p -r1.25 cpu.h > > --- arch/arm64/include/cpu.h23 Mar 2022 23:36:35 - 1.25 > > +++ arch/arm64/include/cpu.h13 Jun 2022 15:09:32 - > > @@ -184,7 +184,7 @@ extern struct cpu_info *cpu_info_list; > > #define CPU_INFO_FOREACH(cii, ci) for (cii = 0, ci = cpu_info_list; \ > > ci != NULL; ci = ci->ci_next) > > #define CPU_INFO_UNIT(ci) ((ci)->ci_dev ? (ci)->ci_dev->dv_unit : 0) > > -#define MAXCPUS32 > > +#define MAXCPUS128 > > > > extern struct cpu_info *cpu_info[MAXCPUS]; > > > > -BEGIN PGP SIGNATURE- > Version: GnuPG/MacGPG2 v2.0.14 (Darwin) > > iEYEARECAAYFAmKnYesACgkQuLKZPfaiT0iDDgCfXC6QIWGHzkMyWxPKHCaTkYwR > AXUAnjLiJX1RyuqrMejk4AT2s5X99fmi > =pRhT > -END PGP SIGNATURE- >
Re: Bug report for Gigabyte_R152_P30 - Multiprocessor boot fails with kernel panic
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear Mark, first of all, thank you very much for your explainations, the diff and, indeed, the ultra swift reply! That helps us a lot already. A snapshot with a higher value of max CPUs out of the box, of course, would be the proverbial icing on the cake. Probably a strange question but I hazard it anyways - should we monitor the snapshot directory the /pub/OpenBSD/snapshots folder or is there a quicker way to find out what your fellow developers think? Again, many thanks for your help and best wishes, Stefan - Dr.-Ing. Stefan Kiltz Otto-von-Guericke University of Magdeburg ITI Research Group on Multimedia and Security Universitaetsplatz 2 39106 Magdeburg Germany Tel: +49-391-67-52838 Fax: +49-391-67-18110 eMail: ki...@iti.cs.uni-magdeburg.de On 13 Jun 2022, at 17:20, Mark Kettenis wrote: From: kiltz Date: Mon, 13 Jun 2022 14:46:39 +0200 Hi Stefan, Dear kind people at OpenBSD.org, we want to run OpenBSD as a firewall system on a Gigabyte R152_P30 with the following specifications: Ampere Altra Q80-33 processor (80 Cores, 3,3 GHz) 512 GB RAM (3200 MHz ECC-reg.) 2 x 480 GB SSD SATA 6 Gb/s 2,5'' Dual-Port 1 GbE (RJ-45) IPMI 2.0 Baseboard Management Controller (BMC) 1 x PCIe4.0 x16 (FHHL) 1 x PCIe3.0 x16 OCP2.0 (belegt) 1 x USB 3.0 (front), 3 x USB 3.0 (rear), 1 x VGA (rear) We tried both: - official stable 7.1 (/pub/OpenBSD/7.1/arm64) and - snapshot from 6th of June 2022 (/pub/OpenBSD/snapshots/arm64) The repeatable result is a working install in single CPU/Core installation mode, cpu panic after first reboot with mp kernel. We use the serial to LAN console provided by the IMPI/BMC card. Attached you will find screenshots from: - the last 49 columns of the reboot into mp kernel (Screenshot_boot_after_install_Gigabyte_R152_P30 at 2022-06-13 13-51-00.png), - the ddb trace output (Screenshot ddb_trace_2022-06-13 14-02-11.png), - the ddb ps output (Screenshot ddb_ps_at 2022-06-13 14-03-25.png), - the ddb show panic output (Screenshot ddb_show_panic_at 2022-06-13 14-04-28.png) - the ddb show registers output (Screenshot ddb_show_registers_at 2022-06-13 14-06-34.png) Due to the nature of the early boot panic, the kernel output is not accessible to us. Interestingly, FreeBSD only supports them in their current release, the stable fails with a similar panic. They seem to have found a fix of sorts. But we very much prefer OpenBSD for the firewalling role of aforementioned system. Of course we support your effort so if you need more info from us regarding the circumstances, we will happily try and supply the required information. The immediate problem is that OpenBSD currently supports a maximum of 32 CPUs. That limit is a bit arbitrary, so the diff below bumps it to 128. You could try building a GENERIC.MP kernel with this diff after booting the GENERIC (bsd.sp) single-processor kernel. I'll see what my fellow developers think abut bumping MAXCPUS. Depending on the outcome of that a snapshot with this change may be available in a few days. I'm not sure how well OpenBSD/arm64 scales to 80 CPUs. Probably not very well but I guess there is only one way to find out... Cheers, Mark Index: arch/arm64/include/cpu.h === RCS file: /cvs/src/sys/arch/arm64/include/cpu.h,v retrieving revision 1.25 diff -u -p -r1.25 cpu.h --- arch/arm64/include/cpu.h23 Mar 2022 23:36:35 - 1.25 +++ arch/arm64/include/cpu.h13 Jun 2022 15:09:32 - @@ -184,7 +184,7 @@ extern struct cpu_info *cpu_info_list; #define CPU_INFO_FOREACH(cii, ci) for (cii = 0, ci = cpu_info_list; \ ci != NULL; ci = ci->ci_next) #define CPU_INFO_UNIT(ci) ((ci)->ci_dev ? (ci)->ci_dev->dv_unit : 0) -#define MAXCPUS32 +#define MAXCPUS128 extern struct cpu_info *cpu_info[MAXCPUS]; -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.14 (Darwin) iEYEARECAAYFAmKnYesACgkQuLKZPfaiT0iDDgCfXC6QIWGHzkMyWxPKHCaTkYwR AXUAnjLiJX1RyuqrMejk4AT2s5X99fmi =pRhT -END PGP SIGNATURE-
Re: Bug report for Gigabyte_R152_P30 - Multiprocessor boot fails with kernel panic
> From: kiltz > Date: Mon, 13 Jun 2022 14:46:39 +0200 Hi Stefan, > Dear kind people at OpenBSD.org, > we want to run OpenBSD as a firewall system on a Gigabyte R152_P30 > with the following specifications: > > Ampere Altra Q80-33 processor (80 Cores, 3,3 GHz) > 512 GB RAM (3200 MHz ECC-reg.) > 2 x 480 GB SSD SATA 6 Gb/s 2,5'' > Dual-Port 1 GbE (RJ-45) > IPMI 2.0 Baseboard Management Controller (BMC) > 1 x PCIe4.0 x16 (FHHL) > 1 x PCIe3.0 x16 OCP2.0 (belegt) > 1 x USB 3.0 (front), 3 x USB 3.0 (rear), 1 x VGA (rear) > > We tried both: > - official stable 7.1 (/pub/OpenBSD/7.1/arm64) and > - snapshot from 6th of June 2022 (/pub/OpenBSD/snapshots/arm64) > > The repeatable result is a working install in single CPU/Core > installation mode, cpu panic after first reboot with mp kernel. We use > the serial to LAN console provided by the IMPI/BMC card. > Attached you will find screenshots from: > > - the last 49 columns of the reboot into mp kernel > (Screenshot_boot_after_install_Gigabyte_R152_P30 at 2022-06-13 > 13-51-00.png), > - the ddb trace output (Screenshot ddb_trace_2022-06-13 14-02-11.png), > - the ddb ps output (Screenshot ddb_ps_at 2022-06-13 14-03-25.png), > - the ddb show panic output (Screenshot ddb_show_panic_at 2022-06-13 > 14-04-28.png) > - the ddb show registers output (Screenshot ddb_show_registers_at > 2022-06-13 14-06-34.png) > > Due to the nature of the early boot panic, the kernel output is not > accessible to us. > > Interestingly, FreeBSD only supports them in their current release, > the stable fails with a similar panic. They seem to have found a fix > of sorts. But we very much prefer OpenBSD for the firewalling role of > aforementioned system. > > Of course we support your effort so if you need more info from us > regarding the circumstances, we will happily try and supply the > required information. The immediate problem is that OpenBSD currently supports a maximum of 32 CPUs. That limit is a bit arbitrary, so the diff below bumps it to 128. You could try building a GENERIC.MP kernel with this diff after booting the GENERIC (bsd.sp) single-processor kernel. I'll see what my fellow developers think abut bumping MAXCPUS. Depending on the outcome of that a snapshot with this change may be available in a few days. I'm not sure how well OpenBSD/arm64 scales to 80 CPUs. Probably not very well but I guess there is only one way to find out... Cheers, Mark Index: arch/arm64/include/cpu.h === RCS file: /cvs/src/sys/arch/arm64/include/cpu.h,v retrieving revision 1.25 diff -u -p -r1.25 cpu.h --- arch/arm64/include/cpu.h23 Mar 2022 23:36:35 - 1.25 +++ arch/arm64/include/cpu.h13 Jun 2022 15:09:32 - @@ -184,7 +184,7 @@ extern struct cpu_info *cpu_info_list; #define CPU_INFO_FOREACH(cii, ci) for (cii = 0, ci = cpu_info_list; \ ci != NULL; ci = ci->ci_next) #define CPU_INFO_UNIT(ci) ((ci)->ci_dev ? (ci)->ci_dev->dv_unit : 0) -#define MAXCPUS32 +#define MAXCPUS128 extern struct cpu_info *cpu_info[MAXCPUS];
Re: System upgraded from 7.0 to 7.1 hangs after fs mounts
On 2022-05-21 15:33, Johan Huldtgren wrote: > On 2022/05/21 14:23, Mark Kettenis wrote: > >> Date: Sat, 21 May 2022 13:13:19 -0400 > >> From: Johan Huldtgren > >> > >> On 2022/05/21 12:43, Mark Kettenis wrote: > Date: Sat, 21 May 2022 12:36:03 -0400 > From: Johan Huldtgren > > hello, > > On 2022/05/21 12:08, Mark Kettenis wrote: > >> Date: Sat, 21 May 2022 10:31:37 -0400 > >> From: Johan Huldtgren > >> > >> hello, > >> > >> Details below, but commenting out 'ttyflags -a' from /etc/rc lets > >> this host boot. I wrote much of this e-mail while going through it, > >> so while we know now what the issue is I'm leaving my responses in > >> case it sheds light on anything. > > > > So it seems your machine incorrectly advertises a serial port that > > doesn't actually exist: > > > >> com1 at acpi0 UAR1 addr 0x2f8/0x8 irq 3: ti16750, 64 byte fifo > >> com1: probed fifo depth: 0 bytes > > I think you're right, Crystal asked about it in a previous > mail which I didn't get a chance to respond to, but I do not > see com1 being reported in the 7.0 dmesg from last night nor > in any older dmesgs I've been able to dig up and I don't > believe anything with this hardware has changed as long as I've > had it. > > > This may be a bug in our APCI code. Can you send the contents of > > /var/db/acpi on your machine? > > root@www ~]# ls -al /var/db/acpi/ > total 164 > drwxr-xr-x 2 root wheel512 May 20 21:26 ./ > drwxr-xr-x 15 root wheel 1024 May 21 06:10 ../ > -rw-r--r-- 1 root wheel146 May 21 06:55 APIC.3 > -rw-r--r-- 1 root wheel120 May 21 06:55 DMAR.12 > -rw-r--r-- 1 root wheel 44470 May 21 06:55 DSDT.2 > -rw-r--r-- 1 root wheel244 May 21 06:55 FACP.1 > -rw-r--r-- 1 root wheel 68 May 21 06:55 FPDT.4 > -rw-r--r-- 1 root wheel 56 May 21 06:55 HPET.7 > -rw-r--r-- 1 root wheel 60 May 21 06:55 MCFG.5 > -rw-r--r-- 1 root wheel190 May 21 06:55 PRAD.6 > -rw-r--r-- 1 root wheel 80 Sep 17 2019 RSDT.0 > -rw-r--r-- 1 root wheel 64 May 21 06:55 SPMI.9 > -rw-r--r-- 1 root wheel 2468 May 21 06:55 SSDT.10 > -rw-r--r-- 1 root wheel 2696 May 21 06:55 SSDT.11 > -rw-r--r-- 1 root wheel877 May 21 06:55 SSDT.8 > -rw-r--r-- 1 root wheel124 May 21 06:55 XSDT.0 > -rw-r--r-- 1 root wheel 2520 May 21 06:55 headers > > Do you need the files? I can tar that directory up and > make it available. > >>> > >>> Right we need all of those. > >> > >> http://www.huldtgren.com/panics/20220520/acpi.tgz > > > > It looks as if the ACPI AML is properly checking that the UART is > > enabled in the NCT6776F SuperIO chip. Can you build a kernel with the > > diff below and mail the dmesg from that kernel? > > > > > > Index: dev/acpi/acpi.c > > === > > RCS file: /cvs/src/sys/dev/acpi/acpi.c,v > > retrieving revision 1.413 > > diff -u -p -r1.413 acpi.c > > --- dev/acpi/acpi.c 17 Feb 2022 00:21:40 - 1.413 > > +++ dev/acpi/acpi.c 21 May 2022 18:20:20 - > > @@ -3095,6 +3095,7 @@ acpi_foundhid(struct aml_node *node, voi > > return (0); > > > > sta = acpi_getsta(sc, node->parent); > > + printf("_STA: 0x%02llx\n", sta); > > if ((sta & (STA_PRESENT | STA_ENABLED)) != (STA_PRESENT | STA_ENABLED)) > > return (0); > > Did this provide any clues as to what is going on? If not and this hardware is just odd and the work around is just to comment out the ttyflags line from /etc/rc I'll add that to my upgrade notes for this machine. thanks, .jh > OpenBSD 7.1-current (GENERIC.MP) #1: Sat May 21 15:19:34 EDT 2022 > jo...@xasthur.home.huldtgren.net:/sys/arch/amd64/compile/GENERIC.MP > real mem = 17127677952 (16334MB) > avail mem = 16591237120 (15822MB) > random: good seed from bootblocks > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xeb4c0 (54 entries) > bios0: vendor American Megatrends Inc. version "2.00" date 05/08/2012 > bios0: Supermicro X9SCD > acpi0 at bios0: ACPI 4.0 > acpi0: sleep states S0 S4 S5 > acpi0: tables DSDT FACP APIC FPDT MCFG PRAD HPET SSDT SPMI SSDT SSDT DMAR > acpi0: wakeup devices PS2K(S4) PS2M(S4) UAR1(S4) P0P1(S4) USB1(S4) USB2(S4) > USB3(S4) USB4(S4) USB5(S4) USB6(S4) USB7(S4) RP01(S4) PXSX(S4) RP02(S4) > PXSX(S4) RP03(S4) [...] > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: Intel(R) Xeon(R) CPU E3-1240 V2 @ 3.40GHz, 3392.86 MHz, 06-3a-09 > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSS