Re: aarch64 main-n263493-4e8d558c9d1c-dirty (so: 2023-Jun-10) Kyuafile run: "Fatal data abort" crash during vnet_register_sysinit

2023-06-24 Thread Mark Millard
On Jun 24, 2023, at 14:26, John F Carr  wrote:

> 
>> On Jun 24, 2023, at 13:00, Mark Millard  wrote:
>> 
>> The running system build is a non-debug build (but
>> with symbols not stripped).
>> 
>> The HoneyComb's console log shows:
>> 
>> . . .
>> GEOM_STRIPE: Device stripe.IMfBZr destroyed.
>> GEOM_NOP: Device md0.nop created.
>> g_vfs_done():md0.nop[READ(offset=5885952, length=8192)]error = 5
>> GEOM_NOP: Device md0.nop removed.
>> GEOM_NOP: Device md0.nop created.
>> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
>> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
>> GEOM_NOP: Device md0.nop removed.
>> GEOM_NOP: Device md0.nop created.
>> GEOM_NOP: Device md0.nop removed.
>> Fatal data abort:
>> x0: a02506e64400
>> x1: 0001ea401880 (g_raid3_post_sync + 3a145f8)
>> x2:   4b
>> x3: a343932b0b22fb30
>> x4:0
>> x5:  3310b0d062d0e1d
>> x6: 1d0e2d060d0b3103
>> x7:0
>> x8: ea325df8
>> x9: 0001eec946d0 ($d.6 + 0)
>> x10: 0001ea401880 (g_raid3_post_sync + 3a145f8)
>> x11:0
>> x12:0
>> x13: 00cd8960 (lock_class_mtx_sleep + 0)
>> x14:0
>> x15: a02506e64405
>> x16: 0001eec94860 (_DYNAMIC + 160)
>> x17: 0063a450 (ifc_attach_cloner + 0)
>> x18: 0001eb290400 (g_raid3_post_sync + 48a3178)
>> x19: 0001eec94600 (vnet_epair_init_vnet_init + 0)
>> x20: 00fa5b68 (vnet_sysinit_sxlock + 18)
>> x21: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
>> x22: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
>> x23: a042e500
>> x24: a042e500
>> x25: 00ce0788 (linker_lookup_set_desc + 0)
>> x26: a0203cdef780
>> x27: 0001eec94698 (__set_sysinit_set_sym_if_epairmodule_sys_init + 0)
>> x28: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
>> x29: 0001eb290430 (g_raid3_post_sync + 48a31a8)
>> sp: 0001eb290400
>> lr: 0001eec82a4c ($x.1 + 3c)
>> elr: 0001eec82a60 ($x.1 + 50)
>> spsr: 6045
>> far: 0002d8fba4c8
>> esr: 9646
>> panic: vm_fault failed: 0001eec82a60 error 1
>> cpuid = 14
>> time = 1687625470
>> KDB: stack backtrace:
>> db_trace_self() at db_trace_self
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>> vpanic() at vpanic+0x13c
>> panic() at panic+0x44
>> data_abort() at data_abort+0x2fc
>> handle_el1h_sync() at handle_el1h_sync+0x14
>> --- exception, esr 0x9646
>> $x.1() at $x.1+0x50
>> vnet_register_sysinit() at vnet_register_sysinit+0x114
>> linker_load_module() at linker_load_module+0xae4
>> kern_kldload() at kern_kldload+0xfc
>> sys_kldload() at sys_kldload+0x60
>> do_el0_sync() at do_el0_sync+0x608
>> handle_el0_sync() at handle_el0_sync+0x44
>> --- exception, esr 0x5600
>> KDB: enter: panic
>> [ thread pid 70419 tid 101003 ]
>> Stopped at  kdb_enter+0x44: str xzr, [x19, #3200]
>> db> 
> 
> The failure appears to be initializing module if_epair.

Yep: trying:

# kldload if_epair.ko

was enough to cause the crash. (Just a HoneyComb context at
that point.)

I tried media dd'd from the recent main snapshot, booting the
same system. No crash. I moved my build boot media to some
other systems and tested them: crashes. I tried my boot media
built optimized for Cortex-A53 or Cortex-X1C/Cortex-A78C
instead of Cortex-A72: no crashes. (But only one system can
use the X1C/A78C code in that build.)

So variation testing only gets the crashes for my builds
that are code-optimized for Cortex-A72's. The same source
tree vintage built for cortex-53 or Cortex-X1C/Cortex-A78C
optimization does not get the crashes. But I also
demonstrated an optmized for Cortex-A72 build from 2023-Mar
that gets the crash.

The last time I ran into one of these "crashes tied to
cortex-a72 code optimization" examples it turned out to be
some missing memory-model management code in FreeBSD's USB
code. But being lucky enough to help identify a FreeBSD
source code problem again seems not that likely. It could
easily be a code generation error by clang for all I know.

So, unless at some point I produce fairly solid evidence
that the code actually running is messed up by FreeBSD
source code, this should likely be treated as "blame the
operator" and should likely be largely ignored as things
are. (Just My Problem, as I want the Cortex-A72 optimized
builds.)

Sorry for the noise.

> I see no recent changes in that module that would be likely to break 
> initialization.
> 
> a9bfd080d09a if_epair: do not transmit packets that exceed the interface MTU
> 4d846d260e2b spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop 
> -FreeBSD
> a6b55ee6be15 net: replace IFF_KNOWSEPOCH with IFF_NEEDSEPOCH
> c69ae8419734 if_epair: also remove vlan metadata from mbufs
> 29c9b1673305 epair: Remove unneeded includes and sort some of the rest

My kyua run examples included a Cortex-A72 optimized system build
from last 2023-Mar. It also crashes. It looks like my last kyua
runs 

Re: aarch64 main-n263493-4e8d558c9d1c-dirty (so: 2023-Jun-10) Kyuafile run: "Fatal data abort" crash during vnet_register_sysinit

2023-06-24 Thread Mark Millard
On Jun 24, 2023, at 13:48, Mark Millard  wrote:

> On Jun 24, 2023, at 12:16, Mark Millard  wrote:
> 
>> On Jun 24, 2023, at 10:49, Mark Millard  wrote:
>> 
>>> On Jun 24, 2023, at 10:00, Mark Millard  wrote:
>>> 
 The running system build is a non-debug build (but
 with symbols not stripped).
 
 The HoneyComb's console log shows:
 
 . . .
 GEOM_STRIPE: Device stripe.IMfBZr destroyed.
 GEOM_NOP: Device md0.nop created.
 g_vfs_done():md0.nop[READ(offset=5885952, length=8192)]error = 5
 GEOM_NOP: Device md0.nop removed.
 GEOM_NOP: Device md0.nop created.
 g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
 g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
 GEOM_NOP: Device md0.nop removed.
 GEOM_NOP: Device md0.nop created.
 GEOM_NOP: Device md0.nop removed.
 Fatal data abort:
 x0: a02506e64400
 x1: 0001ea401880 (g_raid3_post_sync + 3a145f8)
 x2:   4b
 x3: a343932b0b22fb30
 x4:0
 x5:  3310b0d062d0e1d
 x6: 1d0e2d060d0b3103
 x7:0
 x8: ea325df8
 x9: 0001eec946d0 ($d.6 + 0)
 x10: 0001ea401880 (g_raid3_post_sync + 3a145f8)
 x11:0
 x12:0
 x13: 00cd8960 (lock_class_mtx_sleep + 0)
 x14:0
 x15: a02506e64405
 x16: 0001eec94860 (_DYNAMIC + 160)
 x17: 0063a450 (ifc_attach_cloner + 0)
 x18: 0001eb290400 (g_raid3_post_sync + 48a3178)
 x19: 0001eec94600 (vnet_epair_init_vnet_init + 0)
 x20: 00fa5b68 (vnet_sysinit_sxlock + 18)
 x21: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
 x22: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
 x23: a042e500
 x24: a042e500
 x25: 00ce0788 (linker_lookup_set_desc + 0)
 x26: a0203cdef780
 x27: 0001eec94698 (__set_sysinit_set_sym_if_epairmodule_sys_init + 0)
 x28: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
 x29: 0001eb290430 (g_raid3_post_sync + 48a31a8)
 sp: 0001eb290400
 lr: 0001eec82a4c ($x.1 + 3c)
 elr: 0001eec82a60 ($x.1 + 50)
 spsr: 6045
 far: 0002d8fba4c8
 esr: 9646
 panic: vm_fault failed: 0001eec82a60 error 1
 cpuid = 14
 time = 1687625470
 KDB: stack backtrace:
 db_trace_self() at db_trace_self
 db_trace_self_wrapper() at db_trace_self_wrapper+0x30
 vpanic() at vpanic+0x13c
 panic() at panic+0x44
 data_abort() at data_abort+0x2fc
 handle_el1h_sync() at handle_el1h_sync+0x14
 --- exception, esr 0x9646
 $x.1() at $x.1+0x50
 vnet_register_sysinit() at vnet_register_sysinit+0x114
 linker_load_module() at linker_load_module+0xae4
 kern_kldload() at kern_kldload+0xfc
 sys_kldload() at sys_kldload+0x60
 do_el0_sync() at do_el0_sync+0x608
 handle_el0_sync() at handle_el0_sync+0x44
 --- exception, esr 0x5600
 KDB: enter: panic
 [ thread pid 70419 tid 101003 ]
 Stopped at  kdb_enter+0x44: str xzr, [x19, #3200]
 db> 
 
 I'll see if a re-run is repeatable.
 
>>> 
>>> It repeats:
>>> 
>>> GEOM_STRIPE: Device stripe/stripe.VkbPk1 deactivated.
>>> GEOM_STRIPE: Disk md1 removed from stripe.VkbPk1.
>>> GEOM_STRIPE: Disk md0 removed from stripe.VkbPk1.
>>> GEOM_STRIPE: Device stripe.VkbPk1 destroyed.
>>> GEOM_NOP: Device md0.nop created.
>>> g_vfs_done():md0.nop[READ(offset=5885952, length=8192)]error = 5
>>> GEOM_NOP: Device md0.nop removed.
>>> GEOM_NOP: Device md0.nop created.
>>> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
>>> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
>>> GEOM_NOP: Device md0.nop removed.
>>> GEOM_NOP: Device md0.nop created.
>>> GEOM_NOP: Device md0.nop removed.
>>> Fatal data abort:
>>> x0: a0003b1a9500
>>> x1: 00021b530260
>>> x2:   4b
>>> x3: a343932b0b22fb30
>>> x4:0
>>> x5:  3310b0d062d0e1d
>>> x6: 1d0e2d060d0b3103
>>> x7:0
>>> x8: ea325df8
>>> x9: 00021d6946d0 ($d.6 + 0)
>>> x10: 00021b530260
>>> x11:0
>>> x12:0
>>> x13: 00cd8960 (lock_class_mtx_sleep + 0)
>>> x14:0
>>> x15: a0003b1a9505
>>> x16: 00021d694860 (_DYNAMIC + 160)
>>> x17: 0063a450 (ifc_attach_cloner + 0)
>>> x18: 00021a6ea400
>>> x19: 00021d694600 (vnet_epair_init_vnet_init + 0)
>>> x20: 00fa5b68 (vnet_sysinit_sxlock + 18)
>>> x21: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
>>> x22: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
>>> x23: a0431500
>>> x24: a0431500
>>> x25: 00ce0788 (linker_lookup_set_desc + 0)
>>> x26: a02e1ab6d180
>>> x27: 00021d694698 (__set_sysinit_set_sym_if_epairmodule_sys_init + 0)
>>> x28: 00d8e000 (sdt_vfs_vop_vop_spare4_retu

Re: aarch64 main-n263493-4e8d558c9d1c-dirty (so: 2023-Jun-10) Kyuafile run: "Fatal data abort" crash during vnet_register_sysinit

2023-06-24 Thread John F Carr


> On Jun 24, 2023, at 13:00, Mark Millard  wrote:
> 
> The running system build is a non-debug build (but
> with symbols not stripped).
> 
> The HoneyComb's console log shows:
> 
> . . .
> GEOM_STRIPE: Device stripe.IMfBZr destroyed.
> GEOM_NOP: Device md0.nop created.
> g_vfs_done():md0.nop[READ(offset=5885952, length=8192)]error = 5
> GEOM_NOP: Device md0.nop removed.
> GEOM_NOP: Device md0.nop created.
> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
> GEOM_NOP: Device md0.nop removed.
> GEOM_NOP: Device md0.nop created.
> GEOM_NOP: Device md0.nop removed.
> Fatal data abort:
>  x0: a02506e64400
>  x1: 0001ea401880 (g_raid3_post_sync + 3a145f8)
>  x2:   4b
>  x3: a343932b0b22fb30
>  x4:0
>  x5:  3310b0d062d0e1d
>  x6: 1d0e2d060d0b3103
>  x7:0
>  x8: ea325df8
>  x9: 0001eec946d0 ($d.6 + 0)
> x10: 0001ea401880 (g_raid3_post_sync + 3a145f8)
> x11:0
> x12:0
> x13: 00cd8960 (lock_class_mtx_sleep + 0)
> x14:0
> x15: a02506e64405
> x16: 0001eec94860 (_DYNAMIC + 160)
> x17: 0063a450 (ifc_attach_cloner + 0)
> x18: 0001eb290400 (g_raid3_post_sync + 48a3178)
> x19: 0001eec94600 (vnet_epair_init_vnet_init + 0)
> x20: 00fa5b68 (vnet_sysinit_sxlock + 18)
> x21: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
> x22: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
> x23: a042e500
> x24: a042e500
> x25: 00ce0788 (linker_lookup_set_desc + 0)
> x26: a0203cdef780
> x27: 0001eec94698 (__set_sysinit_set_sym_if_epairmodule_sys_init + 0)
> x28: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
> x29: 0001eb290430 (g_raid3_post_sync + 48a31a8)
>  sp: 0001eb290400
>  lr: 0001eec82a4c ($x.1 + 3c)
> elr: 0001eec82a60 ($x.1 + 50)
> spsr: 6045
> far: 0002d8fba4c8
> esr: 9646
> panic: vm_fault failed: 0001eec82a60 error 1
> cpuid = 14
> time = 1687625470
> KDB: stack backtrace:
> db_trace_self() at db_trace_self
> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
> vpanic() at vpanic+0x13c
> panic() at panic+0x44
> data_abort() at data_abort+0x2fc
> handle_el1h_sync() at handle_el1h_sync+0x14
> --- exception, esr 0x9646
> $x.1() at $x.1+0x50
> vnet_register_sysinit() at vnet_register_sysinit+0x114
> linker_load_module() at linker_load_module+0xae4
> kern_kldload() at kern_kldload+0xfc
> sys_kldload() at sys_kldload+0x60
> do_el0_sync() at do_el0_sync+0x608
> handle_el0_sync() at handle_el0_sync+0x44
> --- exception, esr 0x5600
> KDB: enter: panic
> [ thread pid 70419 tid 101003 ]
> Stopped at  kdb_enter+0x44: str xzr, [x19, #3200]
> db> 

The failure appears to be initializing module if_epair.  I see no recent 
changes in that module that would be likely to break initialization.

a9bfd080d09a if_epair: do not transmit packets that exceed the interface MTU
4d846d260e2b spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop 
-FreeBSD
a6b55ee6be15 net: replace IFF_KNOWSEPOCH with IFF_NEEDSEPOCH
c69ae8419734 if_epair: also remove vlan metadata from mbufs
29c9b1673305 epair: Remove unneeded includes and sort some of the rest








Re: aarch64 main-n263493-4e8d558c9d1c-dirty (so: 2023-Jun-10) Kyuafile run: "Fatal data abort" crash during vnet_register_sysinit

2023-06-24 Thread Mark Millard
On Jun 24, 2023, at 12:16, Mark Millard  wrote:

> On Jun 24, 2023, at 10:49, Mark Millard  wrote:
> 
>> On Jun 24, 2023, at 10:00, Mark Millard  wrote:
>> 
>>> The running system build is a non-debug build (but
>>> with symbols not stripped).
>>> 
>>> The HoneyComb's console log shows:
>>> 
>>> . . .
>>> GEOM_STRIPE: Device stripe.IMfBZr destroyed.
>>> GEOM_NOP: Device md0.nop created.
>>> g_vfs_done():md0.nop[READ(offset=5885952, length=8192)]error = 5
>>> GEOM_NOP: Device md0.nop removed.
>>> GEOM_NOP: Device md0.nop created.
>>> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
>>> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
>>> GEOM_NOP: Device md0.nop removed.
>>> GEOM_NOP: Device md0.nop created.
>>> GEOM_NOP: Device md0.nop removed.
>>> Fatal data abort:
>>> x0: a02506e64400
>>> x1: 0001ea401880 (g_raid3_post_sync + 3a145f8)
>>> x2:   4b
>>> x3: a343932b0b22fb30
>>> x4:0
>>> x5:  3310b0d062d0e1d
>>> x6: 1d0e2d060d0b3103
>>> x7:0
>>> x8: ea325df8
>>> x9: 0001eec946d0 ($d.6 + 0)
>>> x10: 0001ea401880 (g_raid3_post_sync + 3a145f8)
>>> x11:0
>>> x12:0
>>> x13: 00cd8960 (lock_class_mtx_sleep + 0)
>>> x14:0
>>> x15: a02506e64405
>>> x16: 0001eec94860 (_DYNAMIC + 160)
>>> x17: 0063a450 (ifc_attach_cloner + 0)
>>> x18: 0001eb290400 (g_raid3_post_sync + 48a3178)
>>> x19: 0001eec94600 (vnet_epair_init_vnet_init + 0)
>>> x20: 00fa5b68 (vnet_sysinit_sxlock + 18)
>>> x21: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
>>> x22: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
>>> x23: a042e500
>>> x24: a042e500
>>> x25: 00ce0788 (linker_lookup_set_desc + 0)
>>> x26: a0203cdef780
>>> x27: 0001eec94698 (__set_sysinit_set_sym_if_epairmodule_sys_init + 0)
>>> x28: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
>>> x29: 0001eb290430 (g_raid3_post_sync + 48a31a8)
>>> sp: 0001eb290400
>>> lr: 0001eec82a4c ($x.1 + 3c)
>>> elr: 0001eec82a60 ($x.1 + 50)
>>> spsr: 6045
>>> far: 0002d8fba4c8
>>> esr: 9646
>>> panic: vm_fault failed: 0001eec82a60 error 1
>>> cpuid = 14
>>> time = 1687625470
>>> KDB: stack backtrace:
>>> db_trace_self() at db_trace_self
>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>>> vpanic() at vpanic+0x13c
>>> panic() at panic+0x44
>>> data_abort() at data_abort+0x2fc
>>> handle_el1h_sync() at handle_el1h_sync+0x14
>>> --- exception, esr 0x9646
>>> $x.1() at $x.1+0x50
>>> vnet_register_sysinit() at vnet_register_sysinit+0x114
>>> linker_load_module() at linker_load_module+0xae4
>>> kern_kldload() at kern_kldload+0xfc
>>> sys_kldload() at sys_kldload+0x60
>>> do_el0_sync() at do_el0_sync+0x608
>>> handle_el0_sync() at handle_el0_sync+0x44
>>> --- exception, esr 0x5600
>>> KDB: enter: panic
>>> [ thread pid 70419 tid 101003 ]
>>> Stopped at  kdb_enter+0x44: str xzr, [x19, #3200]
>>> db> 
>>> 
>>> I'll see if a re-run is repeatable.
>>> 
>> 
>> It repeats:
>> 
>> GEOM_STRIPE: Device stripe/stripe.VkbPk1 deactivated.
>> GEOM_STRIPE: Disk md1 removed from stripe.VkbPk1.
>> GEOM_STRIPE: Disk md0 removed from stripe.VkbPk1.
>> GEOM_STRIPE: Device stripe.VkbPk1 destroyed.
>> GEOM_NOP: Device md0.nop created.
>> g_vfs_done():md0.nop[READ(offset=5885952, length=8192)]error = 5
>> GEOM_NOP: Device md0.nop removed.
>> GEOM_NOP: Device md0.nop created.
>> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
>> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
>> GEOM_NOP: Device md0.nop removed.
>> GEOM_NOP: Device md0.nop created.
>> GEOM_NOP: Device md0.nop removed.
>> Fatal data abort:
>> x0: a0003b1a9500
>> x1: 00021b530260
>> x2:   4b
>> x3: a343932b0b22fb30
>> x4:0
>> x5:  3310b0d062d0e1d
>> x6: 1d0e2d060d0b3103
>> x7:0
>> x8: ea325df8
>> x9: 00021d6946d0 ($d.6 + 0)
>> x10: 00021b530260
>> x11:0
>> x12:0
>> x13: 00cd8960 (lock_class_mtx_sleep + 0)
>> x14:0
>> x15: a0003b1a9505
>> x16: 00021d694860 (_DYNAMIC + 160)
>> x17: 0063a450 (ifc_attach_cloner + 0)
>> x18: 00021a6ea400
>> x19: 00021d694600 (vnet_epair_init_vnet_init + 0)
>> x20: 00fa5b68 (vnet_sysinit_sxlock + 18)
>> x21: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
>> x22: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
>> x23: a0431500
>> x24: a0431500
>> x25: 00ce0788 (linker_lookup_set_desc + 0)
>> x26: a02e1ab6d180
>> x27: 00021d694698 (__set_sysinit_set_sym_if_epairmodule_sys_init + 0)
>> x28: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
>> x29: 00021a6ea430
>> sp: 00021a6ea400
>> lr: 00021d682a4c ($x.1 + 3c)
>> elr: 00021d682a60 ($x.1 + 50)
>> spsr: 6045
>> far: 0003079ba4c8
>

Re: aarch64 main-n263493-4e8d558c9d1c-dirty (so: 2023-Jun-10) Kyuafile run: "Fatal data abort" crash during vnet_register_sysinit

2023-06-24 Thread Mark Millard
On Jun 24, 2023, at 10:49, Mark Millard  wrote:

> On Jun 24, 2023, at 10:00, Mark Millard  wrote:
> 
>> The running system build is a non-debug build (but
>> with symbols not stripped).
>> 
>> The HoneyComb's console log shows:
>> 
>> . . .
>> GEOM_STRIPE: Device stripe.IMfBZr destroyed.
>> GEOM_NOP: Device md0.nop created.
>> g_vfs_done():md0.nop[READ(offset=5885952, length=8192)]error = 5
>> GEOM_NOP: Device md0.nop removed.
>> GEOM_NOP: Device md0.nop created.
>> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
>> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
>> GEOM_NOP: Device md0.nop removed.
>> GEOM_NOP: Device md0.nop created.
>> GEOM_NOP: Device md0.nop removed.
>> Fatal data abort:
>> x0: a02506e64400
>> x1: 0001ea401880 (g_raid3_post_sync + 3a145f8)
>> x2:   4b
>> x3: a343932b0b22fb30
>> x4:0
>> x5:  3310b0d062d0e1d
>> x6: 1d0e2d060d0b3103
>> x7:0
>> x8: ea325df8
>> x9: 0001eec946d0 ($d.6 + 0)
>> x10: 0001ea401880 (g_raid3_post_sync + 3a145f8)
>> x11:0
>> x12:0
>> x13: 00cd8960 (lock_class_mtx_sleep + 0)
>> x14:0
>> x15: a02506e64405
>> x16: 0001eec94860 (_DYNAMIC + 160)
>> x17: 0063a450 (ifc_attach_cloner + 0)
>> x18: 0001eb290400 (g_raid3_post_sync + 48a3178)
>> x19: 0001eec94600 (vnet_epair_init_vnet_init + 0)
>> x20: 00fa5b68 (vnet_sysinit_sxlock + 18)
>> x21: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
>> x22: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
>> x23: a042e500
>> x24: a042e500
>> x25: 00ce0788 (linker_lookup_set_desc + 0)
>> x26: a0203cdef780
>> x27: 0001eec94698 (__set_sysinit_set_sym_if_epairmodule_sys_init + 0)
>> x28: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
>> x29: 0001eb290430 (g_raid3_post_sync + 48a31a8)
>> sp: 0001eb290400
>> lr: 0001eec82a4c ($x.1 + 3c)
>> elr: 0001eec82a60 ($x.1 + 50)
>> spsr: 6045
>> far: 0002d8fba4c8
>> esr: 9646
>> panic: vm_fault failed: 0001eec82a60 error 1
>> cpuid = 14
>> time = 1687625470
>> KDB: stack backtrace:
>> db_trace_self() at db_trace_self
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>> vpanic() at vpanic+0x13c
>> panic() at panic+0x44
>> data_abort() at data_abort+0x2fc
>> handle_el1h_sync() at handle_el1h_sync+0x14
>> --- exception, esr 0x9646
>> $x.1() at $x.1+0x50
>> vnet_register_sysinit() at vnet_register_sysinit+0x114
>> linker_load_module() at linker_load_module+0xae4
>> kern_kldload() at kern_kldload+0xfc
>> sys_kldload() at sys_kldload+0x60
>> do_el0_sync() at do_el0_sync+0x608
>> handle_el0_sync() at handle_el0_sync+0x44
>> --- exception, esr 0x5600
>> KDB: enter: panic
>> [ thread pid 70419 tid 101003 ]
>> Stopped at  kdb_enter+0x44: str xzr, [x19, #3200]
>> db> 
>> 
>> I'll see if a re-run is repeatable.
>> 
> 
> It repeats:
> 
> GEOM_STRIPE: Device stripe/stripe.VkbPk1 deactivated.
> GEOM_STRIPE: Disk md1 removed from stripe.VkbPk1.
> GEOM_STRIPE: Disk md0 removed from stripe.VkbPk1.
> GEOM_STRIPE: Device stripe.VkbPk1 destroyed.
> GEOM_NOP: Device md0.nop created.
> g_vfs_done():md0.nop[READ(offset=5885952, length=8192)]error = 5
> GEOM_NOP: Device md0.nop removed.
> GEOM_NOP: Device md0.nop created.
> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
> GEOM_NOP: Device md0.nop removed.
> GEOM_NOP: Device md0.nop created.
> GEOM_NOP: Device md0.nop removed.
> Fatal data abort:
>  x0: a0003b1a9500
>  x1: 00021b530260
>  x2:   4b
>  x3: a343932b0b22fb30
>  x4:0
>  x5:  3310b0d062d0e1d
>  x6: 1d0e2d060d0b3103
>  x7:0
>  x8: ea325df8
>  x9: 00021d6946d0 ($d.6 + 0)
> x10: 00021b530260
> x11:0
> x12:0
> x13: 00cd8960 (lock_class_mtx_sleep + 0)
> x14:0
> x15: a0003b1a9505
> x16: 00021d694860 (_DYNAMIC + 160)
> x17: 0063a450 (ifc_attach_cloner + 0)
> x18: 00021a6ea400
> x19: 00021d694600 (vnet_epair_init_vnet_init + 0)
> x20: 00fa5b68 (vnet_sysinit_sxlock + 18)
> x21: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
> x22: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
> x23: a0431500
> x24: a0431500
> x25: 00ce0788 (linker_lookup_set_desc + 0)
> x26: a02e1ab6d180
> x27: 00021d694698 (__set_sysinit_set_sym_if_epairmodule_sys_init + 0)
> x28: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
> x29: 00021a6ea430
>  sp: 00021a6ea400
>  lr: 00021d682a4c ($x.1 + 3c)
> elr: 00021d682a60 ($x.1 + 50)
> spsr: 6045
> far: 0003079ba4c8
> esr: 9646
> panic: vm_fault failed: 00021d682a60 error 1
> cpuid = 1
> time = 1687628622
> KDB: stack backtrace:
> db_trace_self() at db_trace_self
> db_trac

Re: Build failure for radlib.o during main-n263767-764464af4968 -> main-n263782-59833b089e78 src update

2023-06-24 Thread Ed Maste
> > > This could be a dependency issue; would you check if removing the
> > > following $OBJTOP subdirs addresses the issue:
> > >
> > > secure/lib/libcrypto
> > > secure/lib/libssl
> > > obj-lib32/secure/lib/libcrypto
> > > obj-lib32/secure/lib/libssl
> > >
> The build was successful; after the reboot, we see:
>
> g1-48(14.0-C)[1] uname -aUK
> FreeBSD g1-48.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #469 
> main-n263782-59833b089e78: Sat Jun 24 16:28:56 UTC 2023 
> r...@g1-48.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY amd64 
> 1400092 1400092
>
>
> So: I believe we have a winner! :-)

Excellent, thanks for checking. I've opened review D40750[1] to have
this cleanup happen automatically.

[1] https://reviews.freebsd.org/D40750



Re: aarch64 main-n263493-4e8d558c9d1c-dirty (so: 2023-Jun-10) Kyuafile run: "Fatal data abort" crash during vnet_register_sysinit

2023-06-24 Thread Mark Millard
On Jun 24, 2023, at 10:00, Mark Millard  wrote:

> The running system build is a non-debug build (but
> with symbols not stripped).
> 
> The HoneyComb's console log shows:
> 
> . . .
> GEOM_STRIPE: Device stripe.IMfBZr destroyed.
> GEOM_NOP: Device md0.nop created.
> g_vfs_done():md0.nop[READ(offset=5885952, length=8192)]error = 5
> GEOM_NOP: Device md0.nop removed.
> GEOM_NOP: Device md0.nop created.
> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
> g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
> GEOM_NOP: Device md0.nop removed.
> GEOM_NOP: Device md0.nop created.
> GEOM_NOP: Device md0.nop removed.
> Fatal data abort:
>  x0: a02506e64400
>  x1: 0001ea401880 (g_raid3_post_sync + 3a145f8)
>  x2:   4b
>  x3: a343932b0b22fb30
>  x4:0
>  x5:  3310b0d062d0e1d
>  x6: 1d0e2d060d0b3103
>  x7:0
>  x8: ea325df8
>  x9: 0001eec946d0 ($d.6 + 0)
> x10: 0001ea401880 (g_raid3_post_sync + 3a145f8)
> x11:0
> x12:0
> x13: 00cd8960 (lock_class_mtx_sleep + 0)
> x14:0
> x15: a02506e64405
> x16: 0001eec94860 (_DYNAMIC + 160)
> x17: 0063a450 (ifc_attach_cloner + 0)
> x18: 0001eb290400 (g_raid3_post_sync + 48a3178)
> x19: 0001eec94600 (vnet_epair_init_vnet_init + 0)
> x20: 00fa5b68 (vnet_sysinit_sxlock + 18)
> x21: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
> x22: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
> x23: a042e500
> x24: a042e500
> x25: 00ce0788 (linker_lookup_set_desc + 0)
> x26: a0203cdef780
> x27: 0001eec94698 (__set_sysinit_set_sym_if_epairmodule_sys_init + 0)
> x28: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
> x29: 0001eb290430 (g_raid3_post_sync + 48a31a8)
>  sp: 0001eb290400
>  lr: 0001eec82a4c ($x.1 + 3c)
> elr: 0001eec82a60 ($x.1 + 50)
> spsr: 6045
> far: 0002d8fba4c8
> esr: 9646
> panic: vm_fault failed: 0001eec82a60 error 1
> cpuid = 14
> time = 1687625470
> KDB: stack backtrace:
> db_trace_self() at db_trace_self
> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
> vpanic() at vpanic+0x13c
> panic() at panic+0x44
> data_abort() at data_abort+0x2fc
> handle_el1h_sync() at handle_el1h_sync+0x14
> --- exception, esr 0x9646
> $x.1() at $x.1+0x50
> vnet_register_sysinit() at vnet_register_sysinit+0x114
> linker_load_module() at linker_load_module+0xae4
> kern_kldload() at kern_kldload+0xfc
> sys_kldload() at sys_kldload+0x60
> do_el0_sync() at do_el0_sync+0x608
> handle_el0_sync() at handle_el0_sync+0x44
> --- exception, esr 0x5600
> KDB: enter: panic
> [ thread pid 70419 tid 101003 ]
> Stopped at  kdb_enter+0x44: str xzr, [x19, #3200]
> db> 
> 
> I'll see if a re-run is repeatable.
> 

It repeats:

GEOM_STRIPE: Device stripe/stripe.VkbPk1 deactivated.
GEOM_STRIPE: Disk md1 removed from stripe.VkbPk1.
GEOM_STRIPE: Disk md0 removed from stripe.VkbPk1.
GEOM_STRIPE: Device stripe.VkbPk1 destroyed.
GEOM_NOP: Device md0.nop created.
g_vfs_done():md0.nop[READ(offset=5885952, length=8192)]error = 5
GEOM_NOP: Device md0.nop removed.
GEOM_NOP: Device md0.nop created.
g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
GEOM_NOP: Device md0.nop removed.
GEOM_NOP: Device md0.nop created.
GEOM_NOP: Device md0.nop removed.
Fatal data abort:
  x0: a0003b1a9500
  x1: 00021b530260
  x2:   4b
  x3: a343932b0b22fb30
  x4:0
  x5:  3310b0d062d0e1d
  x6: 1d0e2d060d0b3103
  x7:0
  x8: ea325df8
  x9: 00021d6946d0 ($d.6 + 0)
 x10: 00021b530260
 x11:0
 x12:0
 x13: 00cd8960 (lock_class_mtx_sleep + 0)
 x14:0
 x15: a0003b1a9505
 x16: 00021d694860 (_DYNAMIC + 160)
 x17: 0063a450 (ifc_attach_cloner + 0)
 x18: 00021a6ea400
 x19: 00021d694600 (vnet_epair_init_vnet_init + 0)
 x20: 00fa5b68 (vnet_sysinit_sxlock + 18)
 x21: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
 x22: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
 x23: a0431500
 x24: a0431500
 x25: 00ce0788 (linker_lookup_set_desc + 0)
 x26: a02e1ab6d180
 x27: 00021d694698 (__set_sysinit_set_sym_if_epairmodule_sys_init + 0)
 x28: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
 x29: 00021a6ea430
  sp: 00021a6ea400
  lr: 00021d682a4c ($x.1 + 3c)
 elr: 00021d682a60 ($x.1 + 50)
spsr: 6045
 far: 0003079ba4c8
 esr: 9646
panic: vm_fault failed: 00021d682a60 error 1
cpuid = 1
time = 1687628622
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x13c
panic() at panic+0x44
data_abort() at data_abort+0x2fc
handle_el1h_sync() at handle_el1h_sync+0x14
--- exception, esr 0x9646
$x

aarch64 main-n263493-4e8d558c9d1c-dirty (so: 2023-Jun-10) Kyuafile run: "Fatal data abort" crash during vnet_register_sysinit

2023-06-24 Thread Mark Millard
The running system build is a non-debug build (but
with symbols not stripped).

The HoneyComb's console log shows:

. . .
GEOM_STRIPE: Device stripe.IMfBZr destroyed.
GEOM_NOP: Device md0.nop created.
g_vfs_done():md0.nop[READ(offset=5885952, length=8192)]error = 5
GEOM_NOP: Device md0.nop removed.
GEOM_NOP: Device md0.nop created.
g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
g_vfs_done():md0.nop[READ(offset=5935104, length=4096)]error = 5
GEOM_NOP: Device md0.nop removed.
GEOM_NOP: Device md0.nop created.
GEOM_NOP: Device md0.nop removed.
Fatal data abort:
  x0: a02506e64400
  x1: 0001ea401880 (g_raid3_post_sync + 3a145f8)
  x2:   4b
  x3: a343932b0b22fb30
  x4:0
  x5:  3310b0d062d0e1d
  x6: 1d0e2d060d0b3103
  x7:0
  x8: ea325df8
  x9: 0001eec946d0 ($d.6 + 0)
 x10: 0001ea401880 (g_raid3_post_sync + 3a145f8)
 x11:0
 x12:0
 x13: 00cd8960 (lock_class_mtx_sleep + 0)
 x14:0
 x15: a02506e64405
 x16: 0001eec94860 (_DYNAMIC + 160)
 x17: 0063a450 (ifc_attach_cloner + 0)
 x18: 0001eb290400 (g_raid3_post_sync + 48a3178)
 x19: 0001eec94600 (vnet_epair_init_vnet_init + 0)
 x20: 00fa5b68 (vnet_sysinit_sxlock + 18)
 x21: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
 x22: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
 x23: a042e500
 x24: a042e500
 x25: 00ce0788 (linker_lookup_set_desc + 0)
 x26: a0203cdef780
 x27: 0001eec94698 (__set_sysinit_set_sym_if_epairmodule_sys_init + 0)
 x28: 00d8e000 (sdt_vfs_vop_vop_spare4_return + 0)
 x29: 0001eb290430 (g_raid3_post_sync + 48a31a8)
  sp: 0001eb290400
  lr: 0001eec82a4c ($x.1 + 3c)
 elr: 0001eec82a60 ($x.1 + 50)
spsr: 6045
 far: 0002d8fba4c8
 esr: 9646
panic: vm_fault failed: 0001eec82a60 error 1
cpuid = 14
time = 1687625470
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x13c
panic() at panic+0x44
data_abort() at data_abort+0x2fc
handle_el1h_sync() at handle_el1h_sync+0x14
--- exception, esr 0x9646
$x.1() at $x.1+0x50
vnet_register_sysinit() at vnet_register_sysinit+0x114
linker_load_module() at linker_load_module+0xae4
kern_kldload() at kern_kldload+0xfc
sys_kldload() at sys_kldload+0x60
do_el0_sync() at do_el0_sync+0x608
handle_el0_sync() at handle_el0_sync+0x44
--- exception, esr 0x5600
KDB: enter: panic
[ thread pid 70419 tid 101003 ]
Stopped at  kdb_enter+0x44: str xzr, [x19, #3200]
db> 

I'll see if a re-run is repeatable.

===
Mark Millard
marklmi at yahoo.com




Re: Build failure for radlib.o during main-n263767-764464af4968 -> main-n263782-59833b089e78 src update

2023-06-24 Thread David Wolfskill
On Sat, Jun 24, 2023 at 09:09:00AM -0700, David Wolfskill wrote:
> On Sat, Jun 24, 2023 at 10:39:57AM -0400, Ed Maste wrote:
> > ...
> > > : "OPENSSL_API_COMPAT expresses an impossible API compatibility level"
> > > #  error "OPENSSL_API_COMPAT expresses an impossible API compatibility 
> > > level"
> > >^
> > 
> > This could be a dependency issue; would you check if removing the
> > following $OBJTOP subdirs addresses the issue:
> > 
> > secure/lib/libcrypto
> > secure/lib/libssl
> > obj-lib32/secure/lib/libcrypto
> > obj-lib32/secure/lib/libssl
> > 
> > If so I'll see if we can add a rule to tools/build/depend-cleanup.sh
> 
> After:
> ... 
> rm -fr /usr/obj/usr/src/amd64.amd64/{,obj-lib32/}secure/lib/lib{crypto,ssl}
> 
> then re-starting the "make buildworld", that process has completed the
> 
> >>> stage 4.2: building libraries
> 
> phase (and is now "building lib32 shim libraries").
> 

The build was successful; after the reboot, we see:

g1-48(14.0-C)[1] uname -aUK
FreeBSD g1-48.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #469 
main-n263782-59833b089e78: Sat Jun 24 16:28:56 UTC 2023 
r...@g1-48.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY amd64 
1400092 1400092


So: I believe we have a winner! :-)

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
"Putin supports any set of ideas to end the conflict,” -- Dmitry Peskov
Putin is the source of the conflict.  Remove the source; end of conflict.

See https://www.catwhisker.org/~david/publickey.gpg for my public key.


signature.asc
Description: PGP signature


Re: Build failure for radlib.o during main-n263767-764464af4968 -> main-n263782-59833b089e78 src update

2023-06-24 Thread David Wolfskill
On Sat, Jun 24, 2023 at 10:39:57AM -0400, Ed Maste wrote:
> ...
> > : "OPENSSL_API_COMPAT expresses an impossible API compatibility level"
> > #  error "OPENSSL_API_COMPAT expresses an impossible API compatibility 
> > level"
> >^
> 
> This could be a dependency issue; would you check if removing the
> following $OBJTOP subdirs addresses the issue:
> 
> secure/lib/libcrypto
> secure/lib/libssl
> obj-lib32/secure/lib/libcrypto
> obj-lib32/secure/lib/libssl
> 
> If so I'll see if we can add a rule to tools/build/depend-cleanup.sh

After:

g1-48(14.0-C)[1] ls -lTd 
/usr/obj/usr/src/amd64.amd64/^G{,obj-lib32/}secure/lib/lib{crypto,ssl}
drwxrwxr-x  4 root  wheel   62464 Jun 23 06:08:19 2023 
/usr/obj/usr/src/amd64.amd64/obj-lib32/secure/lib/libcrypto
drwxrwxr-x  2 root  wheel5120 Jun 23 06:08:38 2023 
/usr/obj/usr/src/amd64.amd64/obj-lib32/secure/lib/libssl
drwxrwxr-x  5 root  wheel  132608 Jun 24 04:11:55 2023 
/usr/obj/usr/src/amd64.amd64/secure/lib/libcrypto
drwxrwxr-x  2 root  wheel5632 Jun 24 04:11:56 2023 
/usr/obj/usr/src/amd64.amd64/secure/lib/libssl
g1-48(14.0-C)[2] rm -fr !$
rm -fr /usr/obj/usr/src/amd64.amd64/{,obj-lib32/}secure/lib/lib{crypto,ssl}

then re-starting the "make buildworld", that process has completed the

>>> stage 4.2: building libraries

phase (and is now "building lib32 shim libraries").

So: definite progress.  (Build machine is busy with other stuff, so the
above was on a laptop; it will be a bit slow).

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
"Putin supports any set of ideas to end the conflict,” -- Dmitry Peskov
Putin is the source of the conflict.  Remove the source; end of conflict.

See https://www.catwhisker.org/~david/publickey.gpg for my public key.


signature.asc
Description: PGP signature


Re: OpenSSL 3.0 is in the tree

2023-06-24 Thread Dimitry Andric
On 24 Jun 2023, at 16:22, Ed Maste  wrote:
> 
> Last night I merged OpenSSL 3.0 to main. This, along with the update
> to Clang 16 and other recent changes may result in some challenges
> over the next few days or weeks for folks following -CURRENT, such as
> ports that need to be updated or unanticipated issues in the base
> system.
> 
> We need to get this work done so that we can continue moving on with
> FreeBSD 14; I apologize for the trouble it might cause in the short
> term. Please follow up to report any trouble you encounter.

Regarding affected ports, see also the llvm-16-update exp-run bug:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271047

and similarly, the openssl 3.0 exp-run bug:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271656

-Dimitry



signature.asc
Description: Message signed with OpenPGP


Re: Build failure for radlib.o during main-n263767-764464af4968 -> main-n263782-59833b089e78 src update

2023-06-24 Thread Ed Maste
On Sat, 24 Jun 2023 at 07:11, David Wolfskill  wrote:
>
> Running:
> FreeBSD freebeast.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #405 
> main-n263767-764464af4968: Fri Jun 23 11:42:14 UTC 2023 
> r...@freebeast.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/GENERIC 
> amd64 1400091 1400091
>
> after updating sources to main-n263782-59833b089e78, then starting
> make -j 64 buildworld (in META mode)
>
> ...
> >>> stage 4.2: building libraries
> ...
> Building /common/S4/obj/usr/src/amd64.amd64/cddl/lib/libzfs/os/freebsd/nfs.o
> In file included from 
> /usr/src/sys/contrib/openzfs/lib/libzfs/libzfs_crypto.c:28
> :
> In file included from 
> /common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/openssl
> /evp.h:14:
> /common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/openssl/macros.h:155:4: 
> error
> : "OPENSSL_API_COMPAT expresses an impossible API compatibility level"
> #  error "OPENSSL_API_COMPAT expresses an impossible API compatibility level"
>^

This could be a dependency issue; would you check if removing the
following $OBJTOP subdirs addresses the issue:

secure/lib/libcrypto
secure/lib/libssl
obj-lib32/secure/lib/libcrypto
obj-lib32/secure/lib/libssl

If so I'll see if we can add a rule to tools/build/depend-cleanup.sh



OpenSSL 3.0 is in the tree

2023-06-24 Thread Ed Maste
Last night I merged OpenSSL 3.0 to main. This, along with the update
to Clang 16 and other recent changes may result in some challenges
over the next few days or weeks for folks following -CURRENT, such as
ports that need to be updated or unanticipated issues in the base
system.

We need to get this work done so that we can continue moving on with
FreeBSD 14; I apologize for the trouble it might cause in the short
term. Please follow up to report any trouble you encounter.



Build failure for radlib.o during main-n263767-764464af4968 -> main-n263782-59833b089e78 src update

2023-06-24 Thread David Wolfskill
Running:
FreeBSD freebeast.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #405 
main-n263767-764464af4968: Fri Jun 23 11:42:14 UTC 2023 
r...@freebeast.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/GENERIC 
amd64 1400091 1400091

after updating sources to main-n263782-59833b089e78, then starting
make -j 64 buildworld (in META mode)

...
>>> stage 4.2: building libraries
...
Building /common/S4/obj/usr/src/amd64.amd64/cddl/lib/libzfs/os/freebsd/nfs.o
In file included from /usr/src/sys/contrib/openzfs/lib/libzfs/libzfs_crypto.c:28
:
In file included from /common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/openssl
/evp.h:14:
/common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/openssl/macros.h:155:4: error
: "OPENSSL_API_COMPAT expresses an impossible API compatibility level"
#  error "OPENSSL_API_COMPAT expresses an impossible API compatibility level"
   ^
*** [radlib.o] Error code 1

make[4]: stopped in /usr/src/lib/libradius
.ERROR_TARGET='radlib.o'
.ERROR_META_FILE='/common/S4/obj/usr/src/amd64.amd64/lib/libradius/radlib.o.meta'
.MAKE.LEVEL='4'


The cited meta file pretty much re-states the above, but I have
attached a copy anyway.

Subsequently, one of my laptops has reproduced the failure (though with
only -j 16).

As of this writing, I see no more rec ent commits to head.

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
"Putin supports any set of ideas to end the conflict,” -- Dmitry Peskov
Putin is the source of the conflict.  Remove the source; end of conflict.

See https://www.catwhisker.org/~david/publickey.gpg for my public key.
# Meta data file /common/S4/obj/usr/src/amd64.amd64/lib/libradius/radlib.o.meta
CMD cc -target x86_64-unknown-freebsd14.0 
--sysroot=/common/S4/obj/usr/src/amd64.amd64/tmp 
-B/common/S4/obj/usr/src/amd64.amd64/tmp/usr/bin  -O2 -pipe -fno-common   -Wall 
-DOPENSSL_API_COMPAT=0x1010L -DWITH_SSL -g -gz=zlib -std=gnu99 
-Wno-format-zero-length -fstack-protector-strong -Wsystem-headers -Werror -Wall 
-Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes 
-Wmissing-prototypes -Wpointer-arith -Wno-uninitialized -Wno-pointer-sign 
-Wdate-time -Wno-empty-body -Wno-string-plus-int -Wno-unused-const-variable 
-Wno-error=unused-but-set-variable -Wno-error=unused-but-set-parameter 
-Wno-tautological-compare -Wno-unused-value -Wno-parentheses-equality 
-Wno-unused-function -Wno-enum-conversion -Wno-unused-local-typedef 
-Wno-address-of-packed-member  -Qunused-arguments-c 
/usr/src/lib/libradius/radlib.c -o radlib.o
CMD 
CWD /common/S4/obj/usr/src/amd64.amd64/lib/libradius
TARGET radlib.o
-- command output --
In file included from /usr/src/lib/libradius/radlib.c:38:
In file included from 
/common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/openssl/hmac.h:14:
/common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/openssl/macros.h:139:4: 
error: "The requested API level higher than the configured API compatibility 
level"
#  error "The requested API level higher than the configured API compatibility 
level"
   ^
1 error generated.

*** Error code 1

-- filemon acquired metadata --
# filemon version 5
# Target pid 20051
# Start 1687604049.922333
V 5
E 20054 /bin/sh
R 20054 /etc/libmap.conf
R 20054 /var/run/ld-elf.so.hints
R 20054 /lib/libedit.so.8
R 20054 /lib/libc.so.7
R 20054 /lib/libtinfow.so.9
R 20054 /usr/share/locale/en_US.UTF-8/LC_COLLATE
R 20054 /usr/share/locale/en_US.UTF-8/LC_CTYPE
R 20054 /usr/share/locale/en_US.UTF-8/LC_MONETARY
R 20054 /usr/share/locale/en_US.UTF-8/LC_NUMERIC
R 20054 /usr/share/locale/en_US.UTF-8/LC_TIME
R 20054 /usr/share/locale/en_US.UTF-8/LC_MESSAGES
F 20054 20056
E 20056 /usr/bin/cc
R 20056 /etc/libmap.conf
R 20056 /var/run/ld-elf.so.hints
R 20056 /lib/libz.so.6
R 20056 /usr/lib/libprivatezstd.so.5
R 20056 /usr/lib/libexecinfo.so.1
R 20056 /lib/libncursesw.so.9
R 20056 /lib/libtinfow.so.9
R 20056 /lib/libthr.so.3
R 20056 /lib/libc++.so.1
R 20056 /lib/libcxxrt.so.1
R 20056 /lib/libm.so.5
R 20056 /lib/libc.so.7
R 20056 /lib/libelf.so.2
R 20056 /lib/libgcc_s.so.1
R 20056 /usr/src/lib/libradius/radlib.c
R 20056 radlib-51f1ada4.o.tmp
W 20056 radlib-51f1ada4.o.tmp
R 20056 /common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/sys/cdefs.h
R 20056 /common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/sys/types.h
R 20056 /common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/machine/endian.h
R 20056 /common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/x86/endian.h
R 20056 /common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/sys/_types.h
R 20056 /common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/machine/_types.h
R 20056 /common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/x86/_types.h
R 20056 /common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/machine/_limits.h
R 20056 /common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/x86/_limits.h
R 20056 /common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/sys/_endian.h
R 20056 /common/S4/obj/usr/src/amd64.amd64/tmp/usr/include/sys/_pthreadtypes.h
R 20056 /common/S4/obj/usr/src/amd64

Re: Stuck on ERROR-tried-to-rebuild-during-make-install- How to move past?

2023-06-24 Thread parv/FreeBSD
On Thu, Jun 15, 2023 at 1:14 AM parv/FreeBSD  wrote:

>  Hi there,
>
> I am stuck in "installworld" step due to
> ERROR-tried-to-rebuild-during-make-install when going
> from 1cebc9298cf to b98fbf3781 on one host, or from
> 9fbeeb6e3831 to dfa1982352ee on another host.
>
> What should I be looking for?
>
>
> 1cebc9298cf-to-b98fbf3781:
> log:
> ...
> install  -d -m 0755 -o root  -g wheel  /boot
> install  -o root  -g wheel -m 444  loader.help.efi /boot/loader.help.efi
> cc -target x86_64-unknown-freebsd14.0
> --sysroot=/build/world/freebsd/src/amd64.amd64/tmp
> -B/build/world/freebsd/src/amd64.amd64/tmp/usr/bin  -O2 -pipe -fno-common
> -Wformat -fshort-wchar -mno-red-zone -nostdinc
>  -I/build/world/freebsd/src/amd64.amd64/stand/libsa
> -I/freebsd/src/stand/libsa -D_STANDALONE -I/freebsd/src/sys
> -Ddouble=jagged-little-pill -Dfloat=floaty-mcfloatface -ffunction-sections
> -fdata-sections -DLOADER_GELI_SUPPORT -I/freebsd/src/stand/libsa/geli
> -DLOADER_DISK_SUPPORT -ffreestanding -mno-mmx -mno-sse -mno-avx -mno-avx2
> -msoft-float -fPIC -mno-red-zone -mno-relax -I. -Iinclude
> -I/freebsd/src/stand/efi/loader_4th/../loader
> -I/freebsd/src/stand/libsa/zfs -I/freebsd/src/sys/contrib/openzfs/include
> -I/freebsd/src/sys/contrib/openzfs/include/os/freebsd/zfs -DEFI_ZFS_BOOT
> -fPIC -I/freebsd/src/stand/efi/loader_4th
> -I/freebsd/src/stand/efi/loader_4th/arch/amd64
> -I/freebsd/src/stand/efi/include -I/freebsd/src/stand/efi/include/amd64
> -I/freebsd/src/sys/contrib/dev/acpica/include
> -I/freebsd/src/stand/i386/libi386 -DEFI -I/freebsd/src/stand/common -fPIC
> -I/freebsd/src/stand/ficl -I/freebsd/src/stand/ficl/amd64
> -I/freebsd/src/stand/common -DBF_DICTSIZE=3 -DLOADER_MSDOS_SUPPORT
> -DLOADER_UFS_SUPPORT -DLOADER_NET_SUPPORT -DLOADER_GPT_SUPPORT
> -DLOADER_MBR_SUPPORT -DLOADER_ZFS_SUPPORT -I/freebsd/src/stand/libsa/zfs
> -I/freebsd/src/sys/cddl/boot/zfs
> -I/freebsd/src/sys/cddl/contrib/opensolaris/uts/common
> -DHELP_FILENAME=\"loader.help.efi\" -mretpoline
> -ftrivial-auto-var-init=zero
>  -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang
> -g -gz=zlib  -std=gnu99 -Wno-format-zero-length -Wsystem-headers -Werror
> -Wno-pointer-sign -Wdate-time -Wno-empty-body -Wno-string-plus-int
> -Wno-unused-const-variable -Wno-error=unused-but-set-variable
> -Wno-error=array-parameter -Wno-error=deprecated-non-prototype
> -Wno-error=unused-but-set-parameter -Wno-tautological-compare
> -Wno-unused-value -Wno-parentheses-equality -Wno-unused-function
> -Wno-enum-conversion -Wno-unused-local-typedef
> -Wno-address-of-packed-member -Wno-switch -Wno-switch-enum
> -Wno-knr-promoted-parameter -Wno-parentheses  -Oz -Qunused-arguments
> ERROR-tried-to-rebuild-during-make-install
> -c /freebsd/src/stand/efi/loader_4th/../loader/efiserialio.c -o
> efiserialio.o
> /tmp/install.PjBzUKhv0k/sh: cc: not found
> *** Error code 127
>
> Stop.
> make[6]: stopped in /freebsd/src/stand/efi/loader_4th
> ...
>
> /etc/{make,src}*conf: attached 1cebc9298cf-to-b98fbf3781.etc.list
>
...

Also asked at https://mastodon.social/@parvXm/110598357680164603
where I note that ...

ERROR-tried-to-rebuild-during-make-install

... was introduced in c 2017 per
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212877#c21 .