Re: Is it valid to combine CTLFLAG_TUN with CTLFLAG_VNET ?

2023-04-18 Thread Zhenlei Huang


> On Apr 6, 2023, at 3:56 AM, Hans Petter Selasky  wrote:
> 
> On 4/5/23 21:44, Hans Petter Selasky wrote:
>> On 4/5/23 20:23, Gleb Smirnoff wrote:
>>> What if we remove the CTLFLAG_VNET check from the code you posted above?
>>> I don't see anything going wrong, rather going right 
>>> 
>>> CTLFLAG_VNET will not mask away CTLFLAG_TUN.
>> Hi Gleb,
>> It's possible to bypass that check, but some work needs to be done first. 
>> Then all jails created, will also start from those sysctl tunable values.
>> The problem is, where does the VNET base pointer come from?
>> Especially those static sysctl's. You would need to make some design there I 
>> guess and look at the SYSINIT() order. When are SYSINIT's filled with 
>> tunable data's. And when is the default VNET created.
>> Because the data pointer passed to the register sysctl function is simply an 
>> offset pointer into a malloc'ed structure.
>> --HPS
> 
> Hi Zhenlei,
> 
> Feel free to work on this, and add me as a reviewer and complete phase two of:
> 
>> commit 3da1cf1e88f8448bb10c5f778ab56ff65c7a6938
>> Author: Hans Petter Selasky 
>> Date:   Fri Jun 27 16:33:43 2014 +
>>Extend the meaning of the CTLFLAG_TUN flag to automatically check if
>>there is an environment variable which shall initialize the SYSCTL
>>during early boot. This works for all SYSCTL types both statically and
>>dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
>>which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
> 
> --HPS

Posted to https://reviews.freebsd.org/D39638 
 

CC freebsd-current if some people are interested in the fix.

Best regards,
Zhenlei



The still-pending openzfs updates status upstream, as far as I can tell

2023-04-18 Thread Mark Millard
Some existing changes in FreeBSD need not be appropriate to
openzfs but it is more likely that many of the same files
would eventually have some sort of change in openzfs's main
dealing with what FreeBSD has run into from the import.

Looking around, the following are the FreeBSD adjustments
in place in files that did not seem to be adjusted yet in
openzfs's main. A few are just tracking changes in FreeBSD's
main that are not tied to the import. I added a few notes in
[]'s. The 3 [TEMPORARY]'s may well never have related openzfs
source code, for example.

The other changes did seem to have code updates intended to
deal with problems associated with the import. No claim to
know how well the changes work when they do not match what
my prior testing dealt with.


include/os/freebsd/spl/sys/simd_arm.h (not updated in openzfs master):
d6e24901349d zfs: disable kernel fpu usage on arm and aarc64 Mateusz Guzik [ARM 
PART]

include/os/freebsd/zfs/sys/zfs_context_os.h (not updated in openzfs master):
8e9db62e7423 zfs: Appease set by unused warnings for spl_fstrans_*mark stubs. 
John Baldwin [FREEBSD TRACKING]

include/os/freebsd/zfs/sys/zfs_vfsops_os.h (not updated in openzfs master):
068913e4ba3d zfs: Add vfs.zfs.bclone_enabled sysctl. Pawel Jakub Dawidek 
[TEMPORARY]

module/os/freebsd/zfs/zfs_ctldir.c (not updated in openzfs master):
e2d997d1cbb9 zfs: add missing vop_fplookup_vexec assignments Mateusz Guzik

module/os/freebsd/zfs/zfs_vfsops.c (not updated in openzfs master):
068913e4ba3d zfs: Add vfs.zfs.bclone_enabled sysctl. Pawel Jakub Dawidek 
[TEMPORARY]

module/os/freebsd/zfs/zfs_vnops_os.c (not updated in openzfs master):
eb1feadc201a zfs: fix null ap->a_fsizetd NULL pointer derefernce Martin Matuska
d012836fb616 zfs: fix up EXDEV handling for clone_range Mateusz Guzik
20be1b4fc4b7 zfs: try to fallback early if can't do optimized copy Mateusz 
Guzik [OPTIMIZATION?]
182b21d46276 openzfs: adopt to the new vn_lock_pair() interface Konstantin 
Belousov [FREEBSD TRACKING]
46ac8f2e7d96 zfs: don't use zfs_freebsd_copy_file_range Mateusz Guzik
068913e4ba3d zfs: Add vfs.zfs.bclone_enabled sysctl. Pawel Jakub Dawidek 
[TEMPORARY]

Warner's stand (loader) work caused by the import might be
associated with more openzfs changes at some point, allowing
for easily/directly avoiding use of "extra register sets" in
the various boot loaders.

===
Mark Millard
marklmi at yahoo.com




Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75

2023-04-18 Thread Mark Millard
On Apr 18, 2023, at 15:02, José Pérez  wrote:

> El 2023-04-18 21:37, Mark Millard escribió:
>>> In this case it does because the value is "active". If it's "enabled"
>>> you do not need to do anything.
>> Well, if block_cloning is disabled it would not become active.
> [...]
>> So, in progressing past the vintage that corrupt zfs data,
>> one could end up with block_cloning enabled in the process.
> 
> You still have to willingly issue the command
> zpool upgrade 
> so you might not just end up with the feature enabled by running this
> or that kernel, that's why I suggested step 0: verify if you are the
> worst case scenario before you begin.

I was not really worried about the no-zpool-upgrade/disabled
case. I was worried about "enabled" vs "active" as the
transition enabled -> active is automatic based on activity.

But there is overall disabled vs. enabled vs. active for the
block_cloning feature so I mentioned all 3.


>>> Boot in single user mode and check if your pool has block cloning in
>>> use:
>>> # zpool get feature@block_cloning zroot
>>> NAME PROPERTY VALUE SOURCE
>>> zroot feature@block_cloning active local
>>> In this case it does because the value is "active". If it's "enabled"
>>> you do not need to do anything.
> 
> If you did not upgrade the pool, the feature would just not be there and
> the pool is sane (*).

"Not being there" vs. "disabled" has some context to it. I worded
based on the way my context shows things.

My example context has:

# zpool get all zroot | grep compat
zroot  compatibility  openzfs-2.1-freebsdlocal

which explains the particular list of disabled features
reported below. (It is a "never had zpool upgrade" context
as well.)

# zpool get all zroot | grep disabled
zroot  feature@edonr  disabled   local
zroot  feature@zilsaxattr disabled   local
zroot  feature@head_errlogdisabled   local
zroot  feature@blake3 disabled   local
zroot  feature@block_cloning  disabled   local

so "not be there" seems to mean "disabled" as zpool presents
things based on compatibility. Just to see the command you
listed fully but in my type of context:

# zpool get feature@block_cloning zroot
NAME   PROPERTY   VALUE  SOURCE
zroot  feature@block_cloning  disabled   local

# zpool version
zfs-2.1.99-FreeBSD_g431083f75
zfs-kmod-2.1.99-FreeBSD_g431083f75

(Those are software versions, not properties of
specific pools.)

I'll note that I see:

# zpool get feature@JUNKNAME zroot
# 

So features that the software does not have in its
list of possibilities get an empty result.

> unaffected_machine# zpool get feature@block_cloning zroot
> unaffected_machine#

That is the same sort of output as in my feature@JUNKNAME
test above. It is not clear from what is presented that
the context had block_cloning in its list of possibilities.

In my normal environment (that still predates the import
of the openzfs update), I get the same sort of result
for feature@block_cloning as you show above.

> As said, if the feature has been enabled but no calls to
> copy_file_range() occurred, the pool is also sane.

A the time but more activity can change the status
because copy_file_range() could be called. So I
expect that the following step is relvant to avoid
ending up with block_cloning becoming active:

QUOTE
When in single user mode set compression property to "off" on any zfs 
active dataset that has compression other than "off" and the sync 
property to something other than "disabled".
END QUOTE

> To summarize:
> no feature -> sane
> feature "enabled" -> sane
> feature "active" -> might not be sane
> 
> BR,
> 
> (*) as per this bug.


===
Mark Millard
marklmi at yahoo.com




Re: The import of openzfs vs. armv7: boot crashs

2023-04-18 Thread Warner Losh
On Tue, Apr 18, 2023 at 5:04 PM Mark Millard  wrote:

> On Apr 18, 2023, at 15:46, Warner Losh  wrote:
>
> > Fun...
> >
> > I'm also fighting aarch64 issues...
>
> Of what kind? I've been able to use things as committed
> in FreeBSD (block_cloning never having been enabled but
> jumping from before the import to, effectively, after
> the FreeBSD adjustments). But I have not tried anything
> that is different as committed in openzfs.
>
> (I'm one of those that tested poudriere bulk activity
> via separate media from my normal aarch64 context. Those
> tests had no problems once the full set up adjustments
> was present in my context.)
>

All boot loader issues for special environments. Not in the kernel,
so maybe unrelated and I shouldn't have said anything.

I'm guessing that upstream needs a generic way to disable all
acceleration, even if that has bad performance. I'm looking for a
good way to do that (once I get done with the bugs I was fixing
when I noticed this issue).

Warner


> > Warner
> >
> > On Tue, Apr 18, 2023, 4:45 PM Mark Millard  wrote:
> >
> https://github.com/openzfs/zfs/commit/d0cbd9feaf5b82130f2e679256c71e0c7413aae9
> >
> > does not seem to cover armv7, just aarch64. (FreeBSD disabled
> > floating point for both armv7 and aarch64 but that is a
> > different change than above.)
>
> I probably should have explicitly noted that the fpu disabling
> was from after the snapshot being tested here.
>
> The point of the snapshot test (the most recent available) was
> to find out if armv7 crashed before the fpu-use disabling commit.
>
> > I used:
> >
> >
> FreeBSD-14.0-CURRENT-arm-armv7-GENERICSD-20230406-f21faa67ab6b-262010.img.xz
>
> That is from after the import and after:
>
> • git: eb1feadc201a - main - zfs: fix null ap->a_fsizetd NULL pointer
> derefernce Martin Matuska
>
> but with no other zfs changes. It does not contain:
>
> • git: d6e24901349d - main - zfs: disable kernel fpu usage on arm and
> aarc64 Mateusz Guzik
>
> (But the openzfs changes are different.)
>
> > booted an RPi2B v1.1 and tried (note the KSTACK_PAGES notice and the
> > "undefined floating point instruction" notice):
> >
> > # zpool import
> > ZFS NOTICE: KSTACK_PAGES is 2 which could result in stack overflow panic!
> > Please consider adding 'options KSTACK_PAGES=4' to your kernel config
> > panic: undefined floating point instruction in supervisor mode
> > cpuid = 2
> > time = 1680784610
> > KDB: stack backtrace:
> > db_trace_self() at db_trace_self
> >  pc = 0xc05eb154  lr = 0xc007a688 (db_trace_self_wrapper+0x30)
> >  sp = 0xdd25c480  fp = 0xdd25c598
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x30
> >  pc = 0xc007a688  lr = 0xc02eb1b4 (vpanic+0x140)
> >  sp = 0xdd25c5a0  fp = 0xdd25c5c0
> >  r4 = 0x0100  r5 = 0x
> >  r6 = 0xc0736bfc  r7 = 0xc0b1aea8
> > vpanic() at vpanic+0x140
> >  pc = 0xc02eb1b4  lr = 0xc02eaf94 (doadump)
> >  sp = 0xdd25c5c8  fp = 0xdd25c5cc
> >  r4 = 0xc0b92210  r5 = 0x
> >  r6 = 0xc0610ca0  r7 = 0xf4210a0d
> >  r8 = 0xddf32e4c  r9 = 0x0013
> > r10 = 0xdd25c6c0
> > doadump() at doadump
> >  pc = 0xc02eaf94  lr = 0xc0610eb0 (vfp_new_thread)
> >  sp = 0xdd25c5d4  fp = 0xdd25c638
> >  r4 = 0xdd25c6c0  r5 = 0xdd25c5cc
> >  r6 = 0xc02eaf94 r10 = 0xdd25c5d4
> > vfp_new_thread() at vfp_new_thread
> >  pc = 0xc0610eb0  lr = 0xc060ff84 (undefinedinstruction+0x178)
> >  sp = 0xdd25c640  fp = 0xdd25c6b8
> > undefinedinstruction() at undefinedinstruction+0x178
> >  pc = 0xc060ff84  lr = 0xc05edaa8 (exception_exit)
> >  sp = 0xdd25c6c0  fp = 0xdd25c750
> >  r4 = 0x2013  r5 = 0xde45e000
> >  r6 = 0xdd25c890  r7 = 0xdd25c8b0
> >  r8 = 0x  r9 = 0x
> > r10 = 0xdd25c8c0
> > exception_exit() at exception_exit
> >  pc = 0xc05edaa8  lr = 0xddf31f20 (K256)
> >  sp = 0xdd25c750  fp = 0xdd25c750
> >  r0 = 0xdd25c890  r1 = 0xde45e000
> >  r2 = 0xde45e400  r3 = 0xddf309fc
> >  r4 = 0x0400  r5 = 0xde45e000
> >  r6 = 0xdd25c890  r7 = 0xdd25c8b0
> >  r8 = 0x  r9 = 0x
> > r10 = 0xdd25c8c0 r12 = 0xdd25c7a0
> > zfs_sha256_block_neon() at zfs_sha256_block_neon+0x1c
> >  pc = 0xddf32e4c  lr = 0xc0946e8c (pcpup)
> >  sp = 0xdd25c758  fp = 0xc0b0aeec
> >  r4 = 0xc0919610  r5 = 0xc0919630
> >  r6 = 0xc0919618  r7 = 0x642ebce2
> >  r8 = 0xc0b1b0ec  r9 = 0xc0915e88
> > r10 = 0xc0b1b0dc
> > Fatal kernel mode data abort: 'Translation Fault (L1)' on read
> > trapframe: 0xdd25c330
> > FSR=0005, FAR=95e29398, spsr=20d3
> > r0 =dd25c424, r1 =8100, r2 =95e29395, r3 =
> > r4 =c08ae93c, r5 =4aa0, r6 =4aa0, r7 =c08d3e3c
> > r8 =0001, r9 =c079567a, r10=000b, r11=dd25c3e0
> > r12=, ssp=dd25c3c4, slr=0001, pc =c0610308

Re: The import of openzfs vs. armv7: boot crashs

2023-04-18 Thread Mark Millard
On Apr 18, 2023, at 15:46, Warner Losh  wrote:

> Fun...
> 
> I'm also fighting aarch64 issues...

Of what kind? I've been able to use things as committed
in FreeBSD (block_cloning never having been enabled but
jumping from before the import to, effectively, after
the FreeBSD adjustments). But I have not tried anything
that is different as committed in openzfs.

(I'm one of those that tested poudriere bulk activity
via separate media from my normal aarch64 context. Those
tests had no problems once the full set up adjustments
was present in my context.)

> Warner
> 
> On Tue, Apr 18, 2023, 4:45 PM Mark Millard  wrote:
> https://github.com/openzfs/zfs/commit/d0cbd9feaf5b82130f2e679256c71e0c7413aae9
> 
> does not seem to cover armv7, just aarch64. (FreeBSD disabled
> floating point for both armv7 and aarch64 but that is a
> different change than above.)

I probably should have explicitly noted that the fpu disabling
was from after the snapshot being tested here.

The point of the snapshot test (the most recent available) was
to find out if armv7 crashed before the fpu-use disabling commit.

> I used:
> 
> FreeBSD-14.0-CURRENT-arm-armv7-GENERICSD-20230406-f21faa67ab6b-262010.img.xz

That is from after the import and after:

• git: eb1feadc201a - main - zfs: fix null ap->a_fsizetd NULL pointer 
derefernce Martin Matuska

but with no other zfs changes. It does not contain:

• git: d6e24901349d - main - zfs: disable kernel fpu usage on arm and 
aarc64 Mateusz Guzik

(But the openzfs changes are different.)

> booted an RPi2B v1.1 and tried (note the KSTACK_PAGES notice and the
> "undefined floating point instruction" notice):
> 
> # zpool import
> ZFS NOTICE: KSTACK_PAGES is 2 which could result in stack overflow panic!
> Please consider adding 'options KSTACK_PAGES=4' to your kernel config
> panic: undefined floating point instruction in supervisor mode
> cpuid = 2
> time = 1680784610
> KDB: stack backtrace:
> db_trace_self() at db_trace_self
>  pc = 0xc05eb154  lr = 0xc007a688 (db_trace_self_wrapper+0x30)
>  sp = 0xdd25c480  fp = 0xdd25c598
> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>  pc = 0xc007a688  lr = 0xc02eb1b4 (vpanic+0x140)
>  sp = 0xdd25c5a0  fp = 0xdd25c5c0
>  r4 = 0x0100  r5 = 0x
>  r6 = 0xc0736bfc  r7 = 0xc0b1aea8
> vpanic() at vpanic+0x140
>  pc = 0xc02eb1b4  lr = 0xc02eaf94 (doadump)
>  sp = 0xdd25c5c8  fp = 0xdd25c5cc
>  r4 = 0xc0b92210  r5 = 0x
>  r6 = 0xc0610ca0  r7 = 0xf4210a0d
>  r8 = 0xddf32e4c  r9 = 0x0013
> r10 = 0xdd25c6c0
> doadump() at doadump
>  pc = 0xc02eaf94  lr = 0xc0610eb0 (vfp_new_thread)
>  sp = 0xdd25c5d4  fp = 0xdd25c638
>  r4 = 0xdd25c6c0  r5 = 0xdd25c5cc
>  r6 = 0xc02eaf94 r10 = 0xdd25c5d4
> vfp_new_thread() at vfp_new_thread
>  pc = 0xc0610eb0  lr = 0xc060ff84 (undefinedinstruction+0x178)
>  sp = 0xdd25c640  fp = 0xdd25c6b8
> undefinedinstruction() at undefinedinstruction+0x178
>  pc = 0xc060ff84  lr = 0xc05edaa8 (exception_exit)
>  sp = 0xdd25c6c0  fp = 0xdd25c750
>  r4 = 0x2013  r5 = 0xde45e000
>  r6 = 0xdd25c890  r7 = 0xdd25c8b0
>  r8 = 0x  r9 = 0x
> r10 = 0xdd25c8c0
> exception_exit() at exception_exit
>  pc = 0xc05edaa8  lr = 0xddf31f20 (K256)
>  sp = 0xdd25c750  fp = 0xdd25c750
>  r0 = 0xdd25c890  r1 = 0xde45e000
>  r2 = 0xde45e400  r3 = 0xddf309fc
>  r4 = 0x0400  r5 = 0xde45e000
>  r6 = 0xdd25c890  r7 = 0xdd25c8b0
>  r8 = 0x  r9 = 0x
> r10 = 0xdd25c8c0 r12 = 0xdd25c7a0
> zfs_sha256_block_neon() at zfs_sha256_block_neon+0x1c
>  pc = 0xddf32e4c  lr = 0xc0946e8c (pcpup)
>  sp = 0xdd25c758  fp = 0xc0b0aeec
>  r4 = 0xc0919610  r5 = 0xc0919630
>  r6 = 0xc0919618  r7 = 0x642ebce2
>  r8 = 0xc0b1b0ec  r9 = 0xc0915e88
> r10 = 0xc0b1b0dc
> Fatal kernel mode data abort: 'Translation Fault (L1)' on read
> trapframe: 0xdd25c330
> FSR=0005, FAR=95e29398, spsr=20d3
> r0 =dd25c424, r1 =8100, r2 =95e29395, r3 =
> r4 =c08ae93c, r5 =4aa0, r6 =4aa0, r7 =c08d3e3c
> r8 =0001, r9 =c079567a, r10=000b, r11=dd25c3e0
> r12=, ssp=dd25c3c4, slr=0001, pc =c0610308
> 
> panic: Fatal abort
> . . . (repeats over and over) . . .
> 




===
Mark Millard
marklmi at yahoo.com




Re: The import of openzfs vs. armv7: boot crashs

2023-04-18 Thread Warner Losh
Fun...

I'm also fighting aarch64 issues...

Warner

On Tue, Apr 18, 2023, 4:45 PM Mark Millard  wrote:

>
> https://github.com/openzfs/zfs/commit/d0cbd9feaf5b82130f2e679256c71e0c7413aae9
>
> does not seem to cover armv7, just aarch64. (FreeBSD disabled
> floating point for both armv7 and aarch64 but that is a
> different change than above.)
>
> I used:
>
>
> FreeBSD-14.0-CURRENT-arm-armv7-GENERICSD-20230406-f21faa67ab6b-262010.img.xz
>
> booted an RPi2B v1.1 and tried (note the KSTACK_PAGES notice and the
> "undefined floating point instruction" notice):
>
> # zpool import
> ZFS NOTICE: KSTACK_PAGES is 2 which could result in stack overflow panic!
> Please consider adding 'options KSTACK_PAGES=4' to your kernel config
> panic: undefined floating point instruction in supervisor mode
> cpuid = 2
> time = 1680784610
> KDB: stack backtrace:
> db_trace_self() at db_trace_self
>  pc = 0xc05eb154  lr = 0xc007a688 (db_trace_self_wrapper+0x30)
>  sp = 0xdd25c480  fp = 0xdd25c598
> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>  pc = 0xc007a688  lr = 0xc02eb1b4 (vpanic+0x140)
>  sp = 0xdd25c5a0  fp = 0xdd25c5c0
>  r4 = 0x0100  r5 = 0x
>  r6 = 0xc0736bfc  r7 = 0xc0b1aea8
> vpanic() at vpanic+0x140
>  pc = 0xc02eb1b4  lr = 0xc02eaf94 (doadump)
>  sp = 0xdd25c5c8  fp = 0xdd25c5cc
>  r4 = 0xc0b92210  r5 = 0x
>  r6 = 0xc0610ca0  r7 = 0xf4210a0d
>  r8 = 0xddf32e4c  r9 = 0x0013
> r10 = 0xdd25c6c0
> doadump() at doadump
>  pc = 0xc02eaf94  lr = 0xc0610eb0 (vfp_new_thread)
>  sp = 0xdd25c5d4  fp = 0xdd25c638
>  r4 = 0xdd25c6c0  r5 = 0xdd25c5cc
>  r6 = 0xc02eaf94 r10 = 0xdd25c5d4
> vfp_new_thread() at vfp_new_thread
>  pc = 0xc0610eb0  lr = 0xc060ff84 (undefinedinstruction+0x178)
>  sp = 0xdd25c640  fp = 0xdd25c6b8
> undefinedinstruction() at undefinedinstruction+0x178
>  pc = 0xc060ff84  lr = 0xc05edaa8 (exception_exit)
>  sp = 0xdd25c6c0  fp = 0xdd25c750
>  r4 = 0x2013  r5 = 0xde45e000
>  r6 = 0xdd25c890  r7 = 0xdd25c8b0
>  r8 = 0x  r9 = 0x
> r10 = 0xdd25c8c0
> exception_exit() at exception_exit
>  pc = 0xc05edaa8  lr = 0xddf31f20 (K256)
>  sp = 0xdd25c750  fp = 0xdd25c750
>  r0 = 0xdd25c890  r1 = 0xde45e000
>  r2 = 0xde45e400  r3 = 0xddf309fc
>  r4 = 0x0400  r5 = 0xde45e000
>  r6 = 0xdd25c890  r7 = 0xdd25c8b0
>  r8 = 0x  r9 = 0x
> r10 = 0xdd25c8c0 r12 = 0xdd25c7a0
> zfs_sha256_block_neon() at zfs_sha256_block_neon+0x1c
>  pc = 0xddf32e4c  lr = 0xc0946e8c (pcpup)
>  sp = 0xdd25c758  fp = 0xc0b0aeec
>  r4 = 0xc0919610  r5 = 0xc0919630
>  r6 = 0xc0919618  r7 = 0x642ebce2
>  r8 = 0xc0b1b0ec  r9 = 0xc0915e88
> r10 = 0xc0b1b0dc
> Fatal kernel mode data abort: 'Translation Fault (L1)' on read
> trapframe: 0xdd25c330
> FSR=0005, FAR=95e29398, spsr=20d3
> r0 =dd25c424, r1 =8100, r2 =95e29395, r3 =
> r4 =c08ae93c, r5 =4aa0, r6 =4aa0, r7 =c08d3e3c
> r8 =0001, r9 =c079567a, r10=000b, r11=dd25c3e0
> r12=, ssp=dd25c3c4, slr=0001, pc =c0610308
>
> panic: Fatal abort
> . . . (repeats over and over) . . .
>
> ===
> Mark Millard
> marklmi at yahoo.com
>
>
>


The import of openzfs vs. armv7: boot crashs

2023-04-18 Thread Mark Millard
https://github.com/openzfs/zfs/commit/d0cbd9feaf5b82130f2e679256c71e0c7413aae9

does not seem to cover armv7, just aarch64. (FreeBSD disabled
floating point for both armv7 and aarch64 but that is a
different change than above.)

I used:

FreeBSD-14.0-CURRENT-arm-armv7-GENERICSD-20230406-f21faa67ab6b-262010.img.xz

booted an RPi2B v1.1 and tried (note the KSTACK_PAGES notice and the
"undefined floating point instruction" notice):

# zpool import
ZFS NOTICE: KSTACK_PAGES is 2 which could result in stack overflow panic!
Please consider adding 'options KSTACK_PAGES=4' to your kernel config
panic: undefined floating point instruction in supervisor mode
cpuid = 2
time = 1680784610
KDB: stack backtrace:
db_trace_self() at db_trace_self
 pc = 0xc05eb154  lr = 0xc007a688 (db_trace_self_wrapper+0x30)
 sp = 0xdd25c480  fp = 0xdd25c598
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
 pc = 0xc007a688  lr = 0xc02eb1b4 (vpanic+0x140)
 sp = 0xdd25c5a0  fp = 0xdd25c5c0
 r4 = 0x0100  r5 = 0x
 r6 = 0xc0736bfc  r7 = 0xc0b1aea8
vpanic() at vpanic+0x140
 pc = 0xc02eb1b4  lr = 0xc02eaf94 (doadump)
 sp = 0xdd25c5c8  fp = 0xdd25c5cc
 r4 = 0xc0b92210  r5 = 0x
 r6 = 0xc0610ca0  r7 = 0xf4210a0d
 r8 = 0xddf32e4c  r9 = 0x0013
r10 = 0xdd25c6c0
doadump() at doadump
 pc = 0xc02eaf94  lr = 0xc0610eb0 (vfp_new_thread)
 sp = 0xdd25c5d4  fp = 0xdd25c638
 r4 = 0xdd25c6c0  r5 = 0xdd25c5cc
 r6 = 0xc02eaf94 r10 = 0xdd25c5d4
vfp_new_thread() at vfp_new_thread
 pc = 0xc0610eb0  lr = 0xc060ff84 (undefinedinstruction+0x178)
 sp = 0xdd25c640  fp = 0xdd25c6b8
undefinedinstruction() at undefinedinstruction+0x178
 pc = 0xc060ff84  lr = 0xc05edaa8 (exception_exit)
 sp = 0xdd25c6c0  fp = 0xdd25c750
 r4 = 0x2013  r5 = 0xde45e000
 r6 = 0xdd25c890  r7 = 0xdd25c8b0
 r8 = 0x  r9 = 0x
r10 = 0xdd25c8c0
exception_exit() at exception_exit
 pc = 0xc05edaa8  lr = 0xddf31f20 (K256)
 sp = 0xdd25c750  fp = 0xdd25c750
 r0 = 0xdd25c890  r1 = 0xde45e000
 r2 = 0xde45e400  r3 = 0xddf309fc
 r4 = 0x0400  r5 = 0xde45e000
 r6 = 0xdd25c890  r7 = 0xdd25c8b0
 r8 = 0x  r9 = 0x
r10 = 0xdd25c8c0 r12 = 0xdd25c7a0
zfs_sha256_block_neon() at zfs_sha256_block_neon+0x1c
 pc = 0xddf32e4c  lr = 0xc0946e8c (pcpup)
 sp = 0xdd25c758  fp = 0xc0b0aeec
 r4 = 0xc0919610  r5 = 0xc0919630
 r6 = 0xc0919618  r7 = 0x642ebce2
 r8 = 0xc0b1b0ec  r9 = 0xc0915e88
r10 = 0xc0b1b0dc
Fatal kernel mode data abort: 'Translation Fault (L1)' on read
trapframe: 0xdd25c330
FSR=0005, FAR=95e29398, spsr=20d3
r0 =dd25c424, r1 =8100, r2 =95e29395, r3 =
r4 =c08ae93c, r5 =4aa0, r6 =4aa0, r7 =c08d3e3c
r8 =0001, r9 =c079567a, r10=000b, r11=dd25c3e0
r12=, ssp=dd25c3c4, slr=0001, pc =c0610308

panic: Fatal abort
. . . (repeats over and over) . . .

===
Mark Millard
marklmi at yahoo.com




Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75

2023-04-18 Thread José Pérez

El 2023-04-18 21:37, Mark Millard escribió:

In this case it does because the value is "active". If it's "enabled"
you do not need to do anything.


Well, if block_cloning is disabled it would not become active.

[...]

So, in progressing past the vintage that corrupt zfs data,
one could end up with block_cloning enabled in the process.


You still have to willingly issue the command
zpool upgrade 
so you might not just end up with the feature enabled by running this
or that kernel, that's why I suggested step 0: verify if you are the
worst case scenario before you begin.


Boot in single user mode and check if your pool has block cloning in
use:
# zpool get feature@block_cloning zroot
NAME PROPERTY VALUE SOURCE
zroot feature@block_cloning active local

In this case it does because the value is "active". If it's "enabled"
you do not need to do anything.


If you did not upgrade the pool, the feature would just not be there and
the pool is sane (*).

unaffected_machine# zpool get feature@block_cloning zroot
unaffected_machine#

As said, if the feature has been enabled but no calls to
copy_file_range() occurred, the pool is also sane.

To summarize:
no feature -> sane
feature "enabled" -> sane
feature "active" -> might not be sane

BR,

(*) as per this bug.
--
José Pérez



Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75

2023-04-18 Thread Mark Millard
José_Pérez  wrote on
Date: Tue, 18 Apr 2023 16:59:03 UTC :

> El 2023-04-17 21:59, Pawel Jakub Dawidek escribió:
> > José,
> > 
> > I can only speak of block cloning in details, but I'll try to address
> > everything.
> > 
> > The easiest way to avoid block_cloning-related corruption on the
> > kernel after the last OpenZFS merge, but before e0bb199925 is to set
> > the compress property to 'off' and the sync property to something
> > other than 'disabled'. This will avoid the block_cloning-related
> > corruption and zil_replaying() panic.
> > 
> > As for the other corruption, unfortunately I don't know the details,
> > but my understanding is that it is happening under higher load. Not
> > sure I'd trust a kernel built on a machine with this bug present. What
> > I would do is to compile the kernel as of 068913e4ba somewhere else,
> > boot the problematic machine in single-user mode and install the newly
> > built kernel.
> > 
> > As far as I can tell, contrary to some initial reports, none of the
> > problems introduced by the recent OpenZFS merge corrupt the pool
> > metadata, only file's data. You can locate the files modified with the
> > bogus kernel using find(1) with a proper modification time, but you
> > have to decide what to do with them (either throw them away, restore
> > them from backup or inspect them).
> 
> Sharing my experience on how to get out of the worst case scenario with 
> a building machine that is affected by the bug.
> 
> CAVEAT: this is my experience, take it at your own risk. It worked for 
> me, there is no guarantee that it will work for your. You may create 
> corrupted files and make your system harder to recover or definitely 
> brick it. Don't blame me, you have been warned. YMMV.
> 
> Boot in single user mode and check if your pool has block cloning in 
> use:
> # zpool get feature@block_cloning zroot
> NAME PROPERTY VALUE SOURCE
> zroot feature@block_cloning active local
> 
> In this case it does because the value is "active". If it's "enabled" 
> you do not need to do anything.

Well, if block_cloning is disabled it would not become active.

But, if it is enabled, it can automatically become active by
creating a first entry in the involved Block Reference Table
during any activity meets the criteria for such. If the FreeBSD
vintage in place is one that corrupts zfs data for any reason,
one would still want to progress to a vintage that does not
corrupt zfs data, even if block_cloning is enabled but not
active just before starting such an update sequence.

So, in progressing past the vintage that corrupt zfs data,
one could end up with block_cloning enabled in the process.
At least, that is my understanding of the issue.

May be only a subset of the "causes data corruption" range of
vintages would have to worry about block_cloning becoming active
during the effort to get past all the sources of corruptions.
(If so, I've no clue what range that would be.)

I expect that the "you do not need to do anything" for
block_cloning being "enabled" instead of "active" may be too
strong of a claim, depending on the specific starting-vintage
inside the range with zfs data corruption problems.

(From what I've read, when the last Block Reference Table
entry is removed for any reason, the matching block_cloning
changes back from being indicated as active to being indicated
as enabled.)

> 1) When in single user mode set compression property to "off" on any zfs 
> active dataset that has compression other than "off" and the sync 
> property to something other than "disabled".
> 2) Boot multiuser and update your current sources, e.g.
> git update --rebase
> 3) Build and install a new kernel without too much pressure (e.g. with 
> -j 1):
> make -j 1 kernel
> 4) Reboot with the new kernel
> 5) Now you have to reinstall the kernel with
> make installkernel
> This is because the new kernel files were written by the old kernel 
> and need to be removed.
> 6) Find out when the pool was upgraded (I used command history) and 
> create a file with that date, in my case:
> touch -t 2304161957 /tmp/from
> 7) Find out when you booted the new kernel (I used fgrep Copyright 
> /var/log/messages | tail -n 1) and create a file with that date, in my 
> case:
> touch -t 2304172142 /tmp/to
> 8) Find the files/firs created between the two dates:
> find / -newerBm /tmp/from -and -not -newerBm /tmp/to > 
> /tmp/filelist.txt
> 9) Inspect /tmp/filelist.txt and save any important items. If the 
> important files are not corrupted you can do:
> cp important_file new; mv new important_file
> NOTA BENE: "touch important_file" would not work, you do need to 
> re-create the file.
> 10) Delete the remaining files/dirs in /tmp/filelist.txt. If you did 5) 
> you will remove /boot/kernel.old files, but not /boot/kernel files.
> 11) Restore your compression and sync properties where appropiate.
> 

===
Mark Millard
marklmi at yahoo.com




Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75

2023-04-18 Thread José Pérez

El 2023-04-17 21:59, Pawel Jakub Dawidek escribió:

José,

I can only speak of block cloning in details, but I'll try to address
everything.

The easiest way to avoid block_cloning-related corruption on the
kernel after the last OpenZFS merge, but before e0bb199925 is to set
the compress property to 'off' and the sync property to something
other than 'disabled'. This will avoid the block_cloning-related
corruption and zil_replaying() panic.

As for the other corruption, unfortunately I don't know the details,
but my understanding is that it is happening under higher load. Not
sure I'd trust a kernel built on a machine with this bug present. What
I would do is to compile the kernel as of 068913e4ba somewhere else,
boot the problematic machine in single-user mode and install the newly
built kernel.

As far as I can tell, contrary to some initial reports, none of the
problems introduced by the recent OpenZFS merge corrupt the pool
metadata, only file's data. You can locate the files modified with the
bogus kernel using find(1) with a proper modification time, but you
have to decide what to do with them (either throw them away, restore
them from backup or inspect them).


Sharing my experience on how to get out of the worst case scenario with 
a building machine that is affected by the bug.


CAVEAT: this is my experience, take it at your own risk. It worked for 
me, there is no guarantee that it will work for your. You may create 
corrupted files and make your system harder to recover or definitely 
brick it. Don't blame me, you have been warned. YMMV.


Boot in single user mode and check if your pool has block cloning in 
use:

# zpool get feature@block_cloning zroot
NAME PROPERTY   VALUE  SOURCE
zrootfeature@block_cloning  active local

In this case it does because the value is "active". If it's "enabled" 
you do not need to do anything.


1) When in single user mode set compression property to "off" on any zfs 
active dataset that has compression other than "off" and the sync 
property to something other than "disabled".

2) Boot multiuser and update your current sources, e.g.
   git update --rebase
3) Build and install a new kernel without too much pressure (e.g. with 
-j 1):

   make -j 1 kernel
4) Reboot with the new kernel
5) Now you have to reinstall the kernel with
   make installkernel
   This is because the new kernel files were written by the old kernel 
and need to be removed.
6) Find out when the pool was upgraded (I used command history) and 
create a file with that date, in my case:

   touch -t 2304161957 /tmp/from
7) Find out when you booted the new kernel (I used fgrep Copyright 
/var/log/messages | tail -n 1) and create a file with that date, in my 
case:

   touch -t 2304172142 /tmp/to
8) Find the files/firs created between the two dates:
   find / -newerBm /tmp/from -and -not -newerBm /tmp/to > 
/tmp/filelist.txt
9) Inspect /tmp/filelist.txt and save any important items. If the 
important files are not corrupted you can do:

   cp important_file new; mv new important_file
   NOTA BENE: "touch important_file" would not work, you do need to 
re-create the file.
10) Delete the remaining files/dirs in /tmp/filelist.txt. If you did 5) 
you will remove /boot/kernel.old files, but not /boot/kernel files.

11) Restore your compression and sync properties where appropiate.

BR,

--
José Pérez



Re: Problem compiling py-* ports

2023-04-18 Thread Filippo Moretti
 After completing installworld and rebuilding python39 I do get the same error 
as reported previouslyFilippo

On Tuesday, April 18, 2023 at 12:54:33 PM UTC, Mateusz Guzik 
 wrote:  
 
 On 4/18/23, Filippo Moretti  wrote:
> Good morning,                      I run this versione of Frrebsd and al
> py-* ports fail with the following message.sincerelyFilippo
>
> FreeBSD STING 14.0-CURRENT FreeBSD 14.0-CURRENT #6
> main-n261981-63b113af5706: Tue Apr  4 16:57:47 CEST 2023
> filippo@STING:/usr/obj/usr/src/amd64.amd64/sys/STING amd64
>
>

you are on a zfs commit with known data corruption (in fact, 2)

bare minimum you need to update to fresh main and reinstall all python stuff

>
>
>
>    return _bootstrap._gcd_import(name[level:], package,
> level)
>    File "", line 1030, in
> _gcd_import
>
>  File "", line 1007, in
> _find_and_load
>
>  File "", line 972, in
> _find_and_load_unlocked
>
>  File "", line 228, in
> _call_with_frames_removed
>  File "", line 1030, in _gcd_import
>  File "", line 1007, in _find_and_load
>  File "", line 986, in
> _find_and_load_unlocked
>  File "", line 680, in _load_unlocked
>  File "", line 850, in exec_module
>  File "", line 228, in
> _call_with_frames_removed
>  File "/usr/local/lib/python3.9/site-packages/setuptools/__init__.py", line
> 18, in 
>    from setuptools.dist import Distribution
>  File "/usr/local/lib/python3.9/site-packages/setuptools/dist.py", line 34,
> in 
>    from ._importlib import metadata
>  File "/usr/local/lib/python3.9/site-packages/setuptools/_importlib.py",
> line 39, in 
>    disable_importlib_metadata_finder(metadata)
>  File "/usr/local/lib/python3.9/site-packages/setuptools/_importlib.py",
> line 28, in disable_importlib_metadata_finder
>    to_remove = [
>  File "/usr/local/lib/python3.9/site-packages/setuptools/_importlib.py",
> line 31, in 
>    if isinstance(ob, importlib_metadata.MetadataPathFinder)
> AttributeError: module 'importlib_metadata' has no attribute
> 'MetadataPathFinder'
>
> ERROR Backend subprocess exited when trying to invoke
> get_requires_for_build_wheel
> *** Error code 1
>
> Stop.
> make: stopped in /usr/ports/textproc/py-pygments
>
> ===>>> make build failed for textproc/py-pygments@py39
> ===>>> Aborting update
>
> ===>>> Update for textproc/py-pygments@py39 failed
> ===>>> Aborting update
>
>
> ===>>> You can restart from the point of failure with this command line:
>
>
>
>


-- 
Mateusz Guzik 

  

Re: Problem compiling py-* ports

2023-04-18 Thread Mateusz Guzik
On 4/18/23, Filippo Moretti  wrote:
> Good morning,   I run this versione of Frrebsd and al
> py-* ports fail with the following message.sincerelyFilippo
>
> FreeBSD STING 14.0-CURRENT FreeBSD 14.0-CURRENT #6
> main-n261981-63b113af5706: Tue Apr  4 16:57:47 CEST 2023
> filippo@STING:/usr/obj/usr/src/amd64.amd64/sys/STING amd64
>
>

you are on a zfs commit with known data corruption (in fact, 2)

bare minimum you need to update to fresh main and reinstall all python stuff

>
>
>
>return _bootstrap._gcd_import(name[level:], package,
> level)
>File "", line 1030, in
> _gcd_import
>
>   File "", line 1007, in
> _find_and_load
>
>   File "", line 972, in
> _find_and_load_unlocked
>
>   File "", line 228, in
> _call_with_frames_removed
>   File "", line 1030, in _gcd_import
>   File "", line 1007, in _find_and_load
>   File "", line 986, in
> _find_and_load_unlocked
>   File "", line 680, in _load_unlocked
>   File "", line 850, in exec_module
>   File "", line 228, in
> _call_with_frames_removed
>   File "/usr/local/lib/python3.9/site-packages/setuptools/__init__.py", line
> 18, in 
> from setuptools.dist import Distribution
>   File "/usr/local/lib/python3.9/site-packages/setuptools/dist.py", line 34,
> in 
> from ._importlib import metadata
>   File "/usr/local/lib/python3.9/site-packages/setuptools/_importlib.py",
> line 39, in 
> disable_importlib_metadata_finder(metadata)
>   File "/usr/local/lib/python3.9/site-packages/setuptools/_importlib.py",
> line 28, in disable_importlib_metadata_finder
> to_remove = [
>   File "/usr/local/lib/python3.9/site-packages/setuptools/_importlib.py",
> line 31, in 
> if isinstance(ob, importlib_metadata.MetadataPathFinder)
> AttributeError: module 'importlib_metadata' has no attribute
> 'MetadataPathFinder'
>
> ERROR Backend subprocess exited when trying to invoke
> get_requires_for_build_wheel
> *** Error code 1
>
> Stop.
> make: stopped in /usr/ports/textproc/py-pygments
>
> ===>>> make build failed for textproc/py-pygments@py39
> ===>>> Aborting update
>
> ===>>> Update for textproc/py-pygments@py39 failed
> ===>>> Aborting update
>
>
> ===>>> You can restart from the point of failure with this command line:
>
>
>
>


-- 
Mateusz Guzik 



Re: Problem compiling py-* ports

2023-04-18 Thread Filippo Moretti
 
After following the instructions provided I get the following:==>>> All >> 
py39-pygments-2.14.0 (1/9)

===>  Building for py39-pygments-2.15.0
/usr/local/bin/python3.9: No module named build
*** Error code 1

Stop.
make: stopped in /usr/ports/textproc/py-pygments

===>>> make build failed for textproc/py-pygments@py39
===>>> Aborting update

===>>> Update for textproc/py-pygments@py39 failed
===>>> Aborting update



On Tuesday, April 18, 2023 at 09:57:31 AM UTC, Felix Palmen 
 wrote:  
 
 * Marek Zarychta  [20230418 11:41]:
> What is the culprit?

Most likely some leftovers from older package versions (so it's only a
problem when building in the live system. A clean jail won't have the
offending files).

>                      Removing blindly all py- packages for all affected
> hosts with dependent software is not the best solution.

Depending on your situation, it might or might not be faster than trying
to identify the exact offending files and just removing them ;)

-- 
 Felix Palmen     {private}  fe...@palmen-it.de
 -- ports committer (mentee) --            {web}  http://palmen-it.de
 {pgp public key}  http://palmen-it.de/pub.txt
 {pgp fingerprint} 6936 13D5 5BBF 4837 B212  3ACC 54AD E006 9879 F231
  

Re: another crash and going forward with zfs

2023-04-18 Thread Juraj Lutter



> On 18 Apr 2023, at 09:46, Martin Matuska  wrote:
> 
> Btw. I am open for setting up a pre-merge stress testing
> 
> I will check out if I can use the hourly-billed amd64 and arm64 cloud boxes 
> at Hetzner with FreeBSD.
> Otherwise there are monthly-billed as well.

I can provide a bhyve VM on some of my hosts. We can discuss it off-list.

jl

—
Juraj Lutter
o...@freebsd.org




Re: Reducing SIGINFO verbosity

2023-04-18 Thread Michael Gmelin



On Thu, 23 Jun 2022 11:15:55 -0600
Warner Losh  wrote:

> On Sun, Jun 19, 2022 at 6:06 AM Michael Gmelin 
> wrote:
> 
> >
> >
> > On Fri, 21 May 2021 08:36:49 -0600
> > Warner Losh  wrote:
> >  
> > > On Fri, May 21, 2021 at 7:38 AM Ceri Davies 
> > > wrote:
> > >  
> > > > On Thu, May 20, 2021 at 03:57:17PM -0700, Conrad Meyer wrote:  
> > > > > No, I don’t think there’s any reason to default it
> > > > > differently on stable  
> > > > vs  
> > > > > current. I think it’s useful (and I prefer the more verbose
> > > > > form, which isn’t the default).  
> > > >
> > > > I agree that there's no reason for the default to be different,
> > > > but I would say that it is much easier for someone who knows
> > > > that there is a default to be changed to change it, than it is
> > > > for someone who does not. Therefore, if the information is not
> > > > useful to someone who does not know how to get rid of it, then
> > > > it should default to not being displayed, IMHO.
> > > >  
> > >
> > > I plan on changing the default for non-INVARIANT kernels back to
> > > the old behavior.
> > >
> > > INVARIANT kernels will keep this behavior because it's a debugging
> > > kernel and this is one more thing to help debugging problems.
> > >  
> >
> > Did this ever happen? I just installed a fresh 13.1-RELEASE
> > production system (non-INVARIANT kernel) and it seems like SIGINFO
> > still outputs kernel stack information.
> >  
> 
> https://reviews.freebsd.org/D35576 for those who wish to weigh in.
> 
> Warner
> 
> 

Hi Warner,

I just installed 13.2-RELEASE, seems like this was never MFCd (it is in
main, but not in stable/13 or releng/13.2). TBH, I could've checked
myself back then (it's so easy to forget to MFC).

Cheers
Michael

p.s. Learned about it by hitting ctrl-t to check if freebsd-update on my
slow test machine is actually alive :D

-- 
Michael Gmelin



Re: Problem compiling py-* ports

2023-04-18 Thread Felix Palmen
* Marek Zarychta  [20230418 11:41]:
> What is the culprit?

Most likely some leftovers from older package versions (so it's only a
problem when building in the live system. A clean jail won't have the
offending files).

>  Removing blindly all py- packages for all affected
> hosts with dependent software is not the best solution.

Depending on your situation, it might or might not be faster than trying
to identify the exact offending files and just removing them ;)

-- 
 Felix Palmen  {private}   fe...@palmen-it.de
 -- ports committer (mentee) --{web}  http://palmen-it.de
 {pgp public key}  http://palmen-it.de/pub.txt
 {pgp fingerprint} 6936 13D5 5BBF 4837 B212  3ACC 54AD E006 9879 F231


signature.asc
Description: PGP signature


Re: Problem compiling py-* ports

2023-04-18 Thread Christos Chatzaras


> On 18 Apr 2023, at 11:12, Filippo Moretti  wrote:
> 
> Good morning,
>I run this versione of Frrebsd and al py-* ports fail 
> with the following message.
> sincerely
> Filippo

Maybe this helps:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=269545#c6


Re: Problem compiling py-* ports

2023-04-18 Thread Marek Zarychta

W dniu 18.04.2023 o 10:46, Felix Palmen pisze:

* Filippo Moretti  [20230418 08:12]:

   File "/usr/local/lib/python3.9/site-packages/setuptools/_importlib.py", line 31, 
in 
     if isinstance(ob, importlib_metadata.MetadataPathFinder)
AttributeError: module 'importlib_metadata' has no attribute 
'MetadataPathFinder'
I am also hitting this bug with Python 3.8, so I recently considered an 
upgrade to 3.10, but it comes out that flaw is not version dependent.

Looks like you're building directly on your host (not in clean jails)
and the build picks up "something" that's already installed.
What is the culprit? Removing blindly all py- packages for all affected 
hosts with dependent software is not the best solution.


I'd probably try to uninstall all python packages, check whether there
are leftovers in LOCALBASE/lib/python3.9 (and remove them if necessary)
and then do a rebuild.

You can avoid this class of issues using poudriere btw...



--
Marek Zarychta




Re: Problem compiling py-* ports

2023-04-18 Thread Felix Palmen
* Filippo Moretti  [20230418 08:12]:
>   File "/usr/local/lib/python3.9/site-packages/setuptools/_importlib.py", 
> line 31, in 
>     if isinstance(ob, importlib_metadata.MetadataPathFinder)
> AttributeError: module 'importlib_metadata' has no attribute 
> 'MetadataPathFinder'

Looks like you're building directly on your host (not in clean jails)
and the build picks up "something" that's already installed.

I'd probably try to uninstall all python packages, check whether there
are leftovers in LOCALBASE/lib/python3.9 (and remove them if necessary)
and then do a rebuild.

You can avoid this class of issues using poudriere btw...

-- 
 Felix Palmen  {private}   fe...@palmen-it.de
 -- ports committer (mentee) --{web}  http://palmen-it.de
 {pgp public key}  http://palmen-it.de/pub.txt
 {pgp fingerprint} 6936 13D5 5BBF 4837 B212  3ACC 54AD E006 9879 F231


signature.asc
Description: PGP signature


Problem compiling py-* ports

2023-04-18 Thread Filippo Moretti
Good morning,   I run this versione of Frrebsd and al py-* 
ports fail with the following message.sincerelyFilippo

FreeBSD STING 14.0-CURRENT FreeBSD 14.0-CURRENT #6 main-n261981-63b113af5706: 
Tue Apr  4 16:57:47 CEST 2023 
filippo@STING:/usr/obj/usr/src/amd64.amd64/sys/STING amd64





   return _bootstrap._gcd_import(name[level:], package, level)  
      File "", line 1030, in _gcd_import   
     
  File "", line 1007, in _find_and_load
     
  File "", line 972, in _find_and_load_unlocked
     
  File "", line 228, in _call_with_frames_removed
  File "", line 1030, in _gcd_import
  File "", line 1007, in _find_and_load
  File "", line 986, in _find_and_load_unlocked
  File "", line 680, in _load_unlocked
  File "", line 850, in exec_module
  File "", line 228, in _call_with_frames_removed
  File "/usr/local/lib/python3.9/site-packages/setuptools/__init__.py", line 
18, in 
    from setuptools.dist import Distribution
  File "/usr/local/lib/python3.9/site-packages/setuptools/dist.py", line 34, in 

    from ._importlib import metadata
  File "/usr/local/lib/python3.9/site-packages/setuptools/_importlib.py", line 
39, in 
    disable_importlib_metadata_finder(metadata)
  File "/usr/local/lib/python3.9/site-packages/setuptools/_importlib.py", line 
28, in disable_importlib_metadata_finder
    to_remove = [
  File "/usr/local/lib/python3.9/site-packages/setuptools/_importlib.py", line 
31, in 
    if isinstance(ob, importlib_metadata.MetadataPathFinder)
AttributeError: module 'importlib_metadata' has no attribute 
'MetadataPathFinder'

ERROR Backend subprocess exited when trying to invoke 
get_requires_for_build_wheel
*** Error code 1

Stop.
make: stopped in /usr/ports/textproc/py-pygments

===>>> make build failed for textproc/py-pygments@py39
===>>> Aborting update

===>>> Update for textproc/py-pygments@py39 failed
===>>> Aborting update


===>>> You can restart from the point of failure with this command line:





Re: another crash and going forward with zfs

2023-04-18 Thread Martin Matuska

Btw. I am open for setting up a pre-merge stress testing

I will check out if I can use the hourly-billed amd64 and arm64 cloud 
boxes at Hetzner with FreeBSD.

Otherwise there are monthly-billed as well.

Cheers,
mm

On 17. 4. 2023 22:14, Mateusz Guzik wrote:

On 4/17/23, Pawel Jakub Dawidek  wrote:

On 4/18/23 03:51, Mateusz Guzik wrote:

After bugfixes got committed I decided to zpool upgrade and sysctl
vfs.zfs.bclone_enabled=1 vs poudriere for testing purposes. I very
quickly got a new crash:

panic: VERIFY(arc_released(db->db_buf)) failed

cpuid = 9
time = 1681755046
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe0a90b8e5f0
vpanic() at vpanic+0x152/frame 0xfe0a90b8e640
spl_panic() at spl_panic+0x3a/frame 0xfe0a90b8e6a0
dbuf_redirty() at dbuf_redirty+0xbd/frame 0xfe0a90b8e6c0
dmu_buf_will_dirty_impl() at dmu_buf_will_dirty_impl+0xa2/frame
0xfe0a90b8e700
dmu_write_uio_dnode() at dmu_write_uio_dnode+0xe9/frame
0xfe0a90b8e780
dmu_write_uio_dbuf() at dmu_write_uio_dbuf+0x42/frame 0xfe0a90b8e7b0
zfs_write() at zfs_write+0x672/frame 0xfe0a90b8e960
zfs_freebsd_write() at zfs_freebsd_write+0x39/frame 0xfe0a90b8e980
VOP_WRITE_APV() at VOP_WRITE_APV+0xdb/frame 0xfe0a90b8ea90
vn_write() at vn_write+0x325/frame 0xfe0a90b8eb20
vn_io_fault_doio() at vn_io_fault_doio+0x43/frame 0xfe0a90b8eb80
vn_io_fault1() at vn_io_fault1+0x161/frame 0xfe0a90b8ecc0
vn_io_fault() at vn_io_fault+0x1b5/frame 0xfe0a90b8ed40
dofilewrite() at dofilewrite+0x81/frame 0xfe0a90b8ed90
sys_write() at sys_write+0xc0/frame 0xfe0a90b8ee00
amd64_syscall() at amd64_syscall+0x157/frame 0xfe0a90b8ef30
fast_syscall_common() at fast_syscall_common+0xf8/frame
0xfe0a90b8ef30
--- syscall (4, FreeBSD ELF64, write), rip = 0x103cddf7949a, rsp =
0x103cdc85dd48, rbp = 0x103cdc85dd80 ---
KDB: enter: panic
[ thread pid 95000 tid 135035 ]
Stopped at  kdb_enter+0x32: movq$0,0x9e4153(%rip)

The posted 14.0 schedule which plans to branch stable/14 on May 12 and
one cannot bet on the feature getting beaten up into production shape
by that time. Given whatever non-block_clonning and not even zfs bugs
which are likely to come out I think this makes the feature a
non-starter for said release.

I note:
1. the current problems did not make it into stable branches.
2. there was block_cloning-related data corruption (fixed) and there may
be more
3. there was unrelated data corruption (see
https://github.com/openzfs/zfs/issues/14753), sorted out by reverting
the problematic commit in FreeBSD, not yet sorted out upstream

As such people's data may be partially hosed as is.

Consequently the proposed plan is as follows:
1. whack the block cloning feature for the time being, but make sure
pools which upgraded to it can be mounted read-only
2. run ztest and whatever other stress testing on FreeBSD, along with
restoring openzfs CI -- I can do the first part, I'm sure pho will not
mind to run some tests of his own
3. recommend people create new pools and restore data from backup. if
restoring from backup is not an option, tar or cp (not zfs send) from
the read-only mount

block cloning beaten into shape would use block_cloning_v2 or whatever
else, key point that the current feature name would be considered
bogus (not blocking RO import though) to prevent RW usage of the
current pools with it enabled.

Comments?

Correct me if I'm wrong, but from my understanding there were zero
problems with block cloning when it wasn't in use or now disabled.

The reason I've introduced vfs.zfs.bclone_enabled sysctl, was to exactly
avoid mess like this and give us more time to sort all the problems out
while making it easy for people to try it.

If there is no plan to revert the whole import, I don't see what value
removing just block cloning will bring if it is now disabled by default
and didn't cause any problems when disabled.


The feature definitely was not properly stress tested and what not and
trying to do it keeps running into panics. Given the complexity of the
feature I would expect there are many bug lurking, some of which
possibly related to the on disk format. Not having to deal with any of
this is can be arranged as described above and is imo the most
sensible route given the timeline for 14.0





Re: another crash and going forward with zfs

2023-04-18 Thread Martin Matuska

On 18. 4. 2023 3:16, Warner Losh wrote:



Related question: what zfs branch is stable/14 going to track? With 13 
it was whatever the next stable branch was.


Warner

FreeBSD 14.0 is about to track soon-to-be-branched OpenZFS 2.2