Re: dump -X of large LVM based FFSv2 with WAPBL panics

2017-11-17 Thread Matthias Petermann

Hello Jaromir,

actually I did a forced fsck on the respective FS while it was unmounted 
upfront. To be sure I just ran the command again - it passes with no 
errors the second time. When I run dump -X again, the panic still occurs.


Best regards,
Matthias


nuc# fsck -P /dev/mapper/vg0-photo
** /dev/mapper/rvg0-photo
** File system is clean; not checking
nuc# fsck -P -f /dev/mapper/vg0-photo
** /dev/mapper/rvg0-photo
** File system is already clean
** Last Mounted on /p
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN 
SUPERBLK** 
|  97%

SALVAGE? [yn] y

59411 files, 63408414 used, 35694535 free (2079 frags, 4461557 blocks, 
0.0% fragmentation)


* FILE SYSTEM WAS MODIFIED *
nuc# fsck -P -f /dev/mapper/vg0-photo
** /dev/mapper/rvg0-photo
** File system is already clean
** Last Mounted on /p
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
59411 files, 63408414 used, 35694535 free (2079 frags, 4461557 blocks, 
0.0% fragmentation)

nuc# mount /p
nuc# touch /p/test.ignore
nuc# umount /p
nuc# fsck -P -f /dev/mapper/vg0-photo
** /dev/mapper/rvg0-photo
** File system is already clean
** Last Mounted on /p
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
59412 files, 63408414 used, 35694535 free (2079 frags, 4461557 blocks, 
0.0% fragmentation)

nuc#

Am 15.11.2017 um 20:29 schrieb Jaromír Doleček:

Hi,

can you try if doing full forced fsck (fsck -f) would resolve this?

I've seen several such persistent panics when I was debugging WAPBL. 
Even after kernel fixes I had persistent panics around ffs_newvnode() 
due to disk data corruption from previous runs. This is worth trying.


Some day I plan to add some counter, so that actually boot would 
actually force fsck every X boots even when clean, similarily what Linux 
does with ext3/4.


Jaromir

2017-11-15 12:56 GMT+01:00 Matthias Petermann >:


Hello,

on my system I have observed a serious panic when doing FFSv2 dumps
under certain conditions. I did some googling on my own and found
some references regarding the lead symptom

         "ffs_newvnode: ino=113 on /p: gen 55fd2f1f/55fd2f1f has non
zero blocks ff00 or size 0"

but all of them ended up as solved back in 2016. So I wanted to
share my observation here, in the hope somebody can give me some
pointers how the issue could be narrowed down further.

1) Given:

- NetBSD 8.0_BETA (Kernel built from branches/netbsd-8 around
2017-11-06)

         NetBSD nuc.local 8.0_BETA NetBSD 8.0_BETA (XEN3_DOM0_XHCI)
#0: Mon Nov 6 14:31:17 CET 2017
admin@nuc.local:/s/src/sys/arch/amd64/compile/XEN3_DOM0_XHCI amd64

- A large (392 GB) LVM volume hosting a FFSv2 filesystem with WAPBL
enabled
   (/dev/mapper/vg0-photo mounted at /p)

- (An external USB 3.0 Drive)

2) What I tried:

- make a dump of the aforementioned filesystem, using snapshots

     # dump -X -0auf /mnt/photo.0.dump /p

3) What happens then:

- the System crashes, leaving a coredump with with the following
indication:

     ffs_newvnode: ino=113 on /p: gen 55fd2f1f/55fd2f1f has non zero
blocks ff00 or size 0
     fatal page fault in supervisor mode
     trap type 6 code 0x2 rip 0x8022c0cc cs 0x8 rflags
0x10246 cr2 0xfe82deaddf1d ilevel 0x3 rsp 0xfe810e6b1eb8
     curlwp 0xfe827f736000 pid 0.4 lowest kstack 0xfe810e6ae2c0
     panic: trap
     cpu0: Begin traceback...
     vpanic() at netbsd:vpanic+0x140
     snprintf() at netbsd:snprintf
     trap() at netbsd:trap+0xc6b
     --- trap (number 6) ---
     mutex_enter() at netbsd:mutex_enter+0xc
     biodone2() at netbsd:biodone2+0x9b
     biodone2() at netbsd:biodone2+0x9b
     biointr() at netbsd:biointr+0x3a
     softint_dispatch() at netbsd:softint_dispatch+0xd3
     DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xfe810e6b1ff0
     Xsoftintr() at netbsd:Xsoftintr+0x4f
     --- interrupt ---
     0:
     cpu0: End traceback...

     dumping to dev 0,1 (offset=168119, size=2076255):
     dump

- gdb backtrace shows:

     (gdb) target kvm netbsd.3.core
     0x80229545 in cpu_reboot ()
     (gdb) bt
     #0  0x80229545 in cpu_reboot ()
     #1  0x809a4afc in vpanic ()
     #2  0x809a4bb0 in panic ()
     #3  0x8022b176 in trap ()

daily CVS update output

2017-11-17 Thread NetBSD source update

Updating src tree:
P src/doc/3RDPARTY
P src/doc/CHANGES
P src/external/bsd/tre/Makefile.inc
U src/external/bsd/tre/tre2netbsd
cvs update: `src/external/bsd/tre/dist/ABOUT-NLS' is no longer in the repository
cvs update: `src/external/bsd/tre/dist/ChangeLog' is no longer in the repository
U src/external/bsd/tre/dist/ChangeLog.old
cvs update: `src/external/bsd/tre/dist/INSTALL' is no longer in the repository
P src/external/bsd/tre/dist/Makefile.am
cvs update: `src/external/bsd/tre/dist/Makefile.in' is no longer in the 
repository
cvs update: `src/external/bsd/tre/dist/README' is no longer in the repository
U src/external/bsd/tre/dist/README.md
cvs update: `src/external/bsd/tre/dist/aclocal.m4' is no longer in the 
repository
cvs update: `src/external/bsd/tre/dist/config.h.in' is no longer in the 
repository
cvs update: `src/external/bsd/tre/dist/configure' is no longer in the repository
P src/external/bsd/tre/dist/configure.ac
U src/external/bsd/tre/dist/include/tre/tre-config.h
U src/external/bsd/tre/dist/include/tre/tre.h
P src/external/bsd/tre/dist/lib/Makefile.am
P src/external/bsd/tre/dist/lib/regcomp.c
P src/external/bsd/tre/dist/lib/regexec.c
P src/external/bsd/tre/dist/lib/tre-compile.c
U src/external/bsd/tre/dist/lib/tre-filter.c
U src/external/bsd/tre/dist/lib/tre-filter.h
P src/external/bsd/tre/dist/lib/tre-internal.h
P src/external/bsd/tre/dist/lib/tre-match-approx.c
P src/external/bsd/tre/dist/lib/tre-match-backtrack.c
P src/external/bsd/tre/dist/lib/tre-match-parallel.c
P src/external/bsd/tre/dist/lib/tre-match-utils.h
P src/external/bsd/tre/dist/lib/tre-parse.c
P src/external/bsd/tre/dist/lib/tre-parse.h
P src/external/bsd/tre/dist/lib/tre-stack.h
P src/external/bsd/tre/dist/lib/tre.h
P src/external/bsd/tre/dist/po/fi.po
P src/external/bsd/tre/dist/po/sv.po
P src/external/bsd/tre/dist/python/setup.py
U src/external/bsd/tre/dist/python/setup.py.in
P src/external/bsd/tre/dist/python/tre-python.c
P src/external/bsd/tre/dist/src/Makefile.am
P src/external/bsd/tre/dist/tests/Makefile.am
U src/external/bsd/tre/dist/tests/build-on-hosts.sh
U src/external/bsd/tre/dist/tests/build-run.sh
U src/external/bsd/tre/dist/tests/build-hosts/ahma
U src/external/bsd/tre/dist/tests/build-hosts/earthquake
U src/external/bsd/tre/dist/tests/build-hosts/hemuli
U src/external/bsd/tre/dist/tests/build-hosts/jolly
P src/external/bsd/tre/dist/utils/Makefile.am
P src/external/bsd/tre/dist/utils/autogen.sh
U src/external/bsd/tre/dist/utils/build-release.sh
U src/external/bsd/tre/dist/utils/build-sources.sh
U src/external/bsd/tre/dist/utils/replace-vars.sh
U src/external/bsd/tre/dist/vcbuild/tre.vcxproj
U src/external/bsd/tre/dist/vcbuild/tre.vcxproj.filters
U src/external/bsd/tre/dist/win32/retest.vcproj
U src/external/bsd/tre/dist/win32/tre-config.h.in
U src/external/bsd/tre/dist/win32/tre.sln
U src/external/bsd/tre/dist/win32/tre.vcproj
P src/external/bsd/tre/include/config.h
P src/external/bsd/tre/include/tre-config.h
U src/external/bsd/tre/lib/tre.pc
P src/external/cddl/osnet/sys/sys/kmem.h
P src/share/man/man9/driver.9
P src/sys/arch/amd64/stand/prekern/Makefile
P src/sys/arch/amd64/stand/prekern/console.c
P src/sys/arch/amd64/stand/prekern/elf.c
P src/sys/arch/amd64/stand/prekern/pdir.h
P src/sys/arch/amd64/stand/prekern/prekern.c
P src/sys/arch/arm/vexpress/vexpress_platform.c
P src/sys/dev/usb/ehci.c
P src/sys/dev/usb/if_run.c
P src/sys/dev/usb/if_runvar.h
P src/sys/dev/usb/if_urtwn.c
P src/sys/dev/usb/motg.c
P src/sys/dev/usb/ohci.c
P src/sys/dev/usb/uhci.c
P src/sys/dev/usb/xhci.c
P src/sys/external/bsd/dwc2/dwc2.c
P src/sys/kern/subr_localcount.c
P src/sys/net/bpf.c
P src/sys/net/if.c
P src/sys/net/if.h
P src/sys/net/if_bridge.c
P src/sys/net/if_loop.c
P src/sys/net/if_pppoe.c
P src/sys/net/rtsock.c
P src/sys/net/npf/npf_os.c
P src/sys/netinet/if_arp.c
P src/sys/netinet/igmp.c
P src/sys/netinet/in.c
P src/sys/netinet/ip_flow.c
P src/sys/netinet/ip_input.c
P src/sys/netinet/ip_output.c
P src/sys/netinet6/frag6.c
P src/sys/netinet6/in6.c
P src/sys/netinet6/ip6_flow.c
P src/sys/netinet6/ip6_input.c
P src/sys/netinet6/mld6.c
P src/sys/netinet6/nd6.c
P src/sys/netinet6/nd6_nbr.c
P src/sys/netipsec/ipsec_output.c
P src/sys/sys/localcount.h
P src/usr.bin/config/main.c
P src/usr.bin/systat/main.c

Updating xsrc tree:


Killing core files:


Updating tar files:
src/top-level: collecting... replacing... done
src/bin: collecting... replacing... done
src/common: collecting... replacing... done
src/compat: collecting... replacing... done
src/crypto: collecting... replacing... done
src/dist: collecting... replacing... done
src/distrib: collecting... replacing... done
src/doc: collecting... replacing... done
src/etc: collecting... replacing... done
src/external: collecting... replacing... done
src/extsrc: collecting... replacing... done
src/games: collecting... replacing... done
src/gnu: collecting...pax: Unable to access src/gnu (No such file or directory)
pax: WARNING! These file names were not selected:
src/gnu
 done

Re: toolchain/52722: config(8) busy-loops, then cores

2017-11-17 Thread John D. Baker
On Fri, 17 Nov 2017, Christos Zoulas wrote:

> Well, put the spinning kernel in the PR :-)

Just saw the above in the mailing list archive as I'm not on the default
distribution list for this PR (not originator).

I appended to it as the commit and the behavior seemed relevant.

Kernel config below.  I began using it with netbsd-6, with slight
adjustments over the years.  It has never had a problem until now.


# YGGDRASIL - big file server based on intel D945GCL board

include "arch/amd64/conf/GENERIC"
no options  INCLUDE_CONFIG_FILE
options INCLUDE_JUST_CONFIG
#maxusers   64  # estimated number of users
no options  INSECURE# disable kernel security levels - X needs this
no options  RTC_OFFSET  # hardware clock is this many mins. west of GMT
no est0 at cpu0 # Intel Enhanced SpeedStep (non-ACPI)
no powernow0at cpu0 # AMD PowerNow! and Cool'n'Quiet (non-ACPI)
# Beep when it is safe to power down the system (requires sysbeep)
options BEEP_ONHALT
# Some tunable details of the above feature (default values used below)
#optionsBEEP_ONHALT_COUNT=3 # Times to beep
#optionsBEEP_ONHALT_PITCH=1500  # Default frequency (in Hz)
#optionsBEEP_ONHALT_PERIOD=250  # Default duration (in msecs)
no options  COMPAT_15   # compatibility with NetBSD 1.5,
#no options COMPAT_70   # NetBSD 7.0, and
#no options COMPAT_80   # NetBSD 8.0 binary compatibility.
no options  COMPAT_43   # and 4.3BSD
no options  COMPAT_OSSAUDIO
no options  COMPAT_LINUX
no options  COMPAT_LINUX32  # req. COMPAT_LINUX and COMPAT_NETBSD32
no file-system  MFS # memory file system
no file-system  CODA# Coda File System; also needs vcoda (below)
options APPLE_UFS
no options  PPP_BSDCOMP # BSD-Compress compression support for PPP
no options  PPP_DEFLATE # Deflate compression support for PPP
no options  PPP_FILTER  # Active filter support for PPP (requires bpf)
no options  IPFILTER_LOG# ipmon(8) log support
no options  IPFILTER_LOOKUP # ippool(8) support
no options  IPFILTER_COMPAT # Compat for IP-Filter
no ipmi0at mainbus?
no options  MPBIOS  # configure CPUs and APICs using MPBIOS
no options  MPBIOS_SCANPCI  # MPBIOS configures PCI roots
no options  VGA_POST# in-kernel support for VGA POST
no acpiacad*at acpi?# ACPI AC Adapter
no acpibat* at acpi?# ACPI Battery
no acpidalb*at acpi?# Direct Application Launch Button
no acpiec*  at acpi?# ACPI Embedded Controller (late)
no acpiecdt*at acpi?# ACPI Embedded Controller (early)
no acpifan* at acpi?# ACPI Fan
no acpilid* at acpi?# ACPI Lid Switch
no acpitz*  at acpi?# ACPI Thermal Zone
no acpivga* at acpi?# ACPI Display Adapter
no acpiout* at acpivga? # ACPI Display Output Device
no acpiwdrt*at acpi?# ACPI Watchdog Resource Table
no acpiwmi* at acpi?# ACPI WMI Mapper
no aibs*at acpi?# ASUSTeK AI Booster hardware monitor
no asus*at acpi?# ASUS hotkeys
com*at acpi?# Serial communications interface
fdc*at acpi?# Floppy disk controller
no fujbp*   at acpi?# Fujitsu Brightness & Pointer
no fujhk*   at acpi?# Fujitsu Hotkeys
#no hpet*   at acpihpetbus? # High Precision Event Timer (table)
no hpet*at acpinodebus? # High Precision Event Timer (device)
no joy* at acpi?# Joystick/Game port
lpt*at acpi?# Parallel port
no mpu* at acpi?# Roland MPU-401 MIDI UART
no sdhc*at acpi?# SD Host Controller
no sony*at acpi?# Sony Notebook Controller
no spic*at acpi?# Sony Programmable I/O Controller
no wsmouse* at spic?# mouse
no thinkpad*at acpi?# IBM/Lenovo Thinkpad hotkeys
no ug*  at acpi?# Abit uGuru Hardware monitor
no wb*  at acpi?# Winbond W83L518D SD/MMC reader
no sdmmc*   at wb?  # SD/MMC bus
no wmidell* at acpiwmibus?  # Dell WMI mappings
no wmieeepc*at acpiwmibus?  # Asus Eee PC WMI mappings
no wmihp*   at acpiwmibus?  # HP WMI mappings
no wmimsi*  at acpiwmibus?  # MSI WMI mappings
no pwdog*   at pci? # QUANCOM PWDOG1
no pci* at pchb?
#no pcib*   at pci? # PCI-ISA bridges
no puc* at pci? # PCI "universal" comm. cards
#no ichlpcib*   at pci? # Intel ICH PCI-LPC w/ timecounter,
no fwhrng*  at ichlpcib?# Intel 82802 FWH Random Number Generator
hpet*   at ichlpcib?
no aapic*   at pci? # AMD 8131 IO apic
no isa0 at mainbus?
#no isa0at pcib?
#no isa0at ichlpcib?
no cbb* at pci?
no cardslot*at cbb?
no cardbus* at cardslot?
no pcmcia*  at cardslot?
no pckbc0   at isa? # pc keyboard controller

Re: zfs tests fails, pool related

2017-11-17 Thread Christos Zoulas
In article <20171117165032.ga1...@mail.soc.lip6.fr>,
Manuel Bouyer   wrote:
>Hello
>http://www-soc.lip6.fr/~bouyer/NetBSD-tests/xen/HEAD/amd64/201711142340Z_atf.html#failed-tcs-summary
>
>all zfs tests fails, with:
>panic: kernel diagnostic assertion "!(flags & PR_NOWAIT) != !(flags &
>PR_WAITOK)" failed: file
>"/usr/src/lib/librump/../../sys/rump/../kern/subr_pool.c", line 2229 
>
>I guess this is related to recent pool changes, but looks like a bug
>in parameters to pool_cache_get_paddr() in zfs ...

I committed a fix to that.

christos



Re: toolchain/52722: config(8) busy-loops, then cores

2017-11-17 Thread John D. Baker
Following this commit:

  http://mail-index.netbsd.org/source-changes/2017/11/16/msg089749.html

the stock kernels in a release build fine, but when trying to build the
first of my custom kernels, "nbconfig" spins, consuming most of the CPU.
It doesn't seem to be eating memory, however.

In the most recent attempt it has been spinning for 45 minutes so far.

-- 
|/"\ John D. Baker, KN5UKS   NetBSD Darwin/MacOS X
|\ / jdbaker[snail]mylinuxisp[flyspeck]comOpenBSDFreeBSD
| X  No HTML/proprietary data in email.   BSD just sits there and works!
|/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645



zfs tests fails, pool related

2017-11-17 Thread Manuel Bouyer
Hello
http://www-soc.lip6.fr/~bouyer/NetBSD-tests/xen/HEAD/amd64/201711142340Z_atf.html#failed-tcs-summary

all zfs tests fails, with:
panic: kernel diagnostic assertion "!(flags & PR_NOWAIT) != !(flags & 
PR_WAITOK)" failed: file 
"/usr/src/lib/librump/../../sys/rump/../kern/subr_pool.c", line 2229 

I guess this is related to recent pool changes, but looks like a bug
in parameters to pool_cache_get_paddr() in zfs ...

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


New panic with today's kernel

2017-11-17 Thread Chavdar Ivanov
Hi,

I just got:
...
/var/crash crash -M netbsd.9.core -N netbsd.9
Crash version 8.99.7, image version 8.99.7.
System panicked: kernel diagnostic assertion
"uvm_page_locked_p(old_pg)" failed: file
"/home/sysbuild/src/sys/arch/x86/x86/pmap.c", line 4251
Backtrace from time of crash is available.
crash> trace
_KERNEL_OPT_NARCNET() at 0
?() at 7ffdd110fbfa
vpanic() at vpanic+0x149
ch_voltag_convert_in() at ch_voltag_convert_in
pmap_enter_ma() at pmap_enter_ma+0xe99
pmap_enter_default() at pmap_enter_default+0x1d
uvm_fault_upper_enter.isra.4() at uvm_fault_upper_enter.isra.4+0xb4
uvm_fault_internal() at uvm_fault_internal+0x1692
trap() at trap+0x3f0
--- trap (number 6) ---
7f7f57e00ad8:
...

with a kernel from this morning. It happened quite ahead into the rc,
right around when mongod was being loaded.

The trouble is, I left the fsck complete and rebooted the same kernel,
it didn't panic this time...

Chavdar Ivanov


--