Re: Rare NVME related freeze at boot (was: Re: NVME aborting outstanding i/o)
Hi! > Am 05.04.2019 um 16:36 schrieb Warner Losh : > What normally comes after the nvme6 line in boot? Often times it's the next > thing after the last message that's the issue, not the last thing. nvme7 ;-) And I had hangs at nvme1, nvme3, … as well. Patrick -- punkt.de GmbH Internet - Dienstleistungen - Beratung Kaiserallee 13a Tel.: 0721 9109-0 Fax: -100 76133 Karlsruhe i...@punkt.de http://punkt.de AG Mannheim 108285 Gf: Juergen Egeling ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Rare NVME related freeze at boot (was: Re: NVME aborting outstanding i/o)
On Fri, Apr 5, 2019 at 6:41 AM Patrick M. Hausen wrote: > Hi all, > > in addition to the aborted commands every dozen of system boots or so > (this order of magnitude) the kernel simply hangs during initialisation of > one of the NVME devices: > > https://cloud.hausen.com/s/TxPTDFJwMe6sJr2 > > The particular device affected is not constant. > > A power cycle fixes it, the system has not shown hangs/freezes during > multiuser operation, yet. > > > Any ideas? > What normally comes after the nvme6 line in boot? Often times it's the next thing after the last message that's the issue, not the last thing. Warner ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Rare NVME related freeze at boot (was: Re: NVME aborting outstanding i/o)
Hi all, in addition to the aborted commands every dozen of system boots or so (this order of magnitude) the kernel simply hangs during initialisation of one of the NVME devices: https://cloud.hausen.com/s/TxPTDFJwMe6sJr2 The particular device affected is not constant. A power cycle fixes it, the system has not shown hangs/freezes during multiuser operation, yet. Any ideas? Patrick -- punkt.de GmbH Internet - Dienstleistungen - Beratung Kaiserallee 13a Tel.: 0721 9109-0 Fax: -100 76133 Karlsruhe i...@punkt.de http://punkt.de AG Mannheim 108285 Gf: Juergen Egeling ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Freebsd-11.2-p2/amd64: Black-screen freeze with base i915kms and xf86-video-intel
I need help getting Xorg working with the intel driver. Here is some info: Machine: Dellox755 shipped 2009-May-16 dmesg snips: FreeBSD 11.2-RELEASE-p2 #0: Tue Aug 14 21:45:40 UTC 2018 r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 ... CPU: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz (2992.19-MHz K8-class CPU) Origin="GenuineIntel" Id=0x1067a Family=0x6 Model=0x17 Stepping=10 Features=0xbfebfbff Features2=0xc08e3fd AMD Features=0x20100800 AMD Features2=0x1 VT-x: (disabled in BIOS) HLT,PAUSE TSC: P-state invariant, performance statistics real memory = 3221225472 (3072 MB) ... vgapci0: port 0xec90-0xec97 mem 0xfea0-0xfea7,0xd000-0xdfff,0xfeb0-0xfebf irq 16 at device 2.0 on pci0 agp0: on vgapci0 agp0: aperture size is 256M, detected 7164k stolen memory ... pciconf -lv snip: vgapci0@pci0:0:2:0:class=0x03 card=0x02111028 chip=0x29b28086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82Q35 Express Integrated Graphics Controller' class = display subclass = VGA vgapci1@pci0:0:2:1:class=0x038000 card=0x02111028 chip=0x29b38086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82Q35 Express Integrated Graphics Controller' class = display Screen capture, intall of xf86-video-intel: # pkg install xf86-video-intel Updating FreeBSD repository catalogue... FreeBSD repository is up to date. All repositories are up to date. Checking integrity... done (0 conflicting) The following 1 package(s) will be affected (of 0 checked): New packages to be INSTALLED: xf86-video-intel: 2.99.917.20180512 Number of packages to be installed: 1 The process will require 2 MiB more space. Proceed with this action? [y/N]: y [1/1] Installing xf86-video-intel-2.99.917.20180512... [1/1] Extracting xf86-video-intel-2.99.917.20180512: 100% If I do anything much different than the following, I get a black-screen freeze, no network, no interrupts, dead keyboard. Only way out is power-off/power-on, followed by single-user boot and fsck. kldload i915 startx which gets X going, but it uses VESA, not the intel driver. For examples: kldload i915kms --> black-screen-freeze kldload drm2; kldload i915kms --> BSF Don't load either --> BSF (xf86-video-intel installed, no drm or 915 modules) HOWEVER: Exactly once, I forgot to kldload any of drm or i915kms or i915, and then ran startx, Xorg used the intel driver and it seemed to work OK. It even loaded i915kms and drm2 by itself, as was shown by kldstat after Xorg had shut down. It never did that again! Unfortunately I can't find the Xorg log I think I saved! Some more detail, for the case where Xorg uses VESA: kldstat right after boot: Id Refs AddressSize Name 17 0x8020 20647f8 kernel 21 0x82421000 1780 uhid.ko 31 0x82423000 2328 ums.ko kldstat right after loading i915.ko: Id Refs AddressSize Name 1 13 0x8020 20647f8 kernel 21 0x82421000 1780 uhid.ko 31 0x82423000 2328 ums.ko 41 0x82426000 6d44 i915.ko 51 0x8242d000 10708drm.ko kldstat after startx and then exiting X: Id Refs AddressSize Name 1 33 0x8020 20647f8 kernel 21 0x82421000 1780 uhid.ko 31 0x82423000 2328 ums.ko 41 0x82426000 6d44 i915.ko 51 0x8242d000 10708drm.ko 61 0x8243e000 7a2b8i915kms.ko 71 0x824b9000 3f8ccdrm2.ko 84 0x824f9000 1ed0 iicbus.ko 91 0x824fb000 e58 iic.ko 101 0x824fc000 1570 iicbb.ko Note that Xorg loaded the last five itself. all.log snip, evidently because of kldload i915: Aug 31 15:02:00 dellox755 kernel: drm0: on vgapci0 Aug 31 15:02:00 dellox755 kernel: info: [drm] MSI enabled 1 message(s) Aug 31 15:02:00 dellox755 kernel: info: [drm] AGP at 0xd000 256MB Aug 31 15:02:00 dellox755 kernel: info: [drm] Initialized i915 1.6.0 20080730 all.log snip, evidently because of startx: Aug 31 15:02:42 dellox755 kernel: info: [drm] Initialized drm 1.1.0 20060810 Aug 31 15:02:42 dellox755 kernel: drmn0: on vgapci0 Aug 31 15:02:42 dellox755 kernel: error: [drm:pid706:drm_get_minor] *ERROR* Failed to create cdev: 17 Aug 31 15:02:42 dellox755 kernel: device_attach: drmn0 attach returned -17 Xorg.0.log snips: [ 3056.909] X.Org X Server 1.18.4 Release Date: 2016-07-19 ... [ 3057.252] (==) AIGLX enabled [ 3057.253] (II) LoadModule: "intel" [ 3057.253] (II) Loading /usr/local/lib/xorg/modules/drivers/intel_drv.so [ 3057.335] (II) Module intel: vendor="X.Org Foundation" [ 3057.335] compiled for 1.18.4, module version = 2.99.91
Re: make kernel ctfmerge freeze on 11-STABLE
Completely cleaning out /usr/src and /usr/obj fixed it (both current and past revisions) On Mon, Jan 2, 2017 at 8:33 AM, Aryeh Friedman wrote: > > > On Mon, Jan 2, 2017 at 7:57 AM, Mateusz Guzik wrote: > >> On Mon, Jan 02, 2017 at 07:48:22AM -0500, Aryeh Friedman wrote: >> > On Mon, Jan 2, 2017 at 7:36 AM, Mateusz Guzik >> wrote: >> > >> > > On Mon, Jan 02, 2017 at 06:57:48AM -0500, Aryeh Friedman wrote: >> > > > FreeBSD lilith 11.0-STABLE FreeBSD 11.0-STABLE #7 r311003: Sun Jan >> 1 >> > > > 02:45:34 EST 2017 root@lilith:/usr/obj/usr/src/sys/GENERIC >> amd64 >> > > > >> > > > >> > > > -- >> > > > >>> stage 3.1: building everything >> > > > -- >> > > > cd /usr/obj/usr/src/sys/GENERIC; COMPILER_VERSION=30901 >> > > > COMPILER_TYPE=clang COMPILER_FREEBSD_VERSION=1100503 >> > > > MAKEOBJDIRPREFIX=/usr/obj MACHINE_ARCH=amd64 MACHINE=amd64 >> CPUTYPE= >> > > > GROFF_BIN_PATH=/usr/obj/usr/src/tmp/legacy/usr/bin >> > > > GROFF_FONT_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/groff_font >> > > > GROFF_TMAC_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/tmac CC="cc >> > > -target >> > > > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp >> > > > -B/usr/obj/usr/src/tmp/usr/bin" CXX="c++ -target >> > > > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp >> > > > -B/usr/obj/usr/src/tmp/usr/bin" CPP="cpp -target >> > > > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp >> > > > -B/usr/obj/usr/src/tmp/usr/bin" AS="as" AR="ar" LD="ld" NM=nm >> > > > OBJDUMP=objdump OBJCOPY="objcopy" RANLIB=ranlib STRINGS= >> SIZE="size" >> > > > INSTALL="sh /usr/src/tools/install.sh" >> > > > PATH=/usr/obj/usr/src/tmp/legacy/usr/sbin:/usr/obj/usr/ >> > > src/tmp/legacy/usr/bin:/usr/obj/usr/src/tmp/legacy/bin:/ >> > > usr/obj/usr/src/tmp/usr/sbin:/usr/obj/usr/src/tmp/usr/bin:/ >> > > sbin:/bin:/usr/sbin:/usr/bin >> > > > make -m /usr/src/share/mk KERNEL=kernel all -DNO_MODULES_OBJ >> > > > linking kernel.full >> > > > ctfmerge -L VERSION -g -o kernel.full ... >> > > > >> > > >> > > How reproducible is the crash? What previous kernel was known to work? >> > > Can you narrow it down to a particular revision, preferably with >> kernel >> > > debugging enabled? (see the end of the mail) >> > > >> > >> > It first appeared a few days ago (forget what revision) then disappeared >> > the day after and reappeared yesterday. It is 100% reproducible (i.e. >> > clearing out /usr/obj and doing a make kernel in either single or >> multiuser >> > mode both cause it).Turing on debugging would be hard but perhaps I >> > should slightly qualify "freeze": make freezes but the rest of the >> system >> > is responsive and killing make leaves a zombie ctfmerge. If I still >> need >> > kernel debugging based on the above I will do it but looking for an >> easier >> > explanation first. >> > >> >> I definitely don't run into anything of the sort and the problem >> statement is quote vague. >> >> However, if the problem is indeed reproducible, the minimum you can do >> is find the first revision where it started appearing and that would >> definitely help with an investigation. >> >> > Any advice on how to do that since I update daily I can tell you when it > started (the day) but not the actual revision ID. > > > > -- > Aryeh M. Friedman, Lead Developer, http://www.PetiteCloud.org > -- Aryeh M. Friedman, Lead Developer, http://www.PetiteCloud.org ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: make kernel ctfmerge freeze on 11-STABLE
On Mon, Jan 02, 2017 at 08:33:29AM -0500, Aryeh Friedman wrote: > On Mon, Jan 2, 2017 at 7:57 AM, Mateusz Guzik wrote: > > > On Mon, Jan 02, 2017 at 07:48:22AM -0500, Aryeh Friedman wrote: > > > On Mon, Jan 2, 2017 at 7:36 AM, Mateusz Guzik wrote: > > > > > > > On Mon, Jan 02, 2017 at 06:57:48AM -0500, Aryeh Friedman wrote: > > > > > FreeBSD lilith 11.0-STABLE FreeBSD 11.0-STABLE #7 r311003: Sun Jan 1 > > > > > 02:45:34 EST 2017 root@lilith:/usr/obj/usr/src/sys/GENERIC > > amd64 > > > > > > > > > > > > > > > -- > > > > > >>> stage 3.1: building everything > > > > > -- > > > > > cd /usr/obj/usr/src/sys/GENERIC; COMPILER_VERSION=30901 > > > > > COMPILER_TYPE=clang COMPILER_FREEBSD_VERSION=1100503 > > > > > MAKEOBJDIRPREFIX=/usr/obj MACHINE_ARCH=amd64 MACHINE=amd64 > > CPUTYPE= > > > > > GROFF_BIN_PATH=/usr/obj/usr/src/tmp/legacy/usr/bin > > > > > GROFF_FONT_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/groff_font > > > > > GROFF_TMAC_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/tmac CC="cc > > > > -target > > > > > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp > > > > > -B/usr/obj/usr/src/tmp/usr/bin" CXX="c++ -target > > > > > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp > > > > > -B/usr/obj/usr/src/tmp/usr/bin" CPP="cpp -target > > > > > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp > > > > > -B/usr/obj/usr/src/tmp/usr/bin" AS="as" AR="ar" LD="ld" NM=nm > > > > > OBJDUMP=objdump OBJCOPY="objcopy" RANLIB=ranlib STRINGS= > > SIZE="size" > > > > > INSTALL="sh /usr/src/tools/install.sh" > > > > > PATH=/usr/obj/usr/src/tmp/legacy/usr/sbin:/usr/obj/usr/ > > > > src/tmp/legacy/usr/bin:/usr/obj/usr/src/tmp/legacy/bin:/ > > > > usr/obj/usr/src/tmp/usr/sbin:/usr/obj/usr/src/tmp/usr/bin:/ > > > > sbin:/bin:/usr/sbin:/usr/bin > > > > > make -m /usr/src/share/mk KERNEL=kernel all -DNO_MODULES_OBJ > > > > > linking kernel.full > > > > > ctfmerge -L VERSION -g -o kernel.full ... > > > > > > > > > > > > > How reproducible is the crash? What previous kernel was known to work? > > > > Can you narrow it down to a particular revision, preferably with kernel > > > > debugging enabled? (see the end of the mail) > > > > > > > > > > It first appeared a few days ago (forget what revision) then disappeared > > > the day after and reappeared yesterday. It is 100% reproducible (i.e. > > > clearing out /usr/obj and doing a make kernel in either single or > > multiuser > > > mode both cause it).Turing on debugging would be hard but perhaps I > > > should slightly qualify "freeze": make freezes but the rest of the system > > > is responsive and killing make leaves a zombie ctfmerge. If I still need > > > kernel debugging based on the above I will do it but looking for an > > easier > > > explanation first. > > > > > > > I definitely don't run into anything of the sort and the problem > > statement is quote vague. > > > > However, if the problem is indeed reproducible, the minimum you can do > > is find the first revision where it started appearing and that would > > definitely help with an investigation. > > > > > Any advice on how to do that since I update daily I can tell you when it > started (the day) but not the actual revision ID. > Just get the source, e.g.: svn checkout https://svn.freebsd.org/base/stable/11 /usr/src You can then switch to a particular revision you can svn up -r, e.g.: svn update -r r310953 to switch to the revision prior to cache merge. Preferably though you would use git as it allows easy bisection. https://github.com/freebsd/freebsd, the branch is origin/stable/11. -- Mateusz Guzik ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: make kernel ctfmerge freeze on 11-STABLE
On Mon, Jan 2, 2017 at 7:57 AM, Mateusz Guzik wrote: > On Mon, Jan 02, 2017 at 07:48:22AM -0500, Aryeh Friedman wrote: > > On Mon, Jan 2, 2017 at 7:36 AM, Mateusz Guzik wrote: > > > > > On Mon, Jan 02, 2017 at 06:57:48AM -0500, Aryeh Friedman wrote: > > > > FreeBSD lilith 11.0-STABLE FreeBSD 11.0-STABLE #7 r311003: Sun Jan 1 > > > > 02:45:34 EST 2017 root@lilith:/usr/obj/usr/src/sys/GENERIC > amd64 > > > > > > > > > > > > -- > > > > >>> stage 3.1: building everything > > > > -- > > > > cd /usr/obj/usr/src/sys/GENERIC; COMPILER_VERSION=30901 > > > > COMPILER_TYPE=clang COMPILER_FREEBSD_VERSION=1100503 > > > > MAKEOBJDIRPREFIX=/usr/obj MACHINE_ARCH=amd64 MACHINE=amd64 > CPUTYPE= > > > > GROFF_BIN_PATH=/usr/obj/usr/src/tmp/legacy/usr/bin > > > > GROFF_FONT_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/groff_font > > > > GROFF_TMAC_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/tmac CC="cc > > > -target > > > > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp > > > > -B/usr/obj/usr/src/tmp/usr/bin" CXX="c++ -target > > > > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp > > > > -B/usr/obj/usr/src/tmp/usr/bin" CPP="cpp -target > > > > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp > > > > -B/usr/obj/usr/src/tmp/usr/bin" AS="as" AR="ar" LD="ld" NM=nm > > > > OBJDUMP=objdump OBJCOPY="objcopy" RANLIB=ranlib STRINGS= > SIZE="size" > > > > INSTALL="sh /usr/src/tools/install.sh" > > > > PATH=/usr/obj/usr/src/tmp/legacy/usr/sbin:/usr/obj/usr/ > > > src/tmp/legacy/usr/bin:/usr/obj/usr/src/tmp/legacy/bin:/ > > > usr/obj/usr/src/tmp/usr/sbin:/usr/obj/usr/src/tmp/usr/bin:/ > > > sbin:/bin:/usr/sbin:/usr/bin > > > > make -m /usr/src/share/mk KERNEL=kernel all -DNO_MODULES_OBJ > > > > linking kernel.full > > > > ctfmerge -L VERSION -g -o kernel.full ... > > > > > > > > > > How reproducible is the crash? What previous kernel was known to work? > > > Can you narrow it down to a particular revision, preferably with kernel > > > debugging enabled? (see the end of the mail) > > > > > > > It first appeared a few days ago (forget what revision) then disappeared > > the day after and reappeared yesterday. It is 100% reproducible (i.e. > > clearing out /usr/obj and doing a make kernel in either single or > multiuser > > mode both cause it).Turing on debugging would be hard but perhaps I > > should slightly qualify "freeze": make freezes but the rest of the system > > is responsive and killing make leaves a zombie ctfmerge. If I still need > > kernel debugging based on the above I will do it but looking for an > easier > > explanation first. > > > > I definitely don't run into anything of the sort and the problem > statement is quote vague. > > However, if the problem is indeed reproducible, the minimum you can do > is find the first revision where it started appearing and that would > definitely help with an investigation. > > Any advice on how to do that since I update daily I can tell you when it started (the day) but not the actual revision ID. -- Aryeh M. Friedman, Lead Developer, http://www.PetiteCloud.org ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: make kernel ctfmerge freeze on 11-STABLE
On Mon, Jan 02, 2017 at 07:48:22AM -0500, Aryeh Friedman wrote: > On Mon, Jan 2, 2017 at 7:36 AM, Mateusz Guzik wrote: > > > On Mon, Jan 02, 2017 at 06:57:48AM -0500, Aryeh Friedman wrote: > > > FreeBSD lilith 11.0-STABLE FreeBSD 11.0-STABLE #7 r311003: Sun Jan 1 > > > 02:45:34 EST 2017 root@lilith:/usr/obj/usr/src/sys/GENERIC amd64 > > > > > > > > > -- > > > >>> stage 3.1: building everything > > > -- > > > cd /usr/obj/usr/src/sys/GENERIC; COMPILER_VERSION=30901 > > > COMPILER_TYPE=clang COMPILER_FREEBSD_VERSION=1100503 > > > MAKEOBJDIRPREFIX=/usr/obj MACHINE_ARCH=amd64 MACHINE=amd64 CPUTYPE= > > > GROFF_BIN_PATH=/usr/obj/usr/src/tmp/legacy/usr/bin > > > GROFF_FONT_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/groff_font > > > GROFF_TMAC_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/tmac CC="cc > > -target > > > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp > > > -B/usr/obj/usr/src/tmp/usr/bin" CXX="c++ -target > > > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp > > > -B/usr/obj/usr/src/tmp/usr/bin" CPP="cpp -target > > > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp > > > -B/usr/obj/usr/src/tmp/usr/bin" AS="as" AR="ar" LD="ld" NM=nm > > > OBJDUMP=objdump OBJCOPY="objcopy" RANLIB=ranlib STRINGS= SIZE="size" > > > INSTALL="sh /usr/src/tools/install.sh" > > > PATH=/usr/obj/usr/src/tmp/legacy/usr/sbin:/usr/obj/usr/ > > src/tmp/legacy/usr/bin:/usr/obj/usr/src/tmp/legacy/bin:/ > > usr/obj/usr/src/tmp/usr/sbin:/usr/obj/usr/src/tmp/usr/bin:/ > > sbin:/bin:/usr/sbin:/usr/bin > > > make -m /usr/src/share/mk KERNEL=kernel all -DNO_MODULES_OBJ > > > linking kernel.full > > > ctfmerge -L VERSION -g -o kernel.full ... > > > > > > > How reproducible is the crash? What previous kernel was known to work? > > Can you narrow it down to a particular revision, preferably with kernel > > debugging enabled? (see the end of the mail) > > > > It first appeared a few days ago (forget what revision) then disappeared > the day after and reappeared yesterday. It is 100% reproducible (i.e. > clearing out /usr/obj and doing a make kernel in either single or multiuser > mode both cause it).Turing on debugging would be hard but perhaps I > should slightly qualify "freeze": make freezes but the rest of the system > is responsive and killing make leaves a zombie ctfmerge. If I still need > kernel debugging based on the above I will do it but looking for an easier > explanation first. > I definitely don't run into anything of the sort and the problem statement is quote vague. However, if the problem is indeed reproducible, the minimum you can do is find the first revision where it started appearing and that would definitely help with an investigation. -- Mateusz Guzik ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: make kernel ctfmerge freeze on 11-STABLE
On Mon, Jan 2, 2017 at 7:36 AM, Mateusz Guzik wrote: > On Mon, Jan 02, 2017 at 06:57:48AM -0500, Aryeh Friedman wrote: > > FreeBSD lilith 11.0-STABLE FreeBSD 11.0-STABLE #7 r311003: Sun Jan 1 > > 02:45:34 EST 2017 root@lilith:/usr/obj/usr/src/sys/GENERIC amd64 > > > > > > -- > > >>> stage 3.1: building everything > > -- > > cd /usr/obj/usr/src/sys/GENERIC; COMPILER_VERSION=30901 > > COMPILER_TYPE=clang COMPILER_FREEBSD_VERSION=1100503 > > MAKEOBJDIRPREFIX=/usr/obj MACHINE_ARCH=amd64 MACHINE=amd64 CPUTYPE= > > GROFF_BIN_PATH=/usr/obj/usr/src/tmp/legacy/usr/bin > > GROFF_FONT_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/groff_font > > GROFF_TMAC_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/tmac CC="cc > -target > > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp > > -B/usr/obj/usr/src/tmp/usr/bin" CXX="c++ -target > > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp > > -B/usr/obj/usr/src/tmp/usr/bin" CPP="cpp -target > > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp > > -B/usr/obj/usr/src/tmp/usr/bin" AS="as" AR="ar" LD="ld" NM=nm > > OBJDUMP=objdump OBJCOPY="objcopy" RANLIB=ranlib STRINGS= SIZE="size" > > INSTALL="sh /usr/src/tools/install.sh" > > PATH=/usr/obj/usr/src/tmp/legacy/usr/sbin:/usr/obj/usr/ > src/tmp/legacy/usr/bin:/usr/obj/usr/src/tmp/legacy/bin:/ > usr/obj/usr/src/tmp/usr/sbin:/usr/obj/usr/src/tmp/usr/bin:/ > sbin:/bin:/usr/sbin:/usr/bin > > make -m /usr/src/share/mk KERNEL=kernel all -DNO_MODULES_OBJ > > linking kernel.full > > ctfmerge -L VERSION -g -o kernel.full ... > > > > How reproducible is the crash? What previous kernel was known to work? > Can you narrow it down to a particular revision, preferably with kernel > debugging enabled? (see the end of the mail) > It first appeared a few days ago (forget what revision) then disappeared the day after and reappeared yesterday. It is 100% reproducible (i.e. clearing out /usr/obj and doing a make kernel in either single or multiuser mode both cause it).Turing on debugging would be hard but perhaps I should slightly qualify "freeze": make freezes but the rest of the system is responsive and killing make leaves a zombie ctfmerge. If I still need kernel debugging based on the above I will do it but looking for an easier explanation first. > > -- Aryeh M. Friedman, Lead Developer, http://www.PetiteCloud.org ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: make kernel ctfmerge freeze on 11-STABLE
On Mon, Jan 02, 2017 at 01:36:31PM +0100, Mateusz Guzik wrote: > On Mon, Jan 02, 2017 at 06:57:48AM -0500, Aryeh Friedman wrote: > > FreeBSD lilith 11.0-STABLE FreeBSD 11.0-STABLE #7 r311003: Sun Jan 1 > > 02:45:34 EST 2017 root@lilith:/usr/obj/usr/src/sys/GENERIC amd64 > > ... > > make -m /usr/src/share/mk KERNEL=kernel all -DNO_MODULES_OBJ > > linking kernel.full > > ctfmerge -L VERSION -g -o kernel.full ... > > > > How reproducible is the crash? What previous kernel was known to work? > Can you narrow it down to a particular revision, preferably with kernel > debugging enabled? (see the end of the mail) FWIW, I did not see anything approaching such a freeze, either on my build machine or my laptop, during the just-comopleted upgrade from: FreeBSD g1-252.catwhisker.org 11.0-STABLE FreeBSD 11.0-STABLE #209 r311007M/311007:1100508: Sun Jan 1 03:51:25 PST 2017 r...@g1-252.catwhisker.org:/common/S1/obj/usr/src/sys/CANARY amd64 to: FreeBSD g1-252.catwhisker.org 11.0-STABLE FreeBSD 11.0-STABLE #210 r311047M/311097:1100508: Mon Jan 2 04:23:25 PST 2017 r...@g1-252.catwhisker.org:/common/S1/obj/usr/src/sys/CANARY amd64 (Or any prior upgrade, that I recall). > Peace, david -- David H. Wolfskill da...@catwhisker.org Epistemology for post-truthers: How do we select parts of reality to ignore? See http://www.catwhisker.org/~david/publickey.gpg for my public key. signature.asc Description: PGP signature
Re: make kernel ctfmerge freeze on 11-STABLE
On Mon, Jan 02, 2017 at 06:57:48AM -0500, Aryeh Friedman wrote: > FreeBSD lilith 11.0-STABLE FreeBSD 11.0-STABLE #7 r311003: Sun Jan 1 > 02:45:34 EST 2017 root@lilith:/usr/obj/usr/src/sys/GENERIC amd64 > > > -- > >>> stage 3.1: building everything > -- > cd /usr/obj/usr/src/sys/GENERIC; COMPILER_VERSION=30901 > COMPILER_TYPE=clang COMPILER_FREEBSD_VERSION=1100503 > MAKEOBJDIRPREFIX=/usr/obj MACHINE_ARCH=amd64 MACHINE=amd64 CPUTYPE= > GROFF_BIN_PATH=/usr/obj/usr/src/tmp/legacy/usr/bin > GROFF_FONT_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/groff_font > GROFF_TMAC_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/tmac CC="cc -target > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp > -B/usr/obj/usr/src/tmp/usr/bin" CXX="c++ -target > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp > -B/usr/obj/usr/src/tmp/usr/bin" CPP="cpp -target > x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp > -B/usr/obj/usr/src/tmp/usr/bin" AS="as" AR="ar" LD="ld" NM=nm > OBJDUMP=objdump OBJCOPY="objcopy" RANLIB=ranlib STRINGS= SIZE="size" > INSTALL="sh /usr/src/tools/install.sh" > PATH=/usr/obj/usr/src/tmp/legacy/usr/sbin:/usr/obj/usr/src/tmp/legacy/usr/bin:/usr/obj/usr/src/tmp/legacy/bin:/usr/obj/usr/src/tmp/usr/sbin:/usr/obj/usr/src/tmp/usr/bin:/sbin:/bin:/usr/sbin:/usr/bin > make -m /usr/src/share/mk KERNEL=kernel all -DNO_MODULES_OBJ > linking kernel.full > ctfmerge -L VERSION -g -o kernel.full ... > How reproducible is the crash? What previous kernel was known to work? Can you narrow it down to a particular revision, preferably with kernel debugging enabled? (see the end of the mail) There was one invasive change merged - fine-grained namecache in r310959 and that can be treated as the likely culprit. That said, I would start the search with verifying there are no issues with r310953 first. Debug opts: options KDB # Enable kernel debugger support. options KDB_TRACE # Print a stack trace for a panic. # For full debugger support use (turn off in stable branch): options DDB # Support DDB. options GDB # Support remote GDB. options INVARIANTS # Enable calls of extra sanity checking options INVARIANT_SUPPORT # Extra sanity checks of internal structures, required by INVARIANTS options WITNESS # Enable checks to detect deadlocks and cycles options WITNESS_SKIPSPIN# Don't run witness on spinlocks for speed options MALLOC_DEBUG_MAXZONES=8 # Separate malloc(9) zones options DEBUG_VFS_LOCKS -- Mateusz Guzik ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
make kernel ctfmerge freeze on 11-STABLE
FreeBSD lilith 11.0-STABLE FreeBSD 11.0-STABLE #7 r311003: Sun Jan 1 02:45:34 EST 2017 root@lilith:/usr/obj/usr/src/sys/GENERIC amd64 -- >>> stage 3.1: building everything -- cd /usr/obj/usr/src/sys/GENERIC; COMPILER_VERSION=30901 COMPILER_TYPE=clang COMPILER_FREEBSD_VERSION=1100503 MAKEOBJDIRPREFIX=/usr/obj MACHINE_ARCH=amd64 MACHINE=amd64 CPUTYPE= GROFF_BIN_PATH=/usr/obj/usr/src/tmp/legacy/usr/bin GROFF_FONT_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/groff_font GROFF_TMAC_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/tmac CC="cc -target x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp -B/usr/obj/usr/src/tmp/usr/bin" CXX="c++ -target x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp -B/usr/obj/usr/src/tmp/usr/bin" CPP="cpp -target x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp -B/usr/obj/usr/src/tmp/usr/bin" AS="as" AR="ar" LD="ld" NM=nm OBJDUMP=objdump OBJCOPY="objcopy" RANLIB=ranlib STRINGS= SIZE="size" INSTALL="sh /usr/src/tools/install.sh" PATH=/usr/obj/usr/src/tmp/legacy/usr/sbin:/usr/obj/usr/src/tmp/legacy/usr/bin:/usr/obj/usr/src/tmp/legacy/bin:/usr/obj/usr/src/tmp/usr/sbin:/usr/obj/usr/src/tmp/usr/bin:/sbin:/bin:/usr/sbin:/usr/bin make -m /usr/src/share/mk KERNEL=kernel all -DNO_MODULES_OBJ linking kernel.full ctfmerge -L VERSION -g -o kernel.full ... -- Aryeh M. Friedman, Lead Developer, http://www.PetiteCloud.org ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: msk msk0 watchdog timeout freeze hang lock stop problem
On Thu, Aug 27, 2015 at 11:29:28AM +0200, Johann Hugo wrote: > It's working for me so far and I haven't seen any watchdog timeouts. > With 10.2-RELEASE I got timeouts and lost connectivity in less that a > minute. > Ok, great. Committed in r287238. Thanks again. > Johann > > On Wed, Aug 26, 2015 at 10:28 AM, Yonghyeon PYUN wrote: > > On Wed, Aug 26, 2015 at 10:06:29AM +0200, Johann Hugo wrote: > >> 10.2-RELEASE does not work for me. It works for a very short while and > >> then it stops with "msk0 watchdog timeout" errors > >> > > > > Thanks a lot for your report. This is the first report for > > msk(4) watchdog timeouts on 10.2-RELEASE. > > > >> I'm not sure what patch Roosevelt was talking about, but the patch in > >> this thread works for me: > >> https://lists.freebsd.org/pipermail/freebsd-stable/2015-April/082226.html > >> > >> I've changed MSK_STAT_ALIGN from 4096 to 8192 in if_mskreg.h and it's > >> been running stable for the last week. > >> > > > > I see. I'm under the impression that RX/TX descriptor ring > > alignment shall trigger the same issue so it would be better to > > know how attached patch works on your box. > > > > Thanks. > > > >> Johann > >> > >> On Sun, Aug 16, 2015 at 2:08 PM, Yonghyeon PYUN wrote: > >> > On Wed, Aug 12, 2015 at 09:44:06AM -0400, Roosevelt Littleton wrote: > >> >> Hi, > >> >> So, I can confirm with the attached patch. I have a working msk0 that > >> >> hasn't failed for the past month. I considered this problem fix for me. > >> >> Since, I have went a long time without any problems. Thanks! > >> > > >> > I'm not sure which patch you used. Given that users reported > >> > 10.2-RELEASE works, it would be great if you revert local patch > >> > and try it again on 10.2-RELEASE. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: msk msk0 watchdog timeout freeze hang lock stop problem
It's working for me so far and I haven't seen any watchdog timeouts. With 10.2-RELEASE I got timeouts and lost connectivity in less that a minute. Johann On Wed, Aug 26, 2015 at 10:28 AM, Yonghyeon PYUN wrote: > On Wed, Aug 26, 2015 at 10:06:29AM +0200, Johann Hugo wrote: >> 10.2-RELEASE does not work for me. It works for a very short while and >> then it stops with "msk0 watchdog timeout" errors >> > > Thanks a lot for your report. This is the first report for > msk(4) watchdog timeouts on 10.2-RELEASE. > >> I'm not sure what patch Roosevelt was talking about, but the patch in >> this thread works for me: >> https://lists.freebsd.org/pipermail/freebsd-stable/2015-April/082226.html >> >> I've changed MSK_STAT_ALIGN from 4096 to 8192 in if_mskreg.h and it's >> been running stable for the last week. >> > > I see. I'm under the impression that RX/TX descriptor ring > alignment shall trigger the same issue so it would be better to > know how attached patch works on your box. > > Thanks. > >> Johann >> >> On Sun, Aug 16, 2015 at 2:08 PM, Yonghyeon PYUN wrote: >> > On Wed, Aug 12, 2015 at 09:44:06AM -0400, Roosevelt Littleton wrote: >> >> Hi, >> >> So, I can confirm with the attached patch. I have a working msk0 that >> >> hasn't failed for the past month. I considered this problem fix for me. >> >> Since, I have went a long time without any problems. Thanks! >> > >> > I'm not sure which patch you used. Given that users reported >> > 10.2-RELEASE works, it would be great if you revert local patch >> > and try it again on 10.2-RELEASE. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: msk msk0 watchdog timeout freeze hang lock stop problem
On Wed, Aug 26, 2015 at 10:06:29AM +0200, Johann Hugo wrote: > 10.2-RELEASE does not work for me. It works for a very short while and > then it stops with "msk0 watchdog timeout" errors > Thanks a lot for your report. This is the first report for msk(4) watchdog timeouts on 10.2-RELEASE. > I'm not sure what patch Roosevelt was talking about, but the patch in > this thread works for me: > https://lists.freebsd.org/pipermail/freebsd-stable/2015-April/082226.html > > I've changed MSK_STAT_ALIGN from 4096 to 8192 in if_mskreg.h and it's > been running stable for the last week. > I see. I'm under the impression that RX/TX descriptor ring alignment shall trigger the same issue so it would be better to know how attached patch works on your box. Thanks. > Johann > > On Sun, Aug 16, 2015 at 2:08 PM, Yonghyeon PYUN wrote: > > On Wed, Aug 12, 2015 at 09:44:06AM -0400, Roosevelt Littleton wrote: > >> Hi, > >> So, I can confirm with the attached patch. I have a working msk0 that > >> hasn't failed for the past month. I considered this problem fix for me. > >> Since, I have went a long time without any problems. Thanks! > > > > I'm not sure which patch you used. Given that users reported > > 10.2-RELEASE works, it would be great if you revert local patch > > and try it again on 10.2-RELEASE. Index: sys/dev/msk/if_mskreg.h === --- sys/dev/msk/if_mskreg.h (revision 281587) +++ sys/dev/msk/if_mskreg.h (working copy) @@ -2175,13 +2175,8 @@ #define MSK_ADDR_LO(x) ((uint64_t) (x) & 0xUL) #define MSK_ADDR_HI(x) ((uint64_t) (x) >> 32) -/* - * At first I guessed 8 bytes, the size of a single descriptor, would be - * required alignment constraints. But, it seems that Yukon II have 4096 - * bytes boundary alignment constraints. - */ -#define MSK_RING_ALIGN 4096 -#define MSK_STAT_ALIGN 4096 +#define MSK_RING_ALIGN 32768 +#define MSK_STAT_ALIGN 32768 /* Rx descriptor data structure */ struct msk_rx_desc { ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: msk msk0 watchdog timeout freeze hang lock stop problem
10.2-RELEASE does not work for me. It works for a very short while and then it stops with "msk0 watchdog timeout" errors I'm not sure what patch Roosevelt was talking about, but the patch in this thread works for me: https://lists.freebsd.org/pipermail/freebsd-stable/2015-April/082226.html I've changed MSK_STAT_ALIGN from 4096 to 8192 in if_mskreg.h and it's been running stable for the last week. Johann On Sun, Aug 16, 2015 at 2:08 PM, Yonghyeon PYUN wrote: > On Wed, Aug 12, 2015 at 09:44:06AM -0400, Roosevelt Littleton wrote: >> Hi, >> So, I can confirm with the attached patch. I have a working msk0 that >> hasn't failed for the past month. I considered this problem fix for me. >> Since, I have went a long time without any problems. Thanks! > > I'm not sure which patch you used. Given that users reported > 10.2-RELEASE works, it would be great if you revert local patch > and try it again on 10.2-RELEASE. > ___ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: msk msk0 watchdog timeout freeze hang lock stop problem
On Wed, Aug 12, 2015 at 09:44:06AM -0400, Roosevelt Littleton wrote: > Hi, > So, I can confirm with the attached patch. I have a working msk0 that > hasn't failed for the past month. I considered this problem fix for me. > Since, I have went a long time without any problems. Thanks! I'm not sure which patch you used. Given that users reported 10.2-RELEASE works, it would be great if you revert local patch and try it again on 10.2-RELEASE. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: msk msk0 watchdog timeout freeze hang lock stop problem
On 08/12/2015 04:44 PM, Roosevelt Littleton wrote: Hi, So, I can confirm with the attached patch. I have a working msk0 that hasn't failed for the past month. I considered this problem fix for me. Since, I have went a long time without any problems. Thanks! Roosevelt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" Since 10.2-RC1 it works for me, too; now on 10.2-RELEASE. And I don't use any patches, still. -Alnis ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: msk msk0 watchdog timeout freeze hang lock stop problem
Hi, So, I can confirm with the attached patch. I have a working msk0 that hasn't failed for the past month. I considered this problem fix for me. Since, I have went a long time without any problems. Thanks! Roosevelt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: msk msk0 watchdog timeout freeze hang lock stop problem
On 07/26/2015 01:40 PM, Yonghyeon PYUN wrote: On Sat, Jul 25, 2015 at 02:08:10PM +0300, Alnis Morics wrote: Just tried 10.2-RC1 amd64 GENERIC, and the problem seems to be gone. I was even able to scp a 500 MB file. Could it be related to this fix in BETA2, as mentioned in the announcement, "The watchdog(4) device has been fixed to print to the correct buffer."? msk(4) will show watchdog timeouts when it detects driver TX path is in stuck condition but I believe this has nothing to do with watchdog(4). There was no msk(4) code change in 10.2-RC1. If you happen to see the watchdog timeouts again, please try attached patch and let me know whether it makes any difference for you. I didn't get much feedbacks on the patch so I'm not sure whether it really fixes the root cause. pciconf -lv [..] mskc0@pci0:9:0:0:class=0x02 card=0xc072144d chip=0x435411ab rev=0x00 hdr=0x00 vendor = 'Marvell Technology Group Ltd.' device = '88E8040 PCI-E Fast Ethernet Controller' class = network subclass = ethernet ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" Thanks, Pyun. If the watchdog timeouts reappear, I'll try the patch and give notice about the results. -Alnis ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: msk msk0 watchdog timeout freeze hang lock stop problem
On Sat, Jul 25, 2015 at 02:08:10PM +0300, Alnis Morics wrote: > Just tried 10.2-RC1 amd64 GENERIC, and the problem seems to be gone. I > was even able to scp a 500 MB file. Could it be related to this fix in > BETA2, as mentioned in the announcement, "The watchdog(4) device has > been fixed to print to the correct buffer."? > msk(4) will show watchdog timeouts when it detects driver TX path is in stuck condition but I believe this has nothing to do with watchdog(4). There was no msk(4) code change in 10.2-RC1. If you happen to see the watchdog timeouts again, please try attached patch and let me know whether it makes any difference for you. I didn't get much feedbacks on the patch so I'm not sure whether it really fixes the root cause. > pciconf -lv > [..] > mskc0@pci0:9:0:0:class=0x02 card=0xc072144d chip=0x435411ab > rev=0x00 hdr=0x00 > vendor = 'Marvell Technology Group Ltd.' > device = '88E8040 PCI-E Fast Ethernet Controller' > class = network > subclass = ethernet > > Index: sys/dev/msk/if_mskreg.h === --- sys/dev/msk/if_mskreg.h (revision 281587) +++ sys/dev/msk/if_mskreg.h (working copy) @@ -2175,13 +2175,8 @@ #define MSK_ADDR_LO(x) ((uint64_t) (x) & 0xUL) #define MSK_ADDR_HI(x) ((uint64_t) (x) >> 32) -/* - * At first I guessed 8 bytes, the size of a single descriptor, would be - * required alignment constraints. But, it seems that Yukon II have 4096 - * bytes boundary alignment constraints. - */ -#define MSK_RING_ALIGN 4096 -#define MSK_STAT_ALIGN 4096 +#define MSK_RING_ALIGN 32768 +#define MSK_STAT_ALIGN 32768 /* Rx descriptor data structure */ struct msk_rx_desc { ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: msk msk0 watchdog timeout freeze hang lock stop problem
0: msk_handle_events: Break #5 cons=196 csrread=197 mskc0: msk_handle_events: Break #5 cons=197 csrread=198 ... mskc0: msk_handle_events: Break #5 cons=510 csrread=511 mskc0: msk_handle_events: Break #5 cons=511 csrread=512 mskc0: msk_handle_events: Break #1 cons=512 csrread=513 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 mskc0: msk_handle_events: Break #1 cons=512 csrread=513 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 mskc0: msk_handle_events: Break #1 cons=512 csrread=513 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 mskc0: msk_handle_events: Break #1 cons=512 csrread=513 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 mskc0: msk_handle_events: Break #1 cons=512 csrread=513 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 mskc0: msk_handle_events: Break #1 cons=512 csrread=513 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 mskc0: msk_handle_events: Break #1 cons=512 csrread=513 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 ... mskc0: msk_handle_events: Break #1 cons=512 csrread=519 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 mskc0: msk_handle_events: Break #1 cons=512 csrread=519 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 ...etc From: owner-freebsd-sta...@freebsd.org [owner-freebsd-sta...@freebsd.org] on behalf of Yonghyeon PYUN [pyu...@gmail.com] Sent: 13 April 2015 09:13 To: Gareth Wyn Roberts Cc: freebsd-stable@freebsd.org Subject: Re: msk msk0 watchdog timeout freeze hang lock stop problem On Sun, Apr 12, 2015 at 05:57:34PM +, Gareth Wyn Roberts wrote: I've run in to problems using the msk device where initially it works well enough to set DHCP etc. but stops/freezes as soon as any appreciable network traffic occurs . There are several threads describing similar symptoms over the past two years or more. I've been following several false leads but have finally found a solution (at least it solves my problem). I'm running a standard FreeBSD 10.1-RELEASE and the NIC is detected as: mskc0: mem 0xfa00-0xfa003fff irq 19 at device 0.0 on pci6 msk0: on mskc0 msk0: Ethernet address: 00:13:77:e9:df:eb miibus0: on msk0 e1000phy0: PHY 0 on miibus0 e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-ma ster, auto, auto-flow The network worked when using the i386 release, but failed for the amd64 release (as reported previously) which prompted me to disable 64-bit DMA (the patch for this is attached below). This worked for the first kernel built but mysteriously failed when another unrelated part of the kernel was changed (a usb driver) and the kernel recompiled. So identical msk driver code worked in one kernel but not the second! This suggested that alignment differences between the two kernels were causing the msk driver to fail. Others have reported varying behaviour depending on different circumstances. It transpires that changing just one value in the if_mskreg.h file solved all my problems. Subsequently I have not been able to make it fail under heavy network traffic in either 32-bit or 64-bit mode. I'm working on 10.1-RELEASE source, i.e. if_msk.c revision 262524 and if_mskreg.h revision 264442. Thanks for letting me know your findings. I really appreciate that. I recall that the alignment requirement of status LEs(List Elements in Marvell terms) is 2048 and the maximum size of the status LEs is 4096 bytes(Actual alignment seems to be much lower value like 32 or 64 bytes, but alignment 2048 is chosen to avoid silicon bugs). Later experiments showed some variants of Yukon II require 4096 bytes alignment and I changed the alignment to 4096 in the past. It seems your finding indicates msk(4) needs 8192 alignment for status LEs. However this does not explain how and why the same code in 8.x/9.x works well. In addition, it's not common to require alignment size greater than PAGE_SIZE on x86 given that the maximum size of DMA buffer is 4096 bytes. I have to check whether there was a change in bus_dma(9) between 8.x/9.x and 10.x but it needs more time due to lack of spare time. Probably you can verify the DMA address of status LEs meets the following requirements both on i386 and amd64. - Alignment is 4096. - Number of DMA segment is 1. - DMA segment base address plus DMA segment size does not cross a PAGE_SIZE boundary. Here's the patch to if_mskreg.h --- if_mskreg.h-orig2014-11-11 20:02:58.0 + +++ if_mskreg.h 2015-04-12 18:47:20.0 +0100 @@ -2179,9 +2179,11 @@ * At first I guessed 8 bytes, the size of a single descriptor, would be * required alignment cons
Heads-Up: stable/10 freeze in effect
For those not subscribed to svn commit email, the code freeze for the upcoming 10.2-RELEASE is now in effect. The full schedule as it stands now is available here: https://www.FreeBSD.org/releases/10.2R/schedule.html If you are aware of an issue that affects stable/10 that does not have a corresponding PR, please file a bug report so we do not lose track. Thank you. Glen On behalf of: re@ pgpNM869awBxn.pgp Description: PGP signature
Re: msk msk0 watchdog timeout freeze hang lock stop problem
On Wed, Apr 15, 2015 at 09:52:09PM +, Gareth Wyn Roberts wrote: > I've inserted code to print some values which show the differences between > specifying 4096 or 8192 for MSK_STAT_ALIGN. In both cases the status buffer > has length 0x4000 (8x2048=16K) but the alignments are different as expected, > respectively start addresses 0x5c3b000 or 0xbdc2c000. > > The following values were output from functions msk_status_dma_alloc(), > msk_dmamap_cb() and msk_handle_events(). > The "Break #n" refer to breaks in msk_handle_events(). "#1" occurs if > ((control & HW_OWNER) == 0), "#5" is OP_RXSTAT and "#6" is OP_TXINDEXLE. > > The first output is for MSK_STAT_ALIGN=8192. It continues normally. > Although not shown here, it reaches cons=2047 then cons=0 as expected. > > The second output is for MSK_STAT_ALIGN=4096. Although there can be isolated > occurences of "Break #1" (e.g. cons=196) (?are these to be expected?), it > continues normally until cons=512. At this point it continually invokes the > "#1" block because the msk_control from msk_stat_ring[512] is always zero and > the network hangs immediately. This suggests the Yukon Ultra 2 88E8057 can't > access the next 4096 memory block, but why not? > Yes, it seems the status LE block is not updated at all for MSK_STAT_ALIGN == 4096 and some elements of the status block looks suspicious(put index increases but the value in the location is 0). I vaguely guess this indicates there are DMA alignment and/or DMA boundary issues. The maximum number of elements of the status block is 4096 so the maximum size of the status block is 32KB. For i386, msk(4) uses 8KB status block(1024 elements). For 64bit architectures, the block size is increased to 16KB(2048 elements). Probably the safe alignment value for the status block would be 32K. This looks excessive value to me but it shall avoid guessing DMA boundary issue. > Please let me know if any further information would be helpful. > Thanks a lot. I've attached a diff which sets the alignment of TX/RX ring and status block to 32KB. Not sure whether this also addresses other msk(4) related watchdog timeouts. Index: sys/dev/msk/if_mskreg.h === --- sys/dev/msk/if_mskreg.h (revision 281587) +++ sys/dev/msk/if_mskreg.h (working copy) @@ -2175,13 +2175,8 @@ #define MSK_ADDR_LO(x) ((uint64_t) (x) & 0xUL) #define MSK_ADDR_HI(x) ((uint64_t) (x) >> 32) -/* - * At first I guessed 8 bytes, the size of a single descriptor, would be - * required alignment constraints. But, it seems that Yukon II have 4096 - * bytes boundary alignment constraints. - */ -#define MSK_RING_ALIGN 4096 -#define MSK_STAT_ALIGN 4096 +#define MSK_RING_ALIGN 32768 +#define MSK_STAT_ALIGN 32768 /* Rx descriptor data structure */ struct msk_rx_desc { ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
RE: msk msk0 watchdog timeout freeze hang lock stop problem
0: msk_handle_events: Break #5 cons=197 csrread=198 ... mskc0: msk_handle_events: Break #5 cons=510 csrread=511 mskc0: msk_handle_events: Break #5 cons=511 csrread=512 mskc0: msk_handle_events: Break #1 cons=512 csrread=513 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 mskc0: msk_handle_events: Break #1 cons=512 csrread=513 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 mskc0: msk_handle_events: Break #1 cons=512 csrread=513 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 mskc0: msk_handle_events: Break #1 cons=512 csrread=513 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 mskc0: msk_handle_events: Break #1 cons=512 csrread=513 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 mskc0: msk_handle_events: Break #1 cons=512 csrread=513 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 mskc0: msk_handle_events: Break #1 cons=512 csrread=513 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 ... mskc0: msk_handle_events: Break #1 cons=512 csrread=519 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 mskc0: msk_handle_events: Break #1 cons=512 csrread=519 mskc0: msk_handle_events: sd=0xfe011e23c000 sd->msk_control=0 control=0 ...etc From: owner-freebsd-sta...@freebsd.org [owner-freebsd-sta...@freebsd.org] on behalf of Yonghyeon PYUN [pyu...@gmail.com] Sent: 13 April 2015 09:13 To: Gareth Wyn Roberts Cc: freebsd-stable@freebsd.org Subject: Re: msk msk0 watchdog timeout freeze hang lock stop problem On Sun, Apr 12, 2015 at 05:57:34PM +, Gareth Wyn Roberts wrote: > I've run in to problems using the msk device where initially it works well > enough to set DHCP etc. but stops/freezes as soon as any appreciable network > traffic occurs . There are several threads describing similar symptoms over > the past two years or more. I've been following several false leads but have > finally found a solution (at least it solves my problem). > > I'm running a standard FreeBSD 10.1-RELEASE and the NIC is detected as: > > mskc0: mem 0xfa00-0xfa003fff irq > 19 at device 0.0 on pci6 > msk0: on mskc0 > msk0: Ethernet address: 00:13:77:e9:df:eb > miibus0: on msk0 > e1000phy0: PHY 0 on miibus0 > e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-ma > ster, auto, auto-flow > > The network worked when using the i386 release, but failed for the amd64 > release (as reported previously) which prompted me to disable 64-bit DMA (the > patch for this is attached below). This worked for the first kernel built > but mysteriously failed when another unrelated part of the kernel was changed > (a usb driver) and the kernel recompiled. So identical msk driver code > worked in one kernel but not the second! This suggested that alignment > differences between the two kernels were causing the msk driver to fail. > Others have reported varying behaviour depending on different circumstances. > > It transpires that changing just one value in the if_mskreg.h file solved all > my problems. Subsequently I have not been able to make it fail under heavy > network traffic in either 32-bit or 64-bit mode. > I'm working on 10.1-RELEASE source, i.e. if_msk.c revision 262524 and > if_mskreg.h revision 264442. Thanks for letting me know your findings. I really appreciate that. I recall that the alignment requirement of status LEs(List Elements in Marvell terms) is 2048 and the maximum size of the status LEs is 4096 bytes(Actual alignment seems to be much lower value like 32 or 64 bytes, but alignment 2048 is chosen to avoid silicon bugs). Later experiments showed some variants of Yukon II require 4096 bytes alignment and I changed the alignment to 4096 in the past. It seems your finding indicates msk(4) needs 8192 alignment for status LEs. However this does not explain how and why the same code in 8.x/9.x works well. In addition, it's not common to require alignment size greater than PAGE_SIZE on x86 given that the maximum size of DMA buffer is 4096 bytes. I have to check whether there was a change in bus_dma(9) between 8.x/9.x and 10.x but it needs more time due to lack of spare time. Probably you can verify the DMA address of status LEs meets the following requirements both on i386 and amd64. - Alignment is 4096. - Number of DMA segment is 1. - DMA segment base address plus DMA segment size does not cross a PAGE_SIZE boundary. > > Here's the patch to if_mskreg.h > --- if_mskreg.h-orig2014-11-11 20:02:58.0 + > +++ if_mskreg.h 2015-04-12 18:47:20.0 +0100 > @@ -21
msk msk0 watchdog timeout freeze hang lock stop problem
Hm... I patched if_msk.c with if_msk.c.rev262524.dma.diff (attachment-001.bin) and if_mskreg.h with if_mskreg.h.rev264442.dma.diff (attachment-002.bin), and nothing changed: scp'ing 50 MB soon got "stalled" and ended up with "broken pipe", as it was before. I have 10.1-RELEASE-p9 amd64 pciconf -lv: [..] mskc0@pci0:9:0:0:class=0x02 card=0xc072144d chip=0x435411ab rev=0x00 hdr=0x00 vendor = 'Marvell Technology Group Ltd.' device = '88E8040 PCI-E Fast Ethernet Controller' class = network subclass = ethernet Alnis ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: msk msk0 watchdog timeout freeze hang lock stop problem
On Sun, Apr 12, 2015 at 05:57:34PM +, Gareth Wyn Roberts wrote: > I've run in to problems using the msk device where initially it works well > enough to set DHCP etc. but stops/freezes as soon as any appreciable network > traffic occurs . There are several threads describing similar symptoms over > the past two years or more. I've been following several false leads but have > finally found a solution (at least it solves my problem). > > I'm running a standard FreeBSD 10.1-RELEASE and the NIC is detected as: > > mskc0: mem 0xfa00-0xfa003fff irq > 19 at device 0.0 on pci6 > msk0: on mskc0 > msk0: Ethernet address: 00:13:77:e9:df:eb > miibus0: on msk0 > e1000phy0: PHY 0 on miibus0 > e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-ma > ster, auto, auto-flow > > The network worked when using the i386 release, but failed for the amd64 > release (as reported previously) which prompted me to disable 64-bit DMA (the > patch for this is attached below). This worked for the first kernel built > but mysteriously failed when another unrelated part of the kernel was changed > (a usb driver) and the kernel recompiled. So identical msk driver code > worked in one kernel but not the second! This suggested that alignment > differences between the two kernels were causing the msk driver to fail. > Others have reported varying behaviour depending on different circumstances. > > It transpires that changing just one value in the if_mskreg.h file solved all > my problems. Subsequently I have not been able to make it fail under heavy > network traffic in either 32-bit or 64-bit mode. > I'm working on 10.1-RELEASE source, i.e. if_msk.c revision 262524 and > if_mskreg.h revision 264442. Thanks for letting me know your findings. I really appreciate that. I recall that the alignment requirement of status LEs(List Elements in Marvell terms) is 2048 and the maximum size of the status LEs is 4096 bytes(Actual alignment seems to be much lower value like 32 or 64 bytes, but alignment 2048 is chosen to avoid silicon bugs). Later experiments showed some variants of Yukon II require 4096 bytes alignment and I changed the alignment to 4096 in the past. It seems your finding indicates msk(4) needs 8192 alignment for status LEs. However this does not explain how and why the same code in 8.x/9.x works well. In addition, it's not common to require alignment size greater than PAGE_SIZE on x86 given that the maximum size of DMA buffer is 4096 bytes. I have to check whether there was a change in bus_dma(9) between 8.x/9.x and 10.x but it needs more time due to lack of spare time. Probably you can verify the DMA address of status LEs meets the following requirements both on i386 and amd64. - Alignment is 4096. - Number of DMA segment is 1. - DMA segment base address plus DMA segment size does not cross a PAGE_SIZE boundary. > > Here's the patch to if_mskreg.h > --- if_mskreg.h-orig2014-11-11 20:02:58.0 + > +++ if_mskreg.h 2015-04-12 18:47:20.0 +0100 > @@ -2179,9 +2179,11 @@ > * At first I guessed 8 bytes, the size of a single descriptor, would be > * required alignment constraints. But, it seems that Yukon II have 4096 > * bytes boundary alignment constraints. > + * And it seems that the DMA status region for the Yukon Ultra 2 (88E8057) > + * requires 8192 byte alignment to prevent locking. > */ > #define MSK_RING_ALIGN 4096 > -#defineMSK_STAT_ALIGN 4096 > +#defineMSK_STAT_ALIGN 8192 > > > The patches to both files which also implement a MSK_64BIT_DMA_DISABLE flag > are attached. Perhaps the developers would consider committing these as it > may be useful for future debugging. > If you have more than 4GB memory installed and disables 64bit DMA addressing, msk(4) shall use bounce buffers. Passing packets through bounce buffers involves copy operation and it costs a lot. You can check hw.busdma sysctl node to see whether there are drivers that use bounce buffers. And if you want to disable 64bit DMA on 64bit architectures, add '#undef MSK_64BIT_DMA' just below BUS_SPACE_MAXADDR check in if_mskreg.h. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: msk msk0 watchdog timeout freeze hang lock stop problem
Hi! > I've run in to problems using the msk device [...] > I'm working on 10.1-RELEASE source, i.e. if_msk.c revision 262524 and > if_mskreg.h revision 264442. > > Here's the patch to if_mskreg.h [...] Thanks for the suggested fix. There are five PRs, all describe similar things: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=197887 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=197002 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=189404 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=186872 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=166727 I added some pointer to your posting, maybe someone can test it ? -- p...@opsec.eu+49 171 3101372 5 years to go ! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
msk msk0 watchdog timeout freeze hang lock stop problem
I've run in to problems using the msk device where initially it works well enough to set DHCP etc. but stops/freezes as soon as any appreciable network traffic occurs . There are several threads describing similar symptoms over the past two years or more. I've been following several false leads but have finally found a solution (at least it solves my problem). I'm running a standard FreeBSD 10.1-RELEASE and the NIC is detected as: mskc0: mem 0xfa00-0xfa003fff irq 19 at device 0.0 on pci6 msk0: on mskc0 msk0: Ethernet address: 00:13:77:e9:df:eb miibus0: on msk0 e1000phy0: PHY 0 on miibus0 e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-ma ster, auto, auto-flow The network worked when using the i386 release, but failed for the amd64 release (as reported previously) which prompted me to disable 64-bit DMA (the patch for this is attached below). This worked for the first kernel built but mysteriously failed when another unrelated part of the kernel was changed (a usb driver) and the kernel recompiled. So identical msk driver code worked in one kernel but not the second! This suggested that alignment differences between the two kernels were causing the msk driver to fail. Others have reported varying behaviour depending on different circumstances. It transpires that changing just one value in the if_mskreg.h file solved all my problems. Subsequently I have not been able to make it fail under heavy network traffic in either 32-bit or 64-bit mode. I'm working on 10.1-RELEASE source, i.e. if_msk.c revision 262524 and if_mskreg.h revision 264442. Here's the patch to if_mskreg.h --- if_mskreg.h-orig2014-11-11 20:02:58.0 + +++ if_mskreg.h 2015-04-12 18:47:20.0 +0100 @@ -2179,9 +2179,11 @@ * At first I guessed 8 bytes, the size of a single descriptor, would be * required alignment constraints. But, it seems that Yukon II have 4096 * bytes boundary alignment constraints. + * And it seems that the DMA status region for the Yukon Ultra 2 (88E8057) + * requires 8192 byte alignment to prevent locking. */ #define MSK_RING_ALIGN 4096 -#defineMSK_STAT_ALIGN 4096 +#defineMSK_STAT_ALIGN 8192 The patches to both files which also implement a MSK_64BIT_DMA_DISABLE flag are attached. Perhaps the developers would consider committing these as it may be useful for future debugging. Gareth. --- if_mskreg.h-orig 2014-11-11 20:02:58.0 + +++ if_mskreg.h 2015-04-12 18:47:20.0 +0100 @@ -2179,9 +2179,11 @@ * At first I guessed 8 bytes, the size of a single descriptor, would be * required alignment constraints. But, it seems that Yukon II have 4096 * bytes boundary alignment constraints. + * And it seems that the DMA status region for the Yukon Ultra 2 (88E8057) + * requires 8192 byte alignment to prevent locking. */ #define MSK_RING_ALIGN 4096 -#define MSK_STAT_ALIGN 4096 +#define MSK_STAT_ALIGN 8192 /* Rx descriptor data structure */ struct msk_rx_desc { --- if_msk.c-orig 2014-11-11 20:02:58.0 + +++ if_msk.c 2015-04-12 02:15:12.551005000 +0100 @@ -2164,8 +2164,8 @@ error = bus_dma_tag_create( bus_get_dma_tag(sc->msk_dev), /* parent */ MSK_STAT_ALIGN, 0, /* alignment, boundary */ - BUS_SPACE_MAXADDR, /* lowaddr */ - BUS_SPACE_MAXADDR, /* highaddr */ + BUS_DMA_TAG_LOWADDR, /* lowaddr */ + BUS_DMA_TAG_HIGHADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ stat_sz, /* maxsize */ 1,/* nsegments */ @@ -2235,8 +2235,8 @@ error = bus_dma_tag_create( bus_get_dma_tag(sc_if->msk_if_dev), /* parent */ 1, 0, /* alignment, boundary */ - BUS_SPACE_MAXADDR, /* lowaddr */ - BUS_SPACE_MAXADDR, /* highaddr */ + BUS_DMA_TAG_LOWADDR, /* lowaddr */ + BUS_DMA_TAG_HIGHADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ BUS_SPACE_MAXSIZE_32BIT, /* maxsize */ 0,/* nsegments */ @@ -2252,8 +2252,8 @@ /* Create tag for Tx ring. */ error = bus_dma_tag_create(sc_if->msk_cdata.msk_parent_tag,/* parent */ MSK_RING_ALIGN, 0, /* alignment, boundary */ - BUS_SPACE_MAXADDR, /* lowaddr */ - BUS_SPACE_MAXADDR, /* highaddr */ + BUS_DMA_TAG_LOWADDR, /* lowaddr */ + BUS_DMA_TAG_HIGHADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ MSK_TX_RING_SZ, /* maxsize */ 1,/* nsegments */ @@ -2270,8 +2270,8 @@ /* Create tag for Rx ring. */ error = bus_dma_tag_create(sc_if->msk_cdata.msk_parent_tag,/* parent */ MSK_RING_ALIGN, 0, /* alignment, boundary */ - BUS_SPACE_MAXADDR, /* lowaddr */ - BUS_SPACE_MAXADDR, /* highaddr */ + BUS_DMA_TAG_LOWADDR, /* lowaddr */ + BUS_DMA_TAG_HIGHADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ MSK_RX_RING_SZ, /* maxsize */ 1,/* nsegments */ @@ -2288,8 +2288,8 @@ /* Create
Re: FreeBSD 9-Stable + Atom D510 Freeze
Gary Palmer [gpal...@freebsd.org] wrote: > It used to be that ports had MAKE_JOBS_SAFE in the Makefile to mark that > the port could be built using parallel compiles with the '-j' argument > to make. It appears that the logic has been switched and now you have > to mark them as MAKE_JOBS_UNSAFE to say that parallel builds shouldn't be > done, indicating that parallel builds are the default now (unless I'm > misreading the code) > > You can try putting > > DISABLE_MAKE_JOBS=yes > > into /etc/make.conf to see if that stops the problem on port builds. > Gary: Making that change worked for me. I built both Subversion and Tshark, my two problem children. The build time was not too much different than without the flag. Only 1 CPU was active with cc1 at a time. I had no 'pfault' states on any entries in top for both builds. I guess that we can close out this issue. Thank you and the list for the suggestion. Tom -- Public Keys: PGP KeyID = 0x5F22FDC1 GnuPG KeyID = 0x620836CF ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 9-Stable + Atom D510 Freeze
Gary Palmer [gpal...@freebsd.org] wrote: > It's not a compiler flag, it's a make flag. make -j n will fork off up to > n compilers to do the build. If you just do "make buildworld" then there > is no parallel compilation. > > It used to be that ports had MAKE_JOBS_SAFE in the Makefile to mark that > the port could be built using parallel compiles with the '-j' argument > to make. It appears that the logic has been switched and now you have > to mark them as MAKE_JOBS_UNSAFE to say that parallel builds shouldn't be > done, indicating that parallel builds are the default now (unless I'm > misreading the code) > > You can try putting > > DISABLE_MAKE_JOBS=yes > > into /etc/make.conf to see if that stops the problem on port builds. > Gary: I don't see that as an option in /usr/share/examples/etc/make.conf. Did you find that one by reading the source code? I will add that to my /etc/make.conf and see if it makes a difference. This issue is very intermittant and may not trigger for weeks or months. I'll repost to the list if any problems show up after setting the flag in my /etc/make.conf Thanks for the help. Tom -- Public Keys: PGP KeyID = 0x5F22FDC1 GnuPG KeyID = 0x620836CF ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 9-Stable + Atom D510 Freeze
On 20 September 2013 11:52, Gary Palmer wrote: > On Fri, Sep 20, 2013 at 10:49:28AM -0400, Thomas Laus wrote: >> Gary Palmer [gpal...@freebsd.org] wrote: >> > >> > When building kernel & world do you use the '-j' argument to do parallel >> > builds? AFAIK thats not done by default, but it is for some ports. >> > >> Gary: >> >> I just use the system defaults when building anything. If there is a >> '-j' argument passed to the compiler, I was not the one that did it. >> Does this mean that the port building process needs to determine the >> processor type in the configure stage? I only use portmaster to keep >> the ports updated. I don't know of a global hook that will change the >> compiler build flags in portmaster. > > Hi Tim, > > It's not a compiler flag, it's a make flag. make -j n will fork off up to > n compilers to do the build. If you just do "make buildworld" then there > is no parallel compilation. > > It used to be that ports had MAKE_JOBS_SAFE in the Makefile to mark that > the port could be built using parallel compiles with the '-j' argument > to make. It appears that the logic has been switched and now you have > to mark them as MAKE_JOBS_UNSAFE to say that parallel builds shouldn't be > done, indicating that parallel builds are the default now (unless I'm > misreading the code) > > You can try putting > > DISABLE_MAKE_JOBS=yes > > into /etc/make.conf to see if that stops the problem on port builds. > Alternatively I think you could do > > portmaster -m DISABLE_MAKE_JOBS=yes > > However you'd have to do that each time you run portmaster. I think > putting > > PM_MAKE_ARGS="DISABLE_MAKE_JOBS=yes" > > in your .portmasterrc may do the same thing (not tried it). > > Note: this is NOT a fix. If it works, it merely stops the ports builder > from triggering the problem by not doing parallel compiles. The compiles > will also take longer. > I believe that both world/kernel & ports will honour MAKE_JOBS_NUMBER=1 #(in /etc/make.conf) which should restrict all builds to 1 "parallel" thread, yes? -- -- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 9-Stable + Atom D510 Freeze
On Fri, Sep 20, 2013 at 10:49:28AM -0400, Thomas Laus wrote: > Gary Palmer [gpal...@freebsd.org] wrote: > > > > When building kernel & world do you use the '-j' argument to do parallel > > builds? AFAIK thats not done by default, but it is for some ports. > > > Gary: > > I just use the system defaults when building anything. If there is a > '-j' argument passed to the compiler, I was not the one that did it. > Does this mean that the port building process needs to determine the > processor type in the configure stage? I only use portmaster to keep > the ports updated. I don't know of a global hook that will change the > compiler build flags in portmaster. Hi Tim, It's not a compiler flag, it's a make flag. make -j n will fork off up to n compilers to do the build. If you just do "make buildworld" then there is no parallel compilation. It used to be that ports had MAKE_JOBS_SAFE in the Makefile to mark that the port could be built using parallel compiles with the '-j' argument to make. It appears that the logic has been switched and now you have to mark them as MAKE_JOBS_UNSAFE to say that parallel builds shouldn't be done, indicating that parallel builds are the default now (unless I'm misreading the code) You can try putting DISABLE_MAKE_JOBS=yes into /etc/make.conf to see if that stops the problem on port builds. Alternatively I think you could do portmaster -m DISABLE_MAKE_JOBS=yes However you'd have to do that each time you run portmaster. I think putting PM_MAKE_ARGS="DISABLE_MAKE_JOBS=yes" in your .portmasterrc may do the same thing (not tried it). Note: this is NOT a fix. If it works, it merely stops the ports builder from triggering the problem by not doing parallel compiles. The compiles will also take longer. Regards, Gary ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 9-Stable + Atom D510 Freeze
Gary Palmer [gpal...@freebsd.org] wrote: > > When building kernel & world do you use the '-j' argument to do parallel > builds? AFAIK thats not done by default, but it is for some ports. > Gary: I just use the system defaults when building anything. If there is a '-j' argument passed to the compiler, I was not the one that did it. Does this mean that the port building process needs to determine the processor type in the configure stage? I only use portmaster to keep the ports updated. I don't know of a global hook that will change the compiler build flags in portmaster. Tom -- Public Keys: PGP KeyID = 0x5F22FDC1 GnuPG KeyID = 0x620836CF ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 9-Stable + Atom D510 Freeze
On Fri, Sep 20, 2013 at 09:12:09AM -0400, Thomas Laus wrote: > > Tom, > > I have had multiple D510's and now D525's that are part of my test > > systems, all are 4GB machines and all run the latest (ie 2 days old) 9.X > > Stable. They're faultless. I have a D510 in production serving 30 > > users - yes its a 1G system running, sendmail, squid, samba as PDC. > > It's been in place for at least 7 months and runs without any hiccups. > > > > Though I would point out that the Atom processor does NOT do out of > > order processing, so a VIA motherboard that is of lower GHz builds > > worlds/ports in less time that a supposedly faster Atom. > > > > Your question re HT, yes HT introduces some additional latency, but is > > unlikely to be the problem. > > > Thanks for the information about the HT CPU's. I asked the question to the > group because I did not know if they were functionally any different than a > traditional CPU. I successfully built my problem port, Tshark, yesterday > while monitoring 'top' on another console. I observed that all 4 cpu's were > in service for the build and at times were running at 100 percent each. The > State column on all 4 occasionally showed a 'pfault' on all 4 but recovered > and the build continued to successful completion. > > > When I experience something like spurious reboots and it is definately > > not hardware, then I delete /usr/src and /usr/ports and perform a > > complete rebuild. (Yes seriously, and on the Atom's we're talking days, > > aren't we :) ) > > > I have been using this Atom D510 since it was released about 3 years ago. It > ran on FreeBSD 8-Stable until about a month ago. I installed an Intel 520 > SSD and loaded a fresh copy of a FreeBSD 9 Snapshot. After getting the > source and ports tarballs, I used svnup to bring both up to date. I built > and installed world and the kernel to bring me up to Stable. I rebuilt all > of my ports using Portmaster. > > The spurious reboot issue existed for the last 3 years when running FreeBSD-8 > Stable. I never had the problem building world or kernel. It only occurred > when building some ports. Subversion and Tshark more often than others. > FreeBSD 9-Stable was frozen when I tried to build tshark, but I was able to > build it OK yesterday. Everything hardware related other than the Atom > microprocessor and the Intel motherboard itself is new. The OS is now a > different version and all of the source was rebuilt monthly. The ports have > been been built many times in the last 3 years. When building kernel & world do you use the '-j' argument to do parallel builds? AFAIK thats not done by default, but it is for some ports. Gary ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 9-Stable + Atom D510 Freeze
> Tom, > I have had multiple D510's and now D525's that are part of my test > systems, all are 4GB machines and all run the latest (ie 2 days old) 9.X > Stable. They're faultless. I have a D510 in production serving 30 > users - yes its a 1G system running, sendmail, squid, samba as PDC. > It's been in place for at least 7 months and runs without any hiccups. > > Though I would point out that the Atom processor does NOT do out of > order processing, so a VIA motherboard that is of lower GHz builds > worlds/ports in less time that a supposedly faster Atom. > > Your question re HT, yes HT introduces some additional latency, but is > unlikely to be the problem. > Thanks for the information about the HT CPU's. I asked the question to the group because I did not know if they were functionally any different than a traditional CPU. I successfully built my problem port, Tshark, yesterday while monitoring 'top' on another console. I observed that all 4 cpu's were in service for the build and at times were running at 100 percent each. The State column on all 4 occasionally showed a 'pfault' on all 4 but recovered and the build continued to successful completion. > When I experience something like spurious reboots and it is definately > not hardware, then I delete /usr/src and /usr/ports and perform a > complete rebuild. (Yes seriously, and on the Atom's we're talking days, > aren't we :) ) > I have been using this Atom D510 since it was released about 3 years ago. It ran on FreeBSD 8-Stable until about a month ago. I installed an Intel 520 SSD and loaded a fresh copy of a FreeBSD 9 Snapshot. After getting the source and ports tarballs, I used svnup to bring both up to date. I built and installed world and the kernel to bring me up to Stable. I rebuilt all of my ports using Portmaster. The spurious reboot issue existed for the last 3 years when running FreeBSD-8 Stable. I never had the problem building world or kernel. It only occurred when building some ports. Subversion and Tshark more often than others. FreeBSD 9-Stable was frozen when I tried to build tshark, but I was able to build it OK yesterday. Everything hardware related other than the Atom microprocessor and the Intel motherboard itself is new. The OS is now a different version and all of the source was rebuilt monthly. The ports have been been built many times in the last 3 years. Tom -- Public Keys: PGP KeyID = 0x5F22FDC1 GnuPG KeyID = 0x620836CF ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
FreeBSD 9-Stable + Atom D510 Freeze
I have an Intel Atom D510 motherboard that is being used in my home router for the last several years. It started on FreeBSD 8-Stable and was recently upgraded to FreeBSD 9-Stable. Through the years I have observed spurious reboots when rebuilding ports, but never world or kernel. I have tried both schedulers in FreeBSD 8-Stable. I have also replaced memory, power supply and disk drives to attempt to isolate hardware from the equation. Last evening I had a complete freeze when rebuilding tshark. The keyboard was dead, screen display was frozen and no network access. I recovered by pressing the reset switch. As always, there are no log entries about panic or core dumps in the swap partition. My question to the group is whether FreeBSD is correctly identifying the number of CPU's on this motherboard. I see 4 listed in the top utility and it appears that code is being run on all 4. Are HT CPU's equal in performance to 'real' ones and should they participate fully in the task scheduler operation? Since my problem is very intermittant and non-reproducable, is it possible that code may try to exercise something in a HT core that should only be run on a 'real' one? My DMESG: Sep 18 20:50:19 mail kernel: Copyright (c) 1992-2013 The FreeBSD Project. Sep 18 20:50:19 mail kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Sep 18 20:50:19 mail kernel: The Regents of the University of California. All rights reserved. Sep 18 20:50:19 mail kernel: FreeBSD is a registered trademark of The FreeBSD Foundation. Sep 18 20:50:19 mail kernel: FreeBSD 9.2-PRERELEASE #2: Sat Sep 14 18:27:55 EDT 2013 Sep 18 20:50:19 mail kernel: root@x.x.x:/usr/obj/usr/src/sys/ROUTER amd64 Sep 18 20:50:19 mail kernel: gcc version 4.2.1 20070831 patched [FreeBSD] Sep 18 20:50:19 mail kernel: CPU: Intel(R) Atom(TM) CPU D510 @ 1.66GHz (1662.72-MHz K8-class CPU) Sep 18 20:50:19 mail kernel: Origin = "GenuineIntel" Id = 0x106ca Family = 0x6 Model = 0x1c Stepping = 10 Sep 18 20:50:19 mail kernel: Features=0xbfebfbff Sep 18 20:50:19 mail kernel: Features2=0x40e31d Sep 18 20:50:19 mail kernel: AMD Features=0x20100800 Sep 18 20:50:19 mail kernel: AMD Features2=0x1 Sep 18 20:50:19 mail kernel: TSC: P-state invariant, performance statistics Sep 18 20:50:19 mail kernel: real memory = 1073741824 (1024 MB) Sep 18 20:50:19 mail kernel: avail memory = 1002127360 (955 MB) Sep 18 20:50:19 mail kernel: Event timer "LAPIC" quality 400 Sep 18 20:50:19 mail kernel: ACPI APIC Table: Sep 18 20:50:19 mail kernel: FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs Sep 18 20:50:19 mail kernel: FreeBSD/SMP: 1 package(s) x 2 core(s) x 2 HTT threads Sep 18 20:50:19 mail kernel: cpu0 (BSP): APIC ID: 0 Sep 18 20:50:19 mail kernel: cpu1 (AP/HT): APIC ID: 1 Sep 18 20:50:19 mail kernel: cpu2 (AP): APIC ID: 2 Sep 18 20:50:19 mail kernel: cpu3 (AP/HT): APIC ID: 3 Tom -- Public Keys: PGP KeyID = 0x5F22FDC1 GnuPG KeyID = 0x620836CF ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
On 28/07/2013 08:24, Konstantin Belousov wrote: > On Sat, Jul 27, 2013 at 10:33:18AM +0200, Dominic Fandrey wrote: >> On 26/07/2013 19:10, Dominic Fandrey wrote: >>> On 25/07/2013 12:00, Konstantin Belousov wrote: >>>> On Thu, Jul 25, 2013 at 09:56:59AM +0200, Dominic Fandrey wrote: >>>>> On 22/07/2013 12:07, Konstantin Belousov wrote: >>>>>> On Mon, Jul 22, 2013 at 11:50:24AM +0200, Dominic Fandrey wrote: >>>>>>> ... >>>>>>> >>>>>>> I run amd through sysutils/automounter, which is a scripting solution >>>>>>> that generates an amd.map file based on encountered devices and devd >>>>>>> events. The SIGHUP it sends to amd to tell it the map file was updated >>>>>>> does not cause problems, only a -SIGKILL- SIGTERM may cause the freeze. >>>>>>> >>>>>>> Nothing was mounted (by amd) during the last freeze. >>>>>>> >>>>>>> ... >>>>>> >>>>>> Are you sure that the machine did not paniced ? Do you have serial >>>>>> console ? >>>>>> >>>>>> The amd(8) locks itself into memory, most likely due to the fear of >>>>>> deadlock. There are some known issues with user wirings in stable/9. >>>>>> If the problem you see is indeed due to wiring, you might try to apply >>>>>> r253187-r253191. >>>>> >>>>> I tried that. Applying the diff was straightforward enough. But the >>>>> resulting kernel paniced as soon as it tried to mount the root fs. >>>> You did provided a useful info to diagnose the issue. >>>> >>>> Patch should keep KBI compatible, but, just in case, if you have any >>>> third-party module, rebuild it. >>>> >>>>> >>>>> So I'll wait for the MFC from someone who knows what he/she is doing. >>>> >>>> Patch below booted for me, and I run some sanity check tests for the >>>> mlockall(2), which also did not resulted in misbehaviour. >>>> >>> >>> Your patch applied cleanly and the system booted with the resulting >>> kernel. >>> >>> Amd exhibits several very strange behaviours. ... >> >> I can verify the whole thing with a clean world and kernel. >> >> This time I'll concentrate on the first instance of amd: >> >> # tail -n3 /var/log/messages >> Jul 27 10:08:56 mobileKamikaze kernel: newnfs server >> pid5868@mobileKamikaze:/var/run/automounter.amd.mnt: not responding >> Jul 27 10:09:41 mobileKamikaze kernel: newnfs server >> pid5868@mobileKamikaze:/var/run/automounter.amd.mnt: not responding >> Jul 27 10:11:41 mobileKamikaze last message repeated 3 times >> >> The process, it turns out, simply doesn't exist. There is another >> process, though: >> # ps auxww | grep -F sbin/amd >> root 5869 0.0 0.1 12036 8020 ?? S10:08am 0:00.01 >> /usr/sbin/amd -r -p -a /var/run/automounter.amd -c 4 -w 2 >> /var/run/automounter.amd.mnt /var/run/automounter.amd.map >> >> # cat /var/run/automounter.amd.pid >> 5868 >> >> Here is what I think happens, amd forks a subprocess and the main >> process, silently dies after it wrote its pidfile. > Nothing dies silently. Either process was killed by signal, or it > exited with the explicit call to exit(2). In the first case, default > kernel settings of kern.logsigexit should make a record in the syslog. > The machdep.uprintf_signal might be also useful, but not for daemons. Well, it finally turned out, that amd came up in this broken state with missing processes because rpcbind wasn't running. I think it would be a good idea for amd to fail with a bit of noise instead of coming up broken, causing the kernel to spam syslog, and confusing the user. At this point I'd usually pull whoever works on amd into the conversation, but the most recent change to src/contrib/amd is 4 years old. -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
On 28/07/2013 11:00, Daniel Braniss wrote: >> On 28/07/2013 08:24, Konstantin Belousov wrote: >>> On Sat, Jul 27, 2013 at 10:33:18AM +0200, Dominic Fandrey wrote: >>>> On 26/07/2013 19:10, Dominic Fandrey wrote: >>>>> On 25/07/2013 12:00, Konstantin Belousov wrote: >>>>>> On Thu, Jul 25, 2013 at 09:56:59AM +0200, Dominic Fandrey wrote: >>>>>>> On 22/07/2013 12:07, Konstantin Belousov wrote: >>>>>>>> On Mon, Jul 22, 2013 at 11:50:24AM +0200, Dominic Fandrey wrote: >>>>>>>>> ... >>>>>>>>> >>>>>>>>> I run amd through sysutils/automounter, which is a scripting solution >>>>>>>>> that generates an amd.map file based on encountered devices and devd >>>>>>>>> events. The SIGHUP it sends to amd to tell it the map file was updated >>>>>>>>> does not cause problems, only a -SIGKILL- SIGTERM may cause the >>>>>>>>> freeze. >>>>>>>>> >>>>>>>>> Nothing was mounted (by amd) during the last freeze. >>>>>>>>> >>>>>>>>> ... Thank you everyone, after updating to stable/9 r254418 the problem has dissipated. -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
> On 28/07/2013 08:24, Konstantin Belousov wrote: > > On Sat, Jul 27, 2013 at 10:33:18AM +0200, Dominic Fandrey wrote: > >> On 26/07/2013 19:10, Dominic Fandrey wrote: > >>> On 25/07/2013 12:00, Konstantin Belousov wrote: > >>>> On Thu, Jul 25, 2013 at 09:56:59AM +0200, Dominic Fandrey wrote: > >>>>> On 22/07/2013 12:07, Konstantin Belousov wrote: > >>>>>> On Mon, Jul 22, 2013 at 11:50:24AM +0200, Dominic Fandrey wrote: > >>>>>>> ... > >>>>>>> > >>>>>>> I run amd through sysutils/automounter, which is a scripting solution > >>>>>>> that generates an amd.map file based on encountered devices and devd > >>>>>>> events. The SIGHUP it sends to amd to tell it the map file was updated > >>>>>>> does not cause problems, only a -SIGKILL- SIGTERM may cause the > >>>>>>> freeze. > >>>>>>> > >>>>>>> Nothing was mounted (by amd) during the last freeze. > >>>>>>> > >>>>>>> ... > >>>>>> > >>>>>> Are you sure that the machine did not paniced ? Do you have serial > >>>>>> console ? > >>>>>> > >>>>>> The amd(8) locks itself into memory, most likely due to the fear of > >>>>>> deadlock. There are some known issues with user wirings in stable/9. > >>>>>> If the problem you see is indeed due to wiring, you might try to apply > >>>>>> r253187-r253191. > >>>>> > >>>>> I tried that. Applying the diff was straightforward enough. But the > >>>>> resulting kernel paniced as soon as it tried to mount the root fs. > >>>> You did provided a useful info to diagnose the issue. > >>>> > >>>> Patch should keep KBI compatible, but, just in case, if you have any > >>>> third-party module, rebuild it. > >>>> > >>>>> > >>>>> So I'll wait for the MFC from someone who knows what he/she is doing. > >>>> > >>>> Patch below booted for me, and I run some sanity check tests for the > >>>> mlockall(2), which also did not resulted in misbehaviour. > >>>> > >>> > >>> Your patch applied cleanly and the system booted with the resulting > >>> kernel. > >>> > >>> Amd exhibits several very strange behaviours. ... > >> > >> I can verify the whole thing with a clean world and kernel. > >> > >> This time I'll concentrate on the first instance of amd: > >> > >> # tail -n3 /var/log/messages > >> Jul 27 10:08:56 mobileKamikaze kernel: newnfs server > >> pid5868@mobileKamikaze:/var/run/automounter.amd.mnt: not responding > >> Jul 27 10:09:41 mobileKamikaze kernel: newnfs server > >> pid5868@mobileKamikaze:/var/run/automounter.amd.mnt: not responding > >> Jul 27 10:11:41 mobileKamikaze last message repeated 3 times > >> > >> The process, it turns out, simply doesn't exist. There is another > >> process, though: > >> # ps auxww | grep -F sbin/amd > >> root 5869 0.0 0.1 12036 8020 ?? S10:08am 0:00.01 > >> /usr/sbin/amd -r -p -a /var/run/automounter.amd -c 4 -w 2 > >> /var/run/automounter.amd.mnt /var/run/automounter.amd.map > >> > >> # cat /var/run/automounter.amd.pid > >> 5868 > >> > >> Here is what I think happens, amd forks a subprocess and the main > >> process, silently dies after it wrote its pidfile. > > Nothing dies silently. Either process was killed by signal, or it > > exited with the explicit call to exit(2). In the first case, default > > kernel settings of kern.logsigexit should make a record in the syslog. > > The machdep.uprintf_signal might be also useful, but not for daemons. > > Well, after I reverted your patch I got some things in the syslog. > Sometimes amd works as expected, sometimes it dies right after starting: > Jul 28 10:19:42 mobileKamikaze kernel: pid 24217 (amd), uid 0: exited on > signal 11 (core dumped) > > This is just all over confusing. just to confuse you a bit more :-) I gave up with mlockall(2) so I compiled amd statically linked. my 5 cents. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
On 28/07/2013 08:24, Konstantin Belousov wrote: > On Sat, Jul 27, 2013 at 10:33:18AM +0200, Dominic Fandrey wrote: >> On 26/07/2013 19:10, Dominic Fandrey wrote: >>> On 25/07/2013 12:00, Konstantin Belousov wrote: >>>> On Thu, Jul 25, 2013 at 09:56:59AM +0200, Dominic Fandrey wrote: >>>>> On 22/07/2013 12:07, Konstantin Belousov wrote: >>>>>> On Mon, Jul 22, 2013 at 11:50:24AM +0200, Dominic Fandrey wrote: >>>>>>> ... >>>>>>> >>>>>>> I run amd through sysutils/automounter, which is a scripting solution >>>>>>> that generates an amd.map file based on encountered devices and devd >>>>>>> events. The SIGHUP it sends to amd to tell it the map file was updated >>>>>>> does not cause problems, only a -SIGKILL- SIGTERM may cause the freeze. >>>>>>> >>>>>>> Nothing was mounted (by amd) during the last freeze. >>>>>>> >>>>>>> ... >>>>>> >>>>>> Are you sure that the machine did not paniced ? Do you have serial >>>>>> console ? >>>>>> >>>>>> The amd(8) locks itself into memory, most likely due to the fear of >>>>>> deadlock. There are some known issues with user wirings in stable/9. >>>>>> If the problem you see is indeed due to wiring, you might try to apply >>>>>> r253187-r253191. >>>>> >>>>> I tried that. Applying the diff was straightforward enough. But the >>>>> resulting kernel paniced as soon as it tried to mount the root fs. >>>> You did provided a useful info to diagnose the issue. >>>> >>>> Patch should keep KBI compatible, but, just in case, if you have any >>>> third-party module, rebuild it. >>>> >>>>> >>>>> So I'll wait for the MFC from someone who knows what he/she is doing. >>>> >>>> Patch below booted for me, and I run some sanity check tests for the >>>> mlockall(2), which also did not resulted in misbehaviour. >>>> >>> >>> Your patch applied cleanly and the system booted with the resulting >>> kernel. >>> >>> Amd exhibits several very strange behaviours. ... >> >> I can verify the whole thing with a clean world and kernel. >> >> This time I'll concentrate on the first instance of amd: >> >> # tail -n3 /var/log/messages >> Jul 27 10:08:56 mobileKamikaze kernel: newnfs server >> pid5868@mobileKamikaze:/var/run/automounter.amd.mnt: not responding >> Jul 27 10:09:41 mobileKamikaze kernel: newnfs server >> pid5868@mobileKamikaze:/var/run/automounter.amd.mnt: not responding >> Jul 27 10:11:41 mobileKamikaze last message repeated 3 times >> >> The process, it turns out, simply doesn't exist. There is another >> process, though: >> # ps auxww | grep -F sbin/amd >> root 5869 0.0 0.1 12036 8020 ?? S10:08am 0:00.01 >> /usr/sbin/amd -r -p -a /var/run/automounter.amd -c 4 -w 2 >> /var/run/automounter.amd.mnt /var/run/automounter.amd.map >> >> # cat /var/run/automounter.amd.pid >> 5868 >> >> Here is what I think happens, amd forks a subprocess and the main >> process, silently dies after it wrote its pidfile. > Nothing dies silently. Either process was killed by signal, or it > exited with the explicit call to exit(2). In the first case, default > kernel settings of kern.logsigexit should make a record in the syslog. > The machdep.uprintf_signal might be also useful, but not for daemons. Well, after I reverted your patch I got some things in the syslog. Sometimes amd works as expected, sometimes it dies right after starting: Jul 28 10:19:42 mobileKamikaze kernel: pid 24217 (amd), uid 0: exited on signal 11 (core dumped) This is just all over confusing. -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
On Sat, Jul 27, 2013 at 10:33:18AM +0200, Dominic Fandrey wrote: > On 26/07/2013 19:10, Dominic Fandrey wrote: > > On 25/07/2013 12:00, Konstantin Belousov wrote: > >> On Thu, Jul 25, 2013 at 09:56:59AM +0200, Dominic Fandrey wrote: > >>> On 22/07/2013 12:07, Konstantin Belousov wrote: > >>>> On Mon, Jul 22, 2013 at 11:50:24AM +0200, Dominic Fandrey wrote: > >>>>> ... > >>>>> > >>>>> I run amd through sysutils/automounter, which is a scripting solution > >>>>> that generates an amd.map file based on encountered devices and devd > >>>>> events. The SIGHUP it sends to amd to tell it the map file was updated > >>>>> does not cause problems, only a -SIGKILL- SIGTERM may cause the freeze. > >>>>> > >>>>> Nothing was mounted (by amd) during the last freeze. > >>>>> > >>>>> ... > >>>> > >>>> Are you sure that the machine did not paniced ? Do you have serial > >>>> console ? > >>>> > >>>> The amd(8) locks itself into memory, most likely due to the fear of > >>>> deadlock. There are some known issues with user wirings in stable/9. > >>>> If the problem you see is indeed due to wiring, you might try to apply > >>>> r253187-r253191. > >>> > >>> I tried that. Applying the diff was straightforward enough. But the > >>> resulting kernel paniced as soon as it tried to mount the root fs. > >> You did provided a useful info to diagnose the issue. > >> > >> Patch should keep KBI compatible, but, just in case, if you have any > >> third-party module, rebuild it. > >> > >>> > >>> So I'll wait for the MFC from someone who knows what he/she is doing. > >> > >> Patch below booted for me, and I run some sanity check tests for the > >> mlockall(2), which also did not resulted in misbehaviour. > >> > > > > Your patch applied cleanly and the system booted with the resulting > > kernel. > > > > Amd exhibits several very strange behaviours. ... > > I can verify the whole thing with a clean world and kernel. > > This time I'll concentrate on the first instance of amd: > > # tail -n3 /var/log/messages > Jul 27 10:08:56 mobileKamikaze kernel: newnfs server > pid5868@mobileKamikaze:/var/run/automounter.amd.mnt: not responding > Jul 27 10:09:41 mobileKamikaze kernel: newnfs server > pid5868@mobileKamikaze:/var/run/automounter.amd.mnt: not responding > Jul 27 10:11:41 mobileKamikaze last message repeated 3 times > > The process, it turns out, simply doesn't exist. There is another > process, though: > # ps auxww | grep -F sbin/amd > root 5869 0.0 0.1 12036 8020 ?? S10:08am 0:00.01 > /usr/sbin/amd -r -p -a /var/run/automounter.amd -c 4 -w 2 > /var/run/automounter.amd.mnt /var/run/automounter.amd.map > > # cat /var/run/automounter.amd.pid > 5868 > > Here is what I think happens, amd forks a subprocess and the main > process, silently dies after it wrote its pidfile. Nothing dies silently. Either process was killed by signal, or it exited with the explicit call to exit(2). In the first case, default kernel settings of kern.logsigexit should make a record in the syslog. The machdep.uprintf_signal might be also useful, but not for daemons. If the process called exit(2), ktrace would show it. > > For completeness: > # mount > /dev/ufs/5root on / (ufs, local, noatime, soft-updates) > devfs on /dev (devfs, local, multilabel) > /dev/ufs/5stor on /pool/5stor (ufs, local, noatime, soft-updates) > /pool/5stor/usr on /usr (nullfs, local, noatime) > /pool/5stor/var on /var (nullfs, local, noatime) > /usr/home/root on /root (nullfs, local, noatime) > tmpfs on /var/log (tmpfs, local) > tmpfs on /var/run (tmpfs, local) > tmpfs on /tmp (tmpfs, local) > > Everything else seems to work. I'll revert your patch for now and > wait for the MFC. I was unable to get useful information from any of your posts. My current plan is to merge the revisions after the 9.2 freeze is over. pgp56BLYX1cxw.pgp Description: PGP signature
Re: stopping amd causes a freeze
On 26/07/2013 19:10, Dominic Fandrey wrote: > On 25/07/2013 12:00, Konstantin Belousov wrote: >> On Thu, Jul 25, 2013 at 09:56:59AM +0200, Dominic Fandrey wrote: >>> On 22/07/2013 12:07, Konstantin Belousov wrote: >>>> On Mon, Jul 22, 2013 at 11:50:24AM +0200, Dominic Fandrey wrote: >>>>> ... >>>>> >>>>> I run amd through sysutils/automounter, which is a scripting solution >>>>> that generates an amd.map file based on encountered devices and devd >>>>> events. The SIGHUP it sends to amd to tell it the map file was updated >>>>> does not cause problems, only a -SIGKILL- SIGTERM may cause the freeze. >>>>> >>>>> Nothing was mounted (by amd) during the last freeze. >>>>> >>>>> ... >>>> >>>> Are you sure that the machine did not paniced ? Do you have serial >>>> console ? >>>> >>>> The amd(8) locks itself into memory, most likely due to the fear of >>>> deadlock. There are some known issues with user wirings in stable/9. >>>> If the problem you see is indeed due to wiring, you might try to apply >>>> r253187-r253191. >>> >>> I tried that. Applying the diff was straightforward enough. But the >>> resulting kernel paniced as soon as it tried to mount the root fs. >> You did provided a useful info to diagnose the issue. >> >> Patch should keep KBI compatible, but, just in case, if you have any >> third-party module, rebuild it. >> >>> >>> So I'll wait for the MFC from someone who knows what he/she is doing. >> >> Patch below booted for me, and I run some sanity check tests for the >> mlockall(2), which also did not resulted in misbehaviour. >> > > Your patch applied cleanly and the system booted with the resulting > kernel. > > Amd exhibits several very strange behaviours. ... I can verify the whole thing with a clean world and kernel. This time I'll concentrate on the first instance of amd: # tail -n3 /var/log/messages Jul 27 10:08:56 mobileKamikaze kernel: newnfs server pid5868@mobileKamikaze:/var/run/automounter.amd.mnt: not responding Jul 27 10:09:41 mobileKamikaze kernel: newnfs server pid5868@mobileKamikaze:/var/run/automounter.amd.mnt: not responding Jul 27 10:11:41 mobileKamikaze last message repeated 3 times The process, it turns out, simply doesn't exist. There is another process, though: # ps auxww | grep -F sbin/amd root 5869 0.0 0.1 12036 8020 ?? S10:08am 0:00.01 /usr/sbin/amd -r -p -a /var/run/automounter.amd -c 4 -w 2 /var/run/automounter.amd.mnt /var/run/automounter.amd.map # cat /var/run/automounter.amd.pid 5868 Here is what I think happens, amd forks a subprocess and the main process, silently dies after it wrote its pidfile. For completeness: # mount /dev/ufs/5root on / (ufs, local, noatime, soft-updates) devfs on /dev (devfs, local, multilabel) /dev/ufs/5stor on /pool/5stor (ufs, local, noatime, soft-updates) /pool/5stor/usr on /usr (nullfs, local, noatime) /pool/5stor/var on /var (nullfs, local, noatime) /usr/home/root on /root (nullfs, local, noatime) tmpfs on /var/log (tmpfs, local) tmpfs on /var/run (tmpfs, local) tmpfs on /tmp (tmpfs, local) Everything else seems to work. I'll revert your patch for now and wait for the MFC. -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
On 26/07/2013 20:37, Artem Belevich wrote: > On Fri, Jul 26, 2013 at 10:10 AM, Dominic Fandrey wrote: > >> Amd exhibits several very strange behaviours. >> >> a) >> During the first start it writes the wrong PID into the pidfile, >> it however still reacts to SIGTERM. >> >> b) >> After starting it again, it no longer reacts to SIGTERM. >> > > amd does block off signals in some of its sub-processes. For instance amd > process that works as NFS server and handles amd mount points does block > off INT/TERM/CHLD/HUP. See /usr/src/contrib/amd/amd/nfs_start.c Didn't know that. But so sending signals to the process in the pidfile, used to work™. >> c) >> It appear to be no longer reacting to SIGHUP, which is required to >> tell it that the amd.map was updated. >> >> > Try using 'amq -f' which would ask amd to reload its maps via RPC and > should work regardless of whether you know the right PID. amq -m or amq -p just hang there and do nothing right now. As soon as amd is unbroken this is good to know, though. Sending a SIGINFO: load: 0.58 cmd: amq 6071 [kqread] 4.71r 0.00u 0.00s 0% 2132k -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
On Fri, Jul 26, 2013 at 10:10 AM, Dominic Fandrey wrote: > Amd exhibits several very strange behaviours. > > a) > During the first start it writes the wrong PID into the pidfile, > it however still reacts to SIGTERM. > > b) > After starting it again, it no longer reacts to SIGTERM. > amd does block off signals in some of its sub-processes. For instance amd process that works as NFS server and handles amd mount points does block off INT/TERM/CHLD/HUP. See /usr/src/contrib/amd/amd/nfs_start.c > > c) > It appear to be no longer reacting to SIGHUP, which is required to > tell it that the amd.map was updated. > > Try using 'amq -f' which would ask amd to reload its maps via RPC and should work regardless of whether you know the right PID. Strangely enough amd man page does not mention SIGHUP at all. amd/doc/am-utils.texi in the source tree does, but only when it talks about hlfsd or about 'type:=auto' maps with 'cache' option. Documentation on am-utils.org matches am-utils.texi. As far as I can tell 'amq -f' is the official way to tell amd that it should reload maps. --Artem > d) > It doesn't work at all, I only get: > # cd /media/ufs/FreeBSD_Install > /media/ufs/FreeBSD_Install: Too many levels of symbolic links. > > e) > A SIGKILL without load will terminate the process. A SIGKILL while > there is heavy file system load panics the system. > > I'll try a clean buildworld buildkernel and repeat. > > -- > A: Because it fouls the order in which people normally read text. > Q: Why is top-posting such a bad thing? > A: Top-posting. > Q: What is the most annoying thing on usenet and in e-mail? > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
On 25/07/2013 12:00, Konstantin Belousov wrote: > On Thu, Jul 25, 2013 at 09:56:59AM +0200, Dominic Fandrey wrote: >> On 22/07/2013 12:07, Konstantin Belousov wrote: >>> On Mon, Jul 22, 2013 at 11:50:24AM +0200, Dominic Fandrey wrote: >>>> ... >>>> >>>> I run amd through sysutils/automounter, which is a scripting solution >>>> that generates an amd.map file based on encountered devices and devd >>>> events. The SIGHUP it sends to amd to tell it the map file was updated >>>> does not cause problems, only a -SIGKILL- SIGTERM may cause the freeze. >>>> >>>> Nothing was mounted (by amd) during the last freeze. >>>> >>>> ... >>> >>> Are you sure that the machine did not paniced ? Do you have serial console >>> ? >>> >>> The amd(8) locks itself into memory, most likely due to the fear of >>> deadlock. There are some known issues with user wirings in stable/9. >>> If the problem you see is indeed due to wiring, you might try to apply >>> r253187-r253191. >> >> I tried that. Applying the diff was straightforward enough. But the >> resulting kernel paniced as soon as it tried to mount the root fs. > You did provided a useful info to diagnose the issue. > > Patch should keep KBI compatible, but, just in case, if you have any > third-party module, rebuild it. > >> >> So I'll wait for the MFC from someone who knows what he/she is doing. > > Patch below booted for me, and I run some sanity check tests for the > mlockall(2), which also did not resulted in misbehaviour. > Your patch applied cleanly and the system booted with the resulting kernel. Amd exhibits several very strange behaviours. a) During the first start it writes the wrong PID into the pidfile, it however still reacts to SIGTERM. b) After starting it again, it no longer reacts to SIGTERM. c) It appear to be no longer reacting to SIGHUP, which is required to tell it that the amd.map was updated. d) It doesn't work at all, I only get: # cd /media/ufs/FreeBSD_Install /media/ufs/FreeBSD_Install: Too many levels of symbolic links. e) A SIGKILL without load will terminate the process. A SIGKILL while there is heavy file system load panics the system. I'll try a clean buildworld buildkernel and repeat. -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
On Thu, Jul 25, 2013 at 09:56:59AM +0200, Dominic Fandrey wrote: > On 22/07/2013 12:07, Konstantin Belousov wrote: > > On Mon, Jul 22, 2013 at 11:50:24AM +0200, Dominic Fandrey wrote: > >> ... > >> > >> I run amd through sysutils/automounter, which is a scripting solution > >> that generates an amd.map file based on encountered devices and devd > >> events. The SIGHUP it sends to amd to tell it the map file was updated > >> does not cause problems, only a -SIGKILL- SIGTERM may cause the freeze. > >> > >> Nothing was mounted (by amd) during the last freeze. > >> > >> ... > > > > Are you sure that the machine did not paniced ? Do you have serial console > > ? > > > > The amd(8) locks itself into memory, most likely due to the fear of > > deadlock. There are some known issues with user wirings in stable/9. > > If the problem you see is indeed due to wiring, you might try to apply > > r253187-r253191. > > I tried that. Applying the diff was straightforward enough. But the > resulting kernel paniced as soon as it tried to mount the root fs. You did provided a useful info to diagnose the issue. Patch should keep KBI compatible, but, just in case, if you have any third-party module, rebuild it. > > So I'll wait for the MFC from someone who knows what he/she is doing. Patch below booted for me, and I run some sanity check tests for the mlockall(2), which also did not resulted in misbehaviour. Index: kern/vfs_bio.c === --- kern/vfs_bio.c (revision 253643) +++ kern/vfs_bio.c (working copy) @@ -1614,7 +1614,8 @@ brelse(struct buf *bp) (PAGE_SIZE - poffset) : resid; KASSERT(presid >= 0, ("brelse: extra page")); - vm_page_set_invalid(m, poffset, presid); + if (pmap_page_wired_mappings(m) == 0) + vm_page_set_invalid(m, poffset, presid); if (had_bogus) printf("avoided corruption bug in bogus_page/brelse code\n"); } Index: vm/vm_fault.c === --- vm/vm_fault.c (revision 253643) +++ vm/vm_fault.c (working copy) @@ -286,6 +286,19 @@ RetryFault:; (u_long)vaddr); } + if (fs.entry->eflags & MAP_ENTRY_IN_TRANSITION && + fs.entry->wiring_thread != curthread) { + vm_map_unlock_read(fs.map); + vm_map_lock(fs.map); + if (vm_map_lookup_entry(fs.map, vaddr, &fs.entry) && + (fs.entry->eflags & MAP_ENTRY_IN_TRANSITION)) { + fs.entry->eflags |= MAP_ENTRY_NEEDS_WAKEUP; + vm_map_unlock_and_wait(fs.map, 0); + } else + vm_map_unlock(fs.map); + goto RetryFault; + } + /* * Make a reference to this object to prevent its disposal while we * are messing with it. Once we have the reference, the map is free Index: vm/vm_map.c === --- vm/vm_map.c (revision 253643) +++ vm/vm_map.c (working copy) @@ -2272,6 +2272,7 @@ vm_map_unwire(vm_map_t map, vm_offset_t start, vm_ * above.) */ entry->eflags |= MAP_ENTRY_IN_TRANSITION; + entry->wiring_thread = curthread; /* * Check the map for holes in the specified region. * If VM_MAP_WIRE_HOLESOK was specified, skip this check. @@ -2304,8 +2305,24 @@ done: else KASSERT(result, ("vm_map_unwire: lookup failed")); } - entry = first_entry; - while (entry != &map->header && entry->start < end) { + for (entry = first_entry; entry != &map->header && entry->start < end; + entry = entry->next) { + /* +* If VM_MAP_WIRE_HOLESOK was specified, an empty +* space in the unwired region could have been mapped +* while the map lock was dropped for draining +* MAP_ENTRY_IN_TRANSITION. Moreover, another thread +* could be simultaneously wiring this new mapping +* entry. Detect these cases and skip any entries +* marked as in transition by us. +*/ + if ((entry->eflags & MAP_ENTRY_IN_TRANSITION) == 0 || + entry->wiring_thread != curthread) { +
Re: stopping amd causes a freeze
On 22/07/2013 12:07, Konstantin Belousov wrote: > On Mon, Jul 22, 2013 at 11:50:24AM +0200, Dominic Fandrey wrote: >> ... >> >> I run amd through sysutils/automounter, which is a scripting solution >> that generates an amd.map file based on encountered devices and devd >> events. The SIGHUP it sends to amd to tell it the map file was updated >> does not cause problems, only a -SIGKILL- SIGTERM may cause the freeze. >> >> Nothing was mounted (by amd) during the last freeze. >> >> ... > > Are you sure that the machine did not paniced ? Do you have serial console ? > > The amd(8) locks itself into memory, most likely due to the fear of > deadlock. There are some known issues with user wirings in stable/9. > If the problem you see is indeed due to wiring, you might try to apply > r253187-r253191. I tried that. Applying the diff was straightforward enough. But the resulting kernel paniced as soon as it tried to mount the root fs. So I'll wait for the MFC from someone who knows what he/she is doing. Regards -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
On Tue, Jul 23, 2013 at 10:43 AM, Dominic Fandrey wrote: > > Don't use KILL or make sure that nobody tries to use amd mountpoints > until > > new instance starts. Manually unmounting them before killing amd may > help. > > Why not let amd do it itself with "/etc/rc.d/amd stop" ? > > That was a typo, I'm using SIGTERM. Sorry about that. > > On SIGTERM amd will attempt to unmount its mountpoints. If someone is using them, unmount may not succeed. I've no clue what amd does in such case. The point is that you should treat amd restart as reboot of an NFS server. amd map reload does not really require amd restart. In some cases you may have to manually unmount some automounted filesystem if underlying map had changed, but that's the only case I can think of off the top of my head. In most of the cases "amq -f" worked well enough for me. By the way, are you absolutely sure that your script that restarts amd is guaranteed not to touch anything mounted with amd? Otherwise you're risking a deadlock. For example, if PATH contains amd-mounted directory then when it's time to execute next command your script may attempt to touch such path and may hang waiting for amd to respond which will never happen because the script can't start it. Now, back to debugging your problem. One way to check what's going on would be to figure out where do the processes get stuck. Start with "ps -axl" and see STAT field. CHances are that stuck processes will be in uninterruptible sleep state 'D'. Check MWCHAN field for those. Hitting '^T' which normally sends SIGINFO should also produce a message that includes process' wait channel and is convenient to use when you have console where you've started the app that is hung. Dig further into the sleeping process with "procstat -kk PID" -- it will give you in-kernel stack trace for process' threads which should whos what's going on. You may want to do it from a root login with local host directory and minimalistic PATH so it does not touch any amd mount points. --Artem ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
On 22/07/2013 20:05, Artem Belevich wrote: > On Mon, Jul 22, 2013 at 2:50 AM, Dominic Fandrey wrote: > >> Occasionally stopping amd freezes my system. It's a rare occurrence, >> and I haven't found a reliable way to reproduce it. >> >> It's also a real freeze, so there's no way to get into the debugger >> or grab a core dump. I only can perform the 4 seconds hard shutdown to >> revive the system. >> >> I run amd through sysutils/automounter, which is a scripting solution >> that generates an amd.map file based on encountered devices and devd >> events. The SIGHUP it sends to amd to tell it the map file was updated >> does not cause problems, only a SIGKILL may cause the freeze. >> >> Nothing was mounted (by amd) during the last freeze. >> >> > ... > > >> I don't see any angle to tackle this, but I'm throwing it out here >> any way, in the hopes that someone actually has an idea how to approach >> the issue. >> > > Don't use KILL or make sure that nobody tries to use amd mountpoints until > new instance starts. Manually unmounting them before killing amd may help. > Why not let amd do it itself with "/etc/rc.d/amd stop" ? That was a typo, I'm using SIGTERM. Sorry about that. -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
On Mon, Jul 22, 2013 at 2:50 AM, Dominic Fandrey wrote: > Occasionally stopping amd freezes my system. It's a rare occurrence, > and I haven't found a reliable way to reproduce it. > > It's also a real freeze, so there's no way to get into the debugger > or grab a core dump. I only can perform the 4 seconds hard shutdown to > revive the system. > > I run amd through sysutils/automounter, which is a scripting solution > that generates an amd.map file based on encountered devices and devd > events. The SIGHUP it sends to amd to tell it the map file was updated > does not cause problems, only a SIGKILL may cause the freeze. > > Nothing was mounted (by amd) during the last freeze. > > amd itself is a primitive NFS server as far as system is concerned and amd mount points are mounted from it. If you just KILL it without giving it a chance to clean things up you'll potentially end up in a situation similar to mounting from remote NFS server that's unresponsive. From mount_nfs(8): If the server becomes unresponsive while an NFS file system is mounted, any new or outstanding file operations on that file system will hang uninterruptibly until the server comes back. To modify this default be- haviour, see the intr and soft options. > I don't see any angle to tackle this, but I'm throwing it out here > any way, in the hopes that someone actually has an idea how to approach > the issue. > Don't use KILL or make sure that nobody tries to use amd mountpoints until new instance starts. Manually unmounting them before killing amd may help. Why not let amd do it itself with "/etc/rc.d/amd stop" ? --Artem > > # uname -a > FreeBSD mobileKamikaze.norad 9.2-PRERELEASE FreeBSD 9.2-PRERELEASE #0 > r253413: Wed Jul 17 13:12:46 CEST 2013 > root@mobileKamikaze.norad:/usr/obj/HP6510b-91/amd64/usr/src/sys/HP6510b-91 > amd64 > > That's amd's starting message: > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: no logfile defined; using > stderr > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: AM-UTILS VERSION > INFORMATION: > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Copyright (c) 1997-2006 > Erez Zadok > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Copyright (c) 1990 > Jan-Simon Pendry > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Copyright (c) 1990 > Imperial College of Science, Technology & Medicine > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Copyright (c) 1990 The > Regents of the University of California. > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: am-utils version 6.1.5 > (build 901505). > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Report bugs to > https://bugzilla.am-utils.org/ or am-ut...@am-utils.org. > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Configured by David > O'Brien on date 4-December-2007 PST. > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Built by > root@mobileKamikaze.norad. > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: cpu=amd64 (little-endian), > arch=amd64, karch=amd64. > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: full_os=freebsd9.2, > os=freebsd9, osver=9.2, vendor=undermydesk, distro=The FreeBSD Project. > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: domain=norad, > host=mobileKamikaze, hostd=mobileKamikaze.norad. > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Map support for: root, > passwd, union, nis, ndbm, file, exec, error. > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: AMFS: nfs, link, nfsx, > nfsl, host, linkx, program, union, ufs, cdfs, > Jul 22 11:32:28 mobileKamikaze amd[8176]/info:pcfs, auto, direct, > toplvl, error, inherit. > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: FS: cd9660, nfs, nfs3, > nullfs, msdosfs, ufs, unionfs. > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Network 1: > wire="192.168.1.0" (netnumber=192.168.1). > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Network 2: > wire="192.168.0.0" (netnumber=192.168). > Jul 22 11:32:28 mobileKamikaze amd[8176]/info: My ip addr is 127.0.0.1 > > amd is called with the flags -r -p -a -c 4 -w 2 > > -- > A: Because it fouls the order in which people normally read text. > Q: Why is top-posting such a bad thing? > A: Top-posting. > Q: What is the most annoying thing on usenet and in e-mail? > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
On 22/07/2013 14:35, Ronald Klop wrote: > On Mon, 22 Jul 2013 14:19:44 +0200, Dominic Fandrey > wrote: > >> On 22/07/2013 12:07, Konstantin Belousov wrote: >>> On Mon, Jul 22, 2013 at 11:50:24AM +0200, Dominic Fandrey wrote: >>>> Occasionally stopping amd freezes my system. It's a rare occurrence, >>>> and I haven't found a reliable way to reproduce it. >>>> >>>> It's also a real freeze, so there's no way to get into the debugger >>>> or grab a core dump. I only can perform the 4 seconds hard shutdown to >>>> revive the system. >>>> >>>> I run amd through sysutils/automounter, which is a scripting solution >>>> that generates an amd.map file based on encountered devices and devd >>>> events. The SIGHUP it sends to amd to tell it the map file was updated >>>> does not cause problems, only a SIGKILL may cause the freeze. >>>> >>>> Nothing was mounted (by amd) during the last freeze. >>>> >>>> I don't see any angle to tackle this, but I'm throwing it out here >>>> any way, in the hopes that someone actually has an idea how to approach >>>> the issue. >>> >>> Are you sure that the machine did not paniced ? Do you have serial console >>> ? >> >> No, I don't have one. All that I can tell is that everything freezes >> (i.e. Xorg screen and mouse). ACPI events like shutdown don't cause a >> reaction. And the system doesn't respond to ICMP queries. >> >>> The amd(8) locks itself into memory, most likely due to the fear of >>> deadlock. There are some known issues with user wirings in stable/9. >>> If the problem you see is indeed due to wiring, you might try to apply >>> r253187-r253191. >> >> From head? That may be worth a try. It would be better for testing if I >> managed to reproduce the problem reliably, before I test patches. >> >> I see it's scheduled for MFC, soon. >> > > Did you try a run with the INVARIANTS, etc. options in the kernel? That > enables more sanity checking for locks which is too slow for production. No I didn't, but I managed to reproduce it in combination with heavy tmpfs load. So now I've got a working test case and will be able to determine whether the suggested fix works. I suppose INVARIANTS would be the next step. -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
On Mon, 22 Jul 2013 14:19:44 +0200, Dominic Fandrey wrote: On 22/07/2013 12:07, Konstantin Belousov wrote: On Mon, Jul 22, 2013 at 11:50:24AM +0200, Dominic Fandrey wrote: Occasionally stopping amd freezes my system. It's a rare occurrence, and I haven't found a reliable way to reproduce it. It's also a real freeze, so there's no way to get into the debugger or grab a core dump. I only can perform the 4 seconds hard shutdown to revive the system. I run amd through sysutils/automounter, which is a scripting solution that generates an amd.map file based on encountered devices and devd events. The SIGHUP it sends to amd to tell it the map file was updated does not cause problems, only a SIGKILL may cause the freeze. Nothing was mounted (by amd) during the last freeze. I don't see any angle to tackle this, but I'm throwing it out here any way, in the hopes that someone actually has an idea how to approach the issue. Are you sure that the machine did not paniced ? Do you have serial console ? No, I don't have one. All that I can tell is that everything freezes (i.e. Xorg screen and mouse). ACPI events like shutdown don't cause a reaction. And the system doesn't respond to ICMP queries. The amd(8) locks itself into memory, most likely due to the fear of deadlock. There are some known issues with user wirings in stable/9. If the problem you see is indeed due to wiring, you might try to apply r253187-r253191. From head? That may be worth a try. It would be better for testing if I managed to reproduce the problem reliably, before I test patches. I see it's scheduled for MFC, soon. Did you try a run with the INVARIANTS, etc. options in the kernel? That enables more sanity checking for locks which is too slow for production. Ronald. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
On 22/07/2013 12:07, Konstantin Belousov wrote: > On Mon, Jul 22, 2013 at 11:50:24AM +0200, Dominic Fandrey wrote: >> Occasionally stopping amd freezes my system. It's a rare occurrence, >> and I haven't found a reliable way to reproduce it. >> >> It's also a real freeze, so there's no way to get into the debugger >> or grab a core dump. I only can perform the 4 seconds hard shutdown to >> revive the system. >> >> I run amd through sysutils/automounter, which is a scripting solution >> that generates an amd.map file based on encountered devices and devd >> events. The SIGHUP it sends to amd to tell it the map file was updated >> does not cause problems, only a SIGKILL may cause the freeze. >> >> Nothing was mounted (by amd) during the last freeze. >> >> I don't see any angle to tackle this, but I'm throwing it out here >> any way, in the hopes that someone actually has an idea how to approach >> the issue. > > Are you sure that the machine did not paniced ? Do you have serial console ? No, I don't have one. All that I can tell is that everything freezes (i.e. Xorg screen and mouse). ACPI events like shutdown don't cause a reaction. And the system doesn't respond to ICMP queries. > The amd(8) locks itself into memory, most likely due to the fear of > deadlock. There are some known issues with user wirings in stable/9. > If the problem you see is indeed due to wiring, you might try to apply > r253187-r253191. >From head? That may be worth a try. It would be better for testing if I managed to reproduce the problem reliably, before I test patches. I see it's scheduled for MFC, soon. -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stopping amd causes a freeze
On Mon, Jul 22, 2013 at 11:50:24AM +0200, Dominic Fandrey wrote: > Occasionally stopping amd freezes my system. It's a rare occurrence, > and I haven't found a reliable way to reproduce it. > > It's also a real freeze, so there's no way to get into the debugger > or grab a core dump. I only can perform the 4 seconds hard shutdown to > revive the system. > > I run amd through sysutils/automounter, which is a scripting solution > that generates an amd.map file based on encountered devices and devd > events. The SIGHUP it sends to amd to tell it the map file was updated > does not cause problems, only a SIGKILL may cause the freeze. > > Nothing was mounted (by amd) during the last freeze. > > I don't see any angle to tackle this, but I'm throwing it out here > any way, in the hopes that someone actually has an idea how to approach > the issue. Are you sure that the machine did not paniced ? Do you have serial console ? The amd(8) locks itself into memory, most likely due to the fear of deadlock. There are some known issues with user wirings in stable/9. If the problem you see is indeed due to wiring, you might try to apply r253187-r253191. pgpsPkzdjccIf.pgp Description: PGP signature
stopping amd causes a freeze
Occasionally stopping amd freezes my system. It's a rare occurrence, and I haven't found a reliable way to reproduce it. It's also a real freeze, so there's no way to get into the debugger or grab a core dump. I only can perform the 4 seconds hard shutdown to revive the system. I run amd through sysutils/automounter, which is a scripting solution that generates an amd.map file based on encountered devices and devd events. The SIGHUP it sends to amd to tell it the map file was updated does not cause problems, only a SIGKILL may cause the freeze. Nothing was mounted (by amd) during the last freeze. I don't see any angle to tackle this, but I'm throwing it out here any way, in the hopes that someone actually has an idea how to approach the issue. # uname -a FreeBSD mobileKamikaze.norad 9.2-PRERELEASE FreeBSD 9.2-PRERELEASE #0 r253413: Wed Jul 17 13:12:46 CEST 2013 root@mobileKamikaze.norad:/usr/obj/HP6510b-91/amd64/usr/src/sys/HP6510b-91 amd64 That's amd's starting message: Jul 22 11:32:28 mobileKamikaze amd[8176]/info: no logfile defined; using stderr Jul 22 11:32:28 mobileKamikaze amd[8176]/info: AM-UTILS VERSION INFORMATION: Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Copyright (c) 1997-2006 Erez Zadok Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Copyright (c) 1990 Jan-Simon Pendry Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Copyright (c) 1990 Imperial College of Science, Technology & Medicine Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Copyright (c) 1990 The Regents of the University of California. Jul 22 11:32:28 mobileKamikaze amd[8176]/info: am-utils version 6.1.5 (build 901505). Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Report bugs to https://bugzilla.am-utils.org/ or am-ut...@am-utils.org. Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Configured by David O'Brien on date 4-December-2007 PST. Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Built by root@mobileKamikaze.norad. Jul 22 11:32:28 mobileKamikaze amd[8176]/info: cpu=amd64 (little-endian), arch=amd64, karch=amd64. Jul 22 11:32:28 mobileKamikaze amd[8176]/info: full_os=freebsd9.2, os=freebsd9, osver=9.2, vendor=undermydesk, distro=The FreeBSD Project. Jul 22 11:32:28 mobileKamikaze amd[8176]/info: domain=norad, host=mobileKamikaze, hostd=mobileKamikaze.norad. Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Map support for: root, passwd, union, nis, ndbm, file, exec, error. Jul 22 11:32:28 mobileKamikaze amd[8176]/info: AMFS: nfs, link, nfsx, nfsl, host, linkx, program, union, ufs, cdfs, Jul 22 11:32:28 mobileKamikaze amd[8176]/info:pcfs, auto, direct, toplvl, error, inherit. Jul 22 11:32:28 mobileKamikaze amd[8176]/info: FS: cd9660, nfs, nfs3, nullfs, msdosfs, ufs, unionfs. Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Network 1: wire="192.168.1.0" (netnumber=192.168.1). Jul 22 11:32:28 mobileKamikaze amd[8176]/info: Network 2: wire="192.168.0.0" (netnumber=192.168). Jul 22 11:32:28 mobileKamikaze amd[8176]/info: My ip addr is 127.0.0.1 amd is called with the flags -r -p -a -c 4 -w 2 -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 6.2-Release ..ish.. CF + ata == freeze
This is a really old thread i thought i would bring back to life. I have heard that the flash card vendor has fessed up to a problem and said there is a software fix they can create. So far i have no ETA on when that is going to happen and for the record i don't think i will. Oh well... here comes a crap load of RMAs. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 8.3 and 9.0 freeze with firefox
On 04/15/12 03:29, Ronald Klop wrote: On Fri, 30 Mar 2012 04:28:06 +0200, Joseph Olatt wrote: Hi, Starting with 8.3, I've been experiencing FreeBSD freezing up completely after using firefox for a while. Thinking the problem would go away if I upgraded to 9.0, I did that and I am still experiencing the same freezing up. The mouse pointer freezes, the keyboard freezes (caps lock light will not come on; Ctrl-Alt-F[1-10] does not work etc.). The only way to get the system back is by pressing and holding down the power button. The problem seems similar to: kern/163145 There is nothing in /var/log/messages to indicate a problem. Output of pciconf -lv and uname -a are at: http://www.eskimo.com/~joji/wisdom/ Anybody else experiencing similar freeze ups with 8.3 or 9.0 while using firefox? Since Firefox uses all kinds of GPU stuff nowadays. Is it possible it locks up your graphics card? I suggest trying to turn of GPU hardware acceleration in Firefox. Ronald. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" Ronald and Joseph, I've been seeing this too on an AMD Phenom II X3 720 running Stable from 2012-03-23. A Prescott Dell, 9 Stable, I use at work does it too. The freezes last some number of second, longer if Youtube types of video is involved. Annoying. I'll try the graphics tip. Thanks, r ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 8.3 and 9.0 freeze with firefox
On Fri, 30 Mar 2012 04:28:06 +0200, Joseph Olatt wrote: Hi, Starting with 8.3, I've been experiencing FreeBSD freezing up completely after using firefox for a while. Thinking the problem would go away if I upgraded to 9.0, I did that and I am still experiencing the same freezing up. The mouse pointer freezes, the keyboard freezes (caps lock light will not come on; Ctrl-Alt-F[1-10] does not work etc.). The only way to get the system back is by pressing and holding down the power button. The problem seems similar to: kern/163145 There is nothing in /var/log/messages to indicate a problem. Output of pciconf -lv and uname -a are at: http://www.eskimo.com/~joji/wisdom/ Anybody else experiencing similar freeze ups with 8.3 or 9.0 while using firefox? Since Firefox uses all kinds of GPU stuff nowadays. Is it possible it locks up your graphics card? I suggest trying to turn of GPU hardware acceleration in Firefox. Ronald. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 8.3 and 9.0 freeze with firefox
On Fri, Mar 30, 2012 at 10:41:54AM +0700, Erich Dollansky wrote: > Hi, > > On Friday 30 March 2012 09:28:06 Joseph Olatt wrote: > > > > Starting with 8.3, I've been experiencing FreeBSD freezing up completely > > after using firefox for a while. Thinking the problem would go away if I > > upgraded to 9.0, I did that and I am still experiencing the same > > freezing up. The mouse pointer freezes, the keyboard freezes (caps lock > > light will not come on; Ctrl-Alt-F[1-10] does not work etc.). The only > > way to get the system back is by pressing and holding down the power > > button. > > > > The problem seems similar to: kern/163145 > > > > > > There is nothing in /var/log/messages to indicate a problem. Output of > > pciconf -lv and uname -a are at: > > > > http://www.eskimo.com/~joji/wisdom/ > > > > Anybody else experiencing similar freeze ups with 8.3 or 9.0 while using > > firefox? > > > I use 8.3 and Firefox without problems. What extension did you install? Are > they all properly updated? > > Earlier, it helped deleting firefox' directory in the user directory. > > Erich Erich, Thanks for your response. I've removed the .mozilla directory from my home directory. Let's see if it will make a difference. It is quite possible it will. I had updated the firefox port when I updated from 8.2 to 8.3. Thanks for the suggestion. joseph ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 8.3 and 9.0 freeze with firefox
Hi, On Friday 30 March 2012 09:28:06 Joseph Olatt wrote: > > Starting with 8.3, I've been experiencing FreeBSD freezing up completely > after using firefox for a while. Thinking the problem would go away if I > upgraded to 9.0, I did that and I am still experiencing the same > freezing up. The mouse pointer freezes, the keyboard freezes (caps lock > light will not come on; Ctrl-Alt-F[1-10] does not work etc.). The only > way to get the system back is by pressing and holding down the power > button. > > The problem seems similar to: kern/163145 > > > There is nothing in /var/log/messages to indicate a problem. Output of > pciconf -lv and uname -a are at: > > http://www.eskimo.com/~joji/wisdom/ > > Anybody else experiencing similar freeze ups with 8.3 or 9.0 while using > firefox? > I use 8.3 and Firefox without problems. What extension did you install? Are they all properly updated? Earlier, it helped deleting firefox' directory in the user directory. Erich ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
FreeBSD 8.3 and 9.0 freeze with firefox
Hi, Starting with 8.3, I've been experiencing FreeBSD freezing up completely after using firefox for a while. Thinking the problem would go away if I upgraded to 9.0, I did that and I am still experiencing the same freezing up. The mouse pointer freezes, the keyboard freezes (caps lock light will not come on; Ctrl-Alt-F[1-10] does not work etc.). The only way to get the system back is by pressing and holding down the power button. The problem seems similar to: kern/163145 There is nothing in /var/log/messages to indicate a problem. Output of pciconf -lv and uname -a are at: http://www.eskimo.com/~joji/wisdom/ Anybody else experiencing similar freeze ups with 8.3 or 9.0 while using firefox? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 6.2-Release ..ish.. CF + ata == freeze?
The plot is starting to thicken. I've noticed all the systems that have done this (so far) have this flash card on them. STEC M2+ CF 9.0.2 K1186-2 From talking to checkpoint this is a newer flash they have started using. I just had a 4th machine do the same thing yesterday. Basic install, about %70 disk space free, very new install, like 1-2 month and the up time on the machine in question was only 16 days. After rebooting i did a few dd if=/dev/zero of=~/file bs=1m count=350 and didn't get any errors. The latest machine is a 1 gig version of the flash listed above, so this ate almost all the free disk space. Checkpoint is asking that we RAM one of the flash cards so they can play with it. From: "jfleming...@yahoo.com" To: Jeremy Chadwick Cc: "freebsd-stable@freebsd.org" Sent: Tuesday, February 14, 2012 7:57 PM Subject: Re: 6.2-Release ..ish.. CF + ata == freeze? 2 of the 3 cf cards are very new, like less then 6 months old. I think around 65-70 percent is in use. This number doesn't change unless the user dumps data in a home dir, which isn't the case so far. You are correct that only writes are failing. Msgbuf has more then what I pasted but I'm pretty sure its just more of the same errors. Ill redouble my check. The other slices are very small. One is 35 meg the other is 100 some odd meg. H is 1.2 gig. I don't know if ill be able to try the dd test for a few reasons but ill check it out. Let me ask you this. Say zeroing out the drive works without error. Does that tell me anything? I also don't have access to smart tools as this is basically a closed system and the vendor would never give us access to a complier. Granted I haven't tried just throwing on gcc from 6.2. I could play with that or maybe since said vendor's dev team is keeping track of this thread they could provide said binary :). I really don't like the idea of replacing hardware as I'm looking at around 200 boxes. I really hope it doesn't come to that. Thanks for the reply! Sent via BlackBerry from T-Mobile -Original Message- From: Jeremy Chadwick Date: Mon, 13 Feb 2012 21:18:28 To: john fleming Cc: freebsd-stable@freebsd.org Subject: Re: 6.2-Release ..ish.. CF + ata == freeze? On Mon, Feb 13, 2012 at 08:43:08PM -0800, john fleming wrote: > Just thought i would post over here as i'm not getting a warm fuzzy from > checkpoint about being able to find the root cause of an issue. I have a > large install base of IPSO checkpoint firewalls, which are based on FreeBSD > 6.2. I've had 3 firewalls hang basically the same way, with something that > looks like a filesystem issue or an?issue with a CF card. FreeBSD 6.2 was EOL'd in early-to-mid-2008. The ATA driver has changed significantly since then (present-day uses CAM). > Does anyone happen to know of any bugs (i've been looking around) that could > cause something like that? Granted, it could be a batch of bad CF cards, but > its odd that i'm seeing the same thing on 3 different boxes and once rebooted > they seem ok. > ? > Also is it possible to get useful info form the atacontroller when things go > south like this from the ddb prompt? Not particularly. What's shown below indicates that the driver had issued some form of ATA write command (there are multiple kinds per ATA specification), and either the underlying media (CF/disk) or controller stalled/locked up/took too long. I forget what the timeout value is in 6.2; I can't be bothered to remember such from 6 years ago. :-) > This is what shows in show msgbuf > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > g_vfs_done():ad0s4h[WRITE(offset=33849344, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=33980416, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=34111488, length=131072)]error = 5 > ?g_vfs_done():ad0s4h[WRITE(offset=34242560, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=34373632, length=131072)]error = 5 error 5 = EIO = Input/output error. But this isn't too big of a surprise given the timeouts you see prior. Are these CF cards brand new -- meaning, are they completely unused (having never had any writes done to them), or have they been in use a while? I'm betting they've been in use a while, and have probably been doing many writes over the years. Two things to note here: 1) The errors you've shown are only happening on writes, not reads. Of course if you omitted information then this isn't an accurate statemen
Re: 6.2-Release ..ish.. CF + ata == freeze?
2 of the 3 cf cards are very new, like less then 6 months old. I think around 65-70 percent is in use. This number doesn't change unless the user dumps data in a home dir, which isn't the case so far. You are correct that only writes are failing. Msgbuf has more then what I pasted but I'm pretty sure its just more of the same errors. Ill redouble my check. The other slices are very small. One is 35 meg the other is 100 some odd meg. H is 1.2 gig. I don't know if ill be able to try the dd test for a few reasons but ill check it out. Let me ask you this. Say zeroing out the drive works without error. Does that tell me anything? I also don't have access to smart tools as this is basically a closed system and the vendor would never give us access to a complier. Granted I haven't tried just throwing on gcc from 6.2. I could play with that or maybe since said vendor's dev team is keeping track of this thread they could provide said binary :). I really don't like the idea of replacing hardware as I'm looking at around 200 boxes. I really hope it doesn't come to that. Thanks for the reply! Sent via BlackBerry from T-Mobile -Original Message- From: Jeremy Chadwick Date: Mon, 13 Feb 2012 21:18:28 To: john fleming Cc: freebsd-stable@freebsd.org Subject: Re: 6.2-Release ..ish.. CF + ata == freeze? On Mon, Feb 13, 2012 at 08:43:08PM -0800, john fleming wrote: > Just thought i would post over here as i'm not getting a warm fuzzy from > checkpoint about being able to find the root cause of an issue. I have a > large install base of IPSO checkpoint firewalls, which are based on FreeBSD > 6.2. I've had 3 firewalls hang basically the same way, with something that > looks like a filesystem issue or an?issue with a CF card. FreeBSD 6.2 was EOL'd in early-to-mid-2008. The ATA driver has changed significantly since then (present-day uses CAM). > Does anyone happen to know of any bugs (i've been looking around) that could > cause something like that? Granted, it could be a batch of bad CF cards, but > its odd that i'm seeing the same thing on 3 different boxes and once rebooted > they seem ok. > ? > Also is it possible to get useful info form the atacontroller when things go > south like this from the ddb prompt? Not particularly. What's shown below indicates that the driver had issued some form of ATA write command (there are multiple kinds per ATA specification), and either the underlying media (CF/disk) or controller stalled/locked up/took too long. I forget what the timeout value is in 6.2; I can't be bothered to remember such from 6 years ago. :-) > This is what shows in show msgbuf > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > g_vfs_done():ad0s4h[WRITE(offset=33849344, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=33980416, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=34111488, length=131072)]error = 5 > ?g_vfs_done():ad0s4h[WRITE(offset=34242560, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=34373632, length=131072)]error = 5 error 5 = EIO = Input/output error. But this isn't too big of a surprise given the timeouts you see prior. Are these CF cards brand new -- meaning, are they completely unused (having never had any writes done to them), or have they been in use a while? I'm betting they've been in use a while, and have probably been doing many writes over the years. Two things to note here: 1) The errors you've shown are only happening on writes, not reads. Of course if you omitted information then this isn't an accurate statement. 2) Timeouts are seen when issuing writes to some LBA regions. How full is the CF card, disk-space-wise? Not just ad0s4h, I'm talking about the entire card. How much space is roughly available? They're very small CF cards (1.8GByte roughly), and the less space available, the less effectiveness of wear levelling (and in some cases the slower the writes are). Reason I ask: given that these are CF cards, this smells of cards which are simply "worn down". CF cards have limited numbers of writes, and the card may be "freaking out" internally when attempting to write to some LBAs which map to CF sectors that are, in effect, "bad". The CF cards' ECC implementation may be buggy, or may simply be "spinning hard" for too long. You can read about this sort of behaviour on Wikipedia's CompactFlash article. You wouldn't be able to verify this with dd if=/dev/ad0, because those are r
Re: 6.2-Release ..ish.. CF + ata == freeze?
On Tue, 2012-02-14 at 00:12 -0500, Jason Hellenthal wrote: > > On Mon, Feb 13, 2012 at 08:43:08PM -0800, john fleming wrote: > > Just thought i would post over here as i'm not getting a warm fuzzy from > > checkpoint about being able to find the root cause of an issue. I have a > > large install base of IPSO checkpoint firewalls, which are based on FreeBSD > > 6.2. I've had 3 firewalls hang basically the same way, with something that > > looks like a filesystem issue or an issue with a CF card. > > > > Does anyone happen to know of any bugs (i've been looking around) that > > could cause something like that? Granted, it could be a batch of bad CF > > cards, but its odd that i'm seeing the same thing on 3 different boxes and > > once rebooted they seem ok. > > > > Also is it possible to get useful info form the atacontroller when things > > go south like this from the ddb prompt? > > > > This is what shows in show msgbuf > > ad0: timeout waiting to issue command > > ad0: error issuing WRITE command > > ad0: timeout waiting to issue command > > ad0: error issuing WRITE command > > ad0: timeout waiting to issue command > > ad0: error issuing WRITE command > > ad0: timeout waiting to issue command > > ad0: error issuing WRITE command > > g_vfs_done():ad0s4h[WRITE(offset=33849344, length=131072)]error = 5 > > g_vfs_done():ad0s4h[WRITE(offset=33980416, length=131072)]error = 5 > > g_vfs_done():ad0s4h[WRITE(offset=34111488, length=131072)]error = 5 > > g_vfs_done():ad0s4h[WRITE(offset=34242560, length=131072)]error = 5 > > g_vfs_done():ad0s4h[WRITE(offset=34373632, length=131072)]error = 5 > > > > ad0: 1882MB at ata0-master PIO4 > > atapci0: port > > 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x5070-0x507f mem 0x80301000-0x803013ff > > at device 31.1 on pci0 > > ata0: on atapci0 > > ata1: on atapci0 > > atapci1: port > > 0x5088-0x508f,0x50a4-0x50a7,0x5080-0x5087,0x50a0-0x50a3,0x5060-0x506f irq > > 15 at device 31.2 on pci0 > > ata2: on atapci1 > > ata3: on atapci1ad0s4h is basically a r/w ufs partition on > > the box where almost anything that needs to be written goes. > > trace > > Tracing pid 1101 tid 100043 td 0x656d8460 > > kdb_enter(608cc388,6246,656d8460,64ba1400,6095d580,...) at kdb_enter+0x2b > > siointr1(64ba1400) at siointr1+0xf0 > > siointr(64ba1400) at siointr+0x38 > > intr_execute_handler(6095d580,f0a4ab04,6,6095d580,f0a4aafc,...) at > > intr_execute_handler+0x61 > > intr_execute_handlers(6095d580,f0a4ab04,6,0,656d8460,...) at > > intr_execute_handlers+0x40 > > atpic_handle_intr(4) at atpic_handle_intr+0x96 > > Xatpic_intr4() at Xatpic_intr4+0x20 > > --- interrupt, eip = 0x606044af, esp = 0xf0a4ab48, ebp = 0xf0a4ab5c --- > > lockmgr(e1456a04,6,0,656d8460) at lockmgr+0x58f > > getdirtybuf(e14569a4,60a405e4,1) at getdirtybuf+0x2e2 > > flush_deplist(68b30850,1,f0a4abb8) at flush_deplist+0x30 > > flush_inodedep_deps(656fa28c,1f235) at flush_inodedep_deps+0xcf > > softdep_sync_metadata(65964618) at softdep_sync_metadata+0x61 > > ffs_syncvnode(65964618,1) at ffs_syncvnode+0x3a2 > > ffs_fsync(f0a4ac74) at ffs_fsync+0x12 > > VOP_FSYNC_APV(60949260,f0a4ac74) at VOP_FSYNC_APV+0x38 > > fsync(656d8460,f0a4acb4) at fsync+0x170 > > syscall(805003b,806003b,5fbf003b,805,288be450,...) at syscall+0x2ee > > Xint0x80_syscall() at Xint0x80_syscall+0x1f > > This looks to be a problem with softupdates and CF cards. Can you get > this to repeat on a brand new (good) card ? > EIO errors on a write that lead to a panic nearly always backtrace into the softupdates code, because that code pretty much has to panic if it can't write things in the proper order. That doesn't imply that the softupdates code is at fault in any way, or that the errors would go away if softupdates were turned off. In fact, I consider it important to have softupdates enabled on CF and SDCard media. The number of writes (and especially of repeated re-writes of the same filesystem metadata sectors) goes way way up without SU enabled, and that's bad for media with a limited number of write cycles in its lifetime. We've been using 6.2 with SU enabled on CF cards for many years at Symmetricom; we're still shipping systems with that config. Depending on the motherboard or SBC, we often have to disable ata DMA, or limit it to a max of WDMA2 mode. The indication that you need to do so is typically a lockup either trying to load the kernel and modules, or sometimes that works but it locks up while initializing the ata driver. [1] If your systems have been running fine with DMA enabled, it's not the sort of problem that suddenly appears out of the blue. You find out when you need to disable it pretty quickly on new hardware because it doesn't boot reliably. I tend to agree with Jeremy's assesment that you may have some CF cards that have neared the end of their life, and especially if they're full the automatic wear leveling can't find any un-worn cells to use. If the cards are old they may have primitive wear-leve
Re: 6.2-Release ..ish.. CF + ata == freeze?
On Mon, Feb 13, 2012 at 11:38:06PM -0600, Adam Vande More wrote: > On Mon, Feb 13, 2012 at 10:43 PM, john fleming wrote: > > > Just thought i would post over here as i'm not getting a warm fuzzy from > > checkpoint about being able to find the root cause of an issue. I have a > > large install base of IPSO checkpoint firewalls, which are based on FreeBSD > > 6.2. I've had 3 firewalls hang basically the same way, with something that > > looks like a filesystem issue or an issue with a CF card. > > > > There was a thread just the other day mentioned lockup problems with DMA > and CF cards. Disabling DMA or reducing the mode helped. Not sure if > applicable to that old of FreeBSD version. > I seen that thread. Doubt it is related to his issue since he is running 6.2. And besides his dmesg proves otherwise. ad0: 1882MB at ata0-master PIO4 -- ;s =; ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 6.2-Release ..ish.. CF + ata == freeze?
On Mon, Feb 13, 2012 at 10:43 PM, john fleming wrote: > Just thought i would post over here as i'm not getting a warm fuzzy from > checkpoint about being able to find the root cause of an issue. I have a > large install base of IPSO checkpoint firewalls, which are based on FreeBSD > 6.2. I've had 3 firewalls hang basically the same way, with something that > looks like a filesystem issue or an issue with a CF card. > There was a thread just the other day mentioned lockup problems with DMA and CF cards. Disabling DMA or reducing the mode helped. Not sure if applicable to that old of FreeBSD version. -- Adam Vande More ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 6.2-Release ..ish.. CF + ata == freeze?
I can't seem to replicate it at all. I've seen it happen on 3 different IPSO boxes so far. The last machine it happened on is maybe 4 months old. Basically on all 3 machines once rebooted the problem doesn't come back. Checkpoint so far is telling me its a known issue and they don't know what the fix is. What makes you think its softupdates? Would that cause the write timeout as well? Just not sure what level of this is failing, filesystem, flash or ata controller. thanks for the reply! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 6.2-Release ..ish.. CF + ata == freeze?
On Mon, Feb 13, 2012 at 08:43:08PM -0800, john fleming wrote: > Just thought i would post over here as i'm not getting a warm fuzzy from > checkpoint about being able to find the root cause of an issue. I have a > large install base of IPSO checkpoint firewalls, which are based on FreeBSD > 6.2. I've had 3 firewalls hang basically the same way, with something that > looks like a filesystem issue or an?issue with a CF card. FreeBSD 6.2 was EOL'd in early-to-mid-2008. The ATA driver has changed significantly since then (present-day uses CAM). > Does anyone happen to know of any bugs (i've been looking around) that could > cause something like that? Granted, it could be a batch of bad CF cards, but > its odd that i'm seeing the same thing on 3 different boxes and once rebooted > they seem ok. > ? > Also is it possible to get useful info form the atacontroller when things go > south like this from the ddb prompt? Not particularly. What's shown below indicates that the driver had issued some form of ATA write command (there are multiple kinds per ATA specification), and either the underlying media (CF/disk) or controller stalled/locked up/took too long. I forget what the timeout value is in 6.2; I can't be bothered to remember such from 6 years ago. :-) > This is what shows in show msgbuf > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > g_vfs_done():ad0s4h[WRITE(offset=33849344, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=33980416, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=34111488, length=131072)]error = 5 > ?g_vfs_done():ad0s4h[WRITE(offset=34242560, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=34373632, length=131072)]error = 5 error 5 = EIO = Input/output error. But this isn't too big of a surprise given the timeouts you see prior. Are these CF cards brand new -- meaning, are they completely unused (having never had any writes done to them), or have they been in use a while? I'm betting they've been in use a while, and have probably been doing many writes over the years. Two things to note here: 1) The errors you've shown are only happening on writes, not reads. Of course if you omitted information then this isn't an accurate statement. 2) Timeouts are seen when issuing writes to some LBA regions. How full is the CF card, disk-space-wise? Not just ad0s4h, I'm talking about the entire card. How much space is roughly available? They're very small CF cards (1.8GByte roughly), and the less space available, the less effectiveness of wear levelling (and in some cases the slower the writes are). Reason I ask: given that these are CF cards, this smells of cards which are simply "worn down". CF cards have limited numbers of writes, and the card may be "freaking out" internally when attempting to write to some LBAs which map to CF sectors that are, in effect, "bad". The CF cards' ECC implementation may be buggy, or may simply be "spinning hard" for too long. You can read about this sort of behaviour on Wikipedia's CompactFlash article. You wouldn't be able to verify this with dd if=/dev/ad0, because those are read operations. You could zero the media (dd if=/dev/zero of=/dev/ad0) as a form of verification if you wanted. Do you happen to know if these CF cards support SMART? If so, installing smartmontools (version 5.42 or newer please) and providing output from "smartctl -a /dev/ad0" may be helpful to me, but I make no guarantees anything of use will be shown there. Overall my advice would be to replace the CF cards, especially if they have been in use for a long while. It really doesn't matter to me that it's happening on 3 machines (honest), especially if these are 6.2 machines with CF cards that have been in use for years. We're lucky to get 2 years out of our CF cards on our Juniper M120/320s before they start spitting I/O errors. Pick larger CF cards as well; more space = more room for effective wear levelling. > ? > ad0: 1882MB at ata0-master PIO4 > atapci0: port > 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x5070-0x507f mem 0x80301000-0x803013ff > at device 31.1 on pci0 > ata0: on atapci0 > ata1: on atapci0 > atapci1: port > 0x5088-0x508f,0x50a4-0x50a7,0x5080-0x5087,0x50a0-0x50a3,0x5060-0x506f irq 15 > at device 31.2 on pci0 > ata2: on atapci1 > ata3: on atapci1ad0s4h is basically a r/w ufs partition on > the box where almost anything that needs to be written goes. > trace > Tracing pid 1101 tid 100043 td 0x656d8460 > kdb_enter(608cc388,6246,656d8460,64ba1400,6095d580,...) at kdb_enter+0x2b > siointr1(64ba1400) at siointr1+0xf0 > siointr(64ba1400) at siointr+0x38 > intr_execute_handler(6095d580,f0a4ab04,6,6095d580,f0a4aafc,...) at > intr_execute_handler+0x61 > int
Re: 6.2-Release ..ish.. CF + ata == freeze?
On Mon, Feb 13, 2012 at 08:43:08PM -0800, john fleming wrote: > Just thought i would post over here as i'm not getting a warm fuzzy from > checkpoint about being able to find the root cause of an issue. I have a > large install base of IPSO checkpoint firewalls, which are based on FreeBSD > 6.2. I've had 3 firewalls hang basically the same way, with something that > looks like a filesystem issue or an issue with a CF card. > > Does anyone happen to know of any bugs (i've been looking around) that could > cause something like that? Granted, it could be a batch of bad CF cards, but > its odd that i'm seeing the same thing on 3 different boxes and once rebooted > they seem ok. > > Also is it possible to get useful info form the atacontroller when things go > south like this from the ddb prompt? > > This is what shows in show msgbuf > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > g_vfs_done():ad0s4h[WRITE(offset=33849344, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=33980416, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=34111488, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=34242560, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=34373632, length=131072)]error = 5 > > ad0: 1882MB at ata0-master PIO4 > atapci0: port > 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x5070-0x507f mem 0x80301000-0x803013ff > at device 31.1 on pci0 > ata0: on atapci0 > ata1: on atapci0 > atapci1: port > 0x5088-0x508f,0x50a4-0x50a7,0x5080-0x5087,0x50a0-0x50a3,0x5060-0x506f irq 15 > at device 31.2 on pci0 > ata2: on atapci1 > ata3: on atapci1ad0s4h is basically a r/w ufs partition on > the box where almost anything that needs to be written goes. > trace > Tracing pid 1101 tid 100043 td 0x656d8460 > kdb_enter(608cc388,6246,656d8460,64ba1400,6095d580,...) at kdb_enter+0x2b > siointr1(64ba1400) at siointr1+0xf0 > siointr(64ba1400) at siointr+0x38 > intr_execute_handler(6095d580,f0a4ab04,6,6095d580,f0a4aafc,...) at > intr_execute_handler+0x61 > intr_execute_handlers(6095d580,f0a4ab04,6,0,656d8460,...) at > intr_execute_handlers+0x40 > atpic_handle_intr(4) at atpic_handle_intr+0x96 > Xatpic_intr4() at Xatpic_intr4+0x20 > --- interrupt, eip = 0x606044af, esp = 0xf0a4ab48, ebp = 0xf0a4ab5c --- > lockmgr(e1456a04,6,0,656d8460) at lockmgr+0x58f > getdirtybuf(e14569a4,60a405e4,1) at getdirtybuf+0x2e2 > flush_deplist(68b30850,1,f0a4abb8) at flush_deplist+0x30 > flush_inodedep_deps(656fa28c,1f235) at flush_inodedep_deps+0xcf > softdep_sync_metadata(65964618) at softdep_sync_metadata+0x61 > ffs_syncvnode(65964618,1) at ffs_syncvnode+0x3a2 > ffs_fsync(f0a4ac74) at ffs_fsync+0x12 > VOP_FSYNC_APV(60949260,f0a4ac74) at VOP_FSYNC_APV+0x38 > fsync(656d8460,f0a4acb4) at fsync+0x170 > syscall(805003b,806003b,5fbf003b,805,288be450,...) at syscall+0x2ee > Xint0x80_syscall() at Xint0x80_syscall+0x1f This looks to be a problem with softupdates and CF cards. Can you get this to repeat on a brand new (good) card ? -- ;s =; ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
6.2-Release ..ish.. CF + ata == freeze?
Just thought i would post over here as i'm not getting a warm fuzzy from checkpoint about being able to find the root cause of an issue. I have a large install base of IPSO checkpoint firewalls, which are based on FreeBSD 6.2. I've had 3 firewalls hang basically the same way, with something that looks like a filesystem issue or an issue with a CF card. Does anyone happen to know of any bugs (i've been looking around) that could cause something like that? Granted, it could be a batch of bad CF cards, but its odd that i'm seeing the same thing on 3 different boxes and once rebooted they seem ok. Also is it possible to get useful info form the atacontroller when things go south like this from the ddb prompt? This is what shows in show msgbuf ad0: timeout waiting to issue command ad0: error issuing WRITE command ad0: timeout waiting to issue command ad0: error issuing WRITE command ad0: timeout waiting to issue command ad0: error issuing WRITE command ad0: timeout waiting to issue command ad0: error issuing WRITE command g_vfs_done():ad0s4h[WRITE(offset=33849344, length=131072)]error = 5 g_vfs_done():ad0s4h[WRITE(offset=33980416, length=131072)]error = 5 g_vfs_done():ad0s4h[WRITE(offset=34111488, length=131072)]error = 5 g_vfs_done():ad0s4h[WRITE(offset=34242560, length=131072)]error = 5 g_vfs_done():ad0s4h[WRITE(offset=34373632, length=131072)]error = 5 ad0: 1882MB at ata0-master PIO4 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x5070-0x507f mem 0x80301000-0x803013ff at device 31.1 on pci0 ata0: on atapci0 ata1: on atapci0 atapci1: port 0x5088-0x508f,0x50a4-0x50a7,0x5080-0x5087,0x50a0-0x50a3,0x5060-0x506f irq 15 at device 31.2 on pci0 ata2: on atapci1 ata3: on atapci1ad0s4h is basically a r/w ufs partition on the box where almost anything that needs to be written goes. trace Tracing pid 1101 tid 100043 td 0x656d8460 kdb_enter(608cc388,6246,656d8460,64ba1400,6095d580,...) at kdb_enter+0x2b siointr1(64ba1400) at siointr1+0xf0 siointr(64ba1400) at siointr+0x38 intr_execute_handler(6095d580,f0a4ab04,6,6095d580,f0a4aafc,...) at intr_execute_handler+0x61 intr_execute_handlers(6095d580,f0a4ab04,6,0,656d8460,...) at intr_execute_handlers+0x40 atpic_handle_intr(4) at atpic_handle_intr+0x96 Xatpic_intr4() at Xatpic_intr4+0x20 --- interrupt, eip = 0x606044af, esp = 0xf0a4ab48, ebp = 0xf0a4ab5c --- lockmgr(e1456a04,6,0,656d8460) at lockmgr+0x58f getdirtybuf(e14569a4,60a405e4,1) at getdirtybuf+0x2e2 flush_deplist(68b30850,1,f0a4abb8) at flush_deplist+0x30 flush_inodedep_deps(656fa28c,1f235) at flush_inodedep_deps+0xcf softdep_sync_metadata(65964618) at softdep_sync_metadata+0x61 ffs_syncvnode(65964618,1) at ffs_syncvnode+0x3a2 ffs_fsync(f0a4ac74) at ffs_fsync+0x12 VOP_FSYNC_APV(60949260,f0a4ac74) at VOP_FSYNC_APV+0x38 fsync(656d8460,f0a4acb4) at fsync+0x170 syscall(805003b,806003b,5fbf003b,805,288be450,...) at syscall+0x2ee Xint0x80_syscall() at Xint0x80_syscall+0x1f --More-- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 9.0-RC1 freeze after swapoff/swapon procedure on md/vnode-backend file
On 26 October 2011 23:31, Subbsd wrote: > Hi > > I get easy reproducible a hang-up servers that use the file-based > swap file after swapoff / swapon procedure (in this case, some of the > data must be swapped). For example: > > 1) dd if=/dev/zero of=/usr/swp1 bs=1m count=100 > 2) mdconfig -a -t vnode -f /usr/swp1 > 3) swapon /dev/md0 > 4) begin to allocated memory, for example by simple: > tail /dev/zero > > 5) after a filling of some percent, do swapoff /dev/md0, then swapon > /dev/md0. you can try this procedure again. > > The system may stop responding to commands and freezes or locks up > after some time. From the outside - the core lives (icmp response > goes) but the disk system is not available. > > PS: one of my server to my mind is frozen without swapoff/on - just > had three swapfile, a day after he crashed. Something interesting while trying to reproduce your problem: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1048970, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1057174, size: 4096 panic: swapoff: failed to locate 16056 swap blocks cpuid = 1 KDB: stack backtrace: db_trace_self_wrapper() at 0x802e009a = db_trace_self_wrapper+0x2a kdb_backtrace() at 0x80486d87 = kdb_backtrace+0x37 panic() at 0x8044f6ee = panic+0x2ee swapoff_one() at 0x80687425 = swapoff_one+0x475 sys_swapoff() at 0x8068789b = sys_swapoff+0x1bb amd64_syscall() at 0x806c60c9 = amd64_syscall+0x299 Xfast_syscall() at 0x806b1467 = Xfast_syscall+0xf7 --- syscall (424, FreeBSD ELF64, sys_swapoff), rip = 0x800ab307c, rsp = 0x7fffd9c8, rbp = 0 --- KDB: enter: panic [ thread pid 63255 tid 100211 ] Stopped at 0x8048696b = kdb_enter+0x3b:movq $0,0x735f72(%rip) Below is a trace for a process on another CPU that's doing intensive malloc+bzero in userland: db> bt 63066 Tracing pid 63066 tid 100199 td 0xfe000e89f000 cpustop_handler() at 0x806bb46b = cpustop_handler+0x2b ipi_nmi_handler() at 0x806bb540 = ipi_nmi_handler+0x50 trap() at 0x806c7035 = trap+0x2a5 nmi_calltrap() at 0x806b15bf = nmi_calltrap+0x8 --- trap 0x13, rip = 0x8043e0d0, rsp = 0x80dc4dc0, rbp = 0xff80908de750 --- _mtx_unlock_flags() at 0x8043e0d0 = _mtx_unlock_flags+0x170 swp_pager_meta_ctl() at 0x806841aa = swp_pager_meta_ctl+0xea swap_pager_haspage() at 0x80684272 = swap_pager_haspage+0x42 vm_fault_hold() at 0x8068e379 = vm_fault_hold+0x599 trap_pfault() at 0x806c6c26 = trap_pfault+0xe6 trap() at 0x806c733f = trap+0x5af calltrap() at 0x806b1183 = calltrap+0x8 --- trap 0xc, rip = 0x4006ed, rsp = 0x7fffdad0, rbp = 0x7fffdae0 --- That corresponds to kgdb: #9 0x8044f6e4 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:624 #10 0x80687425 in swapoff_one (sp=0xfe000dc3dc00, cred=0xff808e5bf340) at /usr/src/sys/vm/swap_pager.c:1774 #11 0x8068789b in sys_swapoff (td=0xfe000e90f000, uap=Variable "uap" is not available. ) at /usr/src/sys/vm/swap_pager.c:2236 #12 0x806c60c9 in amd64_syscall (td=0xfe000e90f000, traced=0) at subr_syscall.c:131 ---Type to continue, or q to quit--- #13 0x806b1467 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:387 and (kgdb) thr 108 [Switching to thread 108 (Thread 100199)]#0 cpustop_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1394 1394CPU_SET_ATOMIC(cpu, &stopped_cpus); (kgdb) bt #0 cpustop_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1394 #1 0x806bb540 in ipi_nmi_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1376 #2 0x806c7035 in trap (frame=0x80dc4d10) at /usr/src/sys/amd64/amd64/trap.c:200 #3 0x806b15bf in nmi_calltrap () at /usr/src/sys/amd64/amd64/exception.S:501 #4 0x8043e0d0 in _mtx_unlock_flags (m=0x80d8af80, opts=0, file=0x80791e48 "/usr/src/sys/vm/swap_pager.c", line=2040) at /usr/src/sys/kern/kern_mutex.c:221 [smth. wrong there -- no further stack: swap_pager_*, etc] Here both swap_pager_swapoff() and swp_pager_meta_ctl() contend on swhash_mtx. Or rather that's due to low limit set on retries counter? Let's see again for another crash: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1062783, size: 4096 panic: swapoff: failed to locate 22133 swap blocks cpuid = 2 KDB: stack backtrace: db_trace_self_wrapper() at 0x802e009a = db_trace_self_wrapper+0x2a kdb_backtrace() at 0x80486d87 = kdb_backtrace+0x37 panic() at 0x8044f6ee = panic+0x2ee swapoff_one() at 0x80687425 = swapoff_one+0x475 sys_swapoff() at 0x8068789b = sys_swapoff+0x1bb amd64_syscall() at 0x806c60c9 = amd64_syscall+0x299 Xfast_syscall() at 0x806b1467 = Xfast_syscall+0xf7 --- syscall (424, FreeBSD ELF64, sys_swapoff), rip = 0x800ab307c, rsp = 0x7ff
FreeBSD 9.0-RC1 freeze after swapoff/swapon procedure on md/vnode-backend file
Hi I get easy reproducible a hang-up servers that use the file-based swap file after swapoff / swapon procedure (in this case, some of the data must be swapped). For example: 1) dd if=/dev/zero of=/usr/swp1 bs=1m count=100 2) mdconfig -a -t vnode -f /usr/swp1 3) swapon /dev/md0 4) begin to allocated memory, for example by simple: tail /dev/zero 5) after a filling of some percent, do swapoff /dev/md0, then swapon /dev/md0. you can try this procedure again. The system may stop responding to commands and freezes or locks up after some time. From the outside - the core lives (icmp response goes) but the disk system is not available. PS: one of my server to my mind is frozen without swapoff/on - just had three swapfile, a day after he crashed. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: [HEADSUP]: ports feature freeze starts soon
On Fri, Oct 07, 2011 at 11:20:28AM +0200, Erwin Lansing wrote: > In preparation for 9.0 the ports tree will be in feature freeze > after release candidate 2 (RC2)is released, currently planned for > October 17. > Depending on your timezone, October 17 has come and gone and the ports tree has not frozen yet. As always, we'll follow the actual dates during the release cycle and not the estimated dates in the tentative schedule. A rough guess would be that RC2, and thus the ports feature freeze, will happed at the end of the month, so please take this as a reminder to get anything you want included in the release into the tree as soon as possible. Erwin -- Erwin Lansing http://droso.org Prediction is very difficult especially about the futureer...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Fwd: Re: [HEADSUP]: ports feature freeze starts soon
Just on case anyone's not on ports@: -- Forwarded message -- From: "Chris Rees" Date: 8 Oct 2011 10:30 Subject: Re: [HEADSUP]: ports feature freeze starts soon To: "Thomas Mueller" Cc: , "Erwin Lansing" On 8 October 2011 10:22, Thomas Mueller wrote: > from Erwin Lansing : > >> In preparation for 9.0 the ports tree will be in feature freeze >> after release candidate 1 (RC2)is released, currently planned for >> October 17. > > Was there a typo here? Did you mean release candidate 1 or 2? > > RC1 seems more logical, since RC1 has not been released yet, > and October 17 is only nine days away. > -- Forwarded message -- From: Erwin Lansing Date: 7 October 2011 17:34 Subject: Re: [HEADSUP]: ports feature freeze starts soon To: "develop...@freebsd.org" On Oct 7, 2011, at 11:20, Erwin Lansing wrote: > In preparation for 9.0 the ports tree will be in feature freeze > after release candidate 1 (RC2)is released, currently planned for > October 17. > Sorry about the typo, just to be clear I did mean RC2, not RC1 as usual as an RC3 has been planned in this release cycle. Erwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
[HEADSUP]: ports feature freeze starts soon
In preparation for 9.0 the ports tree will be in feature freeze after release candidate 1 (RC2)is released, currently planned for October 17. If you have any commits with high impact planned, get them in the tree before then and if they require an experimental build, have a request for one in portmgr hands within the next few days. Note that this again will be a feature freeze and not a full freeze. Normal upgrade, new ports, and changes that only affect other branches will be allowed without prior approval but with the extra Feature safe: yes tag in the commit message. Any commit that is sweeping, i.e. touches a large number of ports, infrastructural changes, commts to ports with unusually high number of dependencies, and any other commit that requires the rebuilding of many packages will not be allowed without prior explicit approval from portmgr after that date. -erwin -- Erwin Lansing http://droso.org Prediction is very difficult especially about the futureer...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: System freeze: Adaptec (aac) timeouts (releng 8)
On Thu, Sep 15, 2011 at 12:13 AM, Dennis Koegel wrote: > I'm not aware how licensing issues play in here, but apart from that, it > should be easy to patch this into base. (I was already half-way there > yesterday and I think I could work up a patch against HEAD and 8.x). I won't worry about filing a PR then. Sounds like you've got it under control. Thanks! Elliot ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: System freeze: Adaptec (aac) timeouts (releng 8)
On Wed, Sep 14, 2011 at 07:26:20PM -0700, Jeremy Chadwick wrote: > I'm actually very surprised to hear there's an official FreeBSD driver > on Adaptec's site that's actually intended for FreeBSD 8.x. As far as I can tell from the source, it's the very same driver (same source code and copyright notices), only that Adaptec has taken over development; fbsd has 2.1, Adaptec has continued development to a version 2.4. I'm not aware how licensing issues play in here, but apart from that, it should be easy to patch this into base. (I was already half-way there yesterday and I think I could work up a patch against HEAD and 8.x). - D. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: System freeze: Adaptec (aac) timeouts (releng 8)
On Thu, Sep 15, 2011 at 09:36:43AM +0800, Adrian Chadd wrote: > On 15 September 2011 00:38, Elliot Finley wrote: > > I was having the exact same problem using an Adaptec 52445. ?After > > downloading and using the latest driver from the adaptec website, the > > problems stopped. ?I haven't had a single freeze since using the new > > code. ?The newest driver from the website has source code with it, so > > it shouldn't be that big of a deal to incorporate it into the base > > system. ? ?I emailed the authors of the aac driver (Mike Smith and > > Scott Long), but they have both retired. ?So I'm not really sure how > > to get this code into the base. ?If anyone knows, please take up the > > charge. > > File a PR and hound people on freebsd-current until it gets done? ...which will either be ignored given that (TMK) nobody is maintaining the Adaptec drivers, or will be addressed in HEAD which won't help the OP who runs RELENG_8 until an MFC happens -- and if it happens (I forget how MFC approvals work). :-) As for the lack of aac(4) maintainer, I'm not sure how this should be addressed in the aac(4) man page. AUTHORS tends to indicate the names of the people who created or were involved in creating/maintaining said driver, which is sometimes (but on FreeBSD hardly always) the individual(s) who currently support it. In the case that there is a different maintainer, how does this get denoted in the man page? I'm actually very surprised to hear there's an official FreeBSD driver on Adaptec's site that's actually intended for FreeBSD 8.x. Last I knew they had basically blown off FreeBSD support. I wonder who at Adaptec is responsible for the FreeBSD driver? It would be good to know to involve them in all communiqu?s. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: System freeze: Adaptec (aac) timeouts (releng 8)
On 15 September 2011 00:38, Elliot Finley wrote: > I was having the exact same problem using an Adaptec 52445. After > downloading and using the latest driver from the adaptec website, the > problems stopped. I haven't had a single freeze since using the new > code. The newest driver from the website has source code with it, so > it shouldn't be that big of a deal to incorporate it into the base > system. I emailed the authors of the aac driver (Mike Smith and > Scott Long), but they have both retired. So I'm not really sure how > to get this code into the base. If anyone knows, please take up the > charge. File a PR and hound people on freebsd-current until it gets done? Adrian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: System freeze: Adaptec (aac) timeouts (releng 8)
I was having the exact same problem using an Adaptec 52445. After downloading and using the latest driver from the adaptec website, the problems stopped. I haven't had a single freeze since using the new code. The newest driver from the website has source code with it, so it shouldn't be that big of a deal to incorporate it into the base system.I emailed the authors of the aac driver (Mike Smith and Scott Long), but they have both retired. So I'm not really sure how to get this code into the base. If anyone knows, please take up the charge. On Wed, Sep 14, 2011 at 2:08 AM, Dennis Koegel wrote: > Cheers, > > we have a reproducible system freeze due to Adaptec driver (aac) timeouts: > > Sep 3 05:26:44 foo kernel: aac0: COMMAND 0xff80005ae4c0 (TYPE 502) > TIMEOUT AFTER 129 SECONDS > Sep 3 05:26:44 foo kernel: aac0: COMMAND 0xff80005ac0e0 (TYPE 502) > TIMEOUT AFTER 129 SECONDS > Sep 3 05:26:44 foo kernel: aac0: COMMAND 0xff80005b0fa0 (TYPE 502) > TIMEOUT AFTER 129 SECONDS > > > Once this happens, the userland seems to be alive, but the controller is > completely dead. As soon as the disk subsystem is involved, any process > hangs forever (e.g. SSH crypto-exchange still happens, but a shell won't > even start anymore). > > We observe the same issue on two systems of (mostly) identical spec, so > it's not a hardware issue. > > Apparently this only happens under heavy disk i/o and high cpu load. > Notably high write throughput plus a 'zpool scrub' on a large > GELI-backed zpool usually triggers the problem after a few hours. > Without high activity, they run smooth for weeks. > > Both systems are amd64 with an Adaptec 5805 controller and 16 disks (of > which two form a RAID-1 system volume (UFS), and the remaining 14 serve > as JBOD for a large zpool -- a total of 15 "aacd" devices). > > Both were running 8.2R originally. I've taken them to 8-STABLE now and > also applied svn r222951 (where the MFC was forgotten, it seems), but > the problem remains. > > Any help is greatly appreciated. > > Thanks, > - D. > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
System freeze: Adaptec (aac) timeouts (releng 8)
Cheers, we have a reproducible system freeze due to Adaptec driver (aac) timeouts: Sep 3 05:26:44 foo kernel: aac0: COMMAND 0xff80005ae4c0 (TYPE 502) TIMEOUT AFTER 129 SECONDS Sep 3 05:26:44 foo kernel: aac0: COMMAND 0xff80005ac0e0 (TYPE 502) TIMEOUT AFTER 129 SECONDS Sep 3 05:26:44 foo kernel: aac0: COMMAND 0xff80005b0fa0 (TYPE 502) TIMEOUT AFTER 129 SECONDS Once this happens, the userland seems to be alive, but the controller is completely dead. As soon as the disk subsystem is involved, any process hangs forever (e.g. SSH crypto-exchange still happens, but a shell won't even start anymore). We observe the same issue on two systems of (mostly) identical spec, so it's not a hardware issue. Apparently this only happens under heavy disk i/o and high cpu load. Notably high write throughput plus a 'zpool scrub' on a large GELI-backed zpool usually triggers the problem after a few hours. Without high activity, they run smooth for weeks. Both systems are amd64 with an Adaptec 5805 controller and 16 disks (of which two form a RAID-1 system volume (UFS), and the remaining 14 serve as JBOD for a large zpool -- a total of 15 "aacd" devices). Both were running 8.2R originally. I've taken them to 8-STABLE now and also applied svn r222951 (where the MFC was forgotten, it seems), but the problem remains. Any help is greatly appreciated. Thanks, - D. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
zfsboot from 8.2RC1 freeze at boot time
Hello and merry Xmas to everybody, I upgrade a remote server from 8.1-RELEASE to 8.2-RC1. This server have one disk: [r...@tignes ~]# gpart show => 63 488397105 ada0 MBR (233G) 63 12583809 1 freebsd (6.0G) 12583872 475813296 2 freebsd [active] (227G) => 0 12583809 ada0s1 BSD (6.0G) 0 8388608 1 freebsd-ufs (4.0G) 8388608 4195201 2 freebsd-swap (2.0G) =>0 475813296 ada0s2 BSD (227G) 0 475813296 1 freebsd-zfs (227G) It boot with zfsboot from ada0s2 containing a zfs pool. After upgrading the zfsboot just to be able to upgrade the pool to v15, the server don't boot anymore. It is a remote server, so I reproduce this config under VirtualBox. The boot freeze after zfsboot displaying "-". I grab a old zfsboot from another server running 8.1-STABLE (r213582) which boot fine. I put the zfsboot from r213582 (zpool v15 aware) on ada0s2 and bingo, the server boot normally. Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ichwd causes freeze instead of reset
On Sat, Aug 21, 2010 at 11:09:04PM +0200, Stefan Bethke wrote: > Am 21.08.2010 um 23:02 schrieb Andriy Gapon: > > > on 21/08/2010 23:33 Stefan Bethke said the following: > >> Hi, > >> > >> somewhat foolishly, I activated watchdogd and ichwd on a remote box, and > >> while testing it (by suspending watchdogd), apparently the watchdog > >> triggered. But instead of resetting, the machine does not react anymore on > >> the serial console. I will have to wait until Monday to get physical > >> access, > >> so it might be hanging or just switched itself off; I have no way of > >> telling > >> right now. > >> > >> ichwd probes as: ichwd0: on isa0 ichwd0: Intel > >> ICH7 watchdog timer (ICH7 or equivalent) ppc0: cannot reserve I/O port > >> range > >> > >> (not sure why ppc0 is getting involved at that point.) > >> > >> FreeBSD lokschuppen.zs64.net 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #30: Thu > >> Jul 15 12:58:20 UTC 2010 > >> r...@lokschuppen.zs64.net:/usr/obj/usr/src/sys/EISENBOOT amd64 > >> > >> Once the box is up again, is it worthwile trying ichwd again, should I try > >> and use SW_WATCHDOG, or forget about it? > > > > Just test it more while having physical access before making any > > conclusions. > > There could be a number of radically different possibilities ranging from > > hardware peculiarities to configuration problems to pilot errors to etc. > > I guess what I'm looking for is some confirmation that ichwd is working > properly on this particular hardware: Asus Pundit P4 P5G41 with a G41 chipset. > > Below are pciconv -lvb and dmesg: > > hos...@pci0:0:0:0:class=0x06 card=0x836d1043 chip=0x2e308086 rev=0x03 > hdr=0x00 > vendor = 'Intel Corporation' > class = bridge > subclass = HOST-PCI > vgap...@pci0:0:2:0: class=0x03 card=0x836d1043 chip=0x2e328086 rev=0x03 > hdr=0x00 > vendor = 'Intel Corporation' > device = 'Intel G41 express graphics > (PCIVEN_8086&DEV_2E32&SUBSYS_31031565&REV_033&115)' > class = display > subclass = VGA > bar [10] = type Memory, range 64, base 0xfe40, size 4194304, enabled > bar [18] = type Prefetchable Memory, range 64, base 0xe000, size > 268435456, enabled > bar [20] = type I/O Port, range 32, base 0xbc00, size 8, enabled > vgap...@pci0:0:2:1: class=0x038000 card=0x836d1043 chip=0x2e338086 rev=0x03 > hdr=0x00 > vendor = 'Intel Corporation' > class = display > bar [10] = type Memory, range 64, base 0xfe80, size 1048576, enabled > no...@pci0:0:27:0:class=0x040300 card=0x82fe1043 chip=0x27d88086 rev=0x01 > hdr=0x00 > vendor = 'Intel Corporation' > device = 'IDT High Definition Audio Driver (BA101897)' > class = multimedia > subclass = HDA > bar [10] = type Memory, range 64, base 0xfe3f8000, size 16384, enabled > pc...@pci0:0:28:0:class=0x060400 card=0x81791043 chip=0x27d08086 rev=0x01 > hdr=0x01 > vendor = 'Intel Corporation' > device = '82801G (ICH7 Family) PCIe Root Port' > class = bridge > subclass = PCI-PCI > pc...@pci0:0:28:2:class=0x060400 card=0x81791043 chip=0x27d48086 rev=0x01 > hdr=0x01 > vendor = 'Intel Corporation' > device = '82801G (ICH7 Family) PCIe Root Port' > class = bridge > subclass = PCI-PCI > pc...@pci0:0:28:3:class=0x060400 card=0x81791043 chip=0x27d68086 rev=0x01 > hdr=0x01 > vendor = 'Intel Corporation' > device = '82801G (ICH7 Family) PCIe Root Port' > class = bridge > subclass = PCI-PCI > uh...@pci0:0:29:0:class=0x0c0300 card=0x81791043 chip=0x27c88086 rev=0x01 > hdr=0x00 > vendor = 'Intel Corporation' > device = '82801G (ICH7 Family) USB Universal Host Controller' > class = serial bus > subclass = USB > bar [20] = type I/O Port, range 32, base 0xb400, size 32, enabled > uh...@pci0:0:29:1:class=0x0c0300 card=0x81791043 chip=0x27c98086 rev=0x01 > hdr=0x00 > vendor = 'Intel Corporation' > device = '82801G (ICH7 Family) USB Universal Host Controller' > class = serial bus > subclass = USB > bar [20] = type I/O Port, range 32, base 0xb480, size 32, enabled > uh...@pci0:0:29:2:class=0x0c0300 card=0x81791043 chip=0x27ca8086 rev=0x01 > hdr=0x00 > vendor = 'Intel Corporation' > device = '82801G (ICH7 Family) USB Universal Host Controller' > class = serial bus > subclass = USB > bar [20] = type I/O Port, range 32, base 0xb800, size 32, enabled > uh...@pci0:0:29:3:class=0x0c0300 card=0x81791043 chip=0x27cb8086 rev=0x01 > hdr=0x00 > vendor = 'Intel Corporation' > device = '82801G (ICH7 Family) USB Universal Host Controller' > class = serial bus > subclass = USB > bar [20] = type I/O Port, range 32, base 0xb880, size 32, enabled > eh...@pci0:0:29:7:class=0x0c0320 card=0x81791043 chip=0x27cc8086 rev=0
Re: ichwd causes freeze instead of reset
Am 21.08.2010 um 23:24 schrieb Mike Tancsa: > At 05:09 PM 8/21/2010, Stefan Bethke wrote: > >> I guess what I'm looking for is some confirmation that ichwd is working >> properly on this particular hardware: Asus Pundit P4 P5G41 with a G41 >> chipset. >> > > Dont know about that particular MB implementation, but I have a number of > various ICH7 based boards where ichwd works as expected. The freeze could > some something as simple as the box is waiting for keyboard input at the BIOS > prompt, or the BIOS option after a watchdog reset is to power off > However, I have only seen that option in later boards. Thanks, I'll check that out Monday morning. Stefan -- Stefan BethkeFon +49 151 14070811 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ichwd causes freeze instead of reset
At 05:09 PM 8/21/2010, Stefan Bethke wrote: I guess what I'm looking for is some confirmation that ichwd is working properly on this particular hardware: Asus Pundit P4 P5G41 with a G41 chipset. Dont know about that particular MB implementation, but I have a number of various ICH7 based boards where ichwd works as expected. The freeze could some something as simple as the box is waiting for keyboard input at the BIOS prompt, or the BIOS option after a watchdog reset is to power off However, I have only seen that option in later boards. ---Mike Mike Tancsa, tel +1 519 651 3400 Sentex Communications,m...@sentex.net Providing Internet since 1994www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ichwd causes freeze instead of reset
Am 21.08.2010 um 23:02 schrieb Andriy Gapon: > on 21/08/2010 23:33 Stefan Bethke said the following: >> Hi, >> >> somewhat foolishly, I activated watchdogd and ichwd on a remote box, and >> while testing it (by suspending watchdogd), apparently the watchdog >> triggered. But instead of resetting, the machine does not react anymore on >> the serial console. I will have to wait until Monday to get physical access, >> so it might be hanging or just switched itself off; I have no way of telling >> right now. >> >> ichwd probes as: ichwd0: on isa0 ichwd0: Intel >> ICH7 watchdog timer (ICH7 or equivalent) ppc0: cannot reserve I/O port range >> >> (not sure why ppc0 is getting involved at that point.) >> >> FreeBSD lokschuppen.zs64.net 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #30: Thu >> Jul 15 12:58:20 UTC 2010 >> r...@lokschuppen.zs64.net:/usr/obj/usr/src/sys/EISENBOOT amd64 >> >> Once the box is up again, is it worthwile trying ichwd again, should I try >> and use SW_WATCHDOG, or forget about it? > > Just test it more while having physical access before making any conclusions. > There could be a number of radically different possibilities ranging from > hardware peculiarities to configuration problems to pilot errors to etc. I guess what I'm looking for is some confirmation that ichwd is working properly on this particular hardware: Asus Pundit P4 P5G41 with a G41 chipset. Below are pciconv -lvb and dmesg: hos...@pci0:0:0:0: class=0x06 card=0x836d1043 chip=0x2e308086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' class = bridge subclass = HOST-PCI vgap...@pci0:0:2:0: class=0x03 card=0x836d1043 chip=0x2e328086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = 'Intel G41 express graphics (PCIVEN_8086&DEV_2E32&SUBSYS_31031565&REV_033&115)' class = display subclass = VGA bar [10] = type Memory, range 64, base 0xfe40, size 4194304, enabled bar [18] = type Prefetchable Memory, range 64, base 0xe000, size 268435456, enabled bar [20] = type I/O Port, range 32, base 0xbc00, size 8, enabled vgap...@pci0:0:2:1: class=0x038000 card=0x836d1043 chip=0x2e338086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' class = display bar [10] = type Memory, range 64, base 0xfe80, size 1048576, enabled no...@pci0:0:27:0: class=0x040300 card=0x82fe1043 chip=0x27d88086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = 'IDT High Definition Audio Driver (BA101897)' class = multimedia subclass = HDA bar [10] = type Memory, range 64, base 0xfe3f8000, size 16384, enabled pc...@pci0:0:28:0: class=0x060400 card=0x81791043 chip=0x27d08086 rev=0x01 hdr=0x01 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) PCIe Root Port' class = bridge subclass = PCI-PCI pc...@pci0:0:28:2: class=0x060400 card=0x81791043 chip=0x27d48086 rev=0x01 hdr=0x01 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) PCIe Root Port' class = bridge subclass = PCI-PCI pc...@pci0:0:28:3: class=0x060400 card=0x81791043 chip=0x27d68086 rev=0x01 hdr=0x01 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) PCIe Root Port' class = bridge subclass = PCI-PCI uh...@pci0:0:29:0: class=0x0c0300 card=0x81791043 chip=0x27c88086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) USB Universal Host Controller' class = serial bus subclass = USB bar [20] = type I/O Port, range 32, base 0xb400, size 32, enabled uh...@pci0:0:29:1: class=0x0c0300 card=0x81791043 chip=0x27c98086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) USB Universal Host Controller' class = serial bus subclass = USB bar [20] = type I/O Port, range 32, base 0xb480, size 32, enabled uh...@pci0:0:29:2: class=0x0c0300 card=0x81791043 chip=0x27ca8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) USB Universal Host Controller' class = serial bus subclass = USB bar [20] = type I/O Port, range 32, base 0xb800, size 32, enabled uh...@pci0:0:29:3: class=0x0c0300 card=0x81791043 chip=0x27cb8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) USB Universal Host Controller' class = serial bus subclass = USB bar [20] = type I/O Port, range 32, base 0xb880, size 32, enabled eh...@pci0:0:29:7: class=0x0c0320 card=0x81791043 chip=0x27cc8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) USB 2.0 Enhanced Host Controller' class = serial bus subclass = USB bar [10] = type Memory, range 32, base 0xfe3f7c00, size 1024, enabled pc...@pci0:0:30:0: class
Re: ichwd causes freeze instead of reset
on 21/08/2010 23:33 Stefan Bethke said the following: > Hi, > > somewhat foolishly, I activated watchdogd and ichwd on a remote box, and > while testing it (by suspending watchdogd), apparently the watchdog > triggered. But instead of resetting, the machine does not react anymore on > the serial console. I will have to wait until Monday to get physical access, > so it might be hanging or just switched itself off; I have no way of telling > right now. > > ichwd probes as: ichwd0: on isa0 ichwd0: Intel > ICH7 watchdog timer (ICH7 or equivalent) ppc0: cannot reserve I/O port range > > (not sure why ppc0 is getting involved at that point.) > > FreeBSD lokschuppen.zs64.net 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #30: Thu > Jul 15 12:58:20 UTC 2010 > r...@lokschuppen.zs64.net:/usr/obj/usr/src/sys/EISENBOOT amd64 > > Once the box is up again, is it worthwile trying ichwd again, should I try > and use SW_WATCHDOG, or forget about it? Just test it more while having physical access before making any conclusions. There could be a number of radically different possibilities ranging from hardware peculiarities to configuration problems to pilot errors to etc. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
ichwd causes freeze instead of reset
Hi, somewhat foolishly, I activated watchdogd and ichwd on a remote box, and while testing it (by suspending watchdogd), apparently the watchdog triggered. But instead of resetting, the machine does not react anymore on the serial console. I will have to wait until Monday to get physical access, so it might be hanging or just switched itself off; I have no way of telling right now. ichwd probes as: ichwd0: on isa0 ichwd0: Intel ICH7 watchdog timer (ICH7 or equivalent) ppc0: cannot reserve I/O port range (not sure why ppc0 is getting involved at that point.) FreeBSD lokschuppen.zs64.net 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #30: Thu Jul 15 12:58:20 UTC 2010 r...@lokschuppen.zs64.net:/usr/obj/usr/src/sys/EISENBOOT amd64 Once the box is up again, is it worthwile trying ichwd again, should I try and use SW_WATCHDOG, or forget about it? Thanks, Stefan -- Stefan BethkeFon +49 151 14070811 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: [HEADSUP]: Ports feature freeze for 8.1 now in effect
On Fri, 18 Jun 2010 14:10:28 +0200 Erwin Lansing wrote: > In preparation for 8.1-RELEASE, the ports tree is now in feature > freeze. > > Normal upgrade, new ports, and changes that only affect other branches > are allowed without prior approval but with the extra Feature safe: > yes tag in the commit message. Any commit that is sweeping, i.e. > touches a large number of ports, infrastructural changes, commits to > ports with unusually high number of dependent ports, and any other > commit that requires the rebuilding of many packages is not allowed > without prior explicit approval from portmgr after that date. > > When in doubt, please do not hesitate to contact portmgr. >>>> "any commit that requires the rebuilding of many packages" And this time we will ask for instant back-out of everything that should had not been committed in the first place. If you have time, you can always help with unmaintained ports: http://qat.tecnik93.com/index.php?action=failed_buildports&maintainer=ports%40freebsd.org&; or even maintained ones: http://qat.tecnik93.com/index.php?action=failed_buildports Help us getting a good, stable package set for the release please, -- IOnut - Un^d^dregistered ;) FreeBSD "user" "Intellectual Property" is nowhere near as valuable as "Intellect" FreeBSD committer -> ite...@freebsd.org, PGP Key ID 057E9F8B493A297B signature.asc Description: PGP signature
[HEADSUP]: Ports feature freeze for 8.1 now in effect
In preparation for 8.1-RELEASE, the ports tree is now in feature freeze. Normal upgrade, new ports, and changes that only affect other branches are allowed without prior approval but with the extra Feature safe: yes tag in the commit message. Any commit that is sweeping, i.e. touches a large number of ports, infrastructural changes, commits to ports with unusually high number of dependent ports, and any other commit that requires the rebuilding of many packages is not allowed without prior explicit approval from portmgr after that date. When in doubt, please do not hesitate to contact portmgr. -- Erwin Lansing http://droso.org Prediction is very difficult especially about the futureer...@freebsd.org pgpjUJDoWTfnI.pgp Description: PGP signature
Re: [HEADS UP] Ports feature freeze coming soon
On Tue, Jun 08, 2010 at 02:20:53PM -0400, FreeBSD portmgr secretary wrote: > In preparation for 8.1-RELEASE, the ports tree will be in feature freeze > after release candidate 1 (RC1) is released, currently planned for June 11. As you may have noticed, RC1 has not been released as yet, but the delay is not expected to be more than a few days. The ports feature freeze will therefore be postponed until this Friday, June 18, 12pm UTC. We do still ask you to be conservative with your changes until then. -erwin > > If you have any commits with high impact planned, get them in the tree > before then and if they require an experimental build, have a request for > one in portmgr@ hands within the next few days. > > Note that this again will be a feature freeze and not a full freeze. > Normal upgrade, new ports, and changes that only affect other branches will > be allowed without prior approval but with the extra Feature safe: yes tag > in the commit message. Any commit that is sweeping, i.e. touches a large > number of ports, infrastructural changes, commits to ports with unusually > high number of dependencies, and any other commit that requires the > rebuilding of many packages will not be allowed without prior explicit > approval from portmgr@ after that date. > > Thomas > with portmgr-secretary@ hat on > > -- > Thomas Abthorpe | FreeBSD Ports Management Team Secretary > tabtho...@freebsd.org | portmgr-secret...@freebsd.org -- Erwin Lansing http://droso.org Prediction is very difficult especially about the futureer...@freebsd.org pgpEJ11mZa5Tz.pgp Description: PGP signature
[HEADSUP] ports feature freeze starts soon
In preparation for 8.1-RELEASE, the ports tree will be in feature freeze after release candidate 1 (RC1) is released, currently planned for June 11. If you have any commits with high impact planned, get them in the tree before then and if they require an experimental build, have a request for one in portmgr@ hands within the next few days. Note that this again will be a feature freeze and not a full freeze. Normal upgrade, new ports, and changes that only affect other branches will be allowed without prior approval but with the extra Feature safe: yes tag in the commit message. Any commit that is sweeping, i.e. touches a large number of ports, infrastructural changes, commits to ports with unusually high number of dependencies, and any other commit that requires the rebuilding of many packages will not be allowed without prior explicit approval from portmgr@ after that date. Thomas with portmgr-secretary@ hat on -- Thomas Abthorpe | FreeBSD Ports Management Team Secretary tabtho...@freebsd.org | portmgr-secret...@freebsd.org pgplW61OQcvRl.pgp Description: PGP signature
Re: Give freeze a chance
On Mon, May 17, 2010 at 5:29 PM, Thomas Abthorpe wrote: > The next wave of the challenge, fear, there is one more already > composed to be released with 8.1! > > -- > > Give Freeze a chance > with apologies to John Lennon et al > > Ev'rybody's talkin' 'bout > portism, srcism, docism, cvsism, svnism, tagism > This-ism, that-ism, ism ism ism > All we are saying is give freeze a chance > All we are saying is give freeze a chance > > C'mon > Ev'rybody's talkin' 'bout > re@, core@, doceng@, donations@, secteam@, > marketing@, portmgr@, vendor-relations@ > All we are saying is give freeze a chance > All we are saying is give freeze a chance > > Let me tell you now > Ev'rybody's talkin' 'bout > Revolution, evolution, i18n, l10n, documentation, > Integration, administration, applications, congratulations > All we are saying is give freeze a chance > All we are saying is give freeze a chance > > Ev'rybody's talkin' 'bout > Erwin Lansing, Mark Linimon, Martin Wilke, > Pav Lucistnik, Florent Thoumie, Ion-Mihai Tetcu, > Kris Kennaway, Joe Marcus Clarke, Thomas Abthorpe too > All we are saying is give freeze a chance > All we are saying is give freeze a chance Nice, it makes me remember the old "Breaking the Ports" song... http://www.mail-archive.com/freebsd-po...@freebsd.org/msg02907.html -- Renato Botelho ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Give freeze a chance
The next wave of the challenge, fear, there is one more already composed to be released with 8.1! -- Give Freeze a chance with apologies to John Lennon et al Ev'rybody's talkin' 'bout portism, srcism, docism, cvsism, svnism, tagism This-ism, that-ism, ism ism ism All we are saying is give freeze a chance All we are saying is give freeze a chance C'mon Ev'rybody's talkin' 'bout re@, core@, doceng@, donations@, secteam@, marketing@, portmgr@, vendor-relations@ All we are saying is give freeze a chance All we are saying is give freeze a chance Let me tell you now Ev'rybody's talkin' 'bout Revolution, evolution, i18n, l10n, documentation, Integration, administration, applications, congratulations All we are saying is give freeze a chance All we are saying is give freeze a chance Ev'rybody's talkin' 'bout Erwin Lansing, Mark Linimon, Martin Wilke, Pav Lucistnik, Florent Thoumie, Ion-Mihai Tetcu, Kris Kennaway, Joe Marcus Clarke, Thomas Abthorpe too All we are saying is give freeze a chance All we are saying is give freeze a chance Thomas -- Thomas Abthorpe | FreeBSD Committer tabtho...@freebsd.org | http://people.freebsd.org/~tabthorpe pgpggnrJ067p0.pgp Description: PGP signature
Re: Freeze on my laptop.
On Wed, Apr 14, 2010 at 3:31 AM, Demelier David wrote: > Hi, > > I'm so sad because FreeBSD is the one which can runs almost perfectly > on > my laptop. But it freezes. Sometime I just do anything and I want to > click on a link in firefox, or open a terminal and then freeze. > > There is no messages, no reboot nothing. Can't know where that come > from. > > I'm running 8.0-STABLE on a hp probook 4510s. > > King regards, > -- > Demelier David > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > do u have dumpdev set in your rc.conf ? try setting it to AUTO, you might get a core dump on next reebot dumpdev="AUTO" there is very little in your mail to infer anything, do you use wireless net access ? what graphics card you have ? these threads might be of help to you http://lists.freebsd.org/pipermail/freebsd-questions/2010-March/214339.html http://lists.freebsd.org/pipermail/freebsd-stable/2010-April/056096.html http://lists.freebsd.org/pipermail/freebsd-hackers/2006-April/016107.html ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Freeze on my laptop.
On Wed, Apr 14, 2010 at 09:43:22AM +1000, Andrew Snow wrote: > Demelier David wrote: > > I'm so sad because FreeBSD is the one which can runs almost perfectly on > > my laptop. But it freezes. Sometime I just do anything and I want to > > click on a link in firefox, or open a terminal and then freeze. > > Sounds like a problem with the X graphics driver.. when it next > happens, can you press Alt+F1 or Ctrl+Alt+F1 to get back to a text console? > > You might like to try upgrading your version of X to a newer version. > > - Andrew I'll try with vesa, maybe you right but with last(1) command I get many `crash'. And I can't go back in console. The odd thing is that happens often when I use gtk based applications (pidgin, firefox). -- Demelier David ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Freeze on my laptop.
Demelier David wrote: I'm so sad because FreeBSD is the one which can runs almost perfectly on my laptop. But it freezes. Sometime I just do anything and I want to click on a link in firefox, or open a terminal and then freeze. Sounds like a problem with the X graphics driver.. when it next happens, can you press Alt+F1 or Ctrl+Alt+F1 to get back to a text console? You might like to try upgrading your version of X to a newer version. - Andrew ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Freeze on my laptop.
Hi, I'm so sad because FreeBSD is the one which can runs almost perfectly on my laptop. But it freezes. Sometime I just do anything and I want to click on a link in firefox, or open a terminal and then freeze. There is no messages, no reboot nothing. Can't know where that come from. I'm running 8.0-STABLE on a hp probook 4510s. King regards, -- Demelier David ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Freeze on closing terminal that runs wpa_supplicant
On 3/17/10, Mathias Sogorski wrote: > Hello! > I am running 8.0-RELEASE on a notebook with the Intel 3945 WiFi. I usually > start wpa_supplicant [...]& on a terminal when entering gnome followed by > the dhcpcd call to use the WiFi connection. After having finished work and > closing the terminal that runs wpa_supplicant, everything freezes and I have > to turn the power off. Any suggestions? That should not happen. So report the bug. You managed to get backtrace? Did kernel actually crashed? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"