Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-06-16 Thread Andre Albsmeier
On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote:
 On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote:
  Each day at 5:15 we are generating snapshots on various machines.
  This used to work perfectly under 7-STABLE for years but since
  we started to use 9.1-STABLE the machine reboots in about 10%
  of all cases.
  
  After rebooting we find a new snapshot file which is a bit
  smaller than the good ones and with different permissions
  It does not succeed a fsck. In this example it is the one
  whose name is beginning with s3:
  
  -r--r-   1 root  operator  snapshot 72802894528 29 May 05:15 
  s2-2013.05.28-03.15.04
  -r   1 root  operator  snapshot 72802893824 29 May 05:15 
  s3-2013.05.29-03.15.03
  -r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
  s4-2013.05.23-06.38.44
  -r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
  s5-2013.05.24-03.15.03
  -r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
  s6-2013.05.25-03.15.03
  
  After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel
  I see the following LORs (mksnap_ffs starts exactly at 5:15):
  
  May 29 05:15:00 kern.crit palveli kernel: lock order reversal:
  May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) @ 
  /src/src-9/sys/kern/vfs_mount.c:1240
  May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ 
  /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414
  May 29 05:15:04 kern.crit palveli kernel: lock order reversal:
  May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk (snaplk) 
  @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976
  May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ 
  /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626
  
  Unfortunatley no corefiles are being generated ;-(.
  
  I have checked and even rebuilt the (UFS1) fs in question
  from scratch. I have also seen this happen on an UFS2 on
  another machine and on a third one when running dump -L
  on a root fs.
  
  Any hints of how to proceed?
 
 Would it be possible to setup a serial console that is logged on this machine
 to see if it is panic'ing but failing to write out a crashdump?

Couldn't attach the serial console yet ;-(. But I had people
attach a KVMoverIP switch and enabled the various KDB options
in the kernel. Now we can see a bit more (see below) -- no
crashdump is being generated though.

Some comments on what the crontab script does at 5:01 (I switched
it from 5:15 to 5:01 for some reason):

1. Unmount all snapshots
2. Remove all /dev/md devices
3. Deleting the oldest snapshot
4. Generating a new snapshost
5. mdconfig and mount mount of all snapshots

I assume the first LOR (sys_unmount) is related to the unmount
and the second one (sys_unlink) to the rm.

I have added some sleep(1) and sync(1) commands between the
different steps but this didn't help.

Now the log of three days, we can see another LOR after booting:

--- cronjob start, day 1 ---

Jun 11 05:01:00 kern.crit typhon kernel: lock order reversal:
Jun 11 05:01:00 kern.crit typhon kernel: 1st 0xc53644c8 ufs (ufs) @ 
/src/src-9/sys/kern/vfs_mount.c:1240
Jun 11 05:01:00 kern.crit typhon kernel: 2nd 0xc5361290 devfs (devfs) @ 
/src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414
Jun 11 05:01:00 kern.crit typhon kernel: KDB: stack backtrace:
Jun 11 05:01:00 kern.crit typhon kernel: 
db_trace_self_wrapper(c0815bdd,662f7366,765f7366,706f7366,3a632e73,...) at 
db_trace_self_wrapper+0x26/frame 0xec3c98d4
Jun 11 05:01:00 kern.crit typhon kernel: 
kdb_backtrace(c063659b,c08198b4,c09e8bb0,586,ec3c99a0,...) at 
kdb_backtrace+0x2a/frame 0xec3c9930
Jun 11 05:01:00 kern.crit typhon kernel: 
_witness_debugger(c08198b4,c5361290,c080d6e7,c4c29338,c0835390,...) at 
_witness_debugger+0x25/frame 0xec3c9948
Jun 11 05:01:00 kern.crit typhon kernel: 
witness_checkorder(c5361290,9,c0835390,586,c53612b0,...) at 
witness_checkorder+0x86f/frame 0xec3c99a0
Jun 11 05:01:00 kern.crit typhon kernel: 
__lockmgr_args(c5361290,80400,c53612b0,0,0,...) at __lockmgr_args+0x829/frame 
0xec3c9a58
Jun 11 05:01:00 kern.crit typhon kernel: 
vop_stdlock(ec3c9abc,246,c08bcd9c,80400,c5361238,...) at vop_stdlock+0x62/frame 
0xec3c9a8c
Jun 11 05:01:00 kern.crit typhon kernel: 
VOP_LOCK1_APV(c08611e0,ec3c9abc,c09e8bb4,c0890120,c5361238,...) at 
VOP_LOCK1_APV+0xb5/frame 0xec3c9aa8
Jun 11 05:01:00 kern.crit typhon kernel: 
_vn_lock(c5361238,80400,c0835390,586,ec3c9b14,...) at _vn_lock+0x5e/frame 
0xec3c9adc
Jun 11 05:01:00 kern.crit typhon kernel: 
ffs_flushfiles(c5365d34,0,c67ec600,0,c5365d34,...) at 
ffs_flushfiles+0x133/frame 0xec3c9b1c
Jun 11 05:01:00 kern.crit typhon kernel: 
ffs_unmount(c5365d34,800,c0821043,513,c4c00c08,...) at 
ffs_unmount+0x180/frame 0xec3c9b5c
Jun 11 05:01:00 kern.crit typhon kernel: 
dounmount(c5365d34,800,c67ec600,494,c67e8378,...) at dounmount+0x423/frame 
0xec3c9bac
Jun 11 05:01:00 kern.crit typhon kernel: 
sys_unmount(c67ec600,ec3c9ccc,c0846650,c081a478,206,...) at 
sys_unmount+0x3d1/frame 

Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-06-16 Thread Jeremy Chadwick
On Fri, May 31, 2013 at 07:25:23PM +0200, Andre Albsmeier wrote:
 On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote:
  On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote:
   Each day at 5:15 we are generating snapshots on various machines.
   This used to work perfectly under 7-STABLE for years but since
   we started to use 9.1-STABLE the machine reboots in about 10%
   of all cases.
   
   After rebooting we find a new snapshot file which is a bit
   smaller than the good ones and with different permissions
   It does not succeed a fsck. In this example it is the one
   whose name is beginning with s3:
   
   -r--r-   1 root  operator  snapshot 72802894528 29 May 05:15 
   s2-2013.05.28-03.15.04
   -r   1 root  operator  snapshot 72802893824 29 May 05:15 
   s3-2013.05.29-03.15.03
   -r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
   s4-2013.05.23-06.38.44
   -r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
   s5-2013.05.24-03.15.03
   -r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
   s6-2013.05.25-03.15.03
   
   After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel
   I see the following LORs (mksnap_ffs starts exactly at 5:15):
   
   May 29 05:15:00 kern.crit palveli kernel: lock order reversal:
   May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) @ 
   /src/src-9/sys/kern/vfs_mount.c:1240
   May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs (devfs) 
   @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414
   May 29 05:15:04 kern.crit palveli kernel: lock order reversal:
   May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk 
   (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976
   May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ 
   /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626
   
   Unfortunatley no corefiles are being generated ;-(.
   
   I have checked and even rebuilt the (UFS1) fs in question
   from scratch. I have also seen this happen on an UFS2 on
   another machine and on a third one when running dump -L
   on a root fs.
   
   Any hints of how to proceed?
  
  Would it be possible to setup a serial console that is logged on this 
  machine
  to see if it is panic'ing but failing to write out a crashdump?
 
 I'll try to arrange that. It'll take a bit since this
 box is 200 km away... 
 
 Maybe I'll find another one nearby to reproduce it...

SPECIFICALLY regarding lack of crash dumps: I need to see the
following:

* cat /etc/rc.conf
* cat /etc/fstab

I may need output from other commands, but shall deal with that when I
see output from the above.  Thanks.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-06-16 Thread Andre Albsmeier
On Sun, 16-Jun-2013 at 08:54:41 +0200, Jeremy Chadwick wrote:
 On Fri, May 31, 2013 at 07:25:23PM +0200, Andre Albsmeier wrote:
  On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote:
   On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote:
Each day at 5:15 we are generating snapshots on various machines.
This used to work perfectly under 7-STABLE for years but since
we started to use 9.1-STABLE the machine reboots in about 10%
of all cases.

After rebooting we find a new snapshot file which is a bit
smaller than the good ones and with different permissions
It does not succeed a fsck. In this example it is the one
whose name is beginning with s3:

-r--r-   1 root  operator  snapshot 72802894528 29 May 05:15 
s2-2013.05.28-03.15.04
-r   1 root  operator  snapshot 72802893824 29 May 05:15 
s3-2013.05.29-03.15.03
-r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
s4-2013.05.23-06.38.44
-r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
s5-2013.05.24-03.15.03
-r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
s6-2013.05.25-03.15.03

After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel
I see the following LORs (mksnap_ffs starts exactly at 5:15):

May 29 05:15:00 kern.crit palveli kernel: lock order reversal:
May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) @ 
/src/src-9/sys/kern/vfs_mount.c:1240
May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs 
(devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414
May 29 05:15:04 kern.crit palveli kernel: lock order reversal:
May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk 
(snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976
May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ 
/src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626

Unfortunatley no corefiles are being generated ;-(.

I have checked and even rebuilt the (UFS1) fs in question
from scratch. I have also seen this happen on an UFS2 on
another machine and on a third one when running dump -L
on a root fs.

Any hints of how to proceed?
   
   Would it be possible to setup a serial console that is logged on this 
   machine
   to see if it is panic'ing but failing to write out a crashdump?
  
  I'll try to arrange that. It'll take a bit since this
  box is 200 km away... 
  
  Maybe I'll find another one nearby to reproduce it...
 
 SPECIFICALLY regarding lack of crash dumps: I need to see the
 following:
 
 * cat /etc/rc.conf
 * cat /etc/fstab
 
 I may need output from other commands, but shall deal with that when I
 see output from the above.  Thanks.

No problem, see below...

To make a long story short, the machine dumps core perfectly
(tested that a while ago), but not when dealing with _this_
issue...

I dump on da1s1b and savecore fetches it from there and puts
it on /var (sitting on da0), that's faster.

rc.conf (beware, rc.conf.local exists):
---
rcshutdown_timeout=180
tmpmfs=YES
tmpsize=$(( `/sbin/sysctl -n hw.usermem` / 300 ))m
tmpmfs_flags=$tmpmfs_flags -v 1 -n

background_fsck=NO

nisdomainname=ofw.tld
pflog_flags=-S

syslogd_flags=-svv
inetd_enable=YES
inetd_flags=-l
named_flags=-S 1000
named_chrootdir=
rwhod_enable=YES
sshd_enable=YES
amd_enable=YES
amd_flags=-F /etc/amd.conf
nfs_client_enable=YES
nfs_access_cache=2
mountd_flags=-n
rpcbind_enable=YES

ntpdate_enable=YES
ntpdate_hosts=ntp
ntpd_enable=YES
ntpd_flags=-p /var/run/ntpd.pid

nis_client_enable=YES
nis_client_flags=-s -S ofw.tld,nis-16-1,nis-16-2
nis_server_flags=-n
nis_yppasswdd_flags=-t /var/yp/src/master.passwd -f -v

defaultrouter=192.168.16.2

keyrate=fast

sendmail_flags=-bd -q5m
sendmail_submit_flags=$sendmail_flags -ODaemonPortOptions=Addr=localhost
sendmail_msp_queue_flags=-Ac -q30m
sendmail_rebuild_aliases=NO

lpd_enable=YES
lpd_flags=-s
chkprintcap_enable=YES
dumpdev=AUTO
clear_tmp_X=NO
ldconfig_paths=/usr/local/lib
ldconfig_paths_aout=
entropy_file=/boot/entropy-file


rc.conf.local:
--
hostname=typhon.ofw.tld
ifconfig_msk0=inet 192.168.24.1/21
ifconfig_msk0_alias0=inet 192.168.24.10/32

named_enable=YES
nfs_server_enable=YES

nis_client_flags=-s -S ofw.tld,nis-24-1,nis-24-2
nis_server_enable=YES

defaultrouter=192.168.24.2

lpd_flags=-l
dumpdev=/dev/da1s1b
quota_enable=YES


fstab:
--
/dev/da0s1a /   ufs noatime,rw  
0 1
/dev/da0s1b noneswapsw  
0 0
proc/proc   procfs  rw  
0 0
/dev/da0s1d /usrufs noatime,rw  
0 2
/dev/da0s1e /varufs noatime,nosuid,rw   
0 2

/dev/da10p1 /share2 ufs suiddir,groupquota,noatime,nosuid,rw
0 2

Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-06-16 Thread Jeremy Chadwick
On Sun, Jun 16, 2013 at 10:02:39AM +0200, Andre Albsmeier wrote:
 On Sun, 16-Jun-2013 at 08:54:41 +0200, Jeremy Chadwick wrote:
  On Fri, May 31, 2013 at 07:25:23PM +0200, Andre Albsmeier wrote:
   On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote:
On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote:
 Each day at 5:15 we are generating snapshots on various machines.
 This used to work perfectly under 7-STABLE for years but since
 we started to use 9.1-STABLE the machine reboots in about 10%
 of all cases.
 
 After rebooting we find a new snapshot file which is a bit
 smaller than the good ones and with different permissions
 It does not succeed a fsck. In this example it is the one
 whose name is beginning with s3:
 
 -r--r-   1 root  operator  snapshot 72802894528 29 May 05:15 
 s2-2013.05.28-03.15.04
 -r   1 root  operator  snapshot 72802893824 29 May 05:15 
 s3-2013.05.29-03.15.03
 -r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
 s4-2013.05.23-06.38.44
 -r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
 s5-2013.05.24-03.15.03
 -r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
 s6-2013.05.25-03.15.03
 
 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel
 I see the following LORs (mksnap_ffs starts exactly at 5:15):
 
 May 29 05:15:00 kern.crit palveli kernel: lock order reversal:
 May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) 
 @ /src/src-9/sys/kern/vfs_mount.c:1240
 May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs 
 (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414
 May 29 05:15:04 kern.crit palveli kernel: lock order reversal:
 May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk 
 (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976
 May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) 
 @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626
 
 Unfortunatley no corefiles are being generated ;-(.
 
 I have checked and even rebuilt the (UFS1) fs in question
 from scratch. I have also seen this happen on an UFS2 on
 another machine and on a third one when running dump -L
 on a root fs.
 
 Any hints of how to proceed?

Would it be possible to setup a serial console that is logged on this 
machine
to see if it is panic'ing but failing to write out a crashdump?
   
   I'll try to arrange that. It'll take a bit since this
   box is 200 km away... 
   
   Maybe I'll find another one nearby to reproduce it...
  
  SPECIFICALLY regarding lack of crash dumps: I need to see the
  following:
  
  * cat /etc/rc.conf
  * cat /etc/fstab
  
  I may need output from other commands, but shall deal with that when I
  see output from the above.  Thanks.
 
 No problem, see below...
 
 To make a long story short, the machine dumps core perfectly
 (tested that a while ago), but not when dealing with _this_
 issue...
 
 I dump on da1s1b and savecore fetches it from there and puts
 it on /var (sitting on da0), that's faster.
 
 rc.conf (beware, rc.conf.local exists):
 ---
 rcshutdown_timeout=180
 tmpmfs=YES
 tmpsize=$(( `/sbin/sysctl -n hw.usermem` / 300 ))m
 tmpmfs_flags=$tmpmfs_flags -v 1 -n
 
 background_fsck=NO
 
 nisdomainname=ofw.tld
 pflog_flags=-S
 
 syslogd_flags=-svv
 inetd_enable=YES
 inetd_flags=-l
 named_flags=-S 1000
 named_chrootdir=
 rwhod_enable=YES
 sshd_enable=YES
 amd_enable=YES
 amd_flags=-F /etc/amd.conf
 nfs_client_enable=YES
 nfs_access_cache=2
 mountd_flags=-n
 rpcbind_enable=YES
 
 ntpdate_enable=YES
 ntpdate_hosts=ntp
 ntpd_enable=YES
 ntpd_flags=-p /var/run/ntpd.pid
 
 nis_client_enable=YES
 nis_client_flags=-s -S ofw.tld,nis-16-1,nis-16-2
 nis_server_flags=-n
 nis_yppasswdd_flags=-t /var/yp/src/master.passwd -f -v
 
 defaultrouter=192.168.16.2
 
 keyrate=fast
 
 sendmail_flags=-bd -q5m
 sendmail_submit_flags=$sendmail_flags -ODaemonPortOptions=Addr=localhost
 sendmail_msp_queue_flags=-Ac -q30m
 sendmail_rebuild_aliases=NO
 
 lpd_enable=YES
 lpd_flags=-s
 chkprintcap_enable=YES
 dumpdev=AUTO
 clear_tmp_X=NO
 ldconfig_paths=/usr/local/lib
 ldconfig_paths_aout=
 entropy_file=/boot/entropy-file
 
 
 rc.conf.local:
 --
 hostname=typhon.ofw.tld
 ifconfig_msk0=inet 192.168.24.1/21
 ifconfig_msk0_alias0=inet 192.168.24.10/32
 
 named_enable=YES
 nfs_server_enable=YES
 
 nis_client_flags=-s -S ofw.tld,nis-24-1,nis-24-2
 nis_server_enable=YES
 
 defaultrouter=192.168.24.2
 
 lpd_flags=-l
 dumpdev=/dev/da1s1b
 quota_enable=YES
 
 
 fstab:
 --
 /dev/da0s1a   /   ufs noatime,rw  
 0 1
 /dev/da0s1b   noneswapsw  
 0 0
 proc  /proc   procfs  rw  
 0 0
 /dev/da0s1d   /usrufs 

Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-06-16 Thread Andre Albsmeier
On Sun, 16-Jun-2013 at 10:49:37 +0200, Jeremy Chadwick wrote:
 On Sun, Jun 16, 2013 at 10:02:39AM +0200, Andre Albsmeier wrote:
  On Sun, 16-Jun-2013 at 08:54:41 +0200, Jeremy Chadwick wrote:
   On Fri, May 31, 2013 at 07:25:23PM +0200, Andre Albsmeier wrote:
On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote:
 On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote:
  Each day at 5:15 we are generating snapshots on various machines.
  This used to work perfectly under 7-STABLE for years but since
  we started to use 9.1-STABLE the machine reboots in about 10%
  of all cases.
  
  After rebooting we find a new snapshot file which is a bit
  smaller than the good ones and with different permissions
  It does not succeed a fsck. In this example it is the one
  whose name is beginning with s3:
  
  -r--r-   1 root  operator  snapshot 72802894528 29 May 05:15 
  s2-2013.05.28-03.15.04
  -r   1 root  operator  snapshot 72802893824 29 May 05:15 
  s3-2013.05.29-03.15.03
  -r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
  s4-2013.05.23-06.38.44
  -r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
  s5-2013.05.24-03.15.03
  -r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
  s6-2013.05.25-03.15.03
  
  After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel
  I see the following LORs (mksnap_ffs starts exactly at 5:15):
  
  May 29 05:15:00 kern.crit palveli kernel: lock order reversal:
  May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs 
  (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240
  May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs 
  (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414
  May 29 05:15:04 kern.crit palveli kernel: lock order reversal:
  May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk 
  (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976
  May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs 
  (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626
  
  Unfortunatley no corefiles are being generated ;-(.
  
  I have checked and even rebuilt the (UFS1) fs in question
  from scratch. I have also seen this happen on an UFS2 on
  another machine and on a third one when running dump -L
  on a root fs.
  
  Any hints of how to proceed?
 
 Would it be possible to setup a serial console that is logged on this 
 machine
 to see if it is panic'ing but failing to write out a crashdump?

I'll try to arrange that. It'll take a bit since this
box is 200 km away... 

Maybe I'll find another one nearby to reproduce it...
   
   SPECIFICALLY regarding lack of crash dumps: I need to see the
   following:
   
   * cat /etc/rc.conf
   * cat /etc/fstab
   
   I may need output from other commands, but shall deal with that when I
   see output from the above.  Thanks.
  
  No problem, see below...
  
  To make a long story short, the machine dumps core perfectly
  (tested that a while ago), but not when dealing with _this_
  issue...
  
  I dump on da1s1b and savecore fetches it from there and puts
  it on /var (sitting on da0), that's faster.
  
  rc.conf (beware, rc.conf.local exists):
  ---
  rcshutdown_timeout=180
  tmpmfs=YES
  tmpsize=$(( `/sbin/sysctl -n hw.usermem` / 300 ))m
  tmpmfs_flags=$tmpmfs_flags -v 1 -n
  
  background_fsck=NO
  
  nisdomainname=ofw.tld
  pflog_flags=-S
  
  syslogd_flags=-svv
  inetd_enable=YES
  inetd_flags=-l
  named_flags=-S 1000
  named_chrootdir=
  rwhod_enable=YES
  sshd_enable=YES
  amd_enable=YES
  amd_flags=-F /etc/amd.conf
  nfs_client_enable=YES
  nfs_access_cache=2
  mountd_flags=-n
  rpcbind_enable=YES
  
  ntpdate_enable=YES
  ntpdate_hosts=ntp
  ntpd_enable=YES
  ntpd_flags=-p /var/run/ntpd.pid
  
  nis_client_enable=YES
  nis_client_flags=-s -S ofw.tld,nis-16-1,nis-16-2
  nis_server_flags=-n
  nis_yppasswdd_flags=-t /var/yp/src/master.passwd -f -v
  
  defaultrouter=192.168.16.2
  
  keyrate=fast
  
  sendmail_flags=-bd -q5m
  sendmail_submit_flags=$sendmail_flags -ODaemonPortOptions=Addr=localhost
  sendmail_msp_queue_flags=-Ac -q30m
  sendmail_rebuild_aliases=NO
  
  lpd_enable=YES
  lpd_flags=-s
  chkprintcap_enable=YES
  dumpdev=AUTO
  clear_tmp_X=NO
  ldconfig_paths=/usr/local/lib
  ldconfig_paths_aout=
  entropy_file=/boot/entropy-file
  
  
  rc.conf.local:
  --
  hostname=typhon.ofw.tld
  ifconfig_msk0=inet 192.168.24.1/21
  ifconfig_msk0_alias0=inet 192.168.24.10/32
  
  named_enable=YES
  nfs_server_enable=YES
  
  nis_client_flags=-s -S ofw.tld,nis-24-1,nis-24-2
  nis_server_enable=YES
  
  defaultrouter=192.168.24.2
  
  lpd_flags=-l
  dumpdev=/dev/da1s1b
  quota_enable=YES
  
  
  fstab:
  --
  /dev/da0s1a /   ufs noatime,rw  
  

Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-06-16 Thread Jeremy Chadwick
On Sun, Jun 16, 2013 at 11:55:38AM +0200, Andre Albsmeier wrote:
 On Sun, 16-Jun-2013 at 10:49:37 +0200, Jeremy Chadwick wrote:
  On Sun, Jun 16, 2013 at 10:02:39AM +0200, Andre Albsmeier wrote:
   On Sun, 16-Jun-2013 at 08:54:41 +0200, Jeremy Chadwick wrote:
On Fri, May 31, 2013 at 07:25:23PM +0200, Andre Albsmeier wrote:
 On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote:
  On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote:
   Each day at 5:15 we are generating snapshots on various machines.
   This used to work perfectly under 7-STABLE for years but since
   we started to use 9.1-STABLE the machine reboots in about 10%
   of all cases.
   
   After rebooting we find a new snapshot file which is a bit
   smaller than the good ones and with different permissions
   It does not succeed a fsck. In this example it is the one
   whose name is beginning with s3:
   
   -r--r-   1 root  operator  snapshot 72802894528 29 May 05:15 
   s2-2013.05.28-03.15.04
   -r   1 root  operator  snapshot 72802893824 29 May 05:15 
   s3-2013.05.29-03.15.03
   -r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
   s4-2013.05.23-06.38.44
   -r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
   s5-2013.05.24-03.15.03
   -r--r-   1 root  operator  snapshot 72802894528 28 May 14:22 
   s6-2013.05.25-03.15.03
   
   After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel
   I see the following LORs (mksnap_ffs starts exactly at 5:15):
   
   May 29 05:15:00 kern.crit palveli kernel: lock order reversal:
   May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs 
   (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240
   May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs 
   (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414
   May 29 05:15:04 kern.crit palveli kernel: lock order reversal:
   May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk 
   (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976
   May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs 
   (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626
   
   Unfortunatley no corefiles are being generated ;-(.
   
   I have checked and even rebuilt the (UFS1) fs in question
   from scratch. I have also seen this happen on an UFS2 on
   another machine and on a third one when running dump -L
   on a root fs.
   
   Any hints of how to proceed?
  
  Would it be possible to setup a serial console that is logged on 
  this machine
  to see if it is panic'ing but failing to write out a crashdump?
 
 I'll try to arrange that. It'll take a bit since this
 box is 200 km away... 
 
 Maybe I'll find another one nearby to reproduce it...

SPECIFICALLY regarding lack of crash dumps: I need to see the
following:

* cat /etc/rc.conf
* cat /etc/fstab

I may need output from other commands, but shall deal with that when I
see output from the above.  Thanks.
   
   No problem, see below...
   
   To make a long story short, the machine dumps core perfectly
   (tested that a while ago), but not when dealing with _this_
   issue...
   
   I dump on da1s1b and savecore fetches it from there and puts
   it on /var (sitting on da0), that's faster.
   
   rc.conf (beware, rc.conf.local exists):
   ---
   rcshutdown_timeout=180
   tmpmfs=YES
   tmpsize=$(( `/sbin/sysctl -n hw.usermem` / 300 ))m
   tmpmfs_flags=$tmpmfs_flags -v 1 -n
   
   background_fsck=NO
   
   nisdomainname=ofw.tld
   pflog_flags=-S
   
   syslogd_flags=-svv
   inetd_enable=YES
   inetd_flags=-l
   named_flags=-S 1000
   named_chrootdir=
   rwhod_enable=YES
   sshd_enable=YES
   amd_enable=YES
   amd_flags=-F /etc/amd.conf
   nfs_client_enable=YES
   nfs_access_cache=2
   mountd_flags=-n
   rpcbind_enable=YES
   
   ntpdate_enable=YES
   ntpdate_hosts=ntp
   ntpd_enable=YES
   ntpd_flags=-p /var/run/ntpd.pid
   
   nis_client_enable=YES
   nis_client_flags=-s -S ofw.tld,nis-16-1,nis-16-2
   nis_server_flags=-n
   nis_yppasswdd_flags=-t /var/yp/src/master.passwd -f -v
   
   defaultrouter=192.168.16.2
   
   keyrate=fast
   
   sendmail_flags=-bd -q5m
   sendmail_submit_flags=$sendmail_flags -ODaemonPortOptions=Addr=localhost
   sendmail_msp_queue_flags=-Ac -q30m
   sendmail_rebuild_aliases=NO
   
   lpd_enable=YES
   lpd_flags=-s
   chkprintcap_enable=YES
   dumpdev=AUTO
   clear_tmp_X=NO
   ldconfig_paths=/usr/local/lib
   ldconfig_paths_aout=
   entropy_file=/boot/entropy-file
   
   
   rc.conf.local:
   --
   hostname=typhon.ofw.tld
   ifconfig_msk0=inet 192.168.24.1/21
   ifconfig_msk0_alias0=inet 192.168.24.10/32
   
   named_enable=YES
   nfs_server_enable=YES
   
   nis_client_flags=-s -S ofw.tld,nis-24-1,nis-24-2
   

system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Michiel Boland
Hi. Recently I switched to WITH_NEW_XORG, primarily because the stock X server 
with Intel driver has some issues that make it unusable for me.


The new X server and Intel driver works extremely well, so kudos to whoever made 
this possible.


Unfortunately, I am now experiencing random hangs on shutdown. On shutdown the 
system randomly freezes after


[...] syslogd: exiting on signal 15

I would then expect to see 'Waiting (max 60 seconds) for system process 'XXX' to 
stop messages, but these never arrive.


I paniced the machine in ddb, so I have a crash dump if someone want to look at 
it. The crashinfo is at http://barrytown.boland.org/core.txt (I would have 
pasted it here but it is a bit verbose.)


Machine has an Intel G41 chipset, with a SAMSUNG SSD 830 Series HD, running 
9.1-STABLE r251803. Serial console. GENERIC kernel, expect for options DDB and 
ALT_BREAK_TO_DEBUGGER.


Who knows what's going on here?

Cheers
Michiel
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Konstantin Belousov
On Sun, Jun 16, 2013 at 05:11:15PM +0200, Michiel Boland wrote:
 Hi. Recently I switched to WITH_NEW_XORG, primarily because the stock X 
 server 
 with Intel driver has some issues that make it unusable for me.
 
 The new X server and Intel driver works extremely well, so kudos to whoever 
 made 
 this possible.
 
 Unfortunately, I am now experiencing random hangs on shutdown. On shutdown 
 the 
 system randomly freezes after
 
 [...] syslogd: exiting on signal 15
 
 I would then expect to see 'Waiting (max 60 seconds) for system process 'XXX' 
 to 
 stop messages, but these never arrive.
 
 I paniced the machine in ddb, so I have a crash dump if someone want to look 
 at 
 it. The crashinfo is at http://barrytown.boland.org/core.txt (I would have 
 pasted it here but it is a bit verbose.)
 
 Machine has an Intel G41 chipset, with a SAMSUNG SSD 830 Series HD, running 
 9.1-STABLE r251803. Serial console. GENERIC kernel, expect for options DDB 
 and 
 ALT_BREAK_TO_DEBUGGER.
 
 Who knows what's going on here?

I do not see anything related to i915 in the core.txt you provided.

Next time the machine hangs, start with the output of ps command from
ddb and 'show allpcpu', together with 'alltrace'.


pgpqYPumBgaGi.pgp
Description: PGP signature


Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Michiel Boland

On 06/16/2013 17:37, Konstantin Belousov wrote:

On Sun, Jun 16, 2013 at 05:11:15PM +0200, Michiel Boland wrote:

Hi. Recently I switched to WITH_NEW_XORG, primarily because the stock X server
with Intel driver has some issues that make it unusable for me.

The new X server and Intel driver works extremely well, so kudos to whoever made
this possible.

Unfortunately, I am now experiencing random hangs on shutdown. On shutdown the
system randomly freezes after

[...] syslogd: exiting on signal 15

I would then expect to see 'Waiting (max 60 seconds) for system process 'XXX' to
stop messages, but these never arrive.

I paniced the machine in ddb, so I have a crash dump if someone want to look at
it. The crashinfo is at http://barrytown.boland.org/core.txt (I would have
pasted it here but it is a bit verbose.)

Machine has an Intel G41 chipset, with a SAMSUNG SSD 830 Series HD, running
9.1-STABLE r251803. Serial console. GENERIC kernel, expect for options DDB and
ALT_BREAK_TO_DEBUGGER.

Who knows what's going on here?


I do not see anything related to i915 in the core.txt you provided.

Next time the machine hangs, start with the output of ps command from
ddb and 'show allpcpu', together with 'alltrace'.



Ok.

I appended 'thread apply all bt' from kgdb to the core.txt, maybe there is 
something interesting in there.


I did notice the following

Thread 17 (Thread 17):
#0  cpustop_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1392
#1  0x80cbebbd in ipi_nmi_handler () at 
/usr/src/sys/amd64/amd64/mp_machdep.c:1374
#2  0x80ccc159 in trap (frame=0x81424890) at 
/usr/src/sys/amd64/amd64/trap.c:211
#3  0x80cb55af in nmi_calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:501
#4  0x80d0c029 in vga_txtmouse (scp=0xfe0005586600, x=320, y=200, 
on=value optimized out) at cpufunc.h:186

Previous frame inner to this frame (corrupt stack?)

Maybe the hang is caused by the removal of the text mouse cursor? (Just guessing 
here.)


Cheers
Michiel

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Jeremy Chadwick
On Sun, Jun 16, 2013 at 05:48:52PM +0200, Michiel Boland wrote:
 On 06/16/2013 17:37, Konstantin Belousov wrote:
 On Sun, Jun 16, 2013 at 05:11:15PM +0200, Michiel Boland wrote:
 Hi. Recently I switched to WITH_NEW_XORG, primarily because the stock X 
 server
 with Intel driver has some issues that make it unusable for me.
 
 The new X server and Intel driver works extremely well, so kudos to whoever 
 made
 this possible.
 
 Unfortunately, I am now experiencing random hangs on shutdown. On shutdown 
 the
 system randomly freezes after
 
 [...] syslogd: exiting on signal 15
 
 I would then expect to see 'Waiting (max 60 seconds) for system process 
 'XXX' to
 stop messages, but these never arrive.
 
 I paniced the machine in ddb, so I have a crash dump if someone want to 
 look at
 it. The crashinfo is at http://barrytown.boland.org/core.txt (I would have
 pasted it here but it is a bit verbose.)
 
 Machine has an Intel G41 chipset, with a SAMSUNG SSD 830 Series HD, running
 9.1-STABLE r251803. Serial console. GENERIC kernel, expect for options DDB 
 and
 ALT_BREAK_TO_DEBUGGER.
 
 Who knows what's going on here?
 
 I do not see anything related to i915 in the core.txt you provided.
 
 Next time the machine hangs, start with the output of ps command from
 ddb and 'show allpcpu', together with 'alltrace'.
 
 
 Ok.
 
 I appended 'thread apply all bt' from kgdb to the core.txt, maybe
 there is something interesting in there.
 
 I did notice the following
 
 Thread 17 (Thread 17):
 #0  cpustop_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1392
 #1  0x80cbebbd in ipi_nmi_handler () at
 /usr/src/sys/amd64/amd64/mp_machdep.c:1374
 #2  0x80ccc159 in trap (frame=0x81424890) at
 /usr/src/sys/amd64/amd64/trap.c:211
 #3  0x80cb55af in nmi_calltrap () at
 /usr/src/sys/amd64/amd64/exception.S:501
 #4  0x80d0c029 in vga_txtmouse (scp=0xfe0005586600,
 x=320, y=200, on=value optimized out) at cpufunc.h:186
 Previous frame inner to this frame (corrupt stack?)
 
 Maybe the hang is caused by the removal of the text mouse cursor?
 (Just guessing here.)

vga_txtmouse comes from syscons(4).

Are you making use of vidcontrol(1) in any way to set the system console
(outside of X) to something that uses the VGA framebuffer?  There are
probably some loader.conf or rc.conf variables that control this (I do
not know).

Are you running moused(8)?  Actually, I can see quite clearly that you
are in your core.txt:

Starting ums0 moused.

Try turning that off.  Don't ask me how, because devd(8) / devd.conf(5)
might be involved.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Michiel Boland

On 06/16/2013 17:55, Jeremy Chadwick wrote:
[...]


Are you running moused(8)?  Actually, I can see quite clearly that you
are in your core.txt:

Starting ums0 moused.

Try turning that off.  Don't ask me how, because devd(8) / devd.conf(5)
might be involved.



The moused is started by devd - I don't see a quick way of turning that off.

As a workaround I'm trying to run a kernel with

 options SC_NO_SYSMOUSE

to see if the hangs go away.

Cheers
Michiel

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Jeremy Chadwick
On Sun, Jun 16, 2013 at 06:01:49PM +0200, Michiel Boland wrote:
 On 06/16/2013 17:55, Jeremy Chadwick wrote:
 [...]
 
 Are you running moused(8)?  Actually, I can see quite clearly that you
 are in your core.txt:
 
 Starting ums0 moused.
 
 Try turning that off.  Don't ask me how, because devd(8) / devd.conf(5)
 might be involved.
 
 
 The moused is started by devd - I don't see a quick way of turning that off.

Comment out the relevant crap in devd.conf(5).  Search for ums
and comment out the two notify sections.

 As a workaround I'm trying to run a kernel with
 
  options SC_NO_SYSMOUSE
 
 to see if the hangs go away.

That's one way to do it, I guess.

Be aware that I do not use X, however I have repeatedly seen mentioned
on these lists problems/complexities from where people rely on moused(8)
to drive their mouse while inside of X (or possibly that X and
moused(8) are both simultaneously polling the mouse).  There's
apparently a very specific kind of X configuration you're supposed to
use to get proper mouse/keyboard/HAL/HID/whatever support, and tons of
people have it wrongt.  Warren Block I think has some insights into
this, or could maybe help shed some light on what I'm remembering.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Steven Hartland


- Original Message - 
From: Michiel Boland bolan...@xs4all.nl

To: FreeBSD Stable freebsd-stable@freebsd.org
Sent: Sunday, June 16, 2013 4:11 PM
Subject: system sporadically hangs on shutdown after switching to WITH_NEW_XORG


Hi. Recently I switched to WITH_NEW_XORG, primarily because the stock X server 
with Intel driver has some issues that make it unusable for me.


The new X server and Intel driver works extremely well, so kudos to whoever made 
this possible.


Unfortunately, I am now experiencing random hangs on shutdown. On shutdown the 
system randomly freezes after


[...] syslogd: exiting on signal 15

I would then expect to see 'Waiting (max 60 seconds) for system process 'XXX' to 
stop messages, but these never arrive.


I paniced the machine in ddb, so I have a crash dump if someone want to look at 
it. The crashinfo is at http://barrytown.boland.org/core.txt (I would have 
pasted it here but it is a bit verbose.)


Machine has an Intel G41 chipset, with a SAMSUNG SSD 830 Series HD, running 
9.1-STABLE r251803. Serial console. GENERIC kernel, expect for options DDB and 
ALT_BREAK_TO_DEBUGGER.


Who knows what's going on here?


Does setting the sysctl: hw.usb.no_shutdown_wait=1 help?

   Regards
   steve


This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


FreeBSD history

2013-06-16 Thread Andy Farkas

On 16/06/13 20:30, Jeremy Chadwick wrote:
 * Output from: strings /boot/kernel/kernel | egrep ^option Thanks.

I stumbled across this one about a week ago:

 strings /boot/kernel/kernel | head -1

and was wondering about the history of where it came from / what it means.

I can see it was added to Makefile.i386 in September 1998 but the commit 
comment mentions the defunct alpha port and searching SVN for things in 
the Attic is a PITA.


Also, according to 
http://svnweb.freebsd.org/base?view=revisionrevision=1 FreeBSD is 20 
years old!


Is not a celebration / announcement warranted?

-andyf

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Michiel Boland

On 06/16/2013 17:37, Konstantin Belousov wrote:
[...]

I do not see anything related to i915 in the core.txt you provided.

Next time the machine hangs, start with the output of ps command from
ddb and 'show allpcpu', together with 'alltrace'.



Ok, I captured 'ps', 'show allpcpu' and 'alltrace' from a stuck shutdown. I've 
appended it to my core.txt. (See previous e-mail.) (Note that the ddb commands 
are from a different session - so the ddb output may not match with the kgdb 
output.)



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Konstantin Belousov
On Sun, Jun 16, 2013 at 07:12:33PM +0200, Michiel Boland wrote:
 On 06/16/2013 17:37, Konstantin Belousov wrote:
 [...]
  I do not see anything related to i915 in the core.txt you provided.
 
  Next time the machine hangs, start with the output of ps command from
  ddb and 'show allpcpu', together with 'alltrace'.
 
 
 Ok, I captured 'ps', 'show allpcpu' and 'alltrace' from a stuck shutdown. 
 I've 
 appended it to my core.txt. (See previous e-mail.) (Note that the ddb 
 commands 
 are from a different session - so the ddb output may not match with the kgdb 
 output.)
 

Hm, how do you initiate the shutdown ? Show the exact command.
Also, from the same moment of the hung system, enter the ddb and
again do ps, alltrace and 'x rebooting'.


pgpFfv8UYSLYj.pgp
Description: PGP signature


Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Michiel Boland

On 06/16/2013 19:46, Konstantin Belousov wrote:

On Sun, Jun 16, 2013 at 07:12:33PM +0200, Michiel Boland wrote:

On 06/16/2013 17:37, Konstantin Belousov wrote:
[...]

I do not see anything related to i915 in the core.txt you provided.

Next time the machine hangs, start with the output of ps command from
ddb and 'show allpcpu', together with 'alltrace'.



Ok, I captured 'ps', 'show allpcpu' and 'alltrace' from a stuck shutdown. I've
appended it to my core.txt. (See previous e-mail.) (Note that the ddb commands
are from a different session - so the ddb output may not match with the kgdb
output.)



Hm, how do you initiate the shutdown ? Show the exact command.
Also, from the same moment of the hung system, enter the ddb and
again do ps, alltrace and 'x rebooting'.



The exact command to generate the hangs from which I created the reports was

'shutdown -r now'

FWIW - the saved core from the ddb-induced panic has

(kgdb) print rebooting
$1 = 1

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Michiel Boland

On 06/16/2013 20:06, Michiel Boland wrote:

FWIW - the saved core from the ddb-induced panic has

(kgdb) print rebooting
$1 = 1



I realised instantly after I sent my message that this is meaningless - so 
please ignore that.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Ian Lepore
On Sun, 2013-06-16 at 09:07 -0700, Jeremy Chadwick wrote:
 On Sun, Jun 16, 2013 at 06:01:49PM +0200, Michiel Boland wrote:
  On 06/16/2013 17:55, Jeremy Chadwick wrote:
  [...]
  
  Are you running moused(8)?  Actually, I can see quite clearly that you
  are in your core.txt:
  
  Starting ums0 moused.
  
  Try turning that off.  Don't ask me how, because devd(8) / devd.conf(5)
  might be involved.
  
  
  The moused is started by devd - I don't see a quick way of turning that off.
 
 Comment out the relevant crap in devd.conf(5).  Search for ums
 and comment out the two notify sections.

I don't understand why people treat devd as if it's some sort of evil
virus that they're forced to live with (using phrases like crap in
devd.conf).  In general, the standard devd rules tend to fall into 3
categories:  
  * use logger(1) to record some anomaly
  * kldload a module
  * invoke a standard /etc/rc.d script

For moused, the devd rules invoke /etc/rc.d/moused, which implies that
setting moused_enable=NO in rc.conf would be all that's needed to
disable it.

-- Ian


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Konstantin Belousov
On Sun, Jun 16, 2013 at 08:06:21PM +0200, Michiel Boland wrote:
 On 06/16/2013 19:46, Konstantin Belousov wrote:
  On Sun, Jun 16, 2013 at 07:12:33PM +0200, Michiel Boland wrote:
  On 06/16/2013 17:37, Konstantin Belousov wrote:
  [...]
  I do not see anything related to i915 in the core.txt you provided.
 
  Next time the machine hangs, start with the output of ps command from
  ddb and 'show allpcpu', together with 'alltrace'.
 
 
  Ok, I captured 'ps', 'show allpcpu' and 'alltrace' from a stuck shutdown. 
  I've
  appended it to my core.txt. (See previous e-mail.) (Note that the ddb 
  commands
  are from a different session - so the ddb output may not match with the 
  kgdb
  output.)
 
 
  Hm, how do you initiate the shutdown ? Show the exact command.
  Also, from the same moment of the hung system, enter the ddb and
  again do ps, alltrace and 'x rebooting'.
 
 
 The exact command to generate the hangs from which I created the reports was
 
 'shutdown -r now'
 
 FWIW - the saved core from the ddb-induced panic has
 
 (kgdb) print rebooting
 $1 = 1
I explicitely asked you to provide me with the consistent ps/alltrace
and 'x rebooting' output.  What you did is useless.

In the ddb trace you appended, there is no thread which executes
the reboot(2) system call.


pgpxBIhY0i5UK.pgp
Description: PGP signature


Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Michiel Boland

So apparently the value of 'rebooting' is 0 at the time of the hang...

db x rebooting
rebooting:  0

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org