RE: Kernel Panic on 9.0 and 9.1 with carp on BCE network interface

2012-09-10 Thread Jean-Luc Dupont
 Date: Fri, 7 Sep 2012 17:14:41 +0400
 From: gleb...@freebsd.org
 To: jl.dup...@outlook.com
 CC: freebsd-stable@FreeBSD.org
 Subject: Re: Kernel Panic on 9.0 and 9.1 with carp on BCE network interface
 
 On Thu, Aug 30, 2012 at 02:39:10PM +, Jean-Luc Dupont wrote:
 J Sorry, it seems that I didn't put the right backtrace :
 J 
 J #0  doadump (textdump=Variable textdump is not available.
 J ) at /usr/src/sys/kern/kern_shutdown.c:271
 J 271 dumpsys(dumper);
 J (kgdb) #0  doadump (textdump=Variable textdump is not available.
 J ) at /usr/src/sys/kern/kern_shutdown.c:271
 J #1  0x807fdf02 in kern_reboot (howto=260)
 J at /usr/src/sys/kern/kern_shutdown.c:448
 J #2  0x807fe3e3 in panic (fmt=0x104 Address 0x104 out of bounds)
 J at /usr/src/sys/kern/kern_shutdown.c:636
 J #3  0x80ad2700 in trap_fatal (frame=0xc, eva=Variable eva is not 
 available.
 J )
 J at /usr/src/sys/amd64/amd64/trap.c:857
 J #4  0x80ad2a3d in trap_pfault (frame=0xff82e97a3500, 
 usermode=0)
 J at /usr/src/sys/amd64/amd64/trap.c:773
 J #5  0x80ad305e in trap (frame=0xff82e97a3500)
 J at /usr/src/sys/amd64/amd64/trap.c:456
 J #6  0x80abd67f in calltrap ()
 J at /usr/src/sys/amd64/amd64/exception.S:228
 J #7  0x8085f597 in m_copym (m=0x0, off0=1500, len=1480, wait=1)
 J at /usr/src/sys/kern/uipc_mbuf.c:542
 J #8  0x8092f2c8 in ip_fragment (ip=0xfe00970e0580, 
 J m_frag=0xff82e97a3728, mtu=Variable mtu is not available.
 J ) at /usr/src/sys/netinet/ip_output.c:822
 J #9  0x8092fc17 in ip_output (m=0xfe00970e0500, opt=Variable 
 opt is not available.
 J )
 J at /usr/src/sys/netinet/ip_output.c:653
 J #10 0x80928713 in ip_forward (m=0xfe00970e0500, srcrt=Variable 
 srcrt is not available.
 J )
 J at /usr/src/sys/netinet/ip_input.c:1494
 J #11 0x80929dc8 in ip_input (m=0xfe00970e0500)
 J at /usr/src/sys/netinet/ip_input.c:702
 
 I don't see that this is CARP related. Do you use any firewall: pf or ipfw?
 
 Can you please show the below session in gdb with discussed core file:
 
 gdb fr 9
 gdb p mtu
 gdb fr 7
 gdb p off
 gdb fr 8
 gdb p m0
 gdb p *m0
 
 -- 
 Totus tuus, Glebius.
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Hi,  

  Thank you very much for your reply, we are using IPFW with several VLAN and 
several CARP on intel igb and bce network cards on a dell poweredge servers.
When we stopped using the bce and using only the igb (with more vlans per 
interface) we don't have any more panics.

Here is the output of the debugger as asked :

(kgdb) fr 9
#9  0x8092fc17 in ip_output (m=0xfe00941c8300, opt=Variable opt 
is not available.
) at /usr/src/sys/netinet/ip_output.c:653
653 error = ip_fragment(ip, m, mtu, ifp-if_hwassist, sw_csum);
(kgdb) p mtu
$1 = 1500
(kgdb) fr 7
#7  0x8085f597 in m_copym (m=0x0, off0=1500, len=1317, wait=1) at 
/usr/src/sys/kern/uipc_mbuf.c:542
542 if (off  m-m_len)
(kgdb) p off
$2 = 1233
(kgdb) fr 8
#8  0x8092f2c8 in ip_fragment (ip=0xfe00941c8380, 
m_frag=0xff834869e7f8, mtu=Variable mtu is not available.
)
at /usr/src/sys/netinet/ip_output.c:822
822 m-m_next = m_copym(m0, off, len, M_DONTWAIT);
(kgdb) p m0
$3 = (struct mbuf *) 0xfe00941c8300
(kgdb) p *m0
$4 = {m_hdr = {mh_next = 0xfe0081d51800, mh_nextpkt = 0x0, mh_data = 
0xfe00941c8380 E, mh_len = 40, 
mh_flags = 2, mh_type = 1, pad = \000\000\000\000\000}, M_dat = {MH = 
{MH_pkthdr = {
rcvif = 0xfe0003b53800, header = 0x0, len = 267, flowid = 0, 
csum_flags = 0, csum_data = 65535, 
tso_segsz = 0, PH_vt = {vt_vtag = 0, vt_nrecs = 0}, tags = {slh_first = 
0x0}}, MH_dat = {MH_ext = {
  ext_buf = 0x400092ae00400045 Address 0x400092ae00400045 out of 
bounds, ext_free = 0x16207, 
  ext_arg1 = 0x42011d, ext_arg2 = 0x601005e, ext_size = 
2660147200, 
  ref_cnt = 0x40f7e20b010045, ext_type = -843971023}, 
MH_databuf = 
E\000@\000�\222\000@\ab\001\000\000\000\000\000\000\000\035\001��B\000\000\000\000\000^\000\001\006\000�\216\236�\200\b\000E\000\001\v��@\0001\006��H\025T�\n\n\vK\000\025��^h���\223R\200\030\000r\213�\000\000\001\001\b\n�$*\200:��\a,
 '\0' repeats 75 times}}, 
M_databuf = 
\0008�\003\000���\000\000\000\000\000\000\000\000\v\001\000\000\000\000\000\000\000\000\000\000��,
 '\0' repeats 18 times, 
E\000@\000�\222\000@\ab\001\000\000\000\000\000\000\000\035\001��B\000\000\000\000\000^\000\001\006\000�\216\236�\200\b\000E\000\001\v��@\0001\006��H\025T�\n\n\vK\000\025��^h���\223R\200\030\000r\213�\000\000\001\001\b\n�$*\200:��\a,
 '\0' repeats 75 times}}
(kgdb) 

  

usb port issue in 9.1-Prerelease (Possibly Cam related)

2012-09-10 Thread Benjamin Close

Hi Folks,
I've facing an intermittent hang with a USB port which seems cam 
related:


Event's that happen are:

o USB modem (HUAWEI E220) plugged into PC

ugen3.2: HUA WEI at usbus3
u3g0: 3G Modem on usbus3
u3g0: Found 3 ports.
umass0: USB MASS STORAGE on usbus3
umass0:  SCSI over Bulk-Only; quirks = 0x
umass0:6:0:-1: Attached to scbus6
umass1: USB MASS STORAGE on usbus3
umass1:  SCSI over Bulk-Only; quirks = 0x
umass1:7:1:-1: Attached to scbus7
cd1 at umass-sim0 bus 0 scbus6 target 0 lun 0
cd1: HUAWEI Mass Storage 2.31 Removable CD-ROM SCSI-2 device
cd1: 1.000MB/s transfers
cd1: Attempt to query device size failed: NOT READY, Medium not present
da0 at umass-sim1 bus 1 scbus7 target 0 lun 0
da0: HUAWEI SD Storage 2.31 Removable Direct Access SCSI-2 device
da0: 1.000MB/s transfers
da0: Attempt to query device size failed: NOT READY, Medium not present


o Time Elapsesmany packets passed, no da0 or cd1 used.


o USB Modem drops off the bus
   (It does this occasionally as it resets itself)

o Causes USB bus to lose devices

ugen3.2: HUA WEI at usbus3 (disconnected)
u3g0: at uhub3, port 1, addr 2 (disconnected)
(cd1:umass-sim0:0:0:0): lost device, 1 refs
(cd1:umass-sim0:0:0:0): removing device entry
(pass4:umass-sim0:0:0:0): passdevgonecb: devfs entry is gone
(da0:umass-sim1:1:0:0): lost device - 0 outstanding, 1 refs
(da0:umass-sim1:1:0:0): removing device entry
(pass5:umass-sim1:1:0:0): passdevgonecb: devfs entry is gone
umass0: at uhub3, port 1, addr 2 (disconnected)


At this point that particular USB port is effectively useless. Plugging 
anything into the ports shows no device showing up.


Running usbconfig hangs with:

  PIDTID COMM TDNAME KSTACK
48562 101874 usbconfig-mi_switch+0x186 
sleepq_wait+0x42 _sx_xlock_hard+0x426 usbd_enum_lock+0xac 
usb_ref_device+0x21c usb_open+0xc7 devfs_open+0x197 vn_open_cred+0x2ff 
kern_openat+0x20a amd64_syscall+0x540 Xfast_syscall+0xf7


Controller is:

uhci0@pci0:0:26:0:  class=0x0c0300 card=0x02091028 chip=0x28348086 
rev=0x02 hdr=0x00

vendor = 'Intel Corporation'
device = '82801H (ICH8 Family) USB UHCI Controller'
class  = serial bus
subclass   = USB
uhci1@pci0:0:26:1:  class=0x0c0300 card=0x02091028 chip=0x28358086 
rev=0x02 hdr=0x00

vendor = 'Intel Corporation'
device = '82801H (ICH8 Family) USB UHCI Controller'
class  = serial bus
subclass   = USB
ehci0@pci0:0:26:7:  class=0x0c0320 card=0x02091028 chip=0x283a8086 
rev=0x02 hdr=0x00

vendor = 'Intel Corporation'
device = '82801H (ICH8 Family) USB2 EHCI Controller'
class  = serial bus
subclass   = USB

It does however seem related to cam as looking at the various threads 
for the usb hub I find:


(kgdb) bt
#0  sched_switch (td=0xfe000265, newtd=0xfe000227f000, 
flags=Variable flags is not available.

) at /usr/src/sys/kern/sched_ule.c:1927
#1  0x808f34c6 in mi_switch (flags=260, newtd=0x0) at 
/usr/src/sys/kern/kern_synch.c:485
#2  0x8092bfd2 in sleepq_wait (wchan=0xfe001ec2a900, pri=92) 
at /usr/src/sys/kern/subr_sleepqueue.c:623
#3  0x808f3c69 in _sleep (ident=0xfe001ec2a900, 
lock=0xfe00371e9210, priority=Variable priority is not available.

) at /usr/src/sys/kern/kern_synch.c:250
#4  0x802bea02 in cam_sim_free (sim=0xfe001ec2a900, 
free_devq=1) at /usr/src/sys/cam/cam_sim.c:112

#5  0x8074f8ba in umass_detach (dev=Variable dev is not available.
) at /usr/src/sys/dev/usb/storage/umass.c:2183
#6  0x8091a054 in device_detach (dev=0xfe001ec2e900) at 
device_if.h:214
#7  0x8075c458 in usb_detach_device (udev=0xfe0007ce8800, 
iface_index=32 ' ', flag=Variable flag is not available.

) at /usr/src/sys/dev/usb/usb_device.c:1065
#8  0x8075c5f4 in usb_unconfigure (udev=0xfe0007ce8800, 
flag=Variable flag is not available.

) at /usr/src/sys/dev/usb/usb_device.c:455
#9  0x8075c88e in usb_free_device (udev=0xfe0007ce8800, 
flag=Variable flag is not available.

) at /usr/src/sys/dev/usb/usb_device.c:2093
#10 0x80764e5e in uhub_explore (udev=0xfe0007353800) at 
/usr/src/sys/dev/usb/usb_hub.c:358
#11 0x8074f536 in usb_bus_explore (pm=Variable pm is not 
available.

) at /usr/src/sys/dev/usb/controller/usb_controller.c:359
#12 0x80769173 in usb_process (arg=Variable arg is not available.
) at /usr/src/sys/dev/usb/usb_process.c:170
#13 0x808bc2df in fork_exit (callout=0x807690a0 
usb_process, arg=0xff80007c0e88, frame=0xff804743cc40) at 
/usr/src/sys/kern/kern_fork.c:992
#14 0x80bc216e in fork_trampoline () at 
/usr/src/sys/amd64/amd64/exception.S:602



From:   cam_sim_free(struct cam_sim *sim, int free_devq)

(kgdb) l
107 {
108 int error;
109
110 sim-refcount--;
111 if (sim-refcount  0) {

112 error = msleep(sim, 

Re: bsnmpd always died on HDD detach

2012-09-10 Thread Miroslav Lachman

Mikolaj Golub wrote:

On Sun, Sep 09, 2012 at 11:56:55PM +0200, Miroslav Lachman wrote:

I am running bsnmpd with basic snmpd.config (only community and location
changed).

When there is a problem with HDD and disk disapeared from ATA channel
(eg.: disc physically removed) the bsnmpd always dumps core:

kernel: pid 1188 (bsnmpd), uid 0: exited on signal 11 (core dumped)

I see this for a long rime on all releases of 7.x and 8.x branches (i386
and amd64). I did not tested 9.x.

Is it a known bug, or should I file PR?


Do you happen to run bsnmp-ucd too? If you do then what version is it?
In bsnmp-ucd-0.3.5 I introduced a bug that lead to bsnmpd crash on a
disk detach. It has been fixed (thanks to Brian Somers) in 0.3.6.


No, I never installed bsnmpd-ucd. We are using plain bsnmpd from base 
without any modules.

It is used by MRTG only for network traffic. Nothing else.

Miroslav Lachman
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: bsnmpd always died on HDD detach

2012-09-10 Thread Mikolaj Golub
On Mon, Sep 10, 2012 at 04:46:15PM +0200, Miroslav Lachman wrote:
 Mikolaj Golub wrote:
  On Sun, Sep 09, 2012 at 11:56:55PM +0200, Miroslav Lachman wrote:
  I am running bsnmpd with basic snmpd.config (only community and location
  changed).
 
  When there is a problem with HDD and disk disapeared from ATA channel
  (eg.: disc physically removed) the bsnmpd always dumps core:
 
  kernel: pid 1188 (bsnmpd), uid 0: exited on signal 11 (core dumped)
 
  I see this for a long rime on all releases of 7.x and 8.x branches (i386
  and amd64). I did not tested 9.x.
 
  Is it a known bug, or should I file PR?
 
  Do you happen to run bsnmp-ucd too? If you do then what version is it?
  In bsnmp-ucd-0.3.5 I introduced a bug that lead to bsnmpd crash on a
  disk detach. It has been fixed (thanks to Brian Somers) in 0.3.6.
 
 No, I never installed bsnmpd-ucd. We are using plain bsnmpd from base 
 without any modules.
 It is used by MRTG only for network traffic. Nothing else.

Then the backtrace might be useful.

gdb /usr/sbin/bsnmpd /path/to/bsnmpd.core
bt

-- 
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org