FYI, in case anyone else encouters this issue. The card that I had which I could reproduce this with was hardware revision B4. I RMAed the card with Digium support and got a newer, revision C card, and the issue is no more.
On 20 Oct, 2009, at 3:25 PM, Chris Brentano wrote: > I've seen this consistently on three systems, with three different > cards, and multiple versions of DAHDI. At first I thought the issue > only occurred on newer, Nehalem-based, systems, but I reproduced it on > a Core 2 Duo box as well. I've tested with dahdi-linux 2.2.0.2, dadhi- > linux-complete 2.0.0+2.0.0, 2.1.0.2+2.1.0.2, and 2.2.0.2+2.2.0. The > card is a Digium TE220B which uses the wct4xxp module. This does not > happen, on the same systems and kernel version, with a TE121 using the > wcte12xp module nor does it happen with a T100P using wct1xxp. OS is > CentOS 5.3, and happens with kernel versions 2.6.18-164.el5 and > 2.6.18-128.el5. I'm posting this wondering if anyone else has seen > similar behavior. > > /etc/dahdi/system.conf: > span=1,1,0,esf,b8xs > bchan=1-23 > dchan=24 > loadzone=us > defaultzone=us > > /etc/dahdi/modules: > wct4xxp > wcte12xp > wct1xxp > > --- > > When I start dahdi, I see the following: > > # /etc/init.d/dahdi start > Loading DAHDI hardware modules: > wct4xxp: [ OK ] > wcte12xp: [ OK ] > wct1xxp: [ OK ] > > Running dahdi_cfg: VPM400: Not Present > VPM450: Not Present > [ OK ] > > Syslog output: > > Oct 20 15:20:54 redbox-ast16 kernel: dahdi: Telephony Interface > Registered on major 196 > Oct 20 15:20:54 redbox-ast16 kernel: dahdi: Version: 2.2.0.2 > Oct 20 15:20:54 redbox-ast16 kernel: ACPI: PCI Interrupt > 0000:03:08.0[A] -> GSI 16 (level, low) -> IRQ 169 > Oct 20 15:20:54 redbox-ast16 kernel: Found TE2XXP at base address > dfbfff80, remapped to ffffc20000022f80 > Oct 20 15:20:54 redbox-ast16 kernel: TE2XXP version c01a016c, burst > ON > Oct 20 15:20:54 redbox-ast16 kernel: Octasic optimized! > Oct 20 15:20:54 redbox-ast16 kernel: FALC version: 00000005, Board > ID: 00 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 0: 0x056af400 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 1: 0x056af000 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 2: 0x00000000 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 3: 0x00000000 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 4: 0x0000ff01 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 5: 0x00000000 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 6: 0xc01a016c > Oct 20 15:20:54 redbox-ast16 kernel: Reg 7: 0x00001000 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 8: 0x00000000 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 9: 0x00ff00ff > Oct 20 15:20:54 redbox-ast16 kernel: Reg 10: 0x0000004a > Oct 20 15:20:54 redbox-ast16 kernel: Found a Wildcard: Wildcard > TE220 (4th Gen) > Oct 20 15:20:54 redbox-ast16 kernel: TE2XXP: Launching card: 0 > Oct 20 15:20:54 redbox-ast16 kernel: TE2XXP: Setting up global > serial parameters > Oct 20 15:20:55 redbox-ast16 kernel: About to enter spanconfig! > Oct 20 15:20:55 redbox-ast16 kernel: Done with spanconfig! > Oct 20 15:20:55 redbox-ast16 kernel: dahdi: Registered tone zone 0 > (United States / North America) > Oct 20 15:20:55 redbox-ast16 kernel: About to enter startup! > Oct 20 15:20:55 redbox-ast16 kernel: TE2XXP: Span 1 configured for > ESF/B8ZS > Oct 20 15:20:55 redbox-ast16 kernel: wct2xxp: Setting yellow alarm > on span 1 > Oct 20 15:20:55 redbox-ast16 kernel: timing source auto card 0! > Oct 20 15:20:55 redbox-ast16 kernel: SPAN 1: Primary Sync Source > Oct 20 15:20:55 redbox-ast16 kernel: VPM400: Not Present > Oct 20 15:20:55 redbox-ast16 kernel: VPM450: Not Present > Oct 20 15:20:55 redbox-ast16 kernel: Completed startup! > > --- > > Now if I either start asterisk, or if I stop dahdi, it will panic: > > # /etc/init.d/dahdi stop > Unloading DAHDI hardware modules: TE4XXP: Version Syncronization > Error! > TE4XXP: Version Syncronization Error! > TE4XXP: Version Syncronization Error! > TE4XXP: Version Syncronization Error! > > > > HARDWARE ERROR > CPU 1: Machine Check Exception: 4 Bank 8: > 00000000000000 > TSC 0 > This is not a software problem! > Run through mcelog --ascii to decode and contact your hardware > vendor > Kernel panic - not syncing: Uncorrected machine check > > > Syslog output (not much before restart): > > Oct 20 07:11:54 localhost kernel: TE4XXP: Version Synchronization > Error! > Oct 20 07:14:24 localhost syslogd 1.4.1: restart. > ... > > --- > > I only see the machine check exception on the two Nehalem boxes (HP > ProLiant ML350 G6, Z800 Workstation); on a Core 2 Duo (Dell Optiplex > 745) it just hard freezes after the "Version Syncronization Error!" > messages. If there's any further details I can provide I'm happy to do > so. Would like to figure out what's happening here if anyone can help > shed any light as this is completely holding up migration to Asterisk > 1.6 and DAHDI. Thanks. > > - Chris > > > _______________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > asterisk-users mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-users _______________________________________________ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users