Without looking at the dump itself, it's always going to be tricky.

Based only on what we have, I'd say no. It's expect it's domething dud 
in the sunray software.

Being that I don't have access to Sunsolve Internal any more (doh!), I 
can only search what is externally available and get NO hits at all on 
any of the interesting parts of the stack.

Based on the stack, and the complete lack of good information (like how 
long after boot, if it happens after some specific action, etc... you 
know - all the stuff we talked about last night that was required to 
diagnose a panic! ;) I don't have much for you.

Worse, only Sun willl have access to the Sunray source, which will tell 
you interesting things, like, what the argument types to our function are.

utstk_set_disk_major+0x1c(0, 7bb6bad0, 10, 0, 1039ba4, 0)

Is interesting in that it's got lots of 0's in it. If we knew wha the 
types were supposed to be.

But - being that it's a bad muted on a mutex_enter that we took the 
panic on, there is a good chance it's someone having dome something dumb 
in software, OR hardware. If someone could look at the dump, they could 
determine some interesting things about the mutex, and why it was 
considered we did something bad.

Ultimatey, this is a call for Sun to resolve. You will have little 
chance of being able to do much more than roll back your changes as far 
as a fall back position without them...

That being said, if it's always reproducable, and you know when it 
started (ie: Was it after the SRSS upgrade, or after the Solaris 
upgrade), there is a good chance you'll be able to isolate where it's 
going wrong.

My wild a$$ guess is a bug in SRSS, or corrupted binaries used for the 
install/disk based corruption.

But - That's based on very very little.

A pkgchk might be of value to ensure all the driver binaries actually 
look like they are supposed to.

Additionally, it would be worth checking all your other drivers as 
well... Remember our emulex example? ;)

You have, however, the unfortunate distinction where in starting any 
sort of SGR/KT analysis you'll be starting with:

What has changed?
        Freaking Everything!

When did the problem first start happening
        When I fooled with the system

Where is it not happening that it could be happening?
        The systems I have not fooled with

heh heh. Just kidding - But it's worth asking those questions, as if 
there are a few servers, and you have updated them all, and only one is 
failing, then that's information that's helpful.

Oh - and don't forget - If the panics all have different stacks, there 
is a real chance that it's hardware... Just a thought.

Cheers!

Nathan.

Zia-ul-Hassan Zia wrote:
> Hi Nathan/Boyd,
> 
> I had core dumped on a SUN Blade W/S which is running Solairs 10 (05/09) 
> and SRS-3.0 software. Following is what I could remember from last 
> night's presentation from Gary and Nathan. Although forgot the sytanx of 
> few more funky dcmds  like getwork*** and printing from a particular 
> address.
> 
> Can you please advise me if the pseudo devise pm0 is the clupirt of this 
> system panic and what can be possible solution?
> 
> 
> 
> bash-3.00# mdb -k 5
> Loading modules: [ unix genunix specfs dtrace ufs pcisch ip hook neti 
> sctp arp usba s1394 fcp fctl qlc emlxs lofs audiosup md sd crypto fcip 
> random zfs logindmux ptm nfs ]
>  > $c
> vpanic(10a0c88, 181de60, 704ce770, 2a100a23ca0, 2a100a23ca0, 0)
> utstk_set_disk_major+0x1c(0, 7bb6bad0, 10, 0, 1039ba4, 0)
> utdisk_detach+0x38(30002663cd8, 0, 2a100a23560, 0, 10, 0)
> devi_detach+0xa4(30002663cd8, 0, 109, 0, 7bb6a608, 0)
> detach_node+0x64(30002663cd8, 2000, 0, 2000, 0, 0)
> i_ndi_unconfig_node+0x144(30002663cd8, 12c, 2000, 10d4010, 14, 18688a8)
> i_ddi_detachchild+0x14(30002663cd8, 2000, 18381a0, 18381a0, 180c000,
> ffffffffffffffff)
> devi_detach_node+0x64(30002663cd8, 2000, 67, 5a2, 80000, 2000)
> unconfig_immediate_children+0x180(30000d1d610, 0, ffffffffffffffff, 
> 30002663cd8
> , 0, 2000)
> devi_unconfig_common+0x1a8(30000d1d610, 0, 6, 0, 0, 2000)
> mt_config_thread+0xac(30004626008, 0, 18381a0, 18381a0, 30005021040, 
> 30000d1d610
> )
> thread_start+4(30004626008, 0, 0, 0, 0, 0)
>  > ::msgbuf
> MESSAGE
> PCI-device: usb at a, ohci0
> ohci0 is /pci at 1e,600000/usb at a
> PCI-device: usb at b, ohci1
> ohci1 is /pci at 1e,600000/usb at b
> PCI-device: usb at 8, ohci2
> ohci2 is /pci at 1e,600000/pci at 2/usb at 8
> PCI-device: usb at 8,1, ohci3
> ohci3 is /pci at 1e,600000/pci at 2/usb at 8,1
> cpu0: UltraSPARC-IIIi (portid 0 impl 0x16 ver 0x24 clock 1062 MHz)
> PCI-device: firewire at b, hci13940
> hci13940 is /pci at 1e,600000/pci at 2/firewire at b
> iscsi0 at root
> iscsi0 is /iscsi
> NOTICE: bge0 registered
> NOTICE: bge0: link down (initialized)
> PCI-device: pci at 4, pci_pci1
> pci_pci1 is /pci at 1e,600000/pci at 4
> SUNW,qfe0: found CheerIO 2.0 (rev = C1)
> PCI-device: IntraServer,qfe at 0,1, qfe0
> qfe0 is /pci at 1e,600000/pci at 4/IntraServer,qfe at 0,1
> SUNW,qfe1: found CheerIO 2.0 (rev = C1)
> PCI-device: IntraServer,qfe at 1,1, qfe1
> qfe1 is /pci at 1e,600000/pci at 4/IntraServer,qfe at 1,1
> SUNW,qfe2: found CheerIO 2.0 (rev = C1)
> PCI-device: IntraServer,qfe at 2,1, qfe2
> qfe2 is /pci at 1e,600000/pci at 4/IntraServer,qfe at 2,1
> SUNW,qfe3: found CheerIO 2.0 (rev = C1)
> PCI-device: IntraServer,qfe at 3,1, qfe3
> qfe3 is /pci at 1e,600000/pci at 4/IntraServer,qfe at 3,1
> dad0 at uata0
>  target 0 lun 0
> dad0 is /pci at 1e,600000/ide at d/dad at 0,0
> NOTICE: bge0: link up 100Mbps Full-Duplex (initialized)
> dump on /dev/dsk/c0t0d0s1 size 1025 MB
> SUNW,qfe0: 10 Mbps half duplex link up - internal  transceiver
> PCI-device: sound at 8, audiots0
> audiots0 is /pci at 1e,600000/sound at 8
> isadma0 at ebus0: offset 0,0
> ecpp0 at ebus0: offset 0,378
> ecpp0 is /pci at 1e,600000/isa at 7/dma at 0,0/parallel at 0,378
> scmi2c0 at smbus0: addr 0x40
> scmi2c0 is /pci at 1e,600000/pmu at 6/i2c at 0,0/card-reader at 40
> pseudo-device: devinfo0
> devinfo0 is /pseudo/devinfo at 0
> pseudo-device: pseudo1
> pseudo1 is /pseudo/zconsnex at 1
> su1 at ebus0: offset 0,2e8
> su1 is /pci at 1e,600000/isa at 7/serial at 0,2e8
> sd0 at uata0: target 2 lun 0
> sd0 is /pci at 1e,600000/ide at d/sd at 2,0
> pseudo-device: profile0
> profile0 is /pseudo/profile at 0
> pseudo-device: ramdisk1024
> ramdisk1024 is /pseudo/ramdisk at 1024
> pseudo-device: dtrace0
> dtrace0 is /pseudo/dtrace at 0
> pseudo-device: lockstat0
> lockstat0 is /pseudo/lockstat at 0
> pseudo-device: fbt0
> fbt0 is /pseudo/fbt at 0
> pseudo-device: systrace0
> systrace0 is /pseudo/systrace at 0
> pseudo-device: sdt0
> sdt0 is /pseudo/sdt at 0
> pseudo-device: fasttrap0
> fasttrap0 is /pseudo/fasttrap at 0
> pseudo-device: llc10
> llc10 is /pseudo/llc1 at 0
> pseudo-device: tod0
> tod0 is /pseudo/tod at 0
> pseudo-device: lofi0
> lofi0 is /pseudo/lofi at 0
> pseudo-device: fcp0
> fcp0 is /pseudo/fcp at 0
> pseudo-device: trapstat0
> trapstat0 is /pseudo/trapstat at 0
> pseudo-device: zfs0
> zfs0 is /pseudo/zfs at 0
> pseudo-device: mem_cache0
> mem_cache0 is /pseudo/mem_cache at 0
> pseudo-device: fcsm0
> fcsm0 is /pseudo/fcsm at 0
> pseudo-device: fssnap0
> fssnap0 is /pseudo/fssnap at 0
> pseudo-device: winlock0
> winlock0 is /pseudo/winlock at 0
> pseudo-device: vol0
> vol0 is /pseudo/vol at 0
> pseudo-device: utdiskctl0
> utdiskctl0 is /pseudo/utdiskctl at 0
> pseudo-device: pool0
> pool0 is /pseudo/pool at 0
> pseudo-device: utdisk0
> utdisk0 is /pseudo/utdisk at 0
> IP Filter: v4.1.9, running.
> pseudo-device: pm0
> pm0 is /pseudo/pm at 0
> 
> panic[cpu0]/thread=2a100a23ca0:
> mutex_enter: bad mutex, lp=704ce770 owner=2a100a23ca0 thread=2a100a23ca0
> 
> 
> 000002a100a23350 utdiskctl:_init+3b04 (0, 7bb6bad0, 10, 0, 1039ba4, 0)
>   %l0-3: 00000000704ce770 0000000000000000 0000000000000002 00000000011a79e4
>   %l4-7: 0000000000000002 00000000012c47e0 00000000012c4400 0000000000000017
> 000002a100a23400 utdisk:_init+230 (30002663cd8, 0, 2a100a23560, 0, 10, 0)
>   %l0-3: 000002a100a23490 0000000000000000 0000000000000000 000002a100a234a8
>   %l4-7: 0000000000000000 0000000000000000 000000000186d800 0000000000000109
> 000002a100a234b0 genunix:devi_detach+a4 (30002663cd8, 0, 109, 0, 
> 7bb6a608, 0)
>   %l0-3: 0000000000000000 0000000001868400 0000030000108700 0000030000106000
>   %l4-7: 0000000000002700 00000000018cdc00 00000000000004e0 00000000011a50cc
> 000002a100a23580 genunix:detach_node+64 (30002663cd8, 2000, 0, 2000, 0, 0)
>   %l0-3: 00000300001089d8 0000030000106000 00000000000029d8 00000000018cdc00
>   %l4-7: 000000000000053b 00000000000005a2 00000000000002d1 0000000000000338
> 000002a100a23630 genunix:i_ndi_unconfig_node+144 (30002663cd8, 12c, 
> 2000, 10d401
> 0, 14, 18688a8)
>   %l0-3: 00000000fffdffff 0000000000020000 0000000000000000 0000030002663d40
>   %l4-7: 0000000000000004 00000000fffdfc00 0000000001868800 0000000000000005
> 000002a100a236e0 genunix:i_ddi_detachchild+14 (30002663cd8, 2000, 
> 18381a0, 18381
> a0, 180c000, ffffffffffffffff)
>   %l0-3: 0000000080001606 0000000000000016 0000000000006962 0000000070008dc0
>   %l4-7: 000002a100a23ca0 0000000000000000 0000000000000000 000002a100a23700
> 000002a100a23790 genunix:devi_detach_node+64 (30002663cd8, 2000, 67, 
> 5a2, 80000,
>  2000)
>   %l0-3: 0000030000d1d678 0000000000000000 0000000000000006 0000000000000200
>   %l4-7: 0000000000000201 00000300001089d8 0000030000106000 00000000000029d8
> 000002a100a23850 genunix:unconfig_immediate_children+180 (30000d1d610, 
> 0, ffffff
> ffffffffff, 30002663cd8, 0, 2000)
>   %l0-3: 0000000000002000 0000000000002000 ffffffffffffffff 0000030002663aa8
>   %l4-7: 000000000180c000 00000000ffffffff ffffffffffffffff 0000000000002000
> 000002a100a23910 genunix:devi_unconfig_common+1a8 (30000d1d610, 0, 6, 0, 
> 0, 2000
>  > $r
> %g0 = 0x0000000000000000                 %l0 = 0x0000000000002000
> %g1 = 0x00000000010a0c00                 %l1 = 0x0000000000002000
> %g2 = 0x0000000000000000                 %l2 = 0xffffffffffffffff
> %g3 = 0xffffffffffffffff                 %l3 = 0x0000030002663aa8
> %g4 = 0x0000000000000000                 %l4 = 0x0000000000001f00
> %g5 = 0x00000000704ce400 corona_cb_ops+0x38 %l5 = 0x000000000181de60
> %g6 = 0x0000000000000716                 %l6 = 0x0000000001834c00
> lgrp_kstat_data+0x258
> %g7 = 0x000002a100a23ca0                 %l7 = 0x0000000001834c00
> lgrp_kstat_data+0x258
> 
> %o0 = 0x00000000010a0c88                 %i0 = 0x00000000010a0c88
> %o1 = 0x000002a100a233d8                 %i1 = 0x000000000181de60
> %o2 = 0x0000000000000000                 %i2 = 0x00000000704ce770
> utdc_mstore+0x60
> %o3 = 0x0000000000000064                 %i3 = 0x000002a100a23ca0
> %o4 = 0x0000000000000069                 %i4 = 0x000002a100a23ca0
> %o5 = 0x0000000000000005                 %i5 = 0x0000000000000000
> %o6 = 0x000002a100a22aa1                 %i6 = 0x000002a100a22b51
> %o7 = 0x000000000106b304      panic+0x1c %i7 = 0x000000007b7d9b04
> utstk_set_disk_major+0x1c
> 
>  %ccr = 0x44 xcc=nZvc icc=nZvc
> %fprs = 0x00 fef=0 du=0 dl=0
>  %asi = 0x80
>    %y = 0x0000000000000000
>   %pc = 0x00000000010498ac vpanic
>  %npc = 0x00000000010498b0 vpanic+4
>   %sp = 0x000002a100a22aa1 unbiased=0x000002a100a232a0
>   %fp = 0x000002a100a22b51
> 
>   %tick = 0x0000000000000000
>    %tba = 0x0000000000000000
>     %tt = 0x0
>     %tl = 0x0
>    %pil = 0x0
> %pstate = 0x016 cle=0 tle=0 mm=TSO red=0 pef=1 am=0 priv=1 ie=1 ag=0
> 
>        %cwp = 0x05  %cansave = 0x00
> %canrestore = 0x00 %otherwin = 0x00
>     %wstate = 0x00 %cleanwin = 0x00
>  > ::cpuinfo -v
>  ID ADDR        FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD      PROC
>   0 0000183a668  1b    5    0  60  yes    no t-4    2a100a23ca0 sched
>                   |    |
>        RUNNING <--+    +-->  PRI THREAD      PROC
>          READY                99 2a10010fca0 sched
>         EXISTS                60 30004650820 tar
>         ENABLE                60 2a1038d9ca0 sched
>                               59 300024eda40 svc.startd
>                               53 3000456b400 tar
> 
>  > ::ps
> S    PID   PPID   PGID    SID    UID      FLAGS             ADDR NAME
> R      0      0      0      0      0 0x00000001 00000000018381a0 sched
> R      3      0      0      0      0 0x00020001 00000300003bf848 fsflush
> R      2      0      0      0      0 0x00020001 00000300003c0468 pageout
> R      1      0      0      0      0 0x4a004000 00000300003c1088 init
> R    730      1    730    730      0 0x4a004000 0000030002b9ec48 jfbdaemon
> R    638      1    638    638      0 0x42000000 0000030002b83880 fpsd
> R    537      1    537    537      0 0x42000000 0000030002b810c8 fmd
> R    624      1    624    624      0 0x42020000 0000030002b82040 snmpXdmid
> R    568      1    563    563      0 0x42000000 00000300045164b0 snmpd
> R    551      1    551    551      0 0x42000000 0000030002b844a0 dmispd
> R    539      1    539    539      0 0x42000000 0000030002b7f888 devfsadm
> R    532      1    532    532      0 0x42010000 0000030002b82c60 snmpdx
> R    510      1    510    510      0 0x4a014000 0000030002b8ec58 vold
> R    490      1    490    490      0 0x42000000 0000030002b8e038 syslogd
> R    482      1    482    482      0 0x42000000 0000030002b8f878 sshd
> R    461      1    461    461      0 0x42000000 0000030002b950b0 automountd
> R    462    461    461    461      0 0x42000000 0000030002ba10a8 automountd
> R    342      1    342    342      0 0x42000000 0000030002b94490 inetd
> R    337      1    337    337      0 0x42000000 0000030002b92c50 utmpd
> R    308      1    308    308      1 0x42000000 0000030002b9e028 lockd
> R    298      1    298    298      1 0x42000000 0000030002ba2020 statd
> R    297      1    297    297      1 0x52000000 00000300003b7858 nfsmapid
> R    296      1    294    294      1 0x42000000 00000300003b6018 nfs4cbd
> R    291      1    291    291      1 0x42000000 00000300003bb850 rpcbind
> R    280      1    280    280      0 0x42010000 00000300003b8478 cron
> R    232      1      7      7      0 0x42000000 00000300003b6c38 iscsid
> R    197      1    197    197      0 0x42000000 0000030002ba3860 powerd
> R    196      1    196    196      0 0x42000000 0000030002ba0488 nscd
> R    193      1    193    193      1 0x42000000 00000300003ba010 kcfd
> R    192      1    192    192      0 0x42000000 00000300003bec28 picld
> R    148      1    148    148      0 0x42000000 00000300003bac30 syseventd
> R     81      1     80     80      0 0x42020000 00000300003bc470 dhcpagent
> R      9      1      9      9      0 0x42000000 00000300003bd090 svc.configd
> R      7      1      7      7      0 0x42000000 00000300003be008 svc.startd
> R    496      7      7      7      0 0x4a004000 0000030002ba2c40 rc2
> R    738    496      7      7      0 0x4a004000 0000030002ba4480 lsvcrun
> R    739    738      7      7      0 0x4a004000 0000030002b850c0 sh
> R    922    739      7      7      0 0x4a004000 0000030002b92030 tar
> R    923    922      7      7      0 0x4a004000 0000030002b7ec68 tar
> R    323      7    323    323      0 0x4a004000 0000030002b9f868 ttymon
> R    305      7    305    305      0 0x4a014000 00000300003b9098 sac
> R    336    305    305    305      0 0x4a014000 0000030002b93870 ttymon
>  >
>  >
>  > ::stack
> vpanic(10a0c88, 181de60, 704ce770, 2a100a23ca0, 2a100a23ca0, 0)
> utstk_set_disk_major+0x1c(0, 7bb6bad0, 10, 0, 1039ba4, 0)
> utdisk_detach+0x38(30002663cd8, 0, 2a100a23560, 0, 10, 0)
> devi_detach+0xa4(30002663cd8, 0, 109, 0, 7bb6a608, 0)
> detach_node+0x64(30002663cd8, 2000, 0, 2000, 0, 0)
> i_ndi_unconfig_node+0x144(30002663cd8, 12c, 2000, 10d4010, 14, 18688a8)
> i_ddi_detachchild+0x14(30002663cd8, 2000, 18381a0, 18381a0, 180c000,
> ffffffffffffffff)
> devi_detach_node+0x64(30002663cd8, 2000, 67, 5a2, 80000, 2000)
> unconfig_immediate_children+0x180(30000d1d610, 0, ffffffffffffffff, 
> 30002663cd8
> , 0, 2000)
> devi_unconfig_common+0x1a8(30000d1d610, 0, 6, 0, 0, 2000)
> mt_config_thread+0xac(30004626008, 0, 18381a0, 18381a0, 30005021040, 
> 30000d1d610
> )
> thread_start+4(30004626008, 0, 0, 0, 0, 0)
>  > ::memstat
> Page Summary                Pages                MB  %Tot
> ------------     ----------------  ----------------  ----
> Kernel                      11209                87    9%
> Anon                         6557                51    5%
> Exec and libs                1591                12    1%
> Page cache                  13143               102   10%
> Free (cachelist)            22451               175   18%
> Free (freelist)             73235               572   57%
> 
> Total                      128186              1001
> Physical                   127297               994
> 
> 
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 
> Kind Regards,
> Zia-ul-Hassan
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> ug-msosug mailing list
> ug-msosug at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ug-msosug

Reply via email to