Hi all,

Recently one of the servers that my company uses , a Dell R710, attached to 2 
Sun JBOD J4400 started to crash quite often.
Finally i got a message in /var/adm/messages that might point to something 
usefull, but i don't have the expertise to start to
troubleshooting this problem, so any help would be highly valuable.

Thanks in advance,

Bruno Sousa



The significant messages are :

Apr 13 11:12:04 san01 savecore: [ID 570001 auth.error] reboot after panic: 
Freeing a free IOMMU page: paddr=0xccca2000
Apr 13 11:12:04 san01 savecore: [ID 385089 auth.error] Saving compressed system 
crash dump in /var/crash/san01/vmdump.0

I also noticed other "interesting" messages like :

Apr 13 11:11:10 san01 unix: [ID 378719 kern.info] NOTICE: cpu_acpi: _PSS 
package evaluation failed for with status 5 for CPU 0.
Apr 13 11:11:10 san01 unix: [ID 388705 kern.info] NOTICE: cpu_acpi: error 
parsing _PSS for CPU 0
Apr 13 11:11:10 san01 unix: [ID 928200 kern.info] NOTICE: SpeedStep support is 
being disabled due to errors parsing ACPI P-state objects exported by BIOS

Apr 13 11:10:50 san01 scsi: [ID 243001 kern.info] 
/p...@0,0/pci8086,3...@4/pci1028,1...@0 (mpt0):
Apr 13 11:10:50 san01       DMA restricted below 4GB boundary due to errata
Apr 13 11:11:32 san01 scsi: [ID 243001 kern.info] 
/p...@0,0/pci8086,3...@9/pci1000,3...@0 (mpt2):
Apr 13 11:11:32 san01       DMA restricted below 4GB boundary due to errata



Relevant specs of the machine :

SunOS san01 5.11 snv_134 i86pc i386 i86pc Solaris

rpool boot drives attached to a Dell SAS6/iR Integrated RAID Controller
(mpt0 Firmware version v0.25.47.0 (IR) )
2 HBA LSI 1068E, each connect to a J4400 jbod (mpt1 Firmware version
v1.26.0.0 (IT) )

multipath enabled and working

2 Quad-Cores, 16Gb ram

Detailed info :


mdb -k unix.0 vmcore.0

mdb: warning: dump is from SunOS 5.11 snv_132; dcmds and macros may not
match kernel implementation
Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc
pcplusmp scsi_vhci zfs mpt sd sockfs ip hook neti sctp arp usba uhci
fctl stmf md lofs idm nfs random sppp fcip cpc crypto logindmux ptm
nsctl ufs ipc ]

::status

debugging crash dump vmcore.0 (64-bit) from san01
operating system: 5.11 snv_132 (i86pc)
panic message: Freeing a free IOMMU page: paddr=0xccca2000
dump content: kernel pages only

::stack

vpanic()
iommu_page_free+0xcb(ffffff04e3da5000, ccca2000)
iommu_free_page+0x15(ffffff04e3da5000, ccca2000)
iommu_setup_level_table+0xa0(ffffff054406d000, ffffff0543b99000, 8)
iommu_setup_page_table+0xa0(ffffff054406d000, 100c000)
iommu_map_page_range+0x6a(ffffff054406d000, 100c000, 3c2329000,
3c2329000, 2)
iommu_map_dvma+0x50(ffffff054406d000, 100c000, 3c2329000, 1000,
ffffff001f7f31d0)
intel_iommu_map_sgl+0x22f(ffffff0553b43e00, ffffff001f7f31d0, 41)
rootnex_coredma_bindhdl+0x11e(ffffff04e3ef5cb0, ffffff04e607f540,
ffffff0553b43e00, ffffff001f7f31d0, ffffff0553efdc50, ffffff0553efdbf8)
rootnex_dma_bindhdl+0x36(ffffff04e3ef5cb0, ffffff04e607f540,
ffffff0553b43e00, ffffff001f7f31d0, ffffff0553efdc50, ffffff0553efdbf8)
ddi_dma_buf_bind_handle+0x117(ffffff0553b43e00, ffffff055860cd00, a, 0,
0, ffffff0553efdc50)
scsi_dma_buf_bind_attr+0x48(ffffff0553efdb90, ffffff055860cd00, a, 0, 0)
scsi_init_cache_pkt+0x2d0(ffffff05456302e0, 0, ffffff055860cd00, a, 20, 0)
scsi_init_pkt+0x5c(ffffff05456302e0, 0, ffffff055860cd00, a, 20, 0)
vhci_bind_transport+0x54d(ffffff0543191c58, ffffff055d2f8968, 40000, 0)
vhci_scsi_init_pkt+0x160(ffffff0543191c58, 0, ffffff055860cd00, a, 20, 0)
scsi_init_pkt+0x5c(ffffff0543191c58, 0, ffffff055860cd00, a, 20, 0)
sd_setup_rw_pkt+0x12a(ffffff0543b9d080, ffffff001f7f3688,
ffffff055860cd00, 40000, fffffffff7a91b80, ffffff0543b9d080)
sd_initpkt_for_buf+0xad(ffffff055860cd00, ffffff001f7f36f8)
sd_start_cmds+0x197(ffffff0543b9d080, 0)
sd_core_iostart+0x186(4, ffffff0543b9d080, ffffff055860cd00)
sd_mapblockaddr_iostart+0x306(3, ffffff0543b9d080, ffffff055860cd00)
sd_xbuf_strategy+0x50(ffffff055860cd00, ffffff0544cf0a00, ffffff0543b9d080)
xbuf_iostart+0x1e5(ffffff04f21cce80)
ddi_xbuf_qstrategy+0xd3(ffffff055860cd00, ffffff04f21cce80)
sdstrategy+0x101(ffffff055860cd00)
bdev_strategy+0x75(ffffff055860cd00)
ldi_strategy+0x59(ffffff04f29a4df8, ffffff055860cd00)
vdev_disk_io_start+0xd0(ffffff055c2379a0)
zio_vdev_io_start+0x17d(ffffff055c2379a0)
zio_execute+0x8d(ffffff055c2379a0)
vdev_queue_io_done+0x92(ffffff055c2fe680)
zio_vdev_io_done+0x62(ffffff055c2fe680)
zio_execute+0x8d(ffffff055c2fe680)
taskq_thread+0x248(ffffff0543a086a0)
thread_start+8()


::msgbuf

panic[cpu4]/thread=ffffff001f7f3c60:
Freeing a free IOMMU page: paddr=0xccca2000


ffffff001f7f2e90 rootnex:iommu_page_free+cb ()
ffffff001f7f2eb0 rootnex:iommu_free_page+15 ()
ffffff001f7f2f10 rootnex:iommu_setup_level_table+a0 ()
ffffff001f7f2f50 rootnex:iommu_setup_page_table+a0 ()
ffffff001f7f2fd0 rootnex:iommu_map_page_range+6a ()
ffffff001f7f3020 rootnex:iommu_map_dvma+50 ()
ffffff001f7f30e0 rootnex:intel_iommu_map_sgl+22f ()
ffffff001f7f3180 rootnex:rootnex_coredma_bindhdl+11e ()
ffffff001f7f31c0 rootnex:rootnex_dma_bindhdl+36 ()
ffffff001f7f3260 genunix:ddi_dma_buf_bind_handle+117 ()
ffffff001f7f32c0 scsi:scsi_dma_buf_bind_attr+48 ()
ffffff001f7f3350 scsi:scsi_init_cache_pkt+2d0 ()
ffffff001f7f33d0 scsi:scsi_init_pkt+5c ()
ffffff001f7f3480 scsi_vhci:vhci_bind_transport+54d ()
ffffff001f7f3500 scsi_vhci:vhci_scsi_init_pkt+160 ()
ffffff001f7f3580 scsi:scsi_init_pkt+5c ()
ffffff001f7f3660 sd:sd_setup_rw_pkt+12a ()
ffffff001f7f36d0 sd:sd_initpkt_for_buf+ad ()
ffffff001f7f3740 sd:sd_start_cmds+197 ()


::panicinfo

 cpu                4
          thread ffffff001f7f3c60
         message Freeing a free IOMMU page: paddr=0xccca2000
             rdi fffffffff78ede80
             rsi ffffff001f7f2e10
             rdx         ccca2000
             rcx                1
              r8 ffffff001f7f2d60
              r9 ffffff001f7f2e60
             rax                0
             rbx                3
             rbp ffffff001f7f2e50
             r10 ffffff0561edd000
             r10 ffffff0561edd000
             r11 ffffff0000003000
             r12 fffffffff78ede80
             r13 ffffff04e3da5000
             r14                0
             r15         ccca2000
          fsbase                0
          gsbase ffffff04f32e0000
              ds               4b
              es               4b
              fs                0
              gs              1c3
          trapno                0
             err                0
             rip fffffffffb862550
              cs               30
          rflags              246
             rsp ffffff001f7f2d58
              ss               38
          gdt_hi                0
          gdt_lo         b00001ef
          idt_hi                0
          idt_lo         20000fff
             ldt                0
            task               70
             cr0         8005003b
             cr2         fe6e971b
             cr3          4000000
             cr4              6f8


::cpuinfo -v

0 fffffffffbc2f9e0  1f    1    0  -1   no    no t-0    ffffff001e805c60 (idle)
                       |    |
            RUNNING <--+    +-->  PRI THREAD           PROC
              READY                60 ffffff00202a2c60 sched
           QUIESCED
             EXISTS
             ENABLE

 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  1 ffffff04f32e8040  1f    0    0  99   no    no t-0    ffffff001fbadc60 
zpool-TEST
                       |
            RUNNING <--+
              READY
           QUIESCED
             EXISTS
             ENABLE

 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  2 ffffff04f32e6b00  1f    0    0  99   no    no t-0    ffffff001fbc5c60 
zpool-TEST
                       |
            RUNNING <--+
              READY
           QUIESCED
             EXISTS
             ENABLE

 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  3 ffffff04f32e1500  1f    1    0  -1   no    no t-0    ffffff001f0e3c60 (idle)
                       |    |
            RUNNING <--+    +-->  PRI THREAD           PROC
              READY                60 ffffff001e985c60 sched
           QUIESCED
             EXISTS
             ENABLE

 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  4 fffffffffbc3a000  1b    0    0  99   no    no t-0    ffffff001f7f3c60 
zpool-TEST
                       |
            RUNNING <--+
              READY
             EXISTS
             ENABLE

 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  5 ffffff04f32dcac0  1f    0    0  99   no    no t-0    ffffff001f7d5c60 
zpool-TEST
                       |
            RUNNING <--+
              READY
           QUIESCED
             EXISTS
             ENABLE

 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  6 ffffff04f3897b00  1f    0    0 104   no    no t-0    ffffff001f413c60 sched
                       |         |
            RUNNING <--+         +--> PIL THREAD
              READY                     5 ffffff001f413c60
           QUIESCED                     - ffffff001ff99c60 sched
             EXISTS

 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  7 ffffff04f3894500  1f    0    0  99   no    no t-0    ffffff001f7e1c60 
zpool-TEST
                       |
            RUNNING <--+
              READY
           QUIESCED
             EXISTS
             ENABLE























  

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org

Reply via email to