I am getting a panic in sd.c:sd_retry_command because sd.c is not 
setting pkt->pkt_private->xb_pktp. Since xb_pktp is a sd.c private 
pointer, I am confident that my hba driver is not breaking it. Just in 
case, I added instrumentation in tran_start and am finding xb_pktp is 
sometimes coming down == NULL.

The panic occurs while I am testing my drivers reset path by forcing a 
timeout. This causes the command to be completed with pkt->pkt_reason == 
CMD_TIMEOUT. Subsequently the command is retried by 
sd.c:sd_retry_command when the panic occurs. I found xb_pktp == NULL by 
examining the dump with mdb. Full analysis at the end...

I am also using the newer tran_setup_pkt(.. ) interface instead of the 
tran_init_pkt(.. ) and friends interface like many of the other SCSI hba 
drivers. I suspect that since this interface is not used as often, there 
may be some lurking bugs.

Anybody run into this already? I'd hate to start debugging sd.c when 
someone else already has a clue what's going on.

Thanks, carlos


I have variable called $MDB that automatically runs mdb on the last dump 
in /var/crashes.

Here is my stack frame:

ffffff0007ad1aa0 sdintr+0x2b1(ffffff01ded15a80)
ffffff0007ad1b00 tw_abort_requests+0x260()
ffffff0007ad1b20 tw_initiate_reset+0x47()
ffffff0007ad1b80 tw_timeout+0x26b()
ffffff0007ad1bd0 callout_execute+0xb1(ffffff01cb3d3000)

sdintr is called to complete the commands. It's argument is scsi_pkt *:

# echo "0xffffff01ded15a80::print struct scsi_pkt" | $MDB
{
    pkt_ha_private = 0xffffff01ded15b50
    pkt_address = {
        a_hba_tran = 0xffffff01ddc75380
        a_target = 0
        a_lun = 0
        a_sublun = 0
    }
    pkt_private = 0xffffff01df03de88    < buf pointer.
    pkt_comp = sdintr
...

Given pkt_private above, we can find the buf:

# echo "0xffffff01df03de88::print struct buf" | $MDB
{
    b_flags = 0x502
    b_forw = 0xffffff01df03de00
    b_back = 0
    av_forw = 0
    av_back = 0
    b_dev = 0xffff
    b_bcount = 0x2000
    b_un = {
        b_addr = 0x8075cb0
        b_fs = 0x8075cb0
        b_cg = 0x8075cb0
        b_dino = 0x8075cb0
        b_daddr = 0x8075cb0
    }
    _b_blkno = {
        _f = 0x5460220
        _p = {
            _l = 0x5460220
            _u = 0
        }
    }
    b_obs1 = '\0'
    b_resid = 0
    b_start = 0
    b_proc = 0xffffff01de8881c0
    b_pages = 0
    b_obs2 = 0
    b_bufsize = 0
    b_iodone = aio_done
    b_vp = 0
    b_chain = 0
    b_obs3 = 0
    b_error = 0
    b_private = 0xffffff01d6144340   < struct sd_xbuf *
    b_edev = 0x1c00000242
    b_sem = {
        _opaque = [ 0, 0 ]
    }
...

Given b_private above, we can find the sd_xbuf:

# echo "0xffffff01d6144340::print struct sd_xbuf" | $MDB
{
    xb_un = 0xffffff01cccd2200
    xb_pktp = 0                       < Oooops...
    xb_pktinfo = 0
    xb_private = 0xffffff01d6144340
    xb_blkno = 0x54640e1
    xb_chain_iostart = 0x3
    xb_chain_iodone = 0x4
    xb_pkt_flags = 0
    xb_dma_resid = 0
    xb_retry_count = 0
    xb_victim_retry_count = 0
    xb_ua_retry_count = 0
...

As you can see, xb_pktp above us NULL.

_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss

Reply via email to