Forwarding for a colleague,

Patrick

-------- Original Message --------
Subject: Crash dump on HP Proliant G6 broken as of V8.0
Date: Wed, 15 Sep 2010 11:53:16 -0700
From: Paul Heyman <phey...@adaranet.com>
To: freebsd-hackers@freebsd.org <freebsd-hackers@freebsd.org>
CC: Patrick Mahan <pma...@adaranet.com>
References: <32ab5c9615cc494997d9abb1db12783c024c8c5...@sj-exch-1.adaranet.com>,<32ab5c9615cc494997d9abb1db12783c024c8de...@sj-exch-1.adaranet.com>,<32ab5c9615cc494997d9abb1db12783c024c8c5...@sj-exch-1.adaranet.com>

ALL,

The crash dump worked fine in V7.3.

I am debugging crash dump problem on a HP Proliant G6
which uses a SATA drive connected to a CISS Raid Controller.

I have tried this on a x86 box using a non-raid ATA/SATA disk controller
and it works well.

I noticed that in V8.0 there is a new SCSI operating method. In the v7.3 
version there was only
CISS_TRANSPORT_METHOD_SIMPLE, but in v8.0 there has been  
CISS_TRANSPORT_METHOD_PERF
method added. These methods have different function calls in
ciss_poll_request.

The dump comand starts with a call to dadump.
This function will setup a struct ccb_scsiio structure. This is done by calling 
scsi_read_write.
Then the meat of  dump happens when it calls xpt_polled_action, which manages 
and simualtes
interrupt functionality that is working fine. The disk operations work fine 
except during a
crash dump.

I have turned debug on for CISS and CAMDEBUG to debug this problem.

In xpt_polled_action (cam_xpt.c) we get past the first polling loop at line 
3013, as
both devq->send_opening and dev->ccbq.dev_openings are > 0  ( 256 and 254 ).

But we do get stuck in the second one at line 3025. We eventually time out
setting start_ccb->ccb_h.status to CAM_CMD_TIMEOUT. The timeout is set with
DA_DEFAULT_TIMEOUT (scsi_da.c) which is set to 60, and is used in the call to 
scsi_read_write.

Here is the debug trace:

Dumping 1240 MB:
ciss_cam_action_io: XPT_SCSI_IO 0:0:0
ciss_get_request: called
ciss_start: post command 150 tag 600
ciss_map_request: called
ciss_request_map_helper: called
ciss_cam_poll: called
ciss_perf_done: completed command 150
ciss_perf_done: completed command 150

ciss_complete: called
ciss_unmap_request: called
ciss_cam_complete: called
_ciss_report_request: called
ciss_cam_complete: SCSI_STATUS_OK
ciss_release_request: called
ciss_complete: called
ciss_unmap_request: called
ciss0: WARNING: completing non-busy request
ciss_cam_complete: called
_ciss_report_request: called
ciss_cam_complete: SCSI_STATUS_OK
 .
 .
 .
 .
after about 60 seconds
ciss0: WARNING: completing non-busy request
ciss0: WARNING: completed command with no submitter
ciss_unmap_request: called
.
.
.
This goes on forever

Thanks
Paul


Paul Heyman
phey...@adaranetworks.com
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Reply via email to