Re: [PATCH scsi-misc-2.6 02/07] scsi: make scsi_send_eh_cmnd use its own timer instead of scmd-eh_timeout
02_scsi_timer_eh_timer_fix.patch scmd-eh_timeout is used to resolve the race between command completion and timeout. However, during error handling, scsi_send_eh_cmnd uses scmd-eh_timeout. This creates a race condition between eh and normal completion for a request which has timed out and in the process of error handling. If the request completes while scmd-eh_timeout is being used by eh, eh timeout is lost and the command will be handled by both eh and completion path. This patch fixes the race by making scsi_send_eh_cmnd() use its own timer. Signed-off-by: Tejun Heo [EMAIL PROTECTED] scsi_error.c | 64 ++- scsi_priv.h |1 2 files changed, 20 insertions(+), 45 deletions(-) Index: scsi-reqfn-export/drivers/scsi/scsi_error.c === --- scsi-reqfn-export.orig/drivers/scsi/scsi_error.c2005-04-11 03:42:11.0 +0900 +++ scsi-reqfn-export/drivers/scsi/scsi_error.c 2005-04-11 03:42:11.0 +0900 @@ -420,46 +420,12 @@ static int scsi_eh_completed_normally(st } /** - * scsi_eh_times_out - timeout function for error handling. - * @scmd: Cmd that is timing out. - * - * Notes: - *During error handling, the kernel thread will be sleeping waiting - *for some action to complete on the device. our only job is to - *record that it timed out, and to wake up the thread. - **/ -static void scsi_eh_times_out(struct scsi_cmnd *scmd) -{ - scsi_eh_eflags_set(scmd, SCSI_EH_REC_TIMEOUT); - SCSI_LOG_ERROR_RECOVERY(3, printk(%s: scmd:%p\n, __FUNCTION__, - scmd)); - - if (scmd-device-host-eh_action) - up(scmd-device-host-eh_action); -} - -/** * scsi_eh_done - Completion function for error handling. * @scmd: Cmd that is done. **/ static void scsi_eh_done(struct scsi_cmnd *scmd) { - /* -* if the timeout handler is already running, then just set the -* flag which says we finished late, and return. we have no -* way of stopping the timeout handler from running, so we must -* always defer to it. -*/ - if (del_timer(scmd-eh_timeout)) { - scmd-request-rq_status = RQ_SCSI_DONE; - scmd-owner = SCSI_OWNER_ERROR_HANDLER; - - SCSI_LOG_ERROR_RECOVERY(3, printk(%s scmd: %p result: %x\n, - __FUNCTION__, scmd, scmd-result)); - - if (scmd-device-host-eh_action) - up(scmd-device-host-eh_action); - } + up(scmd-device-host-eh_action); } /** @@ -478,6 +444,7 @@ static int scsi_send_eh_cmnd(struct scsi { struct scsi_device *sdev = scmd-device; struct Scsi_Host *shost = sdev-host; + struct timer_list timer; DECLARE_MUTEX_LOCKED(sem); unsigned long flags; int rtn = SUCCESS; @@ -492,7 +459,11 @@ static int scsi_send_eh_cmnd(struct scsi scmd-cmnd[1] = (scmd-cmnd[1] 0x1f) | (sdev-lun 5 0xe0); - scsi_add_timer(scmd, timeout, scsi_eh_times_out); + init_timer(timer); + timer.expires = jiffies + timeout; + timer.function = (void (*)(unsigned long))scsi_eh_done; + timer.data = (unsigned long)scmd; + add_timer(timer); /* * set up the semaphore so we wait for the command to complete. @@ -508,14 +479,19 @@ static int scsi_send_eh_cmnd(struct scsi down(sem); scsi_log_completion(scmd, SUCCESS); - shost-eh_action = NULL; - - /* -* see if timeout. if so, tell the host to forget about it. -* in other words, we don't want a callback any more. -*/ - if (scsi_eh_eflags_chk(scmd, SCSI_EH_REC_TIMEOUT)) { - scsi_eh_eflags_clr(scmd, SCSI_EH_REC_TIMEOUT); + if (del_timer(timer)) { + SCSI_LOG_ERROR_RECOVERY(3, + printk(scsi_eh_done scmd: %p result: %x\n, + scmd, scmd-result)); + scmd-request-rq_status = RQ_SCSI_DONE; + scmd-owner = SCSI_OWNER_ERROR_HANDLER; + } else { + /* +* Timed out. Tell the host to forget about it. In +* other words, we don't want a callback any more. +*/ + SCSI_LOG_ERROR_RECOVERY(3, + printk(scsi_eh_times_out scmd: %p\n, scmd)); scmd-owner = SCSI_OWNER_LOWLEVEL; /* Index: scsi-reqfn-export/drivers/scsi/scsi_priv.h === --- scsi-reqfn-export.orig/drivers/scsi/scsi_priv.h 2005-04-11 03:42:10.0 +0900 +++ scsi-reqfn-export/drivers/scsi/scsi_priv.h 2005-04-11 03:42:11.0 +0900 @@ -42,7 +42,6 @@ struct Scsi_Host
Re: [PATCH scsi-misc-2.6 05/07] scsi: unexport scsi_{add|delete}_timer()
05_scsi_timer_unexport_timer_functions.patch SCSI cmd timer has specific synchronization/semantic requirements and shouldn't be directly used outside SCSI midlayer. With aic7xxx driver updated, there's no user left. This patch unexports scsi_{add|delete}_timer() routines and also removes @complete argument from scsi_add_timer(). The change makes the use of scsi_times_out() confined in scsi_error.c, so move it upward such that no prototype is needed and make it static. Signed-off-by: Tejun Heo [EMAIL PROTECTED] drivers/scsi/scsi.c |2 - drivers/scsi/scsi_error.c | 80 +- drivers/scsi/scsi_priv.h |3 + include/scsi/scsi_eh.h|3 - 4 files changed, 41 insertions(+), 47 deletions(-) Index: scsi-reqfn-export/drivers/scsi/scsi.c === --- scsi-reqfn-export.orig/drivers/scsi/scsi.c 2005-04-11 03:42:12.0 +0900 +++ scsi-reqfn-export/drivers/scsi/scsi.c 2005-04-11 03:42:12.0 +0900 @@ -600,7 +600,7 @@ int scsi_dispatch_cmd(struct scsi_cmnd * * AK: unlikely race here: for some reason the timer could * expire before the serial number is set up below. */ - scsi_add_timer(cmd, cmd-timeout_per_command, scsi_times_out); + scsi_add_timer(cmd, cmd-timeout_per_command); scsi_log_send(cmd); Index: scsi-reqfn-export/drivers/scsi/scsi_error.c === --- scsi-reqfn-export.orig/drivers/scsi/scsi_error.c2005-04-11 03:42:12.0 +0900 +++ scsi-reqfn-export/drivers/scsi/scsi_error.c 2005-04-11 03:42:12.0 +0900 @@ -88,6 +88,42 @@ int scsi_eh_scmd_add(struct scsi_cmnd *s } /** + * scsi_times_out - Timeout function for normal scsi commands. + * @scmd: Cmd that is timing out. + * + * Notes: + * We do not need to lock this. There is the potential for a race + * only in that the normal completion handling might run, but if the + * normal completion function determines that the timer has already + * fired, then it mustn't do anything. + **/ +static void scsi_times_out(struct scsi_cmnd *scmd) +{ + scsi_log_completion(scmd, TIMEOUT_ERROR); + + if (scmd-device-host-hostt-eh_timed_out) + switch (scmd-device-host-hostt-eh_timed_out(scmd)) { + case EH_HANDLED: + __scsi_done(scmd); + return; + case EH_RESET_TIMER: + /* This allows a single retry even of a command +* with allowed == 0 */ + if (scmd-retries++ scmd-allowed) + break; + scsi_add_timer(scmd, scmd-timeout_per_command); + return; + case EH_NOT_HANDLED: + break; + } + + if (unlikely(!scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD))) { + panic(Error handler thread not present at %p %p %s %d, + scmd, scmd-device-host, __FILE__, __LINE__); + } +} + +/** * scsi_add_timer - Start timeout timer for a single scsi command. * @scmd: scsi command that is about to start running. * @timeout: amount of time to allow this command to run. @@ -98,8 +134,7 @@ int scsi_eh_scmd_add(struct scsi_cmnd *s *has its own timer, and as it is added to the queue, we set up the *timer. When the command completes, we cancel the timer. **/ -void scsi_add_timer(struct scsi_cmnd *scmd, int timeout, - void (*complete)(struct scsi_cmnd *)) +void scsi_add_timer(struct scsi_cmnd *scmd, int timeout) { /* @@ -112,7 +147,7 @@ void scsi_add_timer(struct scsi_cmnd *sc scmd-eh_timeout.data = (unsigned long)scmd; scmd-eh_timeout.expires = jiffies + timeout; - scmd-eh_timeout.function = (void (*)(unsigned long)) complete; + scmd-eh_timeout.function = (void (*)(unsigned long))scsi_times_out; SCSI_LOG_ERROR_RECOVERY(5, printk(%s: scmd: %p, time: %d, (%p)\n, __FUNCTION__, @@ -120,7 +155,6 @@ void scsi_add_timer(struct scsi_cmnd *sc add_timer(scmd-eh_timeout); } -EXPORT_SYMBOL(scsi_add_timer); /** * scsi_delete_timer - Delete/cancel timer for a given function. @@ -148,44 +182,6 @@ int scsi_delete_timer(struct scsi_cmnd * return rtn; } -EXPORT_SYMBOL(scsi_delete_timer); - -/** - * scsi_times_out - Timeout function for normal scsi commands. - * @scmd: Cmd that is timing out. - * - * Notes: - * We do not need to lock this. There is the potential for a race - * only in that the normal completion handling might run, but if the - * normal completion function determines that the timer has already - * fired, then it mustn't do anything. - **/ -void
Re: [PATCH scsi-misc-2.6 01/04] scsi: replace REQ_SPECIAL with REQ_SOFTBARRIER in scsi_init_io()
01_scsi_REQ_SPECIAL_semantic_scsi_init_io.patch scsi_init_io() used to set REQ_SPECIAL when it fails sg allocation before requeueing the request by returning BLKPREP_DEFER. REQ_SPECIAL is being updated to mean special requests and we need to set REQ_SOFTBARRIER for half-prepp'ed requests. So, replace REQ_SPECIAL with REQ_SOFTBARRIER. Signed-off-by: Tejun Heo [EMAIL PROTECTED] scsi_lib.c |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: scsi-reqfn-export/drivers/scsi/scsi_lib.c === --- scsi-reqfn-export.orig/drivers/scsi/scsi_lib.c 2005-04-11 12:27:07.0 +0900 +++ scsi-reqfn-export/drivers/scsi/scsi_lib.c 2005-04-11 12:27:07.0 +0900 @@ -936,7 +936,7 @@ static int scsi_init_io(struct scsi_cmnd */ sgpnt = scsi_alloc_sgtable(cmd, GFP_ATOMIC); if (unlikely(!sgpnt)) { - req-flags |= REQ_SPECIAL; + req-flags |= REQ_SOFTBARRIER; return BLKPREP_DEFER; } - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 03/04] scsi: make scsi_requeue_request() use blk_requeue_request()
03_scsi_REQ_SPECIAL_semantic_scsi_requeue_command.patch scsi_requeue_request() used to use blk_insert_request() for requeueing requests. This behavior depends on the unobvious behavior of blk_insert_request() setting REQ_SPECIAL and REQ_SOFTBARRIER when requeueing. This patch makes scsi_requeue_request() use blk_requeue_request() and explicitly set REQ_SOFTBARRIER. As REQ_SPECIAL now means special requests, the flag is not set on requeue. Signed-off-by: Tejun Heo [EMAIL PROTECTED] scsi_lib.c |8 +++- 1 files changed, 7 insertions(+), 1 deletion(-) Index: scsi-reqfn-export/drivers/scsi/scsi_lib.c === --- scsi-reqfn-export.orig/drivers/scsi/scsi_lib.c 2005-04-11 12:27:07.0 +0900 +++ scsi-reqfn-export/drivers/scsi/scsi_lib.c 2005-04-11 12:27:08.0 +0900 @@ -483,8 +483,14 @@ static void scsi_run_queue(struct reques */ static void scsi_requeue_command(struct request_queue *q, struct scsi_cmnd *cmd) { + unsigned long flags; + cmd-request-flags = ~REQ_DONTPREP; - blk_insert_request(q, cmd-request, 1, cmd, 1); + cmd-request-flags |= REQ_SOFTBARRIER; + + spin_lock_irqsave(q-queue_lock, flags); + blk_requeue_request(q, cmd-request); + spin_unlock_irqrestore(q-queue_lock, flags); scsi_run_queue(q); } - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 04/04] scsi: remove requeue feature from blk_insert_request()
04_scsi_blk_insert_request_no_requeue.patch blk_insert_request() has a unobivous feature of requeuing a request setting REQ_SPECIAL|REQ_SOFTBARRIER. SCSI midlayer was the only user and as previous patches removed the usage, remove the feature from blk_insert_request(). Only special requests should be queued with blk_insert_request(). All requeueing should go through blk_requeue_request(). Signed-off-by: Tejun Heo [EMAIL PROTECTED] drivers/block/ll_rw_blk.c | 20 ++-- drivers/block/paride/pd.c |2 +- drivers/block/sx8.c |4 ++-- drivers/scsi/scsi_lib.c |2 +- include/linux/blkdev.h|2 +- 5 files changed, 11 insertions(+), 19 deletions(-) Index: scsi-reqfn-export/drivers/block/ll_rw_blk.c === --- scsi-reqfn-export.orig/drivers/block/ll_rw_blk.c2005-04-11 12:25:15.0 +0900 +++ scsi-reqfn-export/drivers/block/ll_rw_blk.c 2005-04-11 12:27:08.0 +0900 @@ -2028,7 +2028,6 @@ EXPORT_SYMBOL(blk_requeue_request); * @rq:request to be inserted * @at_head: insert request at head or tail of queue * @data: private data - * @reinsert: true if request it a reinsertion of previously processed one * * Description: *Many block devices need to execute commands asynchronously, so they don't @@ -2043,8 +2042,9 @@ EXPORT_SYMBOL(blk_requeue_request); *host that is unable to accept a particular command. */ void blk_insert_request(request_queue_t *q, struct request *rq, - int at_head, void *data, int reinsert) + int at_head, void *data) { + int where = at_head ? ELEVATOR_INSERT_FRONT : ELEVATOR_INSERT_BACK; unsigned long flags; /* @@ -2061,20 +2061,12 @@ void blk_insert_request(request_queue_t /* * If command is tagged, release the tag */ - if (reinsert) - blk_requeue_request(q, rq); - else { - int where = ELEVATOR_INSERT_BACK; + if (blk_rq_tagged(rq)) + blk_queue_end_tag(q, rq); - if (at_head) - where = ELEVATOR_INSERT_FRONT; + drive_stat_acct(rq, rq-nr_sectors, 1); + __elv_add_request(q, rq, where, 0); - if (blk_rq_tagged(rq)) - blk_queue_end_tag(q, rq); - - drive_stat_acct(rq, rq-nr_sectors, 1); - __elv_add_request(q, rq, where, 0); - } if (blk_queue_plugged(q)) __generic_unplug_device(q); else Index: scsi-reqfn-export/drivers/block/paride/pd.c === --- scsi-reqfn-export.orig/drivers/block/paride/pd.c2005-04-11 12:25:15.0 +0900 +++ scsi-reqfn-export/drivers/block/paride/pd.c 2005-04-11 12:27:08.0 +0900 @@ -723,7 +723,7 @@ static int pd_special_command(struct pd_ rq.ref_count = 1; rq.waiting = wait; rq.end_io = blk_end_sync_rq; - blk_insert_request(disk-gd-queue, rq, 0, func, 0); + blk_insert_request(disk-gd-queue, rq, 0, func); wait_for_completion(wait); rq.waiting = NULL; if (rq.errors) Index: scsi-reqfn-export/drivers/block/sx8.c === --- scsi-reqfn-export.orig/drivers/block/sx8.c 2005-04-11 12:25:15.0 +0900 +++ scsi-reqfn-export/drivers/block/sx8.c 2005-04-11 12:27:08.0 +0900 @@ -614,7 +614,7 @@ static int carm_array_info (struct carm_ spin_unlock_irq(host-lock); DPRINTK(blk_insert_request, tag == %u\n, idx); - blk_insert_request(host-oob_q, crq-rq, 1, crq, 0); + blk_insert_request(host-oob_q, crq-rq, 1, crq); return 0; @@ -653,7 +653,7 @@ static int carm_send_special (struct car crq-msg_bucket = (u32) rc; DPRINTK(blk_insert_request, tag == %u\n, idx); - blk_insert_request(host-oob_q, crq-rq, 1, crq, 0); + blk_insert_request(host-oob_q, crq-rq, 1, crq); return 0; } Index: scsi-reqfn-export/drivers/scsi/scsi_lib.c === --- scsi-reqfn-export.orig/drivers/scsi/scsi_lib.c 2005-04-11 12:27:08.0 +0900 +++ scsi-reqfn-export/drivers/scsi/scsi_lib.c 2005-04-11 12:27:08.0 +0900 @@ -92,7 +92,7 @@ int scsi_insert_special_req(struct scsi_ */ sreq-sr_request-flags = ~REQ_DONTPREP; blk_insert_request(sreq-sr_device-request_queue, sreq-sr_request, - at_head, sreq, 0); + at_head, sreq); return 0; } Index: scsi-reqfn-export/include/linux/blkdev.h === --- scsi-reqfn-export.orig/include/linux/blkdev.h 2005-04-11 12:25:15.0 +0900 +++ scsi-reqfn-export
Re: [PATCH scsi-misc-2.6 02/04] scsi: make scsi_queue_insert() use blk_requeue_request()
02_scsi_REQ_SPECIAL_semantic_scsi_queue_insert.patch scsi_queue_insert() used to use blk_insert_request() for requeueing requests. This behavior depends on the unobvious behavior of blk_insert_request() setting REQ_SPECIAL and REQ_SOFTBARRIER when requeueing. This patch makes scsi_queue_insert() use blk_requeue_request() and explicitly set REQ_SOFTBARRIER. As REQ_SPECIAL now means special requests, the flag is not set on requeue. Note that scsi_queue_insert() now calls scsi_run_queue() itself, and the prototype of the function is added right above scsi_queue_insert(). This is temporary, as later requeue path consolidation patchset removes scsi_queue_insert(). By adding temporary prototype, we can do away with unnecessary changes. Signed-off-by: Tejun Heo [EMAIL PROTECTED] scsi_lib.c | 24 ++-- 1 files changed, 14 insertions(+), 10 deletions(-) Index: scsi-reqfn-export/drivers/scsi/scsi_lib.c === --- scsi-reqfn-export.orig/drivers/scsi/scsi_lib.c 2005-04-11 12:27:07.0 +0900 +++ scsi-reqfn-export/drivers/scsi/scsi_lib.c 2005-04-11 12:27:07.0 +0900 @@ -96,6 +96,8 @@ int scsi_insert_special_req(struct scsi_ return 0; } +static void scsi_run_queue(struct request_queue *q); + /* * Function:scsi_queue_insert() * @@ -119,6 +121,8 @@ int scsi_queue_insert(struct scsi_cmnd * { struct Scsi_Host *host = cmd-device-host; struct scsi_device *device = cmd-device; + struct request_queue *q = device-request_queue; + unsigned long flags; SCSI_LOG_MLQUEUE(1, printk(Inserting command %p into mlqueue\n, cmd)); @@ -154,17 +158,17 @@ int scsi_queue_insert(struct scsi_cmnd * scsi_device_unbusy(device); /* -* Insert this command at the head of the queue for it's device. -* It will go before all other commands that are already in the queue. -* -* NOTE: there is magic here about the way the queue is plugged if -* we have no outstanding commands. -* -* Although this *doesn't* plug the queue, it does call the request -* function. The SCSI request function detects the blocked condition -* and plugs the queue appropriately. +* Requeue the command. Turn on REQ_SOFTBARRIER to prevent +* other requests from passing this request. */ - blk_insert_request(device-request_queue, cmd-request, 1, cmd, 1); + cmd-request-flags |= REQ_SOFTBARRIER; + + spin_lock_irqsave(q-queue_lock, flags); + blk_requeue_request(q, cmd-request); + spin_unlock_irqrestore(q-queue_lock, flags); + + scsi_run_queue(q); + return 0; } - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH linux-misc-2.6] scsi: remove volatile from scsi data structures
Hello, James. Hello, Christoph. This patch removes volatile qualifier from scsi_device-device_busy, Scsi_Host-host_busy and -host_failed as the volatile qualifiers don't serve any purpose now. While at it, convert those fields from unsigned short to unsigned int as suggested by Christoph. Signed-off-by: Tejun Heo [EMAIL PROTECTED] Index: scsi-reqfn-export/include/scsi/scsi_device.h === --- scsi-reqfn-export.orig/include/scsi/scsi_device.h 2005-04-10 13:03:14.0 +0900 +++ scsi-reqfn-export/include/scsi/scsi_device.h2005-04-10 13:04:42.0 +0900 @@ -43,7 +43,8 @@ struct scsi_device { struct list_headsiblings; /* list of all devices on this host */ struct list_headsame_target_siblings; /* just the devices sharing same target id */ - volatile unsigned short device_busy;/* commands actually active on low-level */ + unsigned int device_busy; /* commands actually active on +* low-level. protected by queue_lock. */ spinlock_t sdev_lock; /* also the request queue_lock */ spinlock_t list_lock; struct list_head cmd_list; /* queue of in use SCSI Command structures */ Index: scsi-reqfn-export/include/scsi/scsi_host.h === --- scsi-reqfn-export.orig/include/scsi/scsi_host.h 2005-04-10 13:03:14.0 +0900 +++ scsi-reqfn-export/include/scsi/scsi_host.h 2005-04-10 13:04:42.0 +0900 @@ -448,8 +448,14 @@ struct Scsi_Host { wait_queue_head_t host_wait; struct scsi_host_template *hostt; struct scsi_transport_template *transportt; - volatile unsigned short host_busy; /* commands actually active on low-level */ - volatile unsigned short host_failed; /* commands that failed. */ + + /* +* The following two fields are protected with host_lock; +* however, eh routines can safely access during eh processing +* without acquiring the lock. +*/ + unsigned int host_busy;/* commands actually active on low-level */ + unsigned int host_failed; /* commands that failed. */ unsigned short host_no; /* Used for IOCTL_GET_IDLUN, /proc/scsi et al. */ int resetting; /* if set, it means that last_reset is a valid value */ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
Jens Axboe wrote: On Wed, Apr 06 2005, Arjan van de Ven wrote: @@ -324,6 +334,7 @@ issue_flush_fn *issue_flush_fn; prepare_flush_fn*prepare_flush_fn; end_flush_fn*end_flush_fn; + release_queue_data_fn *release_queue_data_fn; /* * Auto-unplugging state where does this function method actually get called? I missed the hunk in ll_rw_blk.c, rmk pointed the same thing out not 5 minutes ago :-) The patch would not work anyways, as scsi_sysfs.c clears queuedata unconditionally. This is a better work-around, it just makes the queue hold a reference to the device as well only killing it when the queue is torn down. Still not super happy with it, but I don't see how to solve the circular dependency problem otherwise. Hello, Jens. I've been thinking about it for a while. The problem is that we're reference counting two different objects to track lifetime of one entity. This happens in both SCSI upper and mid layers. In the upper layer, genhd and scsi_disk (or scsi_cd, ...) are ref'ed separately while they share their destiny together (not really different entity) and in the middle layer scsi_device and request_queue does the same thing. Circular dependency is occuring because we separate one entity into two and reference counting them separately. Two are actually one and necessarily want each other. (until death aparts. Wow, serious. :-) IMHO, what we need to do is consolidate ref counting such that in each layer only one object is reference counted, and the other object is freed when the ref counted object is released. The object of choice would be genhd in upper layer and request_queue in mid layer. All ref-counting should be updated to only ref those objects. We'll need to add a release callback to genhd and make request_queue properly reference counted. Conceptually, scsi_disk extends genhd and scsi_device extends request_queue. So, to go one step further, as what UL represents is genhd (disk device) and ML request_queue (request-based device), embedding scsi_disk into genhd and scsi_device into request_queue will make the architecture clearer. To do this, we'll need something like alloc_disk_with_udata(int minors, size_t udata_len) and the equivalent for request_queue. I've done this half-way and then doing it without fixing the SCSI model seemed silly so got into working on the state model. (BTW, the state model is almost done, I'm about to run tests.) What do you think? Jens? -- tejun - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 01/13] scsi: don't use blk_insert_request() for requeueing
Greetings, James. On Fri, Apr 01, 2005 at 12:09:48PM -0600, James Bottomley wrote: On Fri, 2005-04-01 at 14:01 +0900, Tejun Heo wrote: Well, REQ_SPECIAL is the signal to the mid-layer that we've allocated the resources necessary to process the command, so in practice it will be turned on for every requeue request (because we set it when the command is prepared), Sorry, but where? AFAICT, only blk_insert_request() and scsi_init_io() on sgtable allocation failure set REQ_SPECIAL during scsi midlayer processing. This patch replaces blk_insert_request() with blk_requeue_request() and the next patch removes REQ_SPECIAL setting in scsi_init_io(). REQ_SPECIAL is currently overloaded to mean two different things. * The request is a special request. * The request has been requeued using scsi_queue_insert(). i.e. It has been prepped. But its true meaning is defined by the block layer (since it's a block flag). It's supposed to mean that the -special field of the request is in use to carry data meaningful to the underlying driver. SCSI sets it on that basis. So, if I understand correctly, based on the fact that the current block code in fact never bothers with REQ_SPECIAL, but only checks req- special, you're proposing that we need never actually set REQ_SPECIAL when making use of the -special field? Thus you want to use REQ_SPECIAL to distinguish between internally generated commands and external commands? That sounds fine as long as the block API gets updated to reflect this (comments in linux/blkdev.h shoudl be fine). Yeap, That's what I'm proposing. Block API never cares about REQ_SPECIAL flag or -special field except for when determining if requests can be merged - both fields are independently checked in this case. And IDE driver already uses the flag regardless of -special field. Do you want me to clear up the comment? The other reason I don't like this is that we've been trying hard to sweep excess block knowledge out of the mid-layer. I don't think REQ_SOFTBARRIER is anything we really have to know about. We currently requeue using two different block functions. * blk_insert_request(): this function does two things 1. enqueue special requests 2. turn on REQ_SPECIAL|REQ_SOFTBARRIER and call blk_requeue_request(). This is used only by scsi midlayer. * blk_requeue_request() REQ_SOFTBARRIER tells the request queue that other requests shouldn't pass this one. We need this to be set for prepped requests; otherwise, it may, theoretically, deadlock (unprepped request waiting for the prepped request back in the queue). So, the current code * depends on blk_insert_request() sets REQ_SOFTBARRIER when requeueing. It works but nothing in the interface or semantics is clear about it. Why expect a request reinserted using blk_insert_request() gets REQ_SOFTBARRIER turned on while a request reinserted with blk_requeue_request() doesn't? or why are there two different paths doing mostly the same thing with slightly different semantics? * requeueing using blk_requeue_request() in scsi_request_fn() doesn't turn on either REQ_SPECIAL or REQ_SOFTBARRIER. Missing REQ_SPECIAL is okay, as we have the extra path in prep_fn (if it ever gets requeued due to medium failure, so gets re-prepped), but missing REQ_SOFTBARRIER can *theoretically* cause problem. So, it's more likely becoming less dependent on unobvious behavior of block layer. As we need the request to stay on top, we tell the block layer to do so, instead of depending on blk_insert_request() unobviously doing it for us. But that's the point. This is non obvious behaviour in the block layer, so SCSI doesn't want to know about it (and the block maintainer won't want to trawl through the SCSI and other block drivers altering it if it changes). Yes, it's non-obvious behavior of blk_insert_request() when used for requeueing and, SCSI is the *only* user. This patch removes the unobvious path. If the bug is that the block layer isn't setting it on all our requeues where it should, then either we're using the wrong API or the block layer API is buggy. But block drivers in general don't have to inhibit reordering requeued requests. SCSI midlayer caches resources over requeueing, so it's SCSI midlayer's requirement that requeued requests shouldn't be passed. IOW, it's SCSI midlayer's responsibility to tell the block layer not to reschedule the request. REQ_SOFTBARRIER has well-defined meaning to tell just that. It's not a block-layer internal thing. It's a public interface. IOW, this patch uses well-defined flag to tell the block layer what the SCSI midlayer wants. I don't see any problem there. If you're uncomfortable with using REQ_* flag directly, writing a wrapper shouldn't be a problem, but, IMHO, current form seems fine. Thanks. -- tejun
Question about scsi_device_online() usage in mptscsih
Hello, Eric. Hello, James. I've been working on new SCSI state model and was checking on scsi_device_online() users. As the state model is going to change, I need to audit device state usages in lldd's and I'm having difficult time understanding why scsi_device_online() is used in mptscsih. In mptscsih.c, mptscsih_flush_running_cmds() uses scsi_device_online() to determine whether or not dma-unmap data area of active commands. This was added in the changeset 1.1371.776.1 by Eric Moore with the comment MPT Fusion add back FC909 support. Can you please explain me why and how scsi_device_online() condition is used here? Thanks a lot. :-) -- tejun - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 01/13] scsi: don't use blk_insert_request() for requeueing
01_scsi_no_REQ_SPECIAL_on_requeue.patch blk_insert_request() has 'reinsert' argument, which, when set, turns on REQ_SPECIAL and REQ_SOFTBARRIER and requeues the request. SCSI midlayer was the only user of this feature and all requeued requests become special requests defeating quiesce state. This patch makes scsi midlayer use blk_requeue_request() for requeueing and removes 'reinsert' feature from blk_insert_request(). Note: In drivers/scsi/scsi_lib.c, scsi_single_lun_run() and scsi_run_queue() are moved upward unchanged. Signed-off-by: Tejun Heo [EMAIL PROTECTED] drivers/block/ll_rw_blk.c | 20 +-- drivers/block/paride/pd.c |2 drivers/block/sx8.c |4 drivers/scsi/scsi_lib.c | 238 +++--- include/linux/blkdev.h|2 5 files changed, 133 insertions(+), 133 deletions(-) Index: scsi-export/drivers/block/ll_rw_blk.c === --- scsi-export.orig/drivers/block/ll_rw_blk.c 2005-03-31 17:42:05.0 +0900 +++ scsi-export/drivers/block/ll_rw_blk.c 2005-03-31 18:06:19.0 +0900 @@ -2028,7 +2028,6 @@ EXPORT_SYMBOL(blk_requeue_request); * @rq:request to be inserted * @at_head: insert request at head or tail of queue * @data: private data - * @reinsert: true if request it a reinsertion of previously processed one * * Description: *Many block devices need to execute commands asynchronously, so they don't @@ -2043,8 +2042,9 @@ EXPORT_SYMBOL(blk_requeue_request); *host that is unable to accept a particular command. */ void blk_insert_request(request_queue_t *q, struct request *rq, - int at_head, void *data, int reinsert) + int at_head, void *data) { + int where = at_head ? ELEVATOR_INSERT_FRONT : ELEVATOR_INSERT_BACK; unsigned long flags; /* @@ -2061,20 +2061,12 @@ void blk_insert_request(request_queue_t /* * If command is tagged, release the tag */ - if (reinsert) - blk_requeue_request(q, rq); - else { - int where = ELEVATOR_INSERT_BACK; + if (blk_rq_tagged(rq)) + blk_queue_end_tag(q, rq); - if (at_head) - where = ELEVATOR_INSERT_FRONT; + drive_stat_acct(rq, rq-nr_sectors, 1); + __elv_add_request(q, rq, where, 0); - if (blk_rq_tagged(rq)) - blk_queue_end_tag(q, rq); - - drive_stat_acct(rq, rq-nr_sectors, 1); - __elv_add_request(q, rq, where, 0); - } if (blk_queue_plugged(q)) __generic_unplug_device(q); else Index: scsi-export/drivers/block/paride/pd.c === --- scsi-export.orig/drivers/block/paride/pd.c 2005-03-31 17:42:05.0 +0900 +++ scsi-export/drivers/block/paride/pd.c 2005-03-31 18:06:19.0 +0900 @@ -723,7 +723,7 @@ static int pd_special_command(struct pd_ rq.ref_count = 1; rq.waiting = wait; rq.end_io = blk_end_sync_rq; - blk_insert_request(disk-gd-queue, rq, 0, func, 0); + blk_insert_request(disk-gd-queue, rq, 0, func); wait_for_completion(wait); rq.waiting = NULL; if (rq.errors) Index: scsi-export/drivers/block/sx8.c === --- scsi-export.orig/drivers/block/sx8.c2005-03-31 17:42:05.0 +0900 +++ scsi-export/drivers/block/sx8.c 2005-03-31 18:06:19.0 +0900 @@ -614,7 +614,7 @@ static int carm_array_info (struct carm_ spin_unlock_irq(host-lock); DPRINTK(blk_insert_request, tag == %u\n, idx); - blk_insert_request(host-oob_q, crq-rq, 1, crq, 0); + blk_insert_request(host-oob_q, crq-rq, 1, crq); return 0; @@ -653,7 +653,7 @@ static int carm_send_special (struct car crq-msg_bucket = (u32) rc; DPRINTK(blk_insert_request, tag == %u\n, idx); - blk_insert_request(host-oob_q, crq-rq, 1, crq, 0); + blk_insert_request(host-oob_q, crq-rq, 1, crq); return 0; } Index: scsi-export/drivers/scsi/scsi_lib.c === --- scsi-export.orig/drivers/scsi/scsi_lib.c2005-03-31 17:42:05.0 +0900 +++ scsi-export/drivers/scsi/scsi_lib.c 2005-03-31 18:06:19.0 +0900 @@ -65,6 +65,109 @@ struct scsi_host_sg_pool scsi_sg_pools[] /* + * Called for single_lun devices on IO completion. Clear starget_sdev_user, + * and call blk_run_queue for all the scsi_devices on the target - + * including current_sdev first. + * + * Called with *no* scsi locks held. + */ +static void scsi_single_lun_run(struct scsi_device *current_sdev) +{ + struct Scsi_Host *shost = current_sdev-host
Re: [PATCH scsi-misc-2.6 04/13] scsi: remove meaningless volatile qualifiers from structure definitions
04_scsi_remove_volatile.patch scsi_device-device_busy, Scsi_Host-host_busy and -host_failed have volatile qualifiers, but the qualifiers don't serve any purpose. Kill them. While at it, protect -host_failed update in scsi_error for consistency and clarity. Signed-off-by: Tejun Heo [EMAIL PROTECTED] scsi_device.h |3 ++- scsi_host.h | 10 -- 2 files changed, 10 insertions(+), 3 deletions(-) Index: scsi-export/include/scsi/scsi_device.h === --- scsi-export.orig/include/scsi/scsi_device.h 2005-03-31 17:42:05.0 +0900 +++ scsi-export/include/scsi/scsi_device.h 2005-03-31 18:06:20.0 +0900 @@ -43,7 +43,8 @@ struct scsi_device { struct list_headsiblings; /* list of all devices on this host */ struct list_headsame_target_siblings; /* just the devices sharing same target id */ - volatile unsigned short device_busy;/* commands actually active on low-level */ + unsigned short device_busy; /* commands actually active on +* low-level. protected by sdev_lock. */ spinlock_t sdev_lock; /* also the request queue_lock */ spinlock_t list_lock; struct list_head cmd_list; /* queue of in use SCSI Command structures */ Index: scsi-export/include/scsi/scsi_host.h === --- scsi-export.orig/include/scsi/scsi_host.h 2005-03-31 17:42:05.0 +0900 +++ scsi-export/include/scsi/scsi_host.h2005-03-31 18:06:20.0 +0900 @@ -448,8 +448,14 @@ struct Scsi_Host { wait_queue_head_t host_wait; struct scsi_host_template *hostt; struct scsi_transport_template *transportt; - volatile unsigned short host_busy; /* commands actually active on low-level */ - volatile unsigned short host_failed; /* commands that failed. */ + + /* +* The following two fields are protected with host_lock; +* however, eh routines can safely access during eh processing +* without acquiring the lock. +*/ + unsigned short host_busy; /* commands actually active on low-level */ + unsigned short host_failed;/* commands that failed. */ unsigned short host_no; /* Used for IOCTL_GET_IDLUN, /proc/scsi et al. */ int resetting; /* if set, it means that last_reset is a valid value */ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 11/13] scsi: add reprep arg to scsi_requeue_command() and make it public
11_scsi_make_requeue_command_public.patch Add reprep argument to scsi_requeue_command(), remove redundant q argument, add code to set cmd-state/owner, and make the function public. This patch is preparation for consolidating requeue paths. Signed-off-by: Tejun Heo [EMAIL PROTECTED] scsi_lib.c | 23 ++- scsi_priv.h |1 + 2 files changed, 15 insertions(+), 9 deletions(-) Index: scsi-export/drivers/scsi/scsi_lib.c === --- scsi-export.orig/drivers/scsi/scsi_lib.c2005-03-31 18:06:22.0 +0900 +++ scsi-export/drivers/scsi/scsi_lib.c 2005-03-31 18:06:22.0 +0900 @@ -466,8 +466,8 @@ void scsi_device_unbusy(struct scsi_devi * * Purpose:Handle post-processing of completed commands. * - * Arguments: q - queue to operate on - * cmd - command that may need to be requeued. + * Arguments: cmd - command that may need to be requeued. + * reprep - needs to prep the command again? * * Returns:Nothing * @@ -478,11 +478,16 @@ void scsi_device_unbusy(struct scsi_devi * we need to request the blocks that come after the bad * sector. */ -static void scsi_requeue_command(struct request_queue *q, struct scsi_cmnd *cmd) +void scsi_requeue_command(struct scsi_cmnd *cmd, int reprep) { + struct request_queue *q = cmd-device-request_queue; unsigned long flags; - cmd-request-flags = ~REQ_DONTPREP; + cmd-state = SCSI_STATE_MLQUEUE; + cmd-owner = SCSI_OWNER_MIDLEVEL; + + if (reprep) + cmd-request-flags = ~REQ_DONTPREP; cmd-request-flags |= REQ_SOFTBARRIER; spin_lock_irqsave(q-queue_lock, flags); @@ -556,7 +561,7 @@ static struct scsi_cmnd *scsi_end_reques * leftovers in the front of the * queue, and goose the queue again. */ - scsi_requeue_command(q, cmd); + scsi_requeue_command(cmd, 1); return cmd; } @@ -818,7 +823,7 @@ void scsi_io_completion(struct scsi_cmnd * media change, so we just retry the * request and see what happens. */ - scsi_requeue_command(q, cmd); + scsi_requeue_command(cmd, 1); return; } break; @@ -839,7 +844,7 @@ void scsi_io_completion(struct scsi_cmnd * This will cause a retry with a 6-byte * command. */ - scsi_requeue_command(q, cmd); + scsi_requeue_command(cmd, 1); result = 0; } else { cmd = scsi_end_request(cmd, 0, this_count, 1); @@ -852,7 +857,7 @@ void scsi_io_completion(struct scsi_cmnd * retry. */ if (sshdr.asc == 0x04 sshdr.ascq == 0x01) { - scsi_requeue_command(q, cmd); + scsi_requeue_command(cmd, 1); return; } printk(KERN_INFO Device %s not ready.\n, @@ -878,7 +883,7 @@ void scsi_io_completion(struct scsi_cmnd * recovery reasons. Just retry the request * and see what happens. */ - scsi_requeue_command(q, cmd); + scsi_requeue_command(cmd, 1); return; } if (result) { Index: scsi-export/drivers/scsi/scsi_priv.h === --- scsi-export.orig/drivers/scsi/scsi_priv.h 2005-03-31 18:06:20.0 +0900 +++ scsi-export/drivers/scsi/scsi_priv.h2005-03-31 18:06:22.0 +0900 @@ -96,6 +96,7 @@ extern int scsi_maybe_unblock_host(struc extern void scsi_setup_cmd_retry(struct scsi_cmnd *cmd); extern void scsi_device_unbusy(struct scsi_device *sdev); extern int scsi_queue_insert(struct scsi_cmnd *cmd, int reason); +extern void scsi_requeue_command(struct scsi_cmnd *cmd, int reprep); extern void scsi_next_command(struct scsi_cmnd *cmd); extern void scsi_run_host_queues(struct Scsi_Host *shost); extern struct request_queue *scsi_alloc_queue(struct scsi_device *sdev); - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 12/13] scsi: replace scsi_queue_insert() with scsi_requeue_command()
12_scsi_consolidate_requeue_paths.patch Add scsi_device_unbusy() call to scsi_retry_command(), replace scsi_queue_insert() with scsi_requeue_command(), make scsi_softirq() use scsi_retry_command() for ADD_TO_MLQUEUE case too (with explicit device blocking), and make scsi_eh_flush_done_q() use scsi_retry_command(). While at it, remove leading and tailing empty comment lines from trivial comments. As scsi_queue_insert() has no users now, kill it. Signed-off-by: Tejun Heo [EMAIL PROTECTED] scsi.c | 25 ++- scsi_error.c |2 - scsi_lib.c | 74 --- scsi_priv.h |3 -- 4 files changed, 15 insertions(+), 89 deletions(-) Index: scsi-export/drivers/scsi/scsi.c === --- scsi-export.orig/drivers/scsi/scsi.c2005-03-31 18:06:22.0 +0900 +++ scsi-export/drivers/scsi/scsi.c 2005-03-31 18:06:22.0 +0900 @@ -638,6 +638,7 @@ static void scsi_softirq(struct softirq_ while (!list_empty(local_q)) { struct scsi_cmnd *cmd = list_entry(local_q.next, struct scsi_cmnd, eh_entry); + struct scsi_device *sdev = cmd-device; list_del_init(cmd-eh_entry); disposition = scsi_decide_disposition(cmd); @@ -646,12 +647,12 @@ static void scsi_softirq(struct softirq_ case SUCCESS: scsi_finish_command(cmd); break; + case ADD_TO_MLQUEUE: + sdev-device_blocked = sdev-max_device_blocked; + /* fall thru */ case NEEDS_RETRY: scsi_retry_command(cmd); break; - case ADD_TO_MLQUEUE: - scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY); - break; default: if (!scsi_eh_scmd_add(cmd, 0)) scsi_finish_command(cmd); @@ -669,20 +670,20 @@ static void scsi_softirq(struct softirq_ * level drivers should not become re-entrant as a result of * this. */ -int scsi_retry_command(struct scsi_cmnd *cmd) +void scsi_retry_command(struct scsi_cmnd *cmd) { - /* -* Restore the SCSI command state. -*/ + SCSI_LOG_MLQUEUE(1, printk(Retrying command %p\n, cmd)); + + scsi_device_unbusy(cmd-device); + + /* Restore the SCSI command state. */ scsi_setup_cmd_retry(cmd); -/* - * Zero the sense information from the last time we tried - * this command. - */ +/* Zero the sense information from the last time we tried + * this command. */ memset(cmd-sense_buffer, 0, sizeof(cmd-sense_buffer)); - return scsi_queue_insert(cmd, SCSI_MLQUEUE_EH_RETRY); + scsi_requeue_command(cmd, 0); } /* Index: scsi-export/drivers/scsi/scsi_error.c === --- scsi-export.orig/drivers/scsi/scsi_error.c 2005-03-31 18:06:21.0 +0900 +++ scsi-export/drivers/scsi/scsi_error.c 2005-03-31 18:06:22.0 +0900 @@ -1551,7 +1551,7 @@ static void scsi_eh_flush_done_q(struct retry cmd: %p\n, current-comm, scmd)); - scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY); + scsi_retry_command(scmd); } else { if (!scmd-result) scmd-result |= (DRIVER_TIMEOUT 24); Index: scsi-export/drivers/scsi/scsi_lib.c === --- scsi-export.orig/drivers/scsi/scsi_lib.c2005-03-31 18:06:22.0 +0900 +++ scsi-export/drivers/scsi/scsi_lib.c 2005-03-31 18:06:22.0 +0900 @@ -210,80 +210,6 @@ int scsi_insert_special_req(struct scsi_ } /* - * Function:scsi_queue_insert() - * - * Purpose: Insert a command in the midlevel queue. - * - * Arguments: cmd- command that we are adding to queue. - * reason - why we are inserting command to queue. - * - * Lock status: Assumed that lock is not held upon entry. - * - * Returns: Nothing. - * - * Notes: We do this for one of two cases. Either the host is busy - * and it cannot accept any more commands for the time being, - * or the device returned QUEUE_FULL and can accept no more - * commands. - * Notes: This could be called either from an interrupt context or a - * normal process context. - */ -int scsi_queue_insert(struct
Re: [PATCH scsi-misc-2.6 01/13] scsi: don't use blk_insert_request() for requeueing
Hello, James. On Thu, Mar 31, 2005 at 11:53:20AM -0600, James Bottomley wrote: On Thu, 2005-03-31 at 18:07 +0900, Tejun Heo wrote: 01_scsi_no_REQ_SPECIAL_on_requeue.patch blk_insert_request() has 'reinsert' argument, which, when set, turns on REQ_SPECIAL and REQ_SOFTBARRIER and requeues the request. SCSI midlayer was the only user of this feature and all requeued requests become special requests defeating quiesce state. This patch makes scsi midlayer use blk_requeue_request() for requeueing and removes 'reinsert' feature from blk_insert_request(). Well, REQ_SPECIAL is the signal to the mid-layer that we've allocated the resources necessary to process the command, so in practice it will be turned on for every requeue request (because we set it when the command is prepared), Sorry, but where? AFAICT, only blk_insert_request() and scsi_init_io() on sgtable allocation failure set REQ_SPECIAL during scsi midlayer processing. This patch replaces blk_insert_request() with blk_requeue_request() and the next patch removes REQ_SPECIAL setting in scsi_init_io(). REQ_SPECIAL is currently overloaded to mean two different things. * The request is a special request. * The request has been requeued using scsi_queue_insert(). i.e. It has been prepped. However, #2 can be tested by rq-special != NULL condition, and, in fact, the prep_fn already has the code. This is why this and the next patch don't break the requeue path. IMHO, this proves the subtlety of the REQ_SPECIAL semantics. Which path is taken on which case gets kind of confusing by meaning two different things with REQ_SPECIAL. so this patch would have no effect on your stated quiesce problem. However, if you think about how requests work with head insertion and one command following another, there's really not a huge problem here either. I agree that it's not a huge problem, but it's subtle and has the potential of causing probably non-destructive but confusing behavior on rare cases. The other reason I don't like this is that we've been trying hard to sweep excess block knowledge out of the mid-layer. I don't think REQ_SOFTBARRIER is anything we really have to know about. We currently requeue using two different block functions. * blk_insert_request(): this function does two things 1. enqueue special requests 2. turn on REQ_SPECIAL|REQ_SOFTBARRIER and call blk_requeue_request(). This is used only by scsi midlayer. * blk_requeue_request() REQ_SOFTBARRIER tells the request queue that other requests shouldn't pass this one. We need this to be set for prepped requests; otherwise, it may, theoretically, deadlock (unprepped request waiting for the prepped request back in the queue). So, the current code * depends on blk_insert_request() sets REQ_SOFTBARRIER when requeueing. It works but nothing in the interface or semantics is clear about it. Why expect a request reinserted using blk_insert_request() gets REQ_SOFTBARRIER turned on while a request reinserted with blk_requeue_request() doesn't? or why are there two different paths doing mostly the same thing with slightly different semantics? * requeueing using blk_requeue_request() in scsi_request_fn() doesn't turn on either REQ_SPECIAL or REQ_SOFTBARRIER. Missing REQ_SPECIAL is okay, as we have the extra path in prep_fn (if it ever gets requeued due to medium failure, so gets re-prepped), but missing REQ_SOFTBARRIER can *theoretically* cause problem. So, it's more likely becoming less dependent on unobvious behavior of block layer. As we need the request to stay on top, we tell the block layer to do so, instead of depending on blk_insert_request() unobviously doing it for us. Thanks. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 02/13] scsi: don't turn on REQ_SPECIAL on sgtable allocation failure.
Hello, James. On Thu, Mar 31, 2005 at 11:53:45AM -0600, James Bottomley wrote: On Thu, 2005-03-31 at 18:08 +0900, Tejun Heo wrote: Don't turn on REQ_SPECIAL on sgtable allocation failure. This was the last place where REQ_SPECIAL is turned on for normal requests. If you do this, you'll leak a command every time the sgtable allocation fails. AFAICT, not really. We don't allocate another scsi_cmnd for normal requests if req-special != NULL. Thanks. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 04/13] scsi: remove meaningless volatile qualifiers from structure definitions
Hello, Chritoph. On Thu, Mar 31, 2005 at 11:11:45AM +0100, Christoph Hellwig wrote: On Thu, Mar 31, 2005 at 06:08:10PM +0900, Tejun Heo wrote: struct list_headsiblings; /* list of all devices on this host */ struct list_headsame_target_siblings; /* just the devices sharing same target id */ - volatile unsigned short device_busy;/* commands actually active on low-level */ + unsigned short device_busy; /* commands actually active on +* low-level. protected by sdev_lock. */ You should probably switch it to just unsigned. The other 16bit are wasted due to alignment anyway, and some architectures produce better code for 32bit accesses. - volatile unsigned short host_busy; /* commands actually active on low-level */ - volatile unsigned short host_failed; /* commands that failed. */ + + /* +* The following two fields are protected with host_lock; +* however, eh routines can safely access during eh processing +* without acquiring the lock. +*/ + unsigned short host_busy; /* commands actually active on low-level */ + unsigned short host_failed;/* commands that failed. */ Here it would actually increase the struct size but might make sense anyway. Sure, I'll make them unsigned. Thanks. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 05/13] scsi: remove a timer race from scsi_queue_insert() and cleanup timer
Hello, Chritoph. On Thu, Mar 31, 2005 at 11:13:53AM +0100, Christoph Hellwig wrote: /* Queue the command and wait for it to complete */ /* Abuse eh_timeout in the scsi_cmnd struct for our purposes */ init_timer(cmd-eh_timeout); + cmd-eh_timeout.function = NULL; I'd rather not see aic7xxx poke even deeper into this internal code. Can you please switch it to use a timer of it's own first? Yes, I'll. Thanks. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 08/13] scsi: move request preps in other places into prep_fn()
Hello, Christoph. On Thu, Mar 31, 2005 at 11:20:40AM +0100, Christoph Hellwig wrote: +/* + * Macro to determine the size of SCSI command. This macro takes vendor + * unique commands into account. SCSI commands in groups 6 and 7 are + * vendor unique and we will depend upon the command length being + * supplied correctly in cmd_len. + */ +#define CDB_SIZE(cmd) (cmd)-cmnd[0] 5) 7) 6) ? \ + COMMAND_SIZE((cmd)-cmnd[0]) : (cmd)-cmd_len) should probably go to scsi.h as it's generally usefull. I don't know. Currently it's used only in one place. Actually, I was thinking about moving it into the function where it's used. But if it's useful, renaming it to something like SCSI_CMD_CDB_SIZE() (maybe make it inline function?) and moving to scsi.h shouldn't be any problem. I think we need to hear other people's opinions. Some inputs please. Thanks. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 09/13] scsi: in scsi_prep_fn(), remove bogus comments clean up
Hello, James. On Thu, Mar 31, 2005 at 12:02:20PM -0600, James Bottomley wrote: On Thu, 2005-03-31 at 18:08 +0900, Tejun Heo wrote: -* come up when there is a medium error. We have to treat -* these two cases differently. We differentiate by looking -* at request-cmd, as this tells us the real story. +* come up when there is a medium error. This comment isn't wrong. That's exactly what this piece of code: if (sreq-sr_magic == SCSI_REQ_MAGIC) { is all about ... that's how it distinguishes between the two cases. The comment is misleading --- what it actually should say is that req- special has different contents depending upon the two cases, so rephrasing it to be more accurate would be helpful. Yes, it was misleading, even more so with previous REQ_SPECIAL patches. I'll rewrite the comment once we resolve the REQ_SPECIAL issue. Thanks. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 11/13] scsi: add reprep arg to scsi_requeue_command() and make it public
Hello, Christoph. On Thu, Mar 31, 2005 at 11:32:03AM +0100, Christoph Hellwig wrote: - * Arguments: q - queue to operate on - * cmd - command that may need to be requeued. + * Arguments: cmd - command that may need to be requeued. + * reprep - needs to prep the command again? * * Returns:Nothing * @@ -478,11 +478,16 @@ void scsi_device_unbusy(struct scsi_devi * we need to request the blocks that come after the bad * sector. */ -static void scsi_requeue_command(struct request_queue *q, struct scsi_cmnd *cmd) +void scsi_requeue_command(struct scsi_cmnd *cmd, int reprep) { + struct request_queue *q = cmd-device-request_queue; unsigned long flags; - cmd-request-flags = ~REQ_DONTPREP; + cmd-state = SCSI_STATE_MLQUEUE; + cmd-owner = SCSI_OWNER_MIDLEVEL; + + if (reprep) + cmd-request-flags = ~REQ_DONTPREP; the flag is not needed, you can move the clearing of the flag to the caller. And given that there's lots of callers rename the scsi_requeue_command without it to __scsi_requeue_command and make scsi_requeue_command a tiny inline wrapper around it that clears it. I opt for scsi_requeue_command() and scsi_requeue_command_reprep() for clarity (the latter being static inline). Thanks a lot for all your inputs. :-) -- tejun - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 10/13] scsi: rewrite scsi_request_fn()
Hello, Christoph. On Thu, Mar 31, 2005 at 12:14:16PM +0100, Christoph Hellwig wrote: the changes look good to me (although I haven't tested any of your patches yet), but the code flow is rather confusing. What do you think about the not even compile version of scsi_request_fn() below that should be functionally identical to yours: Yes, it's rather confusing. I was a bit concerned about forward gotos which are not error handling/exit and possible needs for unlikely()'s by putting error handling on hotter path, IOW, untaken forward branch. For the records, I think the likely/unlikely thingies are ugly over-optimization in many cases. However, your code is aesthetically much better. If there are no opjections regarding the forward non-error-exit jumps, I'll reorganize the code mostly as you suggested in the next take of this patchset. Thanks a lot. :-) -- tejun - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 08/08] scsi: fix hot unplug sequence
Hello, James. James Bottomley wrote: On Fri, 2005-03-25 at 14:38 +0900, Tejun Heo wrote: We have users of scsi_do_req() other than scsi_wait_req() and they use different done() functions to do different things. I've checked other done functions and none uses contents inside the passed scsi_cmnd, so using a dummy command should be okay with them. Am I missing something here? Well ... the other users are supposed to be going away. They're actually all coded wrongly in some way or other ... perhaps I should speed up the process. Sounds great. :-) Oh, and I would really appreciate if you can fill me in / give a pointer about the scsi_request/scsi_cmnd distinction. The block layer speaks in terms of requests and the scsi layers in terms of commands. The scsi_request_fn() actually associates a request with a command. However, since SCSI uses the block layer for queueing, all the internal scsi command submit paths have to use requests. This is what a scsi_request is. The reason for the special casing is that we can't use the normal REQ_CMD or REQ_BLOCK_PC paths because they need ULD initialisation and back end processing. What I meant was we could just use scsi_cmnd instead of scsi_request for commands. Currently, we do the following for special commands. 1. Allocate scsi_request and request (two are linked) 2. Initialize scsi_request as needed 3. queue the request 4. the request is dispatched 5. scsi_cmnd is initialized from scsi_request 6. scsi_cmnd is executed 7. result code and sense copied back to scsi_request 8. request is completed Instead, we can 1. Allocate scsi_cmnd and request (two are linked) 2. Initialize scsi_cmnd as needed 3. queue the request 4. the request is dispatched 5. scsi_cmnd is executed 6. request is completed As the latter seemed more straight-forward to me, I was wondering if there were reasons that I wasn't aware of. Thanks. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 08/08] scsi: fix hot unplug sequence
Hi, On Thu, Mar 24, 2005 at 11:02:45PM -0600, James Bottomley wrote: On Fri, 2005-03-25 at 12:15 +0900, Tejun Heo wrote: I think I found the cause. Special requests submitted using scsi_do_req() never initializes -end_io(). Normally, SCSI midlayer terminates special requests inside the SCSI midlayer without passing through the blkdev layer. However, if a device is going away or taken offline, blkdev layer gets to terminate special requests and, as -end_io() is never set-up, nothing happens and the completion gets lost. The analysis is exactly correct, well done! I think your patch is a bit overly complex, though. We can achieve the same effect simply by executing the completion without changing the rq_status like the patch below. Jens, To go back to the original problem, except when I hit the usb- storage error handling oops, I can plug and unplug to my hearts content and everything works. We have users of scsi_do_req() other than scsi_wait_req() and they use different done() functions to do different things. I've checked other done functions and none uses contents inside the passed scsi_cmnd, so using a dummy command should be okay with them. Am I missing something here? Oh, and I would really appreciate if you can fill me in / give a pointer about the scsi_request/scsi_cmnd distinction. Thanks a lot. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 07/08] scsi: remove bogus {get|put}_device() calls
Hi, James Bottomley wrote: On Wed, 2005-03-23 at 11:14 +0900, Tejun Heo wrote: So, basically, SCSI high-level object (scsi_disk) and mid-level object (scsi_device) are reference counted by users, not the requests they submit. Reference count cannot go zero with active users and users cannot access the object once the reference count reaches zero. Actually, no. Unfortunately we still have some fire and forget APIs, so the contention that we always have an open refcounted descriptor isn't always true. Yeap, you're right. So, what we have is * All high-level users have open access to the scsi high-level object on issueing requests, but may close it before its requests complete. * All mid-layer users do get_device() before submitting requests, but may put_device() before its requests complete. Thanks for pointing that out. :-) -- tejun - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH scsi-misc-2.6 00/08] scsi: small fixes cleanups
Hello, James. Hello, Jens. These are series of small fixes cleanups. The last two patches deal with reference counting and hot unplugging oops. Patches are against scsi-misc-2.6 tree (this is the devel tree, right?). Jens, please try #08 and tell me if you still get oops. AFAICT, reference counting isn't the issue here. We're over-counting not under-counting (see description of #07). All compile cleanly and I haven't found any problem yet. USB hot-unplugging doesn't create oops or offline device anymore for me. [ Start of patch descriptions ] 01_scsi_remove_scsi_release_buffers.patch : remove unused bounce-buffer release path Buffer bouncing hasn't been done inside the scsi midlayer for quite sometime now, but bounce-buffer release paths are still around. This patch removes these unused paths. 02_scsi_no_special_on_requeue.patch : don't use blk_insert_request() for requeueing blk_insert_request() has 'reinsert' argument, which, when set, turns on REQ_SPECIAL and REQ_SOFTBARRIER and requeues the request. SCSI midlayer was the only user of this feature and all requeued requests become special requests defeating quiesce state. This patch makes scsi midlayer use blk_requeue_request() for requeueing and removes 'reinsert' feature from blk_insert_request(). Note: In drivers/scsi/scsi_lib.c, scsi_single_lun_run() and scsi_run_queue() are moved upward unchanged. 03_scsi_remove_internal_timeout.patch : remove unused scsi_cmnd-internal_timeout field scsi_cmnd-internal_timeout field doesn't have any meaning anymore. Kill the field. 04_scsi_remove_volatile.patch : remove meaningless volatile qualifiers from structure definitions scsi_device-device_busy, Scsi_Host-host_busy and -host_failed have volatile qualifiers, but the qualifiers don't serve any purpose. Kill them. While at it, protect -host_failed update in scsi_error for consistency and clarity. 05_scsi_timer_cleanup.patch : remove a timer race from scsi_queue_insert() and cleanup timer scsi_queue_insert() has four callers. Three callers call with timer disabled and one (the second invocation in scsi_dispatch_cmd()) calls with timer activated. scsi_queue_insert() used to always call scsi_delete_timer() and ignore the return value. This results in race with timer expiration. Remove scsi_delete_timer() call from scsi_queue_insert() and make the caller delete timer and check the return value. While at it, as, once a scsi timer is added, it should expire or be deleted before reused, make scsi_add_timer() strict about timer reuses. Now timer expiration function clears -function and all timer deletion should go through scsi_delete_timer(). Also, remove bogus -eh_action tests from scsi_eh_{done|times_out} functions. The condition is always true and the test is somewhat misleading. 06_scsi_remove_serial_number_at_timeout.patch : remove meaningless scsi_cmnd-serial_number_at_timeout field scsi_cmnd-serial_number_at_timeout doesn't serve any purpose anymore. All serial_number == serial_number_at_timeout tests are always true in abort callbacks. Kill the field. Also, as -pid always equals -serial_number and -serial_number doesn't have any special meaning anymore, update comments above -serial_number accordingly. Once we remove all uses of this field from all lldd's, this field should go. 07_scsi_refcnt_cleanup.patch : remove bogus {get|put}_device() calls SCSI request submission paths can be categorized like the following. * through high-level driver (sd, st, sg...) + requests (fs / pc) + ioctls + flushes (issue_flush / barrier rqs) + backing dev (unplug fn / field referencing) + high-level specific (init / revalidation...) * through scsi-midlayer + midlevel specific (init...) + transport specific (domain validations...) All accesses either * open high-level driver before submitting requests and closes with no request left. * get_device() before submitting requests and put_device() with no request left. So, basically, SCSI high-level object (scsi_disk) and mid-level object (scsi_device) are reference counted by users, not the requests they submit. Reference count cannot go zero with active users and users cannot access the object once the reference count reaches zero. So, the {get/put}_device() calls in scsi_get_command() and scsi_request_fn() are bogus and misleading. In addition, get_device() cannot
Re: [PATCH scsi-misc-2.6 01/08] scsi: remove unused bounce-buffer release path
01_scsi_remove_scsi_release_buffers.patch Buffer bouncing hasn't been done inside the scsi midlayer for quite sometime now, but bounce-buffer release paths are still around. This patch removes these unused paths. Signed-off-by: Tejun Heo [EMAIL PROTECTED] scsi_lib.c | 60 +--- 1 files changed, 5 insertions(+), 55 deletions(-) Index: scsi-export/drivers/scsi/scsi_lib.c === --- scsi-export.orig/drivers/scsi/scsi_lib.c2005-03-23 09:39:36.0 +0900 +++ scsi-export/drivers/scsi/scsi_lib.c 2005-03-23 09:40:09.0 +0900 @@ -622,45 +622,6 @@ static void scsi_free_sgtable(struct sca } /* - * Function:scsi_release_buffers() - * - * Purpose: Completion processing for block device I/O requests. - * - * Arguments: cmd- command that we are bailing. - * - * Lock status: Assumed that no lock is held upon entry. - * - * Returns: Nothing - * - * Notes: In the event that an upper level driver rejects a - * command, we must release resources allocated during - * the __init_io() function. Primarily this would involve - * the scatter-gather table, and potentially any bounce - * buffers. - */ -static void scsi_release_buffers(struct scsi_cmnd *cmd) -{ - struct request *req = cmd-request; - - /* -* Free up any indirection buffers we allocated for DMA purposes. -*/ - if (cmd-use_sg) - scsi_free_sgtable(cmd-request_buffer, cmd-sglist_len); - else if (cmd-request_buffer != req-buffer) - kfree(cmd-request_buffer); - - /* -* Zero these out. They now point to freed memory, and it is -* dangerous to hang onto the pointers. -*/ - cmd-buffer = NULL; - cmd-bufflen = 0; - cmd-request_buffer = NULL; - cmd-request_bufflen = 0; -} - -/* * Function:scsi_io_completion() * * Purpose: Completion processing for block device I/O requests. @@ -703,22 +664,8 @@ void scsi_io_completion(struct scsi_cmnd if (blk_complete_barrier_rq(q, req, good_bytes 9)) return; - /* -* Free up any indirection buffers we allocated for DMA purposes. -* For the case of a READ, we need to copy the data out of the -* bounce buffer and into the real buffer. -*/ if (cmd-use_sg) scsi_free_sgtable(cmd-buffer, cmd-sglist_len); - else if (cmd-buffer != req-buffer) { - if (rq_data_dir(req) == READ) { - unsigned long flags; - char *to = bio_kmap_irq(req-bio, flags); - memcpy(to, cmd-buffer, cmd-bufflen); - bio_kunmap_irq(to, flags); - } - kfree(cmd-buffer); - } if (result) { sense_valid = scsi_command_normalize_sense(cmd, sshdr); @@ -963,7 +910,8 @@ static int scsi_init_io(struct scsi_cmnd req-current_nr_sectors); /* release the command and kill it */ - scsi_release_buffers(cmd); + if (cmd-use_sg) + scsi_free_sgtable(cmd-request_buffer, cmd-sglist_len); scsi_put_command(cmd); return BLKPREP_KILL; } @@ -1140,7 +1088,9 @@ static int scsi_prep_fn(struct request_q */ drv = *(struct scsi_driver **)req-rq_disk-private_data; if (unlikely(!drv-init_command(cmd))) { - scsi_release_buffers(cmd); + if (cmd-use_sg) + scsi_free_sgtable(cmd-request_buffer, + cmd-sglist_len); scsi_put_command(cmd); return BLKPREP_KILL; } - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 08/08] scsi: fix hot unplug sequence
08_scsi_hot_unplug_fix.patch When hot-unplugging using scsi_remove_host() function (as usb does), scsi_forget_host() used to be called before scsi_host_cancel(). So, the device gets removed first without request cleanup and scsi_host_cancel() never gets to call scsi_device_cancel() on the removed devices. This results in premature completion of hot-unplugging process with active requests left in queue, eventually leading to hang/offlined device or oops when the active command times out. This patch makes scsi_remove_host() call scsi_host_cancel() first such that the host is first transited into cancel state and all requests of all devices are killed, and then, the devices are removed. This patch fixes the oops in eh after hot-unplugging bug. Signed-off-by: Tejun Heo [EMAIL PROTECTED] hosts.c |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: scsi-export/drivers/scsi/hosts.c === --- scsi-export.orig/drivers/scsi/hosts.c 2005-03-23 09:39:36.0 +0900 +++ scsi-export/drivers/scsi/hosts.c2005-03-23 09:40:11.0 +0900 @@ -74,8 +74,8 @@ void scsi_host_cancel(struct Scsi_Host * **/ void scsi_remove_host(struct Scsi_Host *shost) { - scsi_forget_host(shost); scsi_host_cancel(shost, 0); + scsi_forget_host(shost); scsi_proc_host_rm(shost); set_bit(SHOST_DEL, shost-shost_state); - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 06/08] scsi: remove meaningless scsi_cmnd-serial_number_at_timeout field
06_scsi_remove_serial_number_at_timeout.patch scsi_cmnd-serial_number_at_timeout doesn't serve any purpose anymore. All serial_number == serial_number_at_timeout tests are always true in abort callbacks. Kill the field. Also, as -pid always equals -serial_number and -serial_number doesn't have any special meaning anymore, update comments above -serial_number accordingly. Once we remove all uses of this field from all lldd's, this field should go. Signed-off-by: Tejun Heo [EMAIL PROTECTED] drivers/scsi/BusLogic.c |7 --- drivers/scsi/advansys.c |5 ++--- drivers/scsi/ips.c |7 --- drivers/scsi/ncr53c8xx.c| 14 ++ drivers/scsi/qla2xxx/qla_dbg.c |6 ++ drivers/scsi/scsi.c |2 -- drivers/scsi/scsi_error.c |7 --- drivers/scsi/scsi_lib.c |1 - drivers/scsi/sym53c8xx_2/sym_glue.c |6 -- include/scsi/scsi_cmnd.h| 22 +- 10 files changed, 15 insertions(+), 62 deletions(-) Index: scsi-export/drivers/scsi/BusLogic.c === --- scsi-export.orig/drivers/scsi/BusLogic.c2005-03-23 09:39:36.0 +0900 +++ scsi-export/drivers/scsi/BusLogic.c 2005-03-23 09:40:11.0 +0900 @@ -2958,13 +2958,6 @@ static int BusLogic_AbortCommand(struct struct BusLogic_CCB *CCB; BusLogic_IncrementErrorCounter(HostAdapter-TargetStatistics[TargetID].CommandAbortsRequested); /* - If this Command has already completed, then no Abort is necessary. -*/ - if (Command-serial_number != Command-serial_number_at_timeout) { - BusLogic_Warning(Unable to Abort Command to Target %d - Already Completed\n, HostAdapter, TargetID); - return SUCCESS; - } - /* Attempt to find an Active CCB for this Command. If no Active CCB for this Command is found, then no Abort is necessary. */ Index: scsi-export/drivers/scsi/advansys.c === --- scsi-export.orig/drivers/scsi/advansys.c2005-03-23 09:40:09.0 +0900 +++ scsi-export/drivers/scsi/advansys.c 2005-03-23 09:40:11.0 +0900 @@ -9198,9 +9198,8 @@ asc_prt_scsi_cmnd(struct scsi_cmnd *s) s-use_sg, s-sglist_len, s-abort_reason); printk( - serial_number 0x%x, serial_number_at_timeout 0x%x, retries %d, allowed %d\n, -(unsigned) s-serial_number, (unsigned) s-serial_number_at_timeout, - s-retries, s-allowed); + serial_number 0x%x, retries %d, allowed %d\n, +(unsigned) s-serial_number, s-retries, s-allowed); printk( timeout_per_command %d, timeout_total %d, timeout %d\n, Index: scsi-export/drivers/scsi/ips.c === --- scsi-export.orig/drivers/scsi/ips.c 2005-03-23 09:39:36.0 +0900 +++ scsi-export/drivers/scsi/ips.c 2005-03-23 09:40:11.0 +0900 @@ -833,13 +833,6 @@ ips_eh_abort(Scsi_Cmnd * SC) if (!ha-active) return (FAILED); - if (SC-serial_number != SC-serial_number_at_timeout) { - /* HMM, looks like a bogus command */ - DEBUG(1, Abort called with bogus scsi command); - - return (FAILED); - } - /* See if the command is on the copp queue */ item = ha-copp_waitlist.head; while ((item) (item-scsi_cmd != SC)) Index: scsi-export/drivers/scsi/ncr53c8xx.c === --- scsi-export.orig/drivers/scsi/ncr53c8xx.c 2005-03-23 09:39:36.0 +0900 +++ scsi-export/drivers/scsi/ncr53c8xx.c2005-03-23 09:40:11.0 +0900 @@ -7601,24 +7601,14 @@ static int ncr53c8xx_abort(struct scsi_c struct scsi_cmnd *done_list; #if defined SCSI_RESET_SYNCHRONOUS defined SCSI_RESET_ASYNCHRONOUS - printk(ncr53c8xx_abort: pid=%lu serial_number=%ld serial_number_at_timeout=%ld\n, - cmd-pid, cmd-serial_number, cmd-serial_number_at_timeout); + printk(ncr53c8xx_abort: pid=%lu serial_number=%ld\n, + cmd-pid, cmd-serial_number); #else printk(ncr53c8xx_abort: command pid %lu\n, cmd-pid); #endif NCR_LOCK_NCB(np, flags); -#if defined SCSI_RESET_SYNCHRONOUS defined SCSI_RESET_ASYNCHRONOUS - /* -* We have to just ignore abort requests in some situations. -*/ - if (cmd-serial_number != cmd-serial_number_at_timeout) { - sts = SCSI_ABORT_NOT_RUNNING; - goto out; - } -#endif - sts = ncr_abort_command(np, cmd); out: done_list = np-done_list; Index: scsi-export/drivers/scsi/qla2xxx/qla_dbg.c === --- scsi-export.orig
Re: [PATCH scsi-misc-2.6 04/08] scsi: remove meaningless volatile qualifiers from structure definitions
04_scsi_remove_volatile.patch scsi_device-device_busy, Scsi_Host-host_busy and -host_failed have volatile qualifiers, but the qualifiers don't serve any purpose. Kill them. While at it, protect -host_failed update in scsi_error for consistency and clarity. Signed-off-by: Tejun Heo [EMAIL PROTECTED] drivers/scsi/scsi_error.c |6 +- include/scsi/scsi_device.h |2 +- include/scsi/scsi_host.h |4 ++-- 3 files changed, 8 insertions(+), 4 deletions(-) Index: scsi-export/drivers/scsi/scsi_error.c === --- scsi-export.orig/drivers/scsi/scsi_error.c 2005-03-23 09:40:09.0 +0900 +++ scsi-export/drivers/scsi/scsi_error.c 2005-03-23 09:40:10.0 +0900 @@ -652,9 +652,13 @@ static int scsi_request_sense(struct scs static void scsi_eh_finish_cmd(struct scsi_cmnd *scmd, struct list_head *done_q) { + unsigned long flags; + + spin_lock_irqsave(scmd-device-host-host_lock, flags); scmd-device-host-host_failed--; - scmd-state = SCSI_STATE_BHQUEUE; + spin_unlock_irqrestore(scmd-device-host-host_lock, flags); + scmd-state = SCSI_STATE_BHQUEUE; scsi_eh_eflags_clr_all(scmd); /* Index: scsi-export/include/scsi/scsi_device.h === --- scsi-export.orig/include/scsi/scsi_device.h 2005-03-23 09:39:36.0 +0900 +++ scsi-export/include/scsi/scsi_device.h 2005-03-23 09:40:10.0 +0900 @@ -43,7 +43,7 @@ struct scsi_device { struct list_headsiblings; /* list of all devices on this host */ struct list_headsame_target_siblings; /* just the devices sharing same target id */ - volatile unsigned short device_busy;/* commands actually active on low-level */ + unsigned short device_busy; /* commands actually active on low-level */ spinlock_t sdev_lock; /* also the request queue_lock */ spinlock_t list_lock; struct list_head cmd_list; /* queue of in use SCSI Command structures */ Index: scsi-export/include/scsi/scsi_host.h === --- scsi-export.orig/include/scsi/scsi_host.h 2005-03-23 09:39:36.0 +0900 +++ scsi-export/include/scsi/scsi_host.h2005-03-23 09:40:10.0 +0900 @@ -448,8 +448,8 @@ struct Scsi_Host { wait_queue_head_t host_wait; struct scsi_host_template *hostt; struct scsi_transport_template *transportt; - volatile unsigned short host_busy; /* commands actually active on low-level */ - volatile unsigned short host_failed; /* commands that failed. */ + unsigned short host_busy; /* commands actually active on low-level */ + unsigned short host_failed;/* commands that failed. */ unsigned short host_no; /* Used for IOCTL_GET_IDLUN, /proc/scsi et al. */ int resetting; /* if set, it means that last_reset is a valid value */ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 05/08] scsi: remove a timer race from scsi_queue_insert() and cleanup timer
05_scsi_timer_cleanup.patch scsi_queue_insert() has four callers. Three callers call with timer disabled and one (the second invocation in scsi_dispatch_cmd()) calls with timer activated. scsi_queue_insert() used to always call scsi_delete_timer() and ignore the return value. This results in race with timer expiration. Remove scsi_delete_timer() call from scsi_queue_insert() and make the caller delete timer and check the return value. While at it, as, once a scsi timer is added, it should expire or be deleted before reused, make scsi_add_timer() strict about timer reuses. Now timer expiration function clears -function and all timer deletion should go through scsi_delete_timer(). Also, remove bogus -eh_action tests from scsi_eh_{done|times_out} functions. The condition is always true and the test is somewhat misleading. Signed-off-by: Tejun Heo [EMAIL PROTECTED] aic7xxx/aic79xx_osm.c |1 + aic7xxx/aic7xxx_osm.c |1 + scsi.c|7 --- scsi_error.c | 24 +++- scsi_lib.c|6 -- 5 files changed, 13 insertions(+), 26 deletions(-) Index: scsi-export/drivers/scsi/aic7xxx/aic79xx_osm.c === --- scsi-export.orig/drivers/scsi/aic7xxx/aic79xx_osm.c 2005-03-23 09:39:36.0 +0900 +++ scsi-export/drivers/scsi/aic7xxx/aic79xx_osm.c 2005-03-23 09:40:10.0 +0900 @@ -2725,6 +2725,7 @@ ahd_linux_dv_target(struct ahd_softc *ah /* Queue the command and wait for it to complete */ /* Abuse eh_timeout in the scsi_cmnd struct for our purposes */ init_timer(cmd-eh_timeout); + cmd-eh_timeout.function = NULL; #ifdef AHD_DEBUG if ((ahd_debug AHD_SHOW_MESSAGES) != 0) /* Index: scsi-export/drivers/scsi/aic7xxx/aic7xxx_osm.c === --- scsi-export.orig/drivers/scsi/aic7xxx/aic7xxx_osm.c 2005-03-23 09:39:36.0 +0900 +++ scsi-export/drivers/scsi/aic7xxx/aic7xxx_osm.c 2005-03-23 09:40:10.0 +0900 @@ -2409,6 +2409,7 @@ ahc_linux_dv_target(struct ahc_softc *ah /* Queue the command and wait for it to complete */ /* Abuse eh_timeout in the scsi_cmnd struct for our purposes */ init_timer(cmd-eh_timeout); + cmd-eh_timeout.function = NULL; #ifdef AHC_DEBUG if ((ahc_debug AHC_SHOW_MESSAGES) != 0) /* Index: scsi-export/drivers/scsi/scsi.c === --- scsi-export.orig/drivers/scsi/scsi.c2005-03-23 09:40:09.0 +0900 +++ scsi-export/drivers/scsi/scsi.c 2005-03-23 09:40:10.0 +0900 @@ -639,9 +639,10 @@ int scsi_dispatch_cmd(struct scsi_cmnd * spin_unlock_irqrestore(host-host_lock, flags); if (rtn) { atomic_inc(cmd-device-iodone_cnt); - scsi_queue_insert(cmd, - (rtn == SCSI_MLQUEUE_DEVICE_BUSY) ? -rtn : SCSI_MLQUEUE_HOST_BUSY); + if (scsi_delete_timer(cmd)) + scsi_queue_insert(cmd, + (rtn == SCSI_MLQUEUE_DEVICE_BUSY) ? + rtn : SCSI_MLQUEUE_HOST_BUSY); SCSI_LOG_MLQUEUE(3, printk(queuecommand : request rejected\n)); } Index: scsi-export/drivers/scsi/scsi_error.c === --- scsi-export.orig/drivers/scsi/scsi_error.c 2005-03-23 09:40:10.0 +0900 +++ scsi-export/drivers/scsi/scsi_error.c 2005-03-23 09:40:10.0 +0900 @@ -107,14 +107,7 @@ int scsi_eh_scmd_add(struct scsi_cmnd *s void scsi_add_timer(struct scsi_cmnd *scmd, int timeout, void (*complete)(struct scsi_cmnd *)) { - - /* -* If the clock was already running for this command, then -* first delete the timer. The timer handling code gets rather -* confused if we don't do this. -*/ - if (scmd-eh_timeout.function) - del_timer(scmd-eh_timeout); + BUG_ON(scmd-eh_timeout.function); scmd-eh_timeout.data = (unsigned long)scmd; scmd-eh_timeout.expires = jiffies + timeout; @@ -170,6 +163,9 @@ void scsi_times_out(struct scsi_cmnd *sc { scsi_log_completion(scmd, TIMEOUT_ERROR); + scmd-eh_timeout.data = (unsigned long)NULL; + scmd-eh_timeout.function = NULL; + if (scmd-device-host-hostt-eh_timed_out) switch (scmd-device-host-hostt-eh_timed_out(scmd)) { case EH_HANDLED: @@ -442,9 +438,7 @@ static void scsi_eh_times_out(struct scs
Re: [PATCH scsi-misc-2.6 07/08] scsi: remove bogus {get|put}_device() calls
07_scsi_refcnt_cleanup.patch SCSI request submission paths can be categorized like the following. * through high-level driver (sd, st, sg...) + requests (fs / pc) + ioctls + flushes (issue_flush / barrier rqs) + backing dev (unplug fn / field referencing) + high-level specific (init / revalidation...) * through scsi-midlayer + midlevel specific (init...) + transport specific (domain validations...) All accesses either * open high-level driver before submitting requests and closes with no request left. * get_device() before submitting requests and put_device() with no request left. So, basically, SCSI high-level object (scsi_disk) and mid-level object (scsi_device) are reference counted by users, not the requests they submit. Reference count cannot go zero with active users and users cannot access the object once the reference count reaches zero. So, the {get/put}_device() calls in scsi_get_command() and scsi_request_fn() are bogus and misleading. In addition, get_device() cannot synchronize 1-0 and 0-1 transitions and always returns the device pointer given as the argument. The == NULL tests are just misleading. Signed-off-by: Tejun Heo [EMAIL PROTECTED] scsi.c |9 + scsi_lib.c | 12 +--- 2 files changed, 2 insertions(+), 19 deletions(-) Index: scsi-export/drivers/scsi/scsi.c === --- scsi-export.orig/drivers/scsi/scsi.c2005-03-23 09:40:11.0 +0900 +++ scsi-export/drivers/scsi/scsi.c 2005-03-23 09:40:11.0 +0900 @@ -246,10 +246,6 @@ struct scsi_cmnd *scsi_get_command(struc { struct scsi_cmnd *cmd; - /* Bail if we can't get a reference to the device */ - if (!get_device(dev-sdev_gendev)) - return NULL; - cmd = __scsi_get_command(dev-host, gfp_mask); if (likely(cmd != NULL)) { @@ -264,8 +260,7 @@ struct scsi_cmnd *scsi_get_command(struc spin_lock_irqsave(dev-list_lock, flags); list_add_tail(cmd-list, dev-cmd_list); spin_unlock_irqrestore(dev-list_lock, flags); - } else - put_device(dev-sdev_gendev); + } return cmd; } @@ -303,8 +298,6 @@ void scsi_put_command(struct scsi_cmnd * if (likely(cmd != NULL)) kmem_cache_free(shost-cmd_pool-slab, cmd); - - put_device(sdev-sdev_gendev); } EXPORT_SYMBOL(scsi_put_command); Index: scsi-export/drivers/scsi/scsi_lib.c === --- scsi-export.orig/drivers/scsi/scsi_lib.c2005-03-23 09:40:11.0 +0900 +++ scsi-export/drivers/scsi/scsi_lib.c 2005-03-23 09:40:11.0 +0900 @@ -1200,10 +1200,6 @@ static void scsi_request_fn(struct reque struct scsi_cmnd *cmd; struct request *req; - if(!get_device(sdev-sdev_gendev)) - /* We must be tearing the block queue down already */ - return; - /* * To start with, we keep looping until the queue is empty, or until * the host is no longer able to accept any more requests. @@ -1288,7 +1284,7 @@ static void scsi_request_fn(struct reque } } - goto out; + return; not_ready: spin_unlock_irq(shost-host_lock); @@ -1306,12 +1302,6 @@ static void scsi_request_fn(struct reque sdev-device_busy--; if(sdev-device_busy == 0) blk_plug_device(q); - out: - /* must be careful here...if we trigger the -remove() function -* we cannot be holding the q lock */ - spin_unlock_irq(q-queue_lock); - put_device(sdev-sdev_gendev); - spin_lock_irq(q-queue_lock); } u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost) - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 02/08] scsi: don't use blk_insert_request() for requeueing
02_scsi_no_special_on_requeue.patch blk_insert_request() has 'reinsert' argument, which, when set, turns on REQ_SPECIAL and REQ_SOFTBARRIER and requeues the request. SCSI midlayer was the only user of this feature and all requeued requests become special requests defeating quiesce state. This patch makes scsi midlayer use blk_requeue_request() for requeueing and removes 'reinsert' feature from blk_insert_request(). Note: In drivers/scsi/scsi_lib.c, scsi_single_lun_run() and scsi_run_queue() are moved upward unchanged. Signed-off-by: Tejun Heo [EMAIL PROTECTED] drivers/block/ll_rw_blk.c | 20 +-- drivers/block/paride/pd.c |2 drivers/block/sx8.c |4 drivers/scsi/scsi_lib.c | 238 +++--- include/linux/blkdev.h|2 5 files changed, 133 insertions(+), 133 deletions(-) Index: scsi-export/drivers/block/ll_rw_blk.c === --- scsi-export.orig/drivers/block/ll_rw_blk.c 2005-03-23 09:39:36.0 +0900 +++ scsi-export/drivers/block/ll_rw_blk.c 2005-03-23 09:40:09.0 +0900 @@ -2028,7 +2028,6 @@ EXPORT_SYMBOL(blk_requeue_request); * @rq:request to be inserted * @at_head: insert request at head or tail of queue * @data: private data - * @reinsert: true if request it a reinsertion of previously processed one * * Description: *Many block devices need to execute commands asynchronously, so they don't @@ -2043,8 +2042,9 @@ EXPORT_SYMBOL(blk_requeue_request); *host that is unable to accept a particular command. */ void blk_insert_request(request_queue_t *q, struct request *rq, - int at_head, void *data, int reinsert) + int at_head, void *data) { + int where = at_head ? ELEVATOR_INSERT_FRONT : ELEVATOR_INSERT_BACK; unsigned long flags; /* @@ -2061,20 +2061,12 @@ void blk_insert_request(request_queue_t /* * If command is tagged, release the tag */ - if (reinsert) - blk_requeue_request(q, rq); - else { - int where = ELEVATOR_INSERT_BACK; + if (blk_rq_tagged(rq)) + blk_queue_end_tag(q, rq); - if (at_head) - where = ELEVATOR_INSERT_FRONT; + drive_stat_acct(rq, rq-nr_sectors, 1); + __elv_add_request(q, rq, where, 0); - if (blk_rq_tagged(rq)) - blk_queue_end_tag(q, rq); - - drive_stat_acct(rq, rq-nr_sectors, 1); - __elv_add_request(q, rq, where, 0); - } if (blk_queue_plugged(q)) __generic_unplug_device(q); else Index: scsi-export/drivers/block/paride/pd.c === --- scsi-export.orig/drivers/block/paride/pd.c 2005-03-23 09:39:36.0 +0900 +++ scsi-export/drivers/block/paride/pd.c 2005-03-23 09:40:09.0 +0900 @@ -723,7 +723,7 @@ static int pd_special_command(struct pd_ rq.ref_count = 1; rq.waiting = wait; rq.end_io = blk_end_sync_rq; - blk_insert_request(disk-gd-queue, rq, 0, func, 0); + blk_insert_request(disk-gd-queue, rq, 0, func); wait_for_completion(wait); rq.waiting = NULL; if (rq.errors) Index: scsi-export/drivers/block/sx8.c === --- scsi-export.orig/drivers/block/sx8.c2005-03-23 09:39:36.0 +0900 +++ scsi-export/drivers/block/sx8.c 2005-03-23 09:40:09.0 +0900 @@ -614,7 +614,7 @@ static int carm_array_info (struct carm_ spin_unlock_irq(host-lock); DPRINTK(blk_insert_request, tag == %u\n, idx); - blk_insert_request(host-oob_q, crq-rq, 1, crq, 0); + blk_insert_request(host-oob_q, crq-rq, 1, crq); return 0; @@ -653,7 +653,7 @@ static int carm_send_special (struct car crq-msg_bucket = (u32) rc; DPRINTK(blk_insert_request, tag == %u\n, idx); - blk_insert_request(host-oob_q, crq-rq, 1, crq, 0); + blk_insert_request(host-oob_q, crq-rq, 1, crq); return 0; } Index: scsi-export/drivers/scsi/scsi_lib.c === --- scsi-export.orig/drivers/scsi/scsi_lib.c2005-03-23 09:40:09.0 +0900 +++ scsi-export/drivers/scsi/scsi_lib.c 2005-03-23 09:40:09.0 +0900 @@ -65,6 +65,109 @@ struct scsi_host_sg_pool scsi_sg_pools[] /* + * Called for single_lun devices on IO completion. Clear starget_sdev_user, + * and call blk_run_queue for all the scsi_devices on the target - + * including current_sdev first. + * + * Called with *no* scsi locks held. + */ +static void scsi_single_lun_run(struct scsi_device *current_sdev) +{ + struct Scsi_Host *shost = current_sdev-host
Re: [PATCH scsi-misc-2.6 04/08] scsi: remove meaningless volatile qualifiers from structure definitions
Hello, guys. On Tue, Mar 22, 2005 at 11:22:23PM -0500, Jeff Garzik wrote: James Bottomley wrote: On Wed, 2005-03-23 at 11:14 +0900, Tejun Heo wrote: scsi_device-device_busy, Scsi_Host-host_busy and -host_failed have volatile qualifiers, but the qualifiers don't serve any purpose. Kill them. While at it, protect -host_failed update in scsi_error for consistency and clarity. Well ... the data here is volatile so what you're advocating is a move away from a volatile variable model to a protected variable one ... did you audit all users of both of these to make sure we have protection on all of them? It looks like the sata strategy handlers would still rely on the volatile data. Yes, I did (well, tried :-). Adding locking/unlocking was just for clarity. We have synchronization w/ implied memory barrier before and after eh processing, so we don't really need to acquire the lock there. And while adding it, I forgot about libata strategy function. Sorry about that. I removed the locking from scsi_error and added comments to data structure definition that those fields can be accessed without acquiring the host lock. I think it's clearer this way. volatile is almost always (a) buggy, or (b) hiding bugs. At the very least, barriers are usually needed. Almost every case really wants to be inside a spinlock, or atomic_t, or similarly protected. Specifically for SATA, I am making the presumption that SCSI is smart enough not to mess with host_failed until my error handler completes. Yes, volatile only instructs the compiler to not cache the variable in the register and not move around accesses to the variable. The only valid usage would be raw synchronization with variables (busy checking, two flag variable synchronization, etc...), but even those usages are better off with explicit barriers and cpu relaxations to clarify what's going on. Currently all accesses outside eh are properly protected with locks except for the following two cases. * sg_proc_seq_show_dev(): read access, informational. doesn't matter. * check looping in scsi_device_quiesce(): we have proper barrier. Signed-off-by: Tejun Heo [EMAIL PROTECTED] Index: scsi-export/include/scsi/scsi_device.h === --- scsi-export.orig/include/scsi/scsi_device.h 2005-03-23 09:40:12.0 +0900 +++ scsi-export/include/scsi/scsi_device.h 2005-03-23 14:04:59.0 +0900 @@ -43,7 +43,8 @@ struct scsi_device { struct list_headsiblings; /* list of all devices on this host */ struct list_headsame_target_siblings; /* just the devices sharing same target id */ - volatile unsigned short device_busy;/* commands actually active on low-level */ + unsigned short device_busy; /* commands actually active on +* low-level. protected by sdev_lock. */ spinlock_t sdev_lock; /* also the request queue_lock */ spinlock_t list_lock; struct list_head cmd_list; /* queue of in use SCSI Command structures */ Index: scsi-export/include/scsi/scsi_host.h === --- scsi-export.orig/include/scsi/scsi_host.h 2005-03-23 09:40:12.0 +0900 +++ scsi-export/include/scsi/scsi_host.h2005-03-23 14:04:59.0 +0900 @@ -448,8 +448,14 @@ struct Scsi_Host { wait_queue_head_t host_wait; struct scsi_host_template *hostt; struct scsi_transport_template *transportt; - volatile unsigned short host_busy; /* commands actually active on low-level */ - volatile unsigned short host_failed; /* commands that failed. */ + + /* +* The following two fields are protected with host_lock; +* however, eh routines can safely access during eh processing +* without acquiring the lock. +*/ + unsigned short host_busy; /* commands actually active on low-level */ + unsigned short host_failed;/* commands that failed. */ unsigned short host_no; /* Used for IOCTL_GET_IDLUN, /proc/scsi et al. */ int resetting; /* if set, it means that last_reset is a valid value */ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH scsi-misc-2.6 01/08] scsi: remove unused bounce-buffer release path
Hello, James. James Bottomley wrote: On Wed, 2005-03-23 at 11:14 +0900, Tejun Heo wrote: 01_scsi_remove_scsi_release_buffers.patch Buffer bouncing hasn't been done inside the scsi midlayer for quite sometime now, but bounce-buffer release paths are still around. This patch removes these unused paths. Yes, but scsi_release_buffers isn't referring to bounce buffers anymore, it's simply releasing the sg buffers. That's what I did. Replacing scsi_release_buffers() calls with calls to scsi_free_sgtable(). The only logic removed is bounce-buffer release/copy-back. [...] - else if (cmd-buffer != req-buffer) { - if (rq_data_dir(req) == READ) { - unsigned long flags; - char *to = bio_kmap_irq(req-bio, flags); - memcpy(to, cmd-buffer, cmd-bufflen); - bio_kunmap_irq(to, flags); - } - kfree(cmd-buffer); - } I'll defer to Jens here, but I don't thing you can just remove this ... sg_io with a misaligned buffer will fail without this. AFAIK, those are done by blk_rq_map_user() and blk_rq_unmap_user(), both of which are invoked directly by sg_io(). That rather nasty code freeing cmd-buffer needs to be in there as well ... so it does make sense to keep this API That code is invoked only for REQ_BLOCK_PC requests without bio, and I digged pretty hard but, in those cases, AFAICT, the callers are responsible for supplying dma-able buffers and nothing seems to alter cmd-buffer after the cmd gets initialized, but I might be missing things here. If so, please point out. Thanks. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] driver model/scsi: synchronize pm calls with probe/remove
Hello, Dmitry, Mochel and James. I've been looking at sd code and found seemingly bogus 'if (!sdkp)' tests with /* this can happen */ comment. I've digged changelog and found out that this was to prevent oops which occurs if some driver gets stuck inside -probe and the machine goes down and calls back -remove. IMHO, we should avoid this problem by fixing driver -probe or -remove callbacks instead of detecting and bypassing half-initialized/destroyed devices in pm callbacks. This patch read-locks a device's bus using device_pm_down_read_bus() before invoking any pm callback. The function also periodically prints out warning messages while waiting for the bus subsys rwsem. This patch makes the machine wait indefinitely if any driver is stuck inside -probe or -remove. In device_shutdown(), devices_subsys.kset.list is walked while holding devices_subsys.rwsem. This patch nests each bus's subsys rwsem inside. Signed-off-by: Tejun Heo [EMAIL PROTECTED] # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2005/03/21 17:22:41+09:00 [EMAIL PROTECTED] # device_pm_down_read_bus() implemented. # # drivers/scsi/sd.c # 2005/03/21 17:22:33+09:00 [EMAIL PROTECTED] +0 -6 # device_pm_down_read_bus() implemented. # # drivers/base/power/suspend.c # 2005/03/21 17:22:33+09:00 [EMAIL PROTECTED] +4 -1 # device_pm_down_read_bus() implemented. # # drivers/base/power/shutdown.c # 2005/03/21 17:22:33+09:00 [EMAIL PROTECTED] +13 -5 # device_pm_down_read_bus() implemented. # # drivers/base/power/resume.c # 2005/03/21 17:22:33+09:00 [EMAIL PROTECTED] +7 -2 # device_pm_down_read_bus() implemented. # # drivers/base/power/power.h # 2005/03/21 17:22:33+09:00 [EMAIL PROTECTED] +2 -0 # device_pm_down_read_bus() implemented. # # drivers/base/power/main.c # 2005/03/21 17:22:33+09:00 [EMAIL PROTECTED] +23 -1 # device_pm_down_read_bus() implemented. # diff -Nru a/drivers/base/power/main.c b/drivers/base/power/main.c --- a/drivers/base/power/main.c 2005-03-21 17:25:50 +09:00 +++ b/drivers/base/power/main.c 2005-03-21 17:25:50 +09:00 @@ -21,6 +21,7 @@ #include linux/config.h #include linux/device.h +#include linux/delay.h #include power.h LIST_HEAD(dpm_active); @@ -96,4 +97,25 @@ up(dpm_list_sem); } - +/** + * device_pm_down_read_bus - Read-lock dev-bus's rwsem. + * @dev: The bus of this dev is read locked on return. + * @opstr: Error message prefix. + * + * Read locks the subsys rwsem of the device's bus. Generally pm + * calls are made with processes frozen, so there shouldn't be + * any contention; however, the shutdown path is invoked when + * halting the machine and it's possible to have some drivers + * stuck inside -probe or -remove method. In such cases, we + * retry while printing a warning message every 10s. + */ +void device_pm_down_read_bus(struct device * dev, const char * opstr) +{ + int cnt = 0; + while (!down_read_trylock(dev-bus-subsys.rwsem)) { + if (cnt++ % 100 == 0) + printk(KERN_WARNING %s %s%s: waiting on bus subsys + rwsem\n, opstr, dev-bus-name, dev-bus_id); + msleep(100); + } +} diff -Nru a/drivers/base/power/power.h b/drivers/base/power/power.h --- a/drivers/base/power/power.h2005-03-21 17:25:50 +09:00 +++ b/drivers/base/power/power.h2005-03-21 17:25:50 +09:00 @@ -53,6 +53,8 @@ extern int device_pm_add(struct device *); extern void device_pm_remove(struct device *); +extern void device_pm_down_read_bus(struct device * dev, const char *opstr); + /* * sysfs.c */ diff -Nru a/drivers/base/power/resume.c b/drivers/base/power/resume.c --- a/drivers/base/power/resume.c 2005-03-21 17:25:50 +09:00 +++ b/drivers/base/power/resume.c 2005-03-21 17:25:50 +09:00 @@ -22,8 +22,13 @@ int resume_device(struct device * dev) { - if (dev-bus dev-bus-resume) - return dev-bus-resume(dev); + if (dev-bus dev-bus-resume) { + int ret; + device_pm_down_read_bus(dev, Resuming); + ret = dev-bus-resume(dev); + up_read(dev-bus-subsys.rwsem); + return ret; + } return 0; } diff -Nru a/drivers/base/power/shutdown.c b/drivers/base/power/shutdown.c --- a/drivers/base/power/shutdown.c 2005-03-21 17:25:50 +09:00 +++ b/drivers/base/power/shutdown.c 2005-03-21 17:25:50 +09:00 @@ -25,6 +25,8 @@ return 0; if (dev-detach_state == DEVICE_PM_OFF) { + /* We have bus rwsem write-locked on entry to this +* function. No need to mess with the bus rwsem. */ if (dev-driver dev-driver-shutdown) dev-driver-shutdown(dev); return 0; @@ -54,11 +56,17 @@ down_write(devices_subsys.rwsem); list_for_each_entry_reverse(dev, devices_subsys.kset.list
Re: [PATCH] driver model/scsi: synchronize pm calls with probe/remove
Hi, Dmitry. Dmitry Torokhov wrote: On Mon, 21 Mar 2005 18:18:46 +0900, Tejun Heo [EMAIL PROTECTED] wrote: Hello, Dmitry, Mochel and James. I've been looking at sd code and found seemingly bogus 'if (!sdkp)' tests with /* this can happen */ comment. I've digged changelog and found out that this was to prevent oops which occurs if some driver gets stuck inside -probe and the machine goes down and calls back -remove. IMHO, we should avoid this problem by fixing driver -probe or -remove callbacks instead of detecting and bypassing half-initialized/destroyed devices in pm callbacks. This patch read-locks a device's bus using device_pm_down_read_bus() before invoking any pm callback. Hi Tejun, There are talks about getting rid of bus's rwsem and replacing it with a per-device semaphore to serialize probe, remove, suspend and resume. This should resolve entire host of problems including this one, if I unrerstand it correctly. Please take a look here: http://seclists.org/lists/linux-kernel/2005/Mar/5847.html Yeap, sounds great. Hmmm.. as the final result will (and should) be the same for inidividual drivers (no overlapping callback invocations), how about incorporating my patch before implementing the proposed fix such that we can get rid of the awkward semantic first? The proposed change should change the same part of code anyway, so I don't think this would be a hassle. Thanks. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html