Re: [PATCH] blk request timeout minor fixes...

2007-12-05 Thread malahal
Jens Axboe [EMAIL PROTECTED] wrote:
  This one fixes the 4-byte hole. Thank you very much.
 
 That's all fine, but now we have two fields doing essentially the same
 thing. Care to cleanup the -timeout usage so we can get by with using
 just that field?

I added another field because I couldn't find a way to use just one
field without breaking something! I will gladly clean it up if you
find a way to fix it.

Thanks, Malahal.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] blk request timeout minor fixes...

2007-12-05 Thread malahal
Jens Axboe [EMAIL PROTECTED] wrote:
 -timeout isn't used very much, so should not be too much work to
 console the two.
 
 If you see any specific obstacles, do list them!

I believe it is used in blk_add_timer() which may be called:
1) First time sending the command to LLDs
   No problem. It would be set to correct value at that time.
2) called from blk_rq_timed_out() in response to BLK_EH_RESET_TIMER
   Without the actual timeout value, there is no way to res-set timer
   correctly.
3) Retry
   Same #2.

The other way is to remove the deadline field. That poses timing out
issues.

Thanks, Malahal.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] blk request timeout minor fixes...

2007-12-03 Thread malahal
Matthew Wilcox [EMAIL PROTECTED] wrote:
 On Sun, Dec 02, 2007 at 05:53:30PM -0800, [EMAIL PROTECTED] wrote:
 Can I suggest running 'pahole' over this when compiled on 64-bit?
 You've just introduced a 4-byte hole.

This one fixes the 4-byte hole. Thank you very much.


diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 79ed268..5e95fa8 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -259,6 +259,7 @@ static void rq_init(struct request_queue *q, struct request 
*rq)
INIT_LIST_HEAD(rq-timeout_list);
 
rq-timeout = 0;
+   rq-deadline = 0;
rq-errors = 0;
rq-bio = rq-biotail = NULL;
INIT_HLIST_NODE(rq-hash);
@@ -3707,21 +3708,24 @@ static void blk_rq_timed_out(struct request *req)
 static void blk_rq_timed_out_timer(unsigned long data)
 {
struct request_queue *q = (struct request_queue *) data;
-   unsigned long flags, next = 0;
+   unsigned long flags, uninitialized_var(next), next_set = 0;
struct request *rq, *tmp;
 
spin_lock_irqsave(q-queue_lock, flags);
 
list_for_each_entry_safe(rq, tmp, q-timeout_list, timeout_list) {
-   if (!next || time_before(next, rq-timeout))
-   next = rq-timeout;
-   if (time_after_eq(jiffies, rq-timeout)) {
+   if (time_after_eq(jiffies, rq-deadline)) {
list_del_init(rq-timeout_list);
blk_rq_timed_out(rq);
}
+   if (!next_set) {
+   next = rq-deadline;
+   next_set = 1;
+   } else if (time_after(next, rq-deadline))
+   next = rq-deadline;
}
 
-   if (next)
+   if (next_set)
mod_timer(q-timeout, round_jiffies(next));
 
spin_unlock_irqrestore(q-queue_lock, flags);
@@ -3760,9 +3764,9 @@ void blk_add_timer(struct request *req)
BUG_ON(!list_empty(req-timeout_list));
 
if (req-timeout)
-   req-timeout += jiffies;
+   req-deadline = jiffies + req-timeout;
else
-   req-timeout = jiffies + q-rq_timeout;
+   req-deadline = jiffies + q-rq_timeout;
 
list_add_tail(req-timeout_list, q-timeout_list);
 
@@ -3771,7 +3775,7 @@ void blk_add_timer(struct request *req)
 * than an existing one, modify the timer. Round to next nearest
 * second.
 */
-   expiry = round_jiffies(req-timeout);
+   expiry = round_jiffies(req-deadline);
if (!timer_pending(q-timeout) ||
time_before(expiry, q-timeout.expires))
mod_timer(q-timeout, expiry);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 917fe86..834d097 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -295,8 +295,9 @@ struct request {
void *data;
void *sense;
 
-   unsigned long timeout;
+   unsigned long deadline;
struct list_head timeout_list;
+   unsigned int timeout;
int retries;
 
/*
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] blk request timeout minor fixes...

2007-12-02 Thread malahal
Adds deadline field to preserve the timeout on retries.  Calculates next
timeout for mod_timer correctly and uses an extra symbol, next_set, as
zero could be a valid 'next' value. Based off of timeout branch from Jens
git repo.

Signed-off-by: Malahal Naineni [EMAIL PROTECTED]

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 79ed268..6bf31f0 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -259,6 +259,7 @@ static void rq_init(struct request_queue *q, struct request 
*rq)
INIT_LIST_HEAD(rq-timeout_list);
 
rq-timeout = 0;
+   rq-deadline = 0;
rq-errors = 0;
rq-bio = rq-biotail = NULL;
INIT_HLIST_NODE(rq-hash);
@@ -3707,21 +3708,24 @@ static void blk_rq_timed_out(struct request *req)
 static void blk_rq_timed_out_timer(unsigned long data)
 {
struct request_queue *q = (struct request_queue *) data;
-   unsigned long flags, next = 0;
+   unsigned long flags, next, next_set = 0;
struct request *rq, *tmp;
 
spin_lock_irqsave(q-queue_lock, flags);
 
list_for_each_entry_safe(rq, tmp, q-timeout_list, timeout_list) {
-   if (!next || time_before(next, rq-timeout))
-   next = rq-timeout;
-   if (time_after_eq(jiffies, rq-timeout)) {
+   if (time_after_eq(jiffies, rq-deadline)) {
list_del_init(rq-timeout_list);
blk_rq_timed_out(rq);
}
+   if (!next_set) {
+   next = rq-deadline;
+   next_set = 1;
+   } else if (time_after(next, rq-deadline))
+   next = rq-deadline;
}
 
-   if (next)
+   if (next_set)
mod_timer(q-timeout, round_jiffies(next));
 
spin_unlock_irqrestore(q-queue_lock, flags);
@@ -3760,9 +3764,9 @@ void blk_add_timer(struct request *req)
BUG_ON(!list_empty(req-timeout_list));
 
if (req-timeout)
-   req-timeout += jiffies;
+   req-deadline = jiffies + req-timeout;
else
-   req-timeout = jiffies + q-rq_timeout;
+   req-deadline = jiffies + q-rq_timeout;
 
list_add_tail(req-timeout_list, q-timeout_list);
 
@@ -3771,7 +3775,7 @@ void blk_add_timer(struct request *req)
 * than an existing one, modify the timer. Round to next nearest
 * second.
 */
-   expiry = round_jiffies(req-timeout);
+   expiry = round_jiffies(req-deadline);
if (!timer_pending(q-timeout) ||
time_before(expiry, q-timeout.expires))
mod_timer(q-timeout, expiry);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 917fe86..e495d61 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -295,7 +295,8 @@ struct request {
void *data;
void *sense;
 
-   unsigned long timeout;
+   unsigned long deadline;
+   unsigned int timeout;
struct list_head timeout_list;
int retries;
 
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] gdth: scp timeout clean up

2007-11-06 Thread malahal
Boaz Harrosh [EMAIL PROTECTED] wrote:
  +static enum scsi_eh_timer_return gdth_timed_out(struct scsi_cmnd *scp)
  +{
  +   gdth_ha_str *ha = shost_priv(scp-device-host);
  +   struct gdth_cmndinfo *cmndinfo = gdth_get_cmndinfo(ha);
 
 a gdth_get_cmndinfo(ha) is for allocating a new cmndinfo out of free
 cmndinfo list, and must be paired by a call to gdth_put_cmndinfo(ha).
 And usually you want to put it on scp-host_scribble, other wise it will
 be lost. 
 To get the cmndinfo associated with a scsi_cmnd use gdth_cmnd_priv(scp) 

Thank you very much. It should be gdth_cmnd_priv(scp). I will fix it.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] gdth: scp timeout clean up

2007-11-05 Thread malahal
gdth driver is modified NOT to use scp-eh_timeout. Now, it has
eh_timed_out (gdth_timed_out) to handle command timeouts for locked
I/O's. Have not tested as I don't have needed hardware! Patch is
against 2.6.23-mm1.

Thank you Matthew Wilcox for your input on the IRC channel.

Signed-off-by: Malahal Naineni [EMAIL PROTECTED]

diff -r dbb45a1edd2a drivers/scsi/gdth.c
--- a/drivers/scsi/gdth.c   Mon Nov 05 21:32:26 2007 -0800
+++ b/drivers/scsi/gdth.c   Mon Nov 05 21:54:27 2007 -0800
@@ -2056,22 +2056,12 @@ static void gdth_putq(gdth_ha_str *ha, S
 register Scsi_Cmnd *pscp;
 register Scsi_Cmnd *nscp;
 ulong flags;
-unchar b, t;
 
 TRACE((gdth_putq() priority %d\n,priority));
 spin_lock_irqsave(ha-smp_lock, flags);
 
 if (!cmndinfo-internal_command) {
 cmndinfo-priority = priority;
-b = scp-device-channel;
-t = scp-device-id;
-if (priority = DEFAULT_PRI) {
-if ((b != ha-virt_bus  ha-raw[BUS_L2P(ha,b)].lock) ||
-(b==ha-virt_bus  tMAX_HDRIVES  ha-hdr[t].lock)) {
-TRACE2((gdth_putq(): locked IO -update_timeout()\n));
-cmndinfo-timeout = gdth_update_timeout(scp, 0);
-}
-}
 }
 
 if (ha-req_first==NULL) {
@@ -2359,7 +2349,7 @@ static void gdth_copy_internal_data(gdth
 {
 ushort cpcount,i, max_sg = gdth_sg_count(scp);
 ushort cpsum,cpnow;
-struct scatterlist *sl, *sg;
+struct scatterlist *sl;
 char *address;
 
 cpcount = min_t(ushort, count, gdth_bufflen(scp));
@@ -3938,6 +3928,38 @@ static const char *gdth_info(struct Scsi
 return ((const char *)ha-binfo.type_string);
 }
 
+static enum scsi_eh_timer_return gdth_timed_out(struct scsi_cmnd *scp)
+{
+   gdth_ha_str *ha = shost_priv(scp-device-host);
+   struct gdth_cmndinfo *cmndinfo = gdth_get_cmndinfo(ha);
+   unchar b, t;
+   ulong flags;
+   enum scsi_eh_timer_return retval = EH_NOT_HANDLED;
+
+   TRACE((%s() cmd 0x%x\n, scp-cmnd[0], __FUNCTION__));
+   b = scp-device-channel;
+   t = scp-device-id;
+
+   /*
+* We don't really honor the command timeout, but we try to
+* honor 6 times of the actual command timeout! So reset the
+* timer if this is less than 6th timeout on this command!
+*/
+   if (++cmndinfo-timeout_count  6)
+   retval = EH_RESET_TIMER;
+
+   /* Reset the timeout if it is locked IO */
+   spin_lock_irqsave(ha-smp_lock, flags);
+   if ((b != ha-virt_bus  ha-raw[BUS_L2P(ha, b)].lock) ||
+   (b == ha-virt_bus  t  MAX_HDRIVES  ha-hdr[t].lock)) {
+   TRACE2((%s(): locked IO, reset timeout\n, __FUNCTION__));
+   retval = EH_RESET_TIMER;
+   }
+   spin_unlock_irqrestore(ha-smp_lock, flags);
+
+   return retval;
+}
+
 static int gdth_eh_bus_reset(Scsi_Cmnd *scp)
 {
 gdth_ha_str *ha = shost_priv(scp-device-host);
@@ -4031,7 +4053,7 @@ static int gdth_queuecommand(struct scsi
 BUG_ON(!cmndinfo);
 
 scp-scsi_done = done;
-gdth_update_timeout(scp, scp-timeout_per_command * 6);
+cmndinfo-timeout_count = 0;
 cmndinfo-priority = DEFAULT_PRI;
 
 gdth_set_bufflen(scp, scsi_bufflen(scp));
@@ -4137,12 +4159,10 @@ static int ioc_lockdrv(void __user *arg)
 ha-hdr[j].lock = 1;
 spin_unlock_irqrestore(ha-smp_lock, flags);
 gdth_wait_completion(ha, ha-bus_cnt, j);
-gdth_stop_timeout(ha, ha-bus_cnt, j);
 } else {
 spin_lock_irqsave(ha-smp_lock, flags);
 ha-hdr[j].lock = 0;
 spin_unlock_irqrestore(ha-smp_lock, flags);
-gdth_start_timeout(ha, ha-bus_cnt, j);
 gdth_next(ha);
 }
 } 
@@ -4582,14 +4602,12 @@ static int gdth_ioctl(struct inode *inod
 spin_unlock_irqrestore(ha-smp_lock, flags);
 for (j = 0; j  ha-tid_cnt; ++j) {
 gdth_wait_completion(ha, i, j);
-gdth_stop_timeout(ha, i, j);
 }
 } else {
 spin_lock_irqsave(ha-smp_lock, flags);
 ha-raw[i].lock = 0;
 spin_unlock_irqrestore(ha-smp_lock, flags);
 for (j = 0; j  ha-tid_cnt; ++j) {
-gdth_start_timeout(ha, i, j);
 gdth_next(ha);
 }
 }
@@ -4724,6 +4742,7 @@ static struct scsi_host_template gdth_te
 .slave_configure= gdth_slave_configure,
 .bios_param = gdth_bios_param,
 .proc_info  = gdth_proc_info,
+   .eh_timed_out   = gdth_timed_out,
 .proc_name  = gdth,
 .can_queue  = GDTH_MAXCMDS,
 .this_id= -1,
diff -r dbb45a1edd2a drivers/scsi/gdth.h
--- a/drivers/scsi/gdth.h   Mon Nov 05 21:32:26 2007 -0800
+++ b/drivers/scsi/gdth.h   Mon Nov 05 21:55:15 2007 -0800
@@ -917,7 +917,7 @@ typedef

Re: [RFC] [PATCH 1/1] blk request timeout handler patches

2007-10-23 Thread malahal
[EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Jens Axboe [EMAIL PROTECTED] wrote:
  Current code is below, btw. Not a lot of changes, iirc it's just the
  delete-always, a missing export, delete timer on empty list.
  
  +void blk_add_timer(struct request *req)
  +{
  +   struct request_queue *q = req-q;
  +   unsigned long expiry;
  +
  +   BUG_ON(!list_empty(req-timeout_list));
  +
  +   if (req-timeout)
  +   req-timeout += jiffies;
  +   else
  +   req-timeout = jiffies + q-rq_timeout;
  +
 
 The meaning of req-timeout is changed here. It was a timeout value, now
 it became an expiry time. If we happen to requeue the request, its
 timeout would be incorrect. Found using scsi_debug! Maybe, make it a
 bitfield to avoid another field...

Sorry, bitfield won't work!
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] [PATCH 1/1] blk request timeout handler patches

2007-10-22 Thread malahal
Jens Axboe [EMAIL PROTECTED] wrote:
 Current code is below, btw. Not a lot of changes, iirc it's just the
 delete-always, a missing export, delete timer on empty list.
 
 +void blk_add_timer(struct request *req)
 +{
 + struct request_queue *q = req-q;
 + unsigned long expiry;
 +
 + BUG_ON(!list_empty(req-timeout_list));
 +
 + if (req-timeout)
 + req-timeout += jiffies;
 + else
 + req-timeout = jiffies + q-rq_timeout;
 +

The meaning of req-timeout is changed here. It was a timeout value, now
it became an expiry time. If we happen to requeue the request, its
timeout would be incorrect. Found using scsi_debug! Maybe, make it a
bitfield to avoid another field...

 + list_add_tail(req-timeout_list, q-timeout_list);
 +
 + /*
 +  * If the timer isn't already pending or this timeout is earlier
 +  * than an existing one, modify the timer. Round to next nearest
 +  * second.
 +  */
 + expiry = round_jiffies(req-timeout);
 + if (!timer_pending(q-timeout) ||
 + time_before(expiry, q-timeout.expires))
 + mod_timer(q-timeout, expiry);

How about changing q-timeout to q-timer?

 diff --git a/drivers/scsi/gdth_proc.c b/drivers/scsi/gdth_proc.c
 index 32982eb..22a9013 100644
 --- a/drivers/scsi/gdth_proc.c
 +++ b/drivers/scsi/gdth_proc.c
 @@ -846,19 +846,19 @@ static int gdth_update_timeout(int hanum, Scsi_Cmnd 
 *scp, int timeout)
  {
  int oldto;
 
 -oldto = scp-timeout_per_command;
 -scp-timeout_per_command = timeout;
 +oldto = scp-request-timeout;
 +scp-request-timeout = timeout;
 
  if (timeout == 0) {
 -del_timer(scp-eh_timeout);
 -scp-eh_timeout.data = (unsigned long) NULL;
 -scp-eh_timeout.expires = 0;
 +del_timer(scp-request-timer);
 +scp-request-timer.data = (unsigned long) NULL;
 +scp-request-timer.expires = 0;

request doesn't have timer field, code compilation fails for this
file.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] [PATCH 1/1] blk request timeout handler patches

2007-10-11 Thread malahal
Jens Axboe [EMAIL PROTECTED] wrote:
 On Tue, Oct 09 2007, James Bottomley wrote:
  On Tue, 2007-10-09 at 14:15 +0200, Jens Axboe wrote:
   On Tue, Oct 09 2007, Matthew Wilcox wrote:
On Mon, Oct 08, 2007 at 10:36:10PM -0700, [EMAIL PROTECTED] wrote:
 Thank you Randy, Jens for your suggestions. I folded the second patch 
 as
 it is just a clean up. Here is the fixed one patch version.

I was thinking about this (in the context of shrinking scsi_cmnd --
obviously, things are not improved if we simply move the timer to 
request
instead of scsi_cmnd).  Why do we need a timer _per request_?  We don't
need one per network packet.  I appreciate we had one per scsi_cmnd and
this patch is just moving it upwards in the hierarchy, but perhaps we
can do better.

What if we have one timer per request queue instead?  It needs to expire
as soon as the earliest request timer would expire, then needs to be
reset to the next earliest one.  We might walk the request queue more
frequently, but we'd save 48 bytes in the struct request.
   
   I agree, adding a full timer to each request is not nice. You jump over
   the actual implementation details of having just one timer in the queue
   though, it's pretty cheap to just say it can be done :-). You need to
   track each request anyways. If all drivers used the block layer tagging
   it would be easy since we are tracking each pending request in that
   case, but right now they don't. So pending requests may very well be
   outside of block layer knowledge.
  
  Can't we handle this a bit like the Linux timer infrastructure?  Instead
  of a timer per cmnd we have one per queue that's reset by commands
  returning?  If we retained a linked list of commands in timer order and
  expiry times, that's still going to save us an unsigned long and two
  pointers over struct timer_list.
 
 Here's an approach that uses a single timer. I purposely do not sort
 commands in expiry times, as that would introduce an O(N) operation far
 out weighing the IO scheduling cost, pretty silly for a timeout
 mechanism that supposedly should never trigger. Or we could waste more
 memory and fewer cycles (but still far more cycles than we need) by
 sorting in some sort of tree.
 
 So I don't sort the list, instead I push the cost of locating expired
 request to the timeout handler. It should really only run, when a
 request times out. Timeout is then reset to next command timeout
 (rounded), if further commands exist. It also doesn't fiddle with the
 timer unless an incoming command has a lower timeout than the current
 one.
 
 Totally untested...
 
 diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
 index ed39313..cb3210a 100644
 --- a/block/ll_rw_blk.c
 +++ b/block/ll_rw_blk.c
 @@ -42,6 +42,7 @@ static void drive_stat_acct(struct request *rq, int 
 nr_sectors, int new_io);
  static void init_request_from_bio(struct request *req, struct bio *bio);
  static int __make_request(struct request_queue *q, struct bio *bio);
  static struct io_context *current_io_context(gfp_t gfp_flags, int node);
 +static void blk_rq_timed_out_timer(unsigned long);
 
  /*
   * For the allocated request tables
 @@ -177,6 +178,18 @@ void blk_queue_softirq_done(struct request_queue *q, 
 softirq_done_fn *fn)
 
  EXPORT_SYMBOL(blk_queue_softirq_done);
 
 +void blk_queue_rq_timeout(struct request_queue *q, unsigned int timeout)
 +{
 + q-rq_timeout = timeout;
 +}
 +EXPORT_SYMBOL_GPL(blk_queue_rq_timeout);
 +
 +void blk_queue_rq_timed_out(struct request_queue *q, rq_timed_out_fn *fn)
 +{
 + q-rq_timed_out_fn = fn;
 +}
 +EXPORT_SYMBOL_GPL(blk_queue_rq_timed_out);
 +
  /**
   * blk_queue_make_request - define an alternate make_request function for a 
 device
   * @q:  the request queue for the device to be affected
 @@ -239,7 +252,9 @@ static void rq_init(struct request_queue *q, struct 
 request *rq)
  {
   INIT_LIST_HEAD(rq-queuelist);
   INIT_LIST_HEAD(rq-donelist);
 + INIT_LIST_HEAD(rq-timeout_list);
 
 + rq-timeout = 0;
   rq-errors = 0;
   rq-bio = rq-biotail = NULL;
   INIT_HLIST_NODE(rq-hash);
 @@ -1851,6 +1866,8 @@ struct request_queue *blk_alloc_queue_node(gfp_t 
 gfp_mask, int node_id)
   return NULL;
 
   init_timer(q-unplug_timer);
 + setup_timer(q-timeout, blk_rq_timed_out_timer, (unsigned long) q);
 + INIT_LIST_HEAD(q-timeout_list);
 
   snprintf(q-kobj.name, KOBJ_NAME_LEN, %s, queue);
   q-kobj.ktype = queue_ktype;
 @@ -2271,6 +2288,7 @@ EXPORT_SYMBOL(blk_start_queueing);
   */
  void blk_requeue_request(struct request_queue *q, struct request *rq)
  {
 + blk_delete_timer(rq);
   blk_add_trace_rq(q, rq, BLK_TA_REQUEUE);
 
   if (blk_rq_tagged(rq))
 @@ -3600,24 +3618,132 @@ static struct notifier_block __devinitdata 
 blk_cpu_notifier = {
  };
 
  /**
 - * blk_complete_request - end I/O on a request
 - * @req:  the request being processed
 + * blk_delete_timer - 

Re: [RFC] [PATCH 1/1] blk request timeout handler patches

2007-10-10 Thread malahal
I don't see blk_delete_timer() actually calling mod_timer/del_timer at
all.  Doesn't that mean, your timer would eventually expire for no
reason and walk through the list unnecessarily?

Thanks, Malahal.

Jens Axboe [EMAIL PROTECTED] wrote:
 On Tue, Oct 09 2007, James Bottomley wrote:
  On Tue, 2007-10-09 at 14:15 +0200, Jens Axboe wrote:
   On Tue, Oct 09 2007, Matthew Wilcox wrote:
On Mon, Oct 08, 2007 at 10:36:10PM -0700, [EMAIL PROTECTED] wrote:
 Thank you Randy, Jens for your suggestions. I folded the second patch 
 as
 it is just a clean up. Here is the fixed one patch version.

I was thinking about this (in the context of shrinking scsi_cmnd --
obviously, things are not improved if we simply move the timer to 
request
instead of scsi_cmnd).  Why do we need a timer _per request_?  We don't
need one per network packet.  I appreciate we had one per scsi_cmnd and
this patch is just moving it upwards in the hierarchy, but perhaps we
can do better.

What if we have one timer per request queue instead?  It needs to expire
as soon as the earliest request timer would expire, then needs to be
reset to the next earliest one.  We might walk the request queue more
frequently, but we'd save 48 bytes in the struct request.
   
   I agree, adding a full timer to each request is not nice. You jump over
   the actual implementation details of having just one timer in the queue
   though, it's pretty cheap to just say it can be done :-). You need to
   track each request anyways. If all drivers used the block layer tagging
   it would be easy since we are tracking each pending request in that
   case, but right now they don't. So pending requests may very well be
   outside of block layer knowledge.
  
  Can't we handle this a bit like the Linux timer infrastructure?  Instead
  of a timer per cmnd we have one per queue that's reset by commands
  returning?  If we retained a linked list of commands in timer order and
  expiry times, that's still going to save us an unsigned long and two
  pointers over struct timer_list.
 
 Here's an approach that uses a single timer. I purposely do not sort
 commands in expiry times, as that would introduce an O(N) operation far
 out weighing the IO scheduling cost, pretty silly for a timeout
 mechanism that supposedly should never trigger. Or we could waste more
 memory and fewer cycles (but still far more cycles than we need) by
 sorting in some sort of tree.
 
 So I don't sort the list, instead I push the cost of locating expired
 request to the timeout handler. It should really only run, when a
 request times out. Timeout is then reset to next command timeout
 (rounded), if further commands exist. It also doesn't fiddle with the
 timer unless an incoming command has a lower timeout than the current
 one.
 
 Totally untested...
 
 diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
 index ed39313..cb3210a 100644
 --- a/block/ll_rw_blk.c
 +++ b/block/ll_rw_blk.c
 @@ -42,6 +42,7 @@ static void drive_stat_acct(struct request *rq, int 
 nr_sectors, int new_io);
  static void init_request_from_bio(struct request *req, struct bio *bio);
  static int __make_request(struct request_queue *q, struct bio *bio);
  static struct io_context *current_io_context(gfp_t gfp_flags, int node);
 +static void blk_rq_timed_out_timer(unsigned long);
 
  /*
   * For the allocated request tables
 @@ -177,6 +178,18 @@ void blk_queue_softirq_done(struct request_queue *q, 
 softirq_done_fn *fn)
 
  EXPORT_SYMBOL(blk_queue_softirq_done);
 
 +void blk_queue_rq_timeout(struct request_queue *q, unsigned int timeout)
 +{
 + q-rq_timeout = timeout;
 +}
 +EXPORT_SYMBOL_GPL(blk_queue_rq_timeout);
 +
 +void blk_queue_rq_timed_out(struct request_queue *q, rq_timed_out_fn *fn)
 +{
 + q-rq_timed_out_fn = fn;
 +}
 +EXPORT_SYMBOL_GPL(blk_queue_rq_timed_out);
 +
  /**
   * blk_queue_make_request - define an alternate make_request function for a 
 device
   * @q:  the request queue for the device to be affected
 @@ -239,7 +252,9 @@ static void rq_init(struct request_queue *q, struct 
 request *rq)
  {
   INIT_LIST_HEAD(rq-queuelist);
   INIT_LIST_HEAD(rq-donelist);
 + INIT_LIST_HEAD(rq-timeout_list);
 
 + rq-timeout = 0;
   rq-errors = 0;
   rq-bio = rq-biotail = NULL;
   INIT_HLIST_NODE(rq-hash);
 @@ -1851,6 +1866,8 @@ struct request_queue *blk_alloc_queue_node(gfp_t 
 gfp_mask, int node_id)
   return NULL;
 
   init_timer(q-unplug_timer);
 + setup_timer(q-timeout, blk_rq_timed_out_timer, (unsigned long) q);
 + INIT_LIST_HEAD(q-timeout_list);
 
   snprintf(q-kobj.name, KOBJ_NAME_LEN, %s, queue);
   q-kobj.ktype = queue_ktype;
 @@ -2271,6 +2288,7 @@ EXPORT_SYMBOL(blk_start_queueing);
   */
  void blk_requeue_request(struct request_queue *q, struct request *rq)
  {
 + blk_delete_timer(rq);
   blk_add_trace_rq(q, rq, BLK_TA_REQUEUE);
 
   if (blk_rq_tagged(rq))
 @@ -3600,24

Re: [RFC] [PATCH 1/1] blk request timeout handler patches

2007-10-09 Thread malahal
Here is another attempt! Thank you.


diff -r 1b51503899a0 block/ll_rw_blk.c
--- a/block/ll_rw_blk.c Thu Sep 27 00:25:15 2007 -0700
+++ b/block/ll_rw_blk.c Tue Oct 09 07:00:51 2007 -0700
@@ -181,6 +181,18 @@ void blk_queue_softirq_done(struct reque
 
 EXPORT_SYMBOL(blk_queue_softirq_done);
 
+void blk_queue_rq_timeout(struct request_queue *q, unsigned int timeout)
+{
+   q-rq_timeout = timeout;
+}
+EXPORT_SYMBOL_GPL(blk_queue_rq_timeout);
+
+void blk_queue_rq_timed_out(struct request_queue *q, rq_timed_out_fn *fn)
+{
+   q-rq_timed_out_fn = fn;
+}
+EXPORT_SYMBOL_GPL(blk_queue_rq_timed_out);
+
 /**
  * blk_queue_make_request - define an alternate make_request function for a 
device
  * @q:  the request queue for the device to be affected
@@ -243,7 +255,9 @@ static void rq_init(struct request_queue
 {
INIT_LIST_HEAD(rq-queuelist);
INIT_LIST_HEAD(rq-donelist);
-
+   init_timer(rq-timer);
+
+   rq-timeout = 0;
rq-errors = 0;
rq-bio = rq-biotail = NULL;
INIT_HLIST_NODE(rq-hash);
@@ -2305,6 +2319,7 @@ EXPORT_SYMBOL(blk_start_queueing);
  */
 void blk_requeue_request(struct request_queue *q, struct request *rq)
 {
+   blk_delete_timer(rq);
blk_add_trace_rq(q, rq, BLK_TA_REQUEUE);
 
if (blk_rq_tagged(rq))
@@ -3647,8 +3662,109 @@ static struct notifier_block blk_cpu_not
 };
 
 /**
+ * blk_delete_timer - Delete/cancel timer for a given function.
+ * @req:   request that we are canceling timer for
+ *
+ * Return value:
+ * 1 if we were able to detach the timer.  0 if we blew it, and the
+ * timer function has already started to run.
+ */
+int blk_delete_timer(struct request *req)
+{
+   if (!req-q-rq_timed_out_fn)
+   return 1;
+
+   return del_timer(req-timer);
+}
+EXPORT_SYMBOL_GPL(blk_delete_timer);
+
+static void blk_rq_timed_out(struct request *req)
+{
+   struct request_queue *q = req-q;
+
+   switch (q-rq_timed_out_fn(req)) {
+   case BLK_EH_HANDLED:
+   __blk_complete_request(req);
+   return;
+   case BLK_EH_RESET_TIMER:
+   blk_add_timer(req);
+   return;
+   case BLK_EH_NOT_HANDLED:
+   /*
+* LLD handles this for now but in the future
+* we can send a request msg to abort the command
+* and we can move more of the generic scsi eh code to
+* the blk layer.
+*/
+   return;
+   }
+}
+
+static void blk_rq_timed_out_timer(unsigned long data)
+{
+   struct request *req = (struct request *)data;
+
+   blk_rq_timed_out(req);
+}
+
+/**
+ * blk_abort_req -- Request request recovery for the specified command
+ * @req:   pointer to the request of interest
+ *
+ * This function requests that the block layer start recovery for the
+ * request by deleting the timer and calling the q's timeout function.
+ * LLDDs who implement their own error recovery MAY ignore the timeout
+ * event if they generated blk_abort_req.
+ */
+void blk_abort_req(struct request *req)
+{
+if (!blk_delete_timer(req))
+return;
+blk_rq_timed_out(req);
+}
+EXPORT_SYMBOL_GPL(blk_abort_req);
+
+/**
+ * blk_add_timer - Start timeout timer for a single request
+ * @req:   request that is about to start running.
+ *
+ * Notes:
+ *Each request has its own timer, and as it is added to the queue, we
+ *set up the timer.  When the request completes, we cancel the timer.
+ */
+void blk_add_timer(struct request *req)
+{
+   struct request_queue *q = req-q;
+
+   req-timer.data = (unsigned long)req;
+   if (req-timeout)
+   req-timer.expires = jiffies + req-timeout;
+   else
+   req-timer.expires = jiffies + q-rq_timeout;
+   req-timer.function = blk_rq_timed_out_timer;
+   add_timer(req-timer);
+}
+EXPORT_SYMBOL_GPL(blk_add_timer);
+
+void __blk_complete_request(struct request *req)
+{
+   struct list_head *cpu_list;
+   unsigned long flags;
+
+   BUG_ON(!req-q-softirq_done_fn);
+
+   local_irq_save(flags);
+
+   cpu_list = __get_cpu_var(blk_cpu_done);
+   list_add_tail(req-donelist, cpu_list);
+   raise_softirq_irqoff(BLOCK_SOFTIRQ);
+
+   local_irq_restore(flags);
+}
+
+/**
  * blk_complete_request - end I/O on a request
- * @req:  the request being processed
+ * @req:   the request being processed
  *
  * Description:
  * Ends all I/O on a request. It does not handle partial completions,
@@ -3656,26 +3772,24 @@ static struct notifier_block blk_cpu_not
  * through requeueing. The actual completion happens out-of-order,
  * through a softirq handler. The user must have registered a completion
  * callback through blk_queue_softirq_done().
- **/
-
+ */
 void blk_complete_request(struct request *req)
 {
-   struct list_head *cpu_list;
-   unsigned long flags;
-
-   BUG_ON(!req-q-softirq_done_fn);
-   

Re: [RFC] [PATCH 1/1] blk request timeout handler patches

2007-10-09 Thread malahal
If we don't need precise timeout, we can have one single timer running
at regular intervals, say every second. We also record the exact number
of requests that are supposed to be done in a particular interval. This
is done by computing the 'interval index' of a request, incrementing it
when we send the request and decrementing the same when the request is
completed. Timer can detect if all requests are done for that interval
by looking at the 'interval number'

In this method, timer can detect a timeout but doesn't know which
command has timed out though. Aborting all pending requests is one
option. As we don't start timer for each request, there is no need to
stop any timer when the request is done.

Thanks, Malahal.

James Bottomley [EMAIL PROTECTED] wrote:
 On Tue, 2007-10-09 at 14:15 +0200, Jens Axboe wrote:
  On Tue, Oct 09 2007, Matthew Wilcox wrote:
   On Mon, Oct 08, 2007 at 10:36:10PM -0700, [EMAIL PROTECTED] wrote:
Thank you Randy, Jens for your suggestions. I folded the second patch as
it is just a clean up. Here is the fixed one patch version.
   
   I was thinking about this (in the context of shrinking scsi_cmnd --
   obviously, things are not improved if we simply move the timer to request
   instead of scsi_cmnd).  Why do we need a timer _per request_?  We don't
   need one per network packet.  I appreciate we had one per scsi_cmnd and
   this patch is just moving it upwards in the hierarchy, but perhaps we
   can do better.
   
   What if we have one timer per request queue instead?  It needs to expire
   as soon as the earliest request timer would expire, then needs to be
   reset to the next earliest one.  We might walk the request queue more
   frequently, but we'd save 48 bytes in the struct request.
  
  I agree, adding a full timer to each request is not nice. You jump over
  the actual implementation details of having just one timer in the queue
  though, it's pretty cheap to just say it can be done :-). You need to
  track each request anyways. If all drivers used the block layer tagging
  it would be easy since we are tracking each pending request in that
  case, but right now they don't. So pending requests may very well be
  outside of block layer knowledge.
 
 Can't we handle this a bit like the Linux timer infrastructure?  Instead
 of a timer per cmnd we have one per queue that's reset by commands
 returning?  If we retained a linked list of commands in timer order and
 expiry times, that's still going to save us an unsigned long and two
 pointers over struct timer_list.
 
  You'd also end up using lots of extra cycles to find and move the timer
  when it expires (not likely) or is deleted (most likely). I'd greatly
  prefer the space overhead in this case, even if it is quite costly.
 
 If there's another command on the linked list, you mod_timer otherwise
 you del_timer_sync.  Returning commands obviously delete from this list.
 
  There's also the issue of fiddling with just one vs many timers,
  potentially more cache bouncy.
 
 Yes, I agree it's more fiddly.
 
 The beauty is that I think it can all be done at the block layer via
 helpers ... so it becomes Somebody Else's Problem from the SCSI point of
 view ...
 
 James
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] [PATCH 1/2] blk request timeout handler patches

2007-10-08 Thread malahal
Jens Axboe [EMAIL PROTECTED] wrote:
 On Thu, Oct 04 2007, [EMAIL PROTECTED] wrote:
  Mike Christie's patches refreshed to 2.6.23-rc8-mm1.
  
  Signed-off-by: Mike Christie [EMAIL PROTECTED]
  Signed-off-by: Malahal Naineni [EMAIL PROTECTED]
  
  
  diff -r 3697367c6e4d block/ll_rw_blk.c
  --- a/block/ll_rw_blk.c Thu Sep 27 00:12:13 2007 -0700
  +++ b/block/ll_rw_blk.c Thu Sep 27 00:13:07 2007 -0700
  @@ -181,6 +181,19 @@ void blk_queue_softirq_done(struct reque
   
   EXPORT_SYMBOL(blk_queue_softirq_done);
   
  +void blk_queue_rq_timeout(struct request_queue *q, unsigned int timeout)
  +{
  +   q-rq_timeout = timeout;
  +}
  +EXPORT_SYMBOL_GPL(blk_queue_rq_timeout);
  +
  +void blk_queue_rq_timed_out(struct request_queue *q, rq_timed_out_fn *fn)
  +{
  +   q-rq_timed_out_fn = fn;
  +}
  +
  +EXPORT_SYMBOL_GPL(blk_queue_rq_timed_out);
  +
   /**
* blk_queue_make_request - define an alternate make_request function for 
  a device
* @q:  the request queue for the device to be affected
  @@ -243,7 +256,9 @@ static void rq_init(struct request_queue
   {
  INIT_LIST_HEAD(rq-queuelist);
  INIT_LIST_HEAD(rq-donelist);
  -
  +   init_timer(rq-timer);
  +
  +   rq-timeout = 0;
  rq-errors = 0;
  rq-bio = rq-biotail = NULL;
  INIT_HLIST_NODE(rq-hash);
  @@ -2305,6 +2320,7 @@ EXPORT_SYMBOL(blk_start_queueing);
*/
   void blk_requeue_request(struct request_queue *q, struct request *rq)
   {
  +   blk_delete_timer(rq);
  blk_add_trace_rq(q, rq, BLK_TA_REQUEUE);
   
  if (blk_rq_tagged(rq))
  @@ -3647,8 +3663,121 @@ static struct notifier_block blk_cpu_not
   };
   
   /**
  + * blk_delete_timer - Delete/cancel timer for a given function.
  + * @req:   request that we are canceling timer for
  + *
  + * Return value:
  + * 1 if we were able to detach the timer.  0 if we blew it, and the
  + * timer function has already started to run.
  + **/
  +int blk_delete_timer(struct request *req)
  +{
  +   int rtn;
  +
  +   if (!req-q-rq_timed_out_fn)
  +   return 1;
  +
  +   rtn = del_timer(req-timer);
  +   req-timer.data = (unsigned long)NULL;
  +   req-timer.function = NULL;
  +
  +   return rtn;
  +}
 
 Hmm, that looks fishy. What if the timer _just_ fired, yet it hasn't
 retrieved -data yet?
 

Looks like it is just copied from the SCSI code. More over _run_timers()
copies data and function on to stack variables before calling the timer
function. So, I don't think there is a problem here but assuming that
implementation detail is bad! I would like modify the above function to
reinitialize data and function fields only if del_timer() returns TRUE.
Please let me know if that is OK.

  +
  +EXPORT_SYMBOL_GPL(blk_delete_timer);
  +
  +static void blk_rq_timed_out(struct request *req)
  +{
  +   struct request_queue *q = req-q;
  +
  +   switch (q-rq_timed_out_fn(req)) {
  +   case BLK_EH_HANDLED:
  +   __blk_complete_request(req);
  +   return;
  +   case BLK_EH_RESET_TIMER:
  +   blk_add_timer(req);
  +   return;
  +   case BLK_EH_NOT_HANDLED:
  +   /*
  +* LLD handles this for now but in the future
  +* we can send a request msg to abort the command
  +* and we can move more of the generic scsi eh code to
  +* the blk layer.
  +*/
  +   return;
  +   }
  +}
  +
  +/**
  + * blk_abort_req -- Request request recovery for the specified command
  + * req:pointer to the request of interest
  + *
  + * This function requests that the block layer start recovery for the
  + * request by deleting the timer and calling the q's timeout function.
  + * LLDDs who implement their own error recovery MAY ignore the timeout
  + * event if they generated blk_abort_req.
  + */
  +void blk_abort_req(struct request *req)
  +{
  +if (!blk_delete_timer(req))
  +return;
  +blk_rq_timed_out(req);
  +}
  +
  +EXPORT_SYMBOL_GPL(blk_abort_req);
  +
  +/**
  + * blk_add_timer - Start timeout timer for a single request
  + * @req:   request that is about to start running.
  + *
  + * Notes:
  + *Each request has its own timer, and as it is added to the queue, we
  + *set up the timer.  When the request completes, we cancel the timer.
  + **/
  +void blk_add_timer(struct request *req)
  +{
  +   struct request_queue *q = req-q;
  +
  +   /*
  +* If the clock was already running for this command, then
  +* first delete the timer.  The timer handling code gets rather
  +* confused if we don't do this.
  +*/
  +   if (req-timer.function)
  +   del_timer(req-timer);
  +
  +   req-timer.data = (unsigned long)req;
  +   if (req-timeout)
  +   req-timer.expires = jiffies + req-timeout;
  +   else
  +   req-timer.expires = jiffies + q-rq_timeout;
  +   req-timer.function = (void (*)(unsigned long))blk_rq_timed_out;
 
 Why the cast?
 

Because blk_rq_timed_out() doesn't take unsigned long as its input

[RFC] [PATCH 0/2] blk request timeout handler patches

2007-10-04 Thread malahal
I refreshed the blk layer timeout patch from Mike Christie to 2.6.23-rc8-mm1.
Fixed couple of places where scsi_dispatch_cmd() wasn't stopping timers.

Thanks, Malahal.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC] [PATCH 2/2] blk request timeout handler patches

2007-10-04 Thread malahal
Fix scsi_dispatch_cmd() to stop timers.

Signed-off-by: Malahal Naineni [EMAIL PROTECTED]


diff -r 870bb598c977 drivers/scsi/scsi.c
--- a/drivers/scsi/scsi.c   Thu Sep 27 00:25:38 2007 -0700
+++ b/drivers/scsi/scsi.c   Thu Sep 27 01:04:10 2007 -0700
@@ -471,14 +471,19 @@ int scsi_dispatch_cmd(struct scsi_cmnd *
unsigned long timeout;
int rtn = 0;
 
+   /*
+* We will use a queued command if possible, otherwise we will
+* emulate the queuing and calling of completion function ourselves.
+*/
+   atomic_inc(cmd-device-iorequest_cnt);
+
/* check if the device is still usable */
if (unlikely(cmd-device-sdev_state == SDEV_DEL)) {
/* in SDEV_DEL we error all commands. DID_NO_CONNECT
 * returns an immediate error upwards, and signals
 * that the device is no longer present */
cmd-result = DID_NO_CONNECT  16;
-   atomic_inc(cmd-device-iorequest_cnt);
-   __blk_complete_request(cmd-request);
+   scsi_done(cmd);
/* return 0 (because the command has been processed) */
goto out;
}
@@ -491,7 +496,8 @@ int scsi_dispatch_cmd(struct scsi_cmnd *
 * future requests should not occur until the device 
 * transitions out of the suspend state.
 */
-   scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
+
+   scsi_queue_retry(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
 
SCSI_LOG_MLQUEUE(3, printk(queuecommand : device blocked \n));
 
@@ -536,12 +542,6 @@ int scsi_dispatch_cmd(struct scsi_cmnd *
scsi_log_send(cmd);
 
/*
-* We will use a queued command if possible, otherwise we will
-* emulate the queuing and calling of completion function ourselves.
-*/
-   atomic_inc(cmd-device-iorequest_cnt);
-
-   /*
 * Before we queue this command, check if the command
 * length exceeds what the host adapter can handle.
 */
@@ -571,12 +571,8 @@ int scsi_dispatch_cmd(struct scsi_cmnd *
}
spin_unlock_irqrestore(host-host_lock, flags);
if (rtn) {
-   if (blk_delete_timer(cmd-request)) {
-   atomic_inc(cmd-device-iodone_cnt);
-   scsi_queue_insert(cmd,
- (rtn == SCSI_MLQUEUE_DEVICE_BUSY) ?
- rtn : SCSI_MLQUEUE_HOST_BUSY);
-   }
+   scsi_queue_retry(cmd, (rtn == SCSI_MLQUEUE_DEVICE_BUSY) ?
+   rtn : SCSI_MLQUEUE_HOST_BUSY);
SCSI_LOG_MLQUEUE(3,
printk(queuecommand : request rejected\n));
}
diff -r 870bb598c977 drivers/scsi/scsi_lib.c
--- a/drivers/scsi/scsi_lib.c   Thu Sep 27 00:25:38 2007 -0700
+++ b/drivers/scsi/scsi_lib.c   Thu Sep 27 01:16:28 2007 -0700
@@ -160,6 +160,36 @@ int scsi_queue_insert(struct scsi_cmnd *
 
return 0;
 }
+
+/*
+ * Function:scsi_queue_retry()
+ *
+ * Purpose: Try inserting a command in the midlevel queue.
+ *
+ * Arguments:   cmd- command that we are adding to queue.
+ *  reason - why we are inserting command to queue.
+ *
+ * Lock status: Assumed that lock is not held upon entry.
+ *
+ * Returns: Nothing.
+ *
+ * Notes:   This is very similar to scsi_queue_insert except that we
+ *  call this function when we don't know if the blk layer timer
+ *  is active or not. We could implement this either by calling
+ *  blk_delete_timer and inserting in the midlevel queue if we
+ *  successfully delete the timer OR setting appropriate result
+ *  field in the cmd and letting it go through the normal done
+ *  routines which will retry the command. For now, We call
+ *  blk_delete_timer!
+ */
+void scsi_queue_retry(struct scsi_cmnd *cmd, int reason)
+{
+   if (blk_delete_timer(cmd-request)) {
+   atomic_inc(cmd-device-iodone_cnt);
+   scsi_queue_insert(cmd, reason);
+   }
+}
+
 
 /**
  * scsi_execute - insert request and wait for the result
diff -r 870bb598c977 drivers/scsi/scsi_priv.h
--- a/drivers/scsi/scsi_priv.h  Thu Sep 27 00:25:38 2007 -0700
+++ b/drivers/scsi/scsi_priv.h  Thu Sep 27 01:03:39 2007 -0700
@@ -64,6 +64,7 @@ extern int scsi_maybe_unblock_host(struc
 extern int scsi_maybe_unblock_host(struct scsi_device *sdev);
 extern void scsi_device_unbusy(struct scsi_device *sdev);
 extern int scsi_queue_insert(struct scsi_cmnd *cmd, int reason);
+extern void scsi_queue_retry(struct scsi_cmnd *cmd, int reason);
 extern void scsi_next_command(struct scsi_cmnd *cmd);
 extern void scsi_run_host_queues(struct Scsi_Host *shost);
 extern struct request_queue *scsi_alloc_queue(struct scsi_device *sdev);
-
To unsubscribe from this list: send the line unsubscribe

[RFC] [PATCH 1/2] blk request timeout handler patches

2007-10-04 Thread malahal
Mike Christie's patches refreshed to 2.6.23-rc8-mm1.

Signed-off-by: Mike Christie [EMAIL PROTECTED]
Signed-off-by: Malahal Naineni [EMAIL PROTECTED]


diff -r 3697367c6e4d block/ll_rw_blk.c
--- a/block/ll_rw_blk.c Thu Sep 27 00:12:13 2007 -0700
+++ b/block/ll_rw_blk.c Thu Sep 27 00:13:07 2007 -0700
@@ -181,6 +181,19 @@ void blk_queue_softirq_done(struct reque
 
 EXPORT_SYMBOL(blk_queue_softirq_done);
 
+void blk_queue_rq_timeout(struct request_queue *q, unsigned int timeout)
+{
+   q-rq_timeout = timeout;
+}
+EXPORT_SYMBOL_GPL(blk_queue_rq_timeout);
+
+void blk_queue_rq_timed_out(struct request_queue *q, rq_timed_out_fn *fn)
+{
+   q-rq_timed_out_fn = fn;
+}
+
+EXPORT_SYMBOL_GPL(blk_queue_rq_timed_out);
+
 /**
  * blk_queue_make_request - define an alternate make_request function for a 
device
  * @q:  the request queue for the device to be affected
@@ -243,7 +256,9 @@ static void rq_init(struct request_queue
 {
INIT_LIST_HEAD(rq-queuelist);
INIT_LIST_HEAD(rq-donelist);
-
+   init_timer(rq-timer);
+
+   rq-timeout = 0;
rq-errors = 0;
rq-bio = rq-biotail = NULL;
INIT_HLIST_NODE(rq-hash);
@@ -2305,6 +2320,7 @@ EXPORT_SYMBOL(blk_start_queueing);
  */
 void blk_requeue_request(struct request_queue *q, struct request *rq)
 {
+   blk_delete_timer(rq);
blk_add_trace_rq(q, rq, BLK_TA_REQUEUE);
 
if (blk_rq_tagged(rq))
@@ -3647,8 +3663,121 @@ static struct notifier_block blk_cpu_not
 };
 
 /**
+ * blk_delete_timer - Delete/cancel timer for a given function.
+ * @req:   request that we are canceling timer for
+ *
+ * Return value:
+ * 1 if we were able to detach the timer.  0 if we blew it, and the
+ * timer function has already started to run.
+ **/
+int blk_delete_timer(struct request *req)
+{
+   int rtn;
+
+   if (!req-q-rq_timed_out_fn)
+   return 1;
+
+   rtn = del_timer(req-timer);
+   req-timer.data = (unsigned long)NULL;
+   req-timer.function = NULL;
+
+   return rtn;
+}
+
+EXPORT_SYMBOL_GPL(blk_delete_timer);
+
+static void blk_rq_timed_out(struct request *req)
+{
+   struct request_queue *q = req-q;
+
+   switch (q-rq_timed_out_fn(req)) {
+   case BLK_EH_HANDLED:
+   __blk_complete_request(req);
+   return;
+   case BLK_EH_RESET_TIMER:
+   blk_add_timer(req);
+   return;
+   case BLK_EH_NOT_HANDLED:
+   /*
+* LLD handles this for now but in the future
+* we can send a request msg to abort the command
+* and we can move more of the generic scsi eh code to
+* the blk layer.
+*/
+   return;
+   }
+}
+
+/**
+ * blk_abort_req -- Request request recovery for the specified command
+ * req:pointer to the request of interest
+ *
+ * This function requests that the block layer start recovery for the
+ * request by deleting the timer and calling the q's timeout function.
+ * LLDDs who implement their own error recovery MAY ignore the timeout
+ * event if they generated blk_abort_req.
+ */
+void blk_abort_req(struct request *req)
+{
+if (!blk_delete_timer(req))
+return;
+blk_rq_timed_out(req);
+}
+
+EXPORT_SYMBOL_GPL(blk_abort_req);
+
+/**
+ * blk_add_timer - Start timeout timer for a single request
+ * @req:   request that is about to start running.
+ *
+ * Notes:
+ *Each request has its own timer, and as it is added to the queue, we
+ *set up the timer.  When the request completes, we cancel the timer.
+ **/
+void blk_add_timer(struct request *req)
+{
+   struct request_queue *q = req-q;
+
+   /*
+* If the clock was already running for this command, then
+* first delete the timer.  The timer handling code gets rather
+* confused if we don't do this.
+*/
+   if (req-timer.function)
+   del_timer(req-timer);
+
+   req-timer.data = (unsigned long)req;
+   if (req-timeout)
+   req-timer.expires = jiffies + req-timeout;
+   else
+   req-timer.expires = jiffies + q-rq_timeout;
+   req-timer.function = (void (*)(unsigned long))blk_rq_timed_out;
+add_timer(req-timer);
+}
+
+EXPORT_SYMBOL_GPL(blk_add_timer);
+
+void __blk_complete_request(struct request *req)
+{
+   struct list_head *cpu_list;
+   unsigned long flags;
+
+   BUG_ON(!req-q-softirq_done_fn);
+
+   local_irq_save(flags);
+
+   cpu_list = __get_cpu_var(blk_cpu_done);
+   list_add_tail(req-donelist, cpu_list);
+   raise_softirq_irqoff(BLOCK_SOFTIRQ);
+
+   local_irq_restore(flags);
+}
+
+EXPORT_SYMBOL_GPL(__blk_complete_request);
+
+/**
  * blk_complete_request - end I/O on a request
- * @req:  the request being processed
+ * @req:   the request being processed
  *
  * Description:
  * Ends all I/O on a request. It does not handle partial completions

Re: [PATCH] qla2xxx: fix RSCN handling on big-endian systems

2007-02-22 Thread malahal
Seokmann Ju [EMAIL PROTECTED] wrote:
 On Wednesday, February 21, 2007 6:54 PM, Malahal Naineni wrote:
  I have JBOD in FL-port and if I unplug the cable or disable 
  the switch port, the qla2xxx driver doesn't fail the I/O 
  soon. The remote port status is 'online' all the time. The 
  I/O's usually timeout after the usual scsi timeout.
 Yes, that is because qla2xxx doesn't want to trigger unnecessary error
 handling mechanism, which is expensive.
 As long as State Change recovered within SCSI timeout period, driver
 won't let mid-layer know about it.

I removed the cable and didn't put it back at all for more than 1 hour.
There is no recovery here.  When I disable an F-port, after few seconds
the remote port state is set to something other than 'online'. Why not
same behaviour for remote ports on FL-port. With my patch, the behaviour
is same whether the remote port is attached to an FL-port or an F-port.

  Based on 
  the existing structure, al_pa is at the lowest addressed 
  byte, so the b24 field's interpretation as an integer is 
  *incorrect* on big endinan systems. You should be able to print the
  b24 as an integer and see what you get is incorrect on big 
  endian systems.
 I agree on your assessment. However, the patch won't fix the problem as
 the problem itself is regardless byte ordering.
 A patch which includes your evaluation will be submitted.
 Thank you for findings.

Well, my patch fixed the problem! qla2x00_device_resync() fails to call
qla2x00_mark_device_lost() for RSCN's affecting area or domain because
the following evaluates to true always due to b24 field being incorrect:

(fcport-d_id.b24  mask) != d_id.b24

Thanks, Malahal.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] qla2xxx: fix RSCN handling on big-endian systems

2007-02-21 Thread malahal
I have JBOD in FL-port and if I unplug the cable or disable the switch
port, the qla2xxx driver doesn't fail the I/O soon. The remote port
status is 'online' all the time. The I/O's usually timeout after the
usual scsi timeout. Based on the existing structure, al_pa is at the
lowest addressed byte, so the b24 field's interpretation as an integer
is *incorrect* on big endinan systems. You should be able to print the
b24 as an integer and see what you get is incorrect on big endian
systems.

Thanks, Malahal.

Seokmann Ju [EMAIL PROTECTED] wrote:
 On Monday, February 19, 2007 10:19 AM, Malahal Naineni wrote:
  qla2xxx driver fails to handle RSCN events affecting area or 
  domain due to an endian issue on big endian systems.  This 
  fixes the port_id_t structure on big endian systems.
 NEED MORE INFORMATION
 Can you provide more details on the fails you are getting?
 In my opinion, those fields in the structure should not get affected by
 byte ordering.
 
 Thank you,
 
 Seokmann
 
  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
  Sent: Monday, February 19, 2007 10:19 AM
  To: linux-scsi@vger.kernel.org; Linux Driver
  Subject: [PATCH] qla2xxx: fix RSCN handling on big-endian systems
  
  qla2xxx driver fails to handle RSCN events affecting area or 
  domain due to an endian issue on big endian systems.  This 
  fixes the port_id_t structure on big endian systems.
  
  Signed-off-by: Malahal Naineni [EMAIL PROTECTED]
  
  diff -r c860739bb0f4 drivers/scsi/qla2xxx/qla_def.h
  --- a/drivers/scsi/qla2xxx/qla_def.hFri Feb 16 14:19:34 2007 -0800
  +++ b/drivers/scsi/qla2xxx/qla_def.hFri Feb 16 14:21:29 2007 -0800
  @@ -1478,14 +1478,17 @@ typedef union {
  uint32_t b24 : 24;
   
  struct {
  -   uint8_t d_id[3];
  -   uint8_t rsvd_1;
  -   } r;
  -
  -   struct {
  +#ifdef __BIG_ENDIAN
  +   uint8_t domain;
  +   uint8_t area;
  +   uint8_t al_pa;
  +#elif __LITTLE_ENDIAN
  uint8_t al_pa;
  uint8_t area;
  uint8_t domain;
  +#else
  +#error __BIG_ENDIAN or __LITTLE_ENDIAN must be defined!
  +#endif
  uint8_t rsvd_1;
  } b;
   } port_id_t;
  
 -
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] qla2xxx: fix RSCN handling on big-endian systems

2007-02-19 Thread malahal
qla2xxx driver fails to handle RSCN events affecting area or domain due
to an endian issue on big endian systems.  This fixes the port_id_t
structure on big endian systems.

Signed-off-by: Malahal Naineni [EMAIL PROTECTED]

diff -r c860739bb0f4 drivers/scsi/qla2xxx/qla_def.h
--- a/drivers/scsi/qla2xxx/qla_def.hFri Feb 16 14:19:34 2007 -0800
+++ b/drivers/scsi/qla2xxx/qla_def.hFri Feb 16 14:21:29 2007 -0800
@@ -1478,14 +1478,17 @@ typedef union {
uint32_t b24 : 24;
 
struct {
-   uint8_t d_id[3];
-   uint8_t rsvd_1;
-   } r;
-
-   struct {
+#ifdef __BIG_ENDIAN
+   uint8_t domain;
+   uint8_t area;
+   uint8_t al_pa;
+#elif __LITTLE_ENDIAN
uint8_t al_pa;
uint8_t area;
uint8_t domain;
+#else
+#error __BIG_ENDIAN or __LITTLE_ENDIAN must be defined!
+#endif
uint8_t rsvd_1;
} b;
 } port_id_t;
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux not adhering to BIOS Drive boot order?

2001-01-16 Thread Malahal Rao Naineni

Venkatesh Ramamurthy wrote:
 
 Hi,
 I have one issue which requires fix from the linux kernel. 
 Initially i put a SCSI controller and install the OS on the drive connected
 to it. After installing the OS (on sda), the customer puts another SCSI
 controller. The BIOS for the first controller has BIOS enabled and for the
 second controller does not have the BIOS enabled. 
 
 The linux loads the driver for the second controller first and assigns sda
 to it first , and the actual boot drive gets some sdX device node. 
 From the lilo prompt we can override it with root=/dev/sdX to boot to the
 correct drive and controller, but for a end -user using these cards, this is
 no fun.
 
 
 Can the linux kernel be changed in such a way that kernel will look for the
 actual boot drive and re-order the drives so that mounting can go on in the
 right order.

Name slippage is a horrible thing. That should be fixed first. The O/S
should get the device names right for every boot.

Thanks, Malahal.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]