[RFC] blktrace interface for sg devices

2007-12-13 Thread Christof Schmitt
I am referring to the discussion of introducing statistics in the SCSI
layer and the conclusion that blktrace already provides the data:
http://lkml.org/lkml/2006/10/21/72
http://lkml.org/lkml/2006/11/2/141

While blktrace works fine for disk devices, it currently does not
provide data for non-disk devices like tape drives. To close this gap,
i am looking for a way to get the same trace data also from other SCSI
devices.

Since the SCSI layer internally uses the same request queuest for all
devices and the queues already use the blktrace interface, the main
missing part is the interface to enable the tracing for all SCSI
devices.

Attached is a patch that adds the ioctl interface for blktrace to the
sg generic scsi interface. This already allows to get some trace data
for SCSI tape drives, although i have to do more testing.

For testing, any sg device file can be passed to blktrace, e.g.:
# blktrace -d /dev/sg1 -o - | blkparse -i -

I am seeking input in this approach: Is this approach worth pursuing
to enable blktrace to trace SCSI tape drives? Would there be a better
approach to get this trace data?

Christof Schmitt

---
 block/blktrace.c   |   19 +++
 drivers/scsi/sg.c  |   12 
 include/linux/blkdev.h |   10 ++
 3 files changed, 33 insertions(+), 8 deletions(-)

--- a/block/blktrace.c  2007-12-13 08:48:23.0 +0100
+++ b/block/blktrace.c  2007-12-13 08:48:25.0 +0100
@@ -231,7 +231,7 @@ static void blk_trace_cleanup(struct blk
kfree(bt);
 }
 
-static int blk_trace_remove(struct request_queue *q)
+int blk_trace_remove(struct request_queue *q)
 {
struct blk_trace *bt;
 
@@ -245,6 +245,7 @@ static int blk_trace_remove(struct reque
 
return 0;
 }
+EXPORT_SYMBOL_GPL(blk_trace_remove);
 
 static int blk_dropped_open(struct inode *inode, struct file *filp)
 {
@@ -312,13 +313,11 @@ static struct rchan_callbacks blk_relay_
 /*
  * Setup everything required to start tracing
  */
-static int blk_trace_setup(struct request_queue *q, struct block_device *bdev,
-  char __user *arg)
+int blk_trace_setup(struct request_queue *q, char *name, dev_t dev, char 
__user *arg)
 {
struct blk_user_trace_setup buts;
struct blk_trace *old_bt, *bt = NULL;
struct dentry *dir = NULL;
-   char b[BDEVNAME_SIZE];
int ret, i;
 
if (copy_from_user(buts, arg, sizeof(buts)))
@@ -327,7 +326,7 @@ static int blk_trace_setup(struct reques
if (!buts.buf_size || !buts.buf_nr)
return -EINVAL;
 
-   strcpy(buts.name, bdevname(bdev, b));
+   strcpy(buts.name, name);
 
/*
 * some device names have larger paths - convert the slashes
@@ -355,7 +354,7 @@ static int blk_trace_setup(struct reques
goto err;
 
bt-dir = dir;
-   bt-dev = bdev-bd_dev;
+   bt-dev = dev;
atomic_set(bt-dropped, 0);
 
ret = -EIO;
@@ -400,8 +399,9 @@ err:
}
return ret;
 }
+EXPORT_SYMBOL_GPL(blk_trace_setup);
 
-static int blk_trace_startstop(struct request_queue *q, int start)
+int blk_trace_startstop(struct request_queue *q, int start)
 {
struct blk_trace *bt;
int ret;
@@ -434,6 +434,7 @@ static int blk_trace_startstop(struct re
 
return ret;
 }
+EXPORT_SYMBOL_GPL(blk_trace_startstop);
 
 /**
  * blk_trace_ioctl: - handle the ioctls associated with tracing
@@ -446,6 +447,7 @@ int blk_trace_ioctl(struct block_device 
 {
struct request_queue *q;
int ret, start = 0;
+   char b[BDEVNAME_SIZE];
 
q = bdev_get_queue(bdev);
if (!q)
@@ -455,7 +457,8 @@ int blk_trace_ioctl(struct block_device 
 
switch (cmd) {
case BLKTRACESETUP:
-   ret = blk_trace_setup(q, bdev, arg);
+   strcpy(b, bdevname(bdev, b));
+   ret = blk_trace_setup(q, b, bdev-bd_dev, arg);
break;
case BLKTRACESTART:
start = 1;
--- a/drivers/scsi/sg.c 2007-12-13 08:48:23.0 +0100
+++ b/drivers/scsi/sg.c 2007-12-13 08:48:25.0 +0100
@@ -55,6 +55,8 @@ static int sg_version_num = 30534;/* 2 
 #include scsi/scsi_ioctl.h
 #include scsi/sg.h
 
+#include linux/blktrace_api.h
+
 #include scsi_logging.h
 
 #ifdef CONFIG_SCSI_PROC_FS
@@ -1066,6 +1068,16 @@ sg_ioctl(struct inode *inode, struct fil
case BLKSECTGET:
return put_user(sdp-device-request_queue-max_sectors * 512,
ip);
+   case BLKTRACESETUP:
+   {
+   return blk_trace_setup(sdp-device-request_queue , 
sdp-device-sdev_gendev.bus_id, sdp-device-sdev_gendev, arg);
+   }
+   case BLKTRACESTART:
+   return blk_trace_startstop(sdp-device-request_queue, 1);
+   case BLKTRACESTOP:
+   return blk_trace_startstop(sdp-device-request_queue, 0);
+   case BLKTRACETEARDOWN:
+   return blk_trace_remove(sdp-device-request_queue);

[PATCH] dpt_i2o: don't set DMA_64BIT_MASK [was: Re: [stable] broken dpt_i2o in 2.6.23 (was: ext2 check page: bad entry in directory) (fwd)]

2007-12-13 Thread Miquel van Smoorenburg
According to Greg KH:
 So, what should be added to 2.6.23-stable then?  And, can I get a real
 changelog entry for it?

This is suitable for both 2.6.23.x and 2.6.24-rc5 :

linux-2.6-dpt_i2o-no-dma64.patch

The dpt_i2o driver can't handle 64 bit DMA addresses, so do not
let it set pci_set_dma_mask(pDev, DMA_64BIT_MASK) .

Signed-off-by: Miquel van Smoorenburg [EMAIL PROTECTED]

diff -ruN linux-2.6.23.9.orig/drivers/scsi/dpt_i2o.c 
linux-2.6.23.9/drivers/scsi/dpt_i2o.c
--- linux-2.6.23.9.orig/drivers/scsi/dpt_i2o.c  2007-11-26 18:51:43.0 
+0100
+++ linux-2.6.23.9/drivers/scsi/dpt_i2o.c   2007-12-12 13:21:05.0 
+0100
@@ -905,8 +905,7 @@
}
 
pci_set_master(pDev);
-   if (pci_set_dma_mask(pDev, DMA_64BIT_MASK) 
-   pci_set_dma_mask(pDev, DMA_32BIT_MASK))
+   if (pci_set_dma_mask(pDev, DMA_32BIT_MASK))
return -EINVAL;
 
base_addr0_phys = pci_resource_start(pDev,0);

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[0/3 ver2] Last 3 patches for bidi support

2007-12-13 Thread Boaz Harrosh
James hi.

Bidi patches just broke again, by a patch that fixes
some to-be-dead code. (scsi: BUG_ON() impossible condition)

Could it not just be accepted into the tree now.
It sat in -mm tree with no reports of breakage or 
complains. What are we waiting for? the way I
see it there is nothing holding it back, it's
not even dangerous anymore.

You need Arm's accessors patch from scsi-pending
Russell King [EMAIL PROTECTED]
Please send an Acked-by for this patch

and the patch that removes the old esp drivers
(http://www.spinics.net/lists/linux-scsi/msg20914.html)

Christoph Hellwig [EMAIL PROTECTED]
David S. Miller [EMAIL PROTECTED]
Maciej W. Rozycki [EMAIL PROTECTED]

Please send an Ack-by or Recommended-by to the removal
of these old esp drivers.

And the 3 patches (based on scsi-misc)
[1] tgt: Use scsi_init_io instead of scsi_alloc_sgtable
  Was Ack-by the maintainer of tgt. Please accept independent
  of the other 2.

[2] scsi: scsi_data_buffer
  The move to scsi_data_buffer. From here on any
  unconverted driver will not compile.

[3] scsi: bidi support
  Actual very simple really.

All parties involved, send your reservations if any NOW.
Else James please put it in.

Andrew could they be included back into -mm tree?

Boaz
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] tgt: Use scsi_init_io instead of scsi_alloc_sgtable

2007-12-13 Thread Boaz Harrosh

  - If we export scsi_init_io()/scsi_release_buffers() instead of
scsi_{alloc,free}_sgtable() from scsi_lib than tgt code is
much more insulated from scsi_lib changes. As a bonus it will
also gain bidi capability when it comes.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
Acked-by: FUJITA Tomonori [EMAIL PROTECTED]
---
 drivers/scsi/scsi_lib.c |   21 ++---
 drivers/scsi/scsi_tgt_lib.c |   29 +
 include/scsi/scsi_cmnd.h|4 ++--
 3 files changed, 17 insertions(+), 37 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index e273e4b..d1a4671 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -739,7 +739,8 @@ static inline unsigned int scsi_sgtable_index(unsigned 
short nents)
return index;
 }
 
-struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd, gfp_t gfp_mask)
+static struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd,
+   gfp_t gfp_mask)
 {
struct scsi_host_sg_pool *sgp;
struct scatterlist *sgl, *prev, *ret;
@@ -825,9 +826,7 @@ enomem:
return NULL;
 }
 
-EXPORT_SYMBOL(scsi_alloc_sgtable);
-
-void scsi_free_sgtable(struct scsi_cmnd *cmd)
+static void scsi_free_sgtable(struct scsi_cmnd *cmd)
 {
struct scatterlist *sgl = cmd-request_buffer;
struct scsi_host_sg_pool *sgp;
@@ -873,8 +872,6 @@ void scsi_free_sgtable(struct scsi_cmnd *cmd)
mempool_free(sgl, sgp-pool);
 }
 
-EXPORT_SYMBOL(scsi_free_sgtable);
-
 /*
  * Function:scsi_release_buffers()
  *
@@ -892,7 +889,7 @@ EXPORT_SYMBOL(scsi_free_sgtable);
  * the scatter-gather table, and potentially any bounce
  * buffers.
  */
-static void scsi_release_buffers(struct scsi_cmnd *cmd)
+void scsi_release_buffers(struct scsi_cmnd *cmd)
 {
if (cmd-use_sg)
scsi_free_sgtable(cmd);
@@ -904,6 +901,7 @@ static void scsi_release_buffers(struct scsi_cmnd *cmd)
cmd-request_buffer = NULL;
cmd-request_bufflen = 0;
 }
+EXPORT_SYMBOL(scsi_release_buffers);
 
 /*
  * Function:scsi_io_completion()
@@ -1105,7 +1103,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
  * Returns: 0 on success
  * BLKPREP_DEFER if the failure is retryable
  */
-static int scsi_init_io(struct scsi_cmnd *cmd)
+int scsi_init_io(struct scsi_cmnd *cmd, gfp_t gfp_mask)
 {
struct request *req = cmd-request;
intcount;
@@ -1120,7 +1118,7 @@ static int scsi_init_io(struct scsi_cmnd *cmd)
/*
 * If sg table allocation fails, requeue request later.
 */
-   cmd-request_buffer = scsi_alloc_sgtable(cmd, GFP_ATOMIC);
+   cmd-request_buffer = scsi_alloc_sgtable(cmd, gfp_mask);
if (unlikely(!cmd-request_buffer)) {
scsi_unprep_request(req);
return BLKPREP_DEFER;
@@ -1141,6 +1139,7 @@ static int scsi_init_io(struct scsi_cmnd *cmd)
cmd-use_sg = count;
return BLKPREP_OK;
 }
+EXPORT_SYMBOL(scsi_init_io);
 
 static struct scsi_cmnd *scsi_get_cmd_from_req(struct scsi_device *sdev,
struct request *req)
@@ -1186,7 +1185,7 @@ int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, 
struct request *req)
 
BUG_ON(!req-nr_phys_segments);
 
-   ret = scsi_init_io(cmd);
+   ret = scsi_init_io(cmd, GFP_ATOMIC);
if (unlikely(ret))
return ret;
} else {
@@ -1237,7 +1236,7 @@ int scsi_setup_fs_cmnd(struct scsi_device *sdev, struct 
request *req)
if (unlikely(!cmd))
return BLKPREP_DEFER;
 
-   return scsi_init_io(cmd);
+   return scsi_init_io(cmd, GFP_ATOMIC);
 }
 EXPORT_SYMBOL(scsi_setup_fs_cmnd);
 
diff --git a/drivers/scsi/scsi_tgt_lib.c b/drivers/scsi/scsi_tgt_lib.c
index 93ece8f..91630ba 100644
--- a/drivers/scsi/scsi_tgt_lib.c
+++ b/drivers/scsi/scsi_tgt_lib.c
@@ -331,8 +331,7 @@ static void scsi_tgt_cmd_done(struct scsi_cmnd *cmd)
 
scsi_tgt_uspace_send_status(cmd, tcmd-itn_id, tcmd-tag);
 
-   if (scsi_sglist(cmd))
-   scsi_free_sgtable(cmd);
+   scsi_release_buffers(cmd);
 
queue_work(scsi_tgtd, tcmd-work);
 }
@@ -353,26 +352,6 @@ static int scsi_tgt_transfer_response(struct scsi_cmnd 
*cmd)
return 0;
 }
 
-static int scsi_tgt_init_cmd(struct scsi_cmnd *cmd, gfp_t gfp_mask)
-{
-   struct request *rq = cmd-request;
-   int count;
-
-   cmd-use_sg = rq-nr_phys_segments;
-   cmd-request_buffer = scsi_alloc_sgtable(cmd, gfp_mask);
-   if (!cmd-request_buffer)
-   return -ENOMEM;
-
-   cmd-request_bufflen = rq-data_len;
-
-   dprintk(cmd %p cnt %d %lu\n, cmd, scsi_sg_count(cmd),
-   rq_data_dir(rq));
-   count = blk_rq_map_sg(rq-q, rq, scsi_sglist(cmd));
-   BUG_ON(count  cmd-use_sg);
-   cmd-use_sg = count;
-   

[PATCH] scsi: scsi_data_buffer

2007-12-13 Thread Boaz Harrosh

  In preparation for bidi we abstract all IO members of scsi_cmnd,
  that will need to duplicate, into a substructure.

  - Group all IO members of scsi_cmnd into a scsi_data_buffer
structure.
  - Adjust accessors to new members.
  - scsi_{alloc,free}_sgtable receive a scsi_data_buffer instead of
scsi_cmnd. And work on it.
  - Adjust scsi_init_io() and  scsi_release_buffers() for above
change.
  - Fix other parts of scsi_lib/scsi.c to members migration. Use
accessors where appropriate.

  - fix Documentation about scsi_cmnd in scsi_host.h

  - scsi_error.c
* Changed needed members of struct scsi_eh_save.
* Careful considerations in scsi_eh_prep/restore_cmnd.

  - sd.c and sr.c
* sd and sr would adjust IO size to align on device's block
  size so code needs to change once we move to scsi_data_buff
  implementation.
* Convert code to use scsi_for_each_sg
* Use data accessors where appropriate.

  - tgt: convert libsrp to use scsi_data_buffer
  - isd200: This driver still bangs on scsi_cmnd IO members,
so need changing

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
---
 drivers/scsi/libsrp.c|4 +-
 drivers/scsi/scsi.c  |2 +-
 drivers/scsi/scsi_error.c|   28 +--
 drivers/scsi/scsi_lib.c  |   77 --
 drivers/scsi/sd.c|4 +-
 drivers/scsi/sr.c|   25 +++--
 drivers/usb/storage/isd200.c |8 ++--
 include/scsi/scsi_cmnd.h |   39 +
 include/scsi/scsi_eh.h   |8 ++---
 include/scsi/scsi_host.h |4 +-
 10 files changed, 91 insertions(+), 108 deletions(-)

diff --git a/drivers/scsi/libsrp.c b/drivers/scsi/libsrp.c
index 5cff020..8a8562a 100644
--- a/drivers/scsi/libsrp.c
+++ b/drivers/scsi/libsrp.c
@@ -426,8 +426,8 @@ int srp_cmd_queue(struct Scsi_Host *shost, struct srp_cmd 
*cmd, void *info,
 
sc-SCp.ptr = info;
memcpy(sc-cmnd, cmd-cdb, MAX_COMMAND_SIZE);
-   sc-request_bufflen = len;
-   sc-request_buffer = (void *) (unsigned long) addr;
+   sc-sdb.length = len;
+   sc-sdb.sglist = (void *) (unsigned long) addr;
sc-tag = tag;
err = scsi_tgt_queue_command(sc, itn_id, (struct scsi_lun *)cmd-lun,
 cmd-tag);
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index ebc0193..a0fd785 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -712,7 +712,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
Notifying upper driver of completion 
(result %x)\n, cmd-result));
 
-   good_bytes = cmd-request_bufflen;
+   good_bytes = scsi_bufflen(cmd);
 if (cmd-request-cmd_type != REQ_TYPE_BLOCK_PC) {
drv = scsi_cmd_to_driver(cmd);
if (drv-done)
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 169bc59..241ab48 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -617,29 +617,25 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, struct 
scsi_eh_save *ses,
ses-cmd_len = scmd-cmd_len;
memcpy(ses-cmnd, scmd-cmnd, sizeof(scmd-cmnd));
ses-data_direction = scmd-sc_data_direction;
-   ses-bufflen = scmd-request_bufflen;
-   ses-buffer = scmd-request_buffer;
-   ses-use_sg = scmd-use_sg;
-   ses-resid = scmd-resid;
+   ses-sdb = scmd-sdb;
ses-result = scmd-result;
 
+   memset(scmd-sdb, 0, sizeof(scmd-sdb));
+
if (sense_bytes) {
-   scmd-request_bufflen = min_t(unsigned,
+   scmd-sdb.length = min_t(unsigned,
   sizeof(scmd-sense_buffer), sense_bytes);
sg_init_one(ses-sense_sgl, scmd-sense_buffer,
-  scmd-request_bufflen);
-   scmd-request_buffer = ses-sense_sgl;
+ scmd-sdb.length);
+   scmd-sdb.sglist = ses-sense_sgl;
scmd-sc_data_direction = DMA_FROM_DEVICE;
-   scmd-use_sg = 1;
+   scmd-sdb.sg_count = 1;
memset(scmd-cmnd, 0, sizeof(scmd-cmnd));
scmd-cmnd[0] = REQUEST_SENSE;
-   scmd-cmnd[4] = scmd-request_bufflen;
+   scmd-cmnd[4] = scmd-sdb.length;
scmd-cmd_len = COMMAND_SIZE(scmd-cmnd[0]);
} else {
-   scmd-request_buffer = NULL;
-   scmd-request_bufflen = 0;
scmd-sc_data_direction = DMA_NONE;
-   scmd-use_sg = 0;
if (cmnd) {
memset(scmd-cmnd, 0, sizeof(scmd-cmnd));
memcpy(scmd-cmnd, cmnd, cmnd_size);
@@ -676,10 +672,7 @@ void scsi_eh_restore_cmnd(struct scsi_cmnd* scmd, struct 
scsi_eh_save *ses)
scmd-cmd_len = ses-cmd_len;
  

[PATCH] scsi: bidi support

2007-12-13 Thread Boaz Harrosh

  At the block level bidi request uses req-next_rq pointer for a second
  bidi_read request.
  At Scsi-midlayer a second scsi_data_buffer structure is used for the
  bidi_read part. This bidi scsi_data_buffer is put on
  request-next_rq-special. Struct scsi_cmnd is not changed.

  - Define scsi_bidi_cmnd() to return true if it is a bidi request and a
second sgtable was allocated.

  - Define scsi_in()/scsi_out() to return the in or out scsi_data_buffer
from this command This API is to isolate users from the mechanics of
bidi.

  - Define scsi_end_bidi_request() to do what scsi_end_request() does but
for a bidi request. This is necessary because bidi commands are a bit
tricky here. (See comments in body)

  - scsi_release_buffers() will also release the bidi_read scsi_data_buffer

  - scsi_io_completion() on bidi commands will now call
scsi_end_bidi_request() and return.

  - The previous work done in scsi_init_io() is now done in a new
scsi_init_sgtable() (which is 99% identical to old scsi_init_io())
The new scsi_init_io() will call the above twice if needed also for
the bidi_read command. Only at this point is a command bidi.

  - In scsi_error.c at scsi_eh_prep/restore_cmnd() make sure bidi-lld is not
confused by a get-sense command that looks like bidi. This is done
by puting NULL at request-next_rq, and restoring.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/scsi_error.c |3 +
 drivers/scsi/scsi_lib.c   |  144 -
 include/scsi/scsi_cmnd.h  |   23 +++-
 include/scsi/scsi_eh.h|1 +
 4 files changed, 141 insertions(+), 30 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 241ab48..5c8ba6a 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -618,9 +618,11 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, struct 
scsi_eh_save *ses,
memcpy(ses-cmnd, scmd-cmnd, sizeof(scmd-cmnd));
ses-data_direction = scmd-sc_data_direction;
ses-sdb = scmd-sdb;
+   ses-next_rq = scmd-request-next_rq;
ses-result = scmd-result;
 
memset(scmd-sdb, 0, sizeof(scmd-sdb));
+   scmd-request-next_rq = NULL;
 
if (sense_bytes) {
scmd-sdb.length = min_t(unsigned,
@@ -673,6 +675,7 @@ void scsi_eh_restore_cmnd(struct scsi_cmnd* scmd, struct 
scsi_eh_save *ses)
memcpy(scmd-cmnd, ses-cmnd, sizeof(scmd-cmnd));
scmd-sc_data_direction = ses-data_direction;
scmd-sdb = ses-sdb;
+   scmd-request-next_rq = ses-next_rq;
scmd-result = ses-result;
 }
 EXPORT_SYMBOL(scsi_eh_restore_cmnd);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 7ac36fe..a6aae56 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -64,6 +64,8 @@ static struct scsi_host_sg_pool scsi_sg_pools[] = {
 };
 #undef SP
 
+static struct kmem_cache *scsi_bidi_sdb_cache;
+
 static void scsi_run_queue(struct request_queue *q);
 
 /*
@@ -627,6 +629,28 @@ void scsi_run_host_queues(struct Scsi_Host *shost)
scsi_run_queue(sdev-request_queue);
 }
 
+static void scsi_finalize_request(struct scsi_cmnd *cmd, int uptodate)
+{
+   struct request_queue *q = cmd-device-request_queue;
+   struct request *req = cmd-request;
+   unsigned long flags;
+
+   add_disk_randomness(req-rq_disk);
+
+   spin_lock_irqsave(q-queue_lock, flags);
+   if (blk_rq_tagged(req))
+   blk_queue_end_tag(q, req);
+
+   end_that_request_last(req, uptodate);
+   spin_unlock_irqrestore(q-queue_lock, flags);
+
+   /*
+* This will goose the queue request function at the end, so we don't
+* need to worry about launching another command.
+*/
+   scsi_next_command(cmd);
+}
+
 /*
  * Function:scsi_end_request()
  *
@@ -654,7 +678,6 @@ static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd 
*cmd, int uptodate,
 {
struct request_queue *q = cmd-device-request_queue;
struct request *req = cmd-request;
-   unsigned long flags;
 
/*
 * If there are blocks left over at the end, set up the command
@@ -683,19 +706,7 @@ static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd 
*cmd, int uptodate,
}
}
 
-   add_disk_randomness(req-rq_disk);
-
-   spin_lock_irqsave(q-queue_lock, flags);
-   if (blk_rq_tagged(req))
-   blk_queue_end_tag(q, req);
-   end_that_request_last(req, uptodate);
-   spin_unlock_irqrestore(q-queue_lock, flags);
-
-   /*
-* This will goose the queue request function at the end, so we don't
-* need to worry about launching another command.
-*/
-   scsi_next_command(cmd);
+   scsi_finalize_request(cmd, uptodate);
return NULL;
 }
 
@@ -894,10 +905,39 @@ void scsi_release_buffers(struct scsi_cmnd *cmd)
scsi_free_sgtable(cmd-sdb);
 
memset(cmd-sdb, 0, 

[PATCH] sr/sd: Remove simple dead code

2007-12-13 Thread Boaz Harrosh

  if (rq_data_dir() == WRITE) else if() else chain had an extra
  else since the if() is on a value of 1 bit.

  Also with a bidi request rq_data_dir() == WRITE
  and blk_bidi_rq() == true.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/sd.c |5 +
 drivers/scsi/sr.c |5 +
 2 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 212f6bc..e6d85b0 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -445,12 +445,9 @@ static int sd_prep_fn(struct request_queue *q, struct 
request *rq)
}
SCpnt-cmnd[0] = WRITE_6;
SCpnt-sc_data_direction = DMA_TO_DEVICE;
-   } else if (rq_data_dir(rq) == READ) {
+   } else {
SCpnt-cmnd[0] = READ_6;
SCpnt-sc_data_direction = DMA_FROM_DEVICE;
-   } else {
-   scmd_printk(KERN_ERR, SCpnt, Unknown command %x\n, 
rq-cmd_flags);
-   goto out;
}
 
SCSI_LOG_HLQUEUE(2, scmd_printk(KERN_INFO, SCpnt,
diff --git a/drivers/scsi/sr.c b/drivers/scsi/sr.c
index 896be4a..7128d15 100644
--- a/drivers/scsi/sr.c
+++ b/drivers/scsi/sr.c
@@ -372,12 +372,9 @@ static int sr_prep_fn(struct request_queue *q, struct 
request *rq)
SCpnt-cmnd[0] = WRITE_10;
SCpnt-sc_data_direction = DMA_TO_DEVICE;
cd-cdi.media_written = 1;
-   } else if (rq_data_dir(rq) == READ) {
+   } else {
SCpnt-cmnd[0] = READ_10;
SCpnt-sc_data_direction = DMA_FROM_DEVICE;
-   } else {
-   blk_dump_rq_flags(rq, Unknown sr command);
-   goto out;
}
 
{
-- 
1.5.3.3

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] dpt_i2o: don't set DMA_64BIT_MASK [was: Re: [stable] broken dpt_i2o in 2.6.23 (was: ext2 check page: bad entry in directory) (fwd)]

2007-12-13 Thread James Bottomley

On Thu, 2007-12-13 at 11:11 +0100, Miquel van Smoorenburg wrote:
 According to Greg KH:
  So, what should be added to 2.6.23-stable then?  And, can I get a real
  changelog entry for it?
 
 This is suitable for both 2.6.23.x and 2.6.24-rc5 :
 
 linux-2.6-dpt_i2o-no-dma64.patch

Actually, this one's already queued:

http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-rc-fixes-2.6.git;a=commit;h=a066b307861238c1970310579c0bc2fe8c8dca51

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi device recovery

2007-12-13 Thread James Bottomley

On Wed, 2007-12-12 at 18:54 +0100, Bernd Schubert wrote:
 [Hmm, resending since mail after more than 30min still not on the ML, maybe 
 the attachment was too large? I have uploaded the log to 
 http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/scsi/kern.log.1]
 
 On Wednesday 12 December 2007 16:59:36 James Bottomley wrote:
  On Wed, 2007-12-12 at 15:36 +0100, Bernd Schubert wrote:
   On Wednesday 12 December 2007 14:39:27 Matthew Wilcox wrote:
On Wed, Dec 12, 2007 at 01:54:14PM +0100, Bernd Schubert wrote:
 below is a patch introducing device recovery, trying to prevent i/o
 errors when a DID_NO_CONNECT or SOFT_ERROR does happen.
   
Why doesn't the regular scsi_eh do what you need?
  
   First of all, it is presently simply not called when the two errors above
   do happen. This could be changed, of course.
 
  Erm, I think you'll find the error handler does activate on
  DID_SOFT_ERROR.  It causes a retry via the eh.  DID_NO_CONNECT is an
 
 Dec  7 23:48:45 beo-96 kernel: [94605.297924] sd 2:0:5:0: [sdd] Result: 
 hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK,SUGGEST_OK
 Dec  7 23:48:45 beo-96 kernel: [94605.297932] end_request: I/O error, dev 
 sdd, 
 sector 7706802052
 Dec  7 23:48:45 beo-96 kernel: [94605.297937] raid5:md5: read error not 
 correctable (sector 871932472 on sdd3).

This is some type of ioc internal error.  What we do on DID_SOFT_ERROR
is retry for the usual number of times up to the timeout limit.
Unfortunately, the retries are fixed at SD_MAX_RETRIES in sd.c.  Without
diagnosing what's going wrong in the fusion, it's impossible to say if
this is reasonable, but your fusion is signalling ioc errors (firmware
errors).

 Full log attached.
 
  immediate error with no eh intervention because it means that the target
  went away.  Handling this as a retryable error isn't an option because
  it will interfere with hotplug.
 
 Then we need a sysfs flag one can set to manually enable eh for these devices
 on DID_NO_CONNECT. 

No, because that will seriously damage a lot of other systems.

The DID_NO_CONNECT looks to be a genuine reselection issue caused by a
device out of spec on the bus.  The SPI standard says a device should
respond in 250ms, which is what most HBA's take as the default selection
timeout.  I'd say for the device you have, you need to increase this.
Unfortunately doing this for the fusion is some type of mode page
setting, I think, but I don't have the doc in front of me.  I'd be
amenable to putting the selection timeout as a parameter in the spi
transport class, since others might find it valuable occasionally to
control.

 
   Secondly, I think scsi_eh is in most cases doing too much. We are
   fighting with flaky Infortrend boxes here, and scsi_eh sometimes manages
   to crash their scsi channels. In most cases it is sufficient to stall any
   io to the device and then to resume.
 
  But that's basically the default behaviour of the error handler (stall
  then resume).
 
   For most scsi devices one probably doesn't need a suspend time or it can
   be very small, this still needs to become configurable via sysfs.
 
  You mean a wait time beyond what the error handler currently does
  (basically it waits for the quiesce, begins error handling and then
  sends a test unit ready when it finishes before restarting).
 
 In deh just waits on the first error and then only does a DV. For 
 these infortrend devices, thats mostly sufficient.

   Thirdly, scsi_eh doesn't give up, in most cases, when the scsi channel of
   a Infortrend box crashed, it tried forever to recover.
   To improve this is still on my todo list.
 
  Could you send traces for this.  I thought the error handler had been
  fixed over the last few years always to terminate.  If there's a case
  where it doesn't, this needs fixing.
 
 I'm attaching the syslog, this is 2.6.22 + additional printks, dump_stack()'s
 and msleep()'s.
 At 03:59:36 the system finally went into wait_for_completion(), similar
 to the everything in wait_for_completion, what is my system doing? thread.

This looks like a genuine bug.  I missed the thread, since my email
system went off line while I was on holiday for two weeks.  The symptoms
look to be lost commands, but I can't see why from the traces.  There's
a known bug where we can hang in domain validation because of a resource
starvation issue, but I know of none where everything hangs just after
error recovery completes.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] blktrace interface for sg devices

2007-12-13 Thread Christof Schmitt
On Thu, Dec 13, 2007 at 10:19:42AM +0100, Jens Axboe wrote:
[...]
 I think this approach is the simplest and right way to do it. Tracing is
 really just tied to the transport (transport here meaning how we
 transport commands to the device), and even character scsi devices use
 the block layer queue for this operation, as you note.
 
 Let me know when you are happy with the patch, and I'll queue it up for
 2.6.25.
  @@ -1066,6 +1068,16 @@ sg_ioctl(struct inode *inode, struct fil
  case BLKSECTGET:
  return put_user(sdp-device-request_queue-max_sectors * 512,
  ip);
  +   case BLKTRACESETUP:
  +   {
  +   return blk_trace_setup(sdp-device-request_queue , 
  sdp-device-sdev_gendev.bus_id, sdp-device-sdev_gendev, arg);
  +   }
 
 Don't need those braces, some other space and long line style issues as
 well.
 
  --- a/include/linux/blkdev.h2007-12-13 08:48:23.0 +0100
  +++ b/include/linux/blkdev.h2007-12-13 08:48:25.0 +0100
  @@ -747,6 +747,16 @@ static inline void blkdev_dequeue_reques
  elv_dequeue_request(req-q, req);
   }
   
  +#ifdef CONFIG_BLK_DEV_IO_TRACE
  +extern int blk_trace_setup(request_queue_t *q,  char * name, dev_t dev, 
  char __user *arg);
  +extern int blk_trace_startstop(request_queue_t *q, int start);
  +extern int blk_trace_remove(request_queue_t *q);
  +#else
  +#define blk_trace_setup(q, name, dev, arg) do { } while(0)
  +#define blk_trace_startstop(q, start) do { } while(0)
  +#define blk_trace_remove(q) do { } while(0)
  +#endif
  +
 
 Put these in the blktrace include file.

Thanks for your input. I will prepare and send an updated version of
the patch. I also want to do some more testing, especially to see how
i can get the sizes of read and write requests and latencies for SCSI
tape drives.

Christof Schmitt
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Jens,

I'm experimenting here with trying to generate large I/O through libata,
and not having much luck.

The limit seems to be the number of hardware PRD (SG) entries permitted
by the driver (libata:ata_piix), which is 128 by default.

The problem is, the block layer *never* sends an SG entry larger than 8192 
bytes,
and even that size is exceptionally rare.  Nearly all I/O segments are 4096 
bytes,
so I never see a single I/O larger than 512KB (128 * 4096).

If I patch various parts of block and SCSI, this limit doesn't budge,
but when I change the hardware PRD limit in libata, it scales by exactly
whatever I set the new value to.  This tells me that adjacent I/O segments
are not being combined.

I thought that QUEUE_FLAG_CLUSTER (aka. SCSI host .use_clustering=1) should
result in adjacent single pages being combined into larger physical segments?

This is x86-32 with latest 2.6.24-rc*.
I'll re-test on older kernels next.

???
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

(resending with corrected email address for Jens)

Jens,

I'm experimenting here with trying to generate large I/O through libata,
and not having much luck.

The limit seems to be the number of hardware PRD (SG) entries permitted
by the driver (libata:ata_piix), which is 128 by default.

The problem is, the block layer *never* sends an SG entry larger than 8192 
bytes,
and even that size is exceptionally rare.  Nearly all I/O segments are 4096 
bytes,
so I never see a single I/O larger than 512KB (128 * 4096).

If I patch various parts of block and SCSI, this limit doesn't budge,
but when I change the hardware PRD limit in libata, it scales by exactly
whatever I set the new value to.  This tells me that adjacent I/O segments
are not being combined.

I thought that QUEUE_FLAG_CLUSTER (aka. SCSI host .use_clustering=1) should
result in adjacent single pages being combined into larger physical segments?

This is x86-32 with latest 2.6.24-rc*.
I'll re-test on older kernels next.

???
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Matthew Wilcox
On Thu, Dec 13, 2007 at 01:37:59PM -0500, Mark Lord wrote:
 The problem is, the block layer *never* sends an SG entry larger than 8192 
 bytes,
 and even that size is exceptionally rare.  Nearly all I/O segments are 4096 
 bytes,
 so I never see a single I/O larger than 512KB (128 * 4096).
 
 If I patch various parts of block and SCSI, this limit doesn't budge,
 but when I change the hardware PRD limit in libata, it scales by exactly
 whatever I set the new value to.  This tells me that adjacent I/O segments
 are not being combined.
 
 I thought that QUEUE_FLAG_CLUSTER (aka. SCSI host .use_clustering=1) should
 result in adjacent single pages being combined into larger physical 
 segments?

I was recently debugging a driver and noticed that consecutive pages in
an sg list are in the reverse order.  ie first you get page 918, then
917, 916, 915, 914, etc.  I vaguely remember James having patches to
correct this, but maybe they weren't merged?

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/24] iscsi class: Use our own workq instead of common system one.

2007-12-13 Thread michaelc
From: Mike Christie [EMAIL PROTECTED]

There is just too much going on through the common workq and
something like a scsi device removal through sysfs affects
how long it will take to recover the transport, mark it as
failed, or shut it down gracefully.

Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/scsi/scsi_transport_iscsi.c |   16 
 1 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_transport_iscsi.c 
b/drivers/scsi/scsi_transport_iscsi.c
index 75d3069..9cc2cc8 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -50,6 +50,7 @@ struct iscsi_internal {
 };
 
 static atomic_t iscsi_session_nr; /* sysfs session id for next new session */
+static struct workqueue_struct *iscsi_eh_timer_workq;
 
 /*
  * list of registered transports and lock that must
@@ -252,7 +253,7 @@ static void session_recovery_timedout(struct work_struct 
*work)
 void iscsi_unblock_session(struct iscsi_cls_session *session)
 {
if (!cancel_delayed_work(session-recovery_work))
-   flush_scheduled_work();
+   flush_workqueue(iscsi_eh_timer_workq);
scsi_target_unblock(session-dev);
 }
 EXPORT_SYMBOL_GPL(iscsi_unblock_session);
@@ -260,8 +261,8 @@ EXPORT_SYMBOL_GPL(iscsi_unblock_session);
 void iscsi_block_session(struct iscsi_cls_session *session)
 {
scsi_target_block(session-dev);
-   schedule_delayed_work(session-recovery_work,
-session-recovery_tmo * HZ);
+   queue_delayed_work(iscsi_eh_timer_workq, session-recovery_work,
+  session-recovery_tmo * HZ);
 }
 EXPORT_SYMBOL_GPL(iscsi_block_session);
 
@@ -357,7 +358,7 @@ void iscsi_remove_session(struct iscsi_cls_session *session)
struct iscsi_host *ihost = shost-shost_data;
 
if (!cancel_delayed_work(session-recovery_work))
-   flush_scheduled_work();
+   flush_workqueue(iscsi_eh_timer_workq);
 
mutex_lock(ihost-mutex);
list_del(session-host_list);
@@ -1521,8 +1522,14 @@ static __init int iscsi_transport_init(void)
goto unregister_session_class;
}
 
+   iscsi_eh_timer_workq = create_singlethread_workqueue(iscsi_eh);
+   if (!iscsi_eh_timer_workq)
+   goto release_nls;
+
return 0;
 
+release_nls:
+   sock_release(nls-sk_socket);
 unregister_session_class:
transport_class_unregister(iscsi_session_class);
 unregister_conn_class:
@@ -1536,6 +1543,7 @@ unregister_transport_class:
 
 static void __exit iscsi_transport_exit(void)
 {
+   destroy_workqueue(iscsi_eh_timer_workq);
sock_release(nls-sk_socket);
transport_class_unregister(iscsi_connection_class);
transport_class_unregister(iscsi_session_class);
-- 
1.5.1.2

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/24] libiscsi: fix shutdown

2007-12-13 Thread michaelc
From: Mike Christie [EMAIL PROTECTED]

We were using the device delete sysfs file to remove each device
then logout. Now in 2.6.21 this will not work because
the sysfs delete file returns immediately and does not wait for
the device removal to complete. This causes a hang if a cache sync
is needed during shutdown. Before .21, that approach had other
problems, so this patch fixes the shutdown code so that we remove the target
and unbind the session before logging out and shut down the session

Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/scsi/libiscsi.c |4 +-
 drivers/scsi/qla4xxx/ql4_init.c |4 +-
 drivers/scsi/qla4xxx/ql4_os.c   |7 +-
 drivers/scsi/scsi_transport_iscsi.c |  289 +++
 include/scsi/iscsi_if.h |7 +
 include/scsi/iscsi_proto.h  |2 +
 include/scsi/scsi_transport_iscsi.h |7 +-
 7 files changed, 176 insertions(+), 144 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 441e351..5205ef2 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1662,7 +1662,7 @@ void iscsi_session_teardown(struct iscsi_cls_session 
*cls_session)
struct iscsi_session *session = iscsi_hostdata(shost-hostdata);
struct module *owner = cls_session-transport-owner;
 
-   iscsi_unblock_session(cls_session);
+   iscsi_remove_session(cls_session);
scsi_remove_host(shost);
 
iscsi_pool_free(session-mgmtpool);
@@ -1677,7 +1677,7 @@ void iscsi_session_teardown(struct iscsi_cls_session 
*cls_session)
kfree(session-hwaddress);
kfree(session-initiatorname);
 
-   iscsi_destroy_session(cls_session);
+   iscsi_free_session(cls_session);
scsi_host_put(shost);
module_put(owner);
 }
diff --git a/drivers/scsi/qla4xxx/ql4_init.c b/drivers/scsi/qla4xxx/ql4_init.c
index d692c71..cbe0a17 100644
--- a/drivers/scsi/qla4xxx/ql4_init.c
+++ b/drivers/scsi/qla4xxx/ql4_init.c
@@ -5,6 +5,7 @@
  * See LICENSE.qla4xxx for copyright and licensing details.
  */
 
+#include scsi/iscsi_if.h
 #include ql4_def.h
 #include ql4_glbl.h
 #include ql4_dbg.h
@@ -1305,7 +1306,8 @@ int qla4xxx_process_ddb_changed(struct scsi_qla_host *ha,
atomic_set(ddb_entry-relogin_timer, 0);
clear_bit(DF_RELOGIN, ddb_entry-flags);
clear_bit(DF_NO_RELOGIN, ddb_entry-flags);
-   iscsi_if_create_session_done(ddb_entry-conn);
+   iscsi_session_event(ddb_entry-sess,
+   ISCSI_KEVENT_CREATE_SESSION);
/*
 * Change the lun state to READY in case the lun TIMEOUT before
 * the device came back.
diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
index 89460d2..f55b9f7 100644
--- a/drivers/scsi/qla4xxx/ql4_os.c
+++ b/drivers/scsi/qla4xxx/ql4_os.c
@@ -298,8 +298,7 @@ void qla4xxx_destroy_sess(struct ddb_entry *ddb_entry)
return;
 
if (ddb_entry-conn) {
-   iscsi_if_destroy_session_done(ddb_entry-conn);
-   iscsi_destroy_conn(ddb_entry-conn);
+   atomic_set(ddb_entry-state, DDB_STATE_DEAD);
iscsi_remove_session(ddb_entry-sess);
}
iscsi_free_session(ddb_entry-sess);
@@ -309,6 +308,7 @@ int qla4xxx_add_sess(struct ddb_entry *ddb_entry)
 {
int err;
 
+   ddb_entry-sess-recovery_tmo = ddb_entry-ha-port_down_retry_count;
err = iscsi_add_session(ddb_entry-sess, ddb_entry-fw_ddb_index);
if (err) {
DEBUG2(printk(KERN_ERR Could not add session.\n));
@@ -321,9 +321,6 @@ int qla4xxx_add_sess(struct ddb_entry *ddb_entry)
DEBUG2(printk(KERN_ERR Could not add connection.\n));
return -ENOMEM;
}
-
-   ddb_entry-sess-recovery_tmo = ddb_entry-ha-port_down_retry_count;
-   iscsi_if_create_session_done(ddb_entry-conn);
return 0;
 }
 
diff --git a/drivers/scsi/scsi_transport_iscsi.c 
b/drivers/scsi/scsi_transport_iscsi.c
index 9cc2cc8..b82139d 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -116,6 +116,8 @@ static struct attribute_group iscsi_transport_group = {
.attrs = iscsi_transport_attrs,
 };
 
+
+
 static int iscsi_setup_host(struct transport_container *tc, struct device *dev,
struct class_device *cdev)
 {
@@ -125,13 +127,30 @@ static int iscsi_setup_host(struct transport_container 
*tc, struct device *dev,
memset(ihost, 0, sizeof(*ihost));
INIT_LIST_HEAD(ihost-sessions);
mutex_init(ihost-mutex);
+
+   snprintf(ihost-unbind_workq_name, KOBJ_NAME_LEN, iscsi_unbind_%d,
+   shost-host_no);
+   ihost-unbind_workq = create_singlethread_workqueue(
+   ihost-unbind_workq_name);
+   if (!ihost-unbind_workq)
+   return -ENOMEM;
+   

[PATCH 13/24] Do not fail commands immediately during logout

2007-12-13 Thread michaelc
From: Mike Christie [EMAIL PROTECTED]

If the target requests a logout, then we do not want
to fail commands to scsi-ml right away. This patch just
fails in pending commands for a requeue immediately, and then lets
iscsid handle running commands like normal recovery.

Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/scsi/libiscsi.c |   14 ++
 1 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 9688361..b17081b 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -917,7 +917,7 @@ check_mgmt:
conn-ctask = list_entry(conn-xmitqueue.next,
 struct iscsi_cmd_task, running);
if (conn-session-state == ISCSI_STATE_LOGGING_OUT) {
-   fail_command(conn, conn-ctask, DID_NO_CONNECT  16);
+   fail_command(conn, conn-ctask, DID_IMM_RETRY  16);
continue;
}
if (iscsi_prep_scsi_cmd_pdu(conn-ctask)) {
@@ -1024,21 +1024,19 @@ int iscsi_queuecommand(struct scsi_cmnd *sc, void 
(*done)(struct scsi_cmnd *))
 * be entering our queuecommand while a block is starting
 * up because the block code is not locked)
 */
-   if (session-state == ISCSI_STATE_IN_RECOVERY) {
+   switch (session-state) {
+   case ISCSI_STATE_IN_RECOVERY:
reason = FAILURE_SESSION_IN_RECOVERY;
goto reject;
-   }
-
-   switch (session-state) {
+   case ISCSI_STATE_LOGGING_OUT:
+   reason = FAILURE_SESSION_LOGGING_OUT;
+   goto reject;
case ISCSI_STATE_RECOVERY_FAILED:
reason = FAILURE_SESSION_RECOVERY_TIMEOUT;
break;
case ISCSI_STATE_TERMINATE:
reason = FAILURE_SESSION_TERMINATE;
break;
-   case ISCSI_STATE_LOGGING_OUT:
-   reason = FAILURE_SESSION_LOGGING_OUT;
-   break;
default:
reason = FAILURE_SESSION_FREED;
}
-- 
1.5.1.2

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/24] clear conn-ctask when task is completed early

2007-12-13 Thread michaelc
From: Mike Christie [EMAIL PROTECTED]

If the current ctask is failed early, we legt the conn-ctask pointer
pointing to a invalid task. When the xmit thread would send data for
it, we would then oops.

Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/scsi/libiscsi.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index b17081b..4461317 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -248,13 +248,16 @@ static int iscsi_prep_scsi_cmd_pdu(struct iscsi_cmd_task 
*ctask)
  */
 static void iscsi_complete_command(struct iscsi_cmd_task *ctask)
 {
-   struct iscsi_session *session = ctask-conn-session;
+   struct iscsi_conn *conn = ctask-conn;
+   struct iscsi_session *session = conn-session;
struct scsi_cmnd *sc = ctask-sc;
 
ctask-state = ISCSI_TASK_COMPLETED;
ctask-sc = NULL;
/* SCSI eh reuses commands to verify us */
sc-SCp.ptr = NULL;
+   if (conn-ctask == ctask)
+   conn-ctask = NULL;
list_del_init(ctask-running);
__kfifo_put(session-cmdpool.queue, (void*)ctask, sizeof(void*));
sc-scsi_done(sc);
-- 
1.5.1.2

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 18/24] iscsi_tcp: drop session when itt does not match any command

2007-12-13 Thread michaelc
From: Mike Christie [EMAIL PROTECTED]

A target should never send us a itt that does not match a running
task. If it does we do not really know what is coming down after the header,
unless we evaluate the hdr and do some guessing sometimes. However,
even if we know what is coming we probably do not have buffers for it or we
cannot respond (if it is a r2t for example), so just drop the session.

Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/scsi/iscsi_tcp.c |6 +-
 1 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index ecba606..65df908 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -755,11 +755,7 @@ iscsi_tcp_hdr_dissect(struct iscsi_conn *conn, struct 
iscsi_hdr *hdr)
opcode = hdr-opcode  ISCSI_OPCODE_MASK;
/* verify itt (itt encoding: age+cid+itt) */
rc = iscsi_verify_itt(conn, hdr, itt);
-   if (rc == ISCSI_ERR_NO_SCSI_CMD) {
-   /* XXX: what does this do? */
-   tcp_conn-in.datalen = 0; /* force drop */
-   return 0;
-   } else if (rc)
+   if (rc)
return rc;
 
debug_tcp(opcode 0x%x ahslen %d datalen %d\n,
-- 
1.5.1.2

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 19/24] libiscsi, iscsi class: set tmf to a safe default and export in sysfs

2007-12-13 Thread michaelc
From: Mike Christie [EMAIL PROTECTED]

Older tools will not be setting the tmf time outs since they
did not exists, so set them to a safe default.

And export abort and lu reset timeout values in sysfs.

Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/scsi/libiscsi.c |2 ++
 drivers/scsi/scsi_transport_iscsi.c |8 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index f15df8d..6573223 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1732,6 +1732,8 @@ iscsi_session_setup(struct iscsi_transport *iscsit,
session-host = shost;
session-state = ISCSI_STATE_FREE;
session-fast_abort = 1;
+   session-lu_reset_timeout = 15;
+   session-abort_timeout = 10;
session-mgmtpool_max = ISCSI_MGMT_CMDS_MAX;
session-cmds_max = cmds_max;
session-queued_cmdsn = session-cmdsn = initial_cmdsn;
diff --git a/drivers/scsi/scsi_transport_iscsi.c 
b/drivers/scsi/scsi_transport_iscsi.c
index 36aa50e..3585599 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -30,7 +30,7 @@
 #include scsi/scsi_transport_iscsi.h
 #include scsi/iscsi_if.h
 
-#define ISCSI_SESSION_ATTRS 16
+#define ISCSI_SESSION_ATTRS 18
 #define ISCSI_CONN_ATTRS 11
 #define ISCSI_HOST_ATTRS 4
 #define ISCSI_TRANSPORT_VERSION 2.0-724
@@ -1242,7 +1242,9 @@ iscsi_session_attr(username, ISCSI_PARAM_USERNAME, 1);
 iscsi_session_attr(username_in, ISCSI_PARAM_USERNAME_IN, 1);
 iscsi_session_attr(password, ISCSI_PARAM_PASSWORD, 1);
 iscsi_session_attr(password_in, ISCSI_PARAM_PASSWORD_IN, 1);
-iscsi_session_attr(fast_abort, ISCSI_PARAM_FAST_ABORT, 1);
+iscsi_session_attr(fast_abort, ISCSI_PARAM_FAST_ABORT, 0);
+iscsi_session_attr(abort_tmo, ISCSI_PARAM_ABORT_TMO, 0);
+iscsi_session_attr(lu_reset_tmo, ISCSI_PARAM_LU_RESET_TMO, 0);
 
 #define iscsi_priv_session_attr_show(field, format)\
 static ssize_t \
@@ -1467,6 +1469,8 @@ iscsi_register_transport(struct iscsi_transport *tt)
SETUP_SESSION_RD_ATTR(username, ISCSI_PASSWORD);
SETUP_SESSION_RD_ATTR(username_in, ISCSI_PASSWORD_IN);
SETUP_SESSION_RD_ATTR(fast_abort, ISCSI_FAST_ABORT);
+   SETUP_SESSION_RD_ATTR(abort_tmo, ISCSI_ABORT_TMO);
+   SETUP_SESSION_RD_ATTR(lu_reset_tmo,ISCSI_LU_RESET_TMO);
SETUP_PRIV_SESSION_RD_ATTR(recovery_tmo);
 
BUG_ON(count  ISCSI_SESSION_ATTRS);
-- 
1.5.1.2

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 21/24] iscsi_tcp: hold lock during data rsp processing

2007-12-13 Thread michaelc
From: Mike Christie [EMAIL PROTECTED]

iscsi_data_rsp needs to hold the sesison lock when it calls
iscsi_update_cmdsn.

Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/scsi/iscsi_tcp.c |   14 ++
 1 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 84c4a50..edebdf2 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -641,13 +641,11 @@ iscsi_r2t_rsp(struct iscsi_conn *conn, struct 
iscsi_cmd_task *ctask)
}
 
/* fill-in new R2T associated with the task */
-   spin_lock(session-lock);
iscsi_update_cmdsn(session, (struct iscsi_nopin*)rhdr);
 
if (!ctask-sc || session-state != ISCSI_STATE_LOGGED_IN) {
printk(KERN_INFO iscsi_tcp: dropping R2T itt %d in 
   recovery...\n, ctask-itt);
-   spin_unlock(session-lock);
return 0;
}
 
@@ -660,7 +658,6 @@ iscsi_r2t_rsp(struct iscsi_conn *conn, struct 
iscsi_cmd_task *ctask)
printk(KERN_ERR iscsi_tcp: invalid R2T with zero data len\n);
__kfifo_put(tcp_ctask-r2tpool.queue, (void*)r2t,
sizeof(void*));
-   spin_unlock(session-lock);
return ISCSI_ERR_DATALEN;
}
 
@@ -676,7 +673,6 @@ iscsi_r2t_rsp(struct iscsi_conn *conn, struct 
iscsi_cmd_task *ctask)
   r2t-data_offset, scsi_bufflen(ctask-sc));
__kfifo_put(tcp_ctask-r2tpool.queue, (void*)r2t,
sizeof(void*));
-   spin_unlock(session-lock);
return ISCSI_ERR_DATALEN;
}
 
@@ -690,8 +686,6 @@ iscsi_r2t_rsp(struct iscsi_conn *conn, struct 
iscsi_cmd_task *ctask)
conn-r2t_pdus_cnt++;
 
iscsi_requeue_ctask(ctask);
-   spin_unlock(session-lock);
-
return 0;
 }
 
@@ -764,7 +758,9 @@ iscsi_tcp_hdr_dissect(struct iscsi_conn *conn, struct 
iscsi_hdr *hdr)
switch(opcode) {
case ISCSI_OP_SCSI_DATA_IN:
ctask = session-cmds[itt];
+   spin_lock(conn-session-lock);
rc = iscsi_data_rsp(conn, ctask);
+   spin_unlock(conn-session-lock);
if (rc)
return rc;
if (tcp_conn-in.datalen) {
@@ -806,9 +802,11 @@ iscsi_tcp_hdr_dissect(struct iscsi_conn *conn, struct 
iscsi_hdr *hdr)
ctask = session-cmds[itt];
if (ahslen)
rc = ISCSI_ERR_AHSLEN;
-   else if (ctask-sc-sc_data_direction == DMA_TO_DEVICE)
+   else if (ctask-sc-sc_data_direction == DMA_TO_DEVICE) {
+   spin_lock(session-lock);
rc = iscsi_r2t_rsp(conn, ctask);
-   else
+   spin_unlock(session-lock);
+   } else
rc = ISCSI_ERR_PROTO;
break;
case ISCSI_OP_LOGIN_RSP:
-- 
1.5.1.2

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 22/24] libiscsi: use is_power_of_2

2007-12-13 Thread michaelc
From: Mike Christie [EMAIL PROTECTED]

Patch from vignesh babu [EMAIL PROTECTED]:

Replacing n  (n - 1) for power of 2 check by is_power_of_2(n)

Signed-off-by: vignesh babu [EMAIL PROTECTED]
Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/scsi/libiscsi.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 6573223..553168a 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -24,6 +24,7 @@
 #include linux/types.h
 #include linux/kfifo.h
 #include linux/delay.h
+#include linux/log2.h
 #include asm/unaligned.h
 #include net/tcp.h
 #include scsi/scsi_cmnd.h
@@ -1700,7 +1701,7 @@ iscsi_session_setup(struct iscsi_transport *iscsit,
qdepth = ISCSI_DEF_CMD_PER_LUN;
}
 
-   if (cmds_max  2 || (cmds_max  (cmds_max - 1)) ||
+   if (!is_power_of_2(cmds_max) ||
cmds_max = ISCSI_MGMT_ITT_OFFSET) {
if (cmds_max != 0)
printk(KERN_ERR iscsi: invalid can_queue of %d. 
-- 
1.5.1.2

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 23/24] iscsi_tcp: fix setting of r2t

2007-12-13 Thread michaelc
From: Mike Christie [EMAIL PROTECTED]

If we negotiate for X r2ts we have to use only X r2ts. We cannot
round up (we could send less though). It is ok to fail if it
is not something the driver can handle, so this patch just does
that.

Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/scsi/iscsi_tcp.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index edebdf2..e5be5fd 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -1774,12 +1774,12 @@ iscsi_conn_set_param(struct iscsi_cls_conn *cls_conn, 
enum iscsi_param param,
break;
case ISCSI_PARAM_MAX_R2T:
sscanf(buf, %d, value);
-   if (session-max_r2t == roundup_pow_of_two(value))
+   if (value = 0 || !is_power_of_2(value))
+   return -EINVAL;
+   if (session-max_r2t == value)
break;
iscsi_r2tpool_free(session);
iscsi_set_param(cls_conn, param, buf, buflen);
-   if (session-max_r2t  (session-max_r2t - 1))
-   session-max_r2t = roundup_pow_of_two(session-max_r2t);
if (iscsi_r2tpool_alloc(session))
return -ENOMEM;
break;
-- 
1.5.1.2

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 17/24] iscsi_tcp: stop leaking r2t_info's when the incoming R2T is bad

2007-12-13 Thread michaelc
From: Mike Christie [EMAIL PROTECTED]

from [EMAIL PROTECTED]:

iscsi_r2t_rsp checks the incoming R2T for sanity, and if it
thinks it's fishy, it will drop it silently. In this case, we
leaked an r2t_info object. If we do this often enough, we run
into a BUG_ON some time later.

Removed r2t wrappers and update patch by Mike Christie

Signed-off-by: Olaf Kirch [EMAIL PROTECTED]
Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/scsi/iscsi_tcp.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 7212fe9..ecba606 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -658,6 +658,8 @@ iscsi_r2t_rsp(struct iscsi_conn *conn, struct 
iscsi_cmd_task *ctask)
r2t-data_length = be32_to_cpu(rhdr-data_length);
if (r2t-data_length == 0) {
printk(KERN_ERR iscsi_tcp: invalid R2T with zero data len\n);
+   __kfifo_put(tcp_ctask-r2tpool.queue, (void*)r2t,
+   sizeof(void*));
spin_unlock(session-lock);
return ISCSI_ERR_DATALEN;
}
@@ -669,10 +671,12 @@ iscsi_r2t_rsp(struct iscsi_conn *conn, struct 
iscsi_cmd_task *ctask)
 
r2t-data_offset = be32_to_cpu(rhdr-data_offset);
if (r2t-data_offset + r2t-data_length  scsi_bufflen(ctask-sc)) {
-   spin_unlock(session-lock);
printk(KERN_ERR iscsi_tcp: invalid R2T with data len %u at 
   offset %u and total length %d\n, r2t-data_length,
   r2t-data_offset, scsi_bufflen(ctask-sc));
+   __kfifo_put(tcp_ctask-r2tpool.queue, (void*)r2t,
+   sizeof(void*));
+   spin_unlock(session-lock);
return ISCSI_ERR_DATALEN;
}
 
-- 
1.5.1.2

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread James Bottomley

On Thu, 2007-12-13 at 11:42 -0700, Matthew Wilcox wrote:
 On Thu, Dec 13, 2007 at 01:37:59PM -0500, Mark Lord wrote:
  The problem is, the block layer *never* sends an SG entry larger than 8192 
  bytes,
  and even that size is exceptionally rare.  Nearly all I/O segments are 4096 
  bytes,
  so I never see a single I/O larger than 512KB (128 * 4096).
  
  If I patch various parts of block and SCSI, this limit doesn't budge,
  but when I change the hardware PRD limit in libata, it scales by exactly
  whatever I set the new value to.  This tells me that adjacent I/O segments
  are not being combined.
  
  I thought that QUEUE_FLAG_CLUSTER (aka. SCSI host .use_clustering=1) should
  result in adjacent single pages being combined into larger physical 
  segments?
 
 I was recently debugging a driver and noticed that consecutive pages in
 an sg list are in the reverse order.  ie first you get page 918, then
 917, 916, 915, 914, etc.  I vaguely remember James having patches to
 correct this, but maybe they weren't merged?

Yes, they were ... it was actually Bill Irwin's patch.  The old problem
was that we fault allocations in reverse order (because we were taking
from the end of the zone list).  I can't remember when his patches went
in, but it was several years ago.  After they did, I was getting a 33%
chance of physical merging (as opposed to zero before).  Probably
someone redid the vm or the zones without understanding this and we've
gone back to the original position.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/24] iser patching for AHS support

2007-12-13 Thread michaelc
From: Mike Christie [EMAIL PROTECTED]

from Boaz Harrosh [EMAIL PROTECTED]

  - The default initialization of hdr_max is the minimum -
sizeof(struct iscsi_cmd) - Once this patch goes into iser the default
initialization at libiscsi can be removed.
  - This is not yet full support for AHSs at iser end. But it should be easy.
Just allocate more space at iser_desc right after iscsi_hdr. Than
at transmission time use ctask-hdr_len to retrieve the total
size of all iscsi pdu headers. See previous patch at iscsi_tcp.[ch]

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/infiniband/ulp/iser/iscsi_iser.c |1 +
 drivers/scsi/libiscsi.c  |1 -
 2 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c 
b/drivers/infiniband/ulp/iser/iscsi_iser.c
index 2eadb6d..a2622f4 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.c
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
@@ -400,6 +400,7 @@ iscsi_iser_session_create(struct iscsi_transport *iscsit,
ctask  = session-cmds[i];
iser_ctask = ctask-dd_data;
ctask-hdr = (struct iscsi_cmd *)iser_ctask-desc.iscsi_header;
+   ctask-hdr_max = sizeof(iser_ctask-desc.iscsi_header);
}
 
for (i = 0; i  session-mgmtpool_max; i++) {
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 0d7914f..5936586 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1570,7 +1570,6 @@ iscsi_session_setup(struct iscsi_transport *iscsit,
if (cmd_task_size)
ctask-dd_data = ctask[1];
ctask-itt = cmd_i;
-   ctask-hdr_max = sizeof(struct iscsi_cmd);
INIT_LIST_HEAD(ctask-running);
}
 
-- 
1.5.1.2

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/24] iscsi_tcp, libiscsi: initial AHS Support

2007-12-13 Thread michaelc
From: Mike Christie [EMAIL PROTECTED]

  at libiscsi generic code
  - currently code assumes a storage space of pdu header is allocated
at llds ctask and is pointed to by iscsi_cmd_task-hdr. Here I add
a hdr_max field pertaining to that storage, and an hdr_len that
accumulates the current use of the pdu-header.

  - Add an iscsi_next_hdr() inline which returns the next free space
to write new Header at. Also iscsi_next_hdr() is used to retrieve
the address at which to write the header-digest.

  - Add iscsi_add_hdr(length). What the user do is calls iscsi_next_hdr()
for address of the new header, than calls iscsi_add_hdr(length) with
the size of the new header. iscsi_add_hdr() will check if space is
available and update to the new size. length must be padded according
to standard.

  - Add 2 padding inline helpers thanks to Olaf. Current patch does not
use them but Following patches will.
Also moved definition of ISCSI_PAD_LEN to iscsi_proto.h which had
PAD_WORD_LEN that was never used anywhere.

  - Let iscsi_prep_scsi_cmd_pdu() signal an Error return since now  it is
possible that it will fail.

  - I was tired of yet again writing a this is a digest comment next to
sizeof(__u32) so I defined a new ISCSI_DIGEST_SIZE. Now I don't need
any comments. Changed all places that used sizeof(__u32) or 4 in
connection to a digest.

  iscsi_tcp specific code
  - At struct iscsi_tcp_cmd_task allocate maximum space allowed in
standard for all headers following the iscsi_cmd header. and mark
it so in iscsi_tcp_session_create()
  - At iscsi_send_cmd_hdr() retrieve the correct headers size and
write header digest at iscsi_next_hdr().

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
Acked-by: Olaf Kirch [EMAIL PROTECTED]
Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/scsi/iscsi_tcp.c   |   16 
 drivers/scsi/iscsi_tcp.h   |   13 +++--
 drivers/scsi/libiscsi.c|   41 +++--
 include/scsi/iscsi_proto.h |   10 +-
 include/scsi/libiscsi.h|   33 +++--
 5 files changed, 94 insertions(+), 19 deletions(-)

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index fd88777..491845f 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -113,7 +113,7 @@ iscsi_hdr_digest(struct iscsi_conn *conn, struct iscsi_buf 
*buf,
struct iscsi_tcp_conn *tcp_conn = conn-dd_data;
 
crypto_hash_digest(tcp_conn-tx_hash, buf-sg, buf-sg.length, crc);
-   buf-sg.length += sizeof(u32);
+   buf-sg.length += ISCSI_DIGEST_SIZE;
 }
 
 /*
@@ -220,6 +220,7 @@ static inline int
 iscsi_tcp_chunk_done(struct iscsi_chunk *chunk)
 {
static unsigned char padbuf[ISCSI_PAD_LEN];
+   unsigned int pad;
 
if (chunk-copied  chunk-size) {
iscsi_tcp_chunk_map(chunk);
@@ -243,10 +244,8 @@ iscsi_tcp_chunk_done(struct iscsi_chunk *chunk)
}
 
/* Do we need to handle padding? */
-   if (chunk-total_copied  (ISCSI_PAD_LEN-1)) {
-   unsigned int pad;
-
-   pad = ISCSI_PAD_LEN - (chunk-total_copied  (ISCSI_PAD_LEN-1));
+   pad = iscsi_padding(chunk-total_copied);
+   if (pad != 0) {
debug_tcp(consume %d pad bytes\n, pad);
chunk-total_size += pad;
chunk-size = pad;
@@ -1385,11 +1384,11 @@ iscsi_send_cmd_hdr(struct iscsi_conn *conn, struct 
iscsi_cmd_task *ctask)
}
 
iscsi_buf_init_iov(tcp_ctask-headbuf, (char*)ctask-hdr,
- sizeof(struct iscsi_hdr));
+ ctask-hdr_len);
 
if (conn-hdrdgst_en)
iscsi_hdr_digest(conn, tcp_ctask-headbuf,
-(u8*)tcp_ctask-hdrext);
+iscsi_next_hdr(ctask));
tcp_ctask-xmstate = ~XMSTATE_CMD_HDR_INIT;
tcp_ctask-xmstate |= XMSTATE_CMD_HDR_XMIT;
}
@@ -2176,7 +2175,8 @@ iscsi_tcp_session_create(struct iscsi_transport *iscsit,
struct iscsi_cmd_task *ctask = session-cmds[cmd_i];
struct iscsi_tcp_cmd_task *tcp_ctask = ctask-dd_data;
 
-   ctask-hdr = tcp_ctask-hdr;
+   ctask-hdr = tcp_ctask-hdr.cmd_hdr;
+   ctask-hdr_max = sizeof(tcp_ctask-hdr) - ISCSI_DIGEST_SIZE;
}
 
for (cmd_i = 0; cmd_i  session-mgmtpool_max; cmd_i++) {
diff --git a/drivers/scsi/iscsi_tcp.h b/drivers/scsi/iscsi_tcp.h
index f1c5411..eb3784f 100644
--- a/drivers/scsi/iscsi_tcp.h
+++ b/drivers/scsi/iscsi_tcp.h
@@ -41,7 +41,6 @@
 #define XMSTATE_IMM_HDR_INIT   0x1000
 #define XMSTATE_SOL_HDR_INIT   0x2000
 
-#define ISCSI_PAD_LEN  4
 #define ISCSI_SG_TABLESIZE SG_ALL
 #define ISCSI_TCP_MAX_CMD_LEN  16
 
@@ -130,14 +129,14 @@ struct iscsi_buf {

Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Mark Lord wrote:

(resending with corrected email address for Jens)

Jens,

I'm experimenting here with trying to generate large I/O through libata,
and not having much luck.

The limit seems to be the number of hardware PRD (SG) entries permitted
by the driver (libata:ata_piix), which is 128 by default.

The problem is, the block layer *never* sends an SG entry larger than 
8192 bytes,
and even that size is exceptionally rare.  Nearly all I/O segments are 
4096 bytes,

so I never see a single I/O larger than 512KB (128 * 4096).

If I patch various parts of block and SCSI, this limit doesn't budge,
but when I change the hardware PRD limit in libata, it scales by exactly
whatever I set the new value to.  This tells me that adjacent I/O segments
are not being combined.

I thought that QUEUE_FLAG_CLUSTER (aka. SCSI host .use_clustering=1) should
result in adjacent single pages being combined into larger physical 
segments?


This is x86-32 with latest 2.6.24-rc*.
I'll re-test on older kernels next.

...

Problem confirmed.  2.6.23.8 regularly generates segments up to 64KB for libata,
but 2.6.24 uses only 4KB segments and a *few* 8KB segments.

???
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Matthew Wilcox
On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
 Problem confirmed.  2.6.23.8 regularly generates segments up to 64KB for 
 libata,
 but 2.6.24 uses only 4KB segments and a *few* 8KB segments.

Just a suspicion ... could this be slab vs slub?  ie check your configs
are the same / similar between the two kernels.

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Matthew Wilcox wrote:

On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
Problem confirmed.  2.6.23.8 regularly generates segments up to 64KB for 
libata,

but 2.6.24 uses only 4KB segments and a *few* 8KB segments.


Just a suspicion ... could this be slab vs slub?  ie check your configs
are the same / similar between the two kernels.

..

Mmmm.. a good thought, that one.
But I just rechecked, and both have CONFIG_SLAB=y

My guess is that something got changed around when Jens
reworked the block layer for 2.6.24.
I'm going to dig around in there now.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Jens Axboe
On Thu, Dec 13 2007, Mark Lord wrote:
 Matthew Wilcox wrote:
 On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
 Problem confirmed.  2.6.23.8 regularly generates segments up to 64KB for 
 libata,
 but 2.6.24 uses only 4KB segments and a *few* 8KB segments.
 
 Just a suspicion ... could this be slab vs slub?  ie check your configs
 are the same / similar between the two kernels.
 ..
 
 Mmmm.. a good thought, that one.
 But I just rechecked, and both have CONFIG_SLAB=y
 
 My guess is that something got changed around when Jens
 reworked the block layer for 2.6.24.
 I'm going to dig around in there now.

I didn't rework the block layer for 2.6.24 :-). The core block layer
changes since 2.6.23 are:

- Support for empty barriers. Not a likely candidate.
- Shared tag queue fixes. Totally unlikely.
- sg chaining support. Not likely.
- The bio changes from Neil. Of the bunch, the most likely suspects in
  this area, since it changes some of the code involved with merges and
  blk_rq_map_sg().
- Lots of simple stuff, again very unlikely.

Anyway, it sounds odd for this to be a block layer problem if you do see
occasional segments being merged. So it sounds more like the input data
having changed.

Why not just bisect it?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Matthew Wilcox wrote:

On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
Problem confirmed.  2.6.23.8 regularly generates segments up to 64KB for 
libata,

but 2.6.24 uses only 4KB segments and a *few* 8KB segments.

Just a suspicion ... could this be slab vs slub?  ie check your configs
are the same / similar between the two kernels.

..

Mmmm.. a good thought, that one.
But I just rechecked, and both have CONFIG_SLAB=y

My guess is that something got changed around when Jens
reworked the block layer for 2.6.24.
I'm going to dig around in there now.


I didn't rework the block layer for 2.6.24 :-). The core block layer
changes since 2.6.23 are:

- Support for empty barriers. Not a likely candidate.
- Shared tag queue fixes. Totally unlikely.
- sg chaining support. Not likely.
- The bio changes from Neil. Of the bunch, the most likely suspects in
  this area, since it changes some of the code involved with merges and
  blk_rq_map_sg().
- Lots of simple stuff, again very unlikely.

Anyway, it sounds odd for this to be a block layer problem if you do see
occasional segments being merged. So it sounds more like the input data
having changed.

Why not just bisect it?

..

Because the early 2.6.24 series failed to boot on this machine
due to bugs in the block layer -- so the code that caused this regression
is probably in the stuff from before the kernels became usable here.

Cheers

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Mark Lord wrote:

Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Matthew Wilcox wrote:

On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
Problem confirmed.  2.6.23.8 regularly generates segments up to 
64KB for libata,

but 2.6.24 uses only 4KB segments and a *few* 8KB segments.

Just a suspicion ... could this be slab vs slub?  ie check your configs
are the same / similar between the two kernels.

..

Mmmm.. a good thought, that one.
But I just rechecked, and both have CONFIG_SLAB=y

My guess is that something got changed around when Jens
reworked the block layer for 2.6.24.
I'm going to dig around in there now.


I didn't rework the block layer for 2.6.24 :-). The core block layer
changes since 2.6.23 are:

- Support for empty barriers. Not a likely candidate.
- Shared tag queue fixes. Totally unlikely.
- sg chaining support. Not likely.
- The bio changes from Neil. Of the bunch, the most likely suspects in
  this area, since it changes some of the code involved with merges and
  blk_rq_map_sg().
- Lots of simple stuff, again very unlikely.

Anyway, it sounds odd for this to be a block layer problem if you do see
occasional segments being merged. So it sounds more like the input data
having changed.

Why not just bisect it?

..

Because the early 2.6.24 series failed to boot on this machine
due to bugs in the block layer -- so the code that caused this regression
is probably in the stuff from before the kernels became usable here.

..

That sounds more harsh than intended -- the earlier 2.6.24 kernels (up to
the first couple of -rc* ones failed here because of incompatibilities
between the block/bio changes and libata.

That's better, I think! 


Cheers
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Jens Axboe
On Thu, Dec 13 2007, Mark Lord wrote:
 Mark Lord wrote:
 Jens Axboe wrote:
 On Thu, Dec 13 2007, Mark Lord wrote:
 Matthew Wilcox wrote:
 On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
 Problem confirmed.  2.6.23.8 regularly generates segments up to 
 64KB for libata,
 but 2.6.24 uses only 4KB segments and a *few* 8KB segments.
 Just a suspicion ... could this be slab vs slub?  ie check your configs
 are the same / similar between the two kernels.
 ..
 
 Mmmm.. a good thought, that one.
 But I just rechecked, and both have CONFIG_SLAB=y
 
 My guess is that something got changed around when Jens
 reworked the block layer for 2.6.24.
 I'm going to dig around in there now.
 
 I didn't rework the block layer for 2.6.24 :-). The core block layer
 changes since 2.6.23 are:
 
 - Support for empty barriers. Not a likely candidate.
 - Shared tag queue fixes. Totally unlikely.
 - sg chaining support. Not likely.
 - The bio changes from Neil. Of the bunch, the most likely suspects in
   this area, since it changes some of the code involved with merges and
   blk_rq_map_sg().
 - Lots of simple stuff, again very unlikely.
 
 Anyway, it sounds odd for this to be a block layer problem if you do see
 occasional segments being merged. So it sounds more like the input data
 having changed.
 
 Why not just bisect it?
 ..
 
 Because the early 2.6.24 series failed to boot on this machine
 due to bugs in the block layer -- so the code that caused this regression
 is probably in the stuff from before the kernels became usable here.
 ..
 
 That sounds more harsh than intended -- the earlier 2.6.24 kernels (up to
 the first couple of -rc* ones failed here because of incompatibilities
 between the block/bio changes and libata.
 
 That's better, I think! 

No worries, I didn't pick it up as harsh just as an odd conclusion :-)

If I were you, I'd just start from the first -rc that booted for you. If
THAT has the bug, then we'll think of something else. If you don't get
anywhere, I can run some tests tomorrow and see if I can reproduce it
here.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Mark Lord wrote:

Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Matthew Wilcox wrote:

On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
Problem confirmed.  2.6.23.8 regularly generates segments up to 
64KB for libata,

but 2.6.24 uses only 4KB segments and a *few* 8KB segments.
Just a suspicion ... could this be slab vs slub?  ie check your 
configs

are the same / similar between the two kernels.

..

Mmmm.. a good thought, that one.
But I just rechecked, and both have CONFIG_SLAB=y

My guess is that something got changed around when Jens
reworked the block layer for 2.6.24.
I'm going to dig around in there now.

I didn't rework the block layer for 2.6.24 :-). The core block layer
changes since 2.6.23 are:

- Support for empty barriers. Not a likely candidate.
- Shared tag queue fixes. Totally unlikely.
- sg chaining support. Not likely.
- The bio changes from Neil. Of the bunch, the most likely suspects in
this area, since it changes some of the code involved with merges and
blk_rq_map_sg().
- Lots of simple stuff, again very unlikely.

Anyway, it sounds odd for this to be a block layer problem if you do see
occasional segments being merged. So it sounds more like the input data
having changed.

Why not just bisect it?

..

Because the early 2.6.24 series failed to boot on this machine
due to bugs in the block layer -- so the code that caused this regression
is probably in the stuff from before the kernels became usable here.

..

That sounds more harsh than intended -- the earlier 2.6.24 kernels (up to
the first couple of -rc* ones failed here because of incompatibilities
between the block/bio changes and libata.

That's better, I think! 

No worries, I didn't pick it up as harsh just as an odd conclusion :-)

If I were you, I'd just start from the first -rc that booted for you. If
THAT has the bug, then we'll think of something else. If you don't get
anywhere, I can run some tests tomorrow and see if I can reproduce it
here.

..

I believe that *anyone* can reproduce it, since it's broken long before
the requests ever get to SCSI or libata.  Which also means that *anyone*
who wants to can bisect it, as well.

I don't do bisects.


It was just a suggestion on how to narrow it down, do as you see fit.


But I will dig a bit more and see if I can find the culprit.


Sure, I'll dig around as well.

..

I wonder if it's 9dfa52831e96194b8649613e3131baa2c109f7dc:
Merge blk_recount_segments into blk_recalc_rq_segments ?

That particular commit does some rather innocent code-shuffling,
but also introduces a couple of new if (nr_hw_segs == 1 conditions
that were not there before.

Okay git experts:  how do I pull out a kernel at the point of this exact commit 
?

Thanks!




-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Jens Axboe
On Thu, Dec 13 2007, Jens Axboe wrote:
 On Thu, Dec 13 2007, Mark Lord wrote:
  Jens Axboe wrote:
  On Thu, Dec 13 2007, Mark Lord wrote:
  Mark Lord wrote:
  Jens Axboe wrote:
  On Thu, Dec 13 2007, Mark Lord wrote:
  Matthew Wilcox wrote:
  On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
  Problem confirmed.  2.6.23.8 regularly generates segments up to 
  64KB for libata,
  but 2.6.24 uses only 4KB segments and a *few* 8KB segments.
  Just a suspicion ... could this be slab vs slub?  ie check your 
  configs
  are the same / similar between the two kernels.
  ..
  
  Mmmm.. a good thought, that one.
  But I just rechecked, and both have CONFIG_SLAB=y
  
  My guess is that something got changed around when Jens
  reworked the block layer for 2.6.24.
  I'm going to dig around in there now.
  I didn't rework the block layer for 2.6.24 :-). The core block layer
  changes since 2.6.23 are:
  
  - Support for empty barriers. Not a likely candidate.
  - Shared tag queue fixes. Totally unlikely.
  - sg chaining support. Not likely.
  - The bio changes from Neil. Of the bunch, the most likely suspects in
   this area, since it changes some of the code involved with merges and
   blk_rq_map_sg().
  - Lots of simple stuff, again very unlikely.
  
  Anyway, it sounds odd for this to be a block layer problem if you do see
  occasional segments being merged. So it sounds more like the input data
  having changed.
  
  Why not just bisect it?
  ..
  
  Because the early 2.6.24 series failed to boot on this machine
  due to bugs in the block layer -- so the code that caused this regression
  is probably in the stuff from before the kernels became usable here.
  ..
  
  That sounds more harsh than intended -- the earlier 2.6.24 kernels (up to
  the first couple of -rc* ones failed here because of incompatibilities
  between the block/bio changes and libata.
  
  That's better, I think! 
  
  No worries, I didn't pick it up as harsh just as an odd conclusion :-)
  
  If I were you, I'd just start from the first -rc that booted for you. If
  THAT has the bug, then we'll think of something else. If you don't get
  anywhere, I can run some tests tomorrow and see if I can reproduce it
  here.
  ..
  
  I believe that *anyone* can reproduce it, since it's broken long before
  the requests ever get to SCSI or libata.  Which also means that *anyone*
  who wants to can bisect it, as well.
  
  I don't do bisects.
 
 It was just a suggestion on how to narrow it down, do as you see fit.
 
  But I will dig a bit more and see if I can find the culprit.
 
 Sure, I'll dig around as well.

Just tried something simple. I only see one 12kb segment so far, so not
a lot by any stretch. I also DONT see any missed merges signs, so it
would appear that the pages in the request are simply not contigious
physically.

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index e30b1a4..1e34b6f 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -1330,6 +1330,8 @@ int blk_rq_map_sg(struct request_queue *q, struct request 
*rq,
goto new_segment;
 
sg-length += nbytes;
+   if (sg-length  8192)
+   printk(sg_len=%d\n, sg-length);
} else {
 new_segment:
if (!sg)
@@ -1349,6 +1351,8 @@ new_segment:
sg = sg_next(sg);
}
 
+   if (bvprv  (page_address(bvprv-bv_page) + 
bvprv-bv_len == page_address(bvec-bv_page)))
+   printk(missed merge\n);
sg_set_page(sg, bvec-bv_page, nbytes, bvec-bv_offset);
nsegs++;
}

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Jens Axboe
On Thu, Dec 13 2007, Mark Lord wrote:
 Jens Axboe wrote:
 On Thu, Dec 13 2007, Mark Lord wrote:
 Jens Axboe wrote:
 On Thu, Dec 13 2007, Mark Lord wrote:
 Mark Lord wrote:
 Jens Axboe wrote:
 On Thu, Dec 13 2007, Mark Lord wrote:
 Matthew Wilcox wrote:
 On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
 Problem confirmed.  2.6.23.8 regularly generates segments up to 
 64KB for libata,
 but 2.6.24 uses only 4KB segments and a *few* 8KB segments.
 Just a suspicion ... could this be slab vs slub?  ie check your 
 configs
 are the same / similar between the two kernels.
 ..
 
 Mmmm.. a good thought, that one.
 But I just rechecked, and both have CONFIG_SLAB=y
 
 My guess is that something got changed around when Jens
 reworked the block layer for 2.6.24.
 I'm going to dig around in there now.
 I didn't rework the block layer for 2.6.24 :-). The core block layer
 changes since 2.6.23 are:
 
 - Support for empty barriers. Not a likely candidate.
 - Shared tag queue fixes. Totally unlikely.
 - sg chaining support. Not likely.
 - The bio changes from Neil. Of the bunch, the most likely suspects in
 this area, since it changes some of the code involved with merges and
 blk_rq_map_sg().
 - Lots of simple stuff, again very unlikely.
 
 Anyway, it sounds odd for this to be a block layer problem if you do 
 see
 occasional segments being merged. So it sounds more like the input 
 data
 having changed.
 
 Why not just bisect it?
 ..
 
 Because the early 2.6.24 series failed to boot on this machine
 due to bugs in the block layer -- so the code that caused this 
 regression
 is probably in the stuff from before the kernels became usable here.
 ..
 
 That sounds more harsh than intended -- the earlier 2.6.24 kernels (up 
 to
 the first couple of -rc* ones failed here because of incompatibilities
 between the block/bio changes and libata.
 
 That's better, I think! 
 No worries, I didn't pick it up as harsh just as an odd conclusion :-)
 
 If I were you, I'd just start from the first -rc that booted for you. If
 THAT has the bug, then we'll think of something else. If you don't get
 anywhere, I can run some tests tomorrow and see if I can reproduce it
 here.
 ..
 
 I believe that *anyone* can reproduce it, since it's broken long before
 the requests ever get to SCSI or libata.  Which also means that *anyone*
 who wants to can bisect it, as well.
 
 I don't do bisects.
 
 It was just a suggestion on how to narrow it down, do as you see fit.
 
 But I will dig a bit more and see if I can find the culprit.
 
 Sure, I'll dig around as well.
 ..
 
 I wonder if it's 9dfa52831e96194b8649613e3131baa2c109f7dc:
 Merge blk_recount_segments into blk_recalc_rq_segments ?
 
 That particular commit does some rather innocent code-shuffling,
 but also introduces a couple of new if (nr_hw_segs == 1 conditions
 that were not there before.

You can try and revert it of course, but I think you are looking at the
wrong bits. If the segment counts were totally off, you'd never be
anywhere close to reaching the set limit. Your problems seems to be
missed contig segment merges.

 Okay git experts:  how do I pull out a kernel at the point of this exact 
 commit ?

Dummy approach - git log and grep for
9dfa52831e96194b8649613e3131baa2c109f7dc, then see what commit is before
that. Then do a git checkout commit.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Jens Axboe wrote:

On Thu, Dec 13 2007, Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Mark Lord wrote:

Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Matthew Wilcox wrote:

On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
Problem confirmed.  2.6.23.8 regularly generates segments up to 
64KB for libata,

but 2.6.24 uses only 4KB segments and a *few* 8KB segments.
Just a suspicion ... could this be slab vs slub?  ie check your 
configs

are the same / similar between the two kernels.

..

Mmmm.. a good thought, that one.
But I just rechecked, and both have CONFIG_SLAB=y

My guess is that something got changed around when Jens
reworked the block layer for 2.6.24.
I'm going to dig around in there now.

I didn't rework the block layer for 2.6.24 :-). The core block layer
changes since 2.6.23 are:

- Support for empty barriers. Not a likely candidate.
- Shared tag queue fixes. Totally unlikely.
- sg chaining support. Not likely.
- The bio changes from Neil. Of the bunch, the most likely suspects in
this area, since it changes some of the code involved with merges and
blk_rq_map_sg().
- Lots of simple stuff, again very unlikely.

Anyway, it sounds odd for this to be a block layer problem if you do see
occasional segments being merged. So it sounds more like the input data
having changed.

Why not just bisect it?

..

Because the early 2.6.24 series failed to boot on this machine
due to bugs in the block layer -- so the code that caused this regression
is probably in the stuff from before the kernels became usable here.

..

That sounds more harsh than intended -- the earlier 2.6.24 kernels (up to
the first couple of -rc* ones failed here because of incompatibilities
between the block/bio changes and libata.

That's better, I think! 

No worries, I didn't pick it up as harsh just as an odd conclusion :-)

If I were you, I'd just start from the first -rc that booted for you. If
THAT has the bug, then we'll think of something else. If you don't get
anywhere, I can run some tests tomorrow and see if I can reproduce it
here.

..

I believe that *anyone* can reproduce it, since it's broken long before
the requests ever get to SCSI or libata.  Which also means that *anyone*
who wants to can bisect it, as well.

I don't do bisects.

It was just a suggestion on how to narrow it down, do as you see fit.


But I will dig a bit more and see if I can find the culprit.

Sure, I'll dig around as well.


Just tried something simple. I only see one 12kb segment so far, so not
a lot by any stretch. I also DONT see any missed merges signs, so it
would appear that the pages in the request are simply not contigious
physically.

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index e30b1a4..1e34b6f 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -1330,6 +1330,8 @@ int blk_rq_map_sg(struct request_queue *q, struct request 
*rq,
goto new_segment;
 
 			sg-length += nbytes;

+   if (sg-length  8192)
+   printk(sg_len=%d\n, sg-length);
} else {
 new_segment:
if (!sg)
@@ -1349,6 +1351,8 @@ new_segment:
sg = sg_next(sg);
}
 
+			if (bvprv  (page_address(bvprv-bv_page) + bvprv-bv_len == page_address(bvec-bv_page)))

+   printk(missed merge\n);
sg_set_page(sg, bvec-bv_page, nbytes, bvec-bv_offset);
nsegs++;
}


..

Yeah, the first part is similar to my own hack.

For testing, try dd if=/dev/sda of=/dev/null bs=4096k.
That *really* should end up using contiguous pages on most systems.

I figured out the git thing, and am now building some in-between kernels to try.

Cheers
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Jens Axboe
On Thu, Dec 13 2007, Mark Lord wrote:
 Jens Axboe wrote:
 On Thu, Dec 13 2007, Jens Axboe wrote:
 On Thu, Dec 13 2007, Mark Lord wrote:
 Jens Axboe wrote:
 On Thu, Dec 13 2007, Mark Lord wrote:
 Mark Lord wrote:
 Jens Axboe wrote:
 On Thu, Dec 13 2007, Mark Lord wrote:
 Matthew Wilcox wrote:
 On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
 Problem confirmed.  2.6.23.8 regularly generates segments up to 
 64KB for libata,
 but 2.6.24 uses only 4KB segments and a *few* 8KB segments.
 Just a suspicion ... could this be slab vs slub?  ie check your 
 configs
 are the same / similar between the two kernels.
 ..
 
 Mmmm.. a good thought, that one.
 But I just rechecked, and both have CONFIG_SLAB=y
 
 My guess is that something got changed around when Jens
 reworked the block layer for 2.6.24.
 I'm going to dig around in there now.
 I didn't rework the block layer for 2.6.24 :-). The core block layer
 changes since 2.6.23 are:
 
 - Support for empty barriers. Not a likely candidate.
 - Shared tag queue fixes. Totally unlikely.
 - sg chaining support. Not likely.
 - The bio changes from Neil. Of the bunch, the most likely suspects 
 in
 this area, since it changes some of the code involved with merges and
 blk_rq_map_sg().
 - Lots of simple stuff, again very unlikely.
 
 Anyway, it sounds odd for this to be a block layer problem if you do 
 see
 occasional segments being merged. So it sounds more like the input 
 data
 having changed.
 
 Why not just bisect it?
 ..
 
 Because the early 2.6.24 series failed to boot on this machine
 due to bugs in the block layer -- so the code that caused this 
 regression
 is probably in the stuff from before the kernels became usable here.
 ..
 
 That sounds more harsh than intended -- the earlier 2.6.24 kernels 
 (up to
 the first couple of -rc* ones failed here because of incompatibilities
 between the block/bio changes and libata.
 
 That's better, I think! 
 No worries, I didn't pick it up as harsh just as an odd conclusion :-)
 
 If I were you, I'd just start from the first -rc that booted for you. If
 THAT has the bug, then we'll think of something else. If you don't get
 anywhere, I can run some tests tomorrow and see if I can reproduce it
 here.
 ..
 
 I believe that *anyone* can reproduce it, since it's broken long before
 the requests ever get to SCSI or libata.  Which also means that *anyone*
 who wants to can bisect it, as well.
 
 I don't do bisects.
 It was just a suggestion on how to narrow it down, do as you see fit.
 
 But I will dig a bit more and see if I can find the culprit.
 Sure, I'll dig around as well.
 
 Just tried something simple. I only see one 12kb segment so far, so not
 a lot by any stretch. I also DONT see any missed merges signs, so it
 would appear that the pages in the request are simply not contigious
 physically.
 
 diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
 index e30b1a4..1e34b6f 100644
 --- a/block/ll_rw_blk.c
 +++ b/block/ll_rw_blk.c
 @@ -1330,6 +1330,8 @@ int blk_rq_map_sg(struct request_queue *q, struct 
 request *rq,
  goto new_segment;
  
  sg-length += nbytes;
 +if (sg-length  8192)
 +printk(sg_len=%d\n, sg-length);
  } else {
  new_segment:
  if (!sg)
 @@ -1349,6 +1351,8 @@ new_segment:
  sg = sg_next(sg);
  }
  
 +if (bvprv  (page_address(bvprv-bv_page) + 
 bvprv-bv_len == page_address(bvec-bv_page)))
 +printk(missed merge\n);
  sg_set_page(sg, bvec-bv_page, nbytes, 
  bvec-bv_offset);
  nsegs++;
  }
 
 ..
 
 Yeah, the first part is similar to my own hack.
 
 For testing, try dd if=/dev/sda of=/dev/null bs=4096k.
 That *really* should end up using contiguous pages on most systems.
 
 I figured out the git thing, and am now building some in-between kernels to 
 try.

OK, it's a vm issue, I have tens of thousand backward pages after a
boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not
reverse. So it looks like that bug got reintroduced.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Jens Axboe wrote:

On Thu, Dec 13 2007, Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Mark Lord wrote:

Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Matthew Wilcox wrote:

On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
Problem confirmed.  2.6.23.8 regularly generates segments up to 
64KB for libata,

but 2.6.24 uses only 4KB segments and a *few* 8KB segments.
Just a suspicion ... could this be slab vs slub?  ie check your 
configs

are the same / similar between the two kernels.

..

Mmmm.. a good thought, that one.
But I just rechecked, and both have CONFIG_SLAB=y

My guess is that something got changed around when Jens
reworked the block layer for 2.6.24.
I'm going to dig around in there now.

I didn't rework the block layer for 2.6.24 :-). The core block layer
changes since 2.6.23 are:

- Support for empty barriers. Not a likely candidate.
- Shared tag queue fixes. Totally unlikely.
- sg chaining support. Not likely.
- The bio changes from Neil. Of the bunch, the most likely suspects 
in

this area, since it changes some of the code involved with merges and
blk_rq_map_sg().
- Lots of simple stuff, again very unlikely.

Anyway, it sounds odd for this to be a block layer problem if you do 
see
occasional segments being merged. So it sounds more like the input 
data

having changed.

Why not just bisect it?

..

Because the early 2.6.24 series failed to boot on this machine
due to bugs in the block layer -- so the code that caused this 
regression

is probably in the stuff from before the kernels became usable here.

..

That sounds more harsh than intended -- the earlier 2.6.24 kernels 
(up to

the first couple of -rc* ones failed here because of incompatibilities
between the block/bio changes and libata.

That's better, I think! 

No worries, I didn't pick it up as harsh just as an odd conclusion :-)

If I were you, I'd just start from the first -rc that booted for you. If
THAT has the bug, then we'll think of something else. If you don't get
anywhere, I can run some tests tomorrow and see if I can reproduce it
here.

..

I believe that *anyone* can reproduce it, since it's broken long before
the requests ever get to SCSI or libata.  Which also means that *anyone*
who wants to can bisect it, as well.

I don't do bisects.

It was just a suggestion on how to narrow it down, do as you see fit.


But I will dig a bit more and see if I can find the culprit.

Sure, I'll dig around as well.

Just tried something simple. I only see one 12kb segment so far, so not
a lot by any stretch. I also DONT see any missed merges signs, so it
would appear that the pages in the request are simply not contigious
physically.

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index e30b1a4..1e34b6f 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -1330,6 +1330,8 @@ int blk_rq_map_sg(struct request_queue *q, struct 
request *rq,

goto new_segment;

sg-length += nbytes;
+   if (sg-length  8192)
+   printk(sg_len=%d\n, sg-length);
} else {
new_segment:
if (!sg)
@@ -1349,6 +1351,8 @@ new_segment:
sg = sg_next(sg);
}

+			if (bvprv  (page_address(bvprv-bv_page) + 
bvprv-bv_len == page_address(bvec-bv_page)))

+   printk(missed merge\n);
			sg_set_page(sg, bvec-bv_page, nbytes, 
			bvec-bv_offset);

nsegs++;
}


..

Yeah, the first part is similar to my own hack.

For testing, try dd if=/dev/sda of=/dev/null bs=4096k.
That *really* should end up using contiguous pages on most systems.

I figured out the git thing, and am now building some in-between kernels to 
try.


OK, it's a vm issue, I have tens of thousand backward pages after a
boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not
reverse. So it looks like that bug got reintroduced.

...

Mmm.. shouldn't one of the front- or back- merge logics work for either order?


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Mark Lord wrote:

Jens Axboe wrote:

..

OK, it's a vm issue, I have tens of thousand backward pages after a
boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not
reverse. So it looks like that bug got reintroduced.

...

Mmm.. shouldn't one of the front- or back- merge logics work for either order?

..

Belay that thought.  I'm slowly remembering how this is supposed to work now.  
:)
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Jens Axboe
On Thu, Dec 13 2007, Mark Lord wrote:
 Jens Axboe wrote:
 On Thu, Dec 13 2007, Mark Lord wrote:
 Jens Axboe wrote:
 On Thu, Dec 13 2007, Jens Axboe wrote:
 On Thu, Dec 13 2007, Mark Lord wrote:
 Jens Axboe wrote:
 On Thu, Dec 13 2007, Mark Lord wrote:
 Mark Lord wrote:
 Jens Axboe wrote:
 On Thu, Dec 13 2007, Mark Lord wrote:
 Matthew Wilcox wrote:
 On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
 Problem confirmed.  2.6.23.8 regularly generates segments up to 
 64KB for libata,
 but 2.6.24 uses only 4KB segments and a *few* 8KB segments.
 Just a suspicion ... could this be slab vs slub?  ie check your 
 configs
 are the same / similar between the two kernels.
 ..
 
 Mmmm.. a good thought, that one.
 But I just rechecked, and both have CONFIG_SLAB=y
 
 My guess is that something got changed around when Jens
 reworked the block layer for 2.6.24.
 I'm going to dig around in there now.
 I didn't rework the block layer for 2.6.24 :-). The core block 
 layer
 changes since 2.6.23 are:
 
 - Support for empty barriers. Not a likely candidate.
 - Shared tag queue fixes. Totally unlikely.
 - sg chaining support. Not likely.
 - The bio changes from Neil. Of the bunch, the most likely 
 suspects in
 this area, since it changes some of the code involved with merges 
 and
 blk_rq_map_sg().
 - Lots of simple stuff, again very unlikely.
 
 Anyway, it sounds odd for this to be a block layer problem if you 
 do see
 occasional segments being merged. So it sounds more like the input 
 data
 having changed.
 
 Why not just bisect it?
 ..
 
 Because the early 2.6.24 series failed to boot on this machine
 due to bugs in the block layer -- so the code that caused this 
 regression
 is probably in the stuff from before the kernels became usable here.
 ..
 
 That sounds more harsh than intended -- the earlier 2.6.24 kernels 
 (up to
 the first couple of -rc* ones failed here because of 
 incompatibilities
 between the block/bio changes and libata.
 
 That's better, I think! 
 No worries, I didn't pick it up as harsh just as an odd conclusion :-)
 
 If I were you, I'd just start from the first -rc that booted for you. 
 If
 THAT has the bug, then we'll think of something else. If you don't get
 anywhere, I can run some tests tomorrow and see if I can reproduce it
 here.
 ..
 
 I believe that *anyone* can reproduce it, since it's broken long before
 the requests ever get to SCSI or libata.  Which also means that 
 *anyone*
 who wants to can bisect it, as well.
 
 I don't do bisects.
 It was just a suggestion on how to narrow it down, do as you see fit.
 
 But I will dig a bit more and see if I can find the culprit.
 Sure, I'll dig around as well.
 Just tried something simple. I only see one 12kb segment so far, so not
 a lot by any stretch. I also DONT see any missed merges signs, so it
 would appear that the pages in the request are simply not contigious
 physically.
 
 diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
 index e30b1a4..1e34b6f 100644
 --- a/block/ll_rw_blk.c
 +++ b/block/ll_rw_blk.c
 @@ -1330,6 +1330,8 @@ int blk_rq_map_sg(struct request_queue *q, struct 
 request *rq,
goto new_segment;
 
sg-length += nbytes;
 +  if (sg-length  8192)
 +  printk(sg_len=%d\n, sg-length);
} else {
 new_segment:
if (!sg)
 @@ -1349,6 +1351,8 @@ new_segment:
sg = sg_next(sg);
}
 
 +  if (bvprv  (page_address(bvprv-bv_page) + 
 bvprv-bv_len == page_address(bvec-bv_page)))
 +  printk(missed merge\n);
sg_set_page(sg, bvec-bv_page, nbytes, 
bvec-bv_offset);
nsegs++;
}
 
 ..
 
 Yeah, the first part is similar to my own hack.
 
 For testing, try dd if=/dev/sda of=/dev/null bs=4096k.
 That *really* should end up using contiguous pages on most systems.
 
 I figured out the git thing, and am now building some in-between kernels 
 to try.
 
 OK, it's a vm issue, I have tens of thousand backward pages after a
 boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not
 reverse. So it looks like that bug got reintroduced.
 ...
 
 Mmm.. shouldn't one of the front- or back- merge logics work for either 
 order?

I think you are misunderstanding the merging. The front/back bits are
for contig on disk, this is sg segment merging. We can only join pieces
that are contig in memory, otherwise the result would not be pretty :-)

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/30] blk_end_request: changing ub (take 4)

2007-12-13 Thread Pete Zaitcev
On Wed, 12 Dec 2007 15:38:15 -0500 (EST), Kiyoshi Ueda [EMAIL PROTECTED] 
wrote:
 On Tue, 11 Dec 2007 15:48:03 -0800, Pete Zaitcev [EMAIL PROTECTED] wrote:

   - end_that_request_first(rq, uptodate, rq-hard_nr_sectors);
   - end_that_request_last(rq, uptodate);
   + if (__blk_end_request(rq, error, blk_rq_bytes(rq)))
   + BUG();

  My understanding was, blk_end_request() is the same thing, only
  takes the queue lock. But then, should I refactor ub so that it
  calls __blk_end_request if request function ends with an error
  and blk_end_request if the end-of-IO even is processed?

 I'm using __blk_end_request() here and I think it's sufficient, because:
   o end_that_request_last() must be called with the queue lock held
   o ub_end_rq() calls end_that_request_last() without taking
 the queue lock in itself.
 So the queue lock must have been taken outside ub_end_rq().

 But, if ub is calling end_that_request_last() without the queue lock,
 it is a bug in the original code and we should use blk_end_request()
 to fix that.

So, I have to rewrite ub to split the paths after all, right?
Let's do this then: I'll wait until your patch gets to Linus and
then update it with the split. The reason is, I need the whole
enchilada applied and I don't want to bother tracking iterations
and all the little segments (of which you already have 30).
Until then, ub will have a race by using your original small patch.

Best wishes,
-- Pete
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


VM allocates pages in reverse order again

2007-12-13 Thread Matthew Wilcox
On Thu, Dec 13, 2007 at 09:09:59PM +0100, Jens Axboe wrote:
  diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
  index e30b1a4..1e34b6f 100644
  --- a/block/ll_rw_blk.c
  +++ b/block/ll_rw_blk.c
  @@ -1349,6 +1351,8 @@ new_segment:
 sg = sg_next(sg);
 }
   
  +  if (bvprv  (page_address(bvprv-bv_page) + 
  bvprv-bv_len == page_address(bvec-bv_page)))
  +  printk(missed merge\n);
 sg_set_page(sg, bvec-bv_page, nbytes, 
 bvec-bv_offset);
 nsegs++;
 }
  
  ..
  
  Yeah, the first part is similar to my own hack.
  
  For testing, try dd if=/dev/sda of=/dev/null bs=4096k.
  That *really* should end up using contiguous pages on most systems.
  
  I figured out the git thing, and am now building some in-between kernels to 
  try.
 
 OK, it's a vm issue, I have tens of thousand backward pages after a
 boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not
 reverse. So it looks like that bug got reintroduced.

Perhaps we should ask the -mm folks if they happen to have an idea what
caused it ...

Background: we're seeing pages allocated in reverse order after boot.
This causes IO performance problems on machines without IOMMUs as we
can't merge pages when they're allocated in the wrong order.  This is
something that went wrong between 2.6.23 and 2.6.24-rc5.

Bill Irwin had a patch that fixed this; it was merged months ago, but
the effects of it seem to have been undone.

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Andrew Morton
On Thu, 13 Dec 2007 21:09:59 +0100
Jens Axboe [EMAIL PROTECTED] wrote:


 OK, it's a vm issue,

cc linux-mm and probable culprit.

  I have tens of thousand backward pages after a
 boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not
 reverse. So it looks like that bug got reintroduced.

Bill Irwin fixed this a couple of years back: changed the page allocator so
that it mostly hands out pages in ascending physical-address order.

I guess we broke that, quite possibly in Mel's page allocator rework.

It would help if you could provide us with a simple recipe for
demonstrating this problem, please.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread James Bottomley

On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote:
 On Thu, 13 Dec 2007 21:09:59 +0100
 Jens Axboe [EMAIL PROTECTED] wrote:
 
 
  OK, it's a vm issue,
 
 cc linux-mm and probable culprit.
 
   I have tens of thousand backward pages after a
  boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not
  reverse. So it looks like that bug got reintroduced.
 
 Bill Irwin fixed this a couple of years back: changed the page allocator so
 that it mostly hands out pages in ascending physical-address order.
 
 I guess we broke that, quite possibly in Mel's page allocator rework.
 
 It would help if you could provide us with a simple recipe for
 demonstrating this problem, please.

The simple way seems to be to malloc a large area, touch every page and
then look at the physical pages assigned ... they now mostly seem to be
descending in physical address.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Jens Axboe
On Thu, Dec 13 2007, Andrew Morton wrote:
 On Thu, 13 Dec 2007 21:09:59 +0100
 Jens Axboe [EMAIL PROTECTED] wrote:
 
 
  OK, it's a vm issue,
 
 cc linux-mm and probable culprit.
 
   I have tens of thousand backward pages after a
  boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not
  reverse. So it looks like that bug got reintroduced.
 
 Bill Irwin fixed this a couple of years back: changed the page allocator so
 that it mostly hands out pages in ascending physical-address order.
 
 I guess we broke that, quite possibly in Mel's page allocator rework.
 
 It would help if you could provide us with a simple recipe for
 demonstrating this problem, please.

Basically anything involving IO :-). A boot here showed a handful of
good merges, and probably in the order of 100,000 descending
allocations. A kernel make is a fine test as well.

Something like the below should work fine - if you see oodles of these
basicaly doing any type of IO, then you are screwed.

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index e30b1a4..8ce3fcc 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -1349,6 +1349,10 @@ new_segment:
sg = sg_next(sg);
}
 
+   if (bvprv) {
+   if (page_address(bvec-bv_page) + PAGE_SIZE == 
page_address(bvprv-bv_page)  printk_ratelimit())
+   printk(page alloc order backwards\n);
+   }
sg_set_page(sg, bvec-bv_page, nbytes, bvec-bv_offset);
nsegs++;
}

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Andrew Morton wrote:

On Thu, 13 Dec 2007 17:15:06 -0500
James Bottomley [EMAIL PROTECTED] wrote:


On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote:

On Thu, 13 Dec 2007 21:09:59 +0100
Jens Axboe [EMAIL PROTECTED] wrote:


OK, it's a vm issue,

cc linux-mm and probable culprit.


 I have tens of thousand backward pages after a
boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not
reverse. So it looks like that bug got reintroduced.

Bill Irwin fixed this a couple of years back: changed the page allocator so
that it mostly hands out pages in ascending physical-address order.

I guess we broke that, quite possibly in Mel's page allocator rework.

It would help if you could provide us with a simple recipe for
demonstrating this problem, please.

The simple way seems to be to malloc a large area, touch every page and
then look at the physical pages assigned ... they now mostly seem to be
descending in physical address.



OIC.  -mm's /proc/pid/pagemap can be used to get the pfn's...

..

I'm actually running the treadmill right now (have been for many hours, 
actually,
to bisect it to a specific commit.

Thought I was almost done, and then noticed that git-bisect doesn't keep
the Makefile VERSION lines the same, so I was actually running the wrong
kernel after the first few times.. duh.

Wrote a script to fix it now.

-ml
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Andrew Morton
On Thu, 13 Dec 2007 17:15:06 -0500
James Bottomley [EMAIL PROTECTED] wrote:

 
 On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote:
  On Thu, 13 Dec 2007 21:09:59 +0100
  Jens Axboe [EMAIL PROTECTED] wrote:
  
  
   OK, it's a vm issue,
  
  cc linux-mm and probable culprit.
  
I have tens of thousand backward pages after a
   boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not
   reverse. So it looks like that bug got reintroduced.
  
  Bill Irwin fixed this a couple of years back: changed the page allocator so
  that it mostly hands out pages in ascending physical-address order.
  
  I guess we broke that, quite possibly in Mel's page allocator rework.
  
  It would help if you could provide us with a simple recipe for
  demonstrating this problem, please.
 
 The simple way seems to be to malloc a large area, touch every page and
 then look at the physical pages assigned ... they now mostly seem to be
 descending in physical address.
 

OIC.  -mm's /proc/pid/pagemap can be used to get the pfn's...
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Mark Lord wrote:

Andrew Morton wrote:

On Thu, 13 Dec 2007 17:15:06 -0500
James Bottomley [EMAIL PROTECTED] wrote:


On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote:

On Thu, 13 Dec 2007 21:09:59 +0100
Jens Axboe [EMAIL PROTECTED] wrote:


OK, it's a vm issue,

cc linux-mm and probable culprit.


 I have tens of thousand backward pages after a
boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not
reverse. So it looks like that bug got reintroduced.
Bill Irwin fixed this a couple of years back: changed the page 
allocator so

that it mostly hands out pages in ascending physical-address order.

I guess we broke that, quite possibly in Mel's page allocator rework.

It would help if you could provide us with a simple recipe for
demonstrating this problem, please.

The simple way seems to be to malloc a large area, touch every page and
then look at the physical pages assigned ... they now mostly seem to be
descending in physical address.



OIC.  -mm's /proc/pid/pagemap can be used to get the pfn's...

..

I'm actually running the treadmill right now (have been for many hours, 
actually,

to bisect it to a specific commit.

Thought I was almost done, and then noticed that git-bisect doesn't keep
the Makefile VERSION lines the same, so I was actually running the wrong
kernel after the first few times.. duh.

Wrote a script to fix it now.

..

Well, that was a waste of three hours.

Somebody else can try it now.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Mark Lord wrote:

Mark Lord wrote:

Andrew Morton wrote:

On Thu, 13 Dec 2007 17:15:06 -0500
James Bottomley [EMAIL PROTECTED] wrote:


On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote:

On Thu, 13 Dec 2007 21:09:59 +0100
Jens Axboe [EMAIL PROTECTED] wrote:


OK, it's a vm issue,

cc linux-mm and probable culprit.


 I have tens of thousand backward pages after a
boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not
reverse. So it looks like that bug got reintroduced.
Bill Irwin fixed this a couple of years back: changed the page 
allocator so

that it mostly hands out pages in ascending physical-address order.

I guess we broke that, quite possibly in Mel's page allocator rework.

It would help if you could provide us with a simple recipe for
demonstrating this problem, please.

The simple way seems to be to malloc a large area, touch every page and
then look at the physical pages assigned ... they now mostly seem to be
descending in physical address.



OIC.  -mm's /proc/pid/pagemap can be used to get the pfn's...

..

I'm actually running the treadmill right now (have been for many 
hours, actually,

to bisect it to a specific commit.

Thought I was almost done, and then noticed that git-bisect doesn't keep
the Makefile VERSION lines the same, so I was actually running the wrong
kernel after the first few times.. duh.

Wrote a script to fix it now.

..

Well, that was a waste of three hours.

..

Ahh.. it seems to be sensitive to one/both of these:

CONFIG_HIGHMEM64G=y with 4GB RAM:  not so bad, frequently does 20KB - 48KB 
segments.
CONFIG_HIGHMEM4G=y  with 2GB RAM:  very severe, rarely does more than 8KB 
segments.
CONFIG_HIGHMEM4G=y  with 3GB RAM:  very severe, rarely does more than 8KB 
segments.

So if you want to reproduce this on a large memory machine, use mem=2GB for 
starters.

Still testing..



-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 01/30] git-scsi-misc gdth fix

2007-12-13 Thread akpm
From: James Bottomley [EMAIL PROTECTED]

On Sun, 2007-10-14 at 12:21 -0700, Andrew Morton wrote:
 On Sun, 14 Oct 2007 22:45:47 +0400 Dave Milter [EMAIL PROTECTED] wrote:

  I build linux-2.6.23-mm1 and try to boot it using qemu,
  and it crashed with trace like this:
  do_page_fault
  error_code
  lock_acquire
  _spin_lock_irqsave
  gdth_timeout
  run_timer_softirq
  __do_softirq
  do_softirq
 
  I have screenshot, but have no idea, is it legal to include it, if I
  sent copy to lkml.
  config of kernel in attachment,
  I apply all three patches from hot-fixes.
 

 The screenshot is here:  http://userweb.kernel.org/~akpm/crash.png

 It would appear that gdth_timeout() is passing a bad pointer into
 spin_lock_irqsave().

There's a bug in the gdth rework in that the instance can be deleted
from the list before the actual timer is stopped.  This can be worked
around I think by the following patch; although we really should be
stopping the timer from firing when the list goes empty.


James said:

This is almost certainly the wrong fix for real hardware.  Although it
kills the timer when the list goes empty, nothing will ever restart it
when the list fills again.

Boaz, since you touched all of this, you get to fix it.  The correct fix
will be to control the timer along with the actual list instead of at
entry/exit time.  If you're not going to add this empty check to the
timer routine, make sure you use del_timer_sync() before removing the
last element from the list.


Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/gdth.c |3 +++
 1 file changed, 3 insertions(+)

diff -puN drivers/scsi/gdth.c~git-scsi-misc-gdth-fix drivers/scsi/gdth.c
--- a/drivers/scsi/gdth.c~git-scsi-misc-gdth-fix
+++ a/drivers/scsi/gdth.c
@@ -3793,6 +3793,9 @@ static void gdth_timeout(ulong data)
 gdth_ha_str *ha;
 ulong flags;
 
+if (list_empty(gdth_instances))
+   return;
+
 ha = list_first_entry(gdth_instances, gdth_ha_str, list);
 spin_lock_irqsave(ha-smp_lock, flags);
 
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 04/30] kill warnings in mptbase.h on parisc64

2007-12-13 Thread akpm
From: Kyle McMartin [EMAIL PROTECTED]

Verified all the arches necessary select the CONFIG_64BIT symbol.  This
also kills the warning (since it was using the 32-bit case) on parisc64 and
mips64.

Signed-off-by: Kyle McMartin [EMAIL PROTECTED]
Cc: Moore, Eric Dean [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/message/fusion/mptbase.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN 
drivers/message/fusion/mptbase.h~kill-warnings-in-mptbaseh-on-parisc64 
drivers/message/fusion/mptbase.h
--- a/drivers/message/fusion/mptbase.h~kill-warnings-in-mptbaseh-on-parisc64
+++ a/drivers/message/fusion/mptbase.h
@@ -922,7 +922,7 @@ extern struct proc_dir_entry*mpt_proc_r
 /*=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=*/
 #endif /* } __KERNEL__ */
 
-#if defined(__alpha__) || defined(__sparc_v9__) || defined(__ia64__) || 
defined(__x86_64__) || defined(__powerpc__)
+#ifdef CONFIG_64BIT
 #define CAST_U32_TO_PTR(x) ((void *)(u64)x)
 #define CAST_PTR_TO_U32(x) ((u32)(u64)x)
 #else
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 02/30] mptbase: reset ioc initiator during PCI resume

2007-12-13 Thread akpm
From: Darrick J. Wong [EMAIL PROTECTED]

It appears that the LSI SAS 1064E chip needs to be reset after a
suspend/resume cycle before the driver attempts further communications with
the chip.  Without this patch, resuming the chip results in this error
message being printed repeatedly and no more disk I/O.

mptbase: ioc0: ERROR - Invalid IOC facts reply, msgLength=0 offsetof=6!

So far it seems to fix suspend/resume on all the MPT Fusion cards I have
(SAS and U320 SCSI) but since I don't know the internals of that chip I
can't say for sure if this is a proper fix.

Signed-off-by: Darrick J. Wong [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/message/fusion/mptbase.c |6 ++
 1 file changed, 6 insertions(+)

diff -puN 
drivers/message/fusion/mptbase.c~mptbase-reset-ioc-initiator-during-pci-resume 
drivers/message/fusion/mptbase.c
--- 
a/drivers/message/fusion/mptbase.c~mptbase-reset-ioc-initiator-during-pci-resume
+++ a/drivers/message/fusion/mptbase.c
@@ -1829,6 +1829,12 @@ mpt_resume(struct pci_dev *pdev)
(mpt_GetIocState(ioc, 1)  MPI_IOC_STATE_SHIFT),
CHIPREG_READ32(ioc-chip-Doorbell));
 
+   /* put ioc into READY_STATE */
+   if(SendIocReset(ioc, MPI_FUNCTION_IOC_MESSAGE_UNIT_RESET, CAN_SLEEP)) {
+   printk(MYIOC_s_ERR_FMT
+   pci-resume:  IOC msg unit reset failed!\n, ioc-name);
+   }
+
/* bring ioc to operational state */
if ((recovery_state = mpt_do_ioc_recovery(ioc,
MPT_HOSTEVENT_IOC_RECOVER, CAN_SLEEP)) != 0) {
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 03/30] initio: fix conflict when loading driver

2007-12-13 Thread akpm
From: Alan Cox [EMAIL PROTECTED]

 I have a scanner connected to a Initio INI-950 SCSI card and I recently
 upgraded from SuSE 10.2 to 10.3.  The new kernel doesn't see any of my
 devices.  I get the following in /var/log/messages:

 ACPI: PCI Interrupt :00:0a.0[A] - GSI 17 (level, low) - IRQ 16
 initio: I/O port range 0x0 is busy.
 ACPI: PCI interrupt for device :00:0a.0 disabled

Humm not a collision - thats a bug in the driver updating.  Looks like the
changes I made and combined with Christoph's lost a line somewhere when I
was merging it all.  Try the following

Signed-off-by: Alan Cox [EMAIL PROTECTED]
Cc: Scott Simpson [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/initio.c |1 +
 1 file changed, 1 insertion(+)

diff -puN drivers/scsi/initio.c~initio-fix-conflict-when-loading-driver 
drivers/scsi/initio.c
--- a/drivers/scsi/initio.c~initio-fix-conflict-when-loading-driver
+++ a/drivers/scsi/initio.c
@@ -2867,6 +2867,7 @@ static int initio_probe_one(struct pci_d
}
host = (struct initio_host *)shost-hostdata;
memset(host, 0, sizeof(struct initio_host));
+   host-addr = pci_resource_start(pdev, 0);
 
if (!request_region(host-addr, 256, i91u)) {
printk(KERN_WARNING initio: I/O port range 0x%x is busy.\n, 
host-addr);
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 14/30] aic94: fix section mismatches

2007-12-13 Thread akpm
From: Randy Dunlap [EMAIL PROTECTED]

Fix section mismatch warning:

WARNING: vmlinux.o(.init.text+0x23be6): Section mismatch: reference to 
.exit.text:asd_unmap_ha (between 'asd_pci_probe' and 'qla4xxx_module_init')
+
WARNING: vmlinux.o(.text+0x1ec8a8): Section mismatch: reference to .exit.text:as
d_unmap_ioport (between 'asd_unmap_ha' and 'asd_remove_dev_attrs')
WARNING: vmlinux.o(.text+0x1ec8b1): Section mismatch: reference to .exit.text:as
d_unmap_memio (between 'asd_unmap_ha' and 'asd_remove_dev_attrs')

Signed-off-by: Randy Dunlap [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/aic94xx/aic94xx_init.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff -puN drivers/scsi/aic94xx/aic94xx_init.c~aic94-fix-section-mismatches 
drivers/scsi/aic94xx/aic94xx_init.c
--- a/drivers/scsi/aic94xx/aic94xx_init.c~aic94-fix-section-mismatches
+++ a/drivers/scsi/aic94xx/aic94xx_init.c
@@ -136,7 +136,7 @@ Err:
return err;
 }
 
-static void __devexit asd_unmap_memio(struct asd_ha_struct *asd_ha)
+static void asd_unmap_memio(struct asd_ha_struct *asd_ha)
 {
struct asd_ha_addrspace *io_handle;
 
@@ -173,7 +173,7 @@ static int __devinit asd_map_ioport(stru
return err;
 }
 
-static void __devexit asd_unmap_ioport(struct asd_ha_struct *asd_ha)
+static void asd_unmap_ioport(struct asd_ha_struct *asd_ha)
 {
pci_release_region(asd_ha-pcidev, PCI_IOBAR_OFFSET);
 }
@@ -210,7 +210,7 @@ Err:
return err;
 }
 
-static void __devexit asd_unmap_ha(struct asd_ha_struct *asd_ha)
+static void asd_unmap_ha(struct asd_ha_struct *asd_ha)
 {
if (asd_ha-iospace)
asd_unmap_ioport(asd_ha);
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 07/30] ips: PCI API cleanups

2007-12-13 Thread akpm
From: Jeff Garzik [EMAIL PROTECTED]

* pass Scsi_Host to ips_remove_device() via pci_set_drvdata(),
  allowing us to eliminate the ips_ha[] search loop and call
  ips_release() directly.

* call pci_{request,release}_regions() and eliminate individual
  request/release_[mem_]region() calls

* call pci_disable_device(), paired with pci_enable_device()

* s/0/NULL/ in a few places

* check ioremap() return value

Signed-off-by: Jeff Garzik [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Acked-by: Salyzyn, Mark [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/ips.c |   72 ++-
 1 file changed, 31 insertions(+), 41 deletions(-)

diff -puN drivers/scsi/ips.c~ips-pci-api-cleanups drivers/scsi/ips.c
--- a/drivers/scsi/ips.c~ips-pci-api-cleanups
+++ a/drivers/scsi/ips.c
@@ -702,10 +702,6 @@ ips_release(struct Scsi_Host *sh)
/* free extra memory */
ips_free(ha);
 
-   /* Free I/O Region */
-   if (ha-io_addr)
-   release_region(ha-io_addr, ha-io_len);
-
/* free IRQ */
free_irq(ha-pcidev-irq, ha);
 
@@ -4394,8 +4390,6 @@ ips_free(ips_ha_t * ha)
ha-mem_ptr = NULL;
}
 
-   if (ha-mem_addr)
-   release_mem_region(ha-mem_addr, ha-mem_len);
ha-mem_addr = 0;
 
}
@@ -6880,20 +6874,14 @@ ips_register_scsi(int index)
 static void __devexit
 ips_remove_device(struct pci_dev *pci_dev)
 {
-   int i;
-   struct Scsi_Host *sh;
-   ips_ha_t *ha;
+   struct Scsi_Host *sh = pci_get_drvdata(pci_dev);
 
-   for (i = 0; i  IPS_MAX_ADAPTERS; i++) {
-   ha = ips_ha[i];
-   if (ha) {
-   if ((pci_dev-bus-number == ha-pcidev-bus-number) 
-   (pci_dev-devfn == ha-pcidev-devfn)) {
-   sh = ips_sh[i];
-   ips_release(sh);
-   }
-   }
-   }
+   pci_set_drvdata(pci_dev, NULL);
+
+   ips_release(sh);
+
+   pci_release_regions(pci_dev);
+   pci_disable_device(pci_dev);
 }
 
 //
@@ -6947,12 +6935,17 @@ module_exit(ips_module_exit);
 static int __devinit
 ips_insert_device(struct pci_dev *pci_dev, const struct pci_device_id *ent)
 {
-   int uninitialized_var(index);
+   int index = -1;
int rc;
 
METHOD_TRACE(ips_insert_device, 1);
-   if (pci_enable_device(pci_dev))
-   return -1;
+   rc = pci_enable_device(pci_dev);
+   if (rc)
+   return rc;
+
+   rc = pci_request_regions(pci_dev, ips);
+   if (rc)
+   goto err_out;
 
rc = ips_init_phase1(pci_dev, index);
if (rc == SUCCESS)
@@ -6968,6 +6961,19 @@ ips_insert_device(struct pci_dev *pci_de
ips_num_controllers++;
 
ips_next_controller = ips_num_controllers;
+
+   if (rc  0) {
+   rc = -ENODEV;
+   goto err_out_regions;
+   }
+
+   pci_set_drvdata(pci_dev, ips_sh[index]);
+   return 0;
+
+err_out_regions:
+   pci_release_regions(pci_dev);
+err_out:
+   pci_disable_device(pci_dev);
return rc;
 }
 
@@ -7000,7 +7006,7 @@ ips_init_phase1(struct pci_dev *pci_dev,
METHOD_TRACE(ips_init_phase1, 1);
index = IPS_MAX_ADAPTERS;
for (j = 0; j  IPS_MAX_ADAPTERS; j++) {
-   if (ips_ha[j] == 0) {
+   if (ips_ha[j] == NULL) {
index = j;
break;
}
@@ -7037,32 +7043,17 @@ ips_init_phase1(struct pci_dev *pci_dev,
uint32_t base;
uint32_t offs;
 
-   if (!request_mem_region(mem_addr, mem_len, ips)) {
-   IPS_PRINTK(KERN_WARNING, pci_dev,
-  Couldn't allocate IO Memory space %x len 
%d.\n,
-  mem_addr, mem_len);
-   return -1;
-   }
-
base = mem_addr  PAGE_MASK;
offs = mem_addr - base;
ioremap_ptr = ioremap(base, PAGE_SIZE);
+   if (!ioremap_ptr)
+   return -1;
mem_ptr = ioremap_ptr + offs;
} else {
ioremap_ptr = NULL;
mem_ptr = NULL;
}
 
-   /* setup I/O mapped area (if applicable) */
-   if (io_addr) {
-   if (!request_region(io_addr, io_len, ips)) {
-   IPS_PRINTK(KERN_WARNING, pci_dev,
-  Couldn't allocate IO space %x len %d.\n,
-  io_addr, io_len);
-   return -1;
-   }
-   }
-
/* found a controller */
ha = kzalloc(sizeof (ips_ha_t), GFP_KERNEL);
if (ha == NULL) {
@@ -7071,7 +7062,6 @@ 

[patch 09/30] MegaRAID driver management char device moved to misc

2007-12-13 Thread akpm
From: Thomas Horsten [EMAIL PROTECTED]

The MegaRAID driver's common management module (megaraid_mm.c) creates a
char device used by the management tool megarc from LSI Logic (and
possibly other management tools).

In 2.6 with udev, this device doesn't get created because it is not
registered in sysfs.

I first fixed this by registering a class megaraid_mm, but realized that
this should probably be moved to misc devices, instead of taking up a char
major.  This is because only 1 device is used, even if there are multiple
adapters - the minor is never used (the adapter info is in the ioctl block
sent to the driver, not detected based on the minor number as one might
think).  So it is a complete waste to have an entire major taken by this.

So it now uses a misc device which I named megadev0 (the name that megarc
expects), and has a dynamic minor (previoulsy a dynamic major was used).

I have tested this on my own system with the megarc tool, and it works just
as fine as before (only now the device gets created correctly by udev).

Cc: [EMAIL PROTECTED]
Cc: Neela Syam Kolli [EMAIL PROTECTED]
Cc: Ju, Seokmann [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/megaraid/megaraid_mm.c |   20 +---
 drivers/scsi/megaraid/megaraid_mm.h |1 +
 2 files changed, 14 insertions(+), 7 deletions(-)

diff -puN 
drivers/scsi/megaraid/megaraid_mm.c~megaraid-driver-management-char-device-moved-to-misc
 drivers/scsi/megaraid/megaraid_mm.c
--- 
a/drivers/scsi/megaraid/megaraid_mm.c~megaraid-driver-management-char-device-moved-to-misc
+++ a/drivers/scsi/megaraid/megaraid_mm.c
@@ -59,7 +59,6 @@ EXPORT_SYMBOL(mraid_mm_register_adp);
 EXPORT_SYMBOL(mraid_mm_unregister_adp);
 EXPORT_SYMBOL(mraid_mm_adapter_app_handle);
 
-static int majorno;
 static uint32_t drvr_ver   = 0x02200207;
 
 static int adapters_count_g;
@@ -76,6 +75,12 @@ static const struct file_operations lsi_
.owner  = THIS_MODULE,
 };
 
+static struct miscdevice megaraid_mm_dev = {
+   .minor  = MISC_DYNAMIC_MINOR,
+   .name   = megadev0,
+   .fops   = lsi_fops,
+};
+
 /**
  * mraid_mm_open - open routine for char node interface
  * @inode  : unused
@@ -1184,15 +1189,16 @@ mraid_mm_teardown_dma_pools(mraid_mmadp_
 static int __init
 mraid_mm_init(void)
 {
+   int err;
+
// Announce the driver version
con_log(CL_ANN, (KERN_INFO megaraid cmm: %s %s\n,
LSI_COMMON_MOD_VERSION, LSI_COMMON_MOD_EXT_VERSION));
 
-   majorno = register_chrdev(0, megadev, lsi_fops);
-
-   if (majorno  0) {
-   con_log(CL_ANN, (megaraid cmm: cannot get major\n));
-   return majorno;
+   err = misc_register(megaraid_mm_dev);
+   if (err  0) {
+   con_log(CL_ANN, (megaraid cmm: cannot register misc 
device\n));
+   return err;
}
 
init_waitqueue_head(wait_q);
@@ -1230,7 +1236,7 @@ mraid_mm_exit(void)
 {
con_log(CL_DLEVEL1 , (exiting common mod\n));
 
-   unregister_chrdev(majorno, megadev);
+   misc_deregister(megaraid_mm_dev);
 }
 
 module_init(mraid_mm_init);
diff -puN 
drivers/scsi/megaraid/megaraid_mm.h~megaraid-driver-management-char-device-moved-to-misc
 drivers/scsi/megaraid/megaraid_mm.h
--- 
a/drivers/scsi/megaraid/megaraid_mm.h~megaraid-driver-management-char-device-moved-to-misc
+++ a/drivers/scsi/megaraid/megaraid_mm.h
@@ -22,6 +22,7 @@
 #include linux/moduleparam.h
 #include linux/pci.h
 #include linux/list.h
+#include linux/miscdevice.h
 
 #include mbox_defs.h
 #include megaraid_ioctl.h
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 13/30] advansys: fix section mismatch warning

2007-12-13 Thread akpm
From: Randy Dunlap [EMAIL PROTECTED]

Fix section mismatch warning:

WARNING: vmlinux.o(.exit.text+0x152a): Section mismatch: reference to 
.init.data:_asc_def_iop_base (between 'advansys_isa_remove' and 'advansys_exit')

Signed-off-by: Randy Dunlap [EMAIL PROTECTED]
Cc: Matthew Wilcox [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/advansys.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN drivers/scsi/advansys.c~advansys-fix-section-mismatch-warning 
drivers/scsi/advansys.c
--- a/drivers/scsi/advansys.c~advansys-fix-section-mismatch-warning
+++ a/drivers/scsi/advansys.c
@@ -13906,7 +13906,7 @@ static int advansys_release(struct Scsi_
 
 #define ASC_IOADR_TABLE_MAX_IX  11
 
-static PortAddr _asc_def_iop_base[ASC_IOADR_TABLE_MAX_IX] __devinitdata = {
+static PortAddr _asc_def_iop_base[ASC_IOADR_TABLE_MAX_IX] = {
0x100, 0x0110, 0x120, 0x0130, 0x140, 0x0150, 0x0190,
0x0210, 0x0230, 0x0250, 0x0330
 };
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 12/30] SCSI/NCR5380: minor irq handler cleanups

2007-12-13 Thread akpm
From: Jeff Garzik [EMAIL PROTECTED]

* remove unnecessary cast

* remove unnecessary use of 'irq' function arg

Signed-off-by: Jeff Garzik [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/NCR5380.c |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff -puN drivers/scsi/NCR5380.c~scsi-ncr5380-minor-irq-handler-cleanups 
drivers/scsi/NCR5380.c
--- a/drivers/scsi/NCR5380.c~scsi-ncr5380-minor-irq-handler-cleanups
+++ a/drivers/scsi/NCR5380.c
@@ -1157,16 +1157,17 @@ static void NCR5380_main(struct work_str
  * Locks: takes the needed instance locks
  */
 
-static irqreturn_t NCR5380_intr(int irq, void *dev_id) 
+static irqreturn_t NCR5380_intr(int dummy, void *dev_id)
 {
NCR5380_local_declare();
-   struct Scsi_Host *instance = (struct Scsi_Host *)dev_id;
+   struct Scsi_Host *instance = dev_id;
struct NCR5380_hostdata *hostdata = (struct NCR5380_hostdata *) 
instance-hostdata;
int done;
unsigned char basr;
unsigned long flags;
 
-   dprintk(NDEBUG_INTR, (scsi : NCR5380 irq %d triggered\n, irq));
+   dprintk(NDEBUG_INTR, (scsi : NCR5380 irq %d triggered\n,
+   instance-irq));
 
do {
done = 1;
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 11/30] SCSI/sym53c416: kill pointless irq handler loop and test

2007-12-13 Thread akpm
From: Jeff Garzik [EMAIL PROTECTED]

- kill pointless irq handler loop to find base address, it is already
  passed to irq handler via Scsi_Host.

- kill now-pointless !base test.

Signed-off-by: Jeff Garzik [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Cc: Matthew Wilcox [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/sym53c416.c |   16 +---
 1 file changed, 1 insertion(+), 15 deletions(-)

diff -puN 
drivers/scsi/sym53c416.c~scsi-sym53c416-kill-pointless-irq-handler-loop-and-test
 drivers/scsi/sym53c416.c
--- 
a/drivers/scsi/sym53c416.c~scsi-sym53c416-kill-pointless-irq-handler-loop-and-test
+++ a/drivers/scsi/sym53c416.c
@@ -328,27 +328,13 @@ static __inline__ unsigned int sym53c416
 static irqreturn_t sym53c416_intr_handle(int irq, void *dev_id)
 {
struct Scsi_Host *dev = dev_id;
-   int base = 0;
+   int base = dev-io_port;
int i;
unsigned long flags = 0;
unsigned char status_reg, pio_int_reg, int_reg;
struct scatterlist *sg;
unsigned int tot_trans = 0;
 
-   /* We search the base address of the host adapter which caused the 
interrupt */
-   /* FIXME: should pass dev_id sensibly as hosts[i] */
-   for(i = 0; i  host_index  !base; i++)
-   if(irq == hosts[i].irq)
-   base = hosts[i].base;
-   /* If no adapter found, we cannot handle the interrupt. Leave a message 
*/
-   /* and continue. This should never happen...
*/
-   if(!base)
-   {
-   printk(KERN_ERR sym53c416: No host adapter defined for 
interrupt %d\n, irq);
-   return IRQ_NONE;
-   }
-   /* Now we have the base address and we can start handling the interrupt 
*/
-
spin_lock_irqsave(dev-host_lock,flags);
status_reg = inb(base + STATUS_REG);
pio_int_reg = inb(base + PIO_INT_REG);
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 15/30] sym2: fix section mismatch warning

2007-12-13 Thread akpm
From: Randy Dunlap [EMAIL PROTECTED]

Fix section mismatch warning:

WARNING: vmlinux.o(.text+0x1ff3a2): Section mismatch: reference to 
.exit.text:sym2_remove (between 'sym2_io_error_detected' and 'sym_xpt_done')

Signed-off-by: Randy Dunlap [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Cc: Matthew Wilcox [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/sym53c8xx_2/sym_glue.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff -puN drivers/scsi/sym53c8xx_2/sym_glue.c~sym2-fix-section-mismatch-warning 
drivers/scsi/sym53c8xx_2/sym_glue.c
--- a/drivers/scsi/sym53c8xx_2/sym_glue.c~sym2-fix-section-mismatch-warning
+++ a/drivers/scsi/sym53c8xx_2/sym_glue.c
@@ -1744,7 +1744,7 @@ static int __devinit sym2_probe(struct p
return -ENODEV;
 }
 
-static void __devexit sym2_remove(struct pci_dev *pdev)
+static void sym2_remove(struct pci_dev *pdev)
 {
struct Scsi_Host *shost = pci_get_drvdata(pdev);
 
@@ -2056,7 +2056,7 @@ static struct pci_driver sym2_driver = {
.name   = NAME53C8XX,
.id_table   = sym2_id_table,
.probe  = sym2_probe,
-   .remove = __devexit_p(sym2_remove),
+   .remove = sym2_remove,
.err_handler= sym2_err_handler,
 };
 
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 10/30] SCSI/gdth: kill unneeded 'irq' argument

2007-12-13 Thread akpm
From: Jeff Garzik [EMAIL PROTECTED]

Neither gdth_get_status() nor __gdth_interrupt() need their 'irq' argument,
so remove it.

[EMAIL PROTECTED]: coding style fixes]
Signed-off-by: Jeff Garzik [EMAIL PROTECTED]
Acked-by: Boaz Harrosh [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/gdth.c |   22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff -puN drivers/scsi/gdth.c~scsi-gdth-kill-unneeded-irq-argument 
drivers/scsi/gdth.c
--- a/drivers/scsi/gdth.c~scsi-gdth-kill-unneeded-irq-argument
+++ a/drivers/scsi/gdth.c
@@ -141,7 +141,7 @@
 static void gdth_delay(int milliseconds);
 static void gdth_eval_mapping(ulong32 size, ulong32 *cyls, int *heads, int 
*secs);
 static irqreturn_t gdth_interrupt(int irq, void *dev_id);
-static irqreturn_t __gdth_interrupt(gdth_ha_str *ha, int irq,
+static irqreturn_t __gdth_interrupt(gdth_ha_str *ha,
 int gdth_from_wait, int* pIndex);
 static int gdth_sync_event(gdth_ha_str *ha, int service, unchar index,
Scsi_Cmnd *scp);
@@ -165,7 +165,6 @@ static int gdth_internal_cache_cmd(gdth_
 static int gdth_fill_cache_cmd(gdth_ha_str *ha, Scsi_Cmnd *scp, ushort hdrive);
 
 static void gdth_enable_int(gdth_ha_str *ha);
-static unchar gdth_get_status(gdth_ha_str *ha, int irq);
 static int gdth_test_busy(gdth_ha_str *ha);
 static int gdth_get_cmd_index(gdth_ha_str *ha);
 static void gdth_release_event(gdth_ha_str *ha);
@@ -1334,14 +1333,12 @@ static void __init gdth_enable_int(gdth_
 }
 
 /* return IStatus if interrupt was from this card else 0 */
-static unchar gdth_get_status(gdth_ha_str *ha, int irq)
+static unchar gdth_get_status(gdth_ha_str *ha)
 {
 unchar IStatus = 0;
 
-TRACE((gdth_get_status() irq %d ctr_count %d\n, irq, gdth_ctr_count));
+TRACE((gdth_get_status() irq %d ctr_count %d\n, ha-irq, 
gdth_ctr_count));
 
-if (ha-irq != (unchar)irq) /* check IRQ */
-return false;
 if (ha-type == GDT_EISA)
 IStatus = inb((ushort)ha-bmic + EDOORREG);
 else if (ha-type == GDT_ISA)
@@ -1523,7 +1520,7 @@ static int gdth_wait(gdth_ha_str *ha, in
 return 1;   /* no wait required */
 
 do {
-__gdth_interrupt(ha, (int)ha-irq, true, wait_index);
+   __gdth_interrupt(ha, true, wait_index);
 if (wait_index == index) {
 answer_found = TRUE;
 break;
@@ -3036,7 +3033,7 @@ static void gdth_clear_events(void)
 
 /* SCSI interface functions */
 
-static irqreturn_t __gdth_interrupt(gdth_ha_str *ha, int irq,
+static irqreturn_t __gdth_interrupt(gdth_ha_str *ha,
 int gdth_from_wait, int* pIndex)
 {
 gdt6m_dpram_str __iomem *dp6m_ptr = NULL;
@@ -3054,7 +3051,7 @@ static irqreturn_t __gdth_interrupt(gdth
 int act_int_coal = 0;   
 #endif
 
-TRACE((gdth_interrupt() IRQ %d\n,irq));
+TRACE((gdth_interrupt() IRQ %d\n, ha-irq));
 
 /* if polling and not from gdth_wait() - return */
 if (gdth_polling) {
@@ -3067,7 +3064,8 @@ static irqreturn_t __gdth_interrupt(gdth
 spin_lock_irqsave(ha-smp_lock, flags);
 
 /* search controller */
-if (0 == (IStatus = gdth_get_status(ha, irq))) {
+IStatus = gdth_get_status(ha);
+if (IStatus == 0) {
 /* spurious interrupt */
 if (!gdth_polling)
 spin_unlock_irqrestore(ha-smp_lock, flags);
@@ -3294,9 +3292,9 @@ static irqreturn_t __gdth_interrupt(gdth
 
 static irqreturn_t gdth_interrupt(int irq, void *dev_id)
 {
-   gdth_ha_str *ha = (gdth_ha_str *)dev_id;
+   gdth_ha_str *ha = dev_id;
 
-   return __gdth_interrupt(ha, irq, false, NULL);
+   return __gdth_interrupt(ha, false, NULL);
 }
 
 static int gdth_sync_event(gdth_ha_str *ha, int service, unchar index,
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 06/30] ips: trim trailing whitespace

2007-12-13 Thread akpm
From: Jeff Garzik [EMAIL PROTECTED]

[EMAIL PROTECTED]: coding style fixes]
Signed-off-by: Jeff Garzik [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Acked-by: Salyzyn, Mark [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/ips.c |   49 +--
 drivers/scsi/ips.h |   12 +-
 2 files changed, 31 insertions(+), 30 deletions(-)

diff -puN drivers/scsi/ips.c~ips-trim-trailing-whitespace drivers/scsi/ips.c
--- a/drivers/scsi/ips.c~ips-trim-trailing-whitespace
+++ a/drivers/scsi/ips.c
@@ -389,17 +389,17 @@ static struct  pci_device_id  ips_pci_ta
 MODULE_DEVICE_TABLE( pci, ips_pci_table );
 
 static char ips_hot_plug_name[] = ips;
-   
+
 static int __devinit  ips_insert_device(struct pci_dev *pci_dev, const struct 
pci_device_id *ent);
 static void __devexit ips_remove_device(struct pci_dev *pci_dev);
-   
+
 static struct pci_driver ips_pci_driver = {
.name   = ips_hot_plug_name,
.id_table   = ips_pci_table,
.probe  = ips_insert_device,
.remove = __devexit_p(ips_remove_device),
 };
-   
+
 
 /*
  * Necessary forward function protoypes
@@ -587,7 +587,7 @@ static void
 ips_setup_funclist(ips_ha_t * ha)
 {
 
-   /*
+   /*
 * Setup Functions
 */
if (IPS_IS_MORPHEUS(ha) || IPS_IS_MARCO(ha)) {
@@ -2081,7 +2081,7 @@ ips_host_info(ips_ha_t * ha, char *ptr, 
 /* That keeps everything happy for text operations on the proc file. 
   */
 
if (le32_to_cpu(ha-nvram-signature) == IPS_NVRAM_P5_SIG) {
-if (ha-nvram-bios_low[3] == 0) { 
+   if (ha-nvram-bios_low[3] == 0) {
 copy_info(info,
  \tBIOS Version  : 
%c%c%c%c%c%c%c\n,
  ha-nvram-bios_high[0], 
ha-nvram-bios_high[1],
@@ -2780,10 +2780,11 @@ ips_next(ips_ha_t * ha, int intr)
scb-dcdb.cmd_attribute =
ips_command_direction[scb-scsi_cmd-cmnd[0]];
 
-/* Allow a WRITE BUFFER Command to Have no Data */
-/* This is Used by Tape Flash Utilites  */
-if ((scb-scsi_cmd-cmnd[0] == WRITE_BUFFER)  (scb-data_len == 0)) 
-scb-dcdb.cmd_attribute = 0;  
+   /* Allow a WRITE BUFFER Command to Have no Data */
+   /* This is Used by Tape Flash Utilites  */
+   if ((scb-scsi_cmd-cmnd[0] == WRITE_BUFFER) 
+   (scb-data_len == 0))
+   scb-dcdb.cmd_attribute = 0;
 
if (!(scb-dcdb.cmd_attribute  0x3))
scb-dcdb.transfer_length = 0;
@@ -3404,7 +3405,7 @@ ips_map_status(ips_ha_t * ha, ips_scb_t 
 
/* Restrict access to physical DASD */
if (scb-scsi_cmd-cmnd[0] == INQUIRY) {
-   ips_scmd_buf_read(scb-scsi_cmd, 
+   ips_scmd_buf_read(scb-scsi_cmd,
   inquiryData, sizeof (inquiryData));
if ((inquiryData.DeviceType  0x1f) == 
TYPE_DISK) {
errcode = DID_TIME_OUT;
@@ -4090,10 +4091,10 @@ ips_chkstatus(ips_ha_t * ha, IPS_STATUS 
scb-scsi_cmd-result = errcode  16;
} else {/* bus == 0 */
/* restrict access to physical drives */
-   if (scb-scsi_cmd-cmnd[0] == INQUIRY) { 
-   ips_scmd_buf_read(scb-scsi_cmd, 
+   if (scb-scsi_cmd-cmnd[0] == INQUIRY) {
+   ips_scmd_buf_read(scb-scsi_cmd,
   inquiryData, sizeof (inquiryData));
-   if ((inquiryData.DeviceType  0x1f) == TYPE_DISK) 
+   if ((inquiryData.DeviceType  0x1f) == TYPE_DISK)
scb-scsi_cmd-result = DID_TIME_OUT  16;
}
}   /* else */
@@ -4661,8 +4662,8 @@ ips_isinit_morpheus(ips_ha_t * ha)
uint32_t bits;
 
METHOD_TRACE(ips_is_init_morpheus, 1);
-   
-   if (ips_isintr_morpheus(ha)) 
+
+   if (ips_isintr_morpheus(ha))
ips_flush_and_reset(ha);
 
post = readl(ha-mem_ptr + IPS_REG_I960_MSG0);
@@ -4686,7 +4687,7 @@ ips_isinit_morpheus(ips_ha_t * ha)
 /*   state ( was trying to INIT and an interrupt was already pending ) ...  */
 /*  */
 //
-static void 
+static void
 ips_flush_and_reset(ips_ha_t *ha)
 {
ips_scb_t *scb;
@@ -4718,9 +4719,9 @@ ips_flush_and_reset(ips_ha_t *ha)
if (ret == IPS_SUCCESS) {
time = 60 * 

[patch 08/30] ips: handle scsi_add_host() failure, and other err cleanups

2007-12-13 Thread akpm
From: Jeff Garzik [EMAIL PROTECTED]

Signed-off-by: Jeff Garzik [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Acked-by: Salyzyn, Mark [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/ips.c |   18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff -puN 
drivers/scsi/ips.c~ips-handle-scsi_add_host-failure-and-other-err-cleanups 
drivers/scsi/ips.c
--- a/drivers/scsi/ips.c~ips-handle-scsi_add_host-failure-and-other-err-cleanups
+++ a/drivers/scsi/ips.c
@@ -6837,13 +6837,10 @@ ips_register_scsi(int index)
if (request_irq(ha-pcidev-irq, do_ipsintr, IRQF_SHARED, ips_name, 
ha)) {
IPS_PRINTK(KERN_WARNING, ha-pcidev,
   Unable to install interrupt handler\n);
-   scsi_host_put(sh);
-   return -1;
+   goto err_out_sh;
}
 
kfree(oldha);
-   ips_sh[index] = sh;
-   ips_ha[index] = ha;
 
/* Store away needed values for later use */
sh-unique_id = (ha-io_addr) ? ha-io_addr : ha-mem_addr;
@@ -6859,10 +6856,21 @@ ips_register_scsi(int index)
sh-max_channel = ha-nbus - 1;
sh-can_queue = ha-max_cmds - 1;
 
-   scsi_add_host(sh, NULL);
+   if (scsi_add_host(sh, ha-pcidev-dev))
+   goto err_out;
+
+   ips_sh[index] = sh;
+   ips_ha[index] = ha;
+
scsi_scan_host(sh);
 
return 0;
+
+err_out:
+   free_irq(ha-pcidev-irq, ha);
+err_out_sh:
+   scsi_host_put(sh);
+   return -1;
 }
 
 /*---*/
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 05/30] ips: remove ips_ha members that duplicate struct pci_dev members

2007-12-13 Thread akpm
From: Jeff Garzik [EMAIL PROTECTED]

Signed-off-by: Jeff Garzik [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Acked-by: Salyzyn, Mark [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/ips.c |  178 ---
 drivers/scsi/ips.h |   20 +---
 2 files changed, 91 insertions(+), 107 deletions(-)

diff -puN 
drivers/scsi/ips.c~ips-remove-ips_ha-members-that-duplicate-struct-pci_dev-members
 drivers/scsi/ips.c
--- 
a/drivers/scsi/ips.c~ips-remove-ips_ha-members-that-duplicate-struct-pci_dev-members
+++ a/drivers/scsi/ips.c
@@ -707,7 +707,7 @@ ips_release(struct Scsi_Host *sh)
release_region(ha-io_addr, ha-io_len);
 
/* free IRQ */
-   free_irq(ha-irq, ha);
+   free_irq(ha-pcidev-irq, ha);
 
scsi_host_put(sh);
 
@@ -1637,7 +1637,7 @@ ips_make_passthru(ips_ha_t *ha, struct s
return (IPS_FAILURE);
}
 
-   if (ha-device_id == IPS_DEVICEID_COPPERHEAD 
+   if (ha-pcidev-device == IPS_DEVICEID_COPPERHEAD 
pt-CoppCP.cmd.flashfw.op_code ==
IPS_CMD_RW_BIOSFW) {
ret = ips_flash_copperhead(ha, pt, scb);
@@ -2021,7 +2021,7 @@ ips_cleanup_passthru(ips_ha_t * ha, ips_
pt-ExtendedStatus = scb-extended_status;
pt-AdapterType = ha-ad_type;
 
-   if (ha-device_id == IPS_DEVICEID_COPPERHEAD 
+   if (ha-pcidev-device == IPS_DEVICEID_COPPERHEAD 
(scb-cmd.flashfw.op_code == IPS_CMD_DOWNLOAD ||
 scb-cmd.flashfw.op_code == IPS_CMD_RW_BIOSFW))
ips_free_flash_copperhead(ha);
@@ -2075,7 +2075,7 @@ ips_host_info(ips_ha_t * ha, char *ptr, 
  ha-mem_ptr);
}
 
-   copy_info(info, \tIRQ number: %d\n, ha-irq);
+   copy_info(info, \tIRQ number: %d\n, 
ha-pcidev-irq);
 
 /* For the Next 3 lines Check for Binary 0 at the end and don't include it 
if it's there. */
 /* That keeps everything happy for text operations on the proc file. 
   */
@@ -2232,31 +2232,31 @@ ips_identify_controller(ips_ha_t * ha)
 {
METHOD_TRACE(ips_identify_controller, 1);
 
-   switch (ha-device_id) {
+   switch (ha-pcidev-device) {
case IPS_DEVICEID_COPPERHEAD:
-   if (ha-revision_id = IPS_REVID_SERVERAID) {
+   if (ha-pcidev-revision = IPS_REVID_SERVERAID) {
ha-ad_type = IPS_ADTYPE_SERVERAID;
-   } else if (ha-revision_id == IPS_REVID_SERVERAID2) {
+   } else if (ha-pcidev-revision == IPS_REVID_SERVERAID2) {
ha-ad_type = IPS_ADTYPE_SERVERAID2;
-   } else if (ha-revision_id == IPS_REVID_NAVAJO) {
+   } else if (ha-pcidev-revision == IPS_REVID_NAVAJO) {
ha-ad_type = IPS_ADTYPE_NAVAJO;
-   } else if ((ha-revision_id == IPS_REVID_SERVERAID2)
+   } else if ((ha-pcidev-revision == IPS_REVID_SERVERAID2)
(ha-slot_num == 0)) {
ha-ad_type = IPS_ADTYPE_KIOWA;
-   } else if ((ha-revision_id = IPS_REVID_CLARINETP1) 
-  (ha-revision_id = IPS_REVID_CLARINETP3)) {
+   } else if ((ha-pcidev-revision = IPS_REVID_CLARINETP1) 
+  (ha-pcidev-revision = IPS_REVID_CLARINETP3)) {
if (ha-enq-ucMaxPhysicalDevices == 15)
ha-ad_type = IPS_ADTYPE_SERVERAID3L;
else
ha-ad_type = IPS_ADTYPE_SERVERAID3;
-   } else if ((ha-revision_id = IPS_REVID_TROMBONE32) 
-  (ha-revision_id = IPS_REVID_TROMBONE64)) {
+   } else if ((ha-pcidev-revision = IPS_REVID_TROMBONE32) 
+  (ha-pcidev-revision = IPS_REVID_TROMBONE64)) {
ha-ad_type = IPS_ADTYPE_SERVERAID4H;
}
break;
 
case IPS_DEVICEID_MORPHEUS:
-   switch (ha-subdevice_id) {
+   switch (ha-pcidev-subsystem_device) {
case IPS_SUBDEVICEID_4L:
ha-ad_type = IPS_ADTYPE_SERVERAID4L;
break;
@@ -2285,7 +2285,7 @@ ips_identify_controller(ips_ha_t * ha)
break;
 
case IPS_DEVICEID_MARCO:
-   switch (ha-subdevice_id) {
+   switch (ha-pcidev-subsystem_device) {
case IPS_SUBDEVICEID_6M:
ha-ad_type = IPS_ADTYPE_SERVERAID6M;
break;
@@ -2332,20 +2332,20 @@ ips_get_bios_version(ips_ha_t * ha, int 
 
strncpy(ha-bios_version,?, 8);
 
-   if (ha-device_id == IPS_DEVICEID_COPPERHEAD) {
+   if (ha-pcidev-device == IPS_DEVICEID_COPPERHEAD) {
 

[patch 20/30] drivers/scsi/sgiwd93.c: export sgiwd93_reset()

2007-12-13 Thread akpm
From: Andrew Morton [EMAIL PROTECTED]

mips allmodconfig:

ERROR: sgiwd93_reset [drivers/scsi/wd33c93.ko] undefined!

Cc: Ralf Baechle [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/sgiwd93.c |1 +
 1 file changed, 1 insertion(+)

diff -puN drivers/scsi/sgiwd93.c~drivers-scsi-sgiwd93c-export-sgiwd93_reset 
drivers/scsi/sgiwd93.c
--- a/drivers/scsi/sgiwd93.c~drivers-scsi-sgiwd93c-export-sgiwd93_reset
+++ a/drivers/scsi/sgiwd93.c
@@ -159,6 +159,7 @@ void sgiwd93_reset(unsigned long base)
udelay(50);
hregs-ctrl = 0;
 }
+EXPORT_SYMBOL_GPL(sgiwd93_reset);
 
 static inline void init_hpc_chain(struct hpc_data *hd)
 {
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Andrew Morton wrote:

On Thu, 13 Dec 2007 19:30:00 -0500
Mark Lord [EMAIL PROTECTED] wrote:


Here's the commit that causes the regression:

...

--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -760,7 +760,8 @@ static int rmqueue_bulk(struct zone *zone, unsigned int 
order,
struct page *page = __rmqueue(zone, order, migratetype);
if (unlikely(page == NULL))
break;
-   list_add_tail(page-lru, list);
+   list_add(page-lru, list);


well that looks fishy.

..

Yeah.  I missed that, and instead just posted a patch
to search the list in reverse order, which seems to work for me.

I'll try just reversing that line above here now.. gimme 5 minutes or so.

Cheers
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] fix page_alloc for larger I/O segments

2007-12-13 Thread Mark Lord

Mark Lord wrote:

Mark Lord wrote:

Mark Lord wrote:

Mark Lord wrote:

Andrew Morton wrote:

On Thu, 13 Dec 2007 17:15:06 -0500
James Bottomley [EMAIL PROTECTED] wrote:


On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote:

On Thu, 13 Dec 2007 21:09:59 +0100
Jens Axboe [EMAIL PROTECTED] wrote:


OK, it's a vm issue,

cc linux-mm and probable culprit.


 I have tens of thousand backward pages after a
boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not
reverse. So it looks like that bug got reintroduced.
Bill Irwin fixed this a couple of years back: changed the page 
allocator so

that it mostly hands out pages in ascending physical-address order.

I guess we broke that, quite possibly in Mel's page allocator 
rework.


It would help if you could provide us with a simple recipe for
demonstrating this problem, please.
The simple way seems to be to malloc a large area, touch every 
page and
then look at the physical pages assigned ... they now mostly seem 
to be

descending in physical address.



OIC.  -mm's /proc/pid/pagemap can be used to get the pfn's...

..

I'm actually running the treadmill right now (have been for many 
hours, actually,

to bisect it to a specific commit.

Thought I was almost done, and then noticed that git-bisect doesn't 
keep
the Makefile VERSION lines the same, so I was actually running the 
wrong

kernel after the first few times.. duh.

Wrote a script to fix it now.

..

Well, that was a waste of three hours.

..

Ahh.. it seems to be sensitive to one/both of these:

CONFIG_HIGHMEM64G=y with 4GB RAM:  not so bad, frequently does 20KB - 
48KB segments.
CONFIG_HIGHMEM4G=y  with 2GB RAM:  very severe, rarely does more than 
8KB segments.
CONFIG_HIGHMEM4G=y  with 3GB RAM:  very severe, rarely does more than 
8KB segments.


So if you want to reproduce this on a large memory machine, use 
mem=2GB for starters.

..

Here's the commit that causes the regression:

535131e6925b4a95f321148ad7293f496e0e58d7 Choose pages from the per-cpu 
list based on migration type




And here is a patch that seems to fix it for me here:

* * * *

Fix page allocator to give better change of larger contiguous segments (again).

Signed-off-by: Mark Lord [EMAIL PROTECTED]
---


--- old/mm/page_alloc.c.orig2007-12-13 19:25:15.0 -0500
+++ linux-2.6/mm/page_alloc.c   2007-12-13 19:35:50.0 -0500
@@ -954,7 +954,7 @@
goto failed;
}
/* Find a page of the appropriate migrate type */
-   list_for_each_entry(page, pcp-list, lru) {
+   list_for_each_entry_reverse(page, pcp-list, lru) {
if (page_private(page) == migratetype) {
list_del(page-lru);
pcp-count--;
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 16/30] aacraid driver fails with Dell PowerEdge Expandable RAID Controller 3/Di

2007-12-13 Thread akpm
From: Salyzyn, Mark [EMAIL PROTECTED]

As reported in http://bugzilla.kernel.org/show_bug.cgi?id=3D9133 it was
discovered that the PERC line of controllers lacked a key 64 bit
ScatterGather capable SCSI pass-through function. The adapters are still
capable of 64 bit ScatterGather I/O commands, but these two can not be
mixed. This problem was exacerbated by the introduction of the SCSI
Generic access to the DASD physical devices.

The fix for users before this patch is applied is aacraid.dacmode=3D0 on
the kernel command line to disable 64 bit I/O.

The enclosed patch introduces a new adapter quirk and tries to limp
along by enabling pass-through in situations where memory is 32 bit
addressable on 64 bit machines, or disable the pass-through functions
altogether. I expect that the check for 32 bit addressable memory to be
controversial in that it can be incorrect in non-Dell non-Intel systems
that PERC would never be installed under, the alternative is to disable
pass-through in all cases which could be reported as another regression.

Pass-through is used for SCSI Generic access to the physical devices, or
for the management applications to properly function.

In systems where this patch has disabled pass-through because it is
unsupportable in combination with I/O performance, the user can choose
to enable pass-through by turning off dacmode (aacraid.dacmode=3D0) or
limiting the discovered kernel memory (mem=3D4G) with an associated loss
in runtime performance. If we chose instead to turn off 64 bit dacmode
for the adapters with this quirk, then this would be reported as another
regression.

Signed-off-by: Mark Salyzyn [EMAIL PROTECTED]
Cc: Marcin Krol [EMAIL PROTECTED]
Cc: Matt Domsch [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/aacraid/aachba.c  |   15 ++-
 drivers/scsi/aacraid/aacraid.h |6 
 drivers/scsi/aacraid/commsup.c |6 ++--
 drivers/scsi/aacraid/linit.c   |   42 +--
 4 files changed, 47 insertions(+), 22 deletions(-)

diff -puN 
drivers/scsi/aacraid/aachba.c~aacraid-driver-fails-with-dell-poweredge-expandable-raid-controller-3-di
 drivers/scsi/aacraid/aachba.c
--- 
a/drivers/scsi/aacraid/aachba.c~aacraid-driver-fails-with-dell-poweredge-expandable-raid-controller-3-di
+++ a/drivers/scsi/aacraid/aachba.c
@@ -1190,6 +1190,15 @@ static int aac_scsi_32(struct fib * fib,
  (fib_callback) aac_srb_callback, (void *) 
cmd);
 }
 
+static int aac_scsi_32_64(struct fib * fib, struct scsi_cmnd * cmd)
+{
+   if ((sizeof(dma_addr_t)  4) 
+(num_physpages  (0xULL  PAGE_SHIFT)) 
+(fib-dev-adapter_info.options  AAC_OPT_SGMAP_HOST64))
+   return FAILED;
+   return aac_scsi_32(fib, cmd);
+}
+
 int aac_get_adapter_info(struct aac_dev* dev)
 {
struct fib* fibptr;
@@ -1267,6 +1276,8 @@ int aac_get_adapter_info(struct aac_dev*
 1, 1,
 NULL, NULL);
 
+   /* reasoned default */
+   dev-maximum_num_physicals = 16;
if (rcode = 0  le32_to_cpu(bus_info-Status) == ST_OK) {
dev-maximum_num_physicals = 
le32_to_cpu(bus_info-TargetsPerBus);
dev-maximum_num_channels = le32_to_cpu(bus_info-BusCount);
@@ -1376,7 +1387,9 @@ int aac_get_adapter_info(struct aac_dev*
 * interface.
 */
dev-a_ops.adapter_scsi = (dev-dac_support)
-   ? aac_scsi_64
+ ? ((aac_get_driver_ident(dev-cardtype)-quirks  AAC_QUIRK_SCSI_32)
+   ? aac_scsi_32_64
+   : aac_scsi_64)
: aac_scsi_32;
if (dev-raw_io_interface) {
dev-a_ops.adapter_bounds = (dev-raw_io_64)
diff -puN 
drivers/scsi/aacraid/aacraid.h~aacraid-driver-fails-with-dell-poweredge-expandable-raid-controller-3-di
 drivers/scsi/aacraid/aacraid.h
--- 
a/drivers/scsi/aacraid/aacraid.h~aacraid-driver-fails-with-dell-poweredge-expandable-raid-controller-3-di
+++ a/drivers/scsi/aacraid/aacraid.h
@@ -521,6 +521,12 @@ struct aac_driver_ident
 #define AAC_QUIRK_17SG 0x0010
 
 /*
+ * Some adapter firmware does not support 64 bit scsi passthrough
+ * commands.
+ */
+#define AAC_QUIRK_SCSI_32  0x0020
+
+/*
  * The adapter interface specs all queues to be located in the same
  * physically contigous block. The host structure that defines the
  * commuication queues will assume they are each a separate physically
diff -puN 
drivers/scsi/aacraid/commsup.c~aacraid-driver-fails-with-dell-poweredge-expandable-raid-controller-3-di
 drivers/scsi/aacraid/commsup.c
--- 
a/drivers/scsi/aacraid/commsup.c~aacraid-driver-fails-with-dell-poweredge-expandable-raid-controller-3-di
+++ a/drivers/scsi/aacraid/commsup.c
@@ -1099,7 +1099,8 @@ static int _aac_reset_adapter(struct aac
free_irq(aac-pdev-irq, aac);
kfree(aac-fsa_dev);
 

[patch 28/30] scsi: arm: convert to accessors and !use_sg cleanup

2007-12-13 Thread akpm
From: Boaz Harrosh [EMAIL PROTECTED]

 - convert to accessors and !use_sg cleanup

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
Cc: Russell King [EMAIL PROTECTED]
Signed-off-by: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/arm/acornscsi.c |   14 ++---
 drivers/scsi/arm/scsi.h  |   34 ++---
 2 files changed, 18 insertions(+), 30 deletions(-)

diff -puN drivers/scsi/arm/acornscsi.c~scsi-pending-arm-convert-to-accessors 
drivers/scsi/arm/acornscsi.c
--- a/drivers/scsi/arm/acornscsi.c~scsi-pending-arm-convert-to-accessors
+++ a/drivers/scsi/arm/acornscsi.c
@@ -1790,7 +1790,7 @@ int acornscsi_starttransfer(AS_Host *hos
return 0;
 }
 
-residual = host-SCpnt-request_bufflen - host-scsi.SCp.scsi_xferred;
+residual = scsi_bufflen(host-SCpnt) - host-scsi.SCp.scsi_xferred;
 
 sbic_arm_write(host-scsi.io_port, SBIC_SYNCHTRANSFER, 
host-device[host-SCpnt-device-id].sync_xfer);
 sbic_arm_writenext(host-scsi.io_port, residual  16);
@@ -2270,7 +2270,7 @@ intr_ret_t acornscsi_sbicintr(AS_Host *h
case 0x4b:  /* - PHASE_STATUSIN
*/
case 0x8b:  /* - PHASE_STATUSIN
*/
/* DATA IN - STATUS */
-   host-scsi.SCp.scsi_xferred = host-SCpnt-request_bufflen -
+   host-scsi.SCp.scsi_xferred = scsi_bufflen(host-SCpnt) -
  acornscsi_sbic_xfcount(host);
acornscsi_dma_stop(host);
acornscsi_readstatusbyte(host);
@@ -2281,7 +2281,7 @@ intr_ret_t acornscsi_sbicintr(AS_Host *h
case 0x4e:  /* - PHASE_MSGOUT  
*/
case 0x8e:  /* - PHASE_MSGOUT  
*/
/* DATA IN - MESSAGE OUT */
-   host-scsi.SCp.scsi_xferred = host-SCpnt-request_bufflen -
+   host-scsi.SCp.scsi_xferred = scsi_bufflen(host-SCpnt) -
  acornscsi_sbic_xfcount(host);
acornscsi_dma_stop(host);
acornscsi_sendmessage(host);
@@ -2291,7 +2291,7 @@ intr_ret_t acornscsi_sbicintr(AS_Host *h
case 0x4f:  /* message in   
*/
case 0x8f:  /* message in   
*/
/* DATA IN - MESSAGE IN */
-   host-scsi.SCp.scsi_xferred = host-SCpnt-request_bufflen -
+   host-scsi.SCp.scsi_xferred = scsi_bufflen(host-SCpnt) -
  acornscsi_sbic_xfcount(host);
acornscsi_dma_stop(host);
acornscsi_message(host);/* - PHASE_MSGIN, PHASE_DISCONNECT 
*/
@@ -2319,7 +2319,7 @@ intr_ret_t acornscsi_sbicintr(AS_Host *h
case 0x4b:  /* - PHASE_STATUSIN
*/
case 0x8b:  /* - PHASE_STATUSIN
*/
/* DATA OUT - STATUS */
-   host-scsi.SCp.scsi_xferred = host-SCpnt-request_bufflen -
+   host-scsi.SCp.scsi_xferred = scsi_bufflen(host-SCpnt) -
  acornscsi_sbic_xfcount(host);
acornscsi_dma_stop(host);
acornscsi_dma_adjust(host);
@@ -2331,7 +2331,7 @@ intr_ret_t acornscsi_sbicintr(AS_Host *h
case 0x4e:  /* - PHASE_MSGOUT  
*/
case 0x8e:  /* - PHASE_MSGOUT  
*/
/* DATA OUT - MESSAGE OUT */
-   host-scsi.SCp.scsi_xferred = host-SCpnt-request_bufflen -
+   host-scsi.SCp.scsi_xferred = scsi_bufflen(host-SCpnt) -
  acornscsi_sbic_xfcount(host);
acornscsi_dma_stop(host);
acornscsi_dma_adjust(host);
@@ -2342,7 +2342,7 @@ intr_ret_t acornscsi_sbicintr(AS_Host *h
case 0x4f:  /* message in   
*/
case 0x8f:  /* message in   
*/
/* DATA OUT - MESSAGE IN */
-   host-scsi.SCp.scsi_xferred = host-SCpnt-request_bufflen -
+   host-scsi.SCp.scsi_xferred = scsi_bufflen(host-SCpnt) -
  acornscsi_sbic_xfcount(host);
acornscsi_dma_stop(host);
acornscsi_dma_adjust(host);
diff -puN drivers/scsi/arm/scsi.h~scsi-pending-arm-convert-to-accessors 
drivers/scsi/arm/scsi.h
--- a/drivers/scsi/arm/scsi.h~scsi-pending-arm-convert-to-accessors
+++ a/drivers/scsi/arm/scsi.h
@@ -68,46 +68,34 @@ static inline void init_SCp(struct scsi_
 {
memset(SCpnt-SCp, 0, sizeof(struct scsi_pointer));
 
-   if (SCpnt-use_sg) {
+   if (scsi_bufflen(SCpnt)) {
unsigned long len = 0;
int buf;
 
- 

[patch 30/30] libsas: convert ATA bridge to use new EH

2007-12-13 Thread akpm
From: Darrick J. Wong [EMAIL PROTECTED]

Migrate the sas_ata bridge to use the new libata EH strategy, and
finally implement correct software reset.

WARNING WARNING WARNING!  This patch is for experimental use only; it is
nowhere near complete!  Especially the sas_ata_freeze() function.  This
patch may eat your data and kill your trees.

jgarzik: If an ATA command was in-progress at the time of a port freeze,
can complete after thawing?  (Does that even make sense?)

[EMAIL PROTECTED]: coding-style fixes]
Comments-requested-by: Darrick J. Wong [EMAIL PROTECTED]
Cc: Jeff Garzik [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/libsas/sas_ata.c |   86 ++--
 1 file changed, 71 insertions(+), 15 deletions(-)

diff -puN drivers/scsi/libsas/sas_ata.c~libsas-convert-ata-bridge-to-use-new-eh 
drivers/scsi/libsas/sas_ata.c
--- a/drivers/scsi/libsas/sas_ata.c~libsas-convert-ata-bridge-to-use-new-eh
+++ a/drivers/scsi/libsas/sas_ata.c
@@ -35,6 +35,8 @@
 #include ../scsi_transport_api.h
 #include scsi/scsi_eh.h
 
+static int sas_issue_ata_srst(struct domain_device *dev);
+
 static enum ata_completion_errors sas_to_ata_err(struct task_status_struct *ts)
 {
/* Cheesy attempt to translate SAS errors into ATA.  Hah! */
@@ -233,37 +235,58 @@ static u8 sas_ata_check_status(struct at
return dev-sata_dev.tf.command;
 }
 
-static void sas_ata_phy_reset(struct ata_port *ap)
+static void sas_ata_freeze(struct ata_port *ap)
 {
-   struct domain_device *dev = ap-private_data;
-   struct sas_internal *i =
-   to_sas_internal(dev-port-ha-core.shost-transportt);
-   int res = 0;
+   /* reroute qc_done for all qc's on this port to a dumb free func */
+   /* i wonder if we can get away with throwing out anything that
+* completes in this time frame, or if we must find the commands
+* that are in progress and cancel only those? */
+   printk(KERN_ERR %s: STUB\n, __FUNCTION__);
+}
 
-   if (i-dft-lldd_I_T_nexus_reset)
-   res = i-dft-lldd_I_T_nexus_reset(dev);
+static void sas_ata_thaw(struct ata_port *ap)
+{
+   /* empty */
+   printk(KERN_ERR %s: STUB\n, __FUNCTION__);
+}
 
-   if (res)
-   SAS_DPRINTK(%s: Unable to reset I T nexus?\n, __FUNCTION__);
+static int sas_ata_soft_reset(struct ata_link *link, unsigned int *classes,
+  unsigned long deadline)
+{
+   struct ata_port *ap = link-ap;
+   struct domain_device *dev = ap-private_data;
+   int res;
 
+   /* Send SRST to device */
+   res = sas_issue_ata_srst(dev);
+   printk(KERN_ERR srst 0 returns %d\n, res);
+
+   /* Set new device type */
switch (dev-sata_dev.command_set) {
case ATA_COMMAND_SET:
SAS_DPRINTK(%s: Found ATA device.\n, __FUNCTION__);
-   ap-link.device[0].class = ATA_DEV_ATA;
+   *classes = ATA_DEV_ATA;
break;
case ATAPI_COMMAND_SET:
SAS_DPRINTK(%s: Found ATAPI device.\n, __FUNCTION__);
-   ap-link.device[0].class = ATA_DEV_ATAPI;
+   *classes = ATA_DEV_ATAPI;
break;
default:
SAS_DPRINTK(%s: Unknown SATA command set: %d.\n,
__FUNCTION__,
dev-sata_dev.command_set);
-   ap-link.device[0].class = ATA_DEV_UNKNOWN;
-   break;
+   *classes = ATA_DEV_UNKNOWN;
+   break;
}
 
-   ap-cbl = ATA_CBL_SATA;
+   /* FIXME: What if SRST fails? */
+   return 0;
+}
+
+static void sas_ata_error_handler(struct ata_port *ap)
+{
+   ata_do_eh(ap, NULL, sas_ata_soft_reset, NULL, NULL);
+   /* uh... hopefully there's no commands left in here? */
 }
 
 static void sas_ata_post_internal(struct ata_queued_cmd *qc)
@@ -353,7 +376,9 @@ static struct ata_port_operations sas_sa
.check_status   = sas_ata_check_status,
.check_altstatus= sas_ata_check_status,
.dev_select = ata_noop_dev_select,
-   .phy_reset  = sas_ata_phy_reset,
+   .error_handler  = sas_ata_error_handler,
+   .freeze = sas_ata_freeze,
+   .thaw   = sas_ata_thaw,
.post_internal_cmd  = sas_ata_post_internal,
.tf_read= sas_ata_tf_read,
.qc_prep= ata_noop_qc_prep,
@@ -658,6 +683,37 @@ out:
return res;
 }
 
+static int sas_issue_ata_srst(struct domain_device *dev)
+{
+   int res = 0;
+   struct sas_task *task;
+   struct dev_to_host_fis *d2h_fis = (struct dev_to_host_fis *)
+   dev-frame_rcvd[0];
+
+   res = -ENOMEM;
+   task = sas_alloc_task(GFP_KERNEL);
+   if (!task)
+ 

[patch 19/30] Dell CERC support for megaraid_mbox

2007-12-13 Thread akpm
From: Hannes Reinecke [EMAIL PROTECTED]

Newer Dell CERC firmware (= 6.62) implement a random deletion handling
compatible with the legacy megaraid driver.  The legacy handling shifted
the target ID by 0x80 only for I/O commands (READ/WRITE/etc), whereas
megaraid_mbox shifts the target ID always if random deletion is supported. 
The resulted in megaraid_mbox sending an INQUIRY to the wrong channel, and
not finding any devices, obviously.

So we disable the random deletion support if the offending firmware is
found.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=6695

Signed-off-by: Hannes Reinecke [EMAIL PROTECTED]
Cc: Patro, Sumant [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/megaraid/megaraid_mbox.c |   17 +
 drivers/scsi/megaraid/megaraid_mbox.h |1 +
 2 files changed, 18 insertions(+)

diff -puN 
drivers/scsi/megaraid/megaraid_mbox.c~dell-cerc-support-for-megaraid_mbox 
drivers/scsi/megaraid/megaraid_mbox.c
--- a/drivers/scsi/megaraid/megaraid_mbox.c~dell-cerc-support-for-megaraid_mbox
+++ a/drivers/scsi/megaraid/megaraid_mbox.c
@@ -3169,6 +3169,23 @@ megaraid_mbox_support_random_del(adapter
uint8_t raw_mbox[sizeof(mbox_t)];
int rval;
 
+   /*
+* Newer firmware on Dell CERC expect a different
+* random deletion handling, so disable it.
+*/
+   if (adapter-pdev-vendor == PCI_VENDOR_ID_AMI 
+   adapter-pdev-device == PCI_DEVICE_ID_AMI_MEGARAID3 
+   adapter-pdev-subsystem_vendor == PCI_VENDOR_ID_DELL 
+   adapter-pdev-subsystem_device == PCI_SUBSYS_ID_CERC_ATA100_4CH 
+   (adapter-fw_version[0]  '6' ||
+(adapter-fw_version[0] == '6' 
+ adapter-fw_version[2]  '6') ||
+(adapter-fw_version[0] == '6'
+  adapter-fw_version[2] == '6'
+  adapter-fw_version[3]  '1'))) {
+   con_log(CL_DLEVEL1, (megaraid: disable random deletion\n));
+   return 0;
+   }
 
mbox = (mbox_t *)raw_mbox;
 
diff -puN 
drivers/scsi/megaraid/megaraid_mbox.h~dell-cerc-support-for-megaraid_mbox 
drivers/scsi/megaraid/megaraid_mbox.h
--- a/drivers/scsi/megaraid/megaraid_mbox.h~dell-cerc-support-for-megaraid_mbox
+++ a/drivers/scsi/megaraid/megaraid_mbox.h
@@ -88,6 +88,7 @@
 #define PCI_SUBSYS_ID_PERC3_QC 0x0471
 #define PCI_SUBSYS_ID_PERC3_DC 0x0493
 #define PCI_SUBSYS_ID_PERC3_SC 0x0475
+#define PCI_SUBSYS_ID_CERC_ATA100_4CH  0x0511
 
 
 #define MBOX_MAX_SCSI_CMDS 128 // number of cmds reserved for kernel
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 22/30] sg: nopage

2007-12-13 Thread akpm
From: Nick Piggin [EMAIL PROTECTED]

Convert SG from nopage to fault.

Signed-off-by: Nick Piggin [EMAIL PROTECTED]
Cc: Douglas Gilbert [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/sg.c |   23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

diff -puN drivers/scsi/sg.c~sg-nopage drivers/scsi/sg.c
--- a/drivers/scsi/sg.c~sg-nopage
+++ a/drivers/scsi/sg.c
@@ -1144,23 +1144,22 @@ sg_fasync(int fd, struct file *filp, int
return (retval  0) ? retval : 0;
 }
 
-static struct page *
-sg_vma_nopage(struct vm_area_struct *vma, unsigned long addr, int *type)
+static int
+sg_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 {
Sg_fd *sfp;
-   struct page *page = NOPAGE_SIGBUS;
unsigned long offset, len, sa;
Sg_scatter_hold *rsv_schp;
struct scatterlist *sg;
int k;
 
if ((NULL == vma) || (!(sfp = (Sg_fd *) vma-vm_private_data)))
-   return page;
+   return VM_FAULT_SIGBUS;
rsv_schp = sfp-reserve;
-   offset = addr - vma-vm_start;
+   offset = vmf-pgoff  PAGE_SHIFT;
if (offset = rsv_schp-bufflen)
-   return page;
-   SCSI_LOG_TIMEOUT(3, printk(sg_vma_nopage: offset=%lu, scatg=%d\n,
+   return VM_FAULT_SIGBUS;
+   SCSI_LOG_TIMEOUT(3, printk(sg_vma_fault: offset=%lu, scatg=%d\n,
   offset, rsv_schp-k_use_sg));
sg = rsv_schp-buffer;
sa = vma-vm_start;
@@ -1169,21 +1168,21 @@ sg_vma_nopage(struct vm_area_struct *vma
len = vma-vm_end - sa;
len = (len  sg-length) ? len : sg-length;
if (offset  len) {
+   struct page *page;
page = virt_to_page(page_address(sg_page(sg)) + offset);
get_page(page); /* increment page count */
-   break;
+   vmf-page = page;
+   return 0; /* success */
}
sa += len;
offset -= len;
}
 
-   if (type)
-   *type = VM_FAULT_MINOR;
-   return page;
+   return VM_FAULT_SIGBUS;
 }
 
 static struct vm_operations_struct sg_mmap_vm_ops = {
-   .nopage = sg_vma_nopage,
+   .fault = sg_vma_fault,
 };
 
 static int
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 29/30] scsi: bidi support

2007-12-13 Thread akpm
From: Boaz Harrosh [EMAIL PROTECTED]

  At the block level bidi request uses req-next_rq pointer for a second
  bidi_read request.
  At Scsi-midlayer a second scsi_data_buffer structure is used for the
  bidi_read part. This bidi scsi_data_buffer is put on
  request-next_rq-special. Struct scsi_cmnd is not changed.

  - Define scsi_bidi_cmnd() to return true if it is a bidi request and a
second sgtable was allocated.

  - Define scsi_in()/scsi_out() to return the in or out scsi_data_buffer
from this command This API is to isolate users from the mechanics of
bidi.

  - Define scsi_end_bidi_request() to do what scsi_end_request() does but
for a bidi request. This is necessary because bidi commands are a bit
tricky here. (See comments in body)

  - scsi_release_buffers() will also release the bidi_read scsi_data_buffer

  - scsi_io_completion() on bidi commands will now call
scsi_end_bidi_request() and return.

  - The previous work done in scsi_init_io() is now done in a new
scsi_init_sgtable() (which is 99% identical to old scsi_init_io())
The new scsi_init_io() will call the above twice if needed also for
the bidi_read command. Only at this point is a command bidi.

  - In scsi_error.c at scsi_eh_prep/restore_cmnd() make sure bidi-lld is not
confused by a get-sense command that looks like bidi. This is done
by puting NULL at request-next_rq, and restoring.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/scsi_error.c |3 
 drivers/scsi/scsi_lib.c   |  144 
 include/scsi/scsi_cmnd.h  |   23 +
 include/scsi/scsi_eh.h|1 
 4 files changed, 141 insertions(+), 30 deletions(-)

diff -puN drivers/scsi/scsi_error.c~scsi-bidi-support drivers/scsi/scsi_error.c
--- a/drivers/scsi/scsi_error.c~scsi-bidi-support
+++ a/drivers/scsi/scsi_error.c
@@ -618,9 +618,11 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd 
memcpy(ses-cmnd, scmd-cmnd, sizeof(scmd-cmnd));
ses-data_direction = scmd-sc_data_direction;
ses-sdb = scmd-sdb;
+   ses-next_rq = scmd-request-next_rq;
ses-result = scmd-result;
 
memset(scmd-sdb, 0, sizeof(scmd-sdb));
+   scmd-request-next_rq = NULL;
 
if (sense_bytes) {
scmd-sdb.length = min_t(unsigned,
@@ -673,6 +675,7 @@ void scsi_eh_restore_cmnd(struct scsi_cm
memcpy(scmd-cmnd, ses-cmnd, sizeof(scmd-cmnd));
scmd-sc_data_direction = ses-data_direction;
scmd-sdb = ses-sdb;
+   scmd-request-next_rq = ses-next_rq;
scmd-result = ses-result;
 }
 EXPORT_SYMBOL(scsi_eh_restore_cmnd);
diff -puN drivers/scsi/scsi_lib.c~scsi-bidi-support drivers/scsi/scsi_lib.c
--- a/drivers/scsi/scsi_lib.c~scsi-bidi-support
+++ a/drivers/scsi/scsi_lib.c
@@ -64,6 +64,8 @@ static struct scsi_host_sg_pool scsi_sg_
 };
 #undef SP
 
+static struct kmem_cache *scsi_bidi_sdb_cache;
+
 static void scsi_run_queue(struct request_queue *q);
 
 /*
@@ -627,6 +629,28 @@ void scsi_run_host_queues(struct Scsi_Ho
scsi_run_queue(sdev-request_queue);
 }
 
+static void scsi_finalize_request(struct scsi_cmnd *cmd, int uptodate)
+{
+   struct request_queue *q = cmd-device-request_queue;
+   struct request *req = cmd-request;
+   unsigned long flags;
+
+   add_disk_randomness(req-rq_disk);
+
+   spin_lock_irqsave(q-queue_lock, flags);
+   if (blk_rq_tagged(req))
+   blk_queue_end_tag(q, req);
+
+   end_that_request_last(req, uptodate);
+   spin_unlock_irqrestore(q-queue_lock, flags);
+
+   /*
+* This will goose the queue request function at the end, so we don't
+* need to worry about launching another command.
+*/
+   scsi_next_command(cmd);
+}
+
 /*
  * Function:scsi_end_request()
  *
@@ -654,7 +678,6 @@ static struct scsi_cmnd *scsi_end_reques
 {
struct request_queue *q = cmd-device-request_queue;
struct request *req = cmd-request;
-   unsigned long flags;
 
/*
 * If there are blocks left over at the end, set up the command
@@ -683,19 +706,7 @@ static struct scsi_cmnd *scsi_end_reques
}
}
 
-   add_disk_randomness(req-rq_disk);
-
-   spin_lock_irqsave(q-queue_lock, flags);
-   if (blk_rq_tagged(req))
-   blk_queue_end_tag(q, req);
-   end_that_request_last(req, uptodate);
-   spin_unlock_irqrestore(q-queue_lock, flags);
-
-   /*
-* This will goose the queue request function at the end, so we don't
-* need to worry about launching another command.
-*/
-   scsi_next_command(cmd);
+   scsi_finalize_request(cmd, uptodate);
return NULL;
 }
 
@@ -894,10 +905,39 @@ void scsi_release_buffers(struct scsi_cm
scsi_free_sgtable(cmd-sdb);
 
memset(cmd-sdb, 0, sizeof(cmd-sdb));
+
+   if (scsi_bidi_cmnd(cmd)) {
+ 

[patch 25/30] drivers/scsi/ipr.c: use LIST_HEAD instead of LIST_HEAD_INIT

2007-12-13 Thread akpm
From: Denis Cheng [EMAIL PROTECTED]

Signed-off-by: Denis Cheng [EMAIL PROTECTED]
Acked-by: Brian King [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/ipr.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN 
drivers/scsi/ipr.c~drivers-scsi-iprc-use-list_head-instead-of-list_head_init 
drivers/scsi/ipr.c
--- 
a/drivers/scsi/ipr.c~drivers-scsi-iprc-use-list_head-instead-of-list_head_init
+++ a/drivers/scsi/ipr.c
@@ -84,7 +84,7 @@
 /*
  *   Global Data
  */
-static struct list_head ipr_ioa_head = LIST_HEAD_INIT(ipr_ioa_head);
+static LIST_HEAD(ipr_ioa_head);
 static unsigned int ipr_log_level = IPR_DEFAULT_LOG_LEVEL;
 static unsigned int ipr_max_speed = 1;
 static int ipr_testmode = 0;
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 21/30] scsi/qla2xxx/qla_os.c section fix

2007-12-13 Thread akpm
From: Adrian Bunk [EMAIL PROTECTED]

WARNING: vmlinux.o(.text+0x2a4462): Section mismatch: reference to 
.exit.text:qla2x00_remove_one (between 'qla2xxx_pci_error_detected' and 
'qla2x00_stop_timer')

qla2x00_remove_one() mustn't be __devexit since it's called from
qla2xxx_pci_error_detected().

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]
Acked-by: Seokmann Ju [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/qla2xxx/qla_os.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff -puN drivers/scsi/qla2xxx/qla_os.c~scsi-qla2xxx-qla_osc-section-fix 
drivers/scsi/qla2xxx/qla_os.c
--- a/drivers/scsi/qla2xxx/qla_os.c~scsi-qla2xxx-qla_osc-section-fix
+++ a/drivers/scsi/qla2xxx/qla_os.c
@@ -1823,7 +1823,7 @@ probe_out:
return ret;
 }
 
-static void __devexit
+static void
 qla2x00_remove_one(struct pci_dev *pdev)
 {
scsi_qla_host_t *ha;
@@ -2957,7 +2957,7 @@ static struct pci_driver qla2xxx_pci_dri
},
.id_table   = qla2xxx_pci_tbl,
.probe  = qla2x00_probe_one,
-   .remove = __devexit_p(qla2x00_remove_one),
+   .remove = qla2x00_remove_one,
.err_handler= qla2xxx_err_handler,
 };
 
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 23/30] 3W RAID drivers: memset not needed in probe

2007-12-13 Thread akpm
From: Denis Cheng [EMAIL PROTECTED]

The memory return from scsi_host_alloc is alloced by kzalloc, which is
already zero initilized, so memset not needed.

Signed-off-by: Denis Cheng [EMAIL PROTECTED]
Cc: Adam Radford [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/3w-9xxx.c |2 --
 drivers/scsi/3w-.c |2 --
 2 files changed, 4 deletions(-)

diff -puN drivers/scsi/3w-9xxx.c~3w-raid-drivers-memset-not-needed-in-probe 
drivers/scsi/3w-9xxx.c
--- a/drivers/scsi/3w-9xxx.c~3w-raid-drivers-memset-not-needed-in-probe
+++ a/drivers/scsi/3w-9xxx.c
@@ -2029,8 +2029,6 @@ static int __devinit twa_probe(struct pc
}
tw_dev = (TW_Device_Extension *)host-hostdata;
 
-   memset(tw_dev, 0, sizeof(TW_Device_Extension));
-
/* Save values to device extension */
tw_dev-host = host;
tw_dev-tw_pci_dev = pdev;
diff -puN drivers/scsi/3w-.c~3w-raid-drivers-memset-not-needed-in-probe 
drivers/scsi/3w-.c
--- a/drivers/scsi/3w-.c~3w-raid-drivers-memset-not-needed-in-probe
+++ a/drivers/scsi/3w-.c
@@ -2295,8 +2295,6 @@ static int __devinit tw_probe(struct pci
}
tw_dev = (TW_Device_Extension *)host-hostdata;
 
-   memset(tw_dev, 0, sizeof(TW_Device_Extension));
-
/* Save values to device extension */
tw_dev-host = host;
tw_dev-tw_pci_dev = pdev;
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 18/30] scsi/qla2xxx/: possible cleanups

2007-12-13 Thread akpm
From: Adrian Bunk [EMAIL PROTECTED]

- make the following needlessly global code static:
  - qla_attr.c: qla24xx_vport_delete()
  - qla_attr.c: qla24xx_vport_disable()
  - qla_mid.c: qla24xx_allocate_vp_id()
  - qla_mid.c: qla24xx_find_vhost_by_name()
  - qla_mid.c: qla2x00_do_dpc_vp()
  - qla_os.c: struct qla2x00_driver_template
  - qla_os.c: qla2x00_stop_timer()
  - qla_os.c: qla2x00_mem_alloc()
  - qla_os.c: qla2x00_mem_free()
  - qla_sup.c: qla2x00_lock_nvram_access()
  - qla_sup.c: qla2x00_unlock_nvram_access()
  - qla_sup.c: qla2x00_get_nvram_word()
  - qla_sup.c: qla2x00_write_nvram_word()
- #if 0 the following unused global functions:
  - qla_dbg.c: qla2x00_dump_pkt()
  - qla_mbx.c: qla2x00_system_error()
  - qla_mbx.c: qla2x00_get_serdes_params()
  - qla_mbx.c: qla2x00_get_idma_speed()
  - qla_mbx.c: qla24xx_get_vp_database()
  - qla_mbx.c: qla24xx_get_vp_entry()
- qla_os.c: remove some unneeded function prototypes

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]
Cc: Andrew Vasquez [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/qla2xxx/qla_attr.c |6 +++---
 drivers/scsi/qla2xxx/qla_dbg.c  |2 ++
 drivers/scsi/qla2xxx/qla_gbl.h  |   25 -
 drivers/scsi/qla2xxx/qla_mbx.c  |   10 ++
 drivers/scsi/qla2xxx/qla_mid.c  |6 +++---
 drivers/scsi/qla2xxx/qla_os.c   |   20 ++--
 drivers/scsi/qla2xxx/qla_sup.c  |8 
 7 files changed, 28 insertions(+), 49 deletions(-)

diff -puN drivers/scsi/qla2xxx/qla_attr.c~scsi-qla2xxx-possible-cleanups 
drivers/scsi/qla2xxx/qla_attr.c
--- a/drivers/scsi/qla2xxx/qla_attr.c~scsi-qla2xxx-possible-cleanups
+++ a/drivers/scsi/qla2xxx/qla_attr.c
@@ -9,7 +9,7 @@
 #include linux/kthread.h
 #include linux/vmalloc.h
 
-int qla24xx_vport_disable(struct fc_vport *, bool);
+static int qla24xx_vport_disable(struct fc_vport *, bool);
 
 /* SYSFS attributes - 
*/
 
@@ -1113,7 +1113,7 @@ vport_create_failed_2:
return FC_VPORT_FAILED;
 }
 
-int
+static int
 qla24xx_vport_delete(struct fc_vport *fc_vport)
 {
scsi_qla_host_t *ha = shost_priv(fc_vport-shost);
@@ -1146,7 +1146,7 @@ qla24xx_vport_delete(struct fc_vport *fc
return 0;
 }
 
-int
+static int
 qla24xx_vport_disable(struct fc_vport *fc_vport, bool disable)
 {
scsi_qla_host_t *vha = fc_vport-dd_data;
diff -puN drivers/scsi/qla2xxx/qla_dbg.c~scsi-qla2xxx-possible-cleanups 
drivers/scsi/qla2xxx/qla_dbg.c
--- a/drivers/scsi/qla2xxx/qla_dbg.c~scsi-qla2xxx-possible-cleanups
+++ a/drivers/scsi/qla2xxx/qla_dbg.c
@@ -1428,6 +1428,7 @@ qla2x00_print_scsi_cmd(struct scsi_cmnd 
printk(  sp flags=0x%x\n, sp-flags);
 }
 
+#if 0
 void
 qla2x00_dump_pkt(void *pkt)
 {
@@ -1442,6 +1443,7 @@ qla2x00_dump_pkt(void *pkt)
}
printk(\n);
 }
+#endif  /*  0  */
 
 #if defined(QL_DEBUG_ROUTINES)
 /*
diff -puN drivers/scsi/qla2xxx/qla_gbl.h~scsi-qla2xxx-possible-cleanups 
drivers/scsi/qla2xxx/qla_gbl.h
--- a/drivers/scsi/qla2xxx/qla_gbl.h~scsi-qla2xxx-possible-cleanups
+++ a/drivers/scsi/qla2xxx/qla_gbl.h
@@ -68,30 +68,20 @@ extern int num_hosts;
 /*
  * Global Functions in qla_mid.c source file.
  */
-extern struct scsi_host_template qla2x00_driver_template;
 extern struct scsi_host_template qla24xx_driver_template;
 extern struct scsi_transport_template *qla2xxx_transport_vport_template;
-extern uint8_t qla2x00_mem_alloc(scsi_qla_host_t *);
 extern void qla2x00_timer(scsi_qla_host_t *);
 extern void qla2x00_start_timer(scsi_qla_host_t *, void *, unsigned long);
-extern void qla2x00_stop_timer(scsi_qla_host_t *);
-extern uint32_t qla24xx_allocate_vp_id(scsi_qla_host_t *);
 extern void qla24xx_deallocate_vp_id(scsi_qla_host_t *);
 extern int qla24xx_disable_vp (scsi_qla_host_t *);
 extern int qla24xx_enable_vp (scsi_qla_host_t *);
-extern void qla2x00_mem_free(scsi_qla_host_t *);
 extern int qla24xx_control_vp(scsi_qla_host_t *, int );
 extern int qla24xx_modify_vp_config(scsi_qla_host_t *);
 extern int qla2x00_send_change_request(scsi_qla_host_t *, uint16_t, uint16_t);
 extern void qla2x00_vp_stop_timer(scsi_qla_host_t *);
 extern int qla24xx_configure_vhba (scsi_qla_host_t *);
-extern int qla24xx_get_vp_entry(scsi_qla_host_t *, uint16_t, int);
-extern int qla24xx_get_vp_database(scsi_qla_host_t *, uint16_t);
-extern int qla2x00_do_dpc_vp(scsi_qla_host_t *);
 extern void qla24xx_report_id_acquisition(scsi_qla_host_t *,
 struct vp_rpt_id_entry_24xx *);
-extern scsi_qla_host_t * qla24xx_find_vhost_by_name(scsi_qla_host_t *,
-uint8_t *);
 extern void qla2x00_do_dpc_all_vps(scsi_qla_host_t *);
 extern int qla24xx_vport_create_req_sanity_check(struct fc_vport *);
 extern scsi_qla_host_t * qla24xx_create_vhost(struct fc_vport *);
@@ -113,7 +103,6 @@ extern void qla2xxx_wake_dpc(scsi_qla_ho
 extern void qla2x00_alert_all_vps(scsi_qla_host_t *, uint16_t *);
 extern void qla2x00_async_event(scsi_qla_host_t *, uint16_t *);
 extern void 

[patch 26/30] tgt: use scsi_init_io instead of scsi_alloc_sgtable

2007-12-13 Thread akpm
From: Boaz Harrosh [EMAIL PROTECTED]

  - If we export scsi_init_io()/scsi_release_buffers() instead of
scsi_{alloc,free}_sgtable() from scsi_lib than tgt code is
much more insulated from scsi_lib changes. As a bonus it will
also gain bidi capability when it comes.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
Acked-by: FUJITA Tomonori [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/scsi_lib.c |   21 ++---
 drivers/scsi/scsi_tgt_lib.c |   29 +
 include/scsi/scsi_cmnd.h|4 ++--
 3 files changed, 17 insertions(+), 37 deletions(-)

diff -puN 
drivers/scsi/scsi_lib.c~tgt-use-scsi_init_io-instead-of-scsi_alloc_sgtable 
drivers/scsi/scsi_lib.c
--- a/drivers/scsi/scsi_lib.c~tgt-use-scsi_init_io-instead-of-scsi_alloc_sgtable
+++ a/drivers/scsi/scsi_lib.c
@@ -739,7 +739,8 @@ static inline unsigned int scsi_sgtable_
return index;
 }
 
-struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd, gfp_t gfp_mask)
+static struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd,
+   gfp_t gfp_mask)
 {
struct scsi_host_sg_pool *sgp;
struct scatterlist *sgl, *prev, *ret;
@@ -825,9 +826,7 @@ enomem:
return NULL;
 }
 
-EXPORT_SYMBOL(scsi_alloc_sgtable);
-
-void scsi_free_sgtable(struct scsi_cmnd *cmd)
+static void scsi_free_sgtable(struct scsi_cmnd *cmd)
 {
struct scatterlist *sgl = cmd-request_buffer;
struct scsi_host_sg_pool *sgp;
@@ -873,8 +872,6 @@ void scsi_free_sgtable(struct scsi_cmnd 
mempool_free(sgl, sgp-pool);
 }
 
-EXPORT_SYMBOL(scsi_free_sgtable);
-
 /*
  * Function:scsi_release_buffers()
  *
@@ -892,7 +889,7 @@ EXPORT_SYMBOL(scsi_free_sgtable);
  * the scatter-gather table, and potentially any bounce
  * buffers.
  */
-static void scsi_release_buffers(struct scsi_cmnd *cmd)
+void scsi_release_buffers(struct scsi_cmnd *cmd)
 {
if (cmd-use_sg)
scsi_free_sgtable(cmd);
@@ -904,6 +901,7 @@ static void scsi_release_buffers(struct 
cmd-request_buffer = NULL;
cmd-request_bufflen = 0;
 }
+EXPORT_SYMBOL(scsi_release_buffers);
 
 /*
  * Function:scsi_io_completion()
@@ -1105,7 +1103,7 @@ void scsi_io_completion(struct scsi_cmnd
  * Returns: 0 on success
  * BLKPREP_DEFER if the failure is retryable
  */
-static int scsi_init_io(struct scsi_cmnd *cmd)
+int scsi_init_io(struct scsi_cmnd *cmd, gfp_t gfp_mask)
 {
struct request *req = cmd-request;
intcount;
@@ -1120,7 +1118,7 @@ static int scsi_init_io(struct scsi_cmnd
/*
 * If sg table allocation fails, requeue request later.
 */
-   cmd-request_buffer = scsi_alloc_sgtable(cmd, GFP_ATOMIC);
+   cmd-request_buffer = scsi_alloc_sgtable(cmd, gfp_mask);
if (unlikely(!cmd-request_buffer)) {
scsi_unprep_request(req);
return BLKPREP_DEFER;
@@ -1141,6 +1139,7 @@ static int scsi_init_io(struct scsi_cmnd
cmd-use_sg = count;
return BLKPREP_OK;
 }
+EXPORT_SYMBOL(scsi_init_io);
 
 static struct scsi_cmnd *scsi_get_cmd_from_req(struct scsi_device *sdev,
struct request *req)
@@ -1186,7 +1185,7 @@ int scsi_setup_blk_pc_cmnd(struct scsi_d
 
BUG_ON(!req-nr_phys_segments);
 
-   ret = scsi_init_io(cmd);
+   ret = scsi_init_io(cmd, GFP_ATOMIC);
if (unlikely(ret))
return ret;
} else {
@@ -1237,7 +1236,7 @@ int scsi_setup_fs_cmnd(struct scsi_devic
if (unlikely(!cmd))
return BLKPREP_DEFER;
 
-   return scsi_init_io(cmd);
+   return scsi_init_io(cmd, GFP_ATOMIC);
 }
 EXPORT_SYMBOL(scsi_setup_fs_cmnd);
 
diff -puN 
drivers/scsi/scsi_tgt_lib.c~tgt-use-scsi_init_io-instead-of-scsi_alloc_sgtable 
drivers/scsi/scsi_tgt_lib.c
--- 
a/drivers/scsi/scsi_tgt_lib.c~tgt-use-scsi_init_io-instead-of-scsi_alloc_sgtable
+++ a/drivers/scsi/scsi_tgt_lib.c
@@ -331,8 +331,7 @@ static void scsi_tgt_cmd_done(struct scs
 
scsi_tgt_uspace_send_status(cmd, tcmd-itn_id, tcmd-tag);
 
-   if (scsi_sglist(cmd))
-   scsi_free_sgtable(cmd);
+   scsi_release_buffers(cmd);
 
queue_work(scsi_tgtd, tcmd-work);
 }
@@ -353,26 +352,6 @@ static int scsi_tgt_transfer_response(st
return 0;
 }
 
-static int scsi_tgt_init_cmd(struct scsi_cmnd *cmd, gfp_t gfp_mask)
-{
-   struct request *rq = cmd-request;
-   int count;
-
-   cmd-use_sg = rq-nr_phys_segments;
-   cmd-request_buffer = scsi_alloc_sgtable(cmd, gfp_mask);
-   if (!cmd-request_buffer)
-   return -ENOMEM;
-
-   cmd-request_bufflen = rq-data_len;
-
-   dprintk(cmd %p cnt %d %lu\n, cmd, scsi_sg_count(cmd),
-   rq_data_dir(rq));
-   count = blk_rq_map_sg(rq-q, rq, scsi_sglist(cmd));

Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Andrew Morton
On Thu, 13 Dec 2007 19:30:00 -0500
Mark Lord [EMAIL PROTECTED] wrote:

 Here's the commit that causes the regression:
 
 ...

 --- a/mm/page_alloc.c
 +++ b/mm/page_alloc.c
 @@ -760,7 +760,8 @@ static int rmqueue_bulk(struct zone *zone, unsigned int 
 order,
   struct page *page = __rmqueue(zone, order, migratetype);
   if (unlikely(page == NULL))
   break;
 - list_add_tail(page-lru, list);
 + list_add(page-lru, list);

well that looks fishy.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] fix page_alloc for larger I/O segments (improved)

2007-12-13 Thread Mark Lord


Improved version, more similar to the 2.6.23 code:

Fix page allocator to give better chance of larger contiguous segments (again).

Signed-off-by: Mark Lord [EMAIL PROTECTED]
---

--- old/mm/page_alloc.c 2007-12-13 19:25:15.0 -0500
+++ linux-2.6/mm/page_alloc.c   2007-12-13 19:43:07.0 -0500
@@ -760,7 +760,7 @@
struct page *page = __rmqueue(zone, order, migratetype);
if (unlikely(page == NULL))
break;
-   list_add(page-lru, list);
+   list_add_tail(page-lru, list);
set_page_private(page, migratetype);
}
spin_unlock(zone-lock);
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Mark Lord wrote:

Andrew Morton wrote:

On Thu, 13 Dec 2007 19:30:00 -0500
Mark Lord [EMAIL PROTECTED] wrote:


Here's the commit that causes the regression:

...

--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -760,7 +760,8 @@ static int rmqueue_bulk(struct zone *zone, 
unsigned int order,

 struct page *page = __rmqueue(zone, order, migratetype);
 if (unlikely(page == NULL))
 break;
-list_add_tail(page-lru, list);
+list_add(page-lru, list);


well that looks fishy.

..

Yeah.  I missed that, and instead just posted a patch
to search the list in reverse order, which seems to work for me.

I'll try just reversing that line above here now.. gimme 5 minutes or so.

..

Yep, that works too.  Alternative improved patch now posted.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix page_alloc for larger I/O segments

2007-12-13 Thread Andrew Morton
On Thu, 13 Dec 2007 19:40:09 -0500
Mark Lord [EMAIL PROTECTED] wrote:

 And here is a patch that seems to fix it for me here:
 
 * * * *
 
 Fix page allocator to give better change of larger contiguous segments 
 (again).
 
 Signed-off-by: Mark Lord [EMAIL PROTECTED]
 ---
 
 
 --- old/mm/page_alloc.c.orig  2007-12-13 19:25:15.0 -0500
 +++ linux-2.6/mm/page_alloc.c 2007-12-13 19:35:50.0 -0500
 @@ -954,7 +954,7 @@
   goto failed;
   }
   /* Find a page of the appropriate migrate type */
 - list_for_each_entry(page, pcp-list, lru) {
 + list_for_each_entry_reverse(page, pcp-list, lru) {
   if (page_private(page) == migratetype) {
   list_del(page-lru);
   pcp-count--;

- needs help to make it apply to mainline

- needs a comment, methinks...


--- 
a/mm/page_alloc.c~fix-page-allocator-to-give-better-chance-of-larger-contiguous-segments-again
+++ a/mm/page_alloc.c
@@ -1060,8 +1060,12 @@ again:
goto failed;
}
 
-   /* Find a page of the appropriate migrate type */
-   list_for_each_entry(page, pcp-list, lru)
+   /*
+* Find a page of the appropriate migrate type.  Doing a
+* reverse-order search here helps us to hand out pages in
+* ascending physical-address order.
+*/
+   list_for_each_entry_reverse(page, pcp-list, lru)
if (page_private(page) == migratetype)
break;
 
_

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix page_alloc for larger I/O segments (improved)

2007-12-13 Thread Andrew Morton
On Thu, 13 Dec 2007 19:57:29 -0500
James Bottomley [EMAIL PROTECTED] wrote:

 
 On Thu, 2007-12-13 at 19:46 -0500, Mark Lord wrote:
  Improved version, more similar to the 2.6.23 code:
  
  Fix page allocator to give better chance of larger contiguous segments 
  (again).
  
  Signed-off-by: Mark Lord [EMAIL PROTECTED]
  ---
  
  --- old/mm/page_alloc.c 2007-12-13 19:25:15.0 -0500
  +++ linux-2.6/mm/page_alloc.c   2007-12-13 19:43:07.0 -0500
  @@ -760,7 +760,7 @@
  struct page *page = __rmqueue(zone, order, migratetype);
  if (unlikely(page == NULL))
  break;
  -   list_add(page-lru, list);
  +   list_add_tail(page-lru, list);
 
 Could we put a big comment above this explaining to the would be vm
 tweakers why this has to be a list_add_tail, so we don't end up back in
 this position after another two years?
 

Already done ;)

--- a/mm/page_alloc.c~fix-page_alloc-for-larger-i-o-segments-fix
+++ a/mm/page_alloc.c
@@ -847,6 +847,10 @@ static int rmqueue_bulk(struct zone *zon
struct page *page = __rmqueue(zone, order, migratetype);
if (unlikely(page == NULL))
break;
+   /*
+* Doing a list_add_tail() here helps us to hand out pages in
+* ascending physical-address order.
+*/
list_add_tail(page-lru, list);
set_page_private(page, migratetype);
}
_

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix page_alloc for larger I/O segments (improved)

2007-12-13 Thread Mark Lord

Andrew Morton wrote:

On Thu, 13 Dec 2007 19:57:29 -0500
James Bottomley [EMAIL PROTECTED] wrote:


On Thu, 2007-12-13 at 19:46 -0500, Mark Lord wrote:

Improved version, more similar to the 2.6.23 code:

Fix page allocator to give better chance of larger contiguous segments (again).

Signed-off-by: Mark Lord [EMAIL PROTECTED]
---

--- old/mm/page_alloc.c 2007-12-13 19:25:15.0 -0500
+++ linux-2.6/mm/page_alloc.c   2007-12-13 19:43:07.0 -0500
@@ -760,7 +760,7 @@
struct page *page = __rmqueue(zone, order, migratetype);
if (unlikely(page == NULL))
break;
-   list_add(page-lru, list);
+   list_add_tail(page-lru, list);

Could we put a big comment above this explaining to the would be vm
tweakers why this has to be a list_add_tail, so we don't end up back in
this position after another two years?



Already done ;)

..

I thought of the comment as I rushed off for dinner.
Thanks, Andrew!


--- a/mm/page_alloc.c~fix-page_alloc-for-larger-i-o-segments-fix
+++ a/mm/page_alloc.c
@@ -847,6 +847,10 @@ static int rmqueue_bulk(struct zone *zon
struct page *page = __rmqueue(zone, order, migratetype);
if (unlikely(page == NULL))
break;
+   /*
+* Doing a list_add_tail() here helps us to hand out pages in
+* ascending physical-address order.
+*/
list_add_tail(page-lru, list);
set_page_private(page, migratetype);
}
_


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html