[RFC] blktrace interface for sg devices
I am referring to the discussion of introducing statistics in the SCSI layer and the conclusion that blktrace already provides the data: http://lkml.org/lkml/2006/10/21/72 http://lkml.org/lkml/2006/11/2/141 While blktrace works fine for disk devices, it currently does not provide data for non-disk devices like tape drives. To close this gap, i am looking for a way to get the same trace data also from other SCSI devices. Since the SCSI layer internally uses the same request queuest for all devices and the queues already use the blktrace interface, the main missing part is the interface to enable the tracing for all SCSI devices. Attached is a patch that adds the ioctl interface for blktrace to the sg generic scsi interface. This already allows to get some trace data for SCSI tape drives, although i have to do more testing. For testing, any sg device file can be passed to blktrace, e.g.: # blktrace -d /dev/sg1 -o - | blkparse -i - I am seeking input in this approach: Is this approach worth pursuing to enable blktrace to trace SCSI tape drives? Would there be a better approach to get this trace data? Christof Schmitt --- block/blktrace.c | 19 +++ drivers/scsi/sg.c | 12 include/linux/blkdev.h | 10 ++ 3 files changed, 33 insertions(+), 8 deletions(-) --- a/block/blktrace.c 2007-12-13 08:48:23.0 +0100 +++ b/block/blktrace.c 2007-12-13 08:48:25.0 +0100 @@ -231,7 +231,7 @@ static void blk_trace_cleanup(struct blk kfree(bt); } -static int blk_trace_remove(struct request_queue *q) +int blk_trace_remove(struct request_queue *q) { struct blk_trace *bt; @@ -245,6 +245,7 @@ static int blk_trace_remove(struct reque return 0; } +EXPORT_SYMBOL_GPL(blk_trace_remove); static int blk_dropped_open(struct inode *inode, struct file *filp) { @@ -312,13 +313,11 @@ static struct rchan_callbacks blk_relay_ /* * Setup everything required to start tracing */ -static int blk_trace_setup(struct request_queue *q, struct block_device *bdev, - char __user *arg) +int blk_trace_setup(struct request_queue *q, char *name, dev_t dev, char __user *arg) { struct blk_user_trace_setup buts; struct blk_trace *old_bt, *bt = NULL; struct dentry *dir = NULL; - char b[BDEVNAME_SIZE]; int ret, i; if (copy_from_user(buts, arg, sizeof(buts))) @@ -327,7 +326,7 @@ static int blk_trace_setup(struct reques if (!buts.buf_size || !buts.buf_nr) return -EINVAL; - strcpy(buts.name, bdevname(bdev, b)); + strcpy(buts.name, name); /* * some device names have larger paths - convert the slashes @@ -355,7 +354,7 @@ static int blk_trace_setup(struct reques goto err; bt-dir = dir; - bt-dev = bdev-bd_dev; + bt-dev = dev; atomic_set(bt-dropped, 0); ret = -EIO; @@ -400,8 +399,9 @@ err: } return ret; } +EXPORT_SYMBOL_GPL(blk_trace_setup); -static int blk_trace_startstop(struct request_queue *q, int start) +int blk_trace_startstop(struct request_queue *q, int start) { struct blk_trace *bt; int ret; @@ -434,6 +434,7 @@ static int blk_trace_startstop(struct re return ret; } +EXPORT_SYMBOL_GPL(blk_trace_startstop); /** * blk_trace_ioctl: - handle the ioctls associated with tracing @@ -446,6 +447,7 @@ int blk_trace_ioctl(struct block_device { struct request_queue *q; int ret, start = 0; + char b[BDEVNAME_SIZE]; q = bdev_get_queue(bdev); if (!q) @@ -455,7 +457,8 @@ int blk_trace_ioctl(struct block_device switch (cmd) { case BLKTRACESETUP: - ret = blk_trace_setup(q, bdev, arg); + strcpy(b, bdevname(bdev, b)); + ret = blk_trace_setup(q, b, bdev-bd_dev, arg); break; case BLKTRACESTART: start = 1; --- a/drivers/scsi/sg.c 2007-12-13 08:48:23.0 +0100 +++ b/drivers/scsi/sg.c 2007-12-13 08:48:25.0 +0100 @@ -55,6 +55,8 @@ static int sg_version_num = 30534;/* 2 #include scsi/scsi_ioctl.h #include scsi/sg.h +#include linux/blktrace_api.h + #include scsi_logging.h #ifdef CONFIG_SCSI_PROC_FS @@ -1066,6 +1068,16 @@ sg_ioctl(struct inode *inode, struct fil case BLKSECTGET: return put_user(sdp-device-request_queue-max_sectors * 512, ip); + case BLKTRACESETUP: + { + return blk_trace_setup(sdp-device-request_queue , sdp-device-sdev_gendev.bus_id, sdp-device-sdev_gendev, arg); + } + case BLKTRACESTART: + return blk_trace_startstop(sdp-device-request_queue, 1); + case BLKTRACESTOP: + return blk_trace_startstop(sdp-device-request_queue, 0); + case BLKTRACETEARDOWN: + return blk_trace_remove(sdp-device-request_queue);
[PATCH] dpt_i2o: don't set DMA_64BIT_MASK [was: Re: [stable] broken dpt_i2o in 2.6.23 (was: ext2 check page: bad entry in directory) (fwd)]
According to Greg KH: So, what should be added to 2.6.23-stable then? And, can I get a real changelog entry for it? This is suitable for both 2.6.23.x and 2.6.24-rc5 : linux-2.6-dpt_i2o-no-dma64.patch The dpt_i2o driver can't handle 64 bit DMA addresses, so do not let it set pci_set_dma_mask(pDev, DMA_64BIT_MASK) . Signed-off-by: Miquel van Smoorenburg [EMAIL PROTECTED] diff -ruN linux-2.6.23.9.orig/drivers/scsi/dpt_i2o.c linux-2.6.23.9/drivers/scsi/dpt_i2o.c --- linux-2.6.23.9.orig/drivers/scsi/dpt_i2o.c 2007-11-26 18:51:43.0 +0100 +++ linux-2.6.23.9/drivers/scsi/dpt_i2o.c 2007-12-12 13:21:05.0 +0100 @@ -905,8 +905,7 @@ } pci_set_master(pDev); - if (pci_set_dma_mask(pDev, DMA_64BIT_MASK) - pci_set_dma_mask(pDev, DMA_32BIT_MASK)) + if (pci_set_dma_mask(pDev, DMA_32BIT_MASK)) return -EINVAL; base_addr0_phys = pci_resource_start(pDev,0); - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[0/3 ver2] Last 3 patches for bidi support
James hi. Bidi patches just broke again, by a patch that fixes some to-be-dead code. (scsi: BUG_ON() impossible condition) Could it not just be accepted into the tree now. It sat in -mm tree with no reports of breakage or complains. What are we waiting for? the way I see it there is nothing holding it back, it's not even dangerous anymore. You need Arm's accessors patch from scsi-pending Russell King [EMAIL PROTECTED] Please send an Acked-by for this patch and the patch that removes the old esp drivers (http://www.spinics.net/lists/linux-scsi/msg20914.html) Christoph Hellwig [EMAIL PROTECTED] David S. Miller [EMAIL PROTECTED] Maciej W. Rozycki [EMAIL PROTECTED] Please send an Ack-by or Recommended-by to the removal of these old esp drivers. And the 3 patches (based on scsi-misc) [1] tgt: Use scsi_init_io instead of scsi_alloc_sgtable Was Ack-by the maintainer of tgt. Please accept independent of the other 2. [2] scsi: scsi_data_buffer The move to scsi_data_buffer. From here on any unconverted driver will not compile. [3] scsi: bidi support Actual very simple really. All parties involved, send your reservations if any NOW. Else James please put it in. Andrew could they be included back into -mm tree? Boaz - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] tgt: Use scsi_init_io instead of scsi_alloc_sgtable
- If we export scsi_init_io()/scsi_release_buffers() instead of scsi_{alloc,free}_sgtable() from scsi_lib than tgt code is much more insulated from scsi_lib changes. As a bonus it will also gain bidi capability when it comes. Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] Acked-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/scsi_lib.c | 21 ++--- drivers/scsi/scsi_tgt_lib.c | 29 + include/scsi/scsi_cmnd.h|4 ++-- 3 files changed, 17 insertions(+), 37 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index e273e4b..d1a4671 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -739,7 +739,8 @@ static inline unsigned int scsi_sgtable_index(unsigned short nents) return index; } -struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd, gfp_t gfp_mask) +static struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd, + gfp_t gfp_mask) { struct scsi_host_sg_pool *sgp; struct scatterlist *sgl, *prev, *ret; @@ -825,9 +826,7 @@ enomem: return NULL; } -EXPORT_SYMBOL(scsi_alloc_sgtable); - -void scsi_free_sgtable(struct scsi_cmnd *cmd) +static void scsi_free_sgtable(struct scsi_cmnd *cmd) { struct scatterlist *sgl = cmd-request_buffer; struct scsi_host_sg_pool *sgp; @@ -873,8 +872,6 @@ void scsi_free_sgtable(struct scsi_cmnd *cmd) mempool_free(sgl, sgp-pool); } -EXPORT_SYMBOL(scsi_free_sgtable); - /* * Function:scsi_release_buffers() * @@ -892,7 +889,7 @@ EXPORT_SYMBOL(scsi_free_sgtable); * the scatter-gather table, and potentially any bounce * buffers. */ -static void scsi_release_buffers(struct scsi_cmnd *cmd) +void scsi_release_buffers(struct scsi_cmnd *cmd) { if (cmd-use_sg) scsi_free_sgtable(cmd); @@ -904,6 +901,7 @@ static void scsi_release_buffers(struct scsi_cmnd *cmd) cmd-request_buffer = NULL; cmd-request_bufflen = 0; } +EXPORT_SYMBOL(scsi_release_buffers); /* * Function:scsi_io_completion() @@ -1105,7 +1103,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) * Returns: 0 on success * BLKPREP_DEFER if the failure is retryable */ -static int scsi_init_io(struct scsi_cmnd *cmd) +int scsi_init_io(struct scsi_cmnd *cmd, gfp_t gfp_mask) { struct request *req = cmd-request; intcount; @@ -1120,7 +1118,7 @@ static int scsi_init_io(struct scsi_cmnd *cmd) /* * If sg table allocation fails, requeue request later. */ - cmd-request_buffer = scsi_alloc_sgtable(cmd, GFP_ATOMIC); + cmd-request_buffer = scsi_alloc_sgtable(cmd, gfp_mask); if (unlikely(!cmd-request_buffer)) { scsi_unprep_request(req); return BLKPREP_DEFER; @@ -1141,6 +1139,7 @@ static int scsi_init_io(struct scsi_cmnd *cmd) cmd-use_sg = count; return BLKPREP_OK; } +EXPORT_SYMBOL(scsi_init_io); static struct scsi_cmnd *scsi_get_cmd_from_req(struct scsi_device *sdev, struct request *req) @@ -1186,7 +1185,7 @@ int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req) BUG_ON(!req-nr_phys_segments); - ret = scsi_init_io(cmd); + ret = scsi_init_io(cmd, GFP_ATOMIC); if (unlikely(ret)) return ret; } else { @@ -1237,7 +1236,7 @@ int scsi_setup_fs_cmnd(struct scsi_device *sdev, struct request *req) if (unlikely(!cmd)) return BLKPREP_DEFER; - return scsi_init_io(cmd); + return scsi_init_io(cmd, GFP_ATOMIC); } EXPORT_SYMBOL(scsi_setup_fs_cmnd); diff --git a/drivers/scsi/scsi_tgt_lib.c b/drivers/scsi/scsi_tgt_lib.c index 93ece8f..91630ba 100644 --- a/drivers/scsi/scsi_tgt_lib.c +++ b/drivers/scsi/scsi_tgt_lib.c @@ -331,8 +331,7 @@ static void scsi_tgt_cmd_done(struct scsi_cmnd *cmd) scsi_tgt_uspace_send_status(cmd, tcmd-itn_id, tcmd-tag); - if (scsi_sglist(cmd)) - scsi_free_sgtable(cmd); + scsi_release_buffers(cmd); queue_work(scsi_tgtd, tcmd-work); } @@ -353,26 +352,6 @@ static int scsi_tgt_transfer_response(struct scsi_cmnd *cmd) return 0; } -static int scsi_tgt_init_cmd(struct scsi_cmnd *cmd, gfp_t gfp_mask) -{ - struct request *rq = cmd-request; - int count; - - cmd-use_sg = rq-nr_phys_segments; - cmd-request_buffer = scsi_alloc_sgtable(cmd, gfp_mask); - if (!cmd-request_buffer) - return -ENOMEM; - - cmd-request_bufflen = rq-data_len; - - dprintk(cmd %p cnt %d %lu\n, cmd, scsi_sg_count(cmd), - rq_data_dir(rq)); - count = blk_rq_map_sg(rq-q, rq, scsi_sglist(cmd)); - BUG_ON(count cmd-use_sg); - cmd-use_sg = count; -
[PATCH] scsi: scsi_data_buffer
In preparation for bidi we abstract all IO members of scsi_cmnd, that will need to duplicate, into a substructure. - Group all IO members of scsi_cmnd into a scsi_data_buffer structure. - Adjust accessors to new members. - scsi_{alloc,free}_sgtable receive a scsi_data_buffer instead of scsi_cmnd. And work on it. - Adjust scsi_init_io() and scsi_release_buffers() for above change. - Fix other parts of scsi_lib/scsi.c to members migration. Use accessors where appropriate. - fix Documentation about scsi_cmnd in scsi_host.h - scsi_error.c * Changed needed members of struct scsi_eh_save. * Careful considerations in scsi_eh_prep/restore_cmnd. - sd.c and sr.c * sd and sr would adjust IO size to align on device's block size so code needs to change once we move to scsi_data_buff implementation. * Convert code to use scsi_for_each_sg * Use data accessors where appropriate. - tgt: convert libsrp to use scsi_data_buffer - isd200: This driver still bangs on scsi_cmnd IO members, so need changing Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/libsrp.c|4 +- drivers/scsi/scsi.c |2 +- drivers/scsi/scsi_error.c| 28 +-- drivers/scsi/scsi_lib.c | 77 -- drivers/scsi/sd.c|4 +- drivers/scsi/sr.c| 25 +++-- drivers/usb/storage/isd200.c |8 ++-- include/scsi/scsi_cmnd.h | 39 + include/scsi/scsi_eh.h |8 ++--- include/scsi/scsi_host.h |4 +- 10 files changed, 91 insertions(+), 108 deletions(-) diff --git a/drivers/scsi/libsrp.c b/drivers/scsi/libsrp.c index 5cff020..8a8562a 100644 --- a/drivers/scsi/libsrp.c +++ b/drivers/scsi/libsrp.c @@ -426,8 +426,8 @@ int srp_cmd_queue(struct Scsi_Host *shost, struct srp_cmd *cmd, void *info, sc-SCp.ptr = info; memcpy(sc-cmnd, cmd-cdb, MAX_COMMAND_SIZE); - sc-request_bufflen = len; - sc-request_buffer = (void *) (unsigned long) addr; + sc-sdb.length = len; + sc-sdb.sglist = (void *) (unsigned long) addr; sc-tag = tag; err = scsi_tgt_queue_command(sc, itn_id, (struct scsi_lun *)cmd-lun, cmd-tag); diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index ebc0193..a0fd785 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -712,7 +712,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd) Notifying upper driver of completion (result %x)\n, cmd-result)); - good_bytes = cmd-request_bufflen; + good_bytes = scsi_bufflen(cmd); if (cmd-request-cmd_type != REQ_TYPE_BLOCK_PC) { drv = scsi_cmd_to_driver(cmd); if (drv-done) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 169bc59..241ab48 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -617,29 +617,25 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, struct scsi_eh_save *ses, ses-cmd_len = scmd-cmd_len; memcpy(ses-cmnd, scmd-cmnd, sizeof(scmd-cmnd)); ses-data_direction = scmd-sc_data_direction; - ses-bufflen = scmd-request_bufflen; - ses-buffer = scmd-request_buffer; - ses-use_sg = scmd-use_sg; - ses-resid = scmd-resid; + ses-sdb = scmd-sdb; ses-result = scmd-result; + memset(scmd-sdb, 0, sizeof(scmd-sdb)); + if (sense_bytes) { - scmd-request_bufflen = min_t(unsigned, + scmd-sdb.length = min_t(unsigned, sizeof(scmd-sense_buffer), sense_bytes); sg_init_one(ses-sense_sgl, scmd-sense_buffer, - scmd-request_bufflen); - scmd-request_buffer = ses-sense_sgl; + scmd-sdb.length); + scmd-sdb.sglist = ses-sense_sgl; scmd-sc_data_direction = DMA_FROM_DEVICE; - scmd-use_sg = 1; + scmd-sdb.sg_count = 1; memset(scmd-cmnd, 0, sizeof(scmd-cmnd)); scmd-cmnd[0] = REQUEST_SENSE; - scmd-cmnd[4] = scmd-request_bufflen; + scmd-cmnd[4] = scmd-sdb.length; scmd-cmd_len = COMMAND_SIZE(scmd-cmnd[0]); } else { - scmd-request_buffer = NULL; - scmd-request_bufflen = 0; scmd-sc_data_direction = DMA_NONE; - scmd-use_sg = 0; if (cmnd) { memset(scmd-cmnd, 0, sizeof(scmd-cmnd)); memcpy(scmd-cmnd, cmnd, cmnd_size); @@ -676,10 +672,7 @@ void scsi_eh_restore_cmnd(struct scsi_cmnd* scmd, struct scsi_eh_save *ses) scmd-cmd_len = ses-cmd_len;
[PATCH] scsi: bidi support
At the block level bidi request uses req-next_rq pointer for a second bidi_read request. At Scsi-midlayer a second scsi_data_buffer structure is used for the bidi_read part. This bidi scsi_data_buffer is put on request-next_rq-special. Struct scsi_cmnd is not changed. - Define scsi_bidi_cmnd() to return true if it is a bidi request and a second sgtable was allocated. - Define scsi_in()/scsi_out() to return the in or out scsi_data_buffer from this command This API is to isolate users from the mechanics of bidi. - Define scsi_end_bidi_request() to do what scsi_end_request() does but for a bidi request. This is necessary because bidi commands are a bit tricky here. (See comments in body) - scsi_release_buffers() will also release the bidi_read scsi_data_buffer - scsi_io_completion() on bidi commands will now call scsi_end_bidi_request() and return. - The previous work done in scsi_init_io() is now done in a new scsi_init_sgtable() (which is 99% identical to old scsi_init_io()) The new scsi_init_io() will call the above twice if needed also for the bidi_read command. Only at this point is a command bidi. - In scsi_error.c at scsi_eh_prep/restore_cmnd() make sure bidi-lld is not confused by a get-sense command that looks like bidi. This is done by puting NULL at request-next_rq, and restoring. Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] --- drivers/scsi/scsi_error.c |3 + drivers/scsi/scsi_lib.c | 144 - include/scsi/scsi_cmnd.h | 23 +++- include/scsi/scsi_eh.h|1 + 4 files changed, 141 insertions(+), 30 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 241ab48..5c8ba6a 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -618,9 +618,11 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, struct scsi_eh_save *ses, memcpy(ses-cmnd, scmd-cmnd, sizeof(scmd-cmnd)); ses-data_direction = scmd-sc_data_direction; ses-sdb = scmd-sdb; + ses-next_rq = scmd-request-next_rq; ses-result = scmd-result; memset(scmd-sdb, 0, sizeof(scmd-sdb)); + scmd-request-next_rq = NULL; if (sense_bytes) { scmd-sdb.length = min_t(unsigned, @@ -673,6 +675,7 @@ void scsi_eh_restore_cmnd(struct scsi_cmnd* scmd, struct scsi_eh_save *ses) memcpy(scmd-cmnd, ses-cmnd, sizeof(scmd-cmnd)); scmd-sc_data_direction = ses-data_direction; scmd-sdb = ses-sdb; + scmd-request-next_rq = ses-next_rq; scmd-result = ses-result; } EXPORT_SYMBOL(scsi_eh_restore_cmnd); diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 7ac36fe..a6aae56 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -64,6 +64,8 @@ static struct scsi_host_sg_pool scsi_sg_pools[] = { }; #undef SP +static struct kmem_cache *scsi_bidi_sdb_cache; + static void scsi_run_queue(struct request_queue *q); /* @@ -627,6 +629,28 @@ void scsi_run_host_queues(struct Scsi_Host *shost) scsi_run_queue(sdev-request_queue); } +static void scsi_finalize_request(struct scsi_cmnd *cmd, int uptodate) +{ + struct request_queue *q = cmd-device-request_queue; + struct request *req = cmd-request; + unsigned long flags; + + add_disk_randomness(req-rq_disk); + + spin_lock_irqsave(q-queue_lock, flags); + if (blk_rq_tagged(req)) + blk_queue_end_tag(q, req); + + end_that_request_last(req, uptodate); + spin_unlock_irqrestore(q-queue_lock, flags); + + /* +* This will goose the queue request function at the end, so we don't +* need to worry about launching another command. +*/ + scsi_next_command(cmd); +} + /* * Function:scsi_end_request() * @@ -654,7 +678,6 @@ static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int uptodate, { struct request_queue *q = cmd-device-request_queue; struct request *req = cmd-request; - unsigned long flags; /* * If there are blocks left over at the end, set up the command @@ -683,19 +706,7 @@ static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int uptodate, } } - add_disk_randomness(req-rq_disk); - - spin_lock_irqsave(q-queue_lock, flags); - if (blk_rq_tagged(req)) - blk_queue_end_tag(q, req); - end_that_request_last(req, uptodate); - spin_unlock_irqrestore(q-queue_lock, flags); - - /* -* This will goose the queue request function at the end, so we don't -* need to worry about launching another command. -*/ - scsi_next_command(cmd); + scsi_finalize_request(cmd, uptodate); return NULL; } @@ -894,10 +905,39 @@ void scsi_release_buffers(struct scsi_cmnd *cmd) scsi_free_sgtable(cmd-sdb); memset(cmd-sdb, 0,
[PATCH] sr/sd: Remove simple dead code
if (rq_data_dir() == WRITE) else if() else chain had an extra else since the if() is on a value of 1 bit. Also with a bidi request rq_data_dir() == WRITE and blk_bidi_rq() == true. Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] --- drivers/scsi/sd.c |5 + drivers/scsi/sr.c |5 + 2 files changed, 2 insertions(+), 8 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 212f6bc..e6d85b0 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -445,12 +445,9 @@ static int sd_prep_fn(struct request_queue *q, struct request *rq) } SCpnt-cmnd[0] = WRITE_6; SCpnt-sc_data_direction = DMA_TO_DEVICE; - } else if (rq_data_dir(rq) == READ) { + } else { SCpnt-cmnd[0] = READ_6; SCpnt-sc_data_direction = DMA_FROM_DEVICE; - } else { - scmd_printk(KERN_ERR, SCpnt, Unknown command %x\n, rq-cmd_flags); - goto out; } SCSI_LOG_HLQUEUE(2, scmd_printk(KERN_INFO, SCpnt, diff --git a/drivers/scsi/sr.c b/drivers/scsi/sr.c index 896be4a..7128d15 100644 --- a/drivers/scsi/sr.c +++ b/drivers/scsi/sr.c @@ -372,12 +372,9 @@ static int sr_prep_fn(struct request_queue *q, struct request *rq) SCpnt-cmnd[0] = WRITE_10; SCpnt-sc_data_direction = DMA_TO_DEVICE; cd-cdi.media_written = 1; - } else if (rq_data_dir(rq) == READ) { + } else { SCpnt-cmnd[0] = READ_10; SCpnt-sc_data_direction = DMA_FROM_DEVICE; - } else { - blk_dump_rq_flags(rq, Unknown sr command); - goto out; } { -- 1.5.3.3 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] dpt_i2o: don't set DMA_64BIT_MASK [was: Re: [stable] broken dpt_i2o in 2.6.23 (was: ext2 check page: bad entry in directory) (fwd)]
On Thu, 2007-12-13 at 11:11 +0100, Miquel van Smoorenburg wrote: According to Greg KH: So, what should be added to 2.6.23-stable then? And, can I get a real changelog entry for it? This is suitable for both 2.6.23.x and 2.6.24-rc5 : linux-2.6-dpt_i2o-no-dma64.patch Actually, this one's already queued: http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-rc-fixes-2.6.git;a=commit;h=a066b307861238c1970310579c0bc2fe8c8dca51 James - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi device recovery
On Wed, 2007-12-12 at 18:54 +0100, Bernd Schubert wrote: [Hmm, resending since mail after more than 30min still not on the ML, maybe the attachment was too large? I have uploaded the log to http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/scsi/kern.log.1] On Wednesday 12 December 2007 16:59:36 James Bottomley wrote: On Wed, 2007-12-12 at 15:36 +0100, Bernd Schubert wrote: On Wednesday 12 December 2007 14:39:27 Matthew Wilcox wrote: On Wed, Dec 12, 2007 at 01:54:14PM +0100, Bernd Schubert wrote: below is a patch introducing device recovery, trying to prevent i/o errors when a DID_NO_CONNECT or SOFT_ERROR does happen. Why doesn't the regular scsi_eh do what you need? First of all, it is presently simply not called when the two errors above do happen. This could be changed, of course. Erm, I think you'll find the error handler does activate on DID_SOFT_ERROR. It causes a retry via the eh. DID_NO_CONNECT is an Dec 7 23:48:45 beo-96 kernel: [94605.297924] sd 2:0:5:0: [sdd] Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK,SUGGEST_OK Dec 7 23:48:45 beo-96 kernel: [94605.297932] end_request: I/O error, dev sdd, sector 7706802052 Dec 7 23:48:45 beo-96 kernel: [94605.297937] raid5:md5: read error not correctable (sector 871932472 on sdd3). This is some type of ioc internal error. What we do on DID_SOFT_ERROR is retry for the usual number of times up to the timeout limit. Unfortunately, the retries are fixed at SD_MAX_RETRIES in sd.c. Without diagnosing what's going wrong in the fusion, it's impossible to say if this is reasonable, but your fusion is signalling ioc errors (firmware errors). Full log attached. immediate error with no eh intervention because it means that the target went away. Handling this as a retryable error isn't an option because it will interfere with hotplug. Then we need a sysfs flag one can set to manually enable eh for these devices on DID_NO_CONNECT. No, because that will seriously damage a lot of other systems. The DID_NO_CONNECT looks to be a genuine reselection issue caused by a device out of spec on the bus. The SPI standard says a device should respond in 250ms, which is what most HBA's take as the default selection timeout. I'd say for the device you have, you need to increase this. Unfortunately doing this for the fusion is some type of mode page setting, I think, but I don't have the doc in front of me. I'd be amenable to putting the selection timeout as a parameter in the spi transport class, since others might find it valuable occasionally to control. Secondly, I think scsi_eh is in most cases doing too much. We are fighting with flaky Infortrend boxes here, and scsi_eh sometimes manages to crash their scsi channels. In most cases it is sufficient to stall any io to the device and then to resume. But that's basically the default behaviour of the error handler (stall then resume). For most scsi devices one probably doesn't need a suspend time or it can be very small, this still needs to become configurable via sysfs. You mean a wait time beyond what the error handler currently does (basically it waits for the quiesce, begins error handling and then sends a test unit ready when it finishes before restarting). In deh just waits on the first error and then only does a DV. For these infortrend devices, thats mostly sufficient. Thirdly, scsi_eh doesn't give up, in most cases, when the scsi channel of a Infortrend box crashed, it tried forever to recover. To improve this is still on my todo list. Could you send traces for this. I thought the error handler had been fixed over the last few years always to terminate. If there's a case where it doesn't, this needs fixing. I'm attaching the syslog, this is 2.6.22 + additional printks, dump_stack()'s and msleep()'s. At 03:59:36 the system finally went into wait_for_completion(), similar to the everything in wait_for_completion, what is my system doing? thread. This looks like a genuine bug. I missed the thread, since my email system went off line while I was on holiday for two weeks. The symptoms look to be lost commands, but I can't see why from the traces. There's a known bug where we can hang in domain validation because of a resource starvation issue, but I know of none where everything hangs just after error recovery completes. James - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] blktrace interface for sg devices
On Thu, Dec 13, 2007 at 10:19:42AM +0100, Jens Axboe wrote: [...] I think this approach is the simplest and right way to do it. Tracing is really just tied to the transport (transport here meaning how we transport commands to the device), and even character scsi devices use the block layer queue for this operation, as you note. Let me know when you are happy with the patch, and I'll queue it up for 2.6.25. @@ -1066,6 +1068,16 @@ sg_ioctl(struct inode *inode, struct fil case BLKSECTGET: return put_user(sdp-device-request_queue-max_sectors * 512, ip); + case BLKTRACESETUP: + { + return blk_trace_setup(sdp-device-request_queue , sdp-device-sdev_gendev.bus_id, sdp-device-sdev_gendev, arg); + } Don't need those braces, some other space and long line style issues as well. --- a/include/linux/blkdev.h2007-12-13 08:48:23.0 +0100 +++ b/include/linux/blkdev.h2007-12-13 08:48:25.0 +0100 @@ -747,6 +747,16 @@ static inline void blkdev_dequeue_reques elv_dequeue_request(req-q, req); } +#ifdef CONFIG_BLK_DEV_IO_TRACE +extern int blk_trace_setup(request_queue_t *q, char * name, dev_t dev, char __user *arg); +extern int blk_trace_startstop(request_queue_t *q, int start); +extern int blk_trace_remove(request_queue_t *q); +#else +#define blk_trace_setup(q, name, dev, arg) do { } while(0) +#define blk_trace_startstop(q, start) do { } while(0) +#define blk_trace_remove(q) do { } while(0) +#endif + Put these in the blktrace include file. Thanks for your input. I will prepare and send an updated version of the patch. I also want to do some more testing, especially to see how i can get the sizes of read and write requests and latencies for SCSI tape drives. Christof Schmitt - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
Jens, I'm experimenting here with trying to generate large I/O through libata, and not having much luck. The limit seems to be the number of hardware PRD (SG) entries permitted by the driver (libata:ata_piix), which is 128 by default. The problem is, the block layer *never* sends an SG entry larger than 8192 bytes, and even that size is exceptionally rare. Nearly all I/O segments are 4096 bytes, so I never see a single I/O larger than 512KB (128 * 4096). If I patch various parts of block and SCSI, this limit doesn't budge, but when I change the hardware PRD limit in libata, it scales by exactly whatever I set the new value to. This tells me that adjacent I/O segments are not being combined. I thought that QUEUE_FLAG_CLUSTER (aka. SCSI host .use_clustering=1) should result in adjacent single pages being combined into larger physical segments? This is x86-32 with latest 2.6.24-rc*. I'll re-test on older kernels next. ??? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
(resending with corrected email address for Jens) Jens, I'm experimenting here with trying to generate large I/O through libata, and not having much luck. The limit seems to be the number of hardware PRD (SG) entries permitted by the driver (libata:ata_piix), which is 128 by default. The problem is, the block layer *never* sends an SG entry larger than 8192 bytes, and even that size is exceptionally rare. Nearly all I/O segments are 4096 bytes, so I never see a single I/O larger than 512KB (128 * 4096). If I patch various parts of block and SCSI, this limit doesn't budge, but when I change the hardware PRD limit in libata, it scales by exactly whatever I set the new value to. This tells me that adjacent I/O segments are not being combined. I thought that QUEUE_FLAG_CLUSTER (aka. SCSI host .use_clustering=1) should result in adjacent single pages being combined into larger physical segments? This is x86-32 with latest 2.6.24-rc*. I'll re-test on older kernels next. ??? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, Dec 13, 2007 at 01:37:59PM -0500, Mark Lord wrote: The problem is, the block layer *never* sends an SG entry larger than 8192 bytes, and even that size is exceptionally rare. Nearly all I/O segments are 4096 bytes, so I never see a single I/O larger than 512KB (128 * 4096). If I patch various parts of block and SCSI, this limit doesn't budge, but when I change the hardware PRD limit in libata, it scales by exactly whatever I set the new value to. This tells me that adjacent I/O segments are not being combined. I thought that QUEUE_FLAG_CLUSTER (aka. SCSI host .use_clustering=1) should result in adjacent single pages being combined into larger physical segments? I was recently debugging a driver and noticed that consecutive pages in an sg list are in the reverse order. ie first you get page 918, then 917, 916, 915, 914, etc. I vaguely remember James having patches to correct this, but maybe they weren't merged? -- Intel are signing my paycheques ... these opinions are still mine Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/24] iscsi class: Use our own workq instead of common system one.
From: Mike Christie [EMAIL PROTECTED] There is just too much going on through the common workq and something like a scsi device removal through sysfs affects how long it will take to recover the transport, mark it as failed, or shut it down gracefully. Signed-off-by: Mike Christie [EMAIL PROTECTED] --- drivers/scsi/scsi_transport_iscsi.c | 16 1 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 75d3069..9cc2cc8 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -50,6 +50,7 @@ struct iscsi_internal { }; static atomic_t iscsi_session_nr; /* sysfs session id for next new session */ +static struct workqueue_struct *iscsi_eh_timer_workq; /* * list of registered transports and lock that must @@ -252,7 +253,7 @@ static void session_recovery_timedout(struct work_struct *work) void iscsi_unblock_session(struct iscsi_cls_session *session) { if (!cancel_delayed_work(session-recovery_work)) - flush_scheduled_work(); + flush_workqueue(iscsi_eh_timer_workq); scsi_target_unblock(session-dev); } EXPORT_SYMBOL_GPL(iscsi_unblock_session); @@ -260,8 +261,8 @@ EXPORT_SYMBOL_GPL(iscsi_unblock_session); void iscsi_block_session(struct iscsi_cls_session *session) { scsi_target_block(session-dev); - schedule_delayed_work(session-recovery_work, -session-recovery_tmo * HZ); + queue_delayed_work(iscsi_eh_timer_workq, session-recovery_work, + session-recovery_tmo * HZ); } EXPORT_SYMBOL_GPL(iscsi_block_session); @@ -357,7 +358,7 @@ void iscsi_remove_session(struct iscsi_cls_session *session) struct iscsi_host *ihost = shost-shost_data; if (!cancel_delayed_work(session-recovery_work)) - flush_scheduled_work(); + flush_workqueue(iscsi_eh_timer_workq); mutex_lock(ihost-mutex); list_del(session-host_list); @@ -1521,8 +1522,14 @@ static __init int iscsi_transport_init(void) goto unregister_session_class; } + iscsi_eh_timer_workq = create_singlethread_workqueue(iscsi_eh); + if (!iscsi_eh_timer_workq) + goto release_nls; + return 0; +release_nls: + sock_release(nls-sk_socket); unregister_session_class: transport_class_unregister(iscsi_session_class); unregister_conn_class: @@ -1536,6 +1543,7 @@ unregister_transport_class: static void __exit iscsi_transport_exit(void) { + destroy_workqueue(iscsi_eh_timer_workq); sock_release(nls-sk_socket); transport_class_unregister(iscsi_connection_class); transport_class_unregister(iscsi_session_class); -- 1.5.1.2 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/24] libiscsi: fix shutdown
From: Mike Christie [EMAIL PROTECTED] We were using the device delete sysfs file to remove each device then logout. Now in 2.6.21 this will not work because the sysfs delete file returns immediately and does not wait for the device removal to complete. This causes a hang if a cache sync is needed during shutdown. Before .21, that approach had other problems, so this patch fixes the shutdown code so that we remove the target and unbind the session before logging out and shut down the session Signed-off-by: Mike Christie [EMAIL PROTECTED] --- drivers/scsi/libiscsi.c |4 +- drivers/scsi/qla4xxx/ql4_init.c |4 +- drivers/scsi/qla4xxx/ql4_os.c |7 +- drivers/scsi/scsi_transport_iscsi.c | 289 +++ include/scsi/iscsi_if.h |7 + include/scsi/iscsi_proto.h |2 + include/scsi/scsi_transport_iscsi.h |7 +- 7 files changed, 176 insertions(+), 144 deletions(-) diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 441e351..5205ef2 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -1662,7 +1662,7 @@ void iscsi_session_teardown(struct iscsi_cls_session *cls_session) struct iscsi_session *session = iscsi_hostdata(shost-hostdata); struct module *owner = cls_session-transport-owner; - iscsi_unblock_session(cls_session); + iscsi_remove_session(cls_session); scsi_remove_host(shost); iscsi_pool_free(session-mgmtpool); @@ -1677,7 +1677,7 @@ void iscsi_session_teardown(struct iscsi_cls_session *cls_session) kfree(session-hwaddress); kfree(session-initiatorname); - iscsi_destroy_session(cls_session); + iscsi_free_session(cls_session); scsi_host_put(shost); module_put(owner); } diff --git a/drivers/scsi/qla4xxx/ql4_init.c b/drivers/scsi/qla4xxx/ql4_init.c index d692c71..cbe0a17 100644 --- a/drivers/scsi/qla4xxx/ql4_init.c +++ b/drivers/scsi/qla4xxx/ql4_init.c @@ -5,6 +5,7 @@ * See LICENSE.qla4xxx for copyright and licensing details. */ +#include scsi/iscsi_if.h #include ql4_def.h #include ql4_glbl.h #include ql4_dbg.h @@ -1305,7 +1306,8 @@ int qla4xxx_process_ddb_changed(struct scsi_qla_host *ha, atomic_set(ddb_entry-relogin_timer, 0); clear_bit(DF_RELOGIN, ddb_entry-flags); clear_bit(DF_NO_RELOGIN, ddb_entry-flags); - iscsi_if_create_session_done(ddb_entry-conn); + iscsi_session_event(ddb_entry-sess, + ISCSI_KEVENT_CREATE_SESSION); /* * Change the lun state to READY in case the lun TIMEOUT before * the device came back. diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c index 89460d2..f55b9f7 100644 --- a/drivers/scsi/qla4xxx/ql4_os.c +++ b/drivers/scsi/qla4xxx/ql4_os.c @@ -298,8 +298,7 @@ void qla4xxx_destroy_sess(struct ddb_entry *ddb_entry) return; if (ddb_entry-conn) { - iscsi_if_destroy_session_done(ddb_entry-conn); - iscsi_destroy_conn(ddb_entry-conn); + atomic_set(ddb_entry-state, DDB_STATE_DEAD); iscsi_remove_session(ddb_entry-sess); } iscsi_free_session(ddb_entry-sess); @@ -309,6 +308,7 @@ int qla4xxx_add_sess(struct ddb_entry *ddb_entry) { int err; + ddb_entry-sess-recovery_tmo = ddb_entry-ha-port_down_retry_count; err = iscsi_add_session(ddb_entry-sess, ddb_entry-fw_ddb_index); if (err) { DEBUG2(printk(KERN_ERR Could not add session.\n)); @@ -321,9 +321,6 @@ int qla4xxx_add_sess(struct ddb_entry *ddb_entry) DEBUG2(printk(KERN_ERR Could not add connection.\n)); return -ENOMEM; } - - ddb_entry-sess-recovery_tmo = ddb_entry-ha-port_down_retry_count; - iscsi_if_create_session_done(ddb_entry-conn); return 0; } diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 9cc2cc8..b82139d 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -116,6 +116,8 @@ static struct attribute_group iscsi_transport_group = { .attrs = iscsi_transport_attrs, }; + + static int iscsi_setup_host(struct transport_container *tc, struct device *dev, struct class_device *cdev) { @@ -125,13 +127,30 @@ static int iscsi_setup_host(struct transport_container *tc, struct device *dev, memset(ihost, 0, sizeof(*ihost)); INIT_LIST_HEAD(ihost-sessions); mutex_init(ihost-mutex); + + snprintf(ihost-unbind_workq_name, KOBJ_NAME_LEN, iscsi_unbind_%d, + shost-host_no); + ihost-unbind_workq = create_singlethread_workqueue( + ihost-unbind_workq_name); + if (!ihost-unbind_workq) + return -ENOMEM; +
[PATCH 13/24] Do not fail commands immediately during logout
From: Mike Christie [EMAIL PROTECTED] If the target requests a logout, then we do not want to fail commands to scsi-ml right away. This patch just fails in pending commands for a requeue immediately, and then lets iscsid handle running commands like normal recovery. Signed-off-by: Mike Christie [EMAIL PROTECTED] --- drivers/scsi/libiscsi.c | 14 ++ 1 files changed, 6 insertions(+), 8 deletions(-) diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 9688361..b17081b 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -917,7 +917,7 @@ check_mgmt: conn-ctask = list_entry(conn-xmitqueue.next, struct iscsi_cmd_task, running); if (conn-session-state == ISCSI_STATE_LOGGING_OUT) { - fail_command(conn, conn-ctask, DID_NO_CONNECT 16); + fail_command(conn, conn-ctask, DID_IMM_RETRY 16); continue; } if (iscsi_prep_scsi_cmd_pdu(conn-ctask)) { @@ -1024,21 +1024,19 @@ int iscsi_queuecommand(struct scsi_cmnd *sc, void (*done)(struct scsi_cmnd *)) * be entering our queuecommand while a block is starting * up because the block code is not locked) */ - if (session-state == ISCSI_STATE_IN_RECOVERY) { + switch (session-state) { + case ISCSI_STATE_IN_RECOVERY: reason = FAILURE_SESSION_IN_RECOVERY; goto reject; - } - - switch (session-state) { + case ISCSI_STATE_LOGGING_OUT: + reason = FAILURE_SESSION_LOGGING_OUT; + goto reject; case ISCSI_STATE_RECOVERY_FAILED: reason = FAILURE_SESSION_RECOVERY_TIMEOUT; break; case ISCSI_STATE_TERMINATE: reason = FAILURE_SESSION_TERMINATE; break; - case ISCSI_STATE_LOGGING_OUT: - reason = FAILURE_SESSION_LOGGING_OUT; - break; default: reason = FAILURE_SESSION_FREED; } -- 1.5.1.2 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 14/24] clear conn-ctask when task is completed early
From: Mike Christie [EMAIL PROTECTED] If the current ctask is failed early, we legt the conn-ctask pointer pointing to a invalid task. When the xmit thread would send data for it, we would then oops. Signed-off-by: Mike Christie [EMAIL PROTECTED] --- drivers/scsi/libiscsi.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index b17081b..4461317 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -248,13 +248,16 @@ static int iscsi_prep_scsi_cmd_pdu(struct iscsi_cmd_task *ctask) */ static void iscsi_complete_command(struct iscsi_cmd_task *ctask) { - struct iscsi_session *session = ctask-conn-session; + struct iscsi_conn *conn = ctask-conn; + struct iscsi_session *session = conn-session; struct scsi_cmnd *sc = ctask-sc; ctask-state = ISCSI_TASK_COMPLETED; ctask-sc = NULL; /* SCSI eh reuses commands to verify us */ sc-SCp.ptr = NULL; + if (conn-ctask == ctask) + conn-ctask = NULL; list_del_init(ctask-running); __kfifo_put(session-cmdpool.queue, (void*)ctask, sizeof(void*)); sc-scsi_done(sc); -- 1.5.1.2 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 18/24] iscsi_tcp: drop session when itt does not match any command
From: Mike Christie [EMAIL PROTECTED] A target should never send us a itt that does not match a running task. If it does we do not really know what is coming down after the header, unless we evaluate the hdr and do some guessing sometimes. However, even if we know what is coming we probably do not have buffers for it or we cannot respond (if it is a r2t for example), so just drop the session. Signed-off-by: Mike Christie [EMAIL PROTECTED] --- drivers/scsi/iscsi_tcp.c |6 +- 1 files changed, 1 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index ecba606..65df908 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -755,11 +755,7 @@ iscsi_tcp_hdr_dissect(struct iscsi_conn *conn, struct iscsi_hdr *hdr) opcode = hdr-opcode ISCSI_OPCODE_MASK; /* verify itt (itt encoding: age+cid+itt) */ rc = iscsi_verify_itt(conn, hdr, itt); - if (rc == ISCSI_ERR_NO_SCSI_CMD) { - /* XXX: what does this do? */ - tcp_conn-in.datalen = 0; /* force drop */ - return 0; - } else if (rc) + if (rc) return rc; debug_tcp(opcode 0x%x ahslen %d datalen %d\n, -- 1.5.1.2 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 19/24] libiscsi, iscsi class: set tmf to a safe default and export in sysfs
From: Mike Christie [EMAIL PROTECTED] Older tools will not be setting the tmf time outs since they did not exists, so set them to a safe default. And export abort and lu reset timeout values in sysfs. Signed-off-by: Mike Christie [EMAIL PROTECTED] --- drivers/scsi/libiscsi.c |2 ++ drivers/scsi/scsi_transport_iscsi.c |8 ++-- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index f15df8d..6573223 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -1732,6 +1732,8 @@ iscsi_session_setup(struct iscsi_transport *iscsit, session-host = shost; session-state = ISCSI_STATE_FREE; session-fast_abort = 1; + session-lu_reset_timeout = 15; + session-abort_timeout = 10; session-mgmtpool_max = ISCSI_MGMT_CMDS_MAX; session-cmds_max = cmds_max; session-queued_cmdsn = session-cmdsn = initial_cmdsn; diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 36aa50e..3585599 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -30,7 +30,7 @@ #include scsi/scsi_transport_iscsi.h #include scsi/iscsi_if.h -#define ISCSI_SESSION_ATTRS 16 +#define ISCSI_SESSION_ATTRS 18 #define ISCSI_CONN_ATTRS 11 #define ISCSI_HOST_ATTRS 4 #define ISCSI_TRANSPORT_VERSION 2.0-724 @@ -1242,7 +1242,9 @@ iscsi_session_attr(username, ISCSI_PARAM_USERNAME, 1); iscsi_session_attr(username_in, ISCSI_PARAM_USERNAME_IN, 1); iscsi_session_attr(password, ISCSI_PARAM_PASSWORD, 1); iscsi_session_attr(password_in, ISCSI_PARAM_PASSWORD_IN, 1); -iscsi_session_attr(fast_abort, ISCSI_PARAM_FAST_ABORT, 1); +iscsi_session_attr(fast_abort, ISCSI_PARAM_FAST_ABORT, 0); +iscsi_session_attr(abort_tmo, ISCSI_PARAM_ABORT_TMO, 0); +iscsi_session_attr(lu_reset_tmo, ISCSI_PARAM_LU_RESET_TMO, 0); #define iscsi_priv_session_attr_show(field, format)\ static ssize_t \ @@ -1467,6 +1469,8 @@ iscsi_register_transport(struct iscsi_transport *tt) SETUP_SESSION_RD_ATTR(username, ISCSI_PASSWORD); SETUP_SESSION_RD_ATTR(username_in, ISCSI_PASSWORD_IN); SETUP_SESSION_RD_ATTR(fast_abort, ISCSI_FAST_ABORT); + SETUP_SESSION_RD_ATTR(abort_tmo, ISCSI_ABORT_TMO); + SETUP_SESSION_RD_ATTR(lu_reset_tmo,ISCSI_LU_RESET_TMO); SETUP_PRIV_SESSION_RD_ATTR(recovery_tmo); BUG_ON(count ISCSI_SESSION_ATTRS); -- 1.5.1.2 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 21/24] iscsi_tcp: hold lock during data rsp processing
From: Mike Christie [EMAIL PROTECTED] iscsi_data_rsp needs to hold the sesison lock when it calls iscsi_update_cmdsn. Signed-off-by: Mike Christie [EMAIL PROTECTED] --- drivers/scsi/iscsi_tcp.c | 14 ++ 1 files changed, 6 insertions(+), 8 deletions(-) diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index 84c4a50..edebdf2 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -641,13 +641,11 @@ iscsi_r2t_rsp(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask) } /* fill-in new R2T associated with the task */ - spin_lock(session-lock); iscsi_update_cmdsn(session, (struct iscsi_nopin*)rhdr); if (!ctask-sc || session-state != ISCSI_STATE_LOGGED_IN) { printk(KERN_INFO iscsi_tcp: dropping R2T itt %d in recovery...\n, ctask-itt); - spin_unlock(session-lock); return 0; } @@ -660,7 +658,6 @@ iscsi_r2t_rsp(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask) printk(KERN_ERR iscsi_tcp: invalid R2T with zero data len\n); __kfifo_put(tcp_ctask-r2tpool.queue, (void*)r2t, sizeof(void*)); - spin_unlock(session-lock); return ISCSI_ERR_DATALEN; } @@ -676,7 +673,6 @@ iscsi_r2t_rsp(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask) r2t-data_offset, scsi_bufflen(ctask-sc)); __kfifo_put(tcp_ctask-r2tpool.queue, (void*)r2t, sizeof(void*)); - spin_unlock(session-lock); return ISCSI_ERR_DATALEN; } @@ -690,8 +686,6 @@ iscsi_r2t_rsp(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask) conn-r2t_pdus_cnt++; iscsi_requeue_ctask(ctask); - spin_unlock(session-lock); - return 0; } @@ -764,7 +758,9 @@ iscsi_tcp_hdr_dissect(struct iscsi_conn *conn, struct iscsi_hdr *hdr) switch(opcode) { case ISCSI_OP_SCSI_DATA_IN: ctask = session-cmds[itt]; + spin_lock(conn-session-lock); rc = iscsi_data_rsp(conn, ctask); + spin_unlock(conn-session-lock); if (rc) return rc; if (tcp_conn-in.datalen) { @@ -806,9 +802,11 @@ iscsi_tcp_hdr_dissect(struct iscsi_conn *conn, struct iscsi_hdr *hdr) ctask = session-cmds[itt]; if (ahslen) rc = ISCSI_ERR_AHSLEN; - else if (ctask-sc-sc_data_direction == DMA_TO_DEVICE) + else if (ctask-sc-sc_data_direction == DMA_TO_DEVICE) { + spin_lock(session-lock); rc = iscsi_r2t_rsp(conn, ctask); - else + spin_unlock(session-lock); + } else rc = ISCSI_ERR_PROTO; break; case ISCSI_OP_LOGIN_RSP: -- 1.5.1.2 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 22/24] libiscsi: use is_power_of_2
From: Mike Christie [EMAIL PROTECTED] Patch from vignesh babu [EMAIL PROTECTED]: Replacing n (n - 1) for power of 2 check by is_power_of_2(n) Signed-off-by: vignesh babu [EMAIL PROTECTED] Signed-off-by: Mike Christie [EMAIL PROTECTED] --- drivers/scsi/libiscsi.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 6573223..553168a 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -24,6 +24,7 @@ #include linux/types.h #include linux/kfifo.h #include linux/delay.h +#include linux/log2.h #include asm/unaligned.h #include net/tcp.h #include scsi/scsi_cmnd.h @@ -1700,7 +1701,7 @@ iscsi_session_setup(struct iscsi_transport *iscsit, qdepth = ISCSI_DEF_CMD_PER_LUN; } - if (cmds_max 2 || (cmds_max (cmds_max - 1)) || + if (!is_power_of_2(cmds_max) || cmds_max = ISCSI_MGMT_ITT_OFFSET) { if (cmds_max != 0) printk(KERN_ERR iscsi: invalid can_queue of %d. -- 1.5.1.2 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 23/24] iscsi_tcp: fix setting of r2t
From: Mike Christie [EMAIL PROTECTED] If we negotiate for X r2ts we have to use only X r2ts. We cannot round up (we could send less though). It is ok to fail if it is not something the driver can handle, so this patch just does that. Signed-off-by: Mike Christie [EMAIL PROTECTED] --- drivers/scsi/iscsi_tcp.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index edebdf2..e5be5fd 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -1774,12 +1774,12 @@ iscsi_conn_set_param(struct iscsi_cls_conn *cls_conn, enum iscsi_param param, break; case ISCSI_PARAM_MAX_R2T: sscanf(buf, %d, value); - if (session-max_r2t == roundup_pow_of_two(value)) + if (value = 0 || !is_power_of_2(value)) + return -EINVAL; + if (session-max_r2t == value) break; iscsi_r2tpool_free(session); iscsi_set_param(cls_conn, param, buf, buflen); - if (session-max_r2t (session-max_r2t - 1)) - session-max_r2t = roundup_pow_of_two(session-max_r2t); if (iscsi_r2tpool_alloc(session)) return -ENOMEM; break; -- 1.5.1.2 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 17/24] iscsi_tcp: stop leaking r2t_info's when the incoming R2T is bad
From: Mike Christie [EMAIL PROTECTED] from [EMAIL PROTECTED]: iscsi_r2t_rsp checks the incoming R2T for sanity, and if it thinks it's fishy, it will drop it silently. In this case, we leaked an r2t_info object. If we do this often enough, we run into a BUG_ON some time later. Removed r2t wrappers and update patch by Mike Christie Signed-off-by: Olaf Kirch [EMAIL PROTECTED] Signed-off-by: Mike Christie [EMAIL PROTECTED] --- drivers/scsi/iscsi_tcp.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index 7212fe9..ecba606 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -658,6 +658,8 @@ iscsi_r2t_rsp(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask) r2t-data_length = be32_to_cpu(rhdr-data_length); if (r2t-data_length == 0) { printk(KERN_ERR iscsi_tcp: invalid R2T with zero data len\n); + __kfifo_put(tcp_ctask-r2tpool.queue, (void*)r2t, + sizeof(void*)); spin_unlock(session-lock); return ISCSI_ERR_DATALEN; } @@ -669,10 +671,12 @@ iscsi_r2t_rsp(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask) r2t-data_offset = be32_to_cpu(rhdr-data_offset); if (r2t-data_offset + r2t-data_length scsi_bufflen(ctask-sc)) { - spin_unlock(session-lock); printk(KERN_ERR iscsi_tcp: invalid R2T with data len %u at offset %u and total length %d\n, r2t-data_length, r2t-data_offset, scsi_bufflen(ctask-sc)); + __kfifo_put(tcp_ctask-r2tpool.queue, (void*)r2t, + sizeof(void*)); + spin_unlock(session-lock); return ISCSI_ERR_DATALEN; } -- 1.5.1.2 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, 2007-12-13 at 11:42 -0700, Matthew Wilcox wrote: On Thu, Dec 13, 2007 at 01:37:59PM -0500, Mark Lord wrote: The problem is, the block layer *never* sends an SG entry larger than 8192 bytes, and even that size is exceptionally rare. Nearly all I/O segments are 4096 bytes, so I never see a single I/O larger than 512KB (128 * 4096). If I patch various parts of block and SCSI, this limit doesn't budge, but when I change the hardware PRD limit in libata, it scales by exactly whatever I set the new value to. This tells me that adjacent I/O segments are not being combined. I thought that QUEUE_FLAG_CLUSTER (aka. SCSI host .use_clustering=1) should result in adjacent single pages being combined into larger physical segments? I was recently debugging a driver and noticed that consecutive pages in an sg list are in the reverse order. ie first you get page 918, then 917, 916, 915, 914, etc. I vaguely remember James having patches to correct this, but maybe they weren't merged? Yes, they were ... it was actually Bill Irwin's patch. The old problem was that we fault allocations in reverse order (because we were taking from the end of the zone list). I can't remember when his patches went in, but it was several years ago. After they did, I was getting a 33% chance of physical merging (as opposed to zero before). Probably someone redid the vm or the zones without understanding this and we've gone back to the original position. James - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/24] iser patching for AHS support
From: Mike Christie [EMAIL PROTECTED] from Boaz Harrosh [EMAIL PROTECTED] - The default initialization of hdr_max is the minimum - sizeof(struct iscsi_cmd) - Once this patch goes into iser the default initialization at libiscsi can be removed. - This is not yet full support for AHSs at iser end. But it should be easy. Just allocate more space at iser_desc right after iscsi_hdr. Than at transmission time use ctask-hdr_len to retrieve the total size of all iscsi pdu headers. See previous patch at iscsi_tcp.[ch] Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] Signed-off-by: Mike Christie [EMAIL PROTECTED] --- drivers/infiniband/ulp/iser/iscsi_iser.c |1 + drivers/scsi/libiscsi.c |1 - 2 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c index 2eadb6d..a2622f4 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.c +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c @@ -400,6 +400,7 @@ iscsi_iser_session_create(struct iscsi_transport *iscsit, ctask = session-cmds[i]; iser_ctask = ctask-dd_data; ctask-hdr = (struct iscsi_cmd *)iser_ctask-desc.iscsi_header; + ctask-hdr_max = sizeof(iser_ctask-desc.iscsi_header); } for (i = 0; i session-mgmtpool_max; i++) { diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 0d7914f..5936586 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -1570,7 +1570,6 @@ iscsi_session_setup(struct iscsi_transport *iscsit, if (cmd_task_size) ctask-dd_data = ctask[1]; ctask-itt = cmd_i; - ctask-hdr_max = sizeof(struct iscsi_cmd); INIT_LIST_HEAD(ctask-running); } -- 1.5.1.2 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/24] iscsi_tcp, libiscsi: initial AHS Support
From: Mike Christie [EMAIL PROTECTED] at libiscsi generic code - currently code assumes a storage space of pdu header is allocated at llds ctask and is pointed to by iscsi_cmd_task-hdr. Here I add a hdr_max field pertaining to that storage, and an hdr_len that accumulates the current use of the pdu-header. - Add an iscsi_next_hdr() inline which returns the next free space to write new Header at. Also iscsi_next_hdr() is used to retrieve the address at which to write the header-digest. - Add iscsi_add_hdr(length). What the user do is calls iscsi_next_hdr() for address of the new header, than calls iscsi_add_hdr(length) with the size of the new header. iscsi_add_hdr() will check if space is available and update to the new size. length must be padded according to standard. - Add 2 padding inline helpers thanks to Olaf. Current patch does not use them but Following patches will. Also moved definition of ISCSI_PAD_LEN to iscsi_proto.h which had PAD_WORD_LEN that was never used anywhere. - Let iscsi_prep_scsi_cmd_pdu() signal an Error return since now it is possible that it will fail. - I was tired of yet again writing a this is a digest comment next to sizeof(__u32) so I defined a new ISCSI_DIGEST_SIZE. Now I don't need any comments. Changed all places that used sizeof(__u32) or 4 in connection to a digest. iscsi_tcp specific code - At struct iscsi_tcp_cmd_task allocate maximum space allowed in standard for all headers following the iscsi_cmd header. and mark it so in iscsi_tcp_session_create() - At iscsi_send_cmd_hdr() retrieve the correct headers size and write header digest at iscsi_next_hdr(). Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] Acked-by: Olaf Kirch [EMAIL PROTECTED] Signed-off-by: Mike Christie [EMAIL PROTECTED] --- drivers/scsi/iscsi_tcp.c | 16 drivers/scsi/iscsi_tcp.h | 13 +++-- drivers/scsi/libiscsi.c| 41 +++-- include/scsi/iscsi_proto.h | 10 +- include/scsi/libiscsi.h| 33 +++-- 5 files changed, 94 insertions(+), 19 deletions(-) diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index fd88777..491845f 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -113,7 +113,7 @@ iscsi_hdr_digest(struct iscsi_conn *conn, struct iscsi_buf *buf, struct iscsi_tcp_conn *tcp_conn = conn-dd_data; crypto_hash_digest(tcp_conn-tx_hash, buf-sg, buf-sg.length, crc); - buf-sg.length += sizeof(u32); + buf-sg.length += ISCSI_DIGEST_SIZE; } /* @@ -220,6 +220,7 @@ static inline int iscsi_tcp_chunk_done(struct iscsi_chunk *chunk) { static unsigned char padbuf[ISCSI_PAD_LEN]; + unsigned int pad; if (chunk-copied chunk-size) { iscsi_tcp_chunk_map(chunk); @@ -243,10 +244,8 @@ iscsi_tcp_chunk_done(struct iscsi_chunk *chunk) } /* Do we need to handle padding? */ - if (chunk-total_copied (ISCSI_PAD_LEN-1)) { - unsigned int pad; - - pad = ISCSI_PAD_LEN - (chunk-total_copied (ISCSI_PAD_LEN-1)); + pad = iscsi_padding(chunk-total_copied); + if (pad != 0) { debug_tcp(consume %d pad bytes\n, pad); chunk-total_size += pad; chunk-size = pad; @@ -1385,11 +1384,11 @@ iscsi_send_cmd_hdr(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask) } iscsi_buf_init_iov(tcp_ctask-headbuf, (char*)ctask-hdr, - sizeof(struct iscsi_hdr)); + ctask-hdr_len); if (conn-hdrdgst_en) iscsi_hdr_digest(conn, tcp_ctask-headbuf, -(u8*)tcp_ctask-hdrext); +iscsi_next_hdr(ctask)); tcp_ctask-xmstate = ~XMSTATE_CMD_HDR_INIT; tcp_ctask-xmstate |= XMSTATE_CMD_HDR_XMIT; } @@ -2176,7 +2175,8 @@ iscsi_tcp_session_create(struct iscsi_transport *iscsit, struct iscsi_cmd_task *ctask = session-cmds[cmd_i]; struct iscsi_tcp_cmd_task *tcp_ctask = ctask-dd_data; - ctask-hdr = tcp_ctask-hdr; + ctask-hdr = tcp_ctask-hdr.cmd_hdr; + ctask-hdr_max = sizeof(tcp_ctask-hdr) - ISCSI_DIGEST_SIZE; } for (cmd_i = 0; cmd_i session-mgmtpool_max; cmd_i++) { diff --git a/drivers/scsi/iscsi_tcp.h b/drivers/scsi/iscsi_tcp.h index f1c5411..eb3784f 100644 --- a/drivers/scsi/iscsi_tcp.h +++ b/drivers/scsi/iscsi_tcp.h @@ -41,7 +41,6 @@ #define XMSTATE_IMM_HDR_INIT 0x1000 #define XMSTATE_SOL_HDR_INIT 0x2000 -#define ISCSI_PAD_LEN 4 #define ISCSI_SG_TABLESIZE SG_ALL #define ISCSI_TCP_MAX_CMD_LEN 16 @@ -130,14 +129,14 @@ struct iscsi_buf {
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
Mark Lord wrote: (resending with corrected email address for Jens) Jens, I'm experimenting here with trying to generate large I/O through libata, and not having much luck. The limit seems to be the number of hardware PRD (SG) entries permitted by the driver (libata:ata_piix), which is 128 by default. The problem is, the block layer *never* sends an SG entry larger than 8192 bytes, and even that size is exceptionally rare. Nearly all I/O segments are 4096 bytes, so I never see a single I/O larger than 512KB (128 * 4096). If I patch various parts of block and SCSI, this limit doesn't budge, but when I change the hardware PRD limit in libata, it scales by exactly whatever I set the new value to. This tells me that adjacent I/O segments are not being combined. I thought that QUEUE_FLAG_CLUSTER (aka. SCSI host .use_clustering=1) should result in adjacent single pages being combined into larger physical segments? This is x86-32 with latest 2.6.24-rc*. I'll re-test on older kernels next. ... Problem confirmed. 2.6.23.8 regularly generates segments up to 64KB for libata, but 2.6.24 uses only 4KB segments and a *few* 8KB segments. ??? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote: Problem confirmed. 2.6.23.8 regularly generates segments up to 64KB for libata, but 2.6.24 uses only 4KB segments and a *few* 8KB segments. Just a suspicion ... could this be slab vs slub? ie check your configs are the same / similar between the two kernels. -- Intel are signing my paycheques ... these opinions are still mine Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
Matthew Wilcox wrote: On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote: Problem confirmed. 2.6.23.8 regularly generates segments up to 64KB for libata, but 2.6.24 uses only 4KB segments and a *few* 8KB segments. Just a suspicion ... could this be slab vs slub? ie check your configs are the same / similar between the two kernels. .. Mmmm.. a good thought, that one. But I just rechecked, and both have CONFIG_SLAB=y My guess is that something got changed around when Jens reworked the block layer for 2.6.24. I'm going to dig around in there now. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, Dec 13 2007, Mark Lord wrote: Matthew Wilcox wrote: On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote: Problem confirmed. 2.6.23.8 regularly generates segments up to 64KB for libata, but 2.6.24 uses only 4KB segments and a *few* 8KB segments. Just a suspicion ... could this be slab vs slub? ie check your configs are the same / similar between the two kernels. .. Mmmm.. a good thought, that one. But I just rechecked, and both have CONFIG_SLAB=y My guess is that something got changed around when Jens reworked the block layer for 2.6.24. I'm going to dig around in there now. I didn't rework the block layer for 2.6.24 :-). The core block layer changes since 2.6.23 are: - Support for empty barriers. Not a likely candidate. - Shared tag queue fixes. Totally unlikely. - sg chaining support. Not likely. - The bio changes from Neil. Of the bunch, the most likely suspects in this area, since it changes some of the code involved with merges and blk_rq_map_sg(). - Lots of simple stuff, again very unlikely. Anyway, it sounds odd for this to be a block layer problem if you do see occasional segments being merged. So it sounds more like the input data having changed. Why not just bisect it? -- Jens Axboe - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Matthew Wilcox wrote: On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote: Problem confirmed. 2.6.23.8 regularly generates segments up to 64KB for libata, but 2.6.24 uses only 4KB segments and a *few* 8KB segments. Just a suspicion ... could this be slab vs slub? ie check your configs are the same / similar between the two kernels. .. Mmmm.. a good thought, that one. But I just rechecked, and both have CONFIG_SLAB=y My guess is that something got changed around when Jens reworked the block layer for 2.6.24. I'm going to dig around in there now. I didn't rework the block layer for 2.6.24 :-). The core block layer changes since 2.6.23 are: - Support for empty barriers. Not a likely candidate. - Shared tag queue fixes. Totally unlikely. - sg chaining support. Not likely. - The bio changes from Neil. Of the bunch, the most likely suspects in this area, since it changes some of the code involved with merges and blk_rq_map_sg(). - Lots of simple stuff, again very unlikely. Anyway, it sounds odd for this to be a block layer problem if you do see occasional segments being merged. So it sounds more like the input data having changed. Why not just bisect it? .. Because the early 2.6.24 series failed to boot on this machine due to bugs in the block layer -- so the code that caused this regression is probably in the stuff from before the kernels became usable here. Cheers - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Matthew Wilcox wrote: On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote: Problem confirmed. 2.6.23.8 regularly generates segments up to 64KB for libata, but 2.6.24 uses only 4KB segments and a *few* 8KB segments. Just a suspicion ... could this be slab vs slub? ie check your configs are the same / similar between the two kernels. .. Mmmm.. a good thought, that one. But I just rechecked, and both have CONFIG_SLAB=y My guess is that something got changed around when Jens reworked the block layer for 2.6.24. I'm going to dig around in there now. I didn't rework the block layer for 2.6.24 :-). The core block layer changes since 2.6.23 are: - Support for empty barriers. Not a likely candidate. - Shared tag queue fixes. Totally unlikely. - sg chaining support. Not likely. - The bio changes from Neil. Of the bunch, the most likely suspects in this area, since it changes some of the code involved with merges and blk_rq_map_sg(). - Lots of simple stuff, again very unlikely. Anyway, it sounds odd for this to be a block layer problem if you do see occasional segments being merged. So it sounds more like the input data having changed. Why not just bisect it? .. Because the early 2.6.24 series failed to boot on this machine due to bugs in the block layer -- so the code that caused this regression is probably in the stuff from before the kernels became usable here. .. That sounds more harsh than intended -- the earlier 2.6.24 kernels (up to the first couple of -rc* ones failed here because of incompatibilities between the block/bio changes and libata. That's better, I think! Cheers - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, Dec 13 2007, Mark Lord wrote: Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Matthew Wilcox wrote: On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote: Problem confirmed. 2.6.23.8 regularly generates segments up to 64KB for libata, but 2.6.24 uses only 4KB segments and a *few* 8KB segments. Just a suspicion ... could this be slab vs slub? ie check your configs are the same / similar between the two kernels. .. Mmmm.. a good thought, that one. But I just rechecked, and both have CONFIG_SLAB=y My guess is that something got changed around when Jens reworked the block layer for 2.6.24. I'm going to dig around in there now. I didn't rework the block layer for 2.6.24 :-). The core block layer changes since 2.6.23 are: - Support for empty barriers. Not a likely candidate. - Shared tag queue fixes. Totally unlikely. - sg chaining support. Not likely. - The bio changes from Neil. Of the bunch, the most likely suspects in this area, since it changes some of the code involved with merges and blk_rq_map_sg(). - Lots of simple stuff, again very unlikely. Anyway, it sounds odd for this to be a block layer problem if you do see occasional segments being merged. So it sounds more like the input data having changed. Why not just bisect it? .. Because the early 2.6.24 series failed to boot on this machine due to bugs in the block layer -- so the code that caused this regression is probably in the stuff from before the kernels became usable here. .. That sounds more harsh than intended -- the earlier 2.6.24 kernels (up to the first couple of -rc* ones failed here because of incompatibilities between the block/bio changes and libata. That's better, I think! No worries, I didn't pick it up as harsh just as an odd conclusion :-) If I were you, I'd just start from the first -rc that booted for you. If THAT has the bug, then we'll think of something else. If you don't get anywhere, I can run some tests tomorrow and see if I can reproduce it here. -- Jens Axboe - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Matthew Wilcox wrote: On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote: Problem confirmed. 2.6.23.8 regularly generates segments up to 64KB for libata, but 2.6.24 uses only 4KB segments and a *few* 8KB segments. Just a suspicion ... could this be slab vs slub? ie check your configs are the same / similar between the two kernels. .. Mmmm.. a good thought, that one. But I just rechecked, and both have CONFIG_SLAB=y My guess is that something got changed around when Jens reworked the block layer for 2.6.24. I'm going to dig around in there now. I didn't rework the block layer for 2.6.24 :-). The core block layer changes since 2.6.23 are: - Support for empty barriers. Not a likely candidate. - Shared tag queue fixes. Totally unlikely. - sg chaining support. Not likely. - The bio changes from Neil. Of the bunch, the most likely suspects in this area, since it changes some of the code involved with merges and blk_rq_map_sg(). - Lots of simple stuff, again very unlikely. Anyway, it sounds odd for this to be a block layer problem if you do see occasional segments being merged. So it sounds more like the input data having changed. Why not just bisect it? .. Because the early 2.6.24 series failed to boot on this machine due to bugs in the block layer -- so the code that caused this regression is probably in the stuff from before the kernels became usable here. .. That sounds more harsh than intended -- the earlier 2.6.24 kernels (up to the first couple of -rc* ones failed here because of incompatibilities between the block/bio changes and libata. That's better, I think! No worries, I didn't pick it up as harsh just as an odd conclusion :-) If I were you, I'd just start from the first -rc that booted for you. If THAT has the bug, then we'll think of something else. If you don't get anywhere, I can run some tests tomorrow and see if I can reproduce it here. .. I believe that *anyone* can reproduce it, since it's broken long before the requests ever get to SCSI or libata. Which also means that *anyone* who wants to can bisect it, as well. I don't do bisects. It was just a suggestion on how to narrow it down, do as you see fit. But I will dig a bit more and see if I can find the culprit. Sure, I'll dig around as well. .. I wonder if it's 9dfa52831e96194b8649613e3131baa2c109f7dc: Merge blk_recount_segments into blk_recalc_rq_segments ? That particular commit does some rather innocent code-shuffling, but also introduces a couple of new if (nr_hw_segs == 1 conditions that were not there before. Okay git experts: how do I pull out a kernel at the point of this exact commit ? Thanks! - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, Dec 13 2007, Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Matthew Wilcox wrote: On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote: Problem confirmed. 2.6.23.8 regularly generates segments up to 64KB for libata, but 2.6.24 uses only 4KB segments and a *few* 8KB segments. Just a suspicion ... could this be slab vs slub? ie check your configs are the same / similar between the two kernels. .. Mmmm.. a good thought, that one. But I just rechecked, and both have CONFIG_SLAB=y My guess is that something got changed around when Jens reworked the block layer for 2.6.24. I'm going to dig around in there now. I didn't rework the block layer for 2.6.24 :-). The core block layer changes since 2.6.23 are: - Support for empty barriers. Not a likely candidate. - Shared tag queue fixes. Totally unlikely. - sg chaining support. Not likely. - The bio changes from Neil. Of the bunch, the most likely suspects in this area, since it changes some of the code involved with merges and blk_rq_map_sg(). - Lots of simple stuff, again very unlikely. Anyway, it sounds odd for this to be a block layer problem if you do see occasional segments being merged. So it sounds more like the input data having changed. Why not just bisect it? .. Because the early 2.6.24 series failed to boot on this machine due to bugs in the block layer -- so the code that caused this regression is probably in the stuff from before the kernels became usable here. .. That sounds more harsh than intended -- the earlier 2.6.24 kernels (up to the first couple of -rc* ones failed here because of incompatibilities between the block/bio changes and libata. That's better, I think! No worries, I didn't pick it up as harsh just as an odd conclusion :-) If I were you, I'd just start from the first -rc that booted for you. If THAT has the bug, then we'll think of something else. If you don't get anywhere, I can run some tests tomorrow and see if I can reproduce it here. .. I believe that *anyone* can reproduce it, since it's broken long before the requests ever get to SCSI or libata. Which also means that *anyone* who wants to can bisect it, as well. I don't do bisects. It was just a suggestion on how to narrow it down, do as you see fit. But I will dig a bit more and see if I can find the culprit. Sure, I'll dig around as well. Just tried something simple. I only see one 12kb segment so far, so not a lot by any stretch. I also DONT see any missed merges signs, so it would appear that the pages in the request are simply not contigious physically. diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c index e30b1a4..1e34b6f 100644 --- a/block/ll_rw_blk.c +++ b/block/ll_rw_blk.c @@ -1330,6 +1330,8 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, goto new_segment; sg-length += nbytes; + if (sg-length 8192) + printk(sg_len=%d\n, sg-length); } else { new_segment: if (!sg) @@ -1349,6 +1351,8 @@ new_segment: sg = sg_next(sg); } + if (bvprv (page_address(bvprv-bv_page) + bvprv-bv_len == page_address(bvec-bv_page))) + printk(missed merge\n); sg_set_page(sg, bvec-bv_page, nbytes, bvec-bv_offset); nsegs++; } -- Jens Axboe - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, Dec 13 2007, Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Matthew Wilcox wrote: On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote: Problem confirmed. 2.6.23.8 regularly generates segments up to 64KB for libata, but 2.6.24 uses only 4KB segments and a *few* 8KB segments. Just a suspicion ... could this be slab vs slub? ie check your configs are the same / similar between the two kernels. .. Mmmm.. a good thought, that one. But I just rechecked, and both have CONFIG_SLAB=y My guess is that something got changed around when Jens reworked the block layer for 2.6.24. I'm going to dig around in there now. I didn't rework the block layer for 2.6.24 :-). The core block layer changes since 2.6.23 are: - Support for empty barriers. Not a likely candidate. - Shared tag queue fixes. Totally unlikely. - sg chaining support. Not likely. - The bio changes from Neil. Of the bunch, the most likely suspects in this area, since it changes some of the code involved with merges and blk_rq_map_sg(). - Lots of simple stuff, again very unlikely. Anyway, it sounds odd for this to be a block layer problem if you do see occasional segments being merged. So it sounds more like the input data having changed. Why not just bisect it? .. Because the early 2.6.24 series failed to boot on this machine due to bugs in the block layer -- so the code that caused this regression is probably in the stuff from before the kernels became usable here. .. That sounds more harsh than intended -- the earlier 2.6.24 kernels (up to the first couple of -rc* ones failed here because of incompatibilities between the block/bio changes and libata. That's better, I think! No worries, I didn't pick it up as harsh just as an odd conclusion :-) If I were you, I'd just start from the first -rc that booted for you. If THAT has the bug, then we'll think of something else. If you don't get anywhere, I can run some tests tomorrow and see if I can reproduce it here. .. I believe that *anyone* can reproduce it, since it's broken long before the requests ever get to SCSI or libata. Which also means that *anyone* who wants to can bisect it, as well. I don't do bisects. It was just a suggestion on how to narrow it down, do as you see fit. But I will dig a bit more and see if I can find the culprit. Sure, I'll dig around as well. .. I wonder if it's 9dfa52831e96194b8649613e3131baa2c109f7dc: Merge blk_recount_segments into blk_recalc_rq_segments ? That particular commit does some rather innocent code-shuffling, but also introduces a couple of new if (nr_hw_segs == 1 conditions that were not there before. You can try and revert it of course, but I think you are looking at the wrong bits. If the segment counts were totally off, you'd never be anywhere close to reaching the set limit. Your problems seems to be missed contig segment merges. Okay git experts: how do I pull out a kernel at the point of this exact commit ? Dummy approach - git log and grep for 9dfa52831e96194b8649613e3131baa2c109f7dc, then see what commit is before that. Then do a git checkout commit. -- Jens Axboe - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
Jens Axboe wrote: On Thu, Dec 13 2007, Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Matthew Wilcox wrote: On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote: Problem confirmed. 2.6.23.8 regularly generates segments up to 64KB for libata, but 2.6.24 uses only 4KB segments and a *few* 8KB segments. Just a suspicion ... could this be slab vs slub? ie check your configs are the same / similar between the two kernels. .. Mmmm.. a good thought, that one. But I just rechecked, and both have CONFIG_SLAB=y My guess is that something got changed around when Jens reworked the block layer for 2.6.24. I'm going to dig around in there now. I didn't rework the block layer for 2.6.24 :-). The core block layer changes since 2.6.23 are: - Support for empty barriers. Not a likely candidate. - Shared tag queue fixes. Totally unlikely. - sg chaining support. Not likely. - The bio changes from Neil. Of the bunch, the most likely suspects in this area, since it changes some of the code involved with merges and blk_rq_map_sg(). - Lots of simple stuff, again very unlikely. Anyway, it sounds odd for this to be a block layer problem if you do see occasional segments being merged. So it sounds more like the input data having changed. Why not just bisect it? .. Because the early 2.6.24 series failed to boot on this machine due to bugs in the block layer -- so the code that caused this regression is probably in the stuff from before the kernels became usable here. .. That sounds more harsh than intended -- the earlier 2.6.24 kernels (up to the first couple of -rc* ones failed here because of incompatibilities between the block/bio changes and libata. That's better, I think! No worries, I didn't pick it up as harsh just as an odd conclusion :-) If I were you, I'd just start from the first -rc that booted for you. If THAT has the bug, then we'll think of something else. If you don't get anywhere, I can run some tests tomorrow and see if I can reproduce it here. .. I believe that *anyone* can reproduce it, since it's broken long before the requests ever get to SCSI or libata. Which also means that *anyone* who wants to can bisect it, as well. I don't do bisects. It was just a suggestion on how to narrow it down, do as you see fit. But I will dig a bit more and see if I can find the culprit. Sure, I'll dig around as well. Just tried something simple. I only see one 12kb segment so far, so not a lot by any stretch. I also DONT see any missed merges signs, so it would appear that the pages in the request are simply not contigious physically. diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c index e30b1a4..1e34b6f 100644 --- a/block/ll_rw_blk.c +++ b/block/ll_rw_blk.c @@ -1330,6 +1330,8 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, goto new_segment; sg-length += nbytes; + if (sg-length 8192) + printk(sg_len=%d\n, sg-length); } else { new_segment: if (!sg) @@ -1349,6 +1351,8 @@ new_segment: sg = sg_next(sg); } + if (bvprv (page_address(bvprv-bv_page) + bvprv-bv_len == page_address(bvec-bv_page))) + printk(missed merge\n); sg_set_page(sg, bvec-bv_page, nbytes, bvec-bv_offset); nsegs++; } .. Yeah, the first part is similar to my own hack. For testing, try dd if=/dev/sda of=/dev/null bs=4096k. That *really* should end up using contiguous pages on most systems. I figured out the git thing, and am now building some in-between kernels to try. Cheers - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, Dec 13 2007, Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Matthew Wilcox wrote: On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote: Problem confirmed. 2.6.23.8 regularly generates segments up to 64KB for libata, but 2.6.24 uses only 4KB segments and a *few* 8KB segments. Just a suspicion ... could this be slab vs slub? ie check your configs are the same / similar between the two kernels. .. Mmmm.. a good thought, that one. But I just rechecked, and both have CONFIG_SLAB=y My guess is that something got changed around when Jens reworked the block layer for 2.6.24. I'm going to dig around in there now. I didn't rework the block layer for 2.6.24 :-). The core block layer changes since 2.6.23 are: - Support for empty barriers. Not a likely candidate. - Shared tag queue fixes. Totally unlikely. - sg chaining support. Not likely. - The bio changes from Neil. Of the bunch, the most likely suspects in this area, since it changes some of the code involved with merges and blk_rq_map_sg(). - Lots of simple stuff, again very unlikely. Anyway, it sounds odd for this to be a block layer problem if you do see occasional segments being merged. So it sounds more like the input data having changed. Why not just bisect it? .. Because the early 2.6.24 series failed to boot on this machine due to bugs in the block layer -- so the code that caused this regression is probably in the stuff from before the kernels became usable here. .. That sounds more harsh than intended -- the earlier 2.6.24 kernels (up to the first couple of -rc* ones failed here because of incompatibilities between the block/bio changes and libata. That's better, I think! No worries, I didn't pick it up as harsh just as an odd conclusion :-) If I were you, I'd just start from the first -rc that booted for you. If THAT has the bug, then we'll think of something else. If you don't get anywhere, I can run some tests tomorrow and see if I can reproduce it here. .. I believe that *anyone* can reproduce it, since it's broken long before the requests ever get to SCSI or libata. Which also means that *anyone* who wants to can bisect it, as well. I don't do bisects. It was just a suggestion on how to narrow it down, do as you see fit. But I will dig a bit more and see if I can find the culprit. Sure, I'll dig around as well. Just tried something simple. I only see one 12kb segment so far, so not a lot by any stretch. I also DONT see any missed merges signs, so it would appear that the pages in the request are simply not contigious physically. diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c index e30b1a4..1e34b6f 100644 --- a/block/ll_rw_blk.c +++ b/block/ll_rw_blk.c @@ -1330,6 +1330,8 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, goto new_segment; sg-length += nbytes; +if (sg-length 8192) +printk(sg_len=%d\n, sg-length); } else { new_segment: if (!sg) @@ -1349,6 +1351,8 @@ new_segment: sg = sg_next(sg); } +if (bvprv (page_address(bvprv-bv_page) + bvprv-bv_len == page_address(bvec-bv_page))) +printk(missed merge\n); sg_set_page(sg, bvec-bv_page, nbytes, bvec-bv_offset); nsegs++; } .. Yeah, the first part is similar to my own hack. For testing, try dd if=/dev/sda of=/dev/null bs=4096k. That *really* should end up using contiguous pages on most systems. I figured out the git thing, and am now building some in-between kernels to try. OK, it's a vm issue, I have tens of thousand backward pages after a boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not reverse. So it looks like that bug got reintroduced. -- Jens Axboe - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Matthew Wilcox wrote: On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote: Problem confirmed. 2.6.23.8 regularly generates segments up to 64KB for libata, but 2.6.24 uses only 4KB segments and a *few* 8KB segments. Just a suspicion ... could this be slab vs slub? ie check your configs are the same / similar between the two kernels. .. Mmmm.. a good thought, that one. But I just rechecked, and both have CONFIG_SLAB=y My guess is that something got changed around when Jens reworked the block layer for 2.6.24. I'm going to dig around in there now. I didn't rework the block layer for 2.6.24 :-). The core block layer changes since 2.6.23 are: - Support for empty barriers. Not a likely candidate. - Shared tag queue fixes. Totally unlikely. - sg chaining support. Not likely. - The bio changes from Neil. Of the bunch, the most likely suspects in this area, since it changes some of the code involved with merges and blk_rq_map_sg(). - Lots of simple stuff, again very unlikely. Anyway, it sounds odd for this to be a block layer problem if you do see occasional segments being merged. So it sounds more like the input data having changed. Why not just bisect it? .. Because the early 2.6.24 series failed to boot on this machine due to bugs in the block layer -- so the code that caused this regression is probably in the stuff from before the kernels became usable here. .. That sounds more harsh than intended -- the earlier 2.6.24 kernels (up to the first couple of -rc* ones failed here because of incompatibilities between the block/bio changes and libata. That's better, I think! No worries, I didn't pick it up as harsh just as an odd conclusion :-) If I were you, I'd just start from the first -rc that booted for you. If THAT has the bug, then we'll think of something else. If you don't get anywhere, I can run some tests tomorrow and see if I can reproduce it here. .. I believe that *anyone* can reproduce it, since it's broken long before the requests ever get to SCSI or libata. Which also means that *anyone* who wants to can bisect it, as well. I don't do bisects. It was just a suggestion on how to narrow it down, do as you see fit. But I will dig a bit more and see if I can find the culprit. Sure, I'll dig around as well. Just tried something simple. I only see one 12kb segment so far, so not a lot by any stretch. I also DONT see any missed merges signs, so it would appear that the pages in the request are simply not contigious physically. diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c index e30b1a4..1e34b6f 100644 --- a/block/ll_rw_blk.c +++ b/block/ll_rw_blk.c @@ -1330,6 +1330,8 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, goto new_segment; sg-length += nbytes; + if (sg-length 8192) + printk(sg_len=%d\n, sg-length); } else { new_segment: if (!sg) @@ -1349,6 +1351,8 @@ new_segment: sg = sg_next(sg); } + if (bvprv (page_address(bvprv-bv_page) + bvprv-bv_len == page_address(bvec-bv_page))) + printk(missed merge\n); sg_set_page(sg, bvec-bv_page, nbytes, bvec-bv_offset); nsegs++; } .. Yeah, the first part is similar to my own hack. For testing, try dd if=/dev/sda of=/dev/null bs=4096k. That *really* should end up using contiguous pages on most systems. I figured out the git thing, and am now building some in-between kernels to try. OK, it's a vm issue, I have tens of thousand backward pages after a boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not reverse. So it looks like that bug got reintroduced. ... Mmm.. shouldn't one of the front- or back- merge logics work for either order? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
Mark Lord wrote: Jens Axboe wrote: .. OK, it's a vm issue, I have tens of thousand backward pages after a boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not reverse. So it looks like that bug got reintroduced. ... Mmm.. shouldn't one of the front- or back- merge logics work for either order? .. Belay that thought. I'm slowly remembering how this is supposed to work now. :) - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, Dec 13 2007, Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Mark Lord wrote: Jens Axboe wrote: On Thu, Dec 13 2007, Mark Lord wrote: Matthew Wilcox wrote: On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote: Problem confirmed. 2.6.23.8 regularly generates segments up to 64KB for libata, but 2.6.24 uses only 4KB segments and a *few* 8KB segments. Just a suspicion ... could this be slab vs slub? ie check your configs are the same / similar between the two kernels. .. Mmmm.. a good thought, that one. But I just rechecked, and both have CONFIG_SLAB=y My guess is that something got changed around when Jens reworked the block layer for 2.6.24. I'm going to dig around in there now. I didn't rework the block layer for 2.6.24 :-). The core block layer changes since 2.6.23 are: - Support for empty barriers. Not a likely candidate. - Shared tag queue fixes. Totally unlikely. - sg chaining support. Not likely. - The bio changes from Neil. Of the bunch, the most likely suspects in this area, since it changes some of the code involved with merges and blk_rq_map_sg(). - Lots of simple stuff, again very unlikely. Anyway, it sounds odd for this to be a block layer problem if you do see occasional segments being merged. So it sounds more like the input data having changed. Why not just bisect it? .. Because the early 2.6.24 series failed to boot on this machine due to bugs in the block layer -- so the code that caused this regression is probably in the stuff from before the kernels became usable here. .. That sounds more harsh than intended -- the earlier 2.6.24 kernels (up to the first couple of -rc* ones failed here because of incompatibilities between the block/bio changes and libata. That's better, I think! No worries, I didn't pick it up as harsh just as an odd conclusion :-) If I were you, I'd just start from the first -rc that booted for you. If THAT has the bug, then we'll think of something else. If you don't get anywhere, I can run some tests tomorrow and see if I can reproduce it here. .. I believe that *anyone* can reproduce it, since it's broken long before the requests ever get to SCSI or libata. Which also means that *anyone* who wants to can bisect it, as well. I don't do bisects. It was just a suggestion on how to narrow it down, do as you see fit. But I will dig a bit more and see if I can find the culprit. Sure, I'll dig around as well. Just tried something simple. I only see one 12kb segment so far, so not a lot by any stretch. I also DONT see any missed merges signs, so it would appear that the pages in the request are simply not contigious physically. diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c index e30b1a4..1e34b6f 100644 --- a/block/ll_rw_blk.c +++ b/block/ll_rw_blk.c @@ -1330,6 +1330,8 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, goto new_segment; sg-length += nbytes; + if (sg-length 8192) + printk(sg_len=%d\n, sg-length); } else { new_segment: if (!sg) @@ -1349,6 +1351,8 @@ new_segment: sg = sg_next(sg); } + if (bvprv (page_address(bvprv-bv_page) + bvprv-bv_len == page_address(bvec-bv_page))) + printk(missed merge\n); sg_set_page(sg, bvec-bv_page, nbytes, bvec-bv_offset); nsegs++; } .. Yeah, the first part is similar to my own hack. For testing, try dd if=/dev/sda of=/dev/null bs=4096k. That *really* should end up using contiguous pages on most systems. I figured out the git thing, and am now building some in-between kernels to try. OK, it's a vm issue, I have tens of thousand backward pages after a boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not reverse. So it looks like that bug got reintroduced. ... Mmm.. shouldn't one of the front- or back- merge logics work for either order? I think you are misunderstanding the merging. The front/back bits are for contig on disk, this is sg segment merging. We can only join pieces that are contig in memory, otherwise the result would not be pretty :-) -- Jens Axboe - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/30] blk_end_request: changing ub (take 4)
On Wed, 12 Dec 2007 15:38:15 -0500 (EST), Kiyoshi Ueda [EMAIL PROTECTED] wrote: On Tue, 11 Dec 2007 15:48:03 -0800, Pete Zaitcev [EMAIL PROTECTED] wrote: - end_that_request_first(rq, uptodate, rq-hard_nr_sectors); - end_that_request_last(rq, uptodate); + if (__blk_end_request(rq, error, blk_rq_bytes(rq))) + BUG(); My understanding was, blk_end_request() is the same thing, only takes the queue lock. But then, should I refactor ub so that it calls __blk_end_request if request function ends with an error and blk_end_request if the end-of-IO even is processed? I'm using __blk_end_request() here and I think it's sufficient, because: o end_that_request_last() must be called with the queue lock held o ub_end_rq() calls end_that_request_last() without taking the queue lock in itself. So the queue lock must have been taken outside ub_end_rq(). But, if ub is calling end_that_request_last() without the queue lock, it is a bug in the original code and we should use blk_end_request() to fix that. So, I have to rewrite ub to split the paths after all, right? Let's do this then: I'll wait until your patch gets to Linus and then update it with the split. The reason is, I need the whole enchilada applied and I don't want to bother tracking iterations and all the little segments (of which you already have 30). Until then, ub will have a race by using your original small patch. Best wishes, -- Pete - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
VM allocates pages in reverse order again
On Thu, Dec 13, 2007 at 09:09:59PM +0100, Jens Axboe wrote: diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c index e30b1a4..1e34b6f 100644 --- a/block/ll_rw_blk.c +++ b/block/ll_rw_blk.c @@ -1349,6 +1351,8 @@ new_segment: sg = sg_next(sg); } + if (bvprv (page_address(bvprv-bv_page) + bvprv-bv_len == page_address(bvec-bv_page))) + printk(missed merge\n); sg_set_page(sg, bvec-bv_page, nbytes, bvec-bv_offset); nsegs++; } .. Yeah, the first part is similar to my own hack. For testing, try dd if=/dev/sda of=/dev/null bs=4096k. That *really* should end up using contiguous pages on most systems. I figured out the git thing, and am now building some in-between kernels to try. OK, it's a vm issue, I have tens of thousand backward pages after a boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not reverse. So it looks like that bug got reintroduced. Perhaps we should ask the -mm folks if they happen to have an idea what caused it ... Background: we're seeing pages allocated in reverse order after boot. This causes IO performance problems on machines without IOMMUs as we can't merge pages when they're allocated in the wrong order. This is something that went wrong between 2.6.23 and 2.6.24-rc5. Bill Irwin had a patch that fixed this; it was merged months ago, but the effects of it seem to have been undone. -- Intel are signing my paycheques ... these opinions are still mine Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, 13 Dec 2007 21:09:59 +0100 Jens Axboe [EMAIL PROTECTED] wrote: OK, it's a vm issue, cc linux-mm and probable culprit. I have tens of thousand backward pages after a boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not reverse. So it looks like that bug got reintroduced. Bill Irwin fixed this a couple of years back: changed the page allocator so that it mostly hands out pages in ascending physical-address order. I guess we broke that, quite possibly in Mel's page allocator rework. It would help if you could provide us with a simple recipe for demonstrating this problem, please. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote: On Thu, 13 Dec 2007 21:09:59 +0100 Jens Axboe [EMAIL PROTECTED] wrote: OK, it's a vm issue, cc linux-mm and probable culprit. I have tens of thousand backward pages after a boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not reverse. So it looks like that bug got reintroduced. Bill Irwin fixed this a couple of years back: changed the page allocator so that it mostly hands out pages in ascending physical-address order. I guess we broke that, quite possibly in Mel's page allocator rework. It would help if you could provide us with a simple recipe for demonstrating this problem, please. The simple way seems to be to malloc a large area, touch every page and then look at the physical pages assigned ... they now mostly seem to be descending in physical address. James - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, Dec 13 2007, Andrew Morton wrote: On Thu, 13 Dec 2007 21:09:59 +0100 Jens Axboe [EMAIL PROTECTED] wrote: OK, it's a vm issue, cc linux-mm and probable culprit. I have tens of thousand backward pages after a boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not reverse. So it looks like that bug got reintroduced. Bill Irwin fixed this a couple of years back: changed the page allocator so that it mostly hands out pages in ascending physical-address order. I guess we broke that, quite possibly in Mel's page allocator rework. It would help if you could provide us with a simple recipe for demonstrating this problem, please. Basically anything involving IO :-). A boot here showed a handful of good merges, and probably in the order of 100,000 descending allocations. A kernel make is a fine test as well. Something like the below should work fine - if you see oodles of these basicaly doing any type of IO, then you are screwed. diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c index e30b1a4..8ce3fcc 100644 --- a/block/ll_rw_blk.c +++ b/block/ll_rw_blk.c @@ -1349,6 +1349,10 @@ new_segment: sg = sg_next(sg); } + if (bvprv) { + if (page_address(bvec-bv_page) + PAGE_SIZE == page_address(bvprv-bv_page) printk_ratelimit()) + printk(page alloc order backwards\n); + } sg_set_page(sg, bvec-bv_page, nbytes, bvec-bv_offset); nsegs++; } -- Jens Axboe - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
Andrew Morton wrote: On Thu, 13 Dec 2007 17:15:06 -0500 James Bottomley [EMAIL PROTECTED] wrote: On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote: On Thu, 13 Dec 2007 21:09:59 +0100 Jens Axboe [EMAIL PROTECTED] wrote: OK, it's a vm issue, cc linux-mm and probable culprit. I have tens of thousand backward pages after a boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not reverse. So it looks like that bug got reintroduced. Bill Irwin fixed this a couple of years back: changed the page allocator so that it mostly hands out pages in ascending physical-address order. I guess we broke that, quite possibly in Mel's page allocator rework. It would help if you could provide us with a simple recipe for demonstrating this problem, please. The simple way seems to be to malloc a large area, touch every page and then look at the physical pages assigned ... they now mostly seem to be descending in physical address. OIC. -mm's /proc/pid/pagemap can be used to get the pfn's... .. I'm actually running the treadmill right now (have been for many hours, actually, to bisect it to a specific commit. Thought I was almost done, and then noticed that git-bisect doesn't keep the Makefile VERSION lines the same, so I was actually running the wrong kernel after the first few times.. duh. Wrote a script to fix it now. -ml - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, 13 Dec 2007 17:15:06 -0500 James Bottomley [EMAIL PROTECTED] wrote: On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote: On Thu, 13 Dec 2007 21:09:59 +0100 Jens Axboe [EMAIL PROTECTED] wrote: OK, it's a vm issue, cc linux-mm and probable culprit. I have tens of thousand backward pages after a boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not reverse. So it looks like that bug got reintroduced. Bill Irwin fixed this a couple of years back: changed the page allocator so that it mostly hands out pages in ascending physical-address order. I guess we broke that, quite possibly in Mel's page allocator rework. It would help if you could provide us with a simple recipe for demonstrating this problem, please. The simple way seems to be to malloc a large area, touch every page and then look at the physical pages assigned ... they now mostly seem to be descending in physical address. OIC. -mm's /proc/pid/pagemap can be used to get the pfn's... - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
Mark Lord wrote: Andrew Morton wrote: On Thu, 13 Dec 2007 17:15:06 -0500 James Bottomley [EMAIL PROTECTED] wrote: On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote: On Thu, 13 Dec 2007 21:09:59 +0100 Jens Axboe [EMAIL PROTECTED] wrote: OK, it's a vm issue, cc linux-mm and probable culprit. I have tens of thousand backward pages after a boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not reverse. So it looks like that bug got reintroduced. Bill Irwin fixed this a couple of years back: changed the page allocator so that it mostly hands out pages in ascending physical-address order. I guess we broke that, quite possibly in Mel's page allocator rework. It would help if you could provide us with a simple recipe for demonstrating this problem, please. The simple way seems to be to malloc a large area, touch every page and then look at the physical pages assigned ... they now mostly seem to be descending in physical address. OIC. -mm's /proc/pid/pagemap can be used to get the pfn's... .. I'm actually running the treadmill right now (have been for many hours, actually, to bisect it to a specific commit. Thought I was almost done, and then noticed that git-bisect doesn't keep the Makefile VERSION lines the same, so I was actually running the wrong kernel after the first few times.. duh. Wrote a script to fix it now. .. Well, that was a waste of three hours. Somebody else can try it now. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
Mark Lord wrote: Mark Lord wrote: Andrew Morton wrote: On Thu, 13 Dec 2007 17:15:06 -0500 James Bottomley [EMAIL PROTECTED] wrote: On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote: On Thu, 13 Dec 2007 21:09:59 +0100 Jens Axboe [EMAIL PROTECTED] wrote: OK, it's a vm issue, cc linux-mm and probable culprit. I have tens of thousand backward pages after a boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not reverse. So it looks like that bug got reintroduced. Bill Irwin fixed this a couple of years back: changed the page allocator so that it mostly hands out pages in ascending physical-address order. I guess we broke that, quite possibly in Mel's page allocator rework. It would help if you could provide us with a simple recipe for demonstrating this problem, please. The simple way seems to be to malloc a large area, touch every page and then look at the physical pages assigned ... they now mostly seem to be descending in physical address. OIC. -mm's /proc/pid/pagemap can be used to get the pfn's... .. I'm actually running the treadmill right now (have been for many hours, actually, to bisect it to a specific commit. Thought I was almost done, and then noticed that git-bisect doesn't keep the Makefile VERSION lines the same, so I was actually running the wrong kernel after the first few times.. duh. Wrote a script to fix it now. .. Well, that was a waste of three hours. .. Ahh.. it seems to be sensitive to one/both of these: CONFIG_HIGHMEM64G=y with 4GB RAM: not so bad, frequently does 20KB - 48KB segments. CONFIG_HIGHMEM4G=y with 2GB RAM: very severe, rarely does more than 8KB segments. CONFIG_HIGHMEM4G=y with 3GB RAM: very severe, rarely does more than 8KB segments. So if you want to reproduce this on a large memory machine, use mem=2GB for starters. Still testing.. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 01/30] git-scsi-misc gdth fix
From: James Bottomley [EMAIL PROTECTED] On Sun, 2007-10-14 at 12:21 -0700, Andrew Morton wrote: On Sun, 14 Oct 2007 22:45:47 +0400 Dave Milter [EMAIL PROTECTED] wrote: I build linux-2.6.23-mm1 and try to boot it using qemu, and it crashed with trace like this: do_page_fault error_code lock_acquire _spin_lock_irqsave gdth_timeout run_timer_softirq __do_softirq do_softirq I have screenshot, but have no idea, is it legal to include it, if I sent copy to lkml. config of kernel in attachment, I apply all three patches from hot-fixes. The screenshot is here: http://userweb.kernel.org/~akpm/crash.png It would appear that gdth_timeout() is passing a bad pointer into spin_lock_irqsave(). There's a bug in the gdth rework in that the instance can be deleted from the list before the actual timer is stopped. This can be worked around I think by the following patch; although we really should be stopping the timer from firing when the list goes empty. James said: This is almost certainly the wrong fix for real hardware. Although it kills the timer when the list goes empty, nothing will ever restart it when the list fills again. Boaz, since you touched all of this, you get to fix it. The correct fix will be to control the timer along with the actual list instead of at entry/exit time. If you're not going to add this empty check to the timer routine, make sure you use del_timer_sync() before removing the last element from the list. Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/gdth.c |3 +++ 1 file changed, 3 insertions(+) diff -puN drivers/scsi/gdth.c~git-scsi-misc-gdth-fix drivers/scsi/gdth.c --- a/drivers/scsi/gdth.c~git-scsi-misc-gdth-fix +++ a/drivers/scsi/gdth.c @@ -3793,6 +3793,9 @@ static void gdth_timeout(ulong data) gdth_ha_str *ha; ulong flags; +if (list_empty(gdth_instances)) + return; + ha = list_first_entry(gdth_instances, gdth_ha_str, list); spin_lock_irqsave(ha-smp_lock, flags); _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 04/30] kill warnings in mptbase.h on parisc64
From: Kyle McMartin [EMAIL PROTECTED] Verified all the arches necessary select the CONFIG_64BIT symbol. This also kills the warning (since it was using the 32-bit case) on parisc64 and mips64. Signed-off-by: Kyle McMartin [EMAIL PROTECTED] Cc: Moore, Eric Dean [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/message/fusion/mptbase.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -puN drivers/message/fusion/mptbase.h~kill-warnings-in-mptbaseh-on-parisc64 drivers/message/fusion/mptbase.h --- a/drivers/message/fusion/mptbase.h~kill-warnings-in-mptbaseh-on-parisc64 +++ a/drivers/message/fusion/mptbase.h @@ -922,7 +922,7 @@ extern struct proc_dir_entry*mpt_proc_r /*=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=*/ #endif /* } __KERNEL__ */ -#if defined(__alpha__) || defined(__sparc_v9__) || defined(__ia64__) || defined(__x86_64__) || defined(__powerpc__) +#ifdef CONFIG_64BIT #define CAST_U32_TO_PTR(x) ((void *)(u64)x) #define CAST_PTR_TO_U32(x) ((u32)(u64)x) #else _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 02/30] mptbase: reset ioc initiator during PCI resume
From: Darrick J. Wong [EMAIL PROTECTED] It appears that the LSI SAS 1064E chip needs to be reset after a suspend/resume cycle before the driver attempts further communications with the chip. Without this patch, resuming the chip results in this error message being printed repeatedly and no more disk I/O. mptbase: ioc0: ERROR - Invalid IOC facts reply, msgLength=0 offsetof=6! So far it seems to fix suspend/resume on all the MPT Fusion cards I have (SAS and U320 SCSI) but since I don't know the internals of that chip I can't say for sure if this is a proper fix. Signed-off-by: Darrick J. Wong [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/message/fusion/mptbase.c |6 ++ 1 file changed, 6 insertions(+) diff -puN drivers/message/fusion/mptbase.c~mptbase-reset-ioc-initiator-during-pci-resume drivers/message/fusion/mptbase.c --- a/drivers/message/fusion/mptbase.c~mptbase-reset-ioc-initiator-during-pci-resume +++ a/drivers/message/fusion/mptbase.c @@ -1829,6 +1829,12 @@ mpt_resume(struct pci_dev *pdev) (mpt_GetIocState(ioc, 1) MPI_IOC_STATE_SHIFT), CHIPREG_READ32(ioc-chip-Doorbell)); + /* put ioc into READY_STATE */ + if(SendIocReset(ioc, MPI_FUNCTION_IOC_MESSAGE_UNIT_RESET, CAN_SLEEP)) { + printk(MYIOC_s_ERR_FMT + pci-resume: IOC msg unit reset failed!\n, ioc-name); + } + /* bring ioc to operational state */ if ((recovery_state = mpt_do_ioc_recovery(ioc, MPT_HOSTEVENT_IOC_RECOVER, CAN_SLEEP)) != 0) { _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 03/30] initio: fix conflict when loading driver
From: Alan Cox [EMAIL PROTECTED] I have a scanner connected to a Initio INI-950 SCSI card and I recently upgraded from SuSE 10.2 to 10.3. The new kernel doesn't see any of my devices. I get the following in /var/log/messages: ACPI: PCI Interrupt :00:0a.0[A] - GSI 17 (level, low) - IRQ 16 initio: I/O port range 0x0 is busy. ACPI: PCI interrupt for device :00:0a.0 disabled Humm not a collision - thats a bug in the driver updating. Looks like the changes I made and combined with Christoph's lost a line somewhere when I was merging it all. Try the following Signed-off-by: Alan Cox [EMAIL PROTECTED] Cc: Scott Simpson [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/initio.c |1 + 1 file changed, 1 insertion(+) diff -puN drivers/scsi/initio.c~initio-fix-conflict-when-loading-driver drivers/scsi/initio.c --- a/drivers/scsi/initio.c~initio-fix-conflict-when-loading-driver +++ a/drivers/scsi/initio.c @@ -2867,6 +2867,7 @@ static int initio_probe_one(struct pci_d } host = (struct initio_host *)shost-hostdata; memset(host, 0, sizeof(struct initio_host)); + host-addr = pci_resource_start(pdev, 0); if (!request_region(host-addr, 256, i91u)) { printk(KERN_WARNING initio: I/O port range 0x%x is busy.\n, host-addr); _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 14/30] aic94: fix section mismatches
From: Randy Dunlap [EMAIL PROTECTED] Fix section mismatch warning: WARNING: vmlinux.o(.init.text+0x23be6): Section mismatch: reference to .exit.text:asd_unmap_ha (between 'asd_pci_probe' and 'qla4xxx_module_init') + WARNING: vmlinux.o(.text+0x1ec8a8): Section mismatch: reference to .exit.text:as d_unmap_ioport (between 'asd_unmap_ha' and 'asd_remove_dev_attrs') WARNING: vmlinux.o(.text+0x1ec8b1): Section mismatch: reference to .exit.text:as d_unmap_memio (between 'asd_unmap_ha' and 'asd_remove_dev_attrs') Signed-off-by: Randy Dunlap [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/aic94xx/aic94xx_init.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff -puN drivers/scsi/aic94xx/aic94xx_init.c~aic94-fix-section-mismatches drivers/scsi/aic94xx/aic94xx_init.c --- a/drivers/scsi/aic94xx/aic94xx_init.c~aic94-fix-section-mismatches +++ a/drivers/scsi/aic94xx/aic94xx_init.c @@ -136,7 +136,7 @@ Err: return err; } -static void __devexit asd_unmap_memio(struct asd_ha_struct *asd_ha) +static void asd_unmap_memio(struct asd_ha_struct *asd_ha) { struct asd_ha_addrspace *io_handle; @@ -173,7 +173,7 @@ static int __devinit asd_map_ioport(stru return err; } -static void __devexit asd_unmap_ioport(struct asd_ha_struct *asd_ha) +static void asd_unmap_ioport(struct asd_ha_struct *asd_ha) { pci_release_region(asd_ha-pcidev, PCI_IOBAR_OFFSET); } @@ -210,7 +210,7 @@ Err: return err; } -static void __devexit asd_unmap_ha(struct asd_ha_struct *asd_ha) +static void asd_unmap_ha(struct asd_ha_struct *asd_ha) { if (asd_ha-iospace) asd_unmap_ioport(asd_ha); _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 07/30] ips: PCI API cleanups
From: Jeff Garzik [EMAIL PROTECTED] * pass Scsi_Host to ips_remove_device() via pci_set_drvdata(), allowing us to eliminate the ips_ha[] search loop and call ips_release() directly. * call pci_{request,release}_regions() and eliminate individual request/release_[mem_]region() calls * call pci_disable_device(), paired with pci_enable_device() * s/0/NULL/ in a few places * check ioremap() return value Signed-off-by: Jeff Garzik [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Acked-by: Salyzyn, Mark [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/ips.c | 72 ++- 1 file changed, 31 insertions(+), 41 deletions(-) diff -puN drivers/scsi/ips.c~ips-pci-api-cleanups drivers/scsi/ips.c --- a/drivers/scsi/ips.c~ips-pci-api-cleanups +++ a/drivers/scsi/ips.c @@ -702,10 +702,6 @@ ips_release(struct Scsi_Host *sh) /* free extra memory */ ips_free(ha); - /* Free I/O Region */ - if (ha-io_addr) - release_region(ha-io_addr, ha-io_len); - /* free IRQ */ free_irq(ha-pcidev-irq, ha); @@ -4394,8 +4390,6 @@ ips_free(ips_ha_t * ha) ha-mem_ptr = NULL; } - if (ha-mem_addr) - release_mem_region(ha-mem_addr, ha-mem_len); ha-mem_addr = 0; } @@ -6880,20 +6874,14 @@ ips_register_scsi(int index) static void __devexit ips_remove_device(struct pci_dev *pci_dev) { - int i; - struct Scsi_Host *sh; - ips_ha_t *ha; + struct Scsi_Host *sh = pci_get_drvdata(pci_dev); - for (i = 0; i IPS_MAX_ADAPTERS; i++) { - ha = ips_ha[i]; - if (ha) { - if ((pci_dev-bus-number == ha-pcidev-bus-number) - (pci_dev-devfn == ha-pcidev-devfn)) { - sh = ips_sh[i]; - ips_release(sh); - } - } - } + pci_set_drvdata(pci_dev, NULL); + + ips_release(sh); + + pci_release_regions(pci_dev); + pci_disable_device(pci_dev); } // @@ -6947,12 +6935,17 @@ module_exit(ips_module_exit); static int __devinit ips_insert_device(struct pci_dev *pci_dev, const struct pci_device_id *ent) { - int uninitialized_var(index); + int index = -1; int rc; METHOD_TRACE(ips_insert_device, 1); - if (pci_enable_device(pci_dev)) - return -1; + rc = pci_enable_device(pci_dev); + if (rc) + return rc; + + rc = pci_request_regions(pci_dev, ips); + if (rc) + goto err_out; rc = ips_init_phase1(pci_dev, index); if (rc == SUCCESS) @@ -6968,6 +6961,19 @@ ips_insert_device(struct pci_dev *pci_de ips_num_controllers++; ips_next_controller = ips_num_controllers; + + if (rc 0) { + rc = -ENODEV; + goto err_out_regions; + } + + pci_set_drvdata(pci_dev, ips_sh[index]); + return 0; + +err_out_regions: + pci_release_regions(pci_dev); +err_out: + pci_disable_device(pci_dev); return rc; } @@ -7000,7 +7006,7 @@ ips_init_phase1(struct pci_dev *pci_dev, METHOD_TRACE(ips_init_phase1, 1); index = IPS_MAX_ADAPTERS; for (j = 0; j IPS_MAX_ADAPTERS; j++) { - if (ips_ha[j] == 0) { + if (ips_ha[j] == NULL) { index = j; break; } @@ -7037,32 +7043,17 @@ ips_init_phase1(struct pci_dev *pci_dev, uint32_t base; uint32_t offs; - if (!request_mem_region(mem_addr, mem_len, ips)) { - IPS_PRINTK(KERN_WARNING, pci_dev, - Couldn't allocate IO Memory space %x len %d.\n, - mem_addr, mem_len); - return -1; - } - base = mem_addr PAGE_MASK; offs = mem_addr - base; ioremap_ptr = ioremap(base, PAGE_SIZE); + if (!ioremap_ptr) + return -1; mem_ptr = ioremap_ptr + offs; } else { ioremap_ptr = NULL; mem_ptr = NULL; } - /* setup I/O mapped area (if applicable) */ - if (io_addr) { - if (!request_region(io_addr, io_len, ips)) { - IPS_PRINTK(KERN_WARNING, pci_dev, - Couldn't allocate IO space %x len %d.\n, - io_addr, io_len); - return -1; - } - } - /* found a controller */ ha = kzalloc(sizeof (ips_ha_t), GFP_KERNEL); if (ha == NULL) { @@ -7071,7 +7062,6 @@
[patch 09/30] MegaRAID driver management char device moved to misc
From: Thomas Horsten [EMAIL PROTECTED] The MegaRAID driver's common management module (megaraid_mm.c) creates a char device used by the management tool megarc from LSI Logic (and possibly other management tools). In 2.6 with udev, this device doesn't get created because it is not registered in sysfs. I first fixed this by registering a class megaraid_mm, but realized that this should probably be moved to misc devices, instead of taking up a char major. This is because only 1 device is used, even if there are multiple adapters - the minor is never used (the adapter info is in the ioctl block sent to the driver, not detected based on the minor number as one might think). So it is a complete waste to have an entire major taken by this. So it now uses a misc device which I named megadev0 (the name that megarc expects), and has a dynamic minor (previoulsy a dynamic major was used). I have tested this on my own system with the megarc tool, and it works just as fine as before (only now the device gets created correctly by udev). Cc: [EMAIL PROTECTED] Cc: Neela Syam Kolli [EMAIL PROTECTED] Cc: Ju, Seokmann [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/megaraid/megaraid_mm.c | 20 +--- drivers/scsi/megaraid/megaraid_mm.h |1 + 2 files changed, 14 insertions(+), 7 deletions(-) diff -puN drivers/scsi/megaraid/megaraid_mm.c~megaraid-driver-management-char-device-moved-to-misc drivers/scsi/megaraid/megaraid_mm.c --- a/drivers/scsi/megaraid/megaraid_mm.c~megaraid-driver-management-char-device-moved-to-misc +++ a/drivers/scsi/megaraid/megaraid_mm.c @@ -59,7 +59,6 @@ EXPORT_SYMBOL(mraid_mm_register_adp); EXPORT_SYMBOL(mraid_mm_unregister_adp); EXPORT_SYMBOL(mraid_mm_adapter_app_handle); -static int majorno; static uint32_t drvr_ver = 0x02200207; static int adapters_count_g; @@ -76,6 +75,12 @@ static const struct file_operations lsi_ .owner = THIS_MODULE, }; +static struct miscdevice megaraid_mm_dev = { + .minor = MISC_DYNAMIC_MINOR, + .name = megadev0, + .fops = lsi_fops, +}; + /** * mraid_mm_open - open routine for char node interface * @inode : unused @@ -1184,15 +1189,16 @@ mraid_mm_teardown_dma_pools(mraid_mmadp_ static int __init mraid_mm_init(void) { + int err; + // Announce the driver version con_log(CL_ANN, (KERN_INFO megaraid cmm: %s %s\n, LSI_COMMON_MOD_VERSION, LSI_COMMON_MOD_EXT_VERSION)); - majorno = register_chrdev(0, megadev, lsi_fops); - - if (majorno 0) { - con_log(CL_ANN, (megaraid cmm: cannot get major\n)); - return majorno; + err = misc_register(megaraid_mm_dev); + if (err 0) { + con_log(CL_ANN, (megaraid cmm: cannot register misc device\n)); + return err; } init_waitqueue_head(wait_q); @@ -1230,7 +1236,7 @@ mraid_mm_exit(void) { con_log(CL_DLEVEL1 , (exiting common mod\n)); - unregister_chrdev(majorno, megadev); + misc_deregister(megaraid_mm_dev); } module_init(mraid_mm_init); diff -puN drivers/scsi/megaraid/megaraid_mm.h~megaraid-driver-management-char-device-moved-to-misc drivers/scsi/megaraid/megaraid_mm.h --- a/drivers/scsi/megaraid/megaraid_mm.h~megaraid-driver-management-char-device-moved-to-misc +++ a/drivers/scsi/megaraid/megaraid_mm.h @@ -22,6 +22,7 @@ #include linux/moduleparam.h #include linux/pci.h #include linux/list.h +#include linux/miscdevice.h #include mbox_defs.h #include megaraid_ioctl.h _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 13/30] advansys: fix section mismatch warning
From: Randy Dunlap [EMAIL PROTECTED] Fix section mismatch warning: WARNING: vmlinux.o(.exit.text+0x152a): Section mismatch: reference to .init.data:_asc_def_iop_base (between 'advansys_isa_remove' and 'advansys_exit') Signed-off-by: Randy Dunlap [EMAIL PROTECTED] Cc: Matthew Wilcox [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/advansys.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -puN drivers/scsi/advansys.c~advansys-fix-section-mismatch-warning drivers/scsi/advansys.c --- a/drivers/scsi/advansys.c~advansys-fix-section-mismatch-warning +++ a/drivers/scsi/advansys.c @@ -13906,7 +13906,7 @@ static int advansys_release(struct Scsi_ #define ASC_IOADR_TABLE_MAX_IX 11 -static PortAddr _asc_def_iop_base[ASC_IOADR_TABLE_MAX_IX] __devinitdata = { +static PortAddr _asc_def_iop_base[ASC_IOADR_TABLE_MAX_IX] = { 0x100, 0x0110, 0x120, 0x0130, 0x140, 0x0150, 0x0190, 0x0210, 0x0230, 0x0250, 0x0330 }; _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 12/30] SCSI/NCR5380: minor irq handler cleanups
From: Jeff Garzik [EMAIL PROTECTED] * remove unnecessary cast * remove unnecessary use of 'irq' function arg Signed-off-by: Jeff Garzik [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/NCR5380.c |7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff -puN drivers/scsi/NCR5380.c~scsi-ncr5380-minor-irq-handler-cleanups drivers/scsi/NCR5380.c --- a/drivers/scsi/NCR5380.c~scsi-ncr5380-minor-irq-handler-cleanups +++ a/drivers/scsi/NCR5380.c @@ -1157,16 +1157,17 @@ static void NCR5380_main(struct work_str * Locks: takes the needed instance locks */ -static irqreturn_t NCR5380_intr(int irq, void *dev_id) +static irqreturn_t NCR5380_intr(int dummy, void *dev_id) { NCR5380_local_declare(); - struct Scsi_Host *instance = (struct Scsi_Host *)dev_id; + struct Scsi_Host *instance = dev_id; struct NCR5380_hostdata *hostdata = (struct NCR5380_hostdata *) instance-hostdata; int done; unsigned char basr; unsigned long flags; - dprintk(NDEBUG_INTR, (scsi : NCR5380 irq %d triggered\n, irq)); + dprintk(NDEBUG_INTR, (scsi : NCR5380 irq %d triggered\n, + instance-irq)); do { done = 1; _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 11/30] SCSI/sym53c416: kill pointless irq handler loop and test
From: Jeff Garzik [EMAIL PROTECTED] - kill pointless irq handler loop to find base address, it is already passed to irq handler via Scsi_Host. - kill now-pointless !base test. Signed-off-by: Jeff Garzik [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Cc: Matthew Wilcox [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/sym53c416.c | 16 +--- 1 file changed, 1 insertion(+), 15 deletions(-) diff -puN drivers/scsi/sym53c416.c~scsi-sym53c416-kill-pointless-irq-handler-loop-and-test drivers/scsi/sym53c416.c --- a/drivers/scsi/sym53c416.c~scsi-sym53c416-kill-pointless-irq-handler-loop-and-test +++ a/drivers/scsi/sym53c416.c @@ -328,27 +328,13 @@ static __inline__ unsigned int sym53c416 static irqreturn_t sym53c416_intr_handle(int irq, void *dev_id) { struct Scsi_Host *dev = dev_id; - int base = 0; + int base = dev-io_port; int i; unsigned long flags = 0; unsigned char status_reg, pio_int_reg, int_reg; struct scatterlist *sg; unsigned int tot_trans = 0; - /* We search the base address of the host adapter which caused the interrupt */ - /* FIXME: should pass dev_id sensibly as hosts[i] */ - for(i = 0; i host_index !base; i++) - if(irq == hosts[i].irq) - base = hosts[i].base; - /* If no adapter found, we cannot handle the interrupt. Leave a message */ - /* and continue. This should never happen... */ - if(!base) - { - printk(KERN_ERR sym53c416: No host adapter defined for interrupt %d\n, irq); - return IRQ_NONE; - } - /* Now we have the base address and we can start handling the interrupt */ - spin_lock_irqsave(dev-host_lock,flags); status_reg = inb(base + STATUS_REG); pio_int_reg = inb(base + PIO_INT_REG); _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 15/30] sym2: fix section mismatch warning
From: Randy Dunlap [EMAIL PROTECTED] Fix section mismatch warning: WARNING: vmlinux.o(.text+0x1ff3a2): Section mismatch: reference to .exit.text:sym2_remove (between 'sym2_io_error_detected' and 'sym_xpt_done') Signed-off-by: Randy Dunlap [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Cc: Matthew Wilcox [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/sym53c8xx_2/sym_glue.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -puN drivers/scsi/sym53c8xx_2/sym_glue.c~sym2-fix-section-mismatch-warning drivers/scsi/sym53c8xx_2/sym_glue.c --- a/drivers/scsi/sym53c8xx_2/sym_glue.c~sym2-fix-section-mismatch-warning +++ a/drivers/scsi/sym53c8xx_2/sym_glue.c @@ -1744,7 +1744,7 @@ static int __devinit sym2_probe(struct p return -ENODEV; } -static void __devexit sym2_remove(struct pci_dev *pdev) +static void sym2_remove(struct pci_dev *pdev) { struct Scsi_Host *shost = pci_get_drvdata(pdev); @@ -2056,7 +2056,7 @@ static struct pci_driver sym2_driver = { .name = NAME53C8XX, .id_table = sym2_id_table, .probe = sym2_probe, - .remove = __devexit_p(sym2_remove), + .remove = sym2_remove, .err_handler= sym2_err_handler, }; _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 10/30] SCSI/gdth: kill unneeded 'irq' argument
From: Jeff Garzik [EMAIL PROTECTED] Neither gdth_get_status() nor __gdth_interrupt() need their 'irq' argument, so remove it. [EMAIL PROTECTED]: coding style fixes] Signed-off-by: Jeff Garzik [EMAIL PROTECTED] Acked-by: Boaz Harrosh [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/gdth.c | 22 ++ 1 file changed, 10 insertions(+), 12 deletions(-) diff -puN drivers/scsi/gdth.c~scsi-gdth-kill-unneeded-irq-argument drivers/scsi/gdth.c --- a/drivers/scsi/gdth.c~scsi-gdth-kill-unneeded-irq-argument +++ a/drivers/scsi/gdth.c @@ -141,7 +141,7 @@ static void gdth_delay(int milliseconds); static void gdth_eval_mapping(ulong32 size, ulong32 *cyls, int *heads, int *secs); static irqreturn_t gdth_interrupt(int irq, void *dev_id); -static irqreturn_t __gdth_interrupt(gdth_ha_str *ha, int irq, +static irqreturn_t __gdth_interrupt(gdth_ha_str *ha, int gdth_from_wait, int* pIndex); static int gdth_sync_event(gdth_ha_str *ha, int service, unchar index, Scsi_Cmnd *scp); @@ -165,7 +165,6 @@ static int gdth_internal_cache_cmd(gdth_ static int gdth_fill_cache_cmd(gdth_ha_str *ha, Scsi_Cmnd *scp, ushort hdrive); static void gdth_enable_int(gdth_ha_str *ha); -static unchar gdth_get_status(gdth_ha_str *ha, int irq); static int gdth_test_busy(gdth_ha_str *ha); static int gdth_get_cmd_index(gdth_ha_str *ha); static void gdth_release_event(gdth_ha_str *ha); @@ -1334,14 +1333,12 @@ static void __init gdth_enable_int(gdth_ } /* return IStatus if interrupt was from this card else 0 */ -static unchar gdth_get_status(gdth_ha_str *ha, int irq) +static unchar gdth_get_status(gdth_ha_str *ha) { unchar IStatus = 0; -TRACE((gdth_get_status() irq %d ctr_count %d\n, irq, gdth_ctr_count)); +TRACE((gdth_get_status() irq %d ctr_count %d\n, ha-irq, gdth_ctr_count)); -if (ha-irq != (unchar)irq) /* check IRQ */ -return false; if (ha-type == GDT_EISA) IStatus = inb((ushort)ha-bmic + EDOORREG); else if (ha-type == GDT_ISA) @@ -1523,7 +1520,7 @@ static int gdth_wait(gdth_ha_str *ha, in return 1; /* no wait required */ do { -__gdth_interrupt(ha, (int)ha-irq, true, wait_index); + __gdth_interrupt(ha, true, wait_index); if (wait_index == index) { answer_found = TRUE; break; @@ -3036,7 +3033,7 @@ static void gdth_clear_events(void) /* SCSI interface functions */ -static irqreturn_t __gdth_interrupt(gdth_ha_str *ha, int irq, +static irqreturn_t __gdth_interrupt(gdth_ha_str *ha, int gdth_from_wait, int* pIndex) { gdt6m_dpram_str __iomem *dp6m_ptr = NULL; @@ -3054,7 +3051,7 @@ static irqreturn_t __gdth_interrupt(gdth int act_int_coal = 0; #endif -TRACE((gdth_interrupt() IRQ %d\n,irq)); +TRACE((gdth_interrupt() IRQ %d\n, ha-irq)); /* if polling and not from gdth_wait() - return */ if (gdth_polling) { @@ -3067,7 +3064,8 @@ static irqreturn_t __gdth_interrupt(gdth spin_lock_irqsave(ha-smp_lock, flags); /* search controller */ -if (0 == (IStatus = gdth_get_status(ha, irq))) { +IStatus = gdth_get_status(ha); +if (IStatus == 0) { /* spurious interrupt */ if (!gdth_polling) spin_unlock_irqrestore(ha-smp_lock, flags); @@ -3294,9 +3292,9 @@ static irqreturn_t __gdth_interrupt(gdth static irqreturn_t gdth_interrupt(int irq, void *dev_id) { - gdth_ha_str *ha = (gdth_ha_str *)dev_id; + gdth_ha_str *ha = dev_id; - return __gdth_interrupt(ha, irq, false, NULL); + return __gdth_interrupt(ha, false, NULL); } static int gdth_sync_event(gdth_ha_str *ha, int service, unchar index, _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 06/30] ips: trim trailing whitespace
From: Jeff Garzik [EMAIL PROTECTED] [EMAIL PROTECTED]: coding style fixes] Signed-off-by: Jeff Garzik [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Acked-by: Salyzyn, Mark [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/ips.c | 49 +-- drivers/scsi/ips.h | 12 +- 2 files changed, 31 insertions(+), 30 deletions(-) diff -puN drivers/scsi/ips.c~ips-trim-trailing-whitespace drivers/scsi/ips.c --- a/drivers/scsi/ips.c~ips-trim-trailing-whitespace +++ a/drivers/scsi/ips.c @@ -389,17 +389,17 @@ static struct pci_device_id ips_pci_ta MODULE_DEVICE_TABLE( pci, ips_pci_table ); static char ips_hot_plug_name[] = ips; - + static int __devinit ips_insert_device(struct pci_dev *pci_dev, const struct pci_device_id *ent); static void __devexit ips_remove_device(struct pci_dev *pci_dev); - + static struct pci_driver ips_pci_driver = { .name = ips_hot_plug_name, .id_table = ips_pci_table, .probe = ips_insert_device, .remove = __devexit_p(ips_remove_device), }; - + /* * Necessary forward function protoypes @@ -587,7 +587,7 @@ static void ips_setup_funclist(ips_ha_t * ha) { - /* + /* * Setup Functions */ if (IPS_IS_MORPHEUS(ha) || IPS_IS_MARCO(ha)) { @@ -2081,7 +2081,7 @@ ips_host_info(ips_ha_t * ha, char *ptr, /* That keeps everything happy for text operations on the proc file. */ if (le32_to_cpu(ha-nvram-signature) == IPS_NVRAM_P5_SIG) { -if (ha-nvram-bios_low[3] == 0) { + if (ha-nvram-bios_low[3] == 0) { copy_info(info, \tBIOS Version : %c%c%c%c%c%c%c\n, ha-nvram-bios_high[0], ha-nvram-bios_high[1], @@ -2780,10 +2780,11 @@ ips_next(ips_ha_t * ha, int intr) scb-dcdb.cmd_attribute = ips_command_direction[scb-scsi_cmd-cmnd[0]]; -/* Allow a WRITE BUFFER Command to Have no Data */ -/* This is Used by Tape Flash Utilites */ -if ((scb-scsi_cmd-cmnd[0] == WRITE_BUFFER) (scb-data_len == 0)) -scb-dcdb.cmd_attribute = 0; + /* Allow a WRITE BUFFER Command to Have no Data */ + /* This is Used by Tape Flash Utilites */ + if ((scb-scsi_cmd-cmnd[0] == WRITE_BUFFER) + (scb-data_len == 0)) + scb-dcdb.cmd_attribute = 0; if (!(scb-dcdb.cmd_attribute 0x3)) scb-dcdb.transfer_length = 0; @@ -3404,7 +3405,7 @@ ips_map_status(ips_ha_t * ha, ips_scb_t /* Restrict access to physical DASD */ if (scb-scsi_cmd-cmnd[0] == INQUIRY) { - ips_scmd_buf_read(scb-scsi_cmd, + ips_scmd_buf_read(scb-scsi_cmd, inquiryData, sizeof (inquiryData)); if ((inquiryData.DeviceType 0x1f) == TYPE_DISK) { errcode = DID_TIME_OUT; @@ -4090,10 +4091,10 @@ ips_chkstatus(ips_ha_t * ha, IPS_STATUS scb-scsi_cmd-result = errcode 16; } else {/* bus == 0 */ /* restrict access to physical drives */ - if (scb-scsi_cmd-cmnd[0] == INQUIRY) { - ips_scmd_buf_read(scb-scsi_cmd, + if (scb-scsi_cmd-cmnd[0] == INQUIRY) { + ips_scmd_buf_read(scb-scsi_cmd, inquiryData, sizeof (inquiryData)); - if ((inquiryData.DeviceType 0x1f) == TYPE_DISK) + if ((inquiryData.DeviceType 0x1f) == TYPE_DISK) scb-scsi_cmd-result = DID_TIME_OUT 16; } } /* else */ @@ -4661,8 +4662,8 @@ ips_isinit_morpheus(ips_ha_t * ha) uint32_t bits; METHOD_TRACE(ips_is_init_morpheus, 1); - - if (ips_isintr_morpheus(ha)) + + if (ips_isintr_morpheus(ha)) ips_flush_and_reset(ha); post = readl(ha-mem_ptr + IPS_REG_I960_MSG0); @@ -4686,7 +4687,7 @@ ips_isinit_morpheus(ips_ha_t * ha) /* state ( was trying to INIT and an interrupt was already pending ) ... */ /* */ // -static void +static void ips_flush_and_reset(ips_ha_t *ha) { ips_scb_t *scb; @@ -4718,9 +4719,9 @@ ips_flush_and_reset(ips_ha_t *ha) if (ret == IPS_SUCCESS) { time = 60 *
[patch 08/30] ips: handle scsi_add_host() failure, and other err cleanups
From: Jeff Garzik [EMAIL PROTECTED] Signed-off-by: Jeff Garzik [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Acked-by: Salyzyn, Mark [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/ips.c | 18 +- 1 file changed, 13 insertions(+), 5 deletions(-) diff -puN drivers/scsi/ips.c~ips-handle-scsi_add_host-failure-and-other-err-cleanups drivers/scsi/ips.c --- a/drivers/scsi/ips.c~ips-handle-scsi_add_host-failure-and-other-err-cleanups +++ a/drivers/scsi/ips.c @@ -6837,13 +6837,10 @@ ips_register_scsi(int index) if (request_irq(ha-pcidev-irq, do_ipsintr, IRQF_SHARED, ips_name, ha)) { IPS_PRINTK(KERN_WARNING, ha-pcidev, Unable to install interrupt handler\n); - scsi_host_put(sh); - return -1; + goto err_out_sh; } kfree(oldha); - ips_sh[index] = sh; - ips_ha[index] = ha; /* Store away needed values for later use */ sh-unique_id = (ha-io_addr) ? ha-io_addr : ha-mem_addr; @@ -6859,10 +6856,21 @@ ips_register_scsi(int index) sh-max_channel = ha-nbus - 1; sh-can_queue = ha-max_cmds - 1; - scsi_add_host(sh, NULL); + if (scsi_add_host(sh, ha-pcidev-dev)) + goto err_out; + + ips_sh[index] = sh; + ips_ha[index] = ha; + scsi_scan_host(sh); return 0; + +err_out: + free_irq(ha-pcidev-irq, ha); +err_out_sh: + scsi_host_put(sh); + return -1; } /*---*/ _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 05/30] ips: remove ips_ha members that duplicate struct pci_dev members
From: Jeff Garzik [EMAIL PROTECTED] Signed-off-by: Jeff Garzik [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Acked-by: Salyzyn, Mark [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/ips.c | 178 --- drivers/scsi/ips.h | 20 +--- 2 files changed, 91 insertions(+), 107 deletions(-) diff -puN drivers/scsi/ips.c~ips-remove-ips_ha-members-that-duplicate-struct-pci_dev-members drivers/scsi/ips.c --- a/drivers/scsi/ips.c~ips-remove-ips_ha-members-that-duplicate-struct-pci_dev-members +++ a/drivers/scsi/ips.c @@ -707,7 +707,7 @@ ips_release(struct Scsi_Host *sh) release_region(ha-io_addr, ha-io_len); /* free IRQ */ - free_irq(ha-irq, ha); + free_irq(ha-pcidev-irq, ha); scsi_host_put(sh); @@ -1637,7 +1637,7 @@ ips_make_passthru(ips_ha_t *ha, struct s return (IPS_FAILURE); } - if (ha-device_id == IPS_DEVICEID_COPPERHEAD + if (ha-pcidev-device == IPS_DEVICEID_COPPERHEAD pt-CoppCP.cmd.flashfw.op_code == IPS_CMD_RW_BIOSFW) { ret = ips_flash_copperhead(ha, pt, scb); @@ -2021,7 +2021,7 @@ ips_cleanup_passthru(ips_ha_t * ha, ips_ pt-ExtendedStatus = scb-extended_status; pt-AdapterType = ha-ad_type; - if (ha-device_id == IPS_DEVICEID_COPPERHEAD + if (ha-pcidev-device == IPS_DEVICEID_COPPERHEAD (scb-cmd.flashfw.op_code == IPS_CMD_DOWNLOAD || scb-cmd.flashfw.op_code == IPS_CMD_RW_BIOSFW)) ips_free_flash_copperhead(ha); @@ -2075,7 +2075,7 @@ ips_host_info(ips_ha_t * ha, char *ptr, ha-mem_ptr); } - copy_info(info, \tIRQ number: %d\n, ha-irq); + copy_info(info, \tIRQ number: %d\n, ha-pcidev-irq); /* For the Next 3 lines Check for Binary 0 at the end and don't include it if it's there. */ /* That keeps everything happy for text operations on the proc file. */ @@ -2232,31 +2232,31 @@ ips_identify_controller(ips_ha_t * ha) { METHOD_TRACE(ips_identify_controller, 1); - switch (ha-device_id) { + switch (ha-pcidev-device) { case IPS_DEVICEID_COPPERHEAD: - if (ha-revision_id = IPS_REVID_SERVERAID) { + if (ha-pcidev-revision = IPS_REVID_SERVERAID) { ha-ad_type = IPS_ADTYPE_SERVERAID; - } else if (ha-revision_id == IPS_REVID_SERVERAID2) { + } else if (ha-pcidev-revision == IPS_REVID_SERVERAID2) { ha-ad_type = IPS_ADTYPE_SERVERAID2; - } else if (ha-revision_id == IPS_REVID_NAVAJO) { + } else if (ha-pcidev-revision == IPS_REVID_NAVAJO) { ha-ad_type = IPS_ADTYPE_NAVAJO; - } else if ((ha-revision_id == IPS_REVID_SERVERAID2) + } else if ((ha-pcidev-revision == IPS_REVID_SERVERAID2) (ha-slot_num == 0)) { ha-ad_type = IPS_ADTYPE_KIOWA; - } else if ((ha-revision_id = IPS_REVID_CLARINETP1) - (ha-revision_id = IPS_REVID_CLARINETP3)) { + } else if ((ha-pcidev-revision = IPS_REVID_CLARINETP1) + (ha-pcidev-revision = IPS_REVID_CLARINETP3)) { if (ha-enq-ucMaxPhysicalDevices == 15) ha-ad_type = IPS_ADTYPE_SERVERAID3L; else ha-ad_type = IPS_ADTYPE_SERVERAID3; - } else if ((ha-revision_id = IPS_REVID_TROMBONE32) - (ha-revision_id = IPS_REVID_TROMBONE64)) { + } else if ((ha-pcidev-revision = IPS_REVID_TROMBONE32) + (ha-pcidev-revision = IPS_REVID_TROMBONE64)) { ha-ad_type = IPS_ADTYPE_SERVERAID4H; } break; case IPS_DEVICEID_MORPHEUS: - switch (ha-subdevice_id) { + switch (ha-pcidev-subsystem_device) { case IPS_SUBDEVICEID_4L: ha-ad_type = IPS_ADTYPE_SERVERAID4L; break; @@ -2285,7 +2285,7 @@ ips_identify_controller(ips_ha_t * ha) break; case IPS_DEVICEID_MARCO: - switch (ha-subdevice_id) { + switch (ha-pcidev-subsystem_device) { case IPS_SUBDEVICEID_6M: ha-ad_type = IPS_ADTYPE_SERVERAID6M; break; @@ -2332,20 +2332,20 @@ ips_get_bios_version(ips_ha_t * ha, int strncpy(ha-bios_version,?, 8); - if (ha-device_id == IPS_DEVICEID_COPPERHEAD) { + if (ha-pcidev-device == IPS_DEVICEID_COPPERHEAD) {
[patch 20/30] drivers/scsi/sgiwd93.c: export sgiwd93_reset()
From: Andrew Morton [EMAIL PROTECTED] mips allmodconfig: ERROR: sgiwd93_reset [drivers/scsi/wd33c93.ko] undefined! Cc: Ralf Baechle [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/sgiwd93.c |1 + 1 file changed, 1 insertion(+) diff -puN drivers/scsi/sgiwd93.c~drivers-scsi-sgiwd93c-export-sgiwd93_reset drivers/scsi/sgiwd93.c --- a/drivers/scsi/sgiwd93.c~drivers-scsi-sgiwd93c-export-sgiwd93_reset +++ a/drivers/scsi/sgiwd93.c @@ -159,6 +159,7 @@ void sgiwd93_reset(unsigned long base) udelay(50); hregs-ctrl = 0; } +EXPORT_SYMBOL_GPL(sgiwd93_reset); static inline void init_hpc_chain(struct hpc_data *hd) { _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
Andrew Morton wrote: On Thu, 13 Dec 2007 19:30:00 -0500 Mark Lord [EMAIL PROTECTED] wrote: Here's the commit that causes the regression: ... --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -760,7 +760,8 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, struct page *page = __rmqueue(zone, order, migratetype); if (unlikely(page == NULL)) break; - list_add_tail(page-lru, list); + list_add(page-lru, list); well that looks fishy. .. Yeah. I missed that, and instead just posted a patch to search the list in reverse order, which seems to work for me. I'll try just reversing that line above here now.. gimme 5 minutes or so. Cheers - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] fix page_alloc for larger I/O segments
Mark Lord wrote: Mark Lord wrote: Mark Lord wrote: Mark Lord wrote: Andrew Morton wrote: On Thu, 13 Dec 2007 17:15:06 -0500 James Bottomley [EMAIL PROTECTED] wrote: On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote: On Thu, 13 Dec 2007 21:09:59 +0100 Jens Axboe [EMAIL PROTECTED] wrote: OK, it's a vm issue, cc linux-mm and probable culprit. I have tens of thousand backward pages after a boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not reverse. So it looks like that bug got reintroduced. Bill Irwin fixed this a couple of years back: changed the page allocator so that it mostly hands out pages in ascending physical-address order. I guess we broke that, quite possibly in Mel's page allocator rework. It would help if you could provide us with a simple recipe for demonstrating this problem, please. The simple way seems to be to malloc a large area, touch every page and then look at the physical pages assigned ... they now mostly seem to be descending in physical address. OIC. -mm's /proc/pid/pagemap can be used to get the pfn's... .. I'm actually running the treadmill right now (have been for many hours, actually, to bisect it to a specific commit. Thought I was almost done, and then noticed that git-bisect doesn't keep the Makefile VERSION lines the same, so I was actually running the wrong kernel after the first few times.. duh. Wrote a script to fix it now. .. Well, that was a waste of three hours. .. Ahh.. it seems to be sensitive to one/both of these: CONFIG_HIGHMEM64G=y with 4GB RAM: not so bad, frequently does 20KB - 48KB segments. CONFIG_HIGHMEM4G=y with 2GB RAM: very severe, rarely does more than 8KB segments. CONFIG_HIGHMEM4G=y with 3GB RAM: very severe, rarely does more than 8KB segments. So if you want to reproduce this on a large memory machine, use mem=2GB for starters. .. Here's the commit that causes the regression: 535131e6925b4a95f321148ad7293f496e0e58d7 Choose pages from the per-cpu list based on migration type And here is a patch that seems to fix it for me here: * * * * Fix page allocator to give better change of larger contiguous segments (again). Signed-off-by: Mark Lord [EMAIL PROTECTED] --- --- old/mm/page_alloc.c.orig2007-12-13 19:25:15.0 -0500 +++ linux-2.6/mm/page_alloc.c 2007-12-13 19:35:50.0 -0500 @@ -954,7 +954,7 @@ goto failed; } /* Find a page of the appropriate migrate type */ - list_for_each_entry(page, pcp-list, lru) { + list_for_each_entry_reverse(page, pcp-list, lru) { if (page_private(page) == migratetype) { list_del(page-lru); pcp-count--; - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 16/30] aacraid driver fails with Dell PowerEdge Expandable RAID Controller 3/Di
From: Salyzyn, Mark [EMAIL PROTECTED] As reported in http://bugzilla.kernel.org/show_bug.cgi?id=3D9133 it was discovered that the PERC line of controllers lacked a key 64 bit ScatterGather capable SCSI pass-through function. The adapters are still capable of 64 bit ScatterGather I/O commands, but these two can not be mixed. This problem was exacerbated by the introduction of the SCSI Generic access to the DASD physical devices. The fix for users before this patch is applied is aacraid.dacmode=3D0 on the kernel command line to disable 64 bit I/O. The enclosed patch introduces a new adapter quirk and tries to limp along by enabling pass-through in situations where memory is 32 bit addressable on 64 bit machines, or disable the pass-through functions altogether. I expect that the check for 32 bit addressable memory to be controversial in that it can be incorrect in non-Dell non-Intel systems that PERC would never be installed under, the alternative is to disable pass-through in all cases which could be reported as another regression. Pass-through is used for SCSI Generic access to the physical devices, or for the management applications to properly function. In systems where this patch has disabled pass-through because it is unsupportable in combination with I/O performance, the user can choose to enable pass-through by turning off dacmode (aacraid.dacmode=3D0) or limiting the discovered kernel memory (mem=3D4G) with an associated loss in runtime performance. If we chose instead to turn off 64 bit dacmode for the adapters with this quirk, then this would be reported as another regression. Signed-off-by: Mark Salyzyn [EMAIL PROTECTED] Cc: Marcin Krol [EMAIL PROTECTED] Cc: Matt Domsch [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/aacraid/aachba.c | 15 ++- drivers/scsi/aacraid/aacraid.h |6 drivers/scsi/aacraid/commsup.c |6 ++-- drivers/scsi/aacraid/linit.c | 42 +-- 4 files changed, 47 insertions(+), 22 deletions(-) diff -puN drivers/scsi/aacraid/aachba.c~aacraid-driver-fails-with-dell-poweredge-expandable-raid-controller-3-di drivers/scsi/aacraid/aachba.c --- a/drivers/scsi/aacraid/aachba.c~aacraid-driver-fails-with-dell-poweredge-expandable-raid-controller-3-di +++ a/drivers/scsi/aacraid/aachba.c @@ -1190,6 +1190,15 @@ static int aac_scsi_32(struct fib * fib, (fib_callback) aac_srb_callback, (void *) cmd); } +static int aac_scsi_32_64(struct fib * fib, struct scsi_cmnd * cmd) +{ + if ((sizeof(dma_addr_t) 4) +(num_physpages (0xULL PAGE_SHIFT)) +(fib-dev-adapter_info.options AAC_OPT_SGMAP_HOST64)) + return FAILED; + return aac_scsi_32(fib, cmd); +} + int aac_get_adapter_info(struct aac_dev* dev) { struct fib* fibptr; @@ -1267,6 +1276,8 @@ int aac_get_adapter_info(struct aac_dev* 1, 1, NULL, NULL); + /* reasoned default */ + dev-maximum_num_physicals = 16; if (rcode = 0 le32_to_cpu(bus_info-Status) == ST_OK) { dev-maximum_num_physicals = le32_to_cpu(bus_info-TargetsPerBus); dev-maximum_num_channels = le32_to_cpu(bus_info-BusCount); @@ -1376,7 +1387,9 @@ int aac_get_adapter_info(struct aac_dev* * interface. */ dev-a_ops.adapter_scsi = (dev-dac_support) - ? aac_scsi_64 + ? ((aac_get_driver_ident(dev-cardtype)-quirks AAC_QUIRK_SCSI_32) + ? aac_scsi_32_64 + : aac_scsi_64) : aac_scsi_32; if (dev-raw_io_interface) { dev-a_ops.adapter_bounds = (dev-raw_io_64) diff -puN drivers/scsi/aacraid/aacraid.h~aacraid-driver-fails-with-dell-poweredge-expandable-raid-controller-3-di drivers/scsi/aacraid/aacraid.h --- a/drivers/scsi/aacraid/aacraid.h~aacraid-driver-fails-with-dell-poweredge-expandable-raid-controller-3-di +++ a/drivers/scsi/aacraid/aacraid.h @@ -521,6 +521,12 @@ struct aac_driver_ident #define AAC_QUIRK_17SG 0x0010 /* + * Some adapter firmware does not support 64 bit scsi passthrough + * commands. + */ +#define AAC_QUIRK_SCSI_32 0x0020 + +/* * The adapter interface specs all queues to be located in the same * physically contigous block. The host structure that defines the * commuication queues will assume they are each a separate physically diff -puN drivers/scsi/aacraid/commsup.c~aacraid-driver-fails-with-dell-poweredge-expandable-raid-controller-3-di drivers/scsi/aacraid/commsup.c --- a/drivers/scsi/aacraid/commsup.c~aacraid-driver-fails-with-dell-poweredge-expandable-raid-controller-3-di +++ a/drivers/scsi/aacraid/commsup.c @@ -1099,7 +1099,8 @@ static int _aac_reset_adapter(struct aac free_irq(aac-pdev-irq, aac); kfree(aac-fsa_dev);
[patch 28/30] scsi: arm: convert to accessors and !use_sg cleanup
From: Boaz Harrosh [EMAIL PROTECTED] - convert to accessors and !use_sg cleanup Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] Cc: Russell King [EMAIL PROTECTED] Signed-off-by: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/arm/acornscsi.c | 14 ++--- drivers/scsi/arm/scsi.h | 34 ++--- 2 files changed, 18 insertions(+), 30 deletions(-) diff -puN drivers/scsi/arm/acornscsi.c~scsi-pending-arm-convert-to-accessors drivers/scsi/arm/acornscsi.c --- a/drivers/scsi/arm/acornscsi.c~scsi-pending-arm-convert-to-accessors +++ a/drivers/scsi/arm/acornscsi.c @@ -1790,7 +1790,7 @@ int acornscsi_starttransfer(AS_Host *hos return 0; } -residual = host-SCpnt-request_bufflen - host-scsi.SCp.scsi_xferred; +residual = scsi_bufflen(host-SCpnt) - host-scsi.SCp.scsi_xferred; sbic_arm_write(host-scsi.io_port, SBIC_SYNCHTRANSFER, host-device[host-SCpnt-device-id].sync_xfer); sbic_arm_writenext(host-scsi.io_port, residual 16); @@ -2270,7 +2270,7 @@ intr_ret_t acornscsi_sbicintr(AS_Host *h case 0x4b: /* - PHASE_STATUSIN */ case 0x8b: /* - PHASE_STATUSIN */ /* DATA IN - STATUS */ - host-scsi.SCp.scsi_xferred = host-SCpnt-request_bufflen - + host-scsi.SCp.scsi_xferred = scsi_bufflen(host-SCpnt) - acornscsi_sbic_xfcount(host); acornscsi_dma_stop(host); acornscsi_readstatusbyte(host); @@ -2281,7 +2281,7 @@ intr_ret_t acornscsi_sbicintr(AS_Host *h case 0x4e: /* - PHASE_MSGOUT */ case 0x8e: /* - PHASE_MSGOUT */ /* DATA IN - MESSAGE OUT */ - host-scsi.SCp.scsi_xferred = host-SCpnt-request_bufflen - + host-scsi.SCp.scsi_xferred = scsi_bufflen(host-SCpnt) - acornscsi_sbic_xfcount(host); acornscsi_dma_stop(host); acornscsi_sendmessage(host); @@ -2291,7 +2291,7 @@ intr_ret_t acornscsi_sbicintr(AS_Host *h case 0x4f: /* message in */ case 0x8f: /* message in */ /* DATA IN - MESSAGE IN */ - host-scsi.SCp.scsi_xferred = host-SCpnt-request_bufflen - + host-scsi.SCp.scsi_xferred = scsi_bufflen(host-SCpnt) - acornscsi_sbic_xfcount(host); acornscsi_dma_stop(host); acornscsi_message(host);/* - PHASE_MSGIN, PHASE_DISCONNECT */ @@ -2319,7 +2319,7 @@ intr_ret_t acornscsi_sbicintr(AS_Host *h case 0x4b: /* - PHASE_STATUSIN */ case 0x8b: /* - PHASE_STATUSIN */ /* DATA OUT - STATUS */ - host-scsi.SCp.scsi_xferred = host-SCpnt-request_bufflen - + host-scsi.SCp.scsi_xferred = scsi_bufflen(host-SCpnt) - acornscsi_sbic_xfcount(host); acornscsi_dma_stop(host); acornscsi_dma_adjust(host); @@ -2331,7 +2331,7 @@ intr_ret_t acornscsi_sbicintr(AS_Host *h case 0x4e: /* - PHASE_MSGOUT */ case 0x8e: /* - PHASE_MSGOUT */ /* DATA OUT - MESSAGE OUT */ - host-scsi.SCp.scsi_xferred = host-SCpnt-request_bufflen - + host-scsi.SCp.scsi_xferred = scsi_bufflen(host-SCpnt) - acornscsi_sbic_xfcount(host); acornscsi_dma_stop(host); acornscsi_dma_adjust(host); @@ -2342,7 +2342,7 @@ intr_ret_t acornscsi_sbicintr(AS_Host *h case 0x4f: /* message in */ case 0x8f: /* message in */ /* DATA OUT - MESSAGE IN */ - host-scsi.SCp.scsi_xferred = host-SCpnt-request_bufflen - + host-scsi.SCp.scsi_xferred = scsi_bufflen(host-SCpnt) - acornscsi_sbic_xfcount(host); acornscsi_dma_stop(host); acornscsi_dma_adjust(host); diff -puN drivers/scsi/arm/scsi.h~scsi-pending-arm-convert-to-accessors drivers/scsi/arm/scsi.h --- a/drivers/scsi/arm/scsi.h~scsi-pending-arm-convert-to-accessors +++ a/drivers/scsi/arm/scsi.h @@ -68,46 +68,34 @@ static inline void init_SCp(struct scsi_ { memset(SCpnt-SCp, 0, sizeof(struct scsi_pointer)); - if (SCpnt-use_sg) { + if (scsi_bufflen(SCpnt)) { unsigned long len = 0; int buf; -
[patch 30/30] libsas: convert ATA bridge to use new EH
From: Darrick J. Wong [EMAIL PROTECTED] Migrate the sas_ata bridge to use the new libata EH strategy, and finally implement correct software reset. WARNING WARNING WARNING! This patch is for experimental use only; it is nowhere near complete! Especially the sas_ata_freeze() function. This patch may eat your data and kill your trees. jgarzik: If an ATA command was in-progress at the time of a port freeze, can complete after thawing? (Does that even make sense?) [EMAIL PROTECTED]: coding-style fixes] Comments-requested-by: Darrick J. Wong [EMAIL PROTECTED] Cc: Jeff Garzik [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/libsas/sas_ata.c | 86 ++-- 1 file changed, 71 insertions(+), 15 deletions(-) diff -puN drivers/scsi/libsas/sas_ata.c~libsas-convert-ata-bridge-to-use-new-eh drivers/scsi/libsas/sas_ata.c --- a/drivers/scsi/libsas/sas_ata.c~libsas-convert-ata-bridge-to-use-new-eh +++ a/drivers/scsi/libsas/sas_ata.c @@ -35,6 +35,8 @@ #include ../scsi_transport_api.h #include scsi/scsi_eh.h +static int sas_issue_ata_srst(struct domain_device *dev); + static enum ata_completion_errors sas_to_ata_err(struct task_status_struct *ts) { /* Cheesy attempt to translate SAS errors into ATA. Hah! */ @@ -233,37 +235,58 @@ static u8 sas_ata_check_status(struct at return dev-sata_dev.tf.command; } -static void sas_ata_phy_reset(struct ata_port *ap) +static void sas_ata_freeze(struct ata_port *ap) { - struct domain_device *dev = ap-private_data; - struct sas_internal *i = - to_sas_internal(dev-port-ha-core.shost-transportt); - int res = 0; + /* reroute qc_done for all qc's on this port to a dumb free func */ + /* i wonder if we can get away with throwing out anything that +* completes in this time frame, or if we must find the commands +* that are in progress and cancel only those? */ + printk(KERN_ERR %s: STUB\n, __FUNCTION__); +} - if (i-dft-lldd_I_T_nexus_reset) - res = i-dft-lldd_I_T_nexus_reset(dev); +static void sas_ata_thaw(struct ata_port *ap) +{ + /* empty */ + printk(KERN_ERR %s: STUB\n, __FUNCTION__); +} - if (res) - SAS_DPRINTK(%s: Unable to reset I T nexus?\n, __FUNCTION__); +static int sas_ata_soft_reset(struct ata_link *link, unsigned int *classes, + unsigned long deadline) +{ + struct ata_port *ap = link-ap; + struct domain_device *dev = ap-private_data; + int res; + /* Send SRST to device */ + res = sas_issue_ata_srst(dev); + printk(KERN_ERR srst 0 returns %d\n, res); + + /* Set new device type */ switch (dev-sata_dev.command_set) { case ATA_COMMAND_SET: SAS_DPRINTK(%s: Found ATA device.\n, __FUNCTION__); - ap-link.device[0].class = ATA_DEV_ATA; + *classes = ATA_DEV_ATA; break; case ATAPI_COMMAND_SET: SAS_DPRINTK(%s: Found ATAPI device.\n, __FUNCTION__); - ap-link.device[0].class = ATA_DEV_ATAPI; + *classes = ATA_DEV_ATAPI; break; default: SAS_DPRINTK(%s: Unknown SATA command set: %d.\n, __FUNCTION__, dev-sata_dev.command_set); - ap-link.device[0].class = ATA_DEV_UNKNOWN; - break; + *classes = ATA_DEV_UNKNOWN; + break; } - ap-cbl = ATA_CBL_SATA; + /* FIXME: What if SRST fails? */ + return 0; +} + +static void sas_ata_error_handler(struct ata_port *ap) +{ + ata_do_eh(ap, NULL, sas_ata_soft_reset, NULL, NULL); + /* uh... hopefully there's no commands left in here? */ } static void sas_ata_post_internal(struct ata_queued_cmd *qc) @@ -353,7 +376,9 @@ static struct ata_port_operations sas_sa .check_status = sas_ata_check_status, .check_altstatus= sas_ata_check_status, .dev_select = ata_noop_dev_select, - .phy_reset = sas_ata_phy_reset, + .error_handler = sas_ata_error_handler, + .freeze = sas_ata_freeze, + .thaw = sas_ata_thaw, .post_internal_cmd = sas_ata_post_internal, .tf_read= sas_ata_tf_read, .qc_prep= ata_noop_qc_prep, @@ -658,6 +683,37 @@ out: return res; } +static int sas_issue_ata_srst(struct domain_device *dev) +{ + int res = 0; + struct sas_task *task; + struct dev_to_host_fis *d2h_fis = (struct dev_to_host_fis *) + dev-frame_rcvd[0]; + + res = -ENOMEM; + task = sas_alloc_task(GFP_KERNEL); + if (!task) +
[patch 19/30] Dell CERC support for megaraid_mbox
From: Hannes Reinecke [EMAIL PROTECTED] Newer Dell CERC firmware (= 6.62) implement a random deletion handling compatible with the legacy megaraid driver. The legacy handling shifted the target ID by 0x80 only for I/O commands (READ/WRITE/etc), whereas megaraid_mbox shifts the target ID always if random deletion is supported. The resulted in megaraid_mbox sending an INQUIRY to the wrong channel, and not finding any devices, obviously. So we disable the random deletion support if the offending firmware is found. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=6695 Signed-off-by: Hannes Reinecke [EMAIL PROTECTED] Cc: Patro, Sumant [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/megaraid/megaraid_mbox.c | 17 + drivers/scsi/megaraid/megaraid_mbox.h |1 + 2 files changed, 18 insertions(+) diff -puN drivers/scsi/megaraid/megaraid_mbox.c~dell-cerc-support-for-megaraid_mbox drivers/scsi/megaraid/megaraid_mbox.c --- a/drivers/scsi/megaraid/megaraid_mbox.c~dell-cerc-support-for-megaraid_mbox +++ a/drivers/scsi/megaraid/megaraid_mbox.c @@ -3169,6 +3169,23 @@ megaraid_mbox_support_random_del(adapter uint8_t raw_mbox[sizeof(mbox_t)]; int rval; + /* +* Newer firmware on Dell CERC expect a different +* random deletion handling, so disable it. +*/ + if (adapter-pdev-vendor == PCI_VENDOR_ID_AMI + adapter-pdev-device == PCI_DEVICE_ID_AMI_MEGARAID3 + adapter-pdev-subsystem_vendor == PCI_VENDOR_ID_DELL + adapter-pdev-subsystem_device == PCI_SUBSYS_ID_CERC_ATA100_4CH + (adapter-fw_version[0] '6' || +(adapter-fw_version[0] == '6' + adapter-fw_version[2] '6') || +(adapter-fw_version[0] == '6' + adapter-fw_version[2] == '6' + adapter-fw_version[3] '1'))) { + con_log(CL_DLEVEL1, (megaraid: disable random deletion\n)); + return 0; + } mbox = (mbox_t *)raw_mbox; diff -puN drivers/scsi/megaraid/megaraid_mbox.h~dell-cerc-support-for-megaraid_mbox drivers/scsi/megaraid/megaraid_mbox.h --- a/drivers/scsi/megaraid/megaraid_mbox.h~dell-cerc-support-for-megaraid_mbox +++ a/drivers/scsi/megaraid/megaraid_mbox.h @@ -88,6 +88,7 @@ #define PCI_SUBSYS_ID_PERC3_QC 0x0471 #define PCI_SUBSYS_ID_PERC3_DC 0x0493 #define PCI_SUBSYS_ID_PERC3_SC 0x0475 +#define PCI_SUBSYS_ID_CERC_ATA100_4CH 0x0511 #define MBOX_MAX_SCSI_CMDS 128 // number of cmds reserved for kernel _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 22/30] sg: nopage
From: Nick Piggin [EMAIL PROTECTED] Convert SG from nopage to fault. Signed-off-by: Nick Piggin [EMAIL PROTECTED] Cc: Douglas Gilbert [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/sg.c | 23 +++ 1 file changed, 11 insertions(+), 12 deletions(-) diff -puN drivers/scsi/sg.c~sg-nopage drivers/scsi/sg.c --- a/drivers/scsi/sg.c~sg-nopage +++ a/drivers/scsi/sg.c @@ -1144,23 +1144,22 @@ sg_fasync(int fd, struct file *filp, int return (retval 0) ? retval : 0; } -static struct page * -sg_vma_nopage(struct vm_area_struct *vma, unsigned long addr, int *type) +static int +sg_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf) { Sg_fd *sfp; - struct page *page = NOPAGE_SIGBUS; unsigned long offset, len, sa; Sg_scatter_hold *rsv_schp; struct scatterlist *sg; int k; if ((NULL == vma) || (!(sfp = (Sg_fd *) vma-vm_private_data))) - return page; + return VM_FAULT_SIGBUS; rsv_schp = sfp-reserve; - offset = addr - vma-vm_start; + offset = vmf-pgoff PAGE_SHIFT; if (offset = rsv_schp-bufflen) - return page; - SCSI_LOG_TIMEOUT(3, printk(sg_vma_nopage: offset=%lu, scatg=%d\n, + return VM_FAULT_SIGBUS; + SCSI_LOG_TIMEOUT(3, printk(sg_vma_fault: offset=%lu, scatg=%d\n, offset, rsv_schp-k_use_sg)); sg = rsv_schp-buffer; sa = vma-vm_start; @@ -1169,21 +1168,21 @@ sg_vma_nopage(struct vm_area_struct *vma len = vma-vm_end - sa; len = (len sg-length) ? len : sg-length; if (offset len) { + struct page *page; page = virt_to_page(page_address(sg_page(sg)) + offset); get_page(page); /* increment page count */ - break; + vmf-page = page; + return 0; /* success */ } sa += len; offset -= len; } - if (type) - *type = VM_FAULT_MINOR; - return page; + return VM_FAULT_SIGBUS; } static struct vm_operations_struct sg_mmap_vm_ops = { - .nopage = sg_vma_nopage, + .fault = sg_vma_fault, }; static int _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 29/30] scsi: bidi support
From: Boaz Harrosh [EMAIL PROTECTED] At the block level bidi request uses req-next_rq pointer for a second bidi_read request. At Scsi-midlayer a second scsi_data_buffer structure is used for the bidi_read part. This bidi scsi_data_buffer is put on request-next_rq-special. Struct scsi_cmnd is not changed. - Define scsi_bidi_cmnd() to return true if it is a bidi request and a second sgtable was allocated. - Define scsi_in()/scsi_out() to return the in or out scsi_data_buffer from this command This API is to isolate users from the mechanics of bidi. - Define scsi_end_bidi_request() to do what scsi_end_request() does but for a bidi request. This is necessary because bidi commands are a bit tricky here. (See comments in body) - scsi_release_buffers() will also release the bidi_read scsi_data_buffer - scsi_io_completion() on bidi commands will now call scsi_end_bidi_request() and return. - The previous work done in scsi_init_io() is now done in a new scsi_init_sgtable() (which is 99% identical to old scsi_init_io()) The new scsi_init_io() will call the above twice if needed also for the bidi_read command. Only at this point is a command bidi. - In scsi_error.c at scsi_eh_prep/restore_cmnd() make sure bidi-lld is not confused by a get-sense command that looks like bidi. This is done by puting NULL at request-next_rq, and restoring. Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/scsi_error.c |3 drivers/scsi/scsi_lib.c | 144 include/scsi/scsi_cmnd.h | 23 + include/scsi/scsi_eh.h|1 4 files changed, 141 insertions(+), 30 deletions(-) diff -puN drivers/scsi/scsi_error.c~scsi-bidi-support drivers/scsi/scsi_error.c --- a/drivers/scsi/scsi_error.c~scsi-bidi-support +++ a/drivers/scsi/scsi_error.c @@ -618,9 +618,11 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd memcpy(ses-cmnd, scmd-cmnd, sizeof(scmd-cmnd)); ses-data_direction = scmd-sc_data_direction; ses-sdb = scmd-sdb; + ses-next_rq = scmd-request-next_rq; ses-result = scmd-result; memset(scmd-sdb, 0, sizeof(scmd-sdb)); + scmd-request-next_rq = NULL; if (sense_bytes) { scmd-sdb.length = min_t(unsigned, @@ -673,6 +675,7 @@ void scsi_eh_restore_cmnd(struct scsi_cm memcpy(scmd-cmnd, ses-cmnd, sizeof(scmd-cmnd)); scmd-sc_data_direction = ses-data_direction; scmd-sdb = ses-sdb; + scmd-request-next_rq = ses-next_rq; scmd-result = ses-result; } EXPORT_SYMBOL(scsi_eh_restore_cmnd); diff -puN drivers/scsi/scsi_lib.c~scsi-bidi-support drivers/scsi/scsi_lib.c --- a/drivers/scsi/scsi_lib.c~scsi-bidi-support +++ a/drivers/scsi/scsi_lib.c @@ -64,6 +64,8 @@ static struct scsi_host_sg_pool scsi_sg_ }; #undef SP +static struct kmem_cache *scsi_bidi_sdb_cache; + static void scsi_run_queue(struct request_queue *q); /* @@ -627,6 +629,28 @@ void scsi_run_host_queues(struct Scsi_Ho scsi_run_queue(sdev-request_queue); } +static void scsi_finalize_request(struct scsi_cmnd *cmd, int uptodate) +{ + struct request_queue *q = cmd-device-request_queue; + struct request *req = cmd-request; + unsigned long flags; + + add_disk_randomness(req-rq_disk); + + spin_lock_irqsave(q-queue_lock, flags); + if (blk_rq_tagged(req)) + blk_queue_end_tag(q, req); + + end_that_request_last(req, uptodate); + spin_unlock_irqrestore(q-queue_lock, flags); + + /* +* This will goose the queue request function at the end, so we don't +* need to worry about launching another command. +*/ + scsi_next_command(cmd); +} + /* * Function:scsi_end_request() * @@ -654,7 +678,6 @@ static struct scsi_cmnd *scsi_end_reques { struct request_queue *q = cmd-device-request_queue; struct request *req = cmd-request; - unsigned long flags; /* * If there are blocks left over at the end, set up the command @@ -683,19 +706,7 @@ static struct scsi_cmnd *scsi_end_reques } } - add_disk_randomness(req-rq_disk); - - spin_lock_irqsave(q-queue_lock, flags); - if (blk_rq_tagged(req)) - blk_queue_end_tag(q, req); - end_that_request_last(req, uptodate); - spin_unlock_irqrestore(q-queue_lock, flags); - - /* -* This will goose the queue request function at the end, so we don't -* need to worry about launching another command. -*/ - scsi_next_command(cmd); + scsi_finalize_request(cmd, uptodate); return NULL; } @@ -894,10 +905,39 @@ void scsi_release_buffers(struct scsi_cm scsi_free_sgtable(cmd-sdb); memset(cmd-sdb, 0, sizeof(cmd-sdb)); + + if (scsi_bidi_cmnd(cmd)) { +
[patch 25/30] drivers/scsi/ipr.c: use LIST_HEAD instead of LIST_HEAD_INIT
From: Denis Cheng [EMAIL PROTECTED] Signed-off-by: Denis Cheng [EMAIL PROTECTED] Acked-by: Brian King [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/ipr.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -puN drivers/scsi/ipr.c~drivers-scsi-iprc-use-list_head-instead-of-list_head_init drivers/scsi/ipr.c --- a/drivers/scsi/ipr.c~drivers-scsi-iprc-use-list_head-instead-of-list_head_init +++ a/drivers/scsi/ipr.c @@ -84,7 +84,7 @@ /* * Global Data */ -static struct list_head ipr_ioa_head = LIST_HEAD_INIT(ipr_ioa_head); +static LIST_HEAD(ipr_ioa_head); static unsigned int ipr_log_level = IPR_DEFAULT_LOG_LEVEL; static unsigned int ipr_max_speed = 1; static int ipr_testmode = 0; _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 21/30] scsi/qla2xxx/qla_os.c section fix
From: Adrian Bunk [EMAIL PROTECTED] WARNING: vmlinux.o(.text+0x2a4462): Section mismatch: reference to .exit.text:qla2x00_remove_one (between 'qla2xxx_pci_error_detected' and 'qla2x00_stop_timer') qla2x00_remove_one() mustn't be __devexit since it's called from qla2xxx_pci_error_detected(). Signed-off-by: Adrian Bunk [EMAIL PROTECTED] Acked-by: Seokmann Ju [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/qla2xxx/qla_os.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -puN drivers/scsi/qla2xxx/qla_os.c~scsi-qla2xxx-qla_osc-section-fix drivers/scsi/qla2xxx/qla_os.c --- a/drivers/scsi/qla2xxx/qla_os.c~scsi-qla2xxx-qla_osc-section-fix +++ a/drivers/scsi/qla2xxx/qla_os.c @@ -1823,7 +1823,7 @@ probe_out: return ret; } -static void __devexit +static void qla2x00_remove_one(struct pci_dev *pdev) { scsi_qla_host_t *ha; @@ -2957,7 +2957,7 @@ static struct pci_driver qla2xxx_pci_dri }, .id_table = qla2xxx_pci_tbl, .probe = qla2x00_probe_one, - .remove = __devexit_p(qla2x00_remove_one), + .remove = qla2x00_remove_one, .err_handler= qla2xxx_err_handler, }; _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 23/30] 3W RAID drivers: memset not needed in probe
From: Denis Cheng [EMAIL PROTECTED] The memory return from scsi_host_alloc is alloced by kzalloc, which is already zero initilized, so memset not needed. Signed-off-by: Denis Cheng [EMAIL PROTECTED] Cc: Adam Radford [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/3w-9xxx.c |2 -- drivers/scsi/3w-.c |2 -- 2 files changed, 4 deletions(-) diff -puN drivers/scsi/3w-9xxx.c~3w-raid-drivers-memset-not-needed-in-probe drivers/scsi/3w-9xxx.c --- a/drivers/scsi/3w-9xxx.c~3w-raid-drivers-memset-not-needed-in-probe +++ a/drivers/scsi/3w-9xxx.c @@ -2029,8 +2029,6 @@ static int __devinit twa_probe(struct pc } tw_dev = (TW_Device_Extension *)host-hostdata; - memset(tw_dev, 0, sizeof(TW_Device_Extension)); - /* Save values to device extension */ tw_dev-host = host; tw_dev-tw_pci_dev = pdev; diff -puN drivers/scsi/3w-.c~3w-raid-drivers-memset-not-needed-in-probe drivers/scsi/3w-.c --- a/drivers/scsi/3w-.c~3w-raid-drivers-memset-not-needed-in-probe +++ a/drivers/scsi/3w-.c @@ -2295,8 +2295,6 @@ static int __devinit tw_probe(struct pci } tw_dev = (TW_Device_Extension *)host-hostdata; - memset(tw_dev, 0, sizeof(TW_Device_Extension)); - /* Save values to device extension */ tw_dev-host = host; tw_dev-tw_pci_dev = pdev; _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 18/30] scsi/qla2xxx/: possible cleanups
From: Adrian Bunk [EMAIL PROTECTED] - make the following needlessly global code static: - qla_attr.c: qla24xx_vport_delete() - qla_attr.c: qla24xx_vport_disable() - qla_mid.c: qla24xx_allocate_vp_id() - qla_mid.c: qla24xx_find_vhost_by_name() - qla_mid.c: qla2x00_do_dpc_vp() - qla_os.c: struct qla2x00_driver_template - qla_os.c: qla2x00_stop_timer() - qla_os.c: qla2x00_mem_alloc() - qla_os.c: qla2x00_mem_free() - qla_sup.c: qla2x00_lock_nvram_access() - qla_sup.c: qla2x00_unlock_nvram_access() - qla_sup.c: qla2x00_get_nvram_word() - qla_sup.c: qla2x00_write_nvram_word() - #if 0 the following unused global functions: - qla_dbg.c: qla2x00_dump_pkt() - qla_mbx.c: qla2x00_system_error() - qla_mbx.c: qla2x00_get_serdes_params() - qla_mbx.c: qla2x00_get_idma_speed() - qla_mbx.c: qla24xx_get_vp_database() - qla_mbx.c: qla24xx_get_vp_entry() - qla_os.c: remove some unneeded function prototypes Signed-off-by: Adrian Bunk [EMAIL PROTECTED] Cc: Andrew Vasquez [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/qla2xxx/qla_attr.c |6 +++--- drivers/scsi/qla2xxx/qla_dbg.c |2 ++ drivers/scsi/qla2xxx/qla_gbl.h | 25 - drivers/scsi/qla2xxx/qla_mbx.c | 10 ++ drivers/scsi/qla2xxx/qla_mid.c |6 +++--- drivers/scsi/qla2xxx/qla_os.c | 20 ++-- drivers/scsi/qla2xxx/qla_sup.c |8 7 files changed, 28 insertions(+), 49 deletions(-) diff -puN drivers/scsi/qla2xxx/qla_attr.c~scsi-qla2xxx-possible-cleanups drivers/scsi/qla2xxx/qla_attr.c --- a/drivers/scsi/qla2xxx/qla_attr.c~scsi-qla2xxx-possible-cleanups +++ a/drivers/scsi/qla2xxx/qla_attr.c @@ -9,7 +9,7 @@ #include linux/kthread.h #include linux/vmalloc.h -int qla24xx_vport_disable(struct fc_vport *, bool); +static int qla24xx_vport_disable(struct fc_vport *, bool); /* SYSFS attributes - */ @@ -1113,7 +1113,7 @@ vport_create_failed_2: return FC_VPORT_FAILED; } -int +static int qla24xx_vport_delete(struct fc_vport *fc_vport) { scsi_qla_host_t *ha = shost_priv(fc_vport-shost); @@ -1146,7 +1146,7 @@ qla24xx_vport_delete(struct fc_vport *fc return 0; } -int +static int qla24xx_vport_disable(struct fc_vport *fc_vport, bool disable) { scsi_qla_host_t *vha = fc_vport-dd_data; diff -puN drivers/scsi/qla2xxx/qla_dbg.c~scsi-qla2xxx-possible-cleanups drivers/scsi/qla2xxx/qla_dbg.c --- a/drivers/scsi/qla2xxx/qla_dbg.c~scsi-qla2xxx-possible-cleanups +++ a/drivers/scsi/qla2xxx/qla_dbg.c @@ -1428,6 +1428,7 @@ qla2x00_print_scsi_cmd(struct scsi_cmnd printk( sp flags=0x%x\n, sp-flags); } +#if 0 void qla2x00_dump_pkt(void *pkt) { @@ -1442,6 +1443,7 @@ qla2x00_dump_pkt(void *pkt) } printk(\n); } +#endif /* 0 */ #if defined(QL_DEBUG_ROUTINES) /* diff -puN drivers/scsi/qla2xxx/qla_gbl.h~scsi-qla2xxx-possible-cleanups drivers/scsi/qla2xxx/qla_gbl.h --- a/drivers/scsi/qla2xxx/qla_gbl.h~scsi-qla2xxx-possible-cleanups +++ a/drivers/scsi/qla2xxx/qla_gbl.h @@ -68,30 +68,20 @@ extern int num_hosts; /* * Global Functions in qla_mid.c source file. */ -extern struct scsi_host_template qla2x00_driver_template; extern struct scsi_host_template qla24xx_driver_template; extern struct scsi_transport_template *qla2xxx_transport_vport_template; -extern uint8_t qla2x00_mem_alloc(scsi_qla_host_t *); extern void qla2x00_timer(scsi_qla_host_t *); extern void qla2x00_start_timer(scsi_qla_host_t *, void *, unsigned long); -extern void qla2x00_stop_timer(scsi_qla_host_t *); -extern uint32_t qla24xx_allocate_vp_id(scsi_qla_host_t *); extern void qla24xx_deallocate_vp_id(scsi_qla_host_t *); extern int qla24xx_disable_vp (scsi_qla_host_t *); extern int qla24xx_enable_vp (scsi_qla_host_t *); -extern void qla2x00_mem_free(scsi_qla_host_t *); extern int qla24xx_control_vp(scsi_qla_host_t *, int ); extern int qla24xx_modify_vp_config(scsi_qla_host_t *); extern int qla2x00_send_change_request(scsi_qla_host_t *, uint16_t, uint16_t); extern void qla2x00_vp_stop_timer(scsi_qla_host_t *); extern int qla24xx_configure_vhba (scsi_qla_host_t *); -extern int qla24xx_get_vp_entry(scsi_qla_host_t *, uint16_t, int); -extern int qla24xx_get_vp_database(scsi_qla_host_t *, uint16_t); -extern int qla2x00_do_dpc_vp(scsi_qla_host_t *); extern void qla24xx_report_id_acquisition(scsi_qla_host_t *, struct vp_rpt_id_entry_24xx *); -extern scsi_qla_host_t * qla24xx_find_vhost_by_name(scsi_qla_host_t *, -uint8_t *); extern void qla2x00_do_dpc_all_vps(scsi_qla_host_t *); extern int qla24xx_vport_create_req_sanity_check(struct fc_vport *); extern scsi_qla_host_t * qla24xx_create_vhost(struct fc_vport *); @@ -113,7 +103,6 @@ extern void qla2xxx_wake_dpc(scsi_qla_ho extern void qla2x00_alert_all_vps(scsi_qla_host_t *, uint16_t *); extern void qla2x00_async_event(scsi_qla_host_t *, uint16_t *); extern void
[patch 26/30] tgt: use scsi_init_io instead of scsi_alloc_sgtable
From: Boaz Harrosh [EMAIL PROTECTED] - If we export scsi_init_io()/scsi_release_buffers() instead of scsi_{alloc,free}_sgtable() from scsi_lib than tgt code is much more insulated from scsi_lib changes. As a bonus it will also gain bidi capability when it comes. Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] Acked-by: FUJITA Tomonori [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/scsi_lib.c | 21 ++--- drivers/scsi/scsi_tgt_lib.c | 29 + include/scsi/scsi_cmnd.h|4 ++-- 3 files changed, 17 insertions(+), 37 deletions(-) diff -puN drivers/scsi/scsi_lib.c~tgt-use-scsi_init_io-instead-of-scsi_alloc_sgtable drivers/scsi/scsi_lib.c --- a/drivers/scsi/scsi_lib.c~tgt-use-scsi_init_io-instead-of-scsi_alloc_sgtable +++ a/drivers/scsi/scsi_lib.c @@ -739,7 +739,8 @@ static inline unsigned int scsi_sgtable_ return index; } -struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd, gfp_t gfp_mask) +static struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd, + gfp_t gfp_mask) { struct scsi_host_sg_pool *sgp; struct scatterlist *sgl, *prev, *ret; @@ -825,9 +826,7 @@ enomem: return NULL; } -EXPORT_SYMBOL(scsi_alloc_sgtable); - -void scsi_free_sgtable(struct scsi_cmnd *cmd) +static void scsi_free_sgtable(struct scsi_cmnd *cmd) { struct scatterlist *sgl = cmd-request_buffer; struct scsi_host_sg_pool *sgp; @@ -873,8 +872,6 @@ void scsi_free_sgtable(struct scsi_cmnd mempool_free(sgl, sgp-pool); } -EXPORT_SYMBOL(scsi_free_sgtable); - /* * Function:scsi_release_buffers() * @@ -892,7 +889,7 @@ EXPORT_SYMBOL(scsi_free_sgtable); * the scatter-gather table, and potentially any bounce * buffers. */ -static void scsi_release_buffers(struct scsi_cmnd *cmd) +void scsi_release_buffers(struct scsi_cmnd *cmd) { if (cmd-use_sg) scsi_free_sgtable(cmd); @@ -904,6 +901,7 @@ static void scsi_release_buffers(struct cmd-request_buffer = NULL; cmd-request_bufflen = 0; } +EXPORT_SYMBOL(scsi_release_buffers); /* * Function:scsi_io_completion() @@ -1105,7 +1103,7 @@ void scsi_io_completion(struct scsi_cmnd * Returns: 0 on success * BLKPREP_DEFER if the failure is retryable */ -static int scsi_init_io(struct scsi_cmnd *cmd) +int scsi_init_io(struct scsi_cmnd *cmd, gfp_t gfp_mask) { struct request *req = cmd-request; intcount; @@ -1120,7 +1118,7 @@ static int scsi_init_io(struct scsi_cmnd /* * If sg table allocation fails, requeue request later. */ - cmd-request_buffer = scsi_alloc_sgtable(cmd, GFP_ATOMIC); + cmd-request_buffer = scsi_alloc_sgtable(cmd, gfp_mask); if (unlikely(!cmd-request_buffer)) { scsi_unprep_request(req); return BLKPREP_DEFER; @@ -1141,6 +1139,7 @@ static int scsi_init_io(struct scsi_cmnd cmd-use_sg = count; return BLKPREP_OK; } +EXPORT_SYMBOL(scsi_init_io); static struct scsi_cmnd *scsi_get_cmd_from_req(struct scsi_device *sdev, struct request *req) @@ -1186,7 +1185,7 @@ int scsi_setup_blk_pc_cmnd(struct scsi_d BUG_ON(!req-nr_phys_segments); - ret = scsi_init_io(cmd); + ret = scsi_init_io(cmd, GFP_ATOMIC); if (unlikely(ret)) return ret; } else { @@ -1237,7 +1236,7 @@ int scsi_setup_fs_cmnd(struct scsi_devic if (unlikely(!cmd)) return BLKPREP_DEFER; - return scsi_init_io(cmd); + return scsi_init_io(cmd, GFP_ATOMIC); } EXPORT_SYMBOL(scsi_setup_fs_cmnd); diff -puN drivers/scsi/scsi_tgt_lib.c~tgt-use-scsi_init_io-instead-of-scsi_alloc_sgtable drivers/scsi/scsi_tgt_lib.c --- a/drivers/scsi/scsi_tgt_lib.c~tgt-use-scsi_init_io-instead-of-scsi_alloc_sgtable +++ a/drivers/scsi/scsi_tgt_lib.c @@ -331,8 +331,7 @@ static void scsi_tgt_cmd_done(struct scs scsi_tgt_uspace_send_status(cmd, tcmd-itn_id, tcmd-tag); - if (scsi_sglist(cmd)) - scsi_free_sgtable(cmd); + scsi_release_buffers(cmd); queue_work(scsi_tgtd, tcmd-work); } @@ -353,26 +352,6 @@ static int scsi_tgt_transfer_response(st return 0; } -static int scsi_tgt_init_cmd(struct scsi_cmnd *cmd, gfp_t gfp_mask) -{ - struct request *rq = cmd-request; - int count; - - cmd-use_sg = rq-nr_phys_segments; - cmd-request_buffer = scsi_alloc_sgtable(cmd, gfp_mask); - if (!cmd-request_buffer) - return -ENOMEM; - - cmd-request_bufflen = rq-data_len; - - dprintk(cmd %p cnt %d %lu\n, cmd, scsi_sg_count(cmd), - rq_data_dir(rq)); - count = blk_rq_map_sg(rq-q, rq, scsi_sglist(cmd));
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, 13 Dec 2007 19:30:00 -0500 Mark Lord [EMAIL PROTECTED] wrote: Here's the commit that causes the regression: ... --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -760,7 +760,8 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, struct page *page = __rmqueue(zone, order, migratetype); if (unlikely(page == NULL)) break; - list_add_tail(page-lru, list); + list_add(page-lru, list); well that looks fishy. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] fix page_alloc for larger I/O segments (improved)
Improved version, more similar to the 2.6.23 code: Fix page allocator to give better chance of larger contiguous segments (again). Signed-off-by: Mark Lord [EMAIL PROTECTED] --- --- old/mm/page_alloc.c 2007-12-13 19:25:15.0 -0500 +++ linux-2.6/mm/page_alloc.c 2007-12-13 19:43:07.0 -0500 @@ -760,7 +760,7 @@ struct page *page = __rmqueue(zone, order, migratetype); if (unlikely(page == NULL)) break; - list_add(page-lru, list); + list_add_tail(page-lru, list); set_page_private(page, migratetype); } spin_unlock(zone-lock); - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
Mark Lord wrote: Andrew Morton wrote: On Thu, 13 Dec 2007 19:30:00 -0500 Mark Lord [EMAIL PROTECTED] wrote: Here's the commit that causes the regression: ... --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -760,7 +760,8 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, struct page *page = __rmqueue(zone, order, migratetype); if (unlikely(page == NULL)) break; -list_add_tail(page-lru, list); +list_add(page-lru, list); well that looks fishy. .. Yeah. I missed that, and instead just posted a patch to search the list in reverse order, which seems to work for me. I'll try just reversing that line above here now.. gimme 5 minutes or so. .. Yep, that works too. Alternative improved patch now posted. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix page_alloc for larger I/O segments
On Thu, 13 Dec 2007 19:40:09 -0500 Mark Lord [EMAIL PROTECTED] wrote: And here is a patch that seems to fix it for me here: * * * * Fix page allocator to give better change of larger contiguous segments (again). Signed-off-by: Mark Lord [EMAIL PROTECTED] --- --- old/mm/page_alloc.c.orig 2007-12-13 19:25:15.0 -0500 +++ linux-2.6/mm/page_alloc.c 2007-12-13 19:35:50.0 -0500 @@ -954,7 +954,7 @@ goto failed; } /* Find a page of the appropriate migrate type */ - list_for_each_entry(page, pcp-list, lru) { + list_for_each_entry_reverse(page, pcp-list, lru) { if (page_private(page) == migratetype) { list_del(page-lru); pcp-count--; - needs help to make it apply to mainline - needs a comment, methinks... --- a/mm/page_alloc.c~fix-page-allocator-to-give-better-chance-of-larger-contiguous-segments-again +++ a/mm/page_alloc.c @@ -1060,8 +1060,12 @@ again: goto failed; } - /* Find a page of the appropriate migrate type */ - list_for_each_entry(page, pcp-list, lru) + /* +* Find a page of the appropriate migrate type. Doing a +* reverse-order search here helps us to hand out pages in +* ascending physical-address order. +*/ + list_for_each_entry_reverse(page, pcp-list, lru) if (page_private(page) == migratetype) break; _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix page_alloc for larger I/O segments (improved)
On Thu, 13 Dec 2007 19:57:29 -0500 James Bottomley [EMAIL PROTECTED] wrote: On Thu, 2007-12-13 at 19:46 -0500, Mark Lord wrote: Improved version, more similar to the 2.6.23 code: Fix page allocator to give better chance of larger contiguous segments (again). Signed-off-by: Mark Lord [EMAIL PROTECTED] --- --- old/mm/page_alloc.c 2007-12-13 19:25:15.0 -0500 +++ linux-2.6/mm/page_alloc.c 2007-12-13 19:43:07.0 -0500 @@ -760,7 +760,7 @@ struct page *page = __rmqueue(zone, order, migratetype); if (unlikely(page == NULL)) break; - list_add(page-lru, list); + list_add_tail(page-lru, list); Could we put a big comment above this explaining to the would be vm tweakers why this has to be a list_add_tail, so we don't end up back in this position after another two years? Already done ;) --- a/mm/page_alloc.c~fix-page_alloc-for-larger-i-o-segments-fix +++ a/mm/page_alloc.c @@ -847,6 +847,10 @@ static int rmqueue_bulk(struct zone *zon struct page *page = __rmqueue(zone, order, migratetype); if (unlikely(page == NULL)) break; + /* +* Doing a list_add_tail() here helps us to hand out pages in +* ascending physical-address order. +*/ list_add_tail(page-lru, list); set_page_private(page, migratetype); } _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix page_alloc for larger I/O segments (improved)
Andrew Morton wrote: On Thu, 13 Dec 2007 19:57:29 -0500 James Bottomley [EMAIL PROTECTED] wrote: On Thu, 2007-12-13 at 19:46 -0500, Mark Lord wrote: Improved version, more similar to the 2.6.23 code: Fix page allocator to give better chance of larger contiguous segments (again). Signed-off-by: Mark Lord [EMAIL PROTECTED] --- --- old/mm/page_alloc.c 2007-12-13 19:25:15.0 -0500 +++ linux-2.6/mm/page_alloc.c 2007-12-13 19:43:07.0 -0500 @@ -760,7 +760,7 @@ struct page *page = __rmqueue(zone, order, migratetype); if (unlikely(page == NULL)) break; - list_add(page-lru, list); + list_add_tail(page-lru, list); Could we put a big comment above this explaining to the would be vm tweakers why this has to be a list_add_tail, so we don't end up back in this position after another two years? Already done ;) .. I thought of the comment as I rushed off for dinner. Thanks, Andrew! --- a/mm/page_alloc.c~fix-page_alloc-for-larger-i-o-segments-fix +++ a/mm/page_alloc.c @@ -847,6 +847,10 @@ static int rmqueue_bulk(struct zone *zon struct page *page = __rmqueue(zone, order, migratetype); if (unlikely(page == NULL)) break; + /* +* Doing a list_add_tail() here helps us to hand out pages in +* ascending physical-address order. +*/ list_add_tail(page-lru, list); set_page_private(page, migratetype); } _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html