Re: [PATCH] gdth: convert to PCI hotplug API

2008-02-15 Thread Jan Engelhardt

On Feb 13 2008 13:07, Boaz Harrosh wrote:
>+static struct pci_device_id gdthtable[] __devinitdata = {
>+  { PCI_VDEVICE(VORTEX, PCI_ANY_ID) },
>+  { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_SRC) },
>+  { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_SRC_XSCALE) },
>+  { } /* terminate list */
>+};

As usual, +const and +__devinitconst :-)

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-15 Thread Bart Van Assche
On Thu, Feb 7, 2008 at 2:45 PM, Vladislav Bolkhovitin <[EMAIL PROTECTED]> wrote:
>
> Bart Van Assche wrote:
>  > Since the focus of this thread shifted somewhat in the last few
>  > messages, I'll try to summarize what has been discussed so far:
>  > - There was a number of participants who joined this discussion
>  > spontaneously. This suggests that there is considerable interest in
>  > networked storage and iSCSI.
>  > - It has been motivated why iSCSI makes sense as a storage protocol
>  > (compared to ATA over Ethernet and Fibre Channel over Ethernet).
>  > - The direct I/O performance results for block transfer sizes below 64
>  > KB are a meaningful benchmark for storage target implementations.
>  > - It has been discussed whether an iSCSI target should be implemented
>  > in user space or in kernel space. It is clear now that an
>  > implementation in the kernel can be made faster than a user space
>  > implementation 
> (http://kerneltrap.org/mailarchive/linux-kernel/2008/2/4/714804).
>  > Regarding existing implementations, measurements have a.o. shown that
>  > SCST is faster than STGT (30% with the following setup: iSCSI via
>  > IPoIB and direct I/O block transfers with a size of 512 bytes).
>  > - It has been discussed which iSCSI target implementation should be in
>  > the mainstream Linux kernel. There is no agreement on this subject
>  > yet. The short-term options are as follows:
>  > 1) Do not integrate any new iSCSI target implementation in the
>  > mainstream Linux kernel.
>  > 2) Add one of the existing in-kernel iSCSI target implementations to
>  > the kernel, e.g. SCST or PyX/LIO.
>  > 3) Create a new in-kernel iSCSI target implementation that combines
>  > the advantages of the existing iSCSI kernel target implementations
>  > (iETD, STGT, SCST and PyX/LIO).
>  >
>  > As an iSCSI user, I prefer option (3). The big question is whether the
>  > various storage target authors agree with this ?
>
>  I tend to agree with some important notes:
>
>  1. IET should be excluded from this list, iSCSI-SCST is IET updated for
>  SCST framework with a lot of bugfixes and improvements.
>
>  2. I think, everybody will agree that Linux iSCSI target should work
>  over some standard SCSI target framework. Hence the choice gets
>  narrower: SCST vs STGT. I don't think there's a way for a dedicated
>  iSCSI target (i.e. PyX/LIO) in the mainline, because of a lot of code
>  duplication. Nicholas could decide to move to either existing framework
>  (although, frankly, I don't think there's a possibility for in-kernel
>  iSCSI target and user space SCSI target framework) and if he decide to
>  go with SCST, I'll be glad to offer my help and support and wouldn't
>  care if LIO-SCST eventually replaced iSCSI-SCST. The better one should win.

If I understood the above correctly, regarding a kernel space iSCSI
target implementation, only LIO-SE and SCST should be considered. What
I know today about these Linux iSCSI target implementations is as
follows:
* SCST performs slightly better than LIO-SE, and LIO-SE performs
slightly better than STGT (both with regard to latency and with regard
to bandwidth).
* The coding style of SCST is closer to the Linux kernel coding style
than the coding style of the LIO-SE project.
* The structure of SCST is closer to what Linus expects than the
structure of LIO-SE (i.e., authentication handled in userspace, data
transfer handled by the kernel -- LIO-SE handles both in kernel
space).
* Until now I did not encounter any strange behavior in SCST. The
issues I encountered with LIO-SE are being resolved via the LIO-SE
mailing list (http://groups.google.com/group/linux-iscsi-target-dev).

It would take too much effort to develop a new kernel space iSCSI
target from scratch -- we should start from either LIO-SE or SCST. My
opinion is that the best approach is to start with integrating SCST in
the mainstream kernel, and that the more advanced features from LIO-SE
that are not yet in SCST can be ported from LIO-SE to the SCST
framework.

Nicholas, do you think the structure of SCST is powerful enough to be
extended with LIO-SE's powerful features like ERL-2 ?

Bart Van Assche.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic94xx: failing on high load (another data point)

2008-02-15 Thread James Bottomley
On Fri, 2008-02-15 at 00:11 +0800, Keith Hopkins wrote:
> On 01/31/2008 03:29 AM, Darrick J. Wong wrote:
> > On Wed, Jan 30, 2008 at 06:59:34PM +0800, Keith Hopkins wrote:
> >> V28.  My controller functions well with a single drive (low-medium load).  
> >> Unfortunately, all attempts to get the mirrors in sync fail and usually 
> >> hang the whole box.
> > 
> > Adaptec posted a V30 sequencer on their website; does that fix the
> > problems?
> > 
> > http://www.adaptec.com/en-US/speed/scsi/linux/aic94xx-seq-30-1_tar_gz.htm
> > 
> 
> I lost connectivity to the drive again, and had to reboot to recover
> the drive, so it seemed a good time to try out the V30 firmware.
> Unfortunately, it didn't work any better.  Details are in the
> attachment.

Well, I can offer some hope.  The errors you report:

> aic94xx: escb_tasklet_complete: REQ_TASK_ABORT, reason=0x6
> aic94xx: escb_tasklet_complete: Can't find task (tc=6) to abort!

Are requests by the sequencer to abort a task because of a protocol
error.  IBM did some extensive testing with seagate drives and found
that the protocol errors were genuine and the result of drive firmware
problems.  IBM released a version of seagate firmware (BA17) to correct
these.  Unfortunately, your drive identifies its firmware as S513 which
is likely OEM firmware from another vendor ... however, that vendor may
have an update which corrects the problem.

Of course, the other issue is this:

> aic94xx: escb_tasklet_complete: Can't find task (tc=6) to abort!

This is a bug in the driver.  It's not finding the task in the
outstanding list.  The problem seems to be that it's taking the task
from the escb which, by definition, is always NULL.  It should be taking
the task from the ascb it finds by looping over the pending queue.

If you're willing, could you try this patch which may correct the
problem?  It's sort of like falling off a cliff: if you never go near
the edge (i.e. you upgrade the drive fw) you never fall off;
alternatively, it would be nice if you could help me put up guard rails
just in case.

Thanks,

James

---
diff --git a/drivers/scsi/aic94xx/aic94xx_scb.c 
b/drivers/scsi/aic94xx/aic94xx_scb.c
index 0febad4..ab35050 100644
--- a/drivers/scsi/aic94xx/aic94xx_scb.c
+++ b/drivers/scsi/aic94xx/aic94xx_scb.c
@@ -458,13 +458,19 @@ static void escb_tasklet_complete(struct asd_ascb *ascb,
tc_abort = le16_to_cpu(tc_abort);
 
list_for_each_entry_safe(a, b, &asd_ha->seq.pend_q, list) {
-   struct sas_task *task = ascb->uldd_task;
+   struct sas_task *task = a->uldd_task;
+
+   if (a->tc_index != tc_abort)
+   continue;
 
-   if (task && a->tc_index == tc_abort) {
+   if (task) {
failed_dev = task->dev;
sas_task_abort(task);
-   break;
+   } else {
+   ASD_DPRINTK("R_T_A for non TASK scb 0x%x\n",
+   a->scb->header.opcode);
}
+   break;
}
 
if (!failed_dev) {
@@ -478,7 +484,7 @@ static void escb_tasklet_complete(struct asd_ascb *ascb,
 * that the EH will wake up and do something.
 */
list_for_each_entry_safe(a, b, &asd_ha->seq.pend_q, list) {
-   struct sas_task *task = ascb->uldd_task;
+   struct sas_task *task = a->uldd_task;
 
if (task &&
task->dev == failed_dev &&


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] SCSI: fix data corruption caused by ses v2

2008-02-15 Thread James Bottomley
On Wed, 2008-02-13 at 16:25 -0800, Yinghai Lu wrote:
> one system: initrd get courrupted:
> 
> RAMDISK: Compressed image found at block 0
> RAMDISK: incomplete write (-28 != 2048) 134217728
> crc error
> VFS: Mounted root (ext2 filesystem).
> Freeing unused kernel memory: 388k freed
> init_special_inode: bogus i_mode (17)
> Warning: unable to open an initial console.
> init_special_inode: bogus i_mode (17)
> init_special_inode: bogus i_mode (17)
> Kernel panic - not syncing: No init found.  Try passing init= option to 
> kernel.
> 
> bisected to
> commit 9927c68864e9c39cc317b4f559309ba29e642168
> Author: James Bottomley <[EMAIL PROTECTED]>
> Date:   Sun Feb 3 15:48:56 2008 -0600
> 
> [SCSI] ses: add new Enclosure ULD
> 
> changes:
> 1. change char to unsigned char to avoid type change later.
> 2. preserve len for page1
> 3. need to move desc_ptr even the entry is not 
> enclosure_component_device/raid.
>so keep desc_ptr on right position
> 4. record page7 len, and double check if desc_ptr out of boundary before 
> touch.
> 5. fix typo in subenclosure checking: should use hdr_buf instead.
> 
> Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]>

OK, I added this with a fixup to eliminate the spurious goto

Thanks,

James

---

diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c
index cbba012..a6d9669 100644
--- a/drivers/scsi/ses.c
+++ b/drivers/scsi/ses.c
@@ -561,16 +561,15 @@ static int ses_intf_add(struct class_device *cdev,
if (desc_ptr) {
if (desc_ptr >= buf + page7_len) {
desc_ptr = NULL;
-   goto noname;
+   } else {
+   len = (desc_ptr[2] << 8) + desc_ptr[3];
+   desc_ptr += 4;
+   /* Add trailing zero - pushes into
+* reserved space */
+   desc_ptr[len] = '\0';
+   name = desc_ptr;
}
-   len = (desc_ptr[2] << 8) + desc_ptr[3];
-   desc_ptr += 4;
-   /* Add trailing zero - pushes into
-* reserved space */
-   desc_ptr[len] = '\0';
-   name = desc_ptr;
}
-   noname:
if (type_ptr[0] == ENCLOSURE_COMPONENT_DEVICE ||
type_ptr[0] == ENCLOSURE_COMPONENT_ARRAY_DEVICE) {
 


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] gdth: convert to PCI hotplug API

2008-02-15 Thread Jeff Garzik

James Bottomley wrote:

diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index c825239..1b53e92 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -595,85 +595,107 @@ static int __init gdth_search_isa(ulong32 bios_adr)
 #endif /* CONFIG_ISA */
 
 #ifdef CONFIG_PCI

-static void gdth_search_dev(gdth_pci_str *pcistr, ushort *cnt,
-ushort vendor, ushort dev);
+static gdth_pci_str gdth_pcistr[MAXHA];
+static int gdth_pci_cnt;
+static bool gdth_pci_registered;


Could we get rid of these static arrays and MAXHA entirely?  It should
be possible just to bung the parameters in pci_str into gdth_ha_str and
dump the arrays.



I kept those array for one reason:  you need it to preserve the existing 
in-driver PCI device sort.


If we can eliminate the sorting, then the array can easily disappear.

I /think/ the sort can be eliminated now because we have pci=reverse, 
but I have not verified that guess.


Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] gdth: convert to PCI hotplug API

2008-02-15 Thread Matthew Wilcox
On Fri, Feb 15, 2008 at 10:44:52AM -0500, Jeff Garzik wrote:
> I kept those array for one reason:  you need it to preserve the existing 
> in-driver PCI device sort.

Just get rid of it.  I got rid of it for sym2 during 2.5 and very few
people have complained.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic94xx: failing on high load (another data point)

2008-02-15 Thread Keith Hopkins
On 02/15/2008 11:28 PM, James Bottomley wrote:
> If you're willing, could you try this patch which may correct the
> problem?  It's sort of like falling off a cliff: if you never go near
> the edge (i.e. you upgrade the drive fw) you never fall off;
> alternatively, it would be nice if you could help me put up guard rails
> just in case.

Hi James,

  Thanks for your feedback & suggestions.  Yes, I'll give the patch a try.  It 
might take a few days to get onto the system.  The system/drive isn't IBM, but 
I'll also see if I can track down a firmware update too for the protocol errors.

--Keith


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-15 Thread FUJITA Tomonori
On Thu, 14 Feb 2008 17:16:36 -0800
Tim Pepper <[EMAIL PROTECTED]> wrote:

> On Fri 15 Feb at 09:13:16 +0900 [EMAIL PROTECTED] said:
> > 
> > Thanks. So we surely have a bug in the non-breakup part.
> > 
> > I've just found one bug. Can you try this patch against 2.6.24?
> 
> Tested and unfortunately no change.  Behaves same as the breakup-revert patch.

Thanks and sorry about that.

Now I don't have any idea. I split the patch. Can you please apply
them in a step-by-step way and tell me which patch breaks the driver?

There are nine patches against 2.6.24:

http://www.kernel.org/pub/linux/kernel/people/tomo/ips/

The first one is just reverting the data buffer accessors
conversion. It would be nice if we could just revert it but we
can't. These changes are necessary to compile the driver against post
2.6.24.

Really sorry about the troubles,
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bad sector detection in sr.c and sd.c

2008-02-15 Thread Alan Stern
James:

With regard to your commit 366c246de9cec909c5eba4f784c92d1e75b4dc38
(sd: handle bad lba in sense information)...

Shouldn't analogous changes be made in sr_done()?  The code there looks 
rather old and somewhat bogus.  It doesn't need to be quite as careful 
as the code in sd_done() because you can assume that sr.c will never 
handle devices with more than 2 billion sectors (an assumption already 
present in sr_prep_fn()).

Also, do you have any objection to a patch for sd_done() which would
use the scsi_cmnd's residue if bad_lba is outside the interval from
start_lba to end_lba?

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: arcmsr + archttp64 calls dma_free_coherent() with irqs disabled - dmesg filled with warnings

2008-02-15 Thread James Bottomley
On Fri, 2008-02-15 at 10:56 -1000, Joshua Hoblitt wrote:
> Hi James,
> 
> Daniel took the time to patch up the 2.6.24 version.  I've tested it and
> the warning messages are gone.  Please take a look at:
> 

> diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c 
> b/drivers/scsi/arcmsr/arcmsr_hba.c
> index f4a202e..4f9ff32 100644
> --- a/drivers/scsi/arcmsr/arcmsr_hba.c
> +++ b/drivers/scsi/arcmsr/arcmsr_hba.c
> @@ -1380,12 +1388,13 @@ static int arcmsr_iop_message_xfer(struct 
> AdapterControlBlock *acb, \
>  
>   case ARCMSR_MESSAGE_READ_RQBUFFER: {
>   unsigned long *ver_addr;
> - dma_addr_t buf_handle;
>   uint8_t *pQbuffer, *ptmpQbuffer;
>   int32_t allxfer_len = 0;
> + void *tmp;
>  
> - ver_addr = pci_alloc_consistent(acb->pdev, 1032, &buf_handle);
> - if (!ver_addr) {
> + tmp = kmalloc(1032, GFP_KERNEL|GFP_DMA);

GFP_DMA is pretty pointless for a buffer which never actually gets anywhere 
near a DMA, isn't it?

> + ver_addr = (unsigned long *)tmp;

No cast needed from void *

> + if (!tmp) {
>   retvalue = ARCMSR_MESSAGE_FAIL;
>   goto message_out;
>   }
> @@ -1421,18 +1430,19 @@ static int arcmsr_iop_message_xfer(struct 
> AdapterControlBlock *acb, \
>   memcpy(pcmdmessagefld->messagedatabuffer, (uint8_t *)ver_addr, 
> allxfer_len);
>   pcmdmessagefld->cmdmessage.Length = allxfer_len;
>   pcmdmessagefld->cmdmessage.ReturnCode = 
> ARCMSR_MESSAGE_RETURNCODE_OK;
> - pci_free_consistent(acb->pdev, 1032, ver_addr, buf_handle);
> + kfree(tmp);
>   }
>   break;
>  
>   case ARCMSR_MESSAGE_WRITE_WQBUFFER: {
>   unsigned long *ver_addr;
> - dma_addr_t buf_handle;
>   int32_t my_empty_len, user_len, wqbuf_firstindex, 
> wqbuf_lastindex;
>   uint8_t *pQbuffer, *ptmpuserbuffer;
> + void *tmp;
>  
> - ver_addr = pci_alloc_consistent(acb->pdev, 1032, &buf_handle);
> - if (!ver_addr) {
> + tmp = kmalloc(1032, GFP_KERNEL|GFP_DMA);
> + ver_addr = (unsigned long *)tmp;

Actually, just do ver_addr = kmalloc(...) instead

> + if (!tmp) {
>   retvalue = ARCMSR_MESSAGE_FAIL;
>   goto message_out;
>   }
> @@ -1482,7 +1492,7 @@ static int arcmsr_iop_message_xfer(struct 
> AdapterControlBlock *acb, \
>   retvalue = ARCMSR_MESSAGE_FAIL;
>   }
>   }
> - pci_free_consistent(acb->pdev, 1032, ver_addr, 
> buf_handle);
> + kfree(tmp);
>   }
>   break;


I can't work out why this was pci_alloc_consistent in the first
place ... it's not used as consistent memory anywhere on the PCI bus


James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: arcmsr + archttp64 calls dma_free_coherent() with irqs disabled - dmesg filled with warnings

2008-02-15 Thread Joshua Hoblitt
Hi James,

Daniel took the time to patch up the 2.6.24 version.  I've tested it and
the warning messages are gone.  Please take a look at:

http://bugs.gentoo.org/attachment.cgi?id=143585

Thanks,

-J

--
On Tue, Feb 12, 2008 at 04:30:36PM -0600, James Bottomley wrote:
> 
> On Tue, 2008-02-12 at 12:21 -1000, Joshua Hoblitt wrote:
> > I've got a few compilation errors on head.  Before I try it, is it
> > possible to backport the 76d78300a6eb8b7f08e47703b7e68a659ffc2053 change
> > to 2.6.24.y?
> 
> I don't see any reason why not but I'd have to cherry pick it to see ...
> so you could save me the bother by trying it yourself ...
> 
> James
> 
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[APPENDIX PATCH 01/13] block: don't call __end_that_request_first() for clone

2008-02-15 Thread Kiyoshi Ueda
This patch adds a flag to indicate the request is a clone and
avoids __end_that_request_first() call for cloned requests
in blk_end_io().
So request-based dm can use blk_end_io() to complete clones
while dm doesn't want to complete the data in the clones.

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 block/blk-core.c   |2 +-
 include/linux/blkdev.h |3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

Index: 2.6.25-rc1/block/blk-core.c
===
--- 2.6.25-rc1.orig/block/blk-core.c
+++ 2.6.25-rc1/block/blk-core.c
@@ -1880,7 +1880,7 @@ static int blk_end_io(struct request *rq
struct request_queue *q = rq->q;
unsigned long flags = 0UL;
 
-   if (blk_fs_request(rq) || blk_pc_request(rq)) {
+   if ((blk_fs_request(rq) || blk_pc_request(rq)) && !blk_cloned_rq(rq)) {
if (__end_that_request_first(rq, error, nr_bytes))
return 1;
 
Index: 2.6.25-rc1/include/linux/blkdev.h
===
--- 2.6.25-rc1.orig/include/linux/blkdev.h
+++ 2.6.25-rc1/include/linux/blkdev.h
@@ -115,6 +115,7 @@ enum rq_flag_bits {
__REQ_RW_SYNC,  /* request is sync (O_DIRECT) */
__REQ_ALLOCED,  /* request came from our alloc pool */
__REQ_RW_META,  /* metadata io request */
+   __REQ_CLONED,   /* request is a clone of another request */
__REQ_NR_BITS,  /* stops here */
 };
 
@@ -136,6 +137,7 @@ enum rq_flag_bits {
 #define REQ_RW_SYNC(1 << __REQ_RW_SYNC)
 #define REQ_ALLOCED(1 << __REQ_ALLOCED)
 #define REQ_RW_META(1 << __REQ_RW_META)
+#define REQ_CLONED (1 << __REQ_CLONED)
 
 #define BLK_MAX_CDB16
 
@@ -500,6 +502,7 @@ enum {
 #define blk_sorted_rq(rq)  ((rq)->cmd_flags & REQ_SORTED)
 #define blk_barrier_rq(rq) ((rq)->cmd_flags & REQ_HARDBARRIER)
 #define blk_fua_rq(rq) ((rq)->cmd_flags & REQ_FUA)
+#define blk_cloned_rq(rq)  ((rq)->cmd_flags & REQ_CLONED)
 #define blk_bidi_rq(rq)((rq)->next_rq != NULL)
 #define blk_empty_barrier(rq)  (blk_barrier_rq(rq) && blk_fs_request(rq) && 
!(rq)->hard_nr_sectors)
 /* rq->queuelist of dequeued request must be list_empty() */
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[APPENDIX PATCH 05/13] dm: remove dead codes

2008-02-15 Thread Kiyoshi Ueda
This patch removes dead codes for the noflush suspend.
No functional change.

This patch is just a clean up of the codes and not functionally
related to request-based dm.  But included here due to literal
dependency.

The dm_queue_flush(md, DM_WQ_FLUSH_ALL, NULL) in dm_suspend()
is never invoked because:
  - The 'goto flush_and_out' is same as 'goto out', because
the 'goto flush_and_out' is called only when '!noflush'
  - If the 'r && noflush' is true, the interrupt check code above
is invoked and 'goto out'

The DM_WQ_FLUSH_ALL type is used only in dm_suspend().
So no need any more.

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 drivers/md/dm.c |   14 +-
 1 files changed, 1 insertion(+), 13 deletions(-)

Index: 2.6.25-rc1/drivers/md/dm.c
===
--- 2.6.25-rc1.orig/drivers/md/dm.c
+++ 2.6.25-rc1/drivers/md/dm.c
@@ -76,7 +76,6 @@ union map_info *dm_get_mapinfo(struct bi
  */
 struct dm_wq_req {
enum {
-   DM_WQ_FLUSH_ALL,
DM_WQ_FLUSH_DEFERRED,
} type;
struct work_struct work;
@@ -1340,9 +1339,6 @@ static void dm_wq_work(struct work_struc
 
down_write(&md->io_lock);
switch (req->type) {
-   case DM_WQ_FLUSH_ALL:
-   __merge_pushback_list(md);
-   /* pass through */
case DM_WQ_FLUSH_DEFERRED:
__flush_deferred_io(md);
break;
@@ -1472,7 +1468,7 @@ int dm_suspend(struct mapped_device *md,
if (!md->suspended_bdev) {
DMWARN("bdget failed in dm_suspend");
r = -ENOMEM;
-   goto flush_and_out;
+   goto out;
}
 
/*
@@ -1523,14 +1519,6 @@ int dm_suspend(struct mapped_device *md,
 
set_bit(DMF_SUSPENDED, &md->flags);
 
-flush_and_out:
-   if (r && noflush)
-   /*
-* Because there may be already I/Os in the pushback list,
-* flush them before return.
-*/
-   dm_queue_flush(md, DM_WQ_FLUSH_ALL, NULL);
-
 out:
if (r && md->suspended_bdev) {
bdput(md->suspended_bdev);
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[APPENDIX PATCH 02/13] block: add request submission interface

2008-02-15 Thread Kiyoshi Ueda
This patch adds a generic request submission interface for request
stacking drivers so that request-based dm can use it to submit
clones to underlying devices.

The request may have been made based on limitations of other queues.
So generic limitation checks based on the submitting queue are needed.

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 block/blk-core.c   |   65 +
 include/linux/blkdev.h |1 
 2 files changed, 66 insertions(+)

Index: 2.6.25-rc1/block/blk-core.c
===
--- 2.6.25-rc1.orig/block/blk-core.c
+++ 2.6.25-rc1/block/blk-core.c
@@ -1510,6 +1510,71 @@ void submit_bio(int rw, struct bio *bio)
 }
 EXPORT_SYMBOL(submit_bio);
 
+/*
+ * Check a request for queue limits
+ */
+static int check_queue_limit(struct request_queue *q, struct request *rq)
+{
+   if (rq->nr_sectors > q->max_sectors ||
+   rq->data_len >> 9 > q->max_hw_sectors) {
+   printk(KERN_ERR "%s: over max size limit.\n", __func__);
+   return 1;
+   }
+
+   /*
+* queue's settings related to segment counting like q->bounce_pfn
+* may differ from that of other stacking queues.
+* Recalculate it to check the request correctly on this queue's
+* limitation.
+*/
+   blk_recalc_rq_segments(rq);
+   if (rq->nr_phys_segments > q->max_phys_segments ||
+   rq->nr_hw_segments > q->max_hw_segments) {
+   printk(KERN_ERR "%s: over max segments limit.\n", __func__);
+   return 1;
+   }
+
+   return 0;
+}
+
+/**
+ * blk_submit_request - Helper for stacking drivers to submit the request
+ * @q:  the queue to submit the request
+ * @rq: the request being queued
+ **/
+void blk_submit_request(struct request_queue *q, struct request *rq)
+{
+   unsigned long flags;
+
+   if (check_queue_limit(q, rq))
+   goto end_io;
+
+#ifdef CONFIG_FAIL_MAKE_REQUEST
+   if (rq->rq_disk && rq->rq_disk->flags & GENHD_FL_FAIL &&
+   should_fail(&fail_make_request, blk_rq_bytes(rq)))
+   goto end_io;
+#endif
+
+   spin_lock_irqsave(q->queue_lock, flags);
+
+   /*
+* Submitting request must be dequeued before calling this function
+* because it will be linked to another request_queue
+*/
+   BUG_ON(blk_queued_rq(rq));
+
+   drive_stat_acct(rq, 1);
+   __elv_add_request(q, rq, ELEVATOR_INSERT_BACK, 0);
+
+   spin_unlock_irqrestore(q->queue_lock, flags);
+
+   return;
+
+end_io:
+   blk_end_request(rq, -EIO, blk_rq_bytes(rq));
+}
+EXPORT_SYMBOL_GPL(blk_submit_request);
+
 /**
  * __end_that_request_first - end I/O on a request
  * @req:  the request being processed
Index: 2.6.25-rc1/include/linux/blkdev.h
===
--- 2.6.25-rc1.orig/include/linux/blkdev.h
+++ 2.6.25-rc1/include/linux/blkdev.h
@@ -616,6 +616,7 @@ extern void blk_end_sync_rq(struct reque
 extern struct request *blk_get_request(struct request_queue *, int, gfp_t);
 extern void blk_insert_request(struct request_queue *, struct request *, int, 
void *);
 extern void blk_requeue_request(struct request_queue *, struct request *);
+extern void blk_submit_request(struct request_queue *q, struct request *rq);
 extern void blk_plug_device(struct request_queue *);
 extern int blk_remove_plug(struct request_queue *);
 extern void blk_recount_segments(struct request_queue *, struct bio *);
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[APPENDIX PATCH 10/13] dm: enable request-based dm

2008-02-15 Thread Kiyoshi Ueda
This patch enables request-based dm.

Request-based dm and bio-based dm coexist.
There are some limitations between them.
  - OK: bio-based dm device on bio-based dm device
  - OK: bio-based dm device on request-based dm device
  - OK: request-based dm device on request-based dm device
  - NG: request-based dm device on bio-based dm device

The type of a dm device is decided at the first table loading time.
Until then, mempool creations and queue initializations are deferred.
Once the type of a dm device is decided, the type can't be changed.

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 drivers/md/dm-table.c |   54 +
 drivers/md/dm.c   |  154 ++
 drivers/md/dm.h   |6 +
 3 files changed, 179 insertions(+), 35 deletions(-)

Index: 2.6.25-rc1/drivers/md/dm-table.c
===
--- 2.6.25-rc1.orig/drivers/md/dm-table.c
+++ 2.6.25-rc1/drivers/md/dm-table.c
@@ -813,6 +813,55 @@ static int setup_indexes(struct dm_table
return 0;
 }
 
+#define DM_HOOK_AT_REQUEST 0
+#define DM_HOOK_AT_BIO 1
+
+/*
+ * Check the consistency of targets' hook type
+ *
+ * Returns
+ *DM_HOOK_AT_REQUEST: the table is for request-based dm
+ *DM_HOOK_AT_BIO: the table is for bio-based dm
+ *negative  : the table is not consistent
+ */
+static int check_table_hook_type(struct dm_table *t)
+{
+   unsigned int i;
+   unsigned int bio_based = 0, rq_based = 0;
+   struct dm_target *ti;
+
+   for (i = 0; i < t->num_targets; i++) {
+   ti = t->targets + i;
+
+   if (ti->type->map_rq)
+   rq_based = 1;
+   else
+   bio_based = 1;
+
+   if (rq_based && bio_based) {
+   DMERR("Inconsistent table: different target types"
+ " mixed up");
+   return -EINVAL;
+   }
+   }
+
+   return rq_based ? DM_HOOK_AT_REQUEST : DM_HOOK_AT_BIO;
+}
+
+static int set_md_hook_type(struct dm_table *t)
+{
+   int r = check_table_hook_type(t);
+
+   switch (r) {
+   case DM_HOOK_AT_REQUEST:
+   return dm_set_md_request_based(t->md);
+   case DM_HOOK_AT_BIO:
+   return dm_set_md_bio_based(t->md);
+   default:
+   return r;
+   }
+}
+
 /*
  * Builds the btree to index the map.
  */
@@ -821,6 +870,11 @@ int dm_table_complete(struct dm_table *t
int r = 0;
unsigned int leaf_nodes;
 
+   /* Setup the mapped_device to bio-based dm or request-based dm */
+   r = set_md_hook_type(t);
+   if (r)
+   return r;
+
check_for_valid_limits(&t->limits);
 
/* how many indexes will the btree have ? */
Index: 2.6.25-rc1/drivers/md/dm.c
===
--- 2.6.25-rc1.orig/drivers/md/dm.c
+++ 2.6.25-rc1/drivers/md/dm.c
@@ -95,6 +95,7 @@ EXPORT_SYMBOL_GPL(dm_get_rq_mapinfo);
 #define DMF_DELETING 4
 #define DMF_NOFLUSH_SUSPENDING 5
 #define DMF_REQUEST_BASED 6
+#define DMF_BIO_BASED 7
 
 /*
  * Work processed by per-device workqueue.
@@ -1398,6 +1399,115 @@ out:
return r;
 }
 
+static void init_queue(struct request_queue *q, struct mapped_device *md)
+{
+   q->queuedata = md;
+   q->backing_dev_info.congested_fn = dm_any_congested;
+   q->backing_dev_info.congested_data = md;
+   blk_queue_make_request(q, dm_request);
+   blk_queue_bounce_limit(q, BLK_BOUNCE_ANY);
+   q->unplug_fn = dm_unplug_all;
+}
+
+int dm_set_md_request_based(struct mapped_device *md)
+{
+   int r = 0;
+
+   if (test_bit(DMF_REQUEST_BASED, &md->flags))
+   /* Initialization is already done */
+   return 0;
+
+   if (test_bit(DMF_BIO_BASED, &md->flags)) {
+   DMERR("Can't change hook type to request-based from bio-based");
+   return -EINVAL;
+   }
+
+   md->tio_pool = mempool_create_slab_pool(MIN_IOS, _rq_tio_cache);
+   if (!md->tio_pool)
+   return -ENOMEM;
+
+   md->queue = blk_init_queue(dm_request_fn, NULL);
+   if (!md->queue) {
+   DMERR("request queue initialization for request-based failed");
+   r = -ENOMEM;
+   goto out_free_tio_pool;
+   }
+
+   md->saved_make_request_fn = md->queue->make_request_fn;
+   init_queue(md->queue, md);
+   set_bit(QUEUE_FLAG_STACKABLE, &md->queue->queue_flags);
+   md->disk->queue = md->queue;
+   r = blk_register_queue(md->disk);
+   if (r) {
+   DMERR("registration of request queue failed");
+   goto out_cleanup_queue;
+   }
+
+   set_bit(DMF_REQUEST_BASED, &md->flags);
+
+   return 0;
+
+out_cleanup_queue:
+   blk_cleanup_queue(md->queue);
+   md->disk->queue = md->queu

Re: arcmsr + archttp64 calls dma_free_coherent() with irqs disabled - dmesg filled with warnings

2008-02-15 Thread James Bottomley

On Fri, 2008-02-15 at 15:57 -0600, James Bottomley wrote:
> On Fri, 2008-02-15 at 10:56 -1000, Joshua Hoblitt wrote:
> > Hi James,
> > 
> > Daniel took the time to patch up the 2.6.24 version.  I've tested it and
> > the warning messages are gone.  Please take a look at:
> > 
> 
> > diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c 
> > b/drivers/scsi/arcmsr/arcmsr_hba.c
> > index f4a202e..4f9ff32 100644
> > --- a/drivers/scsi/arcmsr/arcmsr_hba.c
> > +++ b/drivers/scsi/arcmsr/arcmsr_hba.c
> > @@ -1380,12 +1388,13 @@ static int arcmsr_iop_message_xfer(struct 
> > AdapterControlBlock *acb, \
> >  
> > case ARCMSR_MESSAGE_READ_RQBUFFER: {
> > unsigned long *ver_addr;
> > -   dma_addr_t buf_handle;
> > uint8_t *pQbuffer, *ptmpQbuffer;
> > int32_t allxfer_len = 0;
> > +   void *tmp;
> >  
> > -   ver_addr = pci_alloc_consistent(acb->pdev, 1032, &buf_handle);
> > -   if (!ver_addr) {
> > +   tmp = kmalloc(1032, GFP_KERNEL|GFP_DMA);
> 
> GFP_DMA is pretty pointless for a buffer which never actually gets anywhere 
> near a DMA, isn't it?
> 
> > +   ver_addr = (unsigned long *)tmp;
> 
> No cast needed from void *
> 
> > +   if (!tmp) {
> > retvalue = ARCMSR_MESSAGE_FAIL;
> > goto message_out;
> > }
> > @@ -1421,18 +1430,19 @@ static int arcmsr_iop_message_xfer(struct 
> > AdapterControlBlock *acb, \
> > memcpy(pcmdmessagefld->messagedatabuffer, (uint8_t *)ver_addr, 
> > allxfer_len);
> > pcmdmessagefld->cmdmessage.Length = allxfer_len;
> > pcmdmessagefld->cmdmessage.ReturnCode = 
> > ARCMSR_MESSAGE_RETURNCODE_OK;
> > -   pci_free_consistent(acb->pdev, 1032, ver_addr, buf_handle);
> > +   kfree(tmp);
> > }
> > break;
> >  
> > case ARCMSR_MESSAGE_WRITE_WQBUFFER: {
> > unsigned long *ver_addr;
> > -   dma_addr_t buf_handle;
> > int32_t my_empty_len, user_len, wqbuf_firstindex, 
> > wqbuf_lastindex;
> > uint8_t *pQbuffer, *ptmpuserbuffer;
> > +   void *tmp;
> >  
> > -   ver_addr = pci_alloc_consistent(acb->pdev, 1032, &buf_handle);
> > -   if (!ver_addr) {
> > +   tmp = kmalloc(1032, GFP_KERNEL|GFP_DMA);

Actually, also all the code around here implies we're in atomic context,
so that GFP_KERNEL can't be right either.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH 3/3] block: lld busy status exporting interface

2008-02-15 Thread Kiyoshi Ueda
This patch adds an interface to check lld's busy status
from the block layer.
(scsi patch is also included just for example.)
This resolves a performance problem on request stacking devices below.


Some drivers like scsi mid layer stop dispatching requests when
they detect busy state on its low-level device like host/bus/device.
It allows the requests to stay in the I/O scheduler's queue
for a chance of merging.

Request stacking drivers like request-based dm should follow
the same logic.
However, there is no generic interface for the stacked device
to check if the underlying device(s) are busy.
If the stacking driver dispatches and submits requests to
the busy underlying device, the requests will stay in
the underlying device's queue without a chance for merging.
This causes performance problem on burst I/O load.

With this patch, busy state of the underlying device is exported
via the queue flag.  So the stacking driver can check it and
stop dispatching requests if busy.
The underlying device driver must set/clear the flag appropriately.

For example, scsi sets the busy flag when low-level devices are not
ready or the low-level driver rejects dispatching command.
And scsi clears the busy flag when a command has been dispatched
successfully, since there may be more spaces to dispatch (the exact
limit check will be done in the next loop for the next request.)

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 drivers/scsi/scsi_lib.c |   11 ++-
 include/linux/blkdev.h  |4 
 2 files changed, 14 insertions(+), 1 deletion(-)

Index: 2.6.25-rc1/include/linux/blkdev.h
===
--- 2.6.25-rc1.orig/include/linux/blkdev.h
+++ 2.6.25-rc1/include/linux/blkdev.h
@@ -430,6 +430,7 @@ struct request_queue
 #define QUEUE_FLAG_ELVSWITCH   8   /* don't use elevator, just do FIFO */
 #define QUEUE_FLAG_BIDI9   /* queue supports bidi requests 
*/
 #define QUEUE_FLAG_STACKABLE   10  /* queue supports request stacking */
+#define QUEUE_FLAG_BUSY11  /* device/host under queue is 
busy */
 
 enum {
/*
@@ -477,6 +478,9 @@ enum {
 #define blk_queue_flushing(q)  ((q)->ordseq)
 #define blk_queue_stackable(q) \
test_bit(QUEUE_FLAG_STACKABLE, &(q)->queue_flags)
+#define blk_lld_busy(q)test_bit(QUEUE_FLAG_BUSY, 
&(q)->queue_flags)
+#define blk_set_lld_busy(q)set_bit(QUEUE_FLAG_BUSY, &(q)->queue_flags)
+#define blk_clear_lld_busy(q)  clear_bit(QUEUE_FLAG_BUSY, &(q)->queue_flags)
 
 #define blk_fs_request(rq) ((rq)->cmd_type == REQ_TYPE_FS)
 #define blk_pc_request(rq) ((rq)->cmd_type == REQ_TYPE_BLOCK_PC)
Index: 2.6.25-rc1/drivers/scsi/scsi_lib.c
===
--- 2.6.25-rc1.orig/drivers/scsi/scsi_lib.c
+++ 2.6.25-rc1/drivers/scsi/scsi_lib.c
@@ -1448,9 +1448,14 @@ static void scsi_request_fn(struct reque
 * accept it.
 */
req = elv_next_request(q);
-   if (!req || !scsi_dev_queue_ready(q, sdev))
+   if (!req)
break;
 
+   if (!scsi_dev_queue_ready(q, sdev)) {
+   blk_set_lld_busy(q);
+   break;
+   }
+
if (unlikely(!scsi_device_online(sdev))) {
sdev_printk(KERN_ERR, sdev,
"rejecting I/O to offline device\n");
@@ -1506,6 +1511,8 @@ static void scsi_request_fn(struct reque
rtn = scsi_dispatch_cmd(cmd);
spin_lock_irq(q->queue_lock);
if(rtn) {
+   blk_set_lld_busy(q);
+
/* we're refusing the command; because of
 * the way locks get dropped, we need to 
 * check here if plugging is required */
@@ -1514,6 +1521,7 @@ static void scsi_request_fn(struct reque
 
break;
}
+   blk_clear_lld_busy(q);
}
 
goto out;
@@ -1530,6 +1538,7 @@ static void scsi_request_fn(struct reque
 * later time.
 */
spin_lock_irq(q->queue_lock);
+   blk_set_lld_busy(q);
blk_requeue_request(q, req);
sdev->device_busy--;
if(sdev->device_busy == 0)
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[APPENDIX PATCH 07/13] dm: add memory pool

2008-02-15 Thread Kiyoshi Ueda
This patch prepares memory pools for request-based dm.

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 drivers/md/dm.c |   26 +-
 1 files changed, 25 insertions(+), 1 deletion(-)

Index: 2.6.25-rc1/drivers/md/dm.c
===
--- 2.6.25-rc1.orig/drivers/md/dm.c
+++ 2.6.25-rc1/drivers/md/dm.c
@@ -52,6 +52,22 @@ struct dm_target_io {
union map_info info;
 };
 
+/*
+ * For request based dm.
+ * One of these is allocated per request.
+ *
+ * Since assuming "original request : cloned request = 1 : 1" and
+ * a counter for number of clones like struct dm_io.io_count isn't needed,
+ * struct dm_io and struct target_io can merge.
+ */
+struct dm_rq_target_io {
+   struct mapped_device *md;
+   struct dm_target *ti;
+   struct request *orig, clone;
+   int error;
+   union map_info info;
+};
+
 union map_info *dm_get_mapinfo(struct bio *bio)
 {
if (bio && bio->bi_private)
@@ -147,6 +163,7 @@ struct mapped_device {
 #define MIN_IOS 256
 static struct kmem_cache *_io_cache;
 static struct kmem_cache *_tio_cache;
+static struct kmem_cache *_rq_tio_cache; /* tio pool for request-based dm */
 
 static int __init local_init(void)
 {
@@ -162,9 +179,13 @@ static int __init local_init(void)
if (!_tio_cache)
goto out_free_io_cache;
 
+   _rq_tio_cache = KMEM_CACHE(dm_rq_target_io, 0);
+   if (!_rq_tio_cache)
+   goto out_free_tio_cache;
+
r = dm_uevent_init();
if (r)
-   goto out_free_tio_cache;
+   goto out_free_rq_tio_cache;
 
_major = major;
r = register_blkdev(_major, _name);
@@ -178,6 +199,8 @@ static int __init local_init(void)
 
 out_uevent_exit:
dm_uevent_exit();
+out_free_rq_tio_cache:
+   kmem_cache_destroy(_rq_tio_cache);
 out_free_tio_cache:
kmem_cache_destroy(_tio_cache);
 out_free_io_cache:
@@ -188,6 +211,7 @@ out_free_io_cache:
 
 static void local_exit(void)
 {
+   kmem_cache_destroy(_rq_tio_cache);
kmem_cache_destroy(_tio_cache);
kmem_cache_destroy(_io_cache);
unregister_blkdev(_major, _name);
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[APPENDIX PATCH 04/13] block: export blk_end_io

2008-02-15 Thread Kiyoshi Ueda
This patch exports blk_end_io() so that request-based dm can use it
to complete their clone.

Request-based dm can't use blk_end_request interfaces for their clone,
since their callback is called again.
So another request completion interface which has no stacking hook
is needed.

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 block/blk-core.c   |6 +++---
 include/linux/blkdev.h |3 +++
 2 files changed, 6 insertions(+), 3 deletions(-)

Index: 2.6.25-rc1/block/blk-core.c
===
--- 2.6.25-rc1.orig/block/blk-core.c
+++ 2.6.25-rc1/block/blk-core.c
@@ -1938,9 +1938,8 @@ EXPORT_SYMBOL(end_request);
  * 0 - we are done with this request
  * 1 - this request is not freed yet, it still has pending buffers.
  **/
-static int blk_end_io(struct request *rq, int error, unsigned int nr_bytes,
- unsigned int bidi_bytes,
- int (drv_callback)(struct request *))
+int blk_end_io(struct request *rq, int error, unsigned int nr_bytes,
+  unsigned int bidi_bytes, int (drv_callback)(struct request *))
 {
struct request_queue *q = rq->q;
unsigned long flags = 0UL;
@@ -1967,6 +1966,7 @@ static int blk_end_io(struct request *rq
 
return 0;
 }
+EXPORT_SYMBOL_GPL(blk_end_io);
 
 /**
  * blk_end_request - Helper function for drivers to complete the request.
Index: 2.6.25-rc1/include/linux/blkdev.h
===
--- 2.6.25-rc1.orig/include/linux/blkdev.h
+++ 2.6.25-rc1/include/linux/blkdev.h
@@ -701,6 +701,9 @@ extern int __blk_end_request(struct requ
unsigned int nr_bytes);
 extern int blk_end_bidi_request(struct request *rq, int error,
unsigned int nr_bytes, unsigned int bidi_bytes);
+extern int blk_end_io(struct request *rq, int error,
+ unsigned int nr_bytes, unsigned int bidi_bytes,
+ int (drv_callback)(struct request *));
 extern void blk_async_end_request(struct request *rq, int error);
 extern void end_request(struct request *, int);
 extern void end_queued_request(struct request *, int);
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-15 Thread Tim Pepper
On Sat 16 Feb at 01:09:43 +0900 [EMAIL PROTECTED] said:
> 
> The first one is just reverting the data buffer accessors
> conversion. It would be nice if we could just revert it but we
> can't. These changes are necessary to compile the driver against post
> 2.6.24.

Fujita-san,

Unfortunately (and not too surprisingly given what we've tried so far) with
only the first of your series reverted the driver is working fine for me
again.

I saw (eg: replies to http://lkml.org/lkml/2007/5/11/132) some possibly
similar sounding issues with other drivers.  Could there be some memory
uninitialised?  I did try changing all the ips.c kmalloc's to kzalloc's,
but that didn't help.  Also that thread ties into pci gart.  The machines
we've been using are liable to getting pci calgary although given my
.config has:
CONFIG_GART_IOMMU=y
CONFIG_CALGARY_IOMMU=y
# CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT is not set
and that when booting this mainline I don't see any Calgary related
messages like I get from eg: Ubuntu's 2.6.22-14-server...I'm probably not
actually running the calgary iommu code in these repros.

Anyway, I greatly appreciate your efforts so far in trying to find what
could be wrong here!

-- 
Tim Pepper  <[EMAIL PROTECTED]>
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[APPENDIX PATCH 13/13] dm-mpath: convert to request-based

2008-02-15 Thread Kiyoshi Ueda
This patch converts dm-multipath target to request-based from bio-based.

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 drivers/md/dm-mpath.c |  228 +-
 drivers/md/dm-rq-record.h |   36 +++
 2 files changed, 204 insertions(+), 60 deletions(-)

Index: 2.6.25-rc1/drivers/md/dm-mpath.c
===
--- 2.6.25-rc1.orig/drivers/md/dm-mpath.c
+++ 2.6.25-rc1/drivers/md/dm-mpath.c
@@ -8,8 +8,7 @@
 #include "dm.h"
 #include "dm-path-selector.h"
 #include "dm-hw-handler.h"
-#include "dm-bio-list.h"
-#include "dm-bio-record.h"
+#include "dm-rq-record.h"
 #include "dm-uevent.h"
 
 #include 
@@ -80,7 +79,7 @@ struct multipath {
unsigned pg_init_count; /* Number of times pg_init called */
 
struct work_struct process_queued_ios;
-   struct bio_list queued_ios;
+   struct list_head queued_ios;
unsigned queue_size;
 
struct work_struct trigger_event;
@@ -89,22 +88,22 @@ struct multipath {
 * We must use a mempool of dm_mpath_io structs so that we
 * can resubmit bios on error.
 */
-   mempool_t *mpio_pool;
+   mempool_t *mpio_pool; /* REMOVE ME */
 };
 
 /*
  * Context information attached to each bio we process.
  */
-struct dm_mpath_io {
+struct dm_mpath_io { /* REMOVE ME */
struct pgpath *pgpath;
-   struct dm_bio_details details;
+   struct dm_rq_details details;
 };
 
 typedef int (*action_fn) (struct pgpath *pgpath);
 
 #define MIN_IOS 256/* Mempool size */
 
-static struct kmem_cache *_mpio_cache;
+static struct kmem_cache *_mpio_cache; /* REMOVE ME */
 
 static struct workqueue_struct *kmultipathd;
 static void process_queued_ios(struct work_struct *work);
@@ -174,6 +173,7 @@ static struct multipath *alloc_multipath
m = kzalloc(sizeof(*m), GFP_KERNEL);
if (m) {
INIT_LIST_HEAD(&m->priority_groups);
+   INIT_LIST_HEAD(&m->queued_ios);
spin_lock_init(&m->lock);
m->queue_io = 1;
INIT_WORK(&m->process_queued_ios, process_queued_ios);
@@ -304,12 +304,13 @@ static int __must_push_back(struct multi
dm_noflush_suspending(m->ti));
 }
 
-static int map_io(struct multipath *m, struct bio *bio,
+static int map_io(struct multipath *m, struct request *clone,
  struct dm_mpath_io *mpio, unsigned was_queued)
 {
int r = DM_MAPIO_REMAPPED;
unsigned long flags;
struct pgpath *pgpath;
+   struct block_device *bdev;
 
spin_lock_irqsave(&m->lock, flags);
 
@@ -326,19 +327,28 @@ static int map_io(struct multipath *m, s
if ((pgpath && m->queue_io) ||
(!pgpath && m->queue_if_no_path)) {
/* Queue for the daemon to resubmit */
-   bio_list_add(&m->queued_ios, bio);
+   list_add_tail(&clone->queuelist, &m->queued_ios);
m->queue_size++;
if ((m->pg_init_required && !m->pg_init_in_progress) ||
!m->queue_io)
queue_work(kmultipathd, &m->process_queued_ios);
pgpath = NULL;
+   clone->q = NULL;
+   clone->rq_disk = NULL;
r = DM_MAPIO_SUBMITTED;
-   } else if (pgpath)
-   bio->bi_bdev = pgpath->path.dev->bdev;
-   else if (__must_push_back(m))
+   } else if (pgpath) {
+   bdev = pgpath->path.dev->bdev;
+   clone->q = bdev_get_queue(bdev);
+   clone->rq_disk = bdev->bd_disk;
+   } else if (__must_push_back(m)) {
+   clone->q = NULL;
+   clone->rq_disk = NULL;
r = DM_MAPIO_REQUEUE;
-   else
+   } else {
+   clone->q = NULL;
+   clone->rq_disk = NULL;
r = -EIO;   /* Failed */
+   }
 
mpio->pgpath = pgpath;
 
@@ -378,30 +388,28 @@ static void dispatch_queued_ios(struct m
 {
int r;
unsigned long flags;
-   struct bio *bio = NULL, *next;
struct dm_mpath_io *mpio;
union map_info *info;
+   struct request *clone, *n;
+   LIST_HEAD(cl);
 
spin_lock_irqsave(&m->lock, flags);
-   bio = bio_list_get(&m->queued_ios);
+   list_splice_init(&m->queued_ios, &cl);
spin_unlock_irqrestore(&m->lock, flags);
 
-   while (bio) {
-   next = bio->bi_next;
-   bio->bi_next = NULL;
+   list_for_each_entry_safe(clone, n, &cl, queuelist) {
+   list_del_init(&clone->queuelist);
 
-   info = dm_get_mapinfo(bio);
+   info = dm_get_rq_mapinfo(clone);
mpio = info->ptr;
 
-   r = map_io(m, bio, mpio, 1);
+   r = map_io(m, clone, mpio, 1);
if (r < 0)
-   bio_endio(bio, r);
+   blk_end_req

[APPENDIX PATCH 09/13] dm: add core functions

2008-02-15 Thread Kiyoshi Ueda
This patch adds core functions for request-based dm.

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 drivers/md/dm.c |  452 +++-
 drivers/md/dm.h |7 
 2 files changed, 456 insertions(+), 3 deletions(-)

Index: 2.6.25-rc1/drivers/md/dm.c
===
--- 2.6.25-rc1.orig/drivers/md/dm.c
+++ 2.6.25-rc1/drivers/md/dm.c
@@ -75,6 +75,14 @@ union map_info *dm_get_mapinfo(struct bi
return NULL;
 }
 
+union map_info *dm_get_rq_mapinfo(struct request *rq)
+{
+   if (rq && rq->end_io_data)
+   return &((struct dm_rq_target_io *)rq->end_io_data)->info;
+   return NULL;
+}
+EXPORT_SYMBOL_GPL(dm_get_rq_mapinfo);
+
 #define MINOR_ALLOCED ((void *)-1)
 
 /*
@@ -86,6 +94,7 @@ union map_info *dm_get_mapinfo(struct bi
 #define DMF_FREEING 3
 #define DMF_DELETING 4
 #define DMF_NOFLUSH_SUSPENDING 5
+#define DMF_REQUEST_BASED 6
 
 /*
  * Work processed by per-device workqueue.
@@ -158,6 +167,9 @@ struct mapped_device {
 
/* forced geometry settings */
struct hd_geometry geometry;
+
+   /* For saving the address of __make_request for request based dm */
+   make_request_fn *saved_make_request_fn;
 };
 
 #define MIN_IOS 256
@@ -395,6 +407,17 @@ static void free_tio(struct mapped_devic
mempool_free(tio, md->tio_pool);
 }
 
+static inline struct dm_rq_target_io *alloc_rq_tio(struct mapped_device *md)
+{
+   return mempool_alloc(md->tio_pool, GFP_ATOMIC);
+}
+
+static inline void free_rq_tio(struct mapped_device *md,
+  struct dm_rq_target_io *tio)
+{
+   mempool_free(tio, md->tio_pool);
+}
+
 static void start_io_acct(struct dm_io *io)
 {
struct mapped_device *md = io->md;
@@ -583,6 +606,181 @@ static void clone_endio(struct bio *bio,
free_tio(md, tio);
 }
 
+static void __requeue_request(struct request_queue *q, struct request *rq)
+{
+   if (elv_queue_empty(q))
+   blk_plug_device(q);
+   blk_requeue_request(q, rq);
+}
+
+static void requeue_request(struct request_queue *q, struct request *rq)
+{
+   unsigned long flags = 0UL;
+
+   spin_lock_irqsave(q->queue_lock, flags);
+   __requeue_request(q, rq);
+   spin_unlock_irqrestore(q->queue_lock, flags);
+}
+
+static void dec_rq_pending(struct dm_rq_target_io *tio)
+{
+   if (!atomic_dec_return(&tio->md->pending))
+   /* nudge anyone waiting on suspend queue */
+   wake_up(&tio->md->wait);
+}
+
+static void blk_update_cloned_rq(struct request *rq, struct request *clone)
+{
+   clone->nr_phys_segments = rq->nr_phys_segments;
+   clone->nr_hw_segments = rq->nr_hw_segments;
+   clone->current_nr_sectors = rq->current_nr_sectors;
+   clone->hard_cur_sectors = rq->hard_cur_sectors;
+   clone->hard_nr_sectors = rq->hard_nr_sectors;
+   clone->nr_sectors = rq->nr_sectors;
+   clone->hard_sector = rq->hard_sector;
+   clone->sector = rq->sector;
+   clone->data_len = rq->data_len;
+   clone->buffer = rq->buffer;
+   clone->data = rq->data;
+   clone->bio = rq->bio;
+   clone->biotail = rq->biotail;
+}
+
+static void finish_clone(struct request *clone)
+{
+   if (!clone->q)
+   /*
+* The clone was not dispatched into underlying devices and
+* it means the caller is not underlying device driver,
+* the caller should be dm. (e.g. dispatch_queued_ios() of
+* dm-multipath)
+* So no need to do anything here for this clone.
+*/
+   return;
+
+   /*
+* For just cleaning up the information of the queue in which
+* the clone was dispatched.
+* The clone is *NOT* freed actually here because it is alloced from
+* dm own mempool and REQ_ALLOCED isn't set in clone->cmd_flags.
+*
+* The 'error' and 'nr_bytes' arguments of blk_end_io() don't matter
+* because they aren't used for dm's clones.
+*/
+   if (blk_end_io(clone, 0, 0, 0, NULL))
+   DMWARN("dm ignores the immediate return request of callback.");
+}
+
+static void clean_clone(struct request *clone)
+{
+   finish_clone(clone);
+   clone->special = NULL;
+   clone->errors = 0;
+   clone->endio_error = 0;
+}
+
+/**
+ * Must be called without the queue lock
+ **/
+static int clone_end_request(struct request *clone, int error,
+unsigned int nr_bytes, unsigned int bidi_bytes,
+int (drv_callback)(struct request *))
+{
+   int r = 0, rw = rq_data_dir(clone), requeued = 0;
+   struct dm_rq_target_io *tio = clone->end_io_data;
+   dm_request_endio_first_fn endio_first = tio->ti->type->rq_end_io_first;
+   dm_request_endio_fn endio = tio->ti->type->rq_end_io;
+   dm_request_q

[APPENDIX PATCH 08/13] dm: add target interfaces

2008-02-15 Thread Kiyoshi Ueda
This patch adds target interfaces for request-based dm.

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 include/linux/device-mapper.h |   23 +++
 1 files changed, 23 insertions(+)

Index: 2.6.25-rc1/include/linux/device-mapper.h
===
--- 2.6.25-rc1.orig/include/linux/device-mapper.h
+++ 2.6.25-rc1/include/linux/device-mapper.h
@@ -10,6 +10,8 @@
 
 #ifdef __KERNEL__
 
+struct request;
+struct request_queue;
 struct dm_target;
 struct dm_table;
 struct dm_dev;
@@ -45,6 +47,9 @@ typedef void (*dm_dtr_fn) (struct dm_tar
 typedef int (*dm_map_fn) (struct dm_target *ti, struct bio *bio,
  union map_info *map_context);
 
+typedef int (*dm_map_request_fn) (struct dm_target *ti, struct request *clone,
+ union map_info *map_context);
+
 /*
  * Returns:
  * < 0 : error (currently ignored)
@@ -57,6 +62,18 @@ typedef int (*dm_endio_fn) (struct dm_ta
struct bio *bio, int error,
union map_info *map_context);
 
+typedef int (*dm_request_endio_first_fn) (struct dm_target *ti,
+ struct request *clone, int error,
+ union map_info *map_context);
+
+typedef int (*dm_request_endio_fn) (struct dm_target *ti,
+   struct request *clone, int error,
+   union map_info *map_context);
+
+typedef void (*dm_request_queue_in_tgt_fn) (struct dm_target *ti,
+   struct request *clone,
+   union map_info *map_context);
+
 typedef void (*dm_flush_fn) (struct dm_target *ti);
 typedef void (*dm_presuspend_fn) (struct dm_target *ti);
 typedef void (*dm_postsuspend_fn) (struct dm_target *ti);
@@ -71,6 +88,7 @@ typedef int (*dm_message_fn) (struct dm_
 typedef int (*dm_ioctl_fn) (struct dm_target *ti, struct inode *inode,
struct file *filp, unsigned int cmd,
unsigned long arg);
+typedef int (*dm_congested_fn) (struct dm_target *ti);
 
 void dm_error(const char *message);
 
@@ -98,7 +116,11 @@ struct target_type {
dm_ctr_fn ctr;
dm_dtr_fn dtr;
dm_map_fn map;
+   dm_map_request_fn map_rq;
dm_endio_fn end_io;
+   dm_request_endio_first_fn rq_end_io_first;
+   dm_request_endio_fn rq_end_io;
+   dm_request_queue_in_tgt_fn queue_in_tgt;
dm_flush_fn flush;
dm_presuspend_fn presuspend;
dm_postsuspend_fn postsuspend;
@@ -107,6 +129,7 @@ struct target_type {
dm_status_fn status;
dm_message_fn message;
dm_ioctl_fn ioctl;
+   dm_congested_fn congested;
 };
 
 struct io_restrictions {
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] SCSI: fix data corruption caused by ses v2

2008-02-15 Thread Yinghai Lu
On Friday 15 February 2008 07:53:06 am James Bottomley wrote:
> On Wed, 2008-02-13 at 16:25 -0800, Yinghai Lu wrote:
> > one system: initrd get courrupted:
> > 
> > RAMDISK: Compressed image found at block 0
> > RAMDISK: incomplete write (-28 != 2048) 134217728
> > crc error
> > VFS: Mounted root (ext2 filesystem).
> > Freeing unused kernel memory: 388k freed
> > init_special_inode: bogus i_mode (17)
> > Warning: unable to open an initial console.
> > init_special_inode: bogus i_mode (17)
> > init_special_inode: bogus i_mode (17)
> > Kernel panic - not syncing: No init found.  Try passing init= option to 
> > kernel.
> > 
> > bisected to
> > commit 9927c68864e9c39cc317b4f559309ba29e642168
> > Author: James Bottomley <[EMAIL PROTECTED]>
> > Date:   Sun Feb 3 15:48:56 2008 -0600
> > 
> > [SCSI] ses: add new Enclosure ULD
> > 
> > changes:
> > 1. change char to unsigned char to avoid type change later.
> > 2. preserve len for page1
> > 3. need to move desc_ptr even the entry is not 
> > enclosure_component_device/raid.
> >so keep desc_ptr on right position
> > 4. record page7 len, and double check if desc_ptr out of boundary before 
> > touch.
> > 5. fix typo in subenclosure checking: should use hdr_buf instead.
> > 
> > Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]>
> 
> OK, I added this with a fixup to eliminate the spurious goto
> 

good
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH 2/3] block: move internal request completion to kblockd

2008-02-15 Thread Kiyoshi Ueda
This patch eliminates the use of __blk_end_request() and
end_queued_request() in the block layer.

On the current request stacking design, drivers are not ready
for request stacking if they hold the queue lock when completing
request.
However, some block layer functions are doing that, and the block
layer is not ready for request stacking now.

To complete all requests without the queue lock, this patch uses
kblockd in such cases.

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 block/blk-barrier.c|8 ++--
 block/blk-core.c   |   49 +
 block/blk-settings.c   |2 ++
 block/blk.h|2 ++
 block/elevator.c   |4 ++--
 include/linux/blkdev.h |   25 +
 6 files changed, 82 insertions(+), 8 deletions(-)

Index: 2.6.25-rc1/include/linux/blkdev.h
===
--- 2.6.25-rc1.orig/include/linux/blkdev.h
+++ 2.6.25-rc1/include/linux/blkdev.h
@@ -236,6 +236,23 @@ struct request {
 
/* for bidi */
struct request *next_rq;
+
+   /*
+* For calling the request completion interface without the queue lock
+* using workqueue.
+*
+* The work handler needs to know the error code and the completion
+* size of the request to complete it.
+* We don't need to pass the completion size to the work handler,
+* because the workqueue completion method doesn't allow partial
+* completion and the work handler can use the whole size of
+* the request.
+* For the error code, we need to pass it to the work handler because
+* no member in the struct request can be used for the purpose.
+* (->errors should not be used, because the upper layer may expect
+*  that driver-specific error codes is there.)
+*/
+   int endio_error;
 };
 
 /*
@@ -393,6 +410,13 @@ struct request_queue
 #if defined(CONFIG_BLK_DEV_BSG)
struct bsg_class_device bsg_dev;
 #endif
+
+   /*
+* For request completion without queue lock.
+* The workqueue completion method doesn't allow partial completion.
+*/
+   struct work_struct  endio_work;
+   struct list_headendio_list;
 };
 
 #define QUEUE_FLAG_CLUSTER 0   /* cluster several segments into 1 */
@@ -669,6 +693,7 @@ extern int __blk_end_request(struct requ
unsigned int nr_bytes);
 extern int blk_end_bidi_request(struct request *rq, int error,
unsigned int nr_bytes, unsigned int bidi_bytes);
+extern void blk_async_end_request(struct request *rq, int error);
 extern void end_request(struct request *, int);
 extern void end_queued_request(struct request *, int);
 extern void end_dequeued_request(struct request *, int);
Index: 2.6.25-rc1/block/elevator.c
===
--- 2.6.25-rc1.orig/block/elevator.c
+++ 2.6.25-rc1/block/elevator.c
@@ -721,7 +721,7 @@ struct request *elv_next_request(struct 
 * not ever see it.
 */
if (blk_empty_barrier(rq)) {
-   end_queued_request(rq, 1);
+   blk_async_end_request(rq, 0);
continue;
}
if (!(rq->cmd_flags & REQ_STARTED)) {
@@ -788,7 +788,7 @@ struct request *elv_next_request(struct 
break;
} else if (ret == BLKPREP_KILL) {
rq->cmd_flags |= REQ_QUIET;
-   end_queued_request(rq, 0);
+   blk_async_end_request(rq, -EIO);
} else {
printk(KERN_ERR "%s: bad return=%d\n", __FUNCTION__,
ret);
Index: 2.6.25-rc1/block/blk-barrier.c
===
--- 2.6.25-rc1.orig/block/blk-barrier.c
+++ 2.6.25-rc1/block/blk-barrier.c
@@ -108,8 +108,7 @@ void blk_ordered_complete_seq(struct req
q->ordseq = 0;
rq = q->orig_bar_rq;
 
-   if (__blk_end_request(rq, q->orderr, blk_rq_bytes(rq)))
-   BUG();
+   blk_async_end_request(rq, q->orderr);
 }
 
 static void pre_flush_end_io(struct request *rq, int error)
@@ -225,10 +224,7 @@ int blk_do_ordered(struct request_queue 
 * This can happen when the queue switches to
 * ORDERED_NONE while this request is on it.
 */
-   blkdev_dequeue_request(rq);
-   if (__blk_end_request(rq, -EOPNOTSUPP,
- blk_rq_bytes(rq)))
-   BUG();
+   blk_async_end_request(rq, -EOPNOTSUPP);
*rqp = NULL;
  

[APPENDIX PATCH 03/13] block: export blk_register_queue

2008-02-15 Thread Kiyoshi Ueda
This patch exports blk_register_queue().

Request-based dm and bio-based dm will coexist, since there are
some target drivers which are more fitting to bio-based dm.
dm decides the hook type for a dm device and initializes the queue,
when a table is loaded to the dm device, not device creation time.
Then, request-based dm sets q->request_fn and wants to register
the queue correctly again.

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 block/blk-sysfs.c |1 +
 1 files changed, 1 insertion(+)

Index: 2.6.25-rc1/block/blk-sysfs.c
===
--- 2.6.25-rc1.orig/block/blk-sysfs.c
+++ 2.6.25-rc1/block/blk-sysfs.c
@@ -295,6 +295,7 @@ int blk_register_queue(struct gendisk *d
 
return 0;
 }
+EXPORT_SYMBOL_GPL(blk_register_queue);
 
 void blk_unregister_queue(struct gendisk *disk)
 {
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH 0/3] request stacking and request-based dm-multipath

2008-02-15 Thread Kiyoshi Ueda
Hi Jens, list,

I'm working on device-mapper multipath (dm-multipath) and trying to
move its layer down below the I/O scheduler so that dm-multipath can
make better multipathing decision.
(This is subjected as request-based dm-multipath.)

On that design, dm-multipath treats and submits struct request to
underlying devices instead of struct bio.
So requests are stacked while currently dm and md are stacking bios.

The following 3 patches are implementation of the request stacking
framework.
  1/3: add rq->complete_io hook for request stacking
  2/3: move block-layer internal request completion to kblockd
  3/3: export driver's busy state and an example of scsi

The first 2 patches enable request stacking.
The last 1 patch is necessary to avoid a performance problem,
that requests are sent down too early while the underlying device
is busy.

I also attach the patch-set of request-based dm-multipath
as appendix to show how the request stacking interfaces are used.
(With this version of the patch-set, request-based dm-multipath
 works without userspace (multipath-tools and libdevmapper) change.)

This patch-set should be applied on top of 2.6.25-rc1.

Although I'm planning to discuss this topic in the Storage Workshop, too,
any comments and feedbacks in advance are very appreciated and helpful.


Below is the explanation about needs and details of the request stacking
patches.

SUMMARY
===
Request-based dm-multipath needs this patch-set to hook in before
completing each chunk of request, check errors for it and retry it
using another path if error is detected.


BACKGROUND
==
The patch-set, which adds the rq->complete_io() hook, is necessary
to allow device stacking at request level, that is request-based
device-mapper multipath.
Currently, device-mapper is implemented as a stacking block device
at BIO level.  OTOH, request-based dm will stack at request level
for better multipathing decision.
To stack devices at request level, the completion procedure needs
a hook for it.
For example, dm-multipath has to check errors and retry with other
paths if necessary before returning the I/O result to the upper layer.
struct request has 'end_io' hook currently.  But it's called at
the very late stage of completion handling where the I/O result
is already returned to the upper layer.
So we need something here.

The first approach to hook in completion of each chunk of request
was adding a new rq->end_io_first() hook and calling it on the top
of __end_that_request_first().
  - http://marc.info/?l=linux-kernel&m=116656637425880&w=2
  - http://marc.info/?l=linux-scsi&m=115520444515914&w=2
However, Jens pointed out that redesigning rq->end_io() as a full
completion handler would be better:

On Thu, 21 Dec 2006 08:49:47 +0100, Jens Axboe wrote:
> Ok, I see what you are getting at. The current ->end_io() is called when
> the request has fully completed, you want notification for each chunk
> potentially completed.
>
> I think a better design here would be to use ->end_io() as the full
> completion handler, similar to how bio->bi_end_io() works. A request
> originating from __make_request() would set something ala:
.
> instead of calling the functions manually. That would allow you to get
> notification right at the beginning and do what you need, without adding
> a special hook for this.

I thought his comment was reasonable, and I modified the patches
based on his suggestion:
  - http://marc.info/?l=linux-kernel&m=118860086901958&w=2

But the patch was ugly.
After considering the patch, I come to think changing the role
of ->end_io() is not feasible.
The role of ->end_io() is a destructor, and current users only
want to do something at the destruction of struct request, don't
want to do other works like data completion.
So I think it's reasonable to add another hook for request stacking.
(or rename the current ->end_io() to ->dtor() and make ->end_io()
 a request stacking hook.)

Thanks,
Kiyoshi Ueda
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[APPENDIX PATCH 12/13] dm-mpath: add hw-handler interface

2008-02-15 Thread Kiyoshi Ueda
This patch adds a hw-handler interface for request-based dm-multipath.

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 drivers/md/dm-hw-handler.h |1 +
 1 files changed, 1 insertion(+)

Index: 2.6.25-rc1/drivers/md/dm-hw-handler.h
===
--- 2.6.25-rc1.orig/drivers/md/dm-hw-handler.h
+++ 2.6.25-rc1/drivers/md/dm-hw-handler.h
@@ -35,6 +35,7 @@ struct hw_handler_type {
void (*pg_init) (struct hw_handler *hwh, unsigned bypassed,
 struct dm_path *path);
unsigned (*error) (struct hw_handler *hwh, struct bio *bio);
+   unsigned (*error_rq) (struct hw_handler *hwh, struct request *rq);
int (*status) (struct hw_handler *hwh, status_type_t type,
   char *result, unsigned int maxlen);
 };
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[APPENDIX PATCH 06/13] dm: tidy local_init

2008-02-15 Thread Kiyoshi Ueda
This patch tidies local_init() as preparation for request-based dm.
No functional change.

This patch is just a clean up of the codes and not functionally
related to request-based dm.  But included here due to literal
dependency.

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 drivers/md/dm.c |   34 +-
 1 files changed, 17 insertions(+), 17 deletions(-)

Index: 2.6.25-rc1/drivers/md/dm.c
===
--- 2.6.25-rc1.orig/drivers/md/dm.c
+++ 2.6.25-rc1/drivers/md/dm.c
@@ -150,40 +150,40 @@ static struct kmem_cache *_tio_cache;
 
 static int __init local_init(void)
 {
-   int r;
+   int r = -ENOMEM;
 
/* allocate a slab for the dm_ios */
_io_cache = KMEM_CACHE(dm_io, 0);
if (!_io_cache)
-   return -ENOMEM;
+   return r;
 
/* allocate a slab for the target ios */
_tio_cache = KMEM_CACHE(dm_target_io, 0);
-   if (!_tio_cache) {
-   kmem_cache_destroy(_io_cache);
-   return -ENOMEM;
-   }
+   if (!_tio_cache)
+   goto out_free_io_cache;
 
r = dm_uevent_init();
-   if (r) {
-   kmem_cache_destroy(_tio_cache);
-   kmem_cache_destroy(_io_cache);
-   return r;
-   }
+   if (r)
+   goto out_free_tio_cache;
 
_major = major;
r = register_blkdev(_major, _name);
-   if (r < 0) {
-   kmem_cache_destroy(_tio_cache);
-   kmem_cache_destroy(_io_cache);
-   dm_uevent_exit();
-   return r;
-   }
+   if (r < 0)
+   goto out_uevent_exit;
 
if (!_major)
_major = r;
 
return 0;
+
+out_uevent_exit:
+   dm_uevent_exit();
+out_free_tio_cache:
+   kmem_cache_destroy(_tio_cache);
+out_free_io_cache:
+   kmem_cache_destroy(_io_cache);
+
+   return r;
 }
 
 static void local_exit(void)
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[APPENDIX PATCH 11/13] dm: reject bad table load

2008-02-15 Thread Kiyoshi Ueda
This patch rejects bad table load for request-based dm.

The following table loadings are rejected:
  - including non-stackable device
  - shrinking the current restrictions

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 drivers/md/dm-table.c |   48 --
 drivers/md/dm.c   |   25 +
 include/linux/device-mapper.h |9 +++
 3 files changed, 80 insertions(+), 2 deletions(-)

Index: 2.6.25-rc1/drivers/md/dm-table.c
===
--- 2.6.25-rc1.orig/drivers/md/dm-table.c
+++ 2.6.25-rc1/drivers/md/dm-table.c
@@ -108,6 +108,8 @@ static void combine_restrictions_low(str
lhs->bounce_pfn = min_not_zero(lhs->bounce_pfn, rhs->bounce_pfn);
 
lhs->no_cluster |= rhs->no_cluster;
+
+   lhs->no_stack |= rhs->no_stack;
 }
 
 /*
@@ -578,6 +580,8 @@ void dm_set_device_limits(struct dm_targ
rs->bounce_pfn = min_not_zero(rs->bounce_pfn, q->bounce_pfn);
 
rs->no_cluster |= !test_bit(QUEUE_FLAG_CLUSTER, &q->queue_flags);
+
+   rs->no_stack |= !blk_queue_stackable(q);
 }
 EXPORT_SYMBOL_GPL(dm_set_device_limits);
 
@@ -704,8 +708,13 @@ int dm_split_args(int *argc, char ***arg
return 0;
 }
 
-static void check_for_valid_limits(struct io_restrictions *rs)
+static int check_for_valid_limits(struct io_restrictions *rs,
+ struct mapped_device *md)
 {
+   int r = 0;
+   struct request_queue *q;
+
+   /* Set maximum value if no restriction */
if (!rs->max_sectors)
rs->max_sectors = SAFE_MAX_SECTORS;
if (!rs->max_hw_sectors)
@@ -722,6 +731,39 @@ static void check_for_valid_limits(struc
rs->seg_boundary_mask = -1;
if (!rs->bounce_pfn)
rs->bounce_pfn = -1;
+
+   /* Request-based dm allows to load only request stackable tables */
+   if (dm_request_based(md) && rs->no_stack) {
+   DMERR("table load rejected: including non-stackable devices");
+   return -EINVAL;
+   }
+
+   /* First table loading must be allowed */
+   if (!dm_request_based(md) || !dm_bound_table(md))
+   return 0;
+
+   q  = dm_get_queue(md);
+   if (!q) {
+   DMERR("can't get queue from the mapped device");
+   return -EINVAL;
+   }
+
+   if ((rs->max_sectors < q->max_sectors) ||
+   (rs->max_hw_sectors < q->max_hw_sectors) ||
+   (rs->max_phys_segments < q->max_phys_segments) ||
+   (rs->max_hw_segments < q->max_hw_segments) ||
+   (rs->hardsect_size > q->hardsect_size) ||
+   (rs->max_segment_size < q->max_segment_size) ||
+   (rs->seg_boundary_mask < q->seg_boundary_mask) ||
+   (rs->bounce_pfn < q->bounce_pfn) ||
+   (rs->no_cluster && test_bit(QUEUE_FLAG_CLUSTER, &q->queue_flags))) {
+   DMERR("table load rejected: shrinking current restriction");
+   r = -EINVAL;
+   }
+
+   dm_put_queue(q);
+
+   return r;
 }
 
 int dm_table_add_target(struct dm_table *t, const char *type,
@@ -875,7 +917,9 @@ int dm_table_complete(struct dm_table *t
if (r)
return r;
 
-   check_for_valid_limits(&t->limits);
+   r = check_for_valid_limits(&t->limits, t->md);
+   if (r)
+   return r;
 
/* how many indexes will the btree have ? */
leaf_nodes = dm_div_up(t->num_targets, KEYS_PER_NODE);
Index: 2.6.25-rc1/drivers/md/dm.c
===
--- 2.6.25-rc1.orig/drivers/md/dm.c
+++ 2.6.25-rc1/drivers/md/dm.c
@@ -96,6 +96,7 @@ EXPORT_SYMBOL_GPL(dm_get_rq_mapinfo);
 #define DMF_NOFLUSH_SUSPENDING 5
 #define DMF_REQUEST_BASED 6
 #define DMF_BIO_BASED 7
+#define DMF_BOUND_TABLE 8
 
 /*
  * Work processed by per-device workqueue.
@@ -1672,6 +1673,7 @@ static int __bind(struct mapped_device *
write_lock(&md->map_lock);
md->map = t;
dm_table_set_restrictions(t, q);
+   set_bit(DMF_BOUND_TABLE, &md->flags);
write_unlock(&md->map_lock);
 
return 0;
@@ -1912,6 +1914,19 @@ static void start_queue(struct request_q
spin_unlock_irqrestore(q->queue_lock, flags);
 }
 
+struct request_queue *dm_get_queue(struct mapped_device *md)
+{
+   if (blk_get_queue(md->queue))
+   return NULL;
+
+   return md->queue;
+}
+
+void dm_put_queue(struct request_queue *q)
+{
+   blk_put_queue(q);
+}
+
 /*
  * Functions to lock and unlock any filesystem running on the
  * device.
@@ -2174,6 +2189,16 @@ int dm_suspended(struct mapped_device *m
return test_bit(DMF_SUSPENDED, &md->flags);
 }
 
+int dm_request_based(struct mapped_device *md)
+{
+   return test_bit(DMF_REQUEST_BASED, &md->flags);
+}
+
+int dm_bound_table(struct mapped_device *md)
+{
+   return test_bit(DMF_BOUND_TABLE, 

[RFC PATCH 1/3] block: add rq->complete_io hook for request stacking

2008-02-15 Thread Kiyoshi Ueda
This patch adds ->complete_io() hook for request stacking.
Request stacking drivers (such as request-based dm) can set
a callback for completion.
(The hook is not called in blk_end_io(), since request-based dm uses
 it for clone completion in the following appendix patches.)

For the stacking to work without deadlock between stacking devices,
both the submission function and the completion function must be
called without the queue lock.
So the stacking is not available for __blk_end_request(), which
is called with the queue lock held.
The patch adds a queue flag (QUEUE_FLAG_STACKABLE) and a check to
make sure that the stacking is not enabled in drivers calling
__blk_end_request().

It means that only scsi mid-layer, cciss and i2o are ready for
the stacking.
I believe current dm-multipath users are using scsi, cciss and dasd.
So at least, dasd needs to be changed not to use __blk_end_request()
for request-based dm-multipath.



Below is the detailed explanation why the stacking is not possible
for drivers using __blk_end_request().

If the completion function is called with the queue lock held,
we have 2 deadlock problems:
  a). when the stacking driver completes the hooked request
  b). when the stacking driver submits other requests in that
  completion context

  Suppose we have the following stack:
 
||
| stacking device: DEV#2 |
||
 
||
| real device: DEV#1 |
||

  For example of a):
1. The device driver takes the queue lock for DEV#1
2. The device driver calls __blk_end_request() for the request
   and it is hooked by the stacking driver
3. The stacking driver tries to take the queue lock for DEV#1
   to complete the hooked request
=> deadlock happens on the queue lock for DEV#1

  For example of b): Assume the a) is worked-around by something
1. The device driver takes the queue lock for DEV#1
2. The device driver calls __blk_end_request() for the request
   and it is hooked by the stacking driver
3. The stacking driver completes the hooked request
4. The stacking driver dequeues the next request from DEV#2
   to dispatch quickly due to some performance reasons
5. The stacking driver tries to take the queue lock for DEV#1
   to submit the next request
=> deadlock happens on the queue lock for DEV#1

To prevent such deadlock problems on the stacking, I'd like to say
that drivers which hold the queue lock when completing request are
not ready for the stacking.
So drivers using __blk_end_request() (and end_[queued|dequeued_]request())
need to be modified in this request stacking design, if we need to
use such devices for the stacking.

To prevent request stacking drivers from using such unstackable
devices, a queue flag, QUEUE_FLAG_STACKABLE, is added.
And device drivers can set it to be used by request stacking drivers.
(scsi patch is included just for an example of the setting.)
To detect wrong use of the flag and the hook, some checks are also
put in __blk_end_request().


Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 block/blk-core.c|   31 +++
 drivers/scsi/scsi_lib.c |7 +++
 include/linux/blkdev.h  |7 +++
 3 files changed, 45 insertions(+)

Index: 2.6.25-rc1/block/blk-core.c
===
--- 2.6.25-rc1.orig/block/blk-core.c
+++ 2.6.25-rc1/block/blk-core.c
@@ -138,6 +138,7 @@ void rq_init(struct request_queue *q, st
rq->data = NULL;
rq->sense = NULL;
rq->end_io = NULL;
+   rq->complete_io = NULL;
rq->end_io_data = NULL;
rq->next_rq = NULL;
 }
@@ -1917,6 +1918,9 @@ static int blk_end_io(struct request *rq
  **/
 int blk_end_request(struct request *rq, int error, unsigned int nr_bytes)
 {
+   if (rq->complete_io)
+   return rq->complete_io(rq, error, nr_bytes, 0, NULL);
+
return blk_end_io(rq, error, nr_bytes, 0, NULL);
 }
 EXPORT_SYMBOL_GPL(blk_end_request);
@@ -1936,6 +1940,27 @@ EXPORT_SYMBOL_GPL(blk_end_request);
  **/
 int __blk_end_request(struct request *rq, int error, unsigned int nr_bytes)
 {
+   if (unlikely(blk_queue_stackable(rq->q)))
+   printk(KERN_WARNING "dev: %s: Not ready for request stacking, "
+  "but the device driver set it stackable. "
+  "Need to fix the device driver!\n",
+  rq->rq_disk ? rq->rq_disk->disk_name : "?");
+
+   if (unlikely(rq->complete_io)) {
+   /*
+* If we invoke the ->complete_io here, the request submitter's
+* handler would get deadlock on the queue lock.
+*
+* This happens, when the request submitter didn't check
+* whe

Re: ips.c broken since 2.6.23 on x86_64?

2008-02-15 Thread FUJITA Tomonori
On Fri, 15 Feb 2008 14:50:57 -0800
Tim Pepper <[EMAIL PROTECTED]> wrote:

> On Sat 16 Feb at 01:09:43 +0900 [EMAIL PROTECTED] said:
> > 
> > The first one is just reverting the data buffer accessors
> > conversion. It would be nice if we could just revert it but we
> > can't. These changes are necessary to compile the driver against post
> > 2.6.24.
> 
> Fujita-san,
> 
> Unfortunately (and not too surprisingly given what we've tried so far) with
> only the first of your series reverted the driver is working fine for me
> again.

Do you mean that you applied only the following two patches against
2.6.24, and then it doesn't work?

0001-ips-revert-the-changes-for-the-data-buffer-accessor.patch
0002-ips-kill-the-map_single-path-in-ips_scmd_buf_write.patch

If so, the second patch is broken. Did you saw BUG_ON message (I added
some BUG_ON to the patch)?


> I saw (eg: replies to http://lkml.org/lkml/2007/5/11/132) some possibly
> similar sounding issues with other drivers.  Could there be some memory
> uninitialised?  I did try changing all the ips.c kmalloc's to kzalloc's,
> but that didn't help.  Also that thread ties into pci gart.  The machines
> we've been using are liable to getting pci calgary although given my
> .config has:
> CONFIG_GART_IOMMU=y
> CONFIG_CALGARY_IOMMU=y
> # CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT is not set
> and that when booting this mainline I don't see any Calgary related
> messages like I get from eg: Ubuntu's 2.6.22-14-server...I'm probably not
> actually running the calgary iommu code in these repros.

Yes, probabaly, your machine doesn't use any IOMMU hardware
(nommu_map_sg function was in your crash log).


> Anyway, I greatly appreciate your efforts so far in trying to find what
> could be wrong here!

Really sorry about the troubles and thanks for testing.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] SCSI/gdth: PCI probe cleanups, prep for PCI hotplug API conversion

2008-02-15 Thread Jeff Garzik
- Reduce uses of gdth_pci_str::pdev, preferring a local variable
  (or function arg) 'pdev' instead.

- Reduce uses of gdth_pcistr array, preferring local variable
  (or function arg) 'pcistr' instead.

- Eliminate lone use of gdth_pci_str::irq, using equivalent
  pdev->irq instead

- Eliminate assign-only gdth_pci_str::io_mm

Note:  If the indentation seems weird, that's because a line was
converted from spaces to tabs, when it was modified.

Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]>
---
NOTE: this patch series supercedes the previous "gdth: convert to PCI
hotplug API" patch.

 drivers/scsi/gdth.c |   59 --
 drivers/scsi/gdth.h |2 -
 2 files changed, 28 insertions(+), 33 deletions(-)

diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index 6d67f5c..56c2b91 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -653,7 +653,6 @@ static void __init gdth_search_dev(gdth_pci_str *pcistr, 
ushort *cnt,
 
 /* GDT PCI controller found, resources are already in pdev */
 pcistr[*cnt].pdev = pdev;
-pcistr[*cnt].irq = pdev->irq;
 base0 = pci_resource_flags(pdev, 0);
 base1 = pci_resource_flags(pdev, 1);
 base2 = pci_resource_flags(pdev, 2);
@@ -668,7 +667,6 @@ static void __init gdth_search_dev(gdth_pci_str *pcistr, 
ushort *cnt,
 !(base1 & IORESOURCE_IO)) 
 continue;
 pcistr[*cnt].dpmem = pci_resource_start(pdev, 2);
-pcistr[*cnt].io_mm = pci_resource_start(pdev, 0);
 pcistr[*cnt].io= pci_resource_start(pdev, 1);
 }
 TRACE2(("Controller found at %d/%d, irq %d, dpmem 0x%lx\n",
@@ -913,7 +911,8 @@ static int __init gdth_init_isa(ulong32 
bios_adr,gdth_ha_str *ha)
 #endif /* CONFIG_ISA */
 
 #ifdef CONFIG_PCI
-static int __init gdth_init_pci(gdth_pci_str *pcistr,gdth_ha_str *ha)
+static int __init gdth_init_pci(struct pci_dev *pdev, gdth_pci_str *pcistr,
+   gdth_ha_str *ha)
 {
 register gdt6_dpram_str __iomem *dp6_ptr;
 register gdt6c_dpram_str __iomem *dp6c_ptr;
@@ -925,14 +924,14 @@ static int __init gdth_init_pci(gdth_pci_str 
*pcistr,gdth_ha_str *ha)
 
 TRACE(("gdth_init_pci()\n"));
 
-if (pcistr->pdev->vendor == PCI_VENDOR_ID_INTEL)
+if (pdev->vendor == PCI_VENDOR_ID_INTEL)
 ha->oem_id = OEM_ID_INTEL;
 else
 ha->oem_id = OEM_ID_ICP;
-ha->brd_phys = (pcistr->pdev->bus->number << 8) | (pcistr->pdev->devfn & 
0xf8);
-ha->stype = (ulong32)pcistr->pdev->device;
-ha->irq = pcistr->irq;
-ha->pdev = pcistr->pdev;
+ha->brd_phys = (pdev->bus->number << 8) | (pdev->devfn & 0xf8);
+ha->stype = (ulong32)pdev->device;
+ha->irq = pdev->irq;
+ha->pdev = pdev;
 
 if (ha->pdev->device <= PCI_DEVICE_ID_VORTEX_GDT6000B) {  /* GDT6000/B */
 TRACE2(("init_pci() dpmem %lx irq %d\n",pcistr->dpmem,ha->irq));
@@ -960,8 +959,7 @@ static int __init gdth_init_pci(gdth_pci_str 
*pcistr,gdth_ha_str *ha)
 continue;
 }
 iounmap(ha->brd);
-pci_write_config_dword(pcistr->pdev, 
-   PCI_BASE_ADDRESS_0, i);
+   pci_write_config_dword(pdev, PCI_BASE_ADDRESS_0, i);
 ha->brd = ioremap(i, sizeof(gdt6_dpram_str)); 
 if (ha->brd == NULL) {
 printk("GDT-PCI: Initialization error (DPMEM remap 
error)\n");
@@ -1070,8 +1068,7 @@ static int __init gdth_init_pci(gdth_pci_str 
*pcistr,gdth_ha_str *ha)
 continue;
 }
 iounmap(ha->brd);
-pci_write_config_dword(pcistr->pdev, 
-   PCI_BASE_ADDRESS_2, i);
+   pci_write_config_dword(pdev, PCI_BASE_ADDRESS_2, i);
 ha->brd = ioremap(i, sizeof(gdt6c_dpram_str)); 
 if (ha->brd == NULL) {
 printk("GDT-PCI: Initialization error (DPMEM remap 
error)\n");
@@ -1163,16 +1160,16 @@ static int __init gdth_init_pci(gdth_pci_str 
*pcistr,gdth_ha_str *ha)
 }
 
 /* manipulate config. space to enable DPMEM, start RP controller */
-pci_read_config_word(pcistr->pdev, PCI_COMMAND, &command);
+   pci_read_config_word(pdev, PCI_COMMAND, &command);
 command |= 6;
-pci_write_config_word(pcistr->pdev, PCI_COMMAND, command);
-if (pci_resource_start(pcistr->pdev, 8) == 1UL)
-pci_resource_start(pcistr->pdev, 8) = 0UL;
+   pci_write_config_word(pdev, PCI_COMMAND, command);
+   if (pci_resource_start(pdev, 8) == 1UL)
+   pci_resource_start(pdev, 8) = 0UL;
 i = 0xFEFF0001UL;
-pci_write_config_dword(pcistr->pdev, PCI_ROM_ADDRESS, i);
+   pci_write_config_dword(pdev, PCI_ROM_ADDRESS, i);
 gdth_delay(1);
-pci_write_config_dword(pcistr->pdev, PCI_ROM_ADDRESS,
-   pci_

[PATCH 2/2 v2] SCSI/gdth: convert to PCI hotplug API

2008-02-15 Thread Jeff Garzik
Version 2:
- rediff'd against latest kernel (fix akpm-noticed conflicts)

- remove PCI device sort, which greatly simplifies PCI probe,
  permitting direct, per-HBA function calls rather than an indirect
  route to the same end result.

- remove need for pcistr[]

Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]>
---
NOTE: MAXHA is still used after this, and must be removed in a separate
patch (after other code is updated).

 drivers/scsi/gdth.c |  196 --
 1 files changed, 94 insertions(+), 102 deletions(-)

diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index 56c2b91..ad9aff2 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -595,123 +595,111 @@ static int __init gdth_search_isa(ulong32 bios_adr)
 #endif /* CONFIG_ISA */
 
 #ifdef CONFIG_PCI
-static void gdth_search_dev(gdth_pci_str *pcistr, ushort *cnt,
-ushort vendor, ushort dev);
+static bool gdth_pci_registered;
 
-static int __init gdth_search_pci(gdth_pci_str *pcistr)
+static bool gdth_search_vortex(ushort device)
 {
-ushort device, cnt;
-
-TRACE(("gdth_search_pci()\n"));
-
-cnt = 0;
-for (device = 0; device <= PCI_DEVICE_ID_VORTEX_GDT6555; ++device)
-gdth_search_dev(pcistr, &cnt, PCI_VENDOR_ID_VORTEX, device);
-for (device = PCI_DEVICE_ID_VORTEX_GDT6x17RP; 
- device <= PCI_DEVICE_ID_VORTEX_GDTMAXRP; ++device)
-gdth_search_dev(pcistr, &cnt, PCI_VENDOR_ID_VORTEX, device);
-gdth_search_dev(pcistr, &cnt, PCI_VENDOR_ID_VORTEX, 
-PCI_DEVICE_ID_VORTEX_GDTNEWRX);
-gdth_search_dev(pcistr, &cnt, PCI_VENDOR_ID_VORTEX, 
-PCI_DEVICE_ID_VORTEX_GDTNEWRX2);
-gdth_search_dev(pcistr, &cnt, PCI_VENDOR_ID_INTEL,
-PCI_DEVICE_ID_INTEL_SRC);
-gdth_search_dev(pcistr, &cnt, PCI_VENDOR_ID_INTEL,
-PCI_DEVICE_ID_INTEL_SRC_XSCALE);
-return cnt;
+   if (device <= PCI_DEVICE_ID_VORTEX_GDT6555)
+   return true;
+   if (device >= PCI_DEVICE_ID_VORTEX_GDT6x17RP &&
+   device <= PCI_DEVICE_ID_VORTEX_GDTMAXRP)
+   return true;
+   if (device == PCI_DEVICE_ID_VORTEX_GDTNEWRX ||
+   device == PCI_DEVICE_ID_VORTEX_GDTNEWRX2)
+   return true;
+   return false;
 }
 
+static int gdth_pci_probe_one(gdth_pci_str *pcistr, gdth_ha_str **ha_out);
+static int gdth_pci_init_one(struct pci_dev *pdev,
+const struct pci_device_id *ent);
+static void gdth_pci_remove_one(struct pci_dev *pdev);
+static void gdth_remove_one(gdth_ha_str *ha);
+
 /* Vortex only makes RAID controllers.
  * We do not really want to specify all 550 ids here, so wildcard match.
  */
-static struct pci_device_id gdthtable[] __maybe_unused = {
-{PCI_VENDOR_ID_VORTEX,PCI_ANY_ID,PCI_ANY_ID, PCI_ANY_ID},
-{PCI_VENDOR_ID_INTEL,PCI_DEVICE_ID_INTEL_SRC,PCI_ANY_ID,PCI_ANY_ID}, 
-
{PCI_VENDOR_ID_INTEL,PCI_DEVICE_ID_INTEL_SRC_XSCALE,PCI_ANY_ID,PCI_ANY_ID}, 
-{0}
+static const struct pci_device_id gdthtable[] = {
+   { PCI_VDEVICE(VORTEX, PCI_ANY_ID) },
+   { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_SRC) },
+   { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_SRC_XSCALE) },
+   { } /* terminate list */
 };
-MODULE_DEVICE_TABLE(pci,gdthtable);
+MODULE_DEVICE_TABLE(pci, gdthtable);
 
-static void __init gdth_search_dev(gdth_pci_str *pcistr, ushort *cnt,
-   ushort vendor, ushort device)
+static struct pci_driver gdth_pci_driver = {
+   .name   = "gdth",
+   .id_table   = gdthtable,
+   .probe  = gdth_pci_init_one,
+   .remove = gdth_pci_remove_one,
+};
+
+static void gdth_pci_remove_one(struct pci_dev *pdev)
 {
-ulong base0, base1, base2;
-struct pci_dev *pdev;
+   gdth_ha_str *ha = pci_get_drvdata(pdev);
+
+   pci_set_drvdata(pdev, NULL);
+
+   list_del(&ha->list);
+   gdth_remove_one(ha);
+
+   pci_disable_device(pdev);
+}
+
+static int gdth_pci_init_one(struct pci_dev *pdev,
+  const struct pci_device_id *ent)
+{
+   ushort vendor = pdev->vendor;
+   ushort device = pdev->device;
+   ulong base0, base1, base2;
+   int rc;
+   gdth_pci_str gdth_pcistr;
+   gdth_ha_str *ha = NULL;
 
-TRACE(("gdth_search_dev() cnt %d vendor %x device %x\n",
-  *cnt, vendor, device));
+   TRACE(("gdth_search_dev() cnt %d vendor %x device %x\n",
+  gdth_ctr_count, vendor, device));
 
-pdev = NULL;
-while ((pdev = pci_get_device(vendor, device, pdev))
-   != NULL) {
-if (pci_enable_device(pdev))
-continue;
-if (*cnt >= MAXHA) {
-pci_dev_put(pdev);
-return;
-}
+   memset(&gdth_pcistr, 0, sizeof(gdth_pcistr));
+
+   if (vendor == PCI_VENDOR_ID_VORTEX && !gdth_search_vortex(device))
+   return -ENODEV;
+
+   rc = pc

[PATCH] qla2xxx: fix compilation compile

2008-02-15 Thread FUJITA Tomonori
scsi/qla2xxx/qla_dfs.c: In function 'qla2x00_dfs_fce_show':
scsi/qla2xxx/qla_dfs.c:26: warning: format '%llx' expects type 'long long 
unsigned int', but argument 3 has type 'uint64_t'

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
---
 drivers/scsi/qla2xxx/qla_dfs.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_dfs.c b/drivers/scsi/qla2xxx/qla_dfs.c
index 1479c60..2cd899b 100644
--- a/drivers/scsi/qla2xxx/qla_dfs.c
+++ b/drivers/scsi/qla2xxx/qla_dfs.c
@@ -23,7 +23,7 @@ qla2x00_dfs_fce_show(struct seq_file *s, void *unused)
mutex_lock(&ha->fce_mutex);
 
seq_printf(s, "FCE Trace Buffer\n");
-   seq_printf(s, "In Pointer = %llx\n\n", ha->fce_wr);
+   seq_printf(s, "In Pointer = %llx\n\n", (unsigned long long)ha->fce_wr);
seq_printf(s, "Base = %llx\n\n", (unsigned long long) ha->fce_dma);
seq_printf(s, "FCE Enable Registers\n");
seq_printf(s, "%08x %08x %08x %08x %08x %08x\n",
-- 
1.5.3.4

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html