Re: ips.c broken since 2.6.23 on x86_64?

2008-02-19 Thread FUJITA Tomonori
On Tue, 19 Feb 2008 10:06:39 -0600
James Bottomley <[EMAIL PROTECTED]> wrote:

> On Tue, 2008-02-19 at 17:02 +0900, FUJITA Tomonori wrote:
> > ips did scsi_add_host(sh, NULL) so scsi_dma_map uses
> > shost_gendev.parent that isn't initialized properly, then the kernel
> > crashes. 2.6.23 and 2.6.24 have this bug.
> > 
> > We can fix this by calling scsi_add_host with pdev->dev, in the
> > standard way (like the following way) but this bug was fixed in the
> > current Linus tree by:
> > 
> > commit 2551a13e61d3c3df6c2da6de5a3ece78e6d67111
> > Author: Jeff Garzik <[EMAIL PROTECTED]>
> > Date:   Thu Dec 13 16:14:10 2007 -0800
> > 
> > [SCSI] ips: handle scsi_add_host() failure, and other err cleanups
> > 
> > 
> > James, the legitimate way to fix stable trees is sending this commit
> > (not sending a patch that was not committed upstream)?
> 
> Well, the upstream patch doesn't look so bad as a stable candidate to my
> eye.  Just because it's an unintended bugfix doesn't automatically
> invalidate it.
> 
> The reason stable likes backports of existing upstream patches is
> because they've supposedly been well tested in upstream.  Although that
> doesn't apply in this case because the other bug rather prevented
> testing, the principle is still sound.
> 
> So, would there be any problems simply backporting this?

Thanks, I see. There is no problem. Please backport this.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-19 Thread James Bottomley
On Tue, 2008-02-19 at 17:02 +0900, FUJITA Tomonori wrote:
> ips did scsi_add_host(sh, NULL) so scsi_dma_map uses
> shost_gendev.parent that isn't initialized properly, then the kernel
> crashes. 2.6.23 and 2.6.24 have this bug.
> 
> We can fix this by calling scsi_add_host with pdev->dev, in the
> standard way (like the following way) but this bug was fixed in the
> current Linus tree by:
> 
> commit 2551a13e61d3c3df6c2da6de5a3ece78e6d67111
> Author: Jeff Garzik <[EMAIL PROTECTED]>
> Date:   Thu Dec 13 16:14:10 2007 -0800
> 
> [SCSI] ips: handle scsi_add_host() failure, and other err cleanups
> 
> 
> James, the legitimate way to fix stable trees is sending this commit
> (not sending a patch that was not committed upstream)?

Well, the upstream patch doesn't look so bad as a stable candidate to my
eye.  Just because it's an unintended bugfix doesn't automatically
invalidate it.

The reason stable likes backports of existing upstream patches is
because they've supposedly been well tested in upstream.  Although that
doesn't apply in this case because the other bug rather prevented
testing, the principle is still sound.

So, would there be any problems simply backporting this?

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-19 Thread FUJITA Tomonori
On Mon, 18 Feb 2008 19:48:49 -0800
"Tim Pepper" <[EMAIL PROTECTED]> wrote:

> On Feb 18, 2008 4:11 PM, FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> > Can you please help me just once more? 2.6.25-rc2 fixed this bug in a
> > bit different way by chance. Please test 2.6.25-rc2 with the attached
> > patch to make sure that ips in 2.6.25 works well.
> 
> Confirmed...the patch below against 2.6.25-rc2 also works for me.

Thanks a lot!

I'll make sure that the ips driver in 2.6.23-stable, 2.6.24-stable
(and 2.6.25) will work well.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-19 Thread FUJITA Tomonori
ips did scsi_add_host(sh, NULL) so scsi_dma_map uses
shost_gendev.parent that isn't initialized properly, then the kernel
crashes. 2.6.23 and 2.6.24 have this bug.

We can fix this by calling scsi_add_host with pdev->dev, in the
standard way (like the following way) but this bug was fixed in the
current Linus tree by:

commit 2551a13e61d3c3df6c2da6de5a3ece78e6d67111
Author: Jeff Garzik <[EMAIL PROTECTED]>
Date:   Thu Dec 13 16:14:10 2007 -0800

[SCSI] ips: handle scsi_add_host() failure, and other err cleanups


James, the legitimate way to fix stable trees is sending this commit
(not sending a patch that was not committed upstream)?


On Mon, 18 Feb 2008 22:32:46 +0900
FUJITA Tomonori <[EMAIL PROTECTED]> wrote:

> On Sun, 17 Feb 2008 15:37:02 -0800
> Tim Pepper <[EMAIL PROTECTED]> wrote:
> 
> > On Mon 19 Feb at 07:31:56 +0900 [EMAIL PROTECTED] said:
> > > 
> > > Can you apply the 0001 and 0002 against 2.6.24 and see how it works?
> > > If it works well, then please apply the 0001, 0002 and 0003.
> > 
> > Fujita-san,
> > 
> > I've started through the patches in order, cumulatively and after applying
> > 0005 things break.  I wont be able to test anything else until tomorrow
> > when I can phycisally reset the machine...
> 
> Great, thanks a lot!
> 
> Can you apply this patch after the 0005 patch and see how it works? If
> it works, then please continue to test 0006, 0007 ...
> 
> 
> diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
> index 05bb6ea..39cdd68 100644
> --- a/drivers/scsi/ips.c
> +++ b/drivers/scsi/ips.c
> @@ -6906,7 +6906,7 @@ ips_register_scsi(int index)
>   sh->max_channel = ha->nbus - 1;
>   sh->can_queue = ha->max_cmds - 1;
>  
> - scsi_add_host(sh, NULL);
> + scsi_add_host(sh, &ha->pcidev->dev);
>   scsi_scan_host(sh);
>  
>   return 0;
> -- 
> 1.5.3.7
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-18 Thread Tim Pepper
On Feb 18, 2008 4:11 PM, FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> Can you please help me just once more? 2.6.25-rc2 fixed this bug in a
> bit different way by chance. Please test 2.6.25-rc2 with the attached
> patch to make sure that ips in 2.6.25 works well.

Confirmed...the patch below against 2.6.25-rc2 also works for me.

Thank you again Fujita-san!

>
> diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
> index bb152fb..429592a 100644
> --- a/drivers/scsi/ips.c
> +++ b/drivers/scsi/ips.c
> @@ -1576,7 +1576,7 @@ ips_make_passthru(ips_ha_t *ha, struct scsi_cmnd *SC, 
> ips_scb_t *scb, int intr)
> METHOD_TRACE("ips_make_passthru", 1);
>
>  scsi_for_each_sg(SC, sg, scsi_sg_count(SC), i)
> -length += sg[i].length;
> +length += sg->length;
>
> if (length < sizeof (ips_passthru_t)) {
> /* wrong size */
> @@ -3510,15 +3510,16 @@ ips_scmd_buf_write(struct scsi_cmnd *scmd, void 
> *data, unsigned int count)
>  struct scatterlist *sg = scsi_sglist(scmd);
>
>  for (i = 0, xfer_cnt = 0;
> - (i < scsi_sg_count(scmd)) && (xfer_cnt < count); i++) {
> -min_cnt = min(count - xfer_cnt, sg[i].length);
> + (i < scsi_sg_count(scmd)) && (xfer_cnt < count);
> + i++, sg = sg_next(sg)) {
> +min_cnt = min(count - xfer_cnt, sg->length);
>
>  /* kmap_atomic() ensures addressability of the data buffer.*/
>  /* local_irq_save() protects the KM_IRQ0 address slot. */
>  local_irq_save(flags);
> -buffer = kmap_atomic(sg_page(&sg[i]), KM_IRQ0) + 
> sg[i].offset;
> +buffer = kmap_atomic(sg_page(sg), KM_IRQ0) + sg->offset;
>  memcpy(buffer, &cdata[xfer_cnt], min_cnt);
> -kunmap_atomic(buffer - sg[i].offset, KM_IRQ0);
> +kunmap_atomic(buffer - sg->offset, KM_IRQ0);
>  local_irq_restore(flags);
>
>  xfer_cnt += min_cnt;
> @@ -3543,15 +3544,16 @@ ips_scmd_buf_read(struct scsi_cmnd *scmd, void *data, 
> unsigned int count)
>  struct scatterlist *sg = scsi_sglist(scmd);
>
>  for (i = 0, xfer_cnt = 0;
> - (i < scsi_sg_count(scmd)) && (xfer_cnt < count); i++) {
> -min_cnt = min(count - xfer_cnt, sg[i].length);
> + (i < scsi_sg_count(scmd)) && (xfer_cnt < count);
> + i++, sg = sg_next(sg)) {
> +min_cnt = min(count - xfer_cnt, sg->length);
>
>  /* kmap_atomic() ensures addressability of the data buffer.*/
>  /* local_irq_save() protects the KM_IRQ0 address slot. */
>  local_irq_save(flags);
> -buffer = kmap_atomic(sg_page(&sg[i]), KM_IRQ0) + 
> sg[i].offset;
> +buffer = kmap_atomic(sg_page(sg), KM_IRQ0) + sg->offset;
>  memcpy(&cdata[xfer_cnt], buffer, min_cnt);
> -kunmap_atomic(buffer - sg[i].offset, KM_IRQ0);
> +kunmap_atomic(buffer - sg->offset, KM_IRQ0);
>  local_irq_restore(flags);
>
>  xfer_cnt += min_cnt;
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-18 Thread FUJITA Tomonori
On Mon, 18 Feb 2008 15:30:58 -0800
Tim Pepper <[EMAIL PROTECTED]> wrote:

> On Mon 18 Feb at 22:32:46 +0900 [EMAIL PROTECTED] said:
> > 
> > diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
> > index 05bb6ea..39cdd68 100644
> > --- a/drivers/scsi/ips.c
> > +++ b/drivers/scsi/ips.c
> > @@ -6906,7 +6906,7 @@ ips_register_scsi(int index)
> > sh->max_channel = ha->nbus - 1;
> > sh->can_queue = ha->max_cmds - 1;
> > 
> > -   scsi_add_host(sh, NULL);
> > +   scsi_add_host(sh, &ha->pcidev->dev);
> > scsi_scan_host(sh);
> > 
> > return 0;
> 
> Fujita-san,
> 
> This applies and runs well on top of your 0005 patch!  The rest of the
> patches also then apply in order and run successfully.

Great, thanks a lot!


> Just to confirm, I applied the above alone to a clean 2.6.24 and things
> again build and run successfully.  For completeness I also reproduced
> the problem against 2.6.23.16 and verified the above patch fixes on that
> kernel version as well.

Nice. There is another bug on 2.6.24 but we rarely hit this so 2.6.24
works most of the time:

http://marc.info/?l=linux-scsi&m=120303487528875&w=2


> Assuming this patch is accepted for 2.6.25, please also queue it for
> the 2.6.23/24 stable trees.

Yes, I will take care about it.

Can you please help me just once more? 2.6.25-rc2 fixed this bug in a
bit different way by chance. Please test 2.6.25-rc2 with the attached
patch to make sure that ips in 2.6.25 works well.


> Thank you very much for your help in tracking this issue down!

No problem. I should have fixed it long time ago. Really sorry about
it.


diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
index bb152fb..429592a 100644
--- a/drivers/scsi/ips.c
+++ b/drivers/scsi/ips.c
@@ -1576,7 +1576,7 @@ ips_make_passthru(ips_ha_t *ha, struct scsi_cmnd *SC, 
ips_scb_t *scb, int intr)
METHOD_TRACE("ips_make_passthru", 1);
 
 scsi_for_each_sg(SC, sg, scsi_sg_count(SC), i)
-length += sg[i].length;
+length += sg->length;
 
if (length < sizeof (ips_passthru_t)) {
/* wrong size */
@@ -3510,15 +3510,16 @@ ips_scmd_buf_write(struct scsi_cmnd *scmd, void *data, 
unsigned int count)
 struct scatterlist *sg = scsi_sglist(scmd);
 
 for (i = 0, xfer_cnt = 0;
- (i < scsi_sg_count(scmd)) && (xfer_cnt < count); i++) {
-min_cnt = min(count - xfer_cnt, sg[i].length);
+ (i < scsi_sg_count(scmd)) && (xfer_cnt < count);
+ i++, sg = sg_next(sg)) {
+min_cnt = min(count - xfer_cnt, sg->length);
 
 /* kmap_atomic() ensures addressability of the data buffer.*/
 /* local_irq_save() protects the KM_IRQ0 address slot. */
 local_irq_save(flags);
-buffer = kmap_atomic(sg_page(&sg[i]), KM_IRQ0) + sg[i].offset;
+buffer = kmap_atomic(sg_page(sg), KM_IRQ0) + sg->offset;
 memcpy(buffer, &cdata[xfer_cnt], min_cnt);
-kunmap_atomic(buffer - sg[i].offset, KM_IRQ0);
+kunmap_atomic(buffer - sg->offset, KM_IRQ0);
 local_irq_restore(flags);
 
 xfer_cnt += min_cnt;
@@ -3543,15 +3544,16 @@ ips_scmd_buf_read(struct scsi_cmnd *scmd, void *data, 
unsigned int count)
 struct scatterlist *sg = scsi_sglist(scmd);
 
 for (i = 0, xfer_cnt = 0;
- (i < scsi_sg_count(scmd)) && (xfer_cnt < count); i++) {
-min_cnt = min(count - xfer_cnt, sg[i].length);
+ (i < scsi_sg_count(scmd)) && (xfer_cnt < count);
+ i++, sg = sg_next(sg)) {
+min_cnt = min(count - xfer_cnt, sg->length);
 
 /* kmap_atomic() ensures addressability of the data buffer.*/
 /* local_irq_save() protects the KM_IRQ0 address slot. */
 local_irq_save(flags);
-buffer = kmap_atomic(sg_page(&sg[i]), KM_IRQ0) + sg[i].offset;
+buffer = kmap_atomic(sg_page(sg), KM_IRQ0) + sg->offset;
 memcpy(&cdata[xfer_cnt], buffer, min_cnt);
-kunmap_atomic(buffer - sg[i].offset, KM_IRQ0);
+kunmap_atomic(buffer - sg->offset, KM_IRQ0);
 local_irq_restore(flags);
 
 xfer_cnt += min_cnt;
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-18 Thread Tim Pepper
On Mon 18 Feb at 06:57:14 -0800 [EMAIL PROTECTED] said:
> The path needs to be triggered, it is the path to handle spoofing
> of the Adapter's inquiry.
> 
> You need more printk instrumentation to determine *why* it is not
> reaching that code path. What is the result of scb->scsi_cmd.
> scb->bus, ips_is_passthru(scb->scsi_cmd)?
> 
> The sg breakup issue may need to be looked at, but keep in mind
> the driver is going down a path that was not intended.

Mark,

Fujita Tomonori has noted the following:

-   scsi_add_host(sh, NULL);
+   scsi_add_host(sh, &ha->pcidev->dev);

which appears to fix things for me.

-- 
Tim Pepper  <[EMAIL PROTECTED]>
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-18 Thread Tim Pepper
On Mon 18 Feb at 22:32:46 +0900 [EMAIL PROTECTED] said:
> 
> diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
> index 05bb6ea..39cdd68 100644
> --- a/drivers/scsi/ips.c
> +++ b/drivers/scsi/ips.c
> @@ -6906,7 +6906,7 @@ ips_register_scsi(int index)
>   sh->max_channel = ha->nbus - 1;
>   sh->can_queue = ha->max_cmds - 1;
> 
> - scsi_add_host(sh, NULL);
> + scsi_add_host(sh, &ha->pcidev->dev);
>   scsi_scan_host(sh);
> 
>   return 0;

Fujita-san,

This applies and runs well on top of your 0005 patch!  The rest of the
patches also then apply in order and run successfully.

Just to confirm, I applied the above alone to a clean 2.6.24 and things
again build and run successfully.  For completeness I also reproduced
the problem against 2.6.23.16 and verified the above patch fixes on that
kernel version as well.

Assuming this patch is accepted for 2.6.25, please also queue it for
the 2.6.23/24 stable trees.

Thank you very much for your help in tracking this issue down!

-- 
Tim Pepper  <[EMAIL PROTECTED]>
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: ips.c broken since 2.6.23 on x86_64?

2008-02-18 Thread Salyzyn, Mark
The path needs to be triggered, it is the path to handle spoofing of the 
Adapter's inquiry.

You need more printk instrumentation to determine *why* it is not reaching that 
code path. What is the result of scb->scsi_cmd. scb->bus, 
ips_is_passthru(scb->scsi_cmd)?

The sg breakup issue may need to be looked at, but keep in mind the driver is 
going down a path that was not intended.

Sincerely -- Mark Salyzyn

> -Original Message-
> From: Tim Pepper [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, February 13, 2008 7:04 PM
> To: linux-scsi@vger.kernel.org
> Cc: FUJITA Tomonori; Salyzyn, Mark
> Subject: Re: ips.c broken since 2.6.23 on x86_64?
>
> (replaced missing cc's including linux-scsi)
>
> On Wed 13 Feb at 14:39:07 -0800 [EMAIL PROTECTED] said:
> >
> > - Is the command path code even reaching the controller's
> processor inquiry
> >   spoofing section?
> >
> > if (scb->scsi_cmd->cmnd[0] == INQUIRY) {
> > IPS_SCSI_INQ_DATA inquiry;
> >
> > memset(&inquiry, 0,
> >sizeof (IPS_SCSI_INQ_DATA));
> >
> > inquiry.DeviceType =
> > IPS_SCSI_INQ_TYPE_PROCESSOR;
>
> I just added printk's in ips_send_cmd()'s INQUIRY case just before
> the above condition test and just within the conditional block.
> Neither showed on the console.
>
> > The target_id check may be flawed? It is not using the
> > scmd/sdev accessors and could be the wrong value as a
> result. Wild goose
> > since arrays are working correctly (?)
>
> In the original case the arrays appeared to be working
> correctly although
> we were worried.  In the repro case we don't actually have any disk
> attached...Forgot to mention that in the original email.
>
> > - There appears to be a missing 'break' statement for the
> START_STOP command
> >   immediately preceding the TEST_UNIT_READY/INQUIRY cases.
> What are the side
> >   effects of that? It appears to be innocuous.
>
> No change noted with the missing break added.
>
> > - ips_scmd_buf_write is used for array capacity and other
> fiddly bits and
> >   must be functioning correctly (?), so I can NOT harm it's
> functionality
> >   except to question if the kmap_atomic returned a non-null
> value, it's
> >   return value apparently is not checked. It's failure
> could speak of
> >   problem(s) elsewhere.
>
> I've got a printk there and haven't seen any output from it.
> Haven't seen
> anything from any of the printk's I've tried actually before:
>
> In the run above to check if ips_send_cmd() hits the
> condition you were
> curious about...I did capture the following from printk's I added in
> ips_allocatescbs():
>
> pci_alloc_consistent returns ha->scbs@ 0x18446604437762007040
> pci_alloc_consistent returns ips_sg.list @ 0x18446604437762002944
> pci_alloc_consistent returns ha->scbs@ 0x18446604437698871296
> pci_alloc_consistent returns ips_sg.list @ 0x18446604437756837888
>
> I take that as ips_init_phase2() being called and presumable returning
> SUCCESS.
>
> --
> Tim Pepper  <[EMAIL PROTECTED]>
> IBM Linux Technology Center
>
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-18 Thread FUJITA Tomonori
On Sun, 17 Feb 2008 15:37:02 -0800
Tim Pepper <[EMAIL PROTECTED]> wrote:

> On Mon 19 Feb at 07:31:56 +0900 [EMAIL PROTECTED] said:
> > 
> > Can you apply the 0001 and 0002 against 2.6.24 and see how it works?
> > If it works well, then please apply the 0001, 0002 and 0003.
> 
> Fujita-san,
> 
> I've started through the patches in order, cumulatively and after applying
> 0005 things break.  I wont be able to test anything else until tomorrow
> when I can phycisally reset the machine...

Great, thanks a lot!

Can you apply this patch after the 0005 patch and see how it works? If
it works, then please continue to test 0006, 0007 ...


diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
index 05bb6ea..39cdd68 100644
--- a/drivers/scsi/ips.c
+++ b/drivers/scsi/ips.c
@@ -6906,7 +6906,7 @@ ips_register_scsi(int index)
sh->max_channel = ha->nbus - 1;
sh->can_queue = ha->max_cmds - 1;
 
-   scsi_add_host(sh, NULL);
+   scsi_add_host(sh, &ha->pcidev->dev);
scsi_scan_host(sh);
 
return 0;
-- 
1.5.3.7

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-17 Thread Tim Pepper
On Mon 19 Feb at 07:31:56 +0900 [EMAIL PROTECTED] said:
> 
> Can you apply the 0001 and 0002 against 2.6.24 and see how it works?
> If it works well, then please apply the 0001, 0002 and 0003.

Fujita-san,

I've started through the patches in order, cumulatively and after applying
0005 things break.  I wont be able to test anything else until tomorrow
when I can phycisally reset the machine...

-- 
Tim Pepper  <[EMAIL PROTECTED]>
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-17 Thread Tim Pepper
On Mon 18 Feb at 07:31:56 +0900 [EMAIL PROTECTED] said:
> 
> Can you apply the 0001 and 0002 against 2.6.24 and see how it works?
> If it works well, then please apply the 0001, 0002 and 0003.
> 
> Please the patches in a step-by-step way and tell me which patch
> breaks the driver.

I wasn't thinking Friday...we already expected 0001 to fix it, so the
hope is one of the others breaks it and that is useful info.  I'll apply
and test the rest of the series.  Sorry!

-- 
Tim Pepper  <[EMAIL PROTECTED]>
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-17 Thread FUJITA Tomonori
On Sun, 17 Feb 2008 13:15:40 -0800
Tim Pepper <[EMAIL PROTECTED]> wrote:

> On Sat 16 Feb at 09:41:48 +0900 [EMAIL PROTECTED] said:
> > 
> > Do you mean that you applied only the following two patches against
> > 2.6.24, and then it doesn't work?
> > 
> > 0001-ips-revert-the-changes-for-the-data-buffer-accessor.patch
> 
> I only applies the 0001 patch and the driver appeared to function fine.

I see, thanks. It would be nice if we can push only the 0001 patch to
mainline, however as I explained, we can't unfortunately. The
0002-0009 patches are necessary for post 2.6.24.


> > If so, the second patch is broken. Did you saw BUG_ON message (I added
> > some BUG_ON to the patch)?
> 
> I did not apply the 0002 patch.

Can you apply the 0001 and 0002 against 2.6.24 and see how it works?
If it works well, then please apply the 0001, 0002 and 0003.

Please the patches in a step-by-step way and tell me which patch
breaks the driver.

Thanks,
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-17 Thread Tim Pepper
On Sun 17 Feb at 15:09:06 +0200 [EMAIL PROTECTED] said:
> > 
> > You just need to also check for sg != NULL in the for() loop.
> > 
> > Tim Please test below patch. It's ontop of 2.6.24 but should also apply to
> > 2.6.25-rcx


> Disregard that patch it is totally wrong. That's not it. Sorry
> 
> Boaz
> 

You sure?  I was about to give it a quick run...but I guess I wont then. :)

-- 
Tim Pepper  <[EMAIL PROTECTED]>
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-17 Thread Tim Pepper
On Sat 16 Feb at 09:41:48 +0900 [EMAIL PROTECTED] said:
> 
> Do you mean that you applied only the following two patches against
> 2.6.24, and then it doesn't work?
> 
> 0001-ips-revert-the-changes-for-the-data-buffer-accessor.patch

I only applies the 0001 patch and the driver appeared to function fine.

> If so, the second patch is broken. Did you saw BUG_ON message (I added
> some BUG_ON to the patch)?

I did not apply the 0002 patch.

-- 
Tim Pepper  <[EMAIL PROTECTED]>
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-17 Thread Boaz Harrosh
On Sun, Feb 17 2008 at 14:52 +0200, Boaz Harrosh <[EMAIL PROTECTED]> wrote:
> On Sat, Feb 16 2008 at 2:41 +0200, FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
>> On Fri, 15 Feb 2008 14:50:57 -0800
>> Tim Pepper <[EMAIL PROTECTED]> wrote:
>>
>>> On Sat 16 Feb at 01:09:43 +0900 [EMAIL PROTECTED] said:
 The first one is just reverting the data buffer accessors
 conversion. It would be nice if we could just revert it but we
 can't. These changes are necessary to compile the driver against post
 2.6.24.
>>> Fujita-san,
>>>
>>> Unfortunately (and not too surprisingly given what we've tried so far) with
>>> only the first of your series reverted the driver is working fine for me
>>> again.
>> Do you mean that you applied only the following two patches against
>> 2.6.24, and then it doesn't work?
>>
>> 0001-ips-revert-the-changes-for-the-data-buffer-accessor.patch
>> 0002-ips-kill-the-map_single-path-in-ips_scmd_buf_write.patch
>>
>> If so, the second patch is broken. Did you saw BUG_ON message (I added
>> some BUG_ON to the patch)?
>>
>>
>>> I saw (eg: replies to http://lkml.org/lkml/2007/5/11/132) some possibly
>>> similar sounding issues with other drivers.  Could there be some memory
>>> uninitialised?  I did try changing all the ips.c kmalloc's to kzalloc's,
>>> but that didn't help.  Also that thread ties into pci gart.  The machines
>>> we've been using are liable to getting pci calgary although given my
>>> .config has:
>>> CONFIG_GART_IOMMU=y
>>> CONFIG_CALGARY_IOMMU=y
>>> # CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT is not set
>>> and that when booting this mainline I don't see any Calgary related
>>> messages like I get from eg: Ubuntu's 2.6.22-14-server...I'm probably not
>>> actually running the calgary iommu code in these repros.
>> Yes, probabaly, your machine doesn't use any IOMMU hardware
>> (nommu_map_sg function was in your crash log).
>>
>>
>>> Anyway, I greatly appreciate your efforts so far in trying to find what
>>> could be wrong here!
>> Really sorry about the troubles and thanks for testing.
>> -
> Tomo hi
> It looks like the same bug we had with USB's isd200 and protocol.c. An 
> overflow
> of a data buffer bigger then then the sglist. There 2 it was in the INQUIRY 
> command.
> 
> You just need to also check for sg != NULL in the for() loop.
> 
> Tim Please test below patch. It's ontop of 2.6.24 but should also apply to
> 2.6.25-rcx
> 
> Boaz
> -- 
> From ec20bea25c9fe2400378b19c128b15fef3c7cbb6 Mon Sep 17 00:00:00 2001
> From: Boaz Harrosh <[EMAIL PROTECTED]>
> Date: Sun, 17 Feb 2008 14:50:25 +0200
> Subject: [PATCH] ips: Avoid overflow in writing scsi command data
> 
> Signed-off-by: Boaz Harrosh <[EMAIL PROTECTED]>
> ---
>  drivers/scsi/ips.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
> index 5c5a9b2..1d12253 100644
> --- a/drivers/scsi/ips.c
> +++ b/drivers/scsi/ips.c
> @@ -3517,7 +3517,7 @@ ips_scmd_buf_write(struct scsi_cmnd *scmd, void *data, 
> unsigned int count)

Disregard that patch it is totally wrong. That's not it. Sorry

Boaz

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-17 Thread Boaz Harrosh
On Sat, Feb 16 2008 at 2:41 +0200, FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> On Fri, 15 Feb 2008 14:50:57 -0800
> Tim Pepper <[EMAIL PROTECTED]> wrote:
> 
>> On Sat 16 Feb at 01:09:43 +0900 [EMAIL PROTECTED] said:
>>> The first one is just reverting the data buffer accessors
>>> conversion. It would be nice if we could just revert it but we
>>> can't. These changes are necessary to compile the driver against post
>>> 2.6.24.
>> Fujita-san,
>>
>> Unfortunately (and not too surprisingly given what we've tried so far) with
>> only the first of your series reverted the driver is working fine for me
>> again.
> 
> Do you mean that you applied only the following two patches against
> 2.6.24, and then it doesn't work?
> 
> 0001-ips-revert-the-changes-for-the-data-buffer-accessor.patch
> 0002-ips-kill-the-map_single-path-in-ips_scmd_buf_write.patch
> 
> If so, the second patch is broken. Did you saw BUG_ON message (I added
> some BUG_ON to the patch)?
> 
> 
>> I saw (eg: replies to http://lkml.org/lkml/2007/5/11/132) some possibly
>> similar sounding issues with other drivers.  Could there be some memory
>> uninitialised?  I did try changing all the ips.c kmalloc's to kzalloc's,
>> but that didn't help.  Also that thread ties into pci gart.  The machines
>> we've been using are liable to getting pci calgary although given my
>> .config has:
>> CONFIG_GART_IOMMU=y
>> CONFIG_CALGARY_IOMMU=y
>> # CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT is not set
>> and that when booting this mainline I don't see any Calgary related
>> messages like I get from eg: Ubuntu's 2.6.22-14-server...I'm probably not
>> actually running the calgary iommu code in these repros.
> 
> Yes, probabaly, your machine doesn't use any IOMMU hardware
> (nommu_map_sg function was in your crash log).
> 
> 
>> Anyway, I greatly appreciate your efforts so far in trying to find what
>> could be wrong here!
> 
> Really sorry about the troubles and thanks for testing.
> -
Tomo hi
It looks like the same bug we had with USB's isd200 and protocol.c. An overflow
of a data buffer bigger then then the sglist. There 2 it was in the INQUIRY 
command.

You just need to also check for sg != NULL in the for() loop.

Tim Please test below patch. It's ontop of 2.6.24 but should also apply to
2.6.25-rcx

Boaz
-- 
>From ec20bea25c9fe2400378b19c128b15fef3c7cbb6 Mon Sep 17 00:00:00 2001
From: Boaz Harrosh <[EMAIL PROTECTED]>
Date: Sun, 17 Feb 2008 14:50:25 +0200
Subject: [PATCH] ips: Avoid overflow in writing scsi command data

Signed-off-by: Boaz Harrosh <[EMAIL PROTECTED]>
---
 drivers/scsi/ips.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
index 5c5a9b2..1d12253 100644
--- a/drivers/scsi/ips.c
+++ b/drivers/scsi/ips.c
@@ -3517,7 +3517,7 @@ ips_scmd_buf_write(struct scsi_cmnd *scmd, void *data, 
unsigned int count)
 struct scatterlist *sg = scsi_sglist(scmd);
 
 for (i = 0, xfer_cnt = 0;
- (i < scsi_sg_count(scmd)) && (xfer_cnt < count); i++) {
+ (i < scsi_sg_count(scmd)) && (xfer_cnt < count) && sg; i++) {
 min_cnt = min(count - xfer_cnt, sg[i].length);
 
 /* kmap_atomic() ensures addressability of the data buffer.*/
-- 
1.5.3.3

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-15 Thread FUJITA Tomonori
On Fri, 15 Feb 2008 14:50:57 -0800
Tim Pepper <[EMAIL PROTECTED]> wrote:

> On Sat 16 Feb at 01:09:43 +0900 [EMAIL PROTECTED] said:
> > 
> > The first one is just reverting the data buffer accessors
> > conversion. It would be nice if we could just revert it but we
> > can't. These changes are necessary to compile the driver against post
> > 2.6.24.
> 
> Fujita-san,
> 
> Unfortunately (and not too surprisingly given what we've tried so far) with
> only the first of your series reverted the driver is working fine for me
> again.

Do you mean that you applied only the following two patches against
2.6.24, and then it doesn't work?

0001-ips-revert-the-changes-for-the-data-buffer-accessor.patch
0002-ips-kill-the-map_single-path-in-ips_scmd_buf_write.patch

If so, the second patch is broken. Did you saw BUG_ON message (I added
some BUG_ON to the patch)?


> I saw (eg: replies to http://lkml.org/lkml/2007/5/11/132) some possibly
> similar sounding issues with other drivers.  Could there be some memory
> uninitialised?  I did try changing all the ips.c kmalloc's to kzalloc's,
> but that didn't help.  Also that thread ties into pci gart.  The machines
> we've been using are liable to getting pci calgary although given my
> .config has:
> CONFIG_GART_IOMMU=y
> CONFIG_CALGARY_IOMMU=y
> # CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT is not set
> and that when booting this mainline I don't see any Calgary related
> messages like I get from eg: Ubuntu's 2.6.22-14-server...I'm probably not
> actually running the calgary iommu code in these repros.

Yes, probabaly, your machine doesn't use any IOMMU hardware
(nommu_map_sg function was in your crash log).


> Anyway, I greatly appreciate your efforts so far in trying to find what
> could be wrong here!

Really sorry about the troubles and thanks for testing.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-15 Thread Tim Pepper
On Sat 16 Feb at 01:09:43 +0900 [EMAIL PROTECTED] said:
> 
> The first one is just reverting the data buffer accessors
> conversion. It would be nice if we could just revert it but we
> can't. These changes are necessary to compile the driver against post
> 2.6.24.

Fujita-san,

Unfortunately (and not too surprisingly given what we've tried so far) with
only the first of your series reverted the driver is working fine for me
again.

I saw (eg: replies to http://lkml.org/lkml/2007/5/11/132) some possibly
similar sounding issues with other drivers.  Could there be some memory
uninitialised?  I did try changing all the ips.c kmalloc's to kzalloc's,
but that didn't help.  Also that thread ties into pci gart.  The machines
we've been using are liable to getting pci calgary although given my
.config has:
CONFIG_GART_IOMMU=y
CONFIG_CALGARY_IOMMU=y
# CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT is not set
and that when booting this mainline I don't see any Calgary related
messages like I get from eg: Ubuntu's 2.6.22-14-server...I'm probably not
actually running the calgary iommu code in these repros.

Anyway, I greatly appreciate your efforts so far in trying to find what
could be wrong here!

-- 
Tim Pepper  <[EMAIL PROTECTED]>
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-15 Thread FUJITA Tomonori
On Thu, 14 Feb 2008 17:16:36 -0800
Tim Pepper <[EMAIL PROTECTED]> wrote:

> On Fri 15 Feb at 09:13:16 +0900 [EMAIL PROTECTED] said:
> > 
> > Thanks. So we surely have a bug in the non-breakup part.
> > 
> > I've just found one bug. Can you try this patch against 2.6.24?
> 
> Tested and unfortunately no change.  Behaves same as the breakup-revert patch.

Thanks and sorry about that.

Now I don't have any idea. I split the patch. Can you please apply
them in a step-by-step way and tell me which patch breaks the driver?

There are nine patches against 2.6.24:

http://www.kernel.org/pub/linux/kernel/people/tomo/ips/

The first one is just reverting the data buffer accessors
conversion. It would be nice if we could just revert it but we
can't. These changes are necessary to compile the driver against post
2.6.24.

Really sorry about the troubles,
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-14 Thread Tim Pepper
On Fri 15 Feb at 09:13:16 +0900 [EMAIL PROTECTED] said:
> 
> Thanks. So we surely have a bug in the non-breakup part.
> 
> I've just found one bug. Can you try this patch against 2.6.24?

Tested and unfortunately no change.  Behaves same as the breakup-revert patch.

-- 
Tim Pepper  <[EMAIL PROTECTED]>
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-14 Thread FUJITA Tomonori
On Thu, 14 Feb 2008 15:55:49 -0800
Tim Pepper <[EMAIL PROTECTED]> wrote:

> On Thu 14 Feb at 20:48:38 +0900 [EMAIL PROTECTED] said:
> > 
> > I have a slight doubt on the breakup code though I'm not sure you hit
> > the code. Reverting only the breakup part works? The patch is against
> > 2.6.24.
> 
> I've tested this revert you posted.  Essentially the same trace is logged
> (see below).  The machine doesn't die though.  But /proc/scsi/scsi doesn't
> show the hardware and I am getting an additional trace dumped now (see
> futher below).

Thanks. So we surely have a bug in the non-breakup part.

I've just found one bug. Can you try this patch against 2.6.24?

Thanks,

diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
index 5c5a9b2..d8f5eb5 100644
--- a/drivers/scsi/ips.c
+++ b/drivers/scsi/ips.c
@@ -1575,12 +1575,12 @@ ips_make_passthru(ips_ha_t *ha, struct scsi_cmnd *SC, 
ips_scb_t *scb, int intr)
ips_passthru_t *pt;
int length = 0;
int i, ret;
-struct scatterlist *sg = scsi_sglist(SC);
+struct scatterlist *sg;
 
METHOD_TRACE("ips_make_passthru", 1);
 
 scsi_for_each_sg(SC, sg, scsi_sg_count(SC), i)
-length += sg[i].length;
+length += sg->length;
 
if (length < sizeof (ips_passthru_t)) {
/* wrong size */
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ips.c broken since 2.6.23 on x86_64?

2008-02-14 Thread Tim Pepper
On Thu 14 Feb at 20:48:38 +0900 [EMAIL PROTECTED] said:
> 
> I have a slight doubt on the breakup code though I'm not sure you hit
> the code. Reverting only the breakup part works? The patch is against
> 2.6.24.

I've tested this revert you posted.  Essentially the same trace is logged
(see below).  The machine doesn't die though.  But /proc/scsi/scsi doesn't
show the hardware and I am getting an additional trace dumped now (see
futher below).

Feb 14 16:04:13 ipstest kernel: [ 1046.531180] ACPI: PCI Interrupt 
:02:01.0[A] -> GSI 18 (level, low) -> IRQ 18
Feb 14 16:04:13 ipstest kernel: [ 1046.553420] scsi3 : IBM PCI ServeRAID 
7.12.05  Build 761 
Feb 14 16:04:13 ipstest kernel: [ 1046.554103] PGD 7cc1f067 PUD 7c4f6067 PMD 0
Feb 14 16:04:13 ipstest kernel: [ 1046.554348] CPU 0
Feb 14 16:04:13 ipstest kernel: [ 1046.554460] Modules linked in: ips aic94xx
Feb 14 16:04:13 ipstest kernel: [ 1046.554587] Pid: 4476, comm: scsi_scan_3 Not 
tainted 2.6.24-fujita-revert-breakup #1
Feb 14 16:04:13 ipstest kernel: [ 1046.554827] RIP: 0010:[check_addr+16/80]  
[check_addr+16/80] check_addr+0x10/0x50
Feb 14 16:04:13 ipstest kernel: [ 1046.555079] RSP: 0018:81007b0f78f0  
EFLAGS: 00010082
Feb 14 16:04:13 ipstest kernel: [ 1046.555208] RAX:  RBX: 
81007b92a300 RCX: 0024
Feb 14 16:04:13 ipstest kernel: [ 1046.555350] RDX: 7b92a600 RSI: 
807a7340 RDI: 806e7f43
Feb 14 16:04:13 ipstest kernel: [ 1046.555494] RBP: 81007b0f78f0 R08: 
0500 R09: 
Feb 14 16:04:13 ipstest kernel: [ 1046.555632] R10: 81007c463b38 R11: 
0060 R12: 
Feb 14 16:04:13 ipstest kernel: [ 1046.555784] R13: 0001 R14: 
807a7340 R15: 81007c463a80
Feb 14 16:04:13 ipstest kernel: [ 1046.555924] FS:  () 
GS:807d6000() knlGS:
Feb 14 16:04:13 ipstest kernel: [ 1046.556173] CS:  0010 DS: 0018 ES: 0018 CR0: 
8005003b
Feb 14 16:04:13 ipstest kernel: [ 1046.556306] CR2:  CR3: 
7cd02000 CR4: 06e0
Feb 14 16:04:13 ipstest kernel: [ 1046.556443] DR0:  DR1: 
 DR2: 
Feb 14 16:04:13 ipstest kernel: [ 1046.556582] DR3:  DR6: 
0ff0 DR7: 0400
Feb 14 16:04:13 ipstest kernel: [ 1046.556731] Process scsi_scan_3 (pid: 4476, 
threadinfo 81007b0f6000, task 81007b9295c0)
Feb 14 16:04:13 ipstest kernel: [ 1046.556972] Stack:  81007b0f7920 
80213118 81007c463a80 81007b0fcf50
Feb 14 16:04:13 ipstest kernel: [ 1046.557221]   
81007cd14450 81007b0f7930 8047a4c2
Feb 14 16:04:13 ipstest kernel: [ 1046.557463]  81007b0f79b0 
8801c28e 81007c463ae8 00017c463ae8
Feb 14 16:04:13 ipstest kernel: [ 1046.557603] Call Trace:
Feb 14 16:04:13 ipstest kernel: [ 1046.557830]  [nommu_map_sg+104/176] 
nommu_map_sg+0x68/0xb0
Feb 14 16:04:13 ipstest kernel: [ 1046.557949]  [scsi_dma_map+66/96] 
scsi_dma_map+0x42/0x60
Feb 14 16:04:13 ipstest kernel: [ 1046.558083]  [_end+124740294/2127405112] 
:ips:ips_next+0x33e/0xc10
Feb 14 16:04:13 ipstest kernel: [ 1046.558205]  [scsi_done+0/48] 
scsi_done+0x0/0x30
Feb 14 16:04:13 ipstest kernel: [ 1046.558332]  [_end+124752974/2127405112] 
:ips:ips_queue+0x106/0x1f0
Feb 14 16:04:13 ipstest kernel: [ 1046.558459]  [scsi_dispatch_cmd+453/736] 
scsi_dispatch_cmd+0x1c5/0x2e0
Feb 14 16:04:13 ipstest kernel: [ 1046.558589]  [scsi_request_fn+491/976] 
scsi_request_fn+0x1eb/0x3d0
Feb 14 16:04:13 ipstest kernel: [ 1046.558725]  [__generic_unplug_device+37/48] 
__generic_unplug_device+0x25/0x30
Feb 14 16:04:13 ipstest kernel: [ 1046.558851]  [blk_execute_rq_nowait+99/176] 
blk_execute_rq_nowait+0x63/0xb0
Feb 14 16:04:13 ipstest kernel: [ 1046.558982]  [blk_execute_rq+122/224] 
blk_execute_rq+0x7a/0xe0
Feb 14 16:04:13 ipstest kernel: [ 1046.559110]  [scsi_execute+240/288] 
scsi_execute+0xf0/0x120
Feb 14 16:04:13 ipstest kernel: [ 1046.559237]  [scsi_execute_req+134/240] 
scsi_execute_req+0x86/0xf0
Feb 14 16:04:13 ipstest kernel: [ 1046.559366]  
[scsi_probe_and_add_lun+594/3472] scsi_probe_and_add_lun+0x252/0xd90
Feb 14 16:04:13 ipstest kernel: [ 1046.559501]  [sas_expander_match+27/160] 
sas_expander_match+0x1b/0xa0
Feb 14 16:04:13 ipstest kernel: [ 1046.559633]  [get_device+23/32] 
get_device+0x17/0x20
Feb 14 16:04:13 ipstest kernel: [ 1046.559755]  [__scsi_scan_target+220/1680] 
__scsi_scan_target+0xdc/0x690
Feb 14 16:04:13 ipstest kernel: [ 1046.559887]  [enqueue_task_fair+32/64] 
enqueue_task_fair+0x20/0x40
Feb 14 16:04:13 ipstest kernel: [ 1046.560019]  [enqueue_task+19/48] 
enqueue_task+0x13/0x30
Feb 14 16:04:13 ipstest kernel: [ 1046.560141]  [update_curr+101/192] 
update_curr+0x65/0xc0
Feb 14 16:04:13 ipstest kernel: [ 1046.560268]  [scsi_scan_channel+139/160] 
scsi_scan_channel+0x8b/0xa0
Feb 14 16:04:13 ipstest kernel: [ 1046.560397]  
[scsi_scan_host_selected+158/352

Re: ips.c broken since 2.6.23 on x86_64?

2008-02-14 Thread FUJITA Tomonori
On Wed, 13 Feb 2008 13:43:24 -0800
Tim Pepper <[EMAIL PROTECTED]> wrote:

> We recently upgraded a production x86_64 machine with serveraid
> cards to 2.6.24 and noted that /proc/scsi/scsi showed garbage for our
> serveraid service processors.  sg_inq also returned garbage from the
> service processors' sg devices.  After a few iterations I started seeing
> meaninful stuff in the garbage.  Not sure if it was returning live memory
> or just unzero'd.  Either way not good so we went back to a known good,
> older kernel and tried to repro on a similar machine.  We got different,
> but still bad results in terms of pointing at memory badness.
> 
> FWIW, the original machine had the following hardware:
> scsi0 : IBM PCI ServeRAID 7.12.05  Build 761 
> scsi1 : IBM PCI ServeRAID 7.12.05  Build 761 
> and the repro's have been on a machine with just:
> scsi0 : IBM PCI ServeRAID 7.12.05  Build 761 
> 
> On the repro machine I'm getting a hang on ips driver load with the following
> logged:
> 
> Feb 13 13:16:08 ipstest kernel: [  915.236563] scsi3 : IBM PCI ServeRAID 
> 7.12.05  Build 761 
> Feb 13 13:16:08 ipstest kernel: [  915.236839] Unable to handle kernel NULL 
> pointer dereference at  RIP:
> Feb 13 13:16:08 ipstest kernel: [  915.236863]  [check_addr+16/80] 
> check_addr+0x10/0x50
> Feb 13 13:16:08 ipstest kernel: [  915.237209] PGD 79fff067 PUD 7a898067 PMD 0
> Feb 13 13:16:08 ipstest kernel: [  915.237341] Oops:  [1] SMP
> Feb 13 13:16:08 ipstest kernel: [  915.237463] CPU 1
> Feb 13 13:16:08 ipstest kernel: [  915.239436] Modules linked in: ips aic94xx
> Feb 13 13:16:08 ipstest kernel: [  915.239559] Pid: 5213, comm: scsi_scan_3 
> Not tainted 2.6.23-ips_as_module #3
> Feb 13 13:16:08 ipstest kernel: [  915.239692] RIP: 0010:[check_addr+16/80]  
> [check_addr+16/80] check_addr+0x10/0x50
> Feb 13 13:16:08 ipstest kernel: [  915.239932] RSP: 0018:810076d87900  
> EFLAGS: 00010082
> Feb 13 13:16:08 ipstest kernel: [  915.240059] RAX:  RBX: 
> 81007b636300 RCX: 0024
> Feb 13 13:16:08 ipstest kernel: [  915.240196] RDX: 7b636b00 RSI: 
> 8077cde0 RDI: 806c4ed5
> Feb 13 13:16:08 ipstest kernel: [  915.240332] RBP: 810076d87900 R08: 
> 0500 R09: 
> Feb 13 13:16:08 ipstest kernel: [  915.240468] R10: 81007aa33b40 R11: 
> 0060 R12: 
> Feb 13 13:16:08 ipstest kernel: [  915.240605] R13: 0001 R14: 
> 8077cde0 R15: 81007aa33a80
> Feb 13 13:16:08 ipstest kernel: [  915.240741] FS:  () 
> GS:810001039300() knlGS:
> Feb 13 13:16:08 ipstest kernel: [  915.240981] CS:  0010 DS: 0018 ES: 0018 
> CR0: 8005003b
> Feb 13 13:16:08 ipstest kernel: [  915.24] CR2:  CR3: 
> 78a98000 CR4: 06e0
> Feb 13 13:16:08 ipstest kernel: [  915.241248] DR0:  DR1: 
>  DR2: 
> Feb 13 13:16:08 ipstest kernel: [  915.241384] DR3:  DR6: 
> 0ff0 DR7: 0400
> Feb 13 13:16:08 ipstest kernel: [  915.241520] Process scsi_scan_3 (pid: 
> 5213, threadinfo 810076d86000, task 81007be26720)
> Feb 13 13:16:08 ipstest kernel: [  915.241761] Stack:  810076d87930 
> 802125c3 81007aa33a80 81007480cf50
> Feb 13 13:16:08 ipstest kernel: [  915.242006]   
> 81007ba38ca8 810076d87940 8046fb42
> Feb 13 13:16:08 ipstest kernel: [  915.242248]  810076d879c0 
> 8801c2ee 81007aa33af0 00017aa33af0
> Feb 13 13:16:08 ipstest kernel: [  915.242389] Call Trace:
> Feb 13 13:16:08 ipstest kernel: [  915.242606]  [nommu_map_sg+115/144] 
> nommu_map_sg+0x73/0x90
> Feb 13 13:16:08 ipstest kernel: [  915.242736]  [scsi_dma_map+66/96] 
> scsi_dma_map+0x42/0x60
> Feb 13 13:16:08 ipstest kernel: [  915.242867]  [_end+124884230/2127548952] 
> :ips:ips_next+0x33e/0xc00
> Feb 13 13:16:08 ipstest kernel: [  915.242986]  [scsi_done+0/48] 
> scsi_done+0x0/0x30
> Feb 13 13:16:08 ipstest kernel: [  915.243114]  [_end+124896894/2127548952] 
> :ips:ips_queue+0x106/0x1f0
> Feb 13 13:16:08 ipstest kernel: [  915.243240]  [scsi_dispatch_cmd+498/784] 
> scsi_dispatch_cmd+0x1f2/0x310
> Feb 13 13:16:08 ipstest kernel: [  915.243370]  [scsi_request_fn+491/976] 
> scsi_request_fn+0x1eb/0x3d0
> Feb 13 13:16:08 ipstest kernel: [  915.243500]  
> [__generic_unplug_device+37/48] __generic_unplug_device+0x25/0x30
> Feb 13 13:16:08 ipstest kernel: [  915.243630]  
> [blk_execute_rq_nowait+99/176] blk_execute_rq_nowait+0x63/0xb0
> Feb 13 13:16:08 ipstest kernel: [  915.243761]  [blk_execute_rq+122/224] 
> blk_execute_rq+0x7a/0xe0
> Feb 13 13:16:08 ipstest kernel: [  915.243889]  [scsi_execute+240/288] 
> scsi_execute+0xf0/0x120
> Feb 13 13:16:08 ipstest kernel: [  915.244016]  [scsi_execute_req+134/240] 
> scsi_execute_req+0x86/0xf0
> Feb 13 13:16:08 ipstest kernel: [  915.24414

Re: ips.c broken since 2.6.23 on x86_64?

2008-02-13 Thread Tim Pepper
(replaced missing cc's including linux-scsi)

On Wed 13 Feb at 14:39:07 -0800 [EMAIL PROTECTED] said:
> 
> - Is the command path code even reaching the controller's processor inquiry
>   spoofing section?
> 
> if (scb->scsi_cmd->cmnd[0] == INQUIRY) {
> IPS_SCSI_INQ_DATA inquiry;
> 
> memset(&inquiry, 0,
>sizeof (IPS_SCSI_INQ_DATA));
> 
> inquiry.DeviceType =
> IPS_SCSI_INQ_TYPE_PROCESSOR;

I just added printk's in ips_send_cmd()'s INQUIRY case just before
the above condition test and just within the conditional block.
Neither showed on the console.

> The target_id check may be flawed? It is not using the
> scmd/sdev accessors and could be the wrong value as a result. Wild goose
> since arrays are working correctly (?)

In the original case the arrays appeared to be working correctly although
we were worried.  In the repro case we don't actually have any disk
attached...Forgot to mention that in the original email.

> - There appears to be a missing 'break' statement for the START_STOP command
>   immediately preceding the TEST_UNIT_READY/INQUIRY cases. What are the side
>   effects of that? It appears to be innocuous.

No change noted with the missing break added.

> - ips_scmd_buf_write is used for array capacity and other fiddly bits and
>   must be functioning correctly (?), so I can NOT harm it's functionality
>   except to question if the kmap_atomic returned a non-null value, it's
>   return value apparently is not checked. It's failure could speak of
>   problem(s) elsewhere.

I've got a printk there and haven't seen any output from it.  Haven't seen
anything from any of the printk's I've tried actually before:

In the run above to check if ips_send_cmd() hits the condition you were
curious about...I did capture the following from printk's I added in
ips_allocatescbs():

pci_alloc_consistent returns ha->scbs@ 0x18446604437762007040
pci_alloc_consistent returns ips_sg.list @ 0x18446604437762002944
pci_alloc_consistent returns ha->scbs@ 0x18446604437698871296
pci_alloc_consistent returns ips_sg.list @ 0x18446604437756837888

I take that as ips_init_phase2() being called and presumable returning
SUCCESS.

-- 
Tim Pepper  <[EMAIL PROTECTED]>
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html