Re: [bug] ata subsystem related crash with latest -git

2007-10-20 Thread Torsten Kaiser
[Just catching with reading lkml to this post]

On 10/18/07, Jens Axboe <[EMAIL PROTECTED]> wrote:
>
> Theory - ata_sg_is_last() isn't returning true for the last entry. Can
> you double check that it correcly marks the last entry in mv_fill_sg()?
> Alternatively, just try this patch.

I "hate" to point this out, but I already reported that sata_sil24
fails on 1. Sep.:
http://lkml.org/lkml/2007/9/1/95

In the thread "sata_sil24 broken since 2.6.23-rc4-mm1" I spent over a
week to trace this back to ata_sg_is_last (finally found on 7. Oct):
http://lkml.org/lkml/2007/10/7/43

Thanks for finally patching this now...

Torsten
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-20 Thread Torsten Kaiser
[Just catching with reading lkml to this post]

On 10/18/07, Jens Axboe [EMAIL PROTECTED] wrote:

 Theory - ata_sg_is_last() isn't returning true for the last entry. Can
 you double check that it correcly marks the last entry in mv_fill_sg()?
 Alternatively, just try this patch.

I hate to point this out, but I already reported that sata_sil24
fails on 1. Sep.:
http://lkml.org/lkml/2007/9/1/95

In the thread sata_sil24 broken since 2.6.23-rc4-mm1 I spent over a
week to trace this back to ata_sg_is_last (finally found on 7. Oct):
http://lkml.org/lkml/2007/10/7/43

Thanks for finally patching this now...

Torsten
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-19 Thread FUJITA Tomonori
On Thu, 18 Oct 2007 19:14:29 +0200
Jens Axboe <[EMAIL PROTECTED]> wrote:

> On Thu, Oct 18 2007, Arjan van de Ven wrote:
> > On Thu, 18 Oct 2007 09:55:04 -0700 (PDT)
> > Linus Torvalds <[EMAIL PROTECTED]> wrote:
> > 
> > > 
> > > 
> > > On Thu, 18 Oct 2007, Jens Axboe wrote:
> > > > -   unsigned long addr = page_to_phys(s->page) +
> > > > s->offset; 
> > > > +   unsigned long addr = page_to_phys(sg_page(s)) +
> > > > s->offset; 
> > > 
> > > Umm. May I suggest (I haven't read the whole thread yet, maybe
> > > somebody else already did) that
> > > 
> > >   static inline unsigned long sg_phys(struct scatterlist *sg)
> > >   {
> > >   return  page_to_phys(sg_page(sg)) + sg->offset;
> > >   }
> > > 
> > > would be a good thing to have?
> > > 
> > > Very few drivers should care so much about the *page* itself (or the 
> > > offset). That's something that the generic allocation code etc cares 
> > > about, but the driver is almost bound to care mostly about the actual
> > > DMA address
> > 
> >  but will that work for systems with IOMMU ? or is it fundamentally
> > the wrong interface
> 
> They use foo_to_bus() on the address. sg_phys() should of course only be
> used where the user previously did page_to_phys() on the sg page.

I can take care of IOMMU stuff when I'll send IOMMU merging fix
patchset:

http://marc.info/?l=linux-scsi=119079718126157=2
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-19 Thread FUJITA Tomonori
On Thu, 18 Oct 2007 19:14:29 +0200
Jens Axboe [EMAIL PROTECTED] wrote:

 On Thu, Oct 18 2007, Arjan van de Ven wrote:
  On Thu, 18 Oct 2007 09:55:04 -0700 (PDT)
  Linus Torvalds [EMAIL PROTECTED] wrote:
  
   
   
   On Thu, 18 Oct 2007, Jens Axboe wrote:
-   unsigned long addr = page_to_phys(s-page) +
s-offset; 
+   unsigned long addr = page_to_phys(sg_page(s)) +
s-offset; 
   
   Umm. May I suggest (I haven't read the whole thread yet, maybe
   somebody else already did) that
   
 static inline unsigned long sg_phys(struct scatterlist *sg)
 {
 return  page_to_phys(sg_page(sg)) + sg-offset;
 }
   
   would be a good thing to have?
   
   Very few drivers should care so much about the *page* itself (or the 
   offset). That's something that the generic allocation code etc cares 
   about, but the driver is almost bound to care mostly about the actual
   DMA address
  
   but will that work for systems with IOMMU ? or is it fundamentally
  the wrong interface
 
 They use foo_to_bus() on the address. sg_phys() should of course only be
 used where the user previously did page_to_phys() on the sg page.

I can take care of IOMMU stuff when I'll send IOMMU merging fix
patchset:

http://marc.info/?l=linux-scsim=119079718126157w=2
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Linus Torvalds wrote:
Very few drivers should care so much about the *page* itself (or the 
offset). That's something that the generic allocation code etc cares 
about, but the driver is almost bound to care mostly about the actual DMA 
address, so adding that helper function that abstracts the sg access would 
be helpful in hiding some of the cruft?



FWIW libata cares about both.  When doing DMA, it cares about the DMA 
address.  When doing PIO, it cares about the actual page, because it 
does a kmap before sending the data through a 16-bit/32-bit data FIFO.


Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Arjan van de Ven
On Thu, 18 Oct 2007 09:55:04 -0700 (PDT)
Linus Torvalds <[EMAIL PROTECTED]> wrote:

> 
> 
> On Thu, 18 Oct 2007, Jens Axboe wrote:
> > -   unsigned long addr = page_to_phys(s->page) +
> > s->offset; 
> > +   unsigned long addr = page_to_phys(sg_page(s)) +
> > s->offset; 
> 
> Umm. May I suggest (I haven't read the whole thread yet, maybe
> somebody else already did) that
> 
>   static inline unsigned long sg_phys(struct scatterlist *sg)
>   {
>   return  page_to_phys(sg_page(sg)) + sg->offset;
>   }
> 
> would be a good thing to have?
> 
> Very few drivers should care so much about the *page* itself (or the 
> offset). That's something that the generic allocation code etc cares 
> about, but the driver is almost bound to care mostly about the actual
> DMA address

 but will that work for systems with IOMMU ? or is it fundamentally
the wrong interface
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Arjan van de Ven wrote:
> On Thu, 18 Oct 2007 09:55:04 -0700 (PDT)
> Linus Torvalds <[EMAIL PROTECTED]> wrote:
> 
> > 
> > 
> > On Thu, 18 Oct 2007, Jens Axboe wrote:
> > > - unsigned long addr = page_to_phys(s->page) +
> > > s->offset; 
> > > + unsigned long addr = page_to_phys(sg_page(s)) +
> > > s->offset; 
> > 
> > Umm. May I suggest (I haven't read the whole thread yet, maybe
> > somebody else already did) that
> > 
> > static inline unsigned long sg_phys(struct scatterlist *sg)
> > {
> > return  page_to_phys(sg_page(sg)) + sg->offset;
> > }
> > 
> > would be a good thing to have?
> > 
> > Very few drivers should care so much about the *page* itself (or the 
> > offset). That's something that the generic allocation code etc cares 
> > about, but the driver is almost bound to care mostly about the actual
> > DMA address
> 
>  but will that work for systems with IOMMU ? or is it fundamentally
> the wrong interface

They use foo_to_bus() on the address. sg_phys() should of course only be
used where the user previously did page_to_phys() on the sg page.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jens Axboe wrote:
> On Thu, Oct 18 2007, Linus Torvalds wrote:
> > 
> > 
> > On Thu, 18 Oct 2007, Jens Axboe wrote:
> > > - unsigned long addr = page_to_phys(s->page) + s->offset; 
> > > + unsigned long addr = page_to_phys(sg_page(s)) + s->offset; 
> > 
> > Umm. May I suggest (I haven't read the whole thread yet, maybe somebody 
> > else already did) that
> > 
> > static inline unsigned long sg_phys(struct scatterlist *sg)
> > {
> > return  page_to_phys(sg_page(sg)) + sg->offset;
> > }
> > 
> > would be a good thing to have?
> 
> Sure thing, it's used quite a lot.

Actually, only 11 occurrences in the patch. But still a nice little
cleanup.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Linus Torvalds wrote:
> 
> 
> On Thu, 18 Oct 2007, Jens Axboe wrote:
> > -   unsigned long addr = page_to_phys(s->page) + s->offset; 
> > +   unsigned long addr = page_to_phys(sg_page(s)) + s->offset; 
> 
> Umm. May I suggest (I haven't read the whole thread yet, maybe somebody 
> else already did) that
> 
>   static inline unsigned long sg_phys(struct scatterlist *sg)
>   {
>   return  page_to_phys(sg_page(sg)) + sg->offset;
>   }
> 
> would be a good thing to have?

Sure thing, it's used quite a lot.

> Very few drivers should care so much about the *page* itself (or the 
> offset). That's something that the generic allocation code etc cares 
> about, but the driver is almost bound to care mostly about the actual DMA 
> address, so adding that helper function that abstracts the sg access would 
> be helpful in hiding some of the cruft?

I'll add it to the mix.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Linus Torvalds


On Thu, 18 Oct 2007, Jens Axboe wrote:
> - unsigned long addr = page_to_phys(s->page) + s->offset; 
> + unsigned long addr = page_to_phys(sg_page(s)) + s->offset; 

Umm. May I suggest (I haven't read the whole thread yet, maybe somebody 
else already did) that

static inline unsigned long sg_phys(struct scatterlist *sg)
{
return  page_to_phys(sg_page(sg)) + sg->offset;
}

would be a good thing to have?

Very few drivers should care so much about the *page* itself (or the 
offset). That's something that the generic allocation code etc cares 
about, but the driver is almost bound to care mostly about the actual DMA 
address, so adding that helper function that abstracts the sg access would 
be helpful in hiding some of the cruft?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Olof Johansson wrote:
> On Thu, Oct 18, 2007 at 04:38:36PM +0200, Jens Axboe wrote:
> > On Thu, Oct 18 2007, Benny Halevy wrote:
> > > On Oct. 18, 2007, 16:05 +0200, Jens Axboe <[EMAIL PROTECTED]> wrote:
> > > > On Thu, Oct 18 2007, Jens Axboe wrote:
> > > >> On Thu, Oct 18 2007, Benny Halevy wrote:
> > >   return sg;
> > >   }
> > >  @@ -83,6 +96,9 @@ static inline struct scatterlist *sg_last(struct 
> > >  scatterlist *sgl,
> > >   ret = sg;
> > >   
> > >   #endif
> > >  +#ifdef CONFIG_DEBUG_SG
> > >  +BUG_ON(sgl[0].sg_magic != SG_MAGIC);
> > > >>> can it also do BUG_ON(!sg_is_last(sg))?
> > > >> That would make sense, definitely. I'll add that.
> > > > 
> > > > BUG_ON(!sg_is_last(ret));
> > > > 
> > > > it should be, not sg. That's what I merged.
> > > > 
> > > right. of course.
> > 
> > OK, that found something interesting - mapping a request may shrink it,
> > so we need to update the end marker to move it earlier in the list.
> > Basically just a
> > 
> > if (sg)
> > __sg_mark_end(sg);
> > 
> > at the bottom of blk_rq_map_sg(). Updated patch below, booted on other
> > machines now as well without incident.
> 
> PPC needs this. I'm guessing other arches might too. Otherwise seems to boot
> and work well (with Jeff's sata patch).

Oh duh, indeed I didn't add that to the archs when converting. Thanks
for the patch, I'll update the rest as well.

And thanks a lot for boot testing it!

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Olof Johansson
On Thu, Oct 18, 2007 at 04:38:36PM +0200, Jens Axboe wrote:
> On Thu, Oct 18 2007, Benny Halevy wrote:
> > On Oct. 18, 2007, 16:05 +0200, Jens Axboe <[EMAIL PROTECTED]> wrote:
> > > On Thu, Oct 18 2007, Jens Axboe wrote:
> > >> On Thu, Oct 18 2007, Benny Halevy wrote:
> > return sg;
> >   }
> >  @@ -83,6 +96,9 @@ static inline struct scatterlist *sg_last(struct 
> >  scatterlist *sgl,
> > ret = sg;
> >   
> >   #endif
> >  +#ifdef CONFIG_DEBUG_SG
> >  +  BUG_ON(sgl[0].sg_magic != SG_MAGIC);
> > >>> can it also do BUG_ON(!sg_is_last(sg))?
> > >> That would make sense, definitely. I'll add that.
> > > 
> > > BUG_ON(!sg_is_last(ret));
> > > 
> > > it should be, not sg. That's what I merged.
> > > 
> > right. of course.
> 
> OK, that found something interesting - mapping a request may shrink it,
> so we need to update the end marker to move it earlier in the list.
> Basically just a
> 
> if (sg)
> __sg_mark_end(sg);
> 
> at the bottom of blk_rq_map_sg(). Updated patch below, booted on other
> machines now as well without incident.

PPC needs this. I'm guessing other arches might too. Otherwise seems to boot
and work well (with Jeff's sata patch).


Thanks,

-Olof

diff --git a/include/asm-powerpc/scatterlist.h 
b/include/asm-powerpc/scatterlist.h
index b9f1dbc..fcf7d55 100644
--- a/include/asm-powerpc/scatterlist.h
+++ b/include/asm-powerpc/scatterlist.h
@@ -14,6 +14,9 @@
 #include 
 
 struct scatterlist {
+#ifdef CONFIG_DEBUG_SG
+   unsigned long sg_magic;
+#endif
unsigned long page_link;
unsigned int offset;
unsigned int length;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Olof Johansson
On Thu, Oct 18, 2007 at 06:42:46AM -0400, Jeff Garzik wrote:
> Jens Axboe wrote:
>> On Thu, Oct 18 2007, Jeff Garzik wrote:
>>> Jens Axboe wrote:
 That should work as well. WRT ata_sg_is_last(), if we go ahead with my
 recent sg chaining updates, we can keep the test as it would be a single
 conditional and not require any looping.
 Let me know when you have tested this!
>>> The patch I attached to the last email got both sata_mv test boxes 
>>> working reliably (so far).
>>>
>>> I worked up a patch that kills ata_sg_is_last() (plus the 
>>> max_phys_segments sata_mv fix), see attached.  I'm thinking this is what 
>>> I like to see in upstream.
>> Great!
>>> Of course, this doesn't explain why ata_sg_is_last() was broken, but 
>>> since it's working _and_ slightly more efficient, I don't really care :)
>> Tomo and I agreed to kill sg_last() a few days ago anyways, so this is
>> perfectly fine with me.
>
> Yep, the [attached] patch that kills ata_sg_is_last() is working here on 
> both machines that were previously croaking.
>
> It would be nice to get pdc_adma, sata_sil24 and ipr it-works test done, 
> but IMO the patch is pretty straightforward and should be OK.

Tested-by: Olof Johansson <[EMAIL PROTECTED]>

Looks ok on SATA_SIL24 and SATA_MV here on PPC (together with Jens' latest
patch). Both barfed before.


Thanks!

-Olof
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Benny Halevy
On Oct. 18, 2007, 16:05 +0200, Jens Axboe <[EMAIL PROTECTED]> wrote:
> On Thu, Oct 18 2007, Jens Axboe wrote:
>> On Thu, Oct 18 2007, Benny Halevy wrote:
return sg;
  }
 @@ -83,6 +96,9 @@ static inline struct scatterlist *sg_last(struct 
 scatterlist *sgl,
ret = sg;
  
  #endif
 +#ifdef CONFIG_DEBUG_SG
 +  BUG_ON(sgl[0].sg_magic != SG_MAGIC);
>>> can it also do BUG_ON(!sg_is_last(sg))?
>> That would make sense, definitely. I'll add that.
> 
> BUG_ON(!sg_is_last(ret));
> 
> it should be, not sg. That's what I merged.
> 
right. of course.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Mark Lord

Mark Lord wrote:

On Thu, Oct 18 2007, Mark Lord wrote:

> Jens wrote:

>> OK, I think that covers every arch out there. I haven't been able to
>> compile any of them, but it's mostly search'n replace operations. 
I hope

>> nothing is missing linux/scatterlist.h includes...

>
> Patch fails on drivers/scsi/scsi_lib.c.
>
> I replaced that part of the patch with this updated portion instead:


Hmm, what are you applying against? Must be a clean tree, throw away any
patches that you already applied in this thread.

Updated below.



I'll re-pull everything fresh again from kernel.org and rebuild
with the "updated below" patch you posted.  Thanks.


Okay, fresh pull of everything from kernel.org,
and now your latest patch does apply cleanly to -git13.

Something weird (at this end).

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Mark Lord wrote:
> Mark Lord wrote:
>>> On Thu, Oct 18 2007, Mark Lord wrote:
 > Jens wrote:
> >> OK, I think that covers every arch out there. I haven't been able to
> >> compile any of them, but it's mostly search'n replace operations. I 
> hope
> >> nothing is missing linux/scatterlist.h includes...
 >
 > Patch fails on drivers/scsi/scsi_lib.c.
 >
 > I replaced that part of the patch with this updated portion instead:
>>>
>>> Hmm, what are you applying against? Must be a clean tree, throw away any
>>> patches that you already applied in this thread.
>>>
>>> Updated below.
>> I'll re-pull everything fresh again from kernel.org and rebuild
>> with the "updated below" patch you posted.  Thanks.
>
> Okay, fresh pull of everything from kernel.org,
> and now your latest patch does apply cleanly to -git13.
>
> Something weird (at this end).

Thanks for confirming, I did double check that my HEAD was uptodate -
and it is.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Mark Lord

On Thu, Oct 18 2007, Mark Lord wrote:

> Jens wrote:

>> OK, I think that covers every arch out there. I haven't been able to
>> compile any of them, but it's mostly search'n replace operations. I hope
>> nothing is missing linux/scatterlist.h includes...

>
> Patch fails on drivers/scsi/scsi_lib.c.
>
> I replaced that part of the patch with this updated portion instead:


Hmm, what are you applying against? Must be a clean tree, throw away any
patches that you already applied in this thread.

Updated below.



I'll re-pull everything fresh again from kernel.org and rebuild
with the "updated below" patch you posted.  Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jens Axboe wrote:
> On Thu, Oct 18 2007, Benny Halevy wrote:
> > >   return sg;
> > >  }
> > > @@ -83,6 +96,9 @@ static inline struct scatterlist *sg_last(struct 
> > > scatterlist *sgl,
> > >   ret = sg;
> > >  
> > >  #endif
> > > +#ifdef CONFIG_DEBUG_SG
> > > + BUG_ON(sgl[0].sg_magic != SG_MAGIC);
> > 
> > can it also do BUG_ON(!sg_is_last(sg))?
> 
> That would make sense, definitely. I'll add that.

BUG_ON(!sg_is_last(ret));

it should be, not sg. That's what I merged.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Mark Lord wrote:
> Jens Axboe wrote:
>> On Thu, Oct 18 2007, Mark Lord wrote:
>>> Jens Axboe wrote:
 On Thu, Oct 18 2007, Mark Lord wrote:
> Jens Axboe wrote:
>> On Thu, Oct 18 2007, Jeff Garzik wrote:
>>> Mark Lord wrote:
 Okay, mine is dying with EIP at blk_rq_map_sg+0xcb/0x160.
 Screen photo is at http://rtr.ca/recent/2.6.23-git12-crash.jpg,
 but the top was cut off (isn't there a new config option or patch
 to do double-columns or scrollback or something ???.
>>> Is this a sata_mv box?  If so, could you try this patch?
>> If anything, that shrinks the size of the resulting request. Did this
>> patch make any difference to you?
> Not a sata_mv box, so no point here.
 Can you try the big patch I just posted?
>>> I'll hunt for it and try it, but your earlier little patch already works 
>>> fine.
>
> I found the latest rev, and it failed to apply cleanly on -git12 or -git13
> due to scsi_lib.c. After fixing that portion (replacement chunk below),
> I'm now running with -git12, with the sg list debug option enabled (no 
> messages).
>
> Looks okay so far

OK, thanks a lot for testing!

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Mark Lord

Jens Axboe wrote:

On Thu, Oct 18 2007, Mark Lord wrote:

Jens wrote:

OK, I think that covers every arch out there. I haven't been able to
compile any of them, but it's mostly search'n replace operations. I hope
nothing is missing linux/scatterlist.h includes...

Patch fails on drivers/scsi/scsi_lib.c.

I replaced that part of the patch with this updated portion instead:


Hmm, what are you applying against? Must be a clean tree, throw away any
patches that you already applied in this thread.



Squeaky-clean linux-2.6.23 + patch-2.6.23-git13 + your patch.
Fails on scsi_lib.c.

-ml
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Mark Lord

Jens Axboe wrote:

On Thu, Oct 18 2007, Mark Lord wrote:

Jens Axboe wrote:

On Thu, Oct 18 2007, Mark Lord wrote:

Jens Axboe wrote:

On Thu, Oct 18 2007, Jeff Garzik wrote:

Mark Lord wrote:

Okay, mine is dying with EIP at blk_rq_map_sg+0xcb/0x160.
Screen photo is at http://rtr.ca/recent/2.6.23-git12-crash.jpg,
but the top was cut off (isn't there a new config option or patch
to do double-columns or scrollback or something ???.

Is this a sata_mv box?  If so, could you try this patch?

If anything, that shrinks the size of the resulting request. Did this
patch make any difference to you?

Not a sata_mv box, so no point here.

Can you try the big patch I just posted?
I'll hunt for it and try it, but your earlier little patch already works 
fine.


I found the latest rev, and it failed to apply cleanly on -git12 or -git13
due to scsi_lib.c. After fixing that portion (replacement chunk below),
I'm now running with -git12, with the sg list debug option enabled (no 
messages).

Looks okay so far


--- a/drivers/scsi/scsi_lib.c   2007-10-18 09:35:28.0 -0400
+++ b/drivers/scsi/scsi_lib.c   2007-10-18 09:46:47.0 -0400
@@ -295,7 +295,7 @@
int i, err, nr_vecs = 0;

for_each_sg(sgl, sg, nsegs, i) {
-   page = sg->page;
+   page = sg_page(sg);
off = sg->offset;
len = sg->length;
data_len += len;
@@ -764,6 +764,8 @@
if (unlikely(!sgl))
goto enomem;

+   sg_init_table(sgl, sgp->size);
+
/*
 * first loop through, set initial index and return value
 */
@@ -779,6 +781,13 @@
sg_chain(prev, SCSI_MAX_SG_SEGMENTS, sgl);

/*
+* if we have nothing left, mark the last segment as
+* end-of-list
+*/
+   if (!left)
+   sg_mark_end(sgl, this);
+
+   /*
 * don't allow subsequent mempool allocs to sleep, it would
 * violate the mempool principle.
 */
@@ -2351,7 +2360,7 @@
*offset = *offset - len_complete + sg->offset;

/* Assumption: contiguous pages can be accessed as "page + i" */
-   page = nth_page(sg->page, (*offset >> PAGE_SHIFT));
+   page = nth_page(sg_page(sg), (*offset >> PAGE_SHIFT));
*offset &= ~PAGE_MASK;

/* Bytes in this sg-entry from *offset to the end of the page */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Benny Halevy wrote:
> > return sg;
> >  }
> > @@ -83,6 +96,9 @@ static inline struct scatterlist *sg_last(struct 
> > scatterlist *sgl,
> > ret = sg;
> >  
> >  #endif
> > +#ifdef CONFIG_DEBUG_SG
> > +   BUG_ON(sgl[0].sg_magic != SG_MAGIC);
> 
> can it also do BUG_ON(!sg_is_last(sg))?

That would make sense, definitely. I'll add that.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Benny Halevy wrote:
> On Oct. 18, 2007, 15:32 +0200, Jens Axboe <[EMAIL PROTECTED]> wrote:
> >  static inline struct scatterlist *sg_next(struct scatterlist *sg)
> >  {
> > -   sg++;
> > -
> > -   if (unlikely(sg_is_chain(sg)))
> > +#ifdef CONFIG_DEBUG_SG
> > +   BUG_ON(sg->sg_magic != SG_MAGIC);
> > +#endif
> > +   if (sg_is_last(sg))
> > +   sg = NULL;
> > +   else if (sg_is_chain(sg))
> > sg = sg_chain_ptr(sg);
> > +   else
> > +   sg++;
> >  
> 
> Jens, again, please correct me if I'm wrong, but when sg points at the
> entry right before a chain entry this implementation of sg_next will
> return a pointer to the chain entry here, which I believe it must not.
> 
> > return sg;
> >  }
> > 
> 
> here's how I think sg_next should be implemented:
> 
>   */
>  static inline struct scatterlist *sg_next(struct scatterlist *sg)
>  {
> +#ifdef CONFIG_DEBUG_SG
> + BUG_ON(sg->sg_magic != SG_MAGIC);
> +#endif
> + if (sg_is_last(sg))
> + return NULL;
> +
>   sg++;
>  
>   if (unlikely(sg_is_chain(sg)))
>   sg = sg_chain_ptr(sg);
>  
>   return sg;
>  }

Yep, thanks for catching that!

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Benny Halevy
On Oct. 18, 2007, 15:32 +0200, Jens Axboe <[EMAIL PROTECTED]> wrote:
>  static inline struct scatterlist *sg_next(struct scatterlist *sg)
>  {
> - sg++;
> -
> - if (unlikely(sg_is_chain(sg)))
> +#ifdef CONFIG_DEBUG_SG
> + BUG_ON(sg->sg_magic != SG_MAGIC);
> +#endif
> + if (sg_is_last(sg))
> + sg = NULL;
> + else if (sg_is_chain(sg))
>   sg = sg_chain_ptr(sg);
> + else
> + sg++;
>  

Jens, again, please correct me if I'm wrong, but when sg points at the
entry right before a chain entry this implementation of sg_next will
return a pointer to the chain entry here, which I believe it must not.

>   return sg;
>  }
> 

here's how I think sg_next should be implemented:

  */
 static inline struct scatterlist *sg_next(struct scatterlist *sg)
 {
+#ifdef CONFIG_DEBUG_SG
+   BUG_ON(sg->sg_magic != SG_MAGIC);
+#endif
+   if (sg_is_last(sg))
+   return NULL;
+
sg++;
 
if (unlikely(sg_is_chain(sg)))
sg = sg_chain_ptr(sg);
 
return sg;
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Mark Lord

Jens wrote:

OK, I think that covers every arch out there. I haven't been able to
compile any of them, but it's mostly search'n replace operations. I hope
nothing is missing linux/scatterlist.h includes...


Patch fails on drivers/scsi/scsi_lib.c.

I replaced that part of the patch with this updated portion instead:



--- a/drivers/scsi/scsi_lib.c   2007-10-18 09:35:28.0 -0400
+++ b/drivers/scsi/scsi_lib.c   2007-10-18 09:46:47.0 -0400
@@ -295,7 +295,7 @@
int i, err, nr_vecs = 0;

for_each_sg(sgl, sg, nsegs, i) {
-   page = sg->page;
+   page = sg_page(sg);
off = sg->offset;
len = sg->length;
data_len += len;
@@ -764,6 +764,8 @@
if (unlikely(!sgl))
goto enomem;

+   sg_init_table(sgl, sgp->size);
+
/*
 * first loop through, set initial index and return value
 */
@@ -779,6 +781,13 @@
sg_chain(prev, SCSI_MAX_SG_SEGMENTS, sgl);

/*
+* if we have nothing left, mark the last segment as
+* end-of-list
+*/
+   if (!left)
+   sg_mark_end(sgl, this);
+
+   /*
 * don't allow subsequent mempool allocs to sleep, it would
 * violate the mempool principle.
 */
@@ -2351,7 +2360,7 @@
*offset = *offset - len_complete + sg->offset;

/* Assumption: contiguous pages can be accessed as "page + i" */
-   page = nth_page(sg->page, (*offset >> PAGE_SHIFT));
+   page = nth_page(sg_page(sg), (*offset >> PAGE_SHIFT));
*offset &= ~PAGE_MASK;

/* Bytes in this sg-entry from *offset to the end of the page */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Mark Lord wrote:
> Jens Axboe wrote:
>> On Thu, Oct 18 2007, Mark Lord wrote:
>>> Jens Axboe wrote:
 On Thu, Oct 18 2007, Jeff Garzik wrote:
> Mark Lord wrote:
>> Okay, mine is dying with EIP at blk_rq_map_sg+0xcb/0x160.
>> Screen photo is at http://rtr.ca/recent/2.6.23-git12-crash.jpg,
>> but the top was cut off (isn't there a new config option or patch
>> to do double-columns or scrollback or something ???.
> Is this a sata_mv box?  If so, could you try this patch?
 If anything, that shrinks the size of the resulting request. Did this
 patch make any difference to you?
>>> Not a sata_mv box, so no point here.
>> Can you try the big patch I just posted?
>
> I'll hunt for it and try it, but your earlier little patch already works 
> fine.

I'll send it privately.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Mark Lord

Jens Axboe wrote:

On Thu, Oct 18 2007, Mark Lord wrote:

Jens Axboe wrote:

On Thu, Oct 18 2007, Jeff Garzik wrote:

Mark Lord wrote:

Okay, mine is dying with EIP at blk_rq_map_sg+0xcb/0x160.
Screen photo is at http://rtr.ca/recent/2.6.23-git12-crash.jpg,
but the top was cut off (isn't there a new config option or patch
to do double-columns or scrollback or something ???.

Is this a sata_mv box?  If so, could you try this patch?

If anything, that shrinks the size of the resulting request. Did this
patch make any difference to you?

Not a sata_mv box, so no point here.


Can you try the big patch I just posted?


I'll hunt for it and try it, but your earlier little patch already works fine.

Cheers

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Mark Lord wrote:
> Jens Axboe wrote:
>> On Thu, Oct 18 2007, Jeff Garzik wrote:
>>> Mark Lord wrote:
 Okay, mine is dying with EIP at blk_rq_map_sg+0xcb/0x160.
 Screen photo is at http://rtr.ca/recent/2.6.23-git12-crash.jpg,
 but the top was cut off (isn't there a new config option or patch
 to do double-columns or scrollback or something ???.
>>> Is this a sata_mv box?  If so, could you try this patch?
>> If anything, that shrinks the size of the resulting request. Did this
>> patch make any difference to you?
>
> Not a sata_mv box, so no point here.

Can you try the big patch I just posted?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Mark Lord

Jens Axboe wrote:

On Thu, Oct 18 2007, Jeff Garzik wrote:

Mark Lord wrote:

Okay, mine is dying with EIP at blk_rq_map_sg+0xcb/0x160.
Screen photo is at http://rtr.ca/recent/2.6.23-git12-crash.jpg,
but the top was cut off (isn't there a new config option or patch
to do double-columns or scrollback or something ???.

Is this a sata_mv box?  If so, could you try this patch?


If anything, that shrinks the size of the resulting request. Did this
patch make any difference to you?


Not a sata_mv box, so no point here.

ata_piix.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Benny Halevy
On Oct. 18, 2007, 14:15 +0200, Jens Axboe <[EMAIL PROTECTED]> wrote:

>  /**
>   * sg_next - return the next scatterlist entry in a list
> @@ -43,10 +51,15 @@ static inline void sg_init_one(struct scatterlist *sg, 
> const void *buf,
>   */
>  static inline struct scatterlist *sg_next(struct scatterlist *sg)
>  {
> - sg++;
> -
> - if (unlikely(sg_is_chain(sg)))
> +#ifdef CONFIG_DEBUG_SG
> + BUG_ON(sg->sg_magic != SG_MAGIC);
> +#endif
> + if (sg_is_last(sg))
> + sg = NULL;
> + else if (sg_is_chain(sg))
>   sg = sg_chain_ptr(sg);
> + else
> + sg++;

Hmm, sg_next is not supposed to return a pointer to the chain entry
itself, but rather skip it.  I think that the fix needs only
check the "last" flag before incrementing sg.

+#ifdef CONFIG_DEBUG_SG
+   BUG_ON(sg->sg_magic != SG_MAGIC);
+#endif
+   if (sg_is_last(sg))
+   return NULL;
+
sg++
 
if (unlikely(sg_is_chain(sg)))
sg = sg_chain_ptr(sg);

>  
>   return sg;
>  }
> @@ -83,6 +96,9 @@ static inline struct scatterlist *sg_last(struct 
> scatterlist *sgl,
>   ret = sg;
>  
>  #endif
> +#ifdef CONFIG_DEBUG_SG
> + BUG_ON(sgl[0].sg_magic != SG_MAGIC);

can it also do BUG_ON(!sg_is_last(sg))?

> +#endif
>   return ret;
>  }
>  

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, David Miller wrote:
> From: Jens Axboe <[EMAIL PROTECTED]>
> Date: Thu, 18 Oct 2007 14:15:47 +0200
> 
> > On Thu, Oct 18 2007, Jens Axboe wrote:
> > > On Thu, Oct 18 2007, David Miller wrote:
> > > > From: Jens Axboe <[EMAIL PROTECTED]>
> > > > Date: Thu, 18 Oct 2007 13:57:02 +0200
> > > > 
> > > > > Thanks a lot, Dave! The patch is a monster right now, I'll work on
> > > > > splitting it into a 3-step process. Any arch help is greatly
> > > > > appreciated.
> > > > 
> > > > I have some other bits that my compile hit, such as some things in the
> > > > crypto layer.
> > > 
> > > Yeah, I have tons of that so far. I hope to have an allyesconfig
> > > compiling pretty soonish, will send that out then.
> > 
> > OK here goes, this compiles with allyesconfig on x86-64. Not too bad,
> > the scsi/ drivers were by far the worst.
> 
> It build cleanly here on sparc64 too.

Super

> It's late and I'm about to hit bed so I'm too chicken to do a test
> boot it right now :-)

Don't boot it Dave, odds are that something will break and you'll then
be stuck debugging that since you can't relax and sleep until it's
working :-)

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread David Miller
From: Jens Axboe <[EMAIL PROTECTED]>
Date: Thu, 18 Oct 2007 14:15:47 +0200

> On Thu, Oct 18 2007, Jens Axboe wrote:
> > On Thu, Oct 18 2007, David Miller wrote:
> > > From: Jens Axboe <[EMAIL PROTECTED]>
> > > Date: Thu, 18 Oct 2007 13:57:02 +0200
> > > 
> > > > Thanks a lot, Dave! The patch is a monster right now, I'll work on
> > > > splitting it into a 3-step process. Any arch help is greatly
> > > > appreciated.
> > > 
> > > I have some other bits that my compile hit, such as some things in the
> > > crypto layer.
> > 
> > Yeah, I have tons of that so far. I hope to have an allyesconfig
> > compiling pretty soonish, will send that out then.
> 
> OK here goes, this compiles with allyesconfig on x86-64. Not too bad,
> the scsi/ drivers were by far the worst.

It build cleanly here on sparc64 too.

It's late and I'm about to hit bed so I'm too chicken to do a test
boot it right now :-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, David Miller wrote:
> From: Jens Axboe <[EMAIL PROTECTED]>
> Date: Thu, 18 Oct 2007 13:57:02 +0200
> 
> > Thanks a lot, Dave! The patch is a monster right now, I'll work on
> > splitting it into a 3-step process. Any arch help is greatly
> > appreciated.
> 
> I have some other bits that my compile hit, such as some things in the
> crypto layer.

Yeah, I have tons of that so far. I hope to have an allyesconfig
compiling pretty soonish, will send that out then.

> But I hesitate to send them to you because I think the on-stack cases
> need some helpers such that DEBUG_SG works for them.

Indeed. I convert where appropriate, but I'm not anal about it. If they
don't use sg_next() and/or for_each_sg() on their list, it should be ok.
Don't want to make the changes more than necessary right now.

> BTW, you missed a case in drivers/usb/core/message.c because of
> the config used in your build.  This thing below is a good
> argument for trying to avoid HIGHMEM et al. ifdefs in drivers :-)

Thanks :-)

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread David Miller
From: Jens Axboe <[EMAIL PROTECTED]>
Date: Thu, 18 Oct 2007 13:57:02 +0200

> Thanks a lot, Dave! The patch is a monster right now, I'll work on
> splitting it into a 3-step process. Any arch help is greatly
> appreciated.

I have some other bits that my compile hit, such as some things in the
crypto layer.

But I hesitate to send them to you because I think the on-stack cases
need some helpers such that DEBUG_SG works for them.

BTW, you missed a case in drivers/usb/core/message.c because of
the config used in your build.  This thing below is a good
argument for trying to avoid HIGHMEM et al. ifdefs in drivers :-)

--- drivers/usb/core/message.c~ 2007-10-18 01:46:44.0 -0700
+++ drivers/usb/core/message.c  2007-10-18 03:15:20.0 -0700
@@ -438,7 +438,7 @@
io->urbs[i]->transfer_buffer = NULL;
 #else
io->urbs[i]->transfer_buffer =
-   page_address(sg[i].page) + sg[i].offset;
+   page_address(sg_page([i])) + sg[i].offset;
 #endif
} else {
/* hc may use _only_ transfer_buffer */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, David Miller wrote:
> From: Jens Axboe <[EMAIL PROTECTED]>
> Date: Thu, 18 Oct 2007 10:21:45 +0200
> 
> > I like it. Basically the only real change is using bit 2 as a
> > termination point, so we avoid going beyond the end of the sgtable.
> > Here's a starting point, it actually booted for me in the first go
> > (boggle). Only x86 so far, archs will need to be converted. And lots
> > more drivers I'm sure, I only fixed up the ones that botched my compile.
> > 
> > So just consider this a directional patch.
> 
> Here are some sparc64 bits:

Thanks a lot, Dave! The patch is a monster right now, I'll work on
splitting it into a 3-step process. Any arch help is greatly
appreciated.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread David Miller
From: Jens Axboe <[EMAIL PROTECTED]>
Date: Thu, 18 Oct 2007 10:21:45 +0200

> I like it. Basically the only real change is using bit 2 as a
> termination point, so we avoid going beyond the end of the sgtable.
> Here's a starting point, it actually booted for me in the first go
> (boggle). Only x86 so far, archs will need to be converted. And lots
> more drivers I'm sure, I only fixed up the ones that botched my compile.
> 
> So just consider this a directional patch.

Here are some sparc64 bits:

diff --git a/arch/sparc64/kernel/iommu.c b/arch/sparc64/kernel/iommu.c
index 29af777..73852a2 100644
--- a/arch/sparc64/kernel/iommu.c
+++ b/arch/sparc64/kernel/iommu.c
@@ -473,7 +473,7 @@ static void dma_4u_unmap_single(struct device *dev, 
dma_addr_t bus_addr,
 }
 
 #define SG_ENT_PHYS_ADDRESS(SG)\
-   (__pa(page_address((SG)->page)) + (SG)->offset)
+   (__pa(page_address(sg_page(SG))) + (SG)->offset)
 
 static void fill_sg(iopte_t *iopte, struct scatterlist *sg,
int nused, int nelems,
@@ -566,7 +566,7 @@ static int dma_4u_map_sg(struct device *dev, struct 
scatterlist *sglist,
if (nelems == 1) {
sglist->dma_address =
dma_4u_map_single(dev,
- (page_address(sglist->page) +
+ (page_address(sg_page(sglist)) +
   sglist->offset),
  sglist->length, direction);
if (unlikely(sglist->dma_address == DMA_ERROR_CODE))
diff --git a/arch/sparc64/kernel/iommu_common.c 
b/arch/sparc64/kernel/iommu_common.c
index d7ca900..ec863e0 100644
--- a/arch/sparc64/kernel/iommu_common.c
+++ b/arch/sparc64/kernel/iommu_common.c
@@ -73,7 +73,7 @@ static int verify_one_map(struct scatterlist *dma_sg, struct 
scatterlist **__sg,
 
daddr = dma_sg->dma_address;
sglen = sg->length;
-   sgaddr = (unsigned long) (page_address(sg->page) + sg->offset);
+   sgaddr = (unsigned long) (page_address(sg_page(sg)) + sg->offset);
while (dlen > 0) {
unsigned long paddr;
 
@@ -123,7 +123,7 @@ static int verify_one_map(struct scatterlist *dma_sg, 
struct scatterlist **__sg,
sg = sg_next(sg);
if (--nents <= 0)
break;
-   sgaddr = (unsigned long) (page_address(sg->page) + sg->offset);
+   sgaddr = (unsigned long) (page_address(sg_page(sg)) + 
sg->offset);
sglen = sg->length;
}
if (dlen < 0) {
@@ -191,7 +191,7 @@ void verify_sglist(struct scatterlist *sglist, int nents, 
iopte_t *iopte, int np
printk("sg(%d): page_addr(%p) off(%x) length(%x) "
   "dma_address[%016x] dma_length[%016x]\n",
   i,
-  page_address(sg->page), sg->offset,
+  page_address(sg_page(sg)), sg->offset,
   sg->length,
   sg->dma_address, sg->dma_length);
}
@@ -207,15 +207,15 @@ unsigned long prepare_sg(struct scatterlist *sg, int 
nents)
unsigned long prev;
u32 dent_addr, dent_len;
 
-   prev  = (unsigned long) (page_address(sg->page) + sg->offset);
+   prev  = (unsigned long) (page_address(sg_page(sg)) + sg->offset);
prev += (unsigned long) (dent_len = sg->length);
-   dent_addr = (u32) ((unsigned long)(page_address(sg->page) + sg->offset)
+   dent_addr = (u32) ((unsigned long)(page_address(sg_page(sg)) + 
sg->offset)
   & (IO_PAGE_SIZE - 1UL));
while (--nents) {
unsigned long addr;
 
sg = sg_next(sg);
-   addr = (unsigned long) (page_address(sg->page) + sg->offset);
+   addr = (unsigned long) (page_address(sg_page(sg)) + sg->offset);
if (! VCONTIG(prev, addr)) {
dma_sg->dma_address = dent_addr;
dma_sg->dma_length = dent_len;
diff --git a/arch/sparc64/kernel/pci_sun4v.c b/arch/sparc64/kernel/pci_sun4v.c
index fe46ace..5324a34 100644
--- a/arch/sparc64/kernel/pci_sun4v.c
+++ b/arch/sparc64/kernel/pci_sun4v.c
@@ -366,7 +366,7 @@ static void dma_4v_unmap_single(struct device *dev, 
dma_addr_t bus_addr,
 }
 
 #define SG_ENT_PHYS_ADDRESS(SG)\
-   (__pa(page_address((SG)->page)) + (SG)->offset)
+   (__pa(page_address(sg_page(SG))) + (SG)->offset)
 
 static long fill_sg(long entry, struct device *dev,
struct scatterlist *sg,
@@ -478,7 +478,7 @@ static int dma_4v_map_sg(struct device *dev, struct 
scatterlist *sglist,
if (nelems == 1) {
sglist->dma_address =
dma_4v_map_single(dev,
- (page_address(sglist->page) +
+ 

Re: [PATCH] Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Ingo Molnar

* Jeff Garzik <[EMAIL PROTECTED]> wrote:

> I never had to apply the changes you included, to fix problems here.

perhaps because you are not running a CONFIG_DEBUG_PAGEALLOC=y kernel?

I recently fixed DEBUG_PAGEALLOC (it would crash upon bootup on x86 most 
of the time on any real hardware - so i doubt people were able to use it 
all that much). As long as you try Linus' latest -git tree (which has 
the latest arch/x86 merge) and use the 32-bit x86 kernel it should work 
fine for you too, and you will probably be able to trigger similar 
crashes too.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Ingo Molnar wrote:
> 
> * Ingo Molnar <[EMAIL PROTECTED]> wrote:
> 
> > > -#define SCSI_MAX_SG_SEGMENTS 128
> > > +#define SCSI_MAX_SG_SEGMENTS 129
> > 
> > this one finally made the trick and it's booting fine now, without any 
> > crashes!
> 
> hm, spoke too soon - i just got the ata_qc_issue() crash below. I've 
> attached further below the current fixes/workarounds that i have applied 
> at the moment. Any ideas?

Hang on Ingo, will post an updated patch soonish!

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Ingo Molnar wrote:

* Jeff Garzik <[EMAIL PROTECTED]> wrote:

Tomo and I agreed to kill sg_last() a few days ago anyways, so this 
is perfectly fine with me.
Yep, the [attached] patch that kills ata_sg_is_last() is working here 
on both machines that were previously croaking.


It would be nice to get pdc_adma, sata_sil24 and ipr it-works test 
done, but IMO the patch is pretty straightforward and should be OK.


just a quick question: i have Jens's workarounds applied right now (see 
patch below). Am i now crash/corruption-safe, or do i need your patch 
too? And once your patch [and the other sg_*() patches] are upstream i 
dont need the workaround anymore, correct?


You need my patch if and only if you use one of the drivers touched by 
the patch.  ata_sg_is_last() was a driver helper function, so my fix 
never really touched core code.


I never had to apply the changes you included, to fix problems here.

And looking at those changes...

-   q->max_phys_segments = max_segments;
+   q->max_phys_segments = max_segments - 1;

 ...

-#define SCSI_MAX_SG_SEGMENTS   128
+#define SCSI_MAX_SG_SEGMENTS   129


I wonder if libata should be doing

blk_queue_max_phys_segments(q, q->max_phys_segments - 1)

to account for the pad entry that libata owns.

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> > -#define SCSI_MAX_SG_SEGMENTS   128
> > +#define SCSI_MAX_SG_SEGMENTS   129
> 
> this one finally made the trick and it's booting fine now, without any 
> crashes!

hm, spoke too soon - i just got the ata_qc_issue() crash below. I've 
attached further below the current fixes/workarounds that i have applied 
at the moment. Any ideas?

Ingo

->
[  155.259466] kjournald starting.  Commit interval 5 seconds
[  155.265103] EXT3 FS on sda5, internal journal
[  155.269319] EXT3-fs: mounted filesystem with ordered data mode.
[  156.458225] BUG: unable to handle kernel paging request at virtual address 
7d5ac000
[  156.465723] printing eip: 784e9300 *pde = 00ddd027 *pte = 055ac000 
[  156.471964] Oops:  [#1] DEBUG_PAGEALLOC
[  156.476123] 
[  156.477597] Pid: 0, comm: swapper Not tainted (2.6.23 #40)
[  156.483055] EIP: 0060:[<784e9300>] EFLAGS: 00010006 CPU: 0
[  156.488520] EIP is at ata_qc_issue+0xd0/0x340
[  156.492848] EAX: 3d328000 EBX: 7d5ac000 ECX: 0020 EDX: 0020
[  156.499087] ESI: 7d5ab480 EDI: 7d5abe00 EBP: 7b54007c ESP: 78a13e1c
[  156.505328]  DS: 007b ES: 007b FS:  GS:  SS: 0068
[  156.510700] Process swapper (pid: 0, ti=78a12000 task=789753e0 
task.ti=78a12000)
[  156.517893] Stack: 7d5ac000 7b54 7b54  7d5abff0 7b54007c 
7d5ab480 7b5417a4 
[  156.526213]784c2330 784ef49e 784f1ff3 7b52de98 7d5ab480 7b54 
7b5417a4 7d5ab480 
[  156.534531]7b54 7b524004 784f20e0 784ef180 784c2330 7d5ab480 
0216 7b524004 
[  156.542851] Call Trace:
[  156.545452]  [<784c2330>] scsi_done+0x0/0x20
[  156.549698]  [<784ef49e>] ata_scsi_translate+0xbe/0x140
[  156.554897]  [<784f1ff3>] ata_scsi_queuecmd+0x33/0x200
[  156.560010]  [<784f20e0>] ata_scsi_queuecmd+0x120/0x200
[  156.565210]  [<784ef180>] ata_scsi_rw_xlat+0x0/0x220
[  156.570150]  [<784c2330>] scsi_done+0x0/0x20
[  156.574395]  [<784c2bb2>] scsi_dispatch_cmd+0x152/0x290
[  156.579596]  [<78135aa7>] trace_hardirqs_on+0x67/0xb0
[  156.584622]  [<784c8abe>] scsi_request_fn+0x1be/0x370
[  156.589649]  [<78407ef6>] blk_run_queue+0x36/0x80
[  156.594328]  [<784c73c0>] scsi_next_command+0x30/0x50
[  156.599354]  [<784c754b>] scsi_end_request+0xab/0xe0
[  156.604294]  [<784c8239>] scsi_io_completion+0xa9/0x3d0
[  156.609493]  [<78135aa7>] trace_hardirqs_on+0x67/0xb0
[  156.614520]  [<78404f85>] blk_done_softirq+0x45/0x80
[  156.619460]  [<78404fb3>] blk_done_softirq+0x73/0x80
[  156.624400]  [<7811d2f3>] __do_softirq+0x53/0xb0
[  156.628992]  [<7811d3b8>] do_softirq+0x68/0x70
[  156.633412]  [<78105351>] do_IRQ+0x51/0x90
[  156.637486]  [<7810290f>] restore_nocheck+0x12/0x15
[  156.642339]  [<7810388e>] common_interrupt+0x2e/0x40
[  156.647277]  [<7810f4c0>] pgd_dtor+0x0/0x50
[  156.651437]  [<7815f1d0>] quicklist_trim+0x0/0x90
[  156.656117]  [<7810f4bb>] check_pgt_cache+0x1b/0x20
[  156.660970]  [<78100c52>] cpu_idle+0x32/0x60
[  156.665217]  [<78a14b35>] start_kernel+0x265/0x300
[  156.669983]  [<78a14380>] unknown_bootoption+0x0/0x1e0
[  156.675096]  ===
[  156.678649] Code: 84 d9 01 00 00 7e 32 31 d2 89 f6 8b 1c 24 83 c2 01 8b 03 
2b 05 18 ed d7 78 c1 f8 05 c1 e0 0c 03 43 04 89 43 08 83 c3 10 89 1c 24 <8b> 03 
a8 01 0f 85 58 02 00 00 39 ca 75 d2 f0 83 44 24 00 00 85 
[  156.697455] EIP: [<784e9300>] ata_qc_issue+0xd0/0x340 SS:ESP 0068:78a13e1c
[  156.704822] Kernel panic - not syncing: Fatal exception in interrupt

---
 block/ll_rw_blk.c |2 +-
 drivers/ata/libata-core.c |2 +-
 drivers/scsi/scsi_lib.c   |2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

Index: linux/block/ll_rw_blk.c
===
--- linux.orig/block/ll_rw_blk.c
+++ linux/block/ll_rw_blk.c
@@ -631,7 +631,7 @@ void blk_queue_max_phys_segments(struct 
printk("%s: set to minimum %d\n", __FUNCTION__, max_segments);
}
 
-   q->max_phys_segments = max_segments;
+   q->max_phys_segments = max_segments - 1;
 }
 
 EXPORT_SYMBOL(blk_queue_max_phys_segments);
Index: linux/drivers/ata/libata-core.c
===
--- linux.orig/drivers/ata/libata-core.c
+++ linux/drivers/ata/libata-core.c
@@ -4664,7 +4664,7 @@ static int ata_sg_setup(struct ata_queue
 {
struct ata_port *ap = qc->ap;
struct scatterlist *sg = qc->__sg;
-   struct scatterlist *lsg = sg_last(qc->__sg, qc->n_elem);
+   struct scatterlist *lsg = >__sg[qc->n_elem - 1];
int n_elem, pre_n_elem, dir, trim_sg = 0;
 
VPRINTK("ENTER, ata%u\n", ap->print_id);
Index: linux/drivers/scsi/scsi_lib.c
===
--- linux.orig/drivers/scsi/scsi_lib.c
+++ linux/drivers/scsi/scsi_lib.c
@@ -39,7 +39,7 @@
  * (unless chaining is used). Should ideally fit inside a single page, to
  * avoid a higher order allocation.
  */
-#define SCSI_MAX_SG_SEGMENTS   128

Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Ingo Molnar wrote:
> 
> * Jens Axboe <[EMAIL PROTECTED]> wrote:
> 
> > > It would always be nice. For this case I don't think it's very 
> > > interesting, if we pursue the improved sg iteration setup.
> > 
> > BTW, I think it's pretty clear that ata_sg_is_last() is broken. It's 
> > likely a one-off in the n_iter test.
> 
> fixing that would be a -stable candidate, as a potential data corruptor 
> - or is it more benign?

I think it's safe to say that it was sg chaining introduced breakage, so
it should work fine in 2.6.23.x.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Ingo Molnar

* Jeff Garzik <[EMAIL PROTECTED]> wrote:

> >Tomo and I agreed to kill sg_last() a few days ago anyways, so this 
> >is perfectly fine with me.
> 
> Yep, the [attached] patch that kills ata_sg_is_last() is working here 
> on both machines that were previously croaking.
> 
> It would be nice to get pdc_adma, sata_sil24 and ipr it-works test 
> done, but IMO the patch is pretty straightforward and should be OK.

just a quick question: i have Jens's workarounds applied right now (see 
patch below). Am i now crash/corruption-safe, or do i need your patch 
too? And once your patch [and the other sg_*() patches] are upstream i 
dont need the workaround anymore, correct?

Ingo

---
 block/ll_rw_blk.c |2 +-
 drivers/ata/libata-core.c |2 +-
 drivers/scsi/scsi_lib.c   |2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

Index: linux/block/ll_rw_blk.c
===
--- linux.orig/block/ll_rw_blk.c
+++ linux/block/ll_rw_blk.c
@@ -631,7 +631,7 @@ void blk_queue_max_phys_segments(struct 
printk("%s: set to minimum %d\n", __FUNCTION__, max_segments);
}
 
-   q->max_phys_segments = max_segments;
+   q->max_phys_segments = max_segments - 1;
 }
 
 EXPORT_SYMBOL(blk_queue_max_phys_segments);
Index: linux/drivers/ata/libata-core.c
===
--- linux.orig/drivers/ata/libata-core.c
+++ linux/drivers/ata/libata-core.c
@@ -4664,7 +4664,7 @@ static int ata_sg_setup(struct ata_queue
 {
struct ata_port *ap = qc->ap;
struct scatterlist *sg = qc->__sg;
-   struct scatterlist *lsg = sg_last(qc->__sg, qc->n_elem);
+   struct scatterlist *lsg = >__sg[qc->n_elem - 1];
int n_elem, pre_n_elem, dir, trim_sg = 0;
 
VPRINTK("ENTER, ata%u\n", ap->print_id);
Index: linux/drivers/scsi/scsi_lib.c
===
--- linux.orig/drivers/scsi/scsi_lib.c
+++ linux/drivers/scsi/scsi_lib.c
@@ -39,7 +39,7 @@
  * (unless chaining is used). Should ideally fit inside a single page, to
  * avoid a higher order allocation.
  */
-#define SCSI_MAX_SG_SEGMENTS   128
+#define SCSI_MAX_SG_SEGMENTS   129
 
 struct scsi_host_sg_pool {
size_t  size;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Ingo Molnar wrote:

* Jens Axboe <[EMAIL PROTECTED]> wrote:

It would always be nice. For this case I don't think it's very 
interesting, if we pursue the improved sg iteration setup.
BTW, I think it's pretty clear that ata_sg_is_last() is broken. It's 
likely a one-off in the n_iter test.


fixing that would be a -stable candidate, as a potential data corruptor 
- or is it more benign?


It is confirmed working prior to the sg-chaining stuff, so 2.6.23.1 is OK...

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Ingo Molnar

* Jens Axboe <[EMAIL PROTECTED]> wrote:

> > It would always be nice. For this case I don't think it's very 
> > interesting, if we pursue the improved sg iteration setup.
> 
> BTW, I think it's pretty clear that ata_sg_is_last() is broken. It's 
> likely a one-off in the n_iter test.

fixing that would be a -stable candidate, as a potential data corruptor 
- or is it more benign?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Jens Axboe wrote:

On Thu, Oct 18 2007, Jeff Garzik wrote:

Jens Axboe wrote:

That should work as well. WRT ata_sg_is_last(), if we go ahead with my
recent sg chaining updates, we can keep the test as it would be a single
conditional and not require any looping.
Let me know when you have tested this!
The patch I attached to the last email got both sata_mv test boxes working 
reliably (so far).


I worked up a patch that kills ata_sg_is_last() (plus the max_phys_segments 
sata_mv fix), see attached.  I'm thinking this is what I like to see in 
upstream.


Great!

Of course, this doesn't explain why ata_sg_is_last() was broken, but since 
it's working _and_ slightly more efficient, I don't really care :)


Tomo and I agreed to kill sg_last() a few days ago anyways, so this is
perfectly fine with me.


Yep, the [attached] patch that kills ata_sg_is_last() is working here on 
both machines that were previously croaking.


It would be nice to get pdc_adma, sata_sil24 and ipr it-works test done, 
but IMO the patch is pretty straightforward and should be OK.


Jeff


diff --git a/drivers/ata/pdc_adma.c b/drivers/ata/pdc_adma.c
index 8d1b03d..199f7e1 100644
--- a/drivers/ata/pdc_adma.c
+++ b/drivers/ata/pdc_adma.c
@@ -318,7 +318,7 @@ static int adma_fill_sg(struct ata_queued_cmd *qc)
struct scatterlist *sg;
struct ata_port *ap = qc->ap;
struct adma_port_priv *pp = ap->private_data;
-   u8  *buf = pp->pkt;
+   u8  *buf = pp->pkt, *last_buf = NULL;
int i = (2 + buf[3]) * 8;
u8 pFLAGS = pORD | ((qc->tf.flags & ATA_TFLAG_WRITE) ? pDIRO : 0);
 
@@ -334,8 +334,7 @@ static int adma_fill_sg(struct ata_queued_cmd *qc)
*(__le32 *)(buf + i) = cpu_to_le32(len);
i += 4;
 
-   if (ata_sg_is_last(sg, qc))
-   pFLAGS |= pEND;
+   last_buf = [i];
buf[i++] = pFLAGS;
buf[i++] = qc->dev->dma_mode & 0xf;
buf[i++] = 0;   /* pPKLW */
@@ -348,6 +347,10 @@ static int adma_fill_sg(struct ata_queued_cmd *qc)
VPRINTK("PRD[%u] = (0x%lX, 0x%X)\n", i/4,
(unsigned long)addr, len);
}
+
+   if (likely(last_buf))
+   *last_buf |= pEND;
+
return i;
 }
 
diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
index 4df8311..7f1b13e 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -421,7 +421,6 @@ static void mv_error_handler(struct ata_port *ap);
 static void mv_post_int_cmd(struct ata_queued_cmd *qc);
 static void mv_eh_freeze(struct ata_port *ap);
 static void mv_eh_thaw(struct ata_port *ap);
-static int mv_slave_config(struct scsi_device *sdev);
 static int mv_init_one(struct pci_dev *pdev, const struct pci_device_id *ent);
 
 static void mv5_phy_errata(struct mv_host_priv *hpriv, void __iomem *mmio,
@@ -459,7 +458,7 @@ static struct scsi_host_template mv5_sht = {
.use_clustering = 1,
.proc_name  = DRV_NAME,
.dma_boundary   = MV_DMA_BOUNDARY,
-   .slave_configure= mv_slave_config,
+   .slave_configure= ata_scsi_slave_config,
.slave_destroy  = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
 };
@@ -477,7 +476,7 @@ static struct scsi_host_template mv6_sht = {
.use_clustering = 1,
.proc_name  = DRV_NAME,
.dma_boundary   = MV_DMA_BOUNDARY,
-   .slave_configure= mv_slave_config,
+   .slave_configure= ata_scsi_slave_config,
.slave_destroy  = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
 };
@@ -756,17 +755,6 @@ static void mv_irq_clear(struct ata_port *ap)
 {
 }
 
-static int mv_slave_config(struct scsi_device *sdev)
-{
-   int rc = ata_scsi_slave_config(sdev);
-   if (rc)
-   return rc;
-
-   blk_queue_max_phys_segments(sdev->request_queue, MV_MAX_SG_CT / 2);
-
-   return 0;   /* scsi layer doesn't check return value, sigh */
-}
-
 static void mv_set_edma_ptrs(void __iomem *port_mmio,
 struct mv_host_priv *hpriv,
 struct mv_port_priv *pp)
@@ -1138,7 +1126,7 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
 {
struct mv_port_priv *pp = qc->ap->private_data;
struct scatterlist *sg;
-   struct mv_sg *mv_sg;
+   struct mv_sg *mv_sg, *last_sg = NULL;
 
mv_sg = pp->sg_tbl;
ata_for_each_sg(sg, qc) {
@@ -1159,13 +1147,13 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
sg_len -= len;
addr += len;
 
-   if (!sg_len && ata_sg_is_last(sg, qc))
-   mv_sg->flags_size |= 
cpu_to_le32(EPRD_FLAG_END_OF_TBL);
-
+   last_sg = mv_sg;
mv_sg++;
}
-
}
+

Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jens Axboe wrote:
> On Thu, Oct 18 2007, Ingo Molnar wrote:
> > 
> > * Jens Axboe <[EMAIL PROTECTED]> wrote:
> > 
> > > > Of course, this doesn't explain why ata_sg_is_last() was broken, but 
> > > > since it's working _and_ slightly more efficient, I don't really 
> > > > care :)
> > > 
> > > Tomo and I agreed to kill sg_last() a few days ago anyways, so this is 
> > > perfectly fine with me.
> > 
> > it would still be nice to figure out exactly why it was broken.
> 
> It would always be nice. For this case I don't think it's very
> interesting, if we pursue the improved sg iteration setup.

BTW, I think it's pretty clear that ata_sg_is_last() is broken. It's
likely a one-off in the n_iter test.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Ingo Molnar wrote:
> 
> * Jens Axboe <[EMAIL PROTECTED]> wrote:
> 
> > > Of course, this doesn't explain why ata_sg_is_last() was broken, but 
> > > since it's working _and_ slightly more efficient, I don't really 
> > > care :)
> > 
> > Tomo and I agreed to kill sg_last() a few days ago anyways, so this is 
> > perfectly fine with me.
> 
> it would still be nice to figure out exactly why it was broken.

It would always be nice. For this case I don't think it's very
interesting, if we pursue the improved sg iteration setup.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Ingo Molnar

* Jens Axboe <[EMAIL PROTECTED]> wrote:

> > Of course, this doesn't explain why ata_sg_is_last() was broken, but 
> > since it's working _and_ slightly more efficient, I don't really 
> > care :)
> 
> Tomo and I agreed to kill sg_last() a few days ago anyways, so this is 
> perfectly fine with me.

it would still be nice to figure out exactly why it was broken.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jeff Garzik wrote:
> Jens Axboe wrote:
>> That should work as well. WRT ata_sg_is_last(), if we go ahead with my
>> recent sg chaining updates, we can keep the test as it would be a single
>> conditional and not require any looping.
>> Let me know when you have tested this!
>
> The patch I attached to the last email got both sata_mv test boxes working 
> reliably (so far).
>
> I worked up a patch that kills ata_sg_is_last() (plus the max_phys_segments 
> sata_mv fix), see attached.  I'm thinking this is what I like to see in 
> upstream.

Great!

> Of course, this doesn't explain why ata_sg_is_last() was broken, but since 
> it's working _and_ slightly more efficient, I don't really care :)

Tomo and I agreed to kill sg_last() a few days ago anyways, so this is
perfectly fine with me.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Jens Axboe wrote:

That should work as well. WRT ata_sg_is_last(), if we go ahead with my
recent sg chaining updates, we can keep the test as it would be a single
conditional and not require any looping.

Let me know when you have tested this!


The patch I attached to the last email got both sata_mv test boxes 
working reliably (so far).


I worked up a patch that kills ata_sg_is_last() (plus the 
max_phys_segments sata_mv fix), see attached.  I'm thinking this is what 
I like to see in upstream.


Of course, this doesn't explain why ata_sg_is_last() was broken, but 
since it's working _and_ slightly more efficient, I don't really care :)


Jeff


diff --git a/drivers/ata/pdc_adma.c b/drivers/ata/pdc_adma.c
index 8d1b03d..199f7e1 100644
--- a/drivers/ata/pdc_adma.c
+++ b/drivers/ata/pdc_adma.c
@@ -318,7 +318,7 @@ static int adma_fill_sg(struct ata_queued_cmd *qc)
struct scatterlist *sg;
struct ata_port *ap = qc->ap;
struct adma_port_priv *pp = ap->private_data;
-   u8  *buf = pp->pkt;
+   u8  *buf = pp->pkt, *last_buf = NULL;
int i = (2 + buf[3]) * 8;
u8 pFLAGS = pORD | ((qc->tf.flags & ATA_TFLAG_WRITE) ? pDIRO : 0);
 
@@ -334,8 +334,7 @@ static int adma_fill_sg(struct ata_queued_cmd *qc)
*(__le32 *)(buf + i) = cpu_to_le32(len);
i += 4;
 
-   if (ata_sg_is_last(sg, qc))
-   pFLAGS |= pEND;
+   last_buf = [i];
buf[i++] = pFLAGS;
buf[i++] = qc->dev->dma_mode & 0xf;
buf[i++] = 0;   /* pPKLW */
@@ -348,6 +347,10 @@ static int adma_fill_sg(struct ata_queued_cmd *qc)
VPRINTK("PRD[%u] = (0x%lX, 0x%X)\n", i/4,
(unsigned long)addr, len);
}
+
+   if (likely(last_buf))
+   *last_buf |= pEND;
+
return i;
 }
 
diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
index 4df8311..7f1b13e 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -421,7 +421,6 @@ static void mv_error_handler(struct ata_port *ap);
 static void mv_post_int_cmd(struct ata_queued_cmd *qc);
 static void mv_eh_freeze(struct ata_port *ap);
 static void mv_eh_thaw(struct ata_port *ap);
-static int mv_slave_config(struct scsi_device *sdev);
 static int mv_init_one(struct pci_dev *pdev, const struct pci_device_id *ent);
 
 static void mv5_phy_errata(struct mv_host_priv *hpriv, void __iomem *mmio,
@@ -459,7 +458,7 @@ static struct scsi_host_template mv5_sht = {
.use_clustering = 1,
.proc_name  = DRV_NAME,
.dma_boundary   = MV_DMA_BOUNDARY,
-   .slave_configure= mv_slave_config,
+   .slave_configure= ata_scsi_slave_config,
.slave_destroy  = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
 };
@@ -477,7 +476,7 @@ static struct scsi_host_template mv6_sht = {
.use_clustering = 1,
.proc_name  = DRV_NAME,
.dma_boundary   = MV_DMA_BOUNDARY,
-   .slave_configure= mv_slave_config,
+   .slave_configure= ata_scsi_slave_config,
.slave_destroy  = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
 };
@@ -756,17 +755,6 @@ static void mv_irq_clear(struct ata_port *ap)
 {
 }
 
-static int mv_slave_config(struct scsi_device *sdev)
-{
-   int rc = ata_scsi_slave_config(sdev);
-   if (rc)
-   return rc;
-
-   blk_queue_max_phys_segments(sdev->request_queue, MV_MAX_SG_CT / 2);
-
-   return 0;   /* scsi layer doesn't check return value, sigh */
-}
-
 static void mv_set_edma_ptrs(void __iomem *port_mmio,
 struct mv_host_priv *hpriv,
 struct mv_port_priv *pp)
@@ -1138,7 +1126,7 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
 {
struct mv_port_priv *pp = qc->ap->private_data;
struct scatterlist *sg;
-   struct mv_sg *mv_sg;
+   struct mv_sg *mv_sg, *last_sg = NULL;
 
mv_sg = pp->sg_tbl;
ata_for_each_sg(sg, qc) {
@@ -1159,13 +1147,13 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
sg_len -= len;
addr += len;
 
-   if (!sg_len && ata_sg_is_last(sg, qc))
-   mv_sg->flags_size |= 
cpu_to_le32(EPRD_FLAG_END_OF_TBL);
-
+   last_sg = mv_sg;
mv_sg++;
}
-
}
+
+   if (likely(last_sg))
+   last_sg->flags_size |= cpu_to_le32(EPRD_FLAG_END_OF_TBL);
 }
 
 static inline void mv_crqb_pack_cmd(__le16 *cmdw, u8 data, u8 addr, unsigned 
last)
diff --git a/drivers/ata/sata_sil24.c b/drivers/ata/sata_sil24.c
index b061927..26ebffc 100644
--- a/drivers/ata/sata_sil24.c
+++ b/drivers/ata/sata_sil24.c
@@ -796,16 +796,19 @@ static inline void sil24_fill_sg(struct 

Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jeff Garzik wrote:
> Jens Axboe wrote:
>> The sata_mv construct looks a bit odd. Does this work? That last
>
> The sata_mv construct worked just fine before sg chaining :)

Yes I know, but I'm trying to works towards getting rid of sg_last() and
ata_sg_is_last() anyway :-)

>> end_mv_sg test should always be true, just being paranoid...
>> diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
>> index 4df8311..5397eea 100644
>> --- a/drivers/ata/sata_mv.c
>> +++ b/drivers/ata/sata_mv.c
>> @@ -1138,8 +1138,9 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
>>  {
>>  struct mv_port_priv *pp = qc->ap->private_data;
>>  struct scatterlist *sg;
>> -struct mv_sg *mv_sg;
>> +struct mv_sg *mv_sg, *end_mv_sg;
>>  +   end_mv_sg = NULL;
>>  mv_sg = pp->sg_tbl;
>>  ata_for_each_sg(sg, qc) {
>>  dma_addr_t addr = sg_dma_address(sg);
>> @@ -1158,14 +1159,12 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
>>  sg_len -= len;
>>  addr += len;
>> -
>> -if (!sg_len && ata_sg_is_last(sg, qc))
>> -mv_sg->flags_size |= 
>> cpu_to_le32(EPRD_FLAG_END_OF_TBL);
>> -
>> +end_mv_sg = mv_sg;
>>  mv_sg++;
>>  }
>> -
>>  }
>> +if (end_mv_sg)
>> +end_mv_sg->flags_size |= cpu_to_le32(EPRD_FLAG_END_OF_TBL);
>>  }
>>  
>
> I'm testing a similar patch based on ata_fill_sg()'s method, which 
> basically does something similar to what you've done here (see attached).  
> I had noticed that ata_fill_sg() did not call ata_sg_is_last().
>
> If this fixes the problem, I think the best solution would be to delete 
> ata_sg_is_last().  In the few users that exist, we should be able to 
> eliminate the test programmatically as you and ata_fill_sg() have done -- 
> thereby eliminating a branch per loop in a hotpath.
>
> Off to test the attached...  if that doesn't work I'll try your version, 
> though there shouldn't be much difference.

That should work as well. WRT ata_sg_is_last(), if we go ahead with my
recent sg chaining updates, we can keep the test as it would be a single
conditional and not require any looping.

Let me know when you have tested this!

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Jens Axboe wrote:

The sata_mv construct looks a bit odd. Does this work? That last


The sata_mv construct worked just fine before sg chaining :)



end_mv_sg test should always be true, just being paranoid...

diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
index 4df8311..5397eea 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -1138,8 +1138,9 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
 {
struct mv_port_priv *pp = qc->ap->private_data;
struct scatterlist *sg;
-   struct mv_sg *mv_sg;
+   struct mv_sg *mv_sg, *end_mv_sg;
 
+	end_mv_sg = NULL;

mv_sg = pp->sg_tbl;
ata_for_each_sg(sg, qc) {
dma_addr_t addr = sg_dma_address(sg);
@@ -1158,14 +1159,12 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
 
 			sg_len -= len;

addr += len;
-
-   if (!sg_len && ata_sg_is_last(sg, qc))
-   mv_sg->flags_size |= 
cpu_to_le32(EPRD_FLAG_END_OF_TBL);
-
+   end_mv_sg = mv_sg;
mv_sg++;
}
-
}
+   if (end_mv_sg)
+   end_mv_sg->flags_size |= cpu_to_le32(EPRD_FLAG_END_OF_TBL);
 }
 


I'm testing a similar patch based on ata_fill_sg()'s method, which 
basically does something similar to what you've done here (see 
attached).  I had noticed that ata_fill_sg() did not call ata_sg_is_last().


If this fixes the problem, I think the best solution would be to delete 
ata_sg_is_last().  In the few users that exist, we should be able to 
eliminate the test programmatically as you and ata_fill_sg() have done 
-- thereby eliminating a branch per loop in a hotpath.


Off to test the attached...  if that doesn't work I'll try your version, 
though there shouldn't be much difference.


Jeff


diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
index 4df8311..42b5a9e 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -421,7 +421,6 @@ static void mv_error_handler(struct ata_port *ap);
 static void mv_post_int_cmd(struct ata_queued_cmd *qc);
 static void mv_eh_freeze(struct ata_port *ap);
 static void mv_eh_thaw(struct ata_port *ap);
-static int mv_slave_config(struct scsi_device *sdev);
 static int mv_init_one(struct pci_dev *pdev, const struct pci_device_id *ent);
 
 static void mv5_phy_errata(struct mv_host_priv *hpriv, void __iomem *mmio,
@@ -459,7 +458,7 @@ static struct scsi_host_template mv5_sht = {
.use_clustering = 1,
.proc_name  = DRV_NAME,
.dma_boundary   = MV_DMA_BOUNDARY,
-   .slave_configure= mv_slave_config,
+   .slave_configure= ata_scsi_slave_config,
.slave_destroy  = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
 };
@@ -477,7 +476,7 @@ static struct scsi_host_template mv6_sht = {
.use_clustering = 1,
.proc_name  = DRV_NAME,
.dma_boundary   = MV_DMA_BOUNDARY,
-   .slave_configure= mv_slave_config,
+   .slave_configure= ata_scsi_slave_config,
.slave_destroy  = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
 };
@@ -756,17 +755,6 @@ static void mv_irq_clear(struct ata_port *ap)
 {
 }
 
-static int mv_slave_config(struct scsi_device *sdev)
-{
-   int rc = ata_scsi_slave_config(sdev);
-   if (rc)
-   return rc;
-
-   blk_queue_max_phys_segments(sdev->request_queue, MV_MAX_SG_CT / 2);
-
-   return 0;   /* scsi layer doesn't check return value, sigh */
-}
-
 static void mv_set_edma_ptrs(void __iomem *port_mmio,
 struct mv_host_priv *hpriv,
 struct mv_port_priv *pp)
@@ -1138,34 +1126,35 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
 {
struct mv_port_priv *pp = qc->ap->private_data;
struct scatterlist *sg;
-   struct mv_sg *mv_sg;
+   struct mv_sg *mv_sg = pp->sg_tbl;
+   unsigned int idx = 0;
 
-   mv_sg = pp->sg_tbl;
ata_for_each_sg(sg, qc) {
-   dma_addr_t addr = sg_dma_address(sg);
-   u32 sg_len = sg_dma_len(sg);
+   u64 addr;
+   u32 offset, sg_len, len;
+
+   addr = sg_dma_address(sg);
+   sg_len = sg_dma_len(sg);
 
while (sg_len) {
-   u32 offset = addr & 0x;
-   u32 len = sg_len;
+   offset = addr & 0x;
+   len = sg_len;
 
if ((offset + sg_len > 0x1))
len = 0x1 - offset;
 
-   mv_sg->addr = cpu_to_le32(addr & 0x);
-   mv_sg->addr_hi = cpu_to_le32((addr >> 16) >> 16);
-   mv_sg->flags_size = cpu_to_le32(len & 0x);
+   mv_sg[idx].addr = 

Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jeff Garzik wrote:
> Jens Axboe wrote:
>> index 4df8311..b858183 100644
>> --- a/drivers/ata/sata_mv.c
>> +++ b/drivers/ata/sata_mv.c
>> @@ -1139,6 +1139,7 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
>>  struct mv_port_priv *pp = qc->ap->private_data;
>>  struct scatterlist *sg;
>>  struct mv_sg *mv_sg;
>> +int end_marked = 0;
>>  mv_sg = pp->sg_tbl;
>>  ata_for_each_sg(sg, qc) {
>> @@ -1159,13 +1160,15 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
>>  sg_len -= len;
>>  addr += len;
>>  -   if (!sg_len && ata_sg_is_last(sg, qc))
>> +if (!sg_len && ata_sg_is_last(sg, qc)) {
>>  mv_sg->flags_size |= 
>> cpu_to_le32(EPRD_FLAG_END_OF_TBL);
>> +end_marked++;
>> +}
>>  mv_sg++;
>>  }
>> -
>>  }
>> +BUG_ON(end_marked != 1);
>
>
> Your BUG_ON() does indeed trip, here.
>
> Its surprising that other folks don't explode, considering that 
> mv_fill_sg() intentionally mirrors the logic in ata_fill_sg().

The sata_mv construct looks a bit odd. Does this work? That last
end_mv_sg test should always be true, just being paranoid...

diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
index 4df8311..5397eea 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -1138,8 +1138,9 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
 {
struct mv_port_priv *pp = qc->ap->private_data;
struct scatterlist *sg;
-   struct mv_sg *mv_sg;
+   struct mv_sg *mv_sg, *end_mv_sg;
 
+   end_mv_sg = NULL;
mv_sg = pp->sg_tbl;
ata_for_each_sg(sg, qc) {
dma_addr_t addr = sg_dma_address(sg);
@@ -1158,14 +1159,12 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
 
sg_len -= len;
addr += len;
-
-   if (!sg_len && ata_sg_is_last(sg, qc))
-   mv_sg->flags_size |= 
cpu_to_le32(EPRD_FLAG_END_OF_TBL);
-
+   end_mv_sg = mv_sg;
mv_sg++;
}
-
}
+   if (end_mv_sg)
+   end_mv_sg->flags_size |= cpu_to_le32(EPRD_FLAG_END_OF_TBL);
 }
 
 static inline void mv_crqb_pack_cmd(__le16 *cmdw, u8 data, u8 addr, unsigned 
last)

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Benny Halevy wrote:



On 10/18/07, *Jeff Garzik* <[EMAIL PROTECTED] > wrote:

Jens Axboe wrote:
 > index 4df8311..b858183 100644
 > --- a/drivers/ata/sata_mv.c
 > +++ b/drivers/ata/sata_mv.c
 > @@ -1139,6 +1139,7 @@ static void mv_fill_sg(struct
ata_queued_cmd *qc)
 >   struct mv_port_priv *pp = qc->ap->private_data;
 >   struct scatterlist *sg;
 >   struct mv_sg *mv_sg;
 > + int end_marked = 0;
 >
 >   mv_sg = pp->sg_tbl;
 >   ata_for_each_sg(sg, qc) {
 > @@ -1159,13 +1160,15 @@ static void mv_fill_sg(struct
ata_queued_cmd *qc)
 >   sg_len -= len;
 >   addr += len;
 >
 > - if (!sg_len && ata_sg_is_last(sg, qc))
 > + if (!sg_len && ata_sg_is_last(sg, qc)) { 



I'm not sure, but shouldn't that be || rather than &&?


sg_len is zero at the end of each scatterlist entry, so we need to test 
the additional ata_sg_is_last() condition to determine if we are really 
at the end of the PRD table.


Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Jens Axboe wrote:

index 4df8311..b858183 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -1139,6 +1139,7 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
struct mv_port_priv *pp = qc->ap->private_data;
struct scatterlist *sg;
struct mv_sg *mv_sg;
+   int end_marked = 0;
 
 	mv_sg = pp->sg_tbl;

ata_for_each_sg(sg, qc) {
@@ -1159,13 +1160,15 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
sg_len -= len;
addr += len;
 
-			if (!sg_len && ata_sg_is_last(sg, qc))

+   if (!sg_len && ata_sg_is_last(sg, qc)) {
mv_sg->flags_size |= 
cpu_to_le32(EPRD_FLAG_END_OF_TBL);
+   end_marked++;
+   }
 
 			mv_sg++;

}
-
}
+   BUG_ON(end_marked != 1);



Your BUG_ON() does indeed trip, here.

Its surprising that other folks don't explode, considering that 
mv_fill_sg() intentionally mirrors the logic in ata_fill_sg().


Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Jeff Garzik wrote:
I noticed that 
sg_tablesize in sata_mv is not LIBATA_MAX_PRD.  This is expected 
behavior, but I wonder if that difference -- most notably being smaller 
than SCSI_MAX_SG_SEGMENTS -- would trigger any latent bugs.


Another sata_mv difference from the rest: the chip does not support 
ATAPI, so we never care about qc->pad


Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Jens Axboe wrote:

On Thu, Oct 18 2007, Jeff Garzik wrote:

Ingo Molnar wrote:

* Jens Axboe <[EMAIL PROTECTED]> wrote:

--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -39,7 +39,7 @@
  * (unless chaining is used). Should ideally fit inside a single page, 
to

  * avoid a higher order allocation.
  */
-#define SCSI_MAX_SG_SEGMENTS   128
+#define SCSI_MAX_SG_SEGMENTS   129
this one finally made the trick and it's booting fine now, without any 
crashes!
Alas, this didn't help me here.  I did manage to capture the error messages 
this time, and in the two machines I'm actively testing on, sata_mv is the 
driver that's dying in both cases.  Machine A additionally has sata_nv, 
which is working.  Machine B additionally has ata_piix, which is working.  
So in both cases, its sata_mv that is throwing SError complaints:


Theory - ata_sg_is_last() isn't returning true for the last entry. Can
you double check that it correcly marks the last entry in mv_fill_sg()?
Alternatively, just try this patch.


Will check in a few minutes, after my current test:  I noticed that 
sg_tablesize in sata_mv is not LIBATA_MAX_PRD.  This is expected 
behavior, but I wonder if that difference -- most notably being smaller 
than SCSI_MAX_SG_SEGMENTS -- would trigger any latent bugs.


Anyway, we will both have answers momentarily...

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jeff Garzik wrote:
> Ingo Molnar wrote:
>> * Jens Axboe <[EMAIL PROTECTED]> wrote:
>>> --- a/drivers/scsi/scsi_lib.c
>>> +++ b/drivers/scsi/scsi_lib.c
>>> @@ -39,7 +39,7 @@
>>>   * (unless chaining is used). Should ideally fit inside a single page, 
>>> to
>>>   * avoid a higher order allocation.
>>>   */
>>> -#define SCSI_MAX_SG_SEGMENTS   128
>>> +#define SCSI_MAX_SG_SEGMENTS   129
>> this one finally made the trick and it's booting fine now, without any 
>> crashes!
>
> Alas, this didn't help me here.  I did manage to capture the error messages 
> this time, and in the two machines I'm actively testing on, sata_mv is the 
> driver that's dying in both cases.  Machine A additionally has sata_nv, 
> which is working.  Machine B additionally has ata_piix, which is working.  
> So in both cases, its sata_mv that is throwing SError complaints:

Theory - ata_sg_is_last() isn't returning true for the last entry. Can
you double check that it correcly marks the last entry in mv_fill_sg()?
Alternatively, just try this patch.


diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
index 4df8311..b858183 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -1139,6 +1139,7 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
struct mv_port_priv *pp = qc->ap->private_data;
struct scatterlist *sg;
struct mv_sg *mv_sg;
+   int end_marked = 0;
 
mv_sg = pp->sg_tbl;
ata_for_each_sg(sg, qc) {
@@ -1159,13 +1160,15 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
sg_len -= len;
addr += len;
 
-   if (!sg_len && ata_sg_is_last(sg, qc))
+   if (!sg_len && ata_sg_is_last(sg, qc)) {
mv_sg->flags_size |= 
cpu_to_le32(EPRD_FLAG_END_OF_TBL);
+   end_marked++;
+   }
 
mv_sg++;
}
-
}
+   BUG_ON(end_marked != 1);
 }
 
 static inline void mv_crqb_pack_cmd(__le16 *cmdw, u8 data, u8 addr, unsigned 
last)

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Ingo Molnar wrote:

* Jens Axboe <[EMAIL PROTECTED]> wrote:


--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -39,7 +39,7 @@
  * (unless chaining is used). Should ideally fit inside a single page, to
  * avoid a higher order allocation.
  */
-#define SCSI_MAX_SG_SEGMENTS   128
+#define SCSI_MAX_SG_SEGMENTS   129


this one finally made the trick and it's booting fine now, without any 
crashes!


Alas, this didn't help me here.  I did manage to capture the error 
messages this time, and in the two machines I'm actively testing on, 
sata_mv is the driver that's dying in both cases.  Machine A 
additionally has sata_nv, which is working.  Machine B additionally has 
ata_piix, which is working.  So in both cases, its sata_mv that is 
throwing SError complaints:



ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x40 action 0x6 frozen
ata7.00: edma_err 0x0480, EDMA self-disable
ata7: SError: { Handshk }
ata7.00: cmd ca/00:08:c7:40:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 4096 out
 res 50/00:00:ce:40:00/00:00:00:00:00/e0 Emask 0x100 (unknown error)
ata7.00: status: { DRDY }
ata7: hard resetting link
ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 300)


Still digging...  this behavior showed up after libata changes went in, 
FWIW.


Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Wed, Oct 17 2007, David Miller wrote:
> From: Linus Torvalds <[EMAIL PROTECTED]>
> Date: Wed, 17 Oct 2007 18:07:19 -0700 (PDT)
> 
> > sg_next() - as it stands now - never actually looks at the SG that its 
> > argument points to: it explicitly *only* looks at the next one.
> > 
> > That's the bug. If sg_next() looked at the actual *current* sg entry, we 
> > wouldn't have any issues to begin with, and that's what I'm arguing we 
> > should do in the longer run (where "longer run" is defined as "when Jens 
> > does it asap").
> 
> What the thing really wants is some kind of indication of state,
> without having to bloat up the scatterlist structure.
> 
> I believe that we have enough of a limited set of accessors to
> sg->page that we can more aggressively encode things in the lower
> bits.
> 
> I'm thinking of encoding the low two bits of sg->page as
> follows:
> 
> 1) bits == 0
> 
>then the SG list is linear and sg_next() is sg++
> 
> 2) bits == 1
> 
>the nest SG is an indirect chunk, sg_next() is
>therefore something like:
> 
>   next = *((struct scatterlist **)(sg + 1));
> 
> 3) bits == 2
> 
>this is the last entry in the scatterlist, sg_next() is NULL
> 
> So for the cases of ARCH_HAS_SG_CHAIN not being set (ie. back
> compatible), we can do no bit encoding in page->flags and just do
> sg_next() == sg++, as is done now.
> 
> When doing SG chaining, in each non-linear chunk we have to allocate
> one more pointer past the end of the scatterlist array (instead of a
> full extra scatterlist entry for the indirect pointer encode).  Next,
> all sg->page accesses have to be guarded to clear the state bits
> out first.
> 
> I don't know, maybe it would work, and would make the loop termination
> issues easier to handle properly.

I like it. Basically the only real change is using bit 2 as a
termination point, so we avoid going beyond the end of the sgtable.
Here's a starting point, it actually booted for me in the first go
(boggle). Only x86 so far, archs will need to be converted. And lots
more drivers I'm sure, I only fixed up the ones that botched my compile.

So just consider this a directional patch.

diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c
index 5cdfab6..daaf636 100644
--- a/arch/x86/kernel/pci-gart_64.c
+++ b/arch/x86/kernel/pci-gart_64.c
@@ -302,7 +302,7 @@ static int dma_map_sg_nonforce(struct device *dev, struct 
scatterlist *sg,
 #endif
 
for_each_sg(sg, s, nents, i) {
-   unsigned long addr = page_to_phys(s->page) + s->offset; 
+   unsigned long addr = page_to_phys(sg_page(s)) + s->offset; 
if (nonforced_iommu(dev, addr, s->length)) { 
addr = dma_map_area(dev, addr, s->length, dir);
if (addr == bad_dma_address) { 
@@ -397,7 +397,7 @@ static int gart_map_sg(struct device *dev, struct 
scatterlist *sg, int nents,
start_sg = sgmap = sg;
ps = NULL; /* shut up gcc */
for_each_sg(sg, s, nents, i) {
-   dma_addr_t addr = page_to_phys(s->page) + s->offset;
+   dma_addr_t addr = page_to_phys(sg_page(s)) + s->offset;
s->dma_address = addr;
BUG_ON(s->length == 0); 
 
diff --git a/arch/x86/kernel/pci-nommu_64.c b/arch/x86/kernel/pci-nommu_64.c
index e85d436..d64a4a5 100644
--- a/arch/x86/kernel/pci-nommu_64.c
+++ b/arch/x86/kernel/pci-nommu_64.c
@@ -62,8 +62,8 @@ static int nommu_map_sg(struct device *hwdev, struct 
scatterlist *sg,
int i;
 
for_each_sg(sg, s, nents, i) {
-   BUG_ON(!s->page);
-   s->dma_address = virt_to_bus(page_address(s->page) +s->offset);
+   BUG_ON(!sg_page(s));
+   s->dma_address = virt_to_bus(page_address(sg_page(s)) 
+s->offset);
if (!check_addr("map_sg", hwdev, s->dma_address, s->length))
return 0;
s->dma_length = s->length;
diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 3935469..d02783c 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -1354,10 +1354,11 @@ new_segment:
else
sg = sg_next(sg);
 
-   memset(sg, 0, sizeof(*sg));
-   sg->page = bvec->bv_page;
+   sg_set_page(sg, bvec->bv_page);
sg->length = nbytes;
sg->offset = bvec->bv_offset;
+   sg_dma_len(sg) = 0;
+   sg_dma_address(sg) = 0;
nsegs++;
}
bvprv = bvec;
diff --git a/crypto/digest.c b/crypto/digest.c
index e56de67..8871dec 100644
--- a/crypto/digest.c
+++ b/crypto/digest.c
@@ -41,7 +41,7 @@ static int update2(struct hash_desc *desc,
return 0;
 
for (;;) {
-   struct page *pg = sg->page;
+   struct page *pg = sg_page(sg);
unsigned int 

Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Jens Axboe wrote:

On Thu, Oct 18 2007, Jeff Garzik wrote:

Mark Lord wrote:

Mark Lord wrote:

Linus Torvalds wrote:

On Wed, 17 Oct 2007, Mark Lord wrote:

It would be good to have something soon-ish.
This "dead at boot time" issue is impacting the general ability to test
patches against latest -git in time for the current merge window.

In the meantime, does the patch I sent out help people?

Your patch from this posting http://lkml.org/lkml/2007/10/17/285
does not seem to make much difference here.

It still crashes at exactly the same place.

However, Jens's patch from that same thread:
http://lkml.org/lkml/2007/10/17/269
..allowed me to boot and post this followup message from -git12
Jeff: try that one.

That's already in my upstream kernel, here.  commits
ba951841ceb7fa5b06ad48caa5270cc2ae17941e and 
a3bec5c5aea0da263111c4d8f8eabc1f8560d7bf.


sata_mv and sata_nv still reliably poop themselves here, whereas its rock 
solid with 2.6.23.1.  Sounds like different issues from yours, as I see a 
stream of SATA errors on the bad kernels, errors which are often a symptom 
of something whacked in the DMA engine (misprogramming causes the silicon 
to generate bogus FIS's, which the device then chokes on)


Do you know if this poop involves the segment padding that sometimes
goes on in libata?


Definitely not, in this case -- it's all ATA, nothing ATAPI.

It throws SError { Handshk } which then triggers the EH to reset the 
link, and so it goes, over and over :)  The same thing happens when I 
intentionally screw up the PRD tables.  Not much more data points than 
that, so far.


I'll try the SCSI_MAX_SG_SEGMENTS patch too, to see if that fixes things.

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Ingo Molnar wrote:
> 
> * Jens Axboe <[EMAIL PROTECTED]> wrote:
> 
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -39,7 +39,7 @@
> >   * (unless chaining is used). Should ideally fit inside a single page, to
> >   * avoid a higher order allocation.
> >   */
> > -#define SCSI_MAX_SG_SEGMENTS   128
> > +#define SCSI_MAX_SG_SEGMENTS   129
> 
> this one finally made the trick and it's booting fine now, without any 
> crashes!

Super, that validates the theory the theory that Linus put forth. So the
bug is clear now, this morning I'll work on proper sg looping.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jeff Garzik wrote:
> Mark Lord wrote:
>> Mark Lord wrote:
>>> Linus Torvalds wrote:

 On Wed, 17 Oct 2007, Mark Lord wrote:
> It would be good to have something soon-ish.
> This "dead at boot time" issue is impacting the general ability to test
> patches against latest -git in time for the current merge window.

 In the meantime, does the patch I sent out help people?
>>>
>>> Your patch from this posting http://lkml.org/lkml/2007/10/17/285
>>> does not seem to make much difference here.
>>>
>>> It still crashes at exactly the same place.
>> However, Jens's patch from that same thread:
>> http://lkml.org/lkml/2007/10/17/269
>> ..allowed me to boot and post this followup message from -git12
>> Jeff: try that one.
>
> That's already in my upstream kernel, here.  commits
> ba951841ceb7fa5b06ad48caa5270cc2ae17941e and 
> a3bec5c5aea0da263111c4d8f8eabc1f8560d7bf.
>
> sata_mv and sata_nv still reliably poop themselves here, whereas its rock 
> solid with 2.6.23.1.  Sounds like different issues from yours, as I see a 
> stream of SATA errors on the bad kernels, errors which are often a symptom 
> of something whacked in the DMA engine (misprogramming causes the silicon 
> to generate bogus FIS's, which the device then chokes on)

Do you know if this poop involves the segment padding that sometimes
goes on in libata?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Ingo Molnar

* Jens Axboe <[EMAIL PROTECTED]> wrote:

> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -39,7 +39,7 @@
>   * (unless chaining is used). Should ideally fit inside a single page, to
>   * avoid a higher order allocation.
>   */
> -#define SCSI_MAX_SG_SEGMENTS 128
> +#define SCSI_MAX_SG_SEGMENTS 129

this one finally made the trick and it's booting fine now, without any 
crashes!

Tested-by: Ingo Molnar <[EMAIL PROTECTED]>

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jeff Garzik wrote:
> Mark Lord wrote:
>> Okay, mine is dying with EIP at blk_rq_map_sg+0xcb/0x160.
>> Screen photo is at http://rtr.ca/recent/2.6.23-git12-crash.jpg,
>> but the top was cut off (isn't there a new config option or patch
>> to do double-columns or scrollback or something ???.
>
> Is this a sata_mv box?  If so, could you try this patch?

If anything, that shrinks the size of the resulting request. Did this
patch make any difference to you?

Now sata_mv should not be doing this (already discussed), but as long as
it's only lowering the physical sg count then it should not cause any
bugs at least.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jeff Garzik wrote:
 Mark Lord wrote:
 Okay, mine is dying with EIP at blk_rq_map_sg+0xcb/0x160.
 Screen photo is at http://rtr.ca/recent/2.6.23-git12-crash.jpg,
 but the top was cut off (isn't there a new config option or patch
 to do double-columns or scrollback or something ???.

 Is this a sata_mv box?  If so, could you try this patch?

If anything, that shrinks the size of the resulting request. Did this
patch make any difference to you?

Now sata_mv should not be doing this (already discussed), but as long as
it's only lowering the physical sg count then it should not cause any
bugs at least.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Ingo Molnar

* Jens Axboe [EMAIL PROTECTED] wrote:

 --- a/drivers/scsi/scsi_lib.c
 +++ b/drivers/scsi/scsi_lib.c
 @@ -39,7 +39,7 @@
   * (unless chaining is used). Should ideally fit inside a single page, to
   * avoid a higher order allocation.
   */
 -#define SCSI_MAX_SG_SEGMENTS 128
 +#define SCSI_MAX_SG_SEGMENTS 129

this one finally made the trick and it's booting fine now, without any 
crashes!

Tested-by: Ingo Molnar [EMAIL PROTECTED]

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jeff Garzik wrote:
 Mark Lord wrote:
 Mark Lord wrote:
 Linus Torvalds wrote:

 On Wed, 17 Oct 2007, Mark Lord wrote:
 It would be good to have something soon-ish.
 This dead at boot time issue is impacting the general ability to test
 patches against latest -git in time for the current merge window.

 In the meantime, does the patch I sent out help people?

 Your patch from this posting http://lkml.org/lkml/2007/10/17/285
 does not seem to make much difference here.

 It still crashes at exactly the same place.
 However, Jens's patch from that same thread:
 http://lkml.org/lkml/2007/10/17/269
 ..allowed me to boot and post this followup message from -git12
 Jeff: try that one.

 That's already in my upstream kernel, here.  commits
 ba951841ceb7fa5b06ad48caa5270cc2ae17941e and 
 a3bec5c5aea0da263111c4d8f8eabc1f8560d7bf.

 sata_mv and sata_nv still reliably poop themselves here, whereas its rock 
 solid with 2.6.23.1.  Sounds like different issues from yours, as I see a 
 stream of SATA errors on the bad kernels, errors which are often a symptom 
 of something whacked in the DMA engine (misprogramming causes the silicon 
 to generate bogus FIS's, which the device then chokes on)

Do you know if this poop involves the segment padding that sometimes
goes on in libata?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Ingo Molnar wrote:
 
 * Jens Axboe [EMAIL PROTECTED] wrote:
 
  --- a/drivers/scsi/scsi_lib.c
  +++ b/drivers/scsi/scsi_lib.c
  @@ -39,7 +39,7 @@
* (unless chaining is used). Should ideally fit inside a single page, to
* avoid a higher order allocation.
*/
  -#define SCSI_MAX_SG_SEGMENTS   128
  +#define SCSI_MAX_SG_SEGMENTS   129
 
 this one finally made the trick and it's booting fine now, without any 
 crashes!

Super, that validates the theory the theory that Linus put forth. So the
bug is clear now, this morning I'll work on proper sg looping.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Jens Axboe wrote:

On Thu, Oct 18 2007, Jeff Garzik wrote:

Mark Lord wrote:

Mark Lord wrote:

Linus Torvalds wrote:

On Wed, 17 Oct 2007, Mark Lord wrote:

It would be good to have something soon-ish.
This dead at boot time issue is impacting the general ability to test
patches against latest -git in time for the current merge window.

In the meantime, does the patch I sent out help people?

Your patch from this posting http://lkml.org/lkml/2007/10/17/285
does not seem to make much difference here.

It still crashes at exactly the same place.

However, Jens's patch from that same thread:
http://lkml.org/lkml/2007/10/17/269
..allowed me to boot and post this followup message from -git12
Jeff: try that one.

That's already in my upstream kernel, here.  commits
ba951841ceb7fa5b06ad48caa5270cc2ae17941e and 
a3bec5c5aea0da263111c4d8f8eabc1f8560d7bf.


sata_mv and sata_nv still reliably poop themselves here, whereas its rock 
solid with 2.6.23.1.  Sounds like different issues from yours, as I see a 
stream of SATA errors on the bad kernels, errors which are often a symptom 
of something whacked in the DMA engine (misprogramming causes the silicon 
to generate bogus FIS's, which the device then chokes on)


Do you know if this poop involves the segment padding that sometimes
goes on in libata?


Definitely not, in this case -- it's all ATA, nothing ATAPI.

It throws SError { Handshk } which then triggers the EH to reset the 
link, and so it goes, over and over :)  The same thing happens when I 
intentionally screw up the PRD tables.  Not much more data points than 
that, so far.


I'll try the SCSI_MAX_SG_SEGMENTS patch too, to see if that fixes things.

Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Wed, Oct 17 2007, David Miller wrote:
 From: Linus Torvalds [EMAIL PROTECTED]
 Date: Wed, 17 Oct 2007 18:07:19 -0700 (PDT)
 
  sg_next() - as it stands now - never actually looks at the SG that its 
  argument points to: it explicitly *only* looks at the next one.
  
  That's the bug. If sg_next() looked at the actual *current* sg entry, we 
  wouldn't have any issues to begin with, and that's what I'm arguing we 
  should do in the longer run (where longer run is defined as when Jens 
  does it asap).
 
 What the thing really wants is some kind of indication of state,
 without having to bloat up the scatterlist structure.
 
 I believe that we have enough of a limited set of accessors to
 sg-page that we can more aggressively encode things in the lower
 bits.
 
 I'm thinking of encoding the low two bits of sg-page as
 follows:
 
 1) bits == 0
 
then the SG list is linear and sg_next() is sg++
 
 2) bits == 1
 
the nest SG is an indirect chunk, sg_next() is
therefore something like:
 
   next = *((struct scatterlist **)(sg + 1));
 
 3) bits == 2
 
this is the last entry in the scatterlist, sg_next() is NULL
 
 So for the cases of ARCH_HAS_SG_CHAIN not being set (ie. back
 compatible), we can do no bit encoding in page-flags and just do
 sg_next() == sg++, as is done now.
 
 When doing SG chaining, in each non-linear chunk we have to allocate
 one more pointer past the end of the scatterlist array (instead of a
 full extra scatterlist entry for the indirect pointer encode).  Next,
 all sg-page accesses have to be guarded to clear the state bits
 out first.
 
 I don't know, maybe it would work, and would make the loop termination
 issues easier to handle properly.

I like it. Basically the only real change is using bit 2 as a
termination point, so we avoid going beyond the end of the sgtable.
Here's a starting point, it actually booted for me in the first go
(boggle). Only x86 so far, archs will need to be converted. And lots
more drivers I'm sure, I only fixed up the ones that botched my compile.

So just consider this a directional patch.

diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c
index 5cdfab6..daaf636 100644
--- a/arch/x86/kernel/pci-gart_64.c
+++ b/arch/x86/kernel/pci-gart_64.c
@@ -302,7 +302,7 @@ static int dma_map_sg_nonforce(struct device *dev, struct 
scatterlist *sg,
 #endif
 
for_each_sg(sg, s, nents, i) {
-   unsigned long addr = page_to_phys(s-page) + s-offset; 
+   unsigned long addr = page_to_phys(sg_page(s)) + s-offset; 
if (nonforced_iommu(dev, addr, s-length)) { 
addr = dma_map_area(dev, addr, s-length, dir);
if (addr == bad_dma_address) { 
@@ -397,7 +397,7 @@ static int gart_map_sg(struct device *dev, struct 
scatterlist *sg, int nents,
start_sg = sgmap = sg;
ps = NULL; /* shut up gcc */
for_each_sg(sg, s, nents, i) {
-   dma_addr_t addr = page_to_phys(s-page) + s-offset;
+   dma_addr_t addr = page_to_phys(sg_page(s)) + s-offset;
s-dma_address = addr;
BUG_ON(s-length == 0); 
 
diff --git a/arch/x86/kernel/pci-nommu_64.c b/arch/x86/kernel/pci-nommu_64.c
index e85d436..d64a4a5 100644
--- a/arch/x86/kernel/pci-nommu_64.c
+++ b/arch/x86/kernel/pci-nommu_64.c
@@ -62,8 +62,8 @@ static int nommu_map_sg(struct device *hwdev, struct 
scatterlist *sg,
int i;
 
for_each_sg(sg, s, nents, i) {
-   BUG_ON(!s-page);
-   s-dma_address = virt_to_bus(page_address(s-page) +s-offset);
+   BUG_ON(!sg_page(s));
+   s-dma_address = virt_to_bus(page_address(sg_page(s)) 
+s-offset);
if (!check_addr(map_sg, hwdev, s-dma_address, s-length))
return 0;
s-dma_length = s-length;
diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 3935469..d02783c 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -1354,10 +1354,11 @@ new_segment:
else
sg = sg_next(sg);
 
-   memset(sg, 0, sizeof(*sg));
-   sg-page = bvec-bv_page;
+   sg_set_page(sg, bvec-bv_page);
sg-length = nbytes;
sg-offset = bvec-bv_offset;
+   sg_dma_len(sg) = 0;
+   sg_dma_address(sg) = 0;
nsegs++;
}
bvprv = bvec;
diff --git a/crypto/digest.c b/crypto/digest.c
index e56de67..8871dec 100644
--- a/crypto/digest.c
+++ b/crypto/digest.c
@@ -41,7 +41,7 @@ static int update2(struct hash_desc *desc,
return 0;
 
for (;;) {
-   struct page *pg = sg-page;
+   struct page *pg = sg_page(sg);
unsigned int offset = sg-offset;
unsigned int l = sg-length;
 
diff --git 

Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Ingo Molnar wrote:

* Jens Axboe [EMAIL PROTECTED] wrote:


--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -39,7 +39,7 @@
  * (unless chaining is used). Should ideally fit inside a single page, to
  * avoid a higher order allocation.
  */
-#define SCSI_MAX_SG_SEGMENTS   128
+#define SCSI_MAX_SG_SEGMENTS   129


this one finally made the trick and it's booting fine now, without any 
crashes!


Alas, this didn't help me here.  I did manage to capture the error 
messages this time, and in the two machines I'm actively testing on, 
sata_mv is the driver that's dying in both cases.  Machine A 
additionally has sata_nv, which is working.  Machine B additionally has 
ata_piix, which is working.  So in both cases, its sata_mv that is 
throwing SError complaints:



ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x40 action 0x6 frozen
ata7.00: edma_err 0x0480, EDMA self-disable
ata7: SError: { Handshk }
ata7.00: cmd ca/00:08:c7:40:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 4096 out
 res 50/00:00:ce:40:00/00:00:00:00:00/e0 Emask 0x100 (unknown error)
ata7.00: status: { DRDY }
ata7: hard resetting link
ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 300)


Still digging...  this behavior showed up after libata changes went in, 
FWIW.


Jeff


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jeff Garzik wrote:
 Ingo Molnar wrote:
 * Jens Axboe [EMAIL PROTECTED] wrote:
 --- a/drivers/scsi/scsi_lib.c
 +++ b/drivers/scsi/scsi_lib.c
 @@ -39,7 +39,7 @@
   * (unless chaining is used). Should ideally fit inside a single page, 
 to
   * avoid a higher order allocation.
   */
 -#define SCSI_MAX_SG_SEGMENTS   128
 +#define SCSI_MAX_SG_SEGMENTS   129
 this one finally made the trick and it's booting fine now, without any 
 crashes!

 Alas, this didn't help me here.  I did manage to capture the error messages 
 this time, and in the two machines I'm actively testing on, sata_mv is the 
 driver that's dying in both cases.  Machine A additionally has sata_nv, 
 which is working.  Machine B additionally has ata_piix, which is working.  
 So in both cases, its sata_mv that is throwing SError complaints:

Theory - ata_sg_is_last() isn't returning true for the last entry. Can
you double check that it correcly marks the last entry in mv_fill_sg()?
Alternatively, just try this patch.


diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
index 4df8311..b858183 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -1139,6 +1139,7 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
struct mv_port_priv *pp = qc-ap-private_data;
struct scatterlist *sg;
struct mv_sg *mv_sg;
+   int end_marked = 0;
 
mv_sg = pp-sg_tbl;
ata_for_each_sg(sg, qc) {
@@ -1159,13 +1160,15 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
sg_len -= len;
addr += len;
 
-   if (!sg_len  ata_sg_is_last(sg, qc))
+   if (!sg_len  ata_sg_is_last(sg, qc)) {
mv_sg-flags_size |= 
cpu_to_le32(EPRD_FLAG_END_OF_TBL);
+   end_marked++;
+   }
 
mv_sg++;
}
-
}
+   BUG_ON(end_marked != 1);
 }
 
 static inline void mv_crqb_pack_cmd(__le16 *cmdw, u8 data, u8 addr, unsigned 
last)

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Jens Axboe wrote:

On Thu, Oct 18 2007, Jeff Garzik wrote:

Ingo Molnar wrote:

* Jens Axboe [EMAIL PROTECTED] wrote:

--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -39,7 +39,7 @@
  * (unless chaining is used). Should ideally fit inside a single page, 
to

  * avoid a higher order allocation.
  */
-#define SCSI_MAX_SG_SEGMENTS   128
+#define SCSI_MAX_SG_SEGMENTS   129
this one finally made the trick and it's booting fine now, without any 
crashes!
Alas, this didn't help me here.  I did manage to capture the error messages 
this time, and in the two machines I'm actively testing on, sata_mv is the 
driver that's dying in both cases.  Machine A additionally has sata_nv, 
which is working.  Machine B additionally has ata_piix, which is working.  
So in both cases, its sata_mv that is throwing SError complaints:


Theory - ata_sg_is_last() isn't returning true for the last entry. Can
you double check that it correcly marks the last entry in mv_fill_sg()?
Alternatively, just try this patch.


Will check in a few minutes, after my current test:  I noticed that 
sg_tablesize in sata_mv is not LIBATA_MAX_PRD.  This is expected 
behavior, but I wonder if that difference -- most notably being smaller 
than SCSI_MAX_SG_SEGMENTS -- would trigger any latent bugs.


Anyway, we will both have answers momentarily...

Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Jeff Garzik wrote:
I noticed that 
sg_tablesize in sata_mv is not LIBATA_MAX_PRD.  This is expected 
behavior, but I wonder if that difference -- most notably being smaller 
than SCSI_MAX_SG_SEGMENTS -- would trigger any latent bugs.


Another sata_mv difference from the rest: the chip does not support 
ATAPI, so we never care about qc-pad


Jeff


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Jens Axboe wrote:

index 4df8311..b858183 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -1139,6 +1139,7 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
struct mv_port_priv *pp = qc-ap-private_data;
struct scatterlist *sg;
struct mv_sg *mv_sg;
+   int end_marked = 0;
 
 	mv_sg = pp-sg_tbl;

ata_for_each_sg(sg, qc) {
@@ -1159,13 +1160,15 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
sg_len -= len;
addr += len;
 
-			if (!sg_len  ata_sg_is_last(sg, qc))

+   if (!sg_len  ata_sg_is_last(sg, qc)) {
mv_sg-flags_size |= 
cpu_to_le32(EPRD_FLAG_END_OF_TBL);
+   end_marked++;
+   }
 
 			mv_sg++;

}
-
}
+   BUG_ON(end_marked != 1);



Your BUG_ON() does indeed trip, here.

Its surprising that other folks don't explode, considering that 
mv_fill_sg() intentionally mirrors the logic in ata_fill_sg().


Jeff


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Benny Halevy wrote:



On 10/18/07, *Jeff Garzik* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote:

Jens Axboe wrote:
  index 4df8311..b858183 100644
  --- a/drivers/ata/sata_mv.c
  +++ b/drivers/ata/sata_mv.c
  @@ -1139,6 +1139,7 @@ static void mv_fill_sg(struct
ata_queued_cmd *qc)
struct mv_port_priv *pp = qc-ap-private_data;
struct scatterlist *sg;
struct mv_sg *mv_sg;
  + int end_marked = 0;
 
mv_sg = pp-sg_tbl;
ata_for_each_sg(sg, qc) {
  @@ -1159,13 +1160,15 @@ static void mv_fill_sg(struct
ata_queued_cmd *qc)
sg_len -= len;
addr += len;
 
  - if (!sg_len  ata_sg_is_last(sg, qc))
  + if (!sg_len  ata_sg_is_last(sg, qc)) { 



I'm not sure, but shouldn't that be || rather than ?


sg_len is zero at the end of each scatterlist entry, so we need to test 
the additional ata_sg_is_last() condition to determine if we are really 
at the end of the PRD table.


Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jeff Garzik wrote:
 Jens Axboe wrote:
 index 4df8311..b858183 100644
 --- a/drivers/ata/sata_mv.c
 +++ b/drivers/ata/sata_mv.c
 @@ -1139,6 +1139,7 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
  struct mv_port_priv *pp = qc-ap-private_data;
  struct scatterlist *sg;
  struct mv_sg *mv_sg;
 +int end_marked = 0;
  mv_sg = pp-sg_tbl;
  ata_for_each_sg(sg, qc) {
 @@ -1159,13 +1160,15 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
  sg_len -= len;
  addr += len;
  -   if (!sg_len  ata_sg_is_last(sg, qc))
 +if (!sg_len  ata_sg_is_last(sg, qc)) {
  mv_sg-flags_size |= 
 cpu_to_le32(EPRD_FLAG_END_OF_TBL);
 +end_marked++;
 +}
  mv_sg++;
  }
 -
  }
 +BUG_ON(end_marked != 1);


 Your BUG_ON() does indeed trip, here.

 Its surprising that other folks don't explode, considering that 
 mv_fill_sg() intentionally mirrors the logic in ata_fill_sg().

The sata_mv construct looks a bit odd. Does this work? That last
end_mv_sg test should always be true, just being paranoid...

diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
index 4df8311..5397eea 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -1138,8 +1138,9 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
 {
struct mv_port_priv *pp = qc-ap-private_data;
struct scatterlist *sg;
-   struct mv_sg *mv_sg;
+   struct mv_sg *mv_sg, *end_mv_sg;
 
+   end_mv_sg = NULL;
mv_sg = pp-sg_tbl;
ata_for_each_sg(sg, qc) {
dma_addr_t addr = sg_dma_address(sg);
@@ -1158,14 +1159,12 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
 
sg_len -= len;
addr += len;
-
-   if (!sg_len  ata_sg_is_last(sg, qc))
-   mv_sg-flags_size |= 
cpu_to_le32(EPRD_FLAG_END_OF_TBL);
-
+   end_mv_sg = mv_sg;
mv_sg++;
}
-
}
+   if (end_mv_sg)
+   end_mv_sg-flags_size |= cpu_to_le32(EPRD_FLAG_END_OF_TBL);
 }
 
 static inline void mv_crqb_pack_cmd(__le16 *cmdw, u8 data, u8 addr, unsigned 
last)

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Jens Axboe wrote:

The sata_mv construct looks a bit odd. Does this work? That last


The sata_mv construct worked just fine before sg chaining :)



end_mv_sg test should always be true, just being paranoid...

diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
index 4df8311..5397eea 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -1138,8 +1138,9 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
 {
struct mv_port_priv *pp = qc-ap-private_data;
struct scatterlist *sg;
-   struct mv_sg *mv_sg;
+   struct mv_sg *mv_sg, *end_mv_sg;
 
+	end_mv_sg = NULL;

mv_sg = pp-sg_tbl;
ata_for_each_sg(sg, qc) {
dma_addr_t addr = sg_dma_address(sg);
@@ -1158,14 +1159,12 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
 
 			sg_len -= len;

addr += len;
-
-   if (!sg_len  ata_sg_is_last(sg, qc))
-   mv_sg-flags_size |= 
cpu_to_le32(EPRD_FLAG_END_OF_TBL);
-
+   end_mv_sg = mv_sg;
mv_sg++;
}
-
}
+   if (end_mv_sg)
+   end_mv_sg-flags_size |= cpu_to_le32(EPRD_FLAG_END_OF_TBL);
 }
 


I'm testing a similar patch based on ata_fill_sg()'s method, which 
basically does something similar to what you've done here (see 
attached).  I had noticed that ata_fill_sg() did not call ata_sg_is_last().


If this fixes the problem, I think the best solution would be to delete 
ata_sg_is_last().  In the few users that exist, we should be able to 
eliminate the test programmatically as you and ata_fill_sg() have done 
-- thereby eliminating a branch per loop in a hotpath.


Off to test the attached...  if that doesn't work I'll try your version, 
though there shouldn't be much difference.


Jeff


diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
index 4df8311..42b5a9e 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -421,7 +421,6 @@ static void mv_error_handler(struct ata_port *ap);
 static void mv_post_int_cmd(struct ata_queued_cmd *qc);
 static void mv_eh_freeze(struct ata_port *ap);
 static void mv_eh_thaw(struct ata_port *ap);
-static int mv_slave_config(struct scsi_device *sdev);
 static int mv_init_one(struct pci_dev *pdev, const struct pci_device_id *ent);
 
 static void mv5_phy_errata(struct mv_host_priv *hpriv, void __iomem *mmio,
@@ -459,7 +458,7 @@ static struct scsi_host_template mv5_sht = {
.use_clustering = 1,
.proc_name  = DRV_NAME,
.dma_boundary   = MV_DMA_BOUNDARY,
-   .slave_configure= mv_slave_config,
+   .slave_configure= ata_scsi_slave_config,
.slave_destroy  = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
 };
@@ -477,7 +476,7 @@ static struct scsi_host_template mv6_sht = {
.use_clustering = 1,
.proc_name  = DRV_NAME,
.dma_boundary   = MV_DMA_BOUNDARY,
-   .slave_configure= mv_slave_config,
+   .slave_configure= ata_scsi_slave_config,
.slave_destroy  = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
 };
@@ -756,17 +755,6 @@ static void mv_irq_clear(struct ata_port *ap)
 {
 }
 
-static int mv_slave_config(struct scsi_device *sdev)
-{
-   int rc = ata_scsi_slave_config(sdev);
-   if (rc)
-   return rc;
-
-   blk_queue_max_phys_segments(sdev-request_queue, MV_MAX_SG_CT / 2);
-
-   return 0;   /* scsi layer doesn't check return value, sigh */
-}
-
 static void mv_set_edma_ptrs(void __iomem *port_mmio,
 struct mv_host_priv *hpriv,
 struct mv_port_priv *pp)
@@ -1138,34 +1126,35 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
 {
struct mv_port_priv *pp = qc-ap-private_data;
struct scatterlist *sg;
-   struct mv_sg *mv_sg;
+   struct mv_sg *mv_sg = pp-sg_tbl;
+   unsigned int idx = 0;
 
-   mv_sg = pp-sg_tbl;
ata_for_each_sg(sg, qc) {
-   dma_addr_t addr = sg_dma_address(sg);
-   u32 sg_len = sg_dma_len(sg);
+   u64 addr;
+   u32 offset, sg_len, len;
+
+   addr = sg_dma_address(sg);
+   sg_len = sg_dma_len(sg);
 
while (sg_len) {
-   u32 offset = addr  0x;
-   u32 len = sg_len;
+   offset = addr  0x;
+   len = sg_len;
 
if ((offset + sg_len  0x1))
len = 0x1 - offset;
 
-   mv_sg-addr = cpu_to_le32(addr  0x);
-   mv_sg-addr_hi = cpu_to_le32((addr  16)  16);
-   mv_sg-flags_size = cpu_to_le32(len  0x);
+   mv_sg[idx].addr = cpu_to_le32(addr  0x);
+ 

Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jeff Garzik wrote:
 Jens Axboe wrote:
 The sata_mv construct looks a bit odd. Does this work? That last

 The sata_mv construct worked just fine before sg chaining :)

Yes I know, but I'm trying to works towards getting rid of sg_last() and
ata_sg_is_last() anyway :-)

 end_mv_sg test should always be true, just being paranoid...
 diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
 index 4df8311..5397eea 100644
 --- a/drivers/ata/sata_mv.c
 +++ b/drivers/ata/sata_mv.c
 @@ -1138,8 +1138,9 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
  {
  struct mv_port_priv *pp = qc-ap-private_data;
  struct scatterlist *sg;
 -struct mv_sg *mv_sg;
 +struct mv_sg *mv_sg, *end_mv_sg;
  +   end_mv_sg = NULL;
  mv_sg = pp-sg_tbl;
  ata_for_each_sg(sg, qc) {
  dma_addr_t addr = sg_dma_address(sg);
 @@ -1158,14 +1159,12 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
  sg_len -= len;
  addr += len;
 -
 -if (!sg_len  ata_sg_is_last(sg, qc))
 -mv_sg-flags_size |= 
 cpu_to_le32(EPRD_FLAG_END_OF_TBL);
 -
 +end_mv_sg = mv_sg;
  mv_sg++;
  }
 -
  }
 +if (end_mv_sg)
 +end_mv_sg-flags_size |= cpu_to_le32(EPRD_FLAG_END_OF_TBL);
  }
  

 I'm testing a similar patch based on ata_fill_sg()'s method, which 
 basically does something similar to what you've done here (see attached).  
 I had noticed that ata_fill_sg() did not call ata_sg_is_last().

 If this fixes the problem, I think the best solution would be to delete 
 ata_sg_is_last().  In the few users that exist, we should be able to 
 eliminate the test programmatically as you and ata_fill_sg() have done -- 
 thereby eliminating a branch per loop in a hotpath.

 Off to test the attached...  if that doesn't work I'll try your version, 
 though there shouldn't be much difference.

That should work as well. WRT ata_sg_is_last(), if we go ahead with my
recent sg chaining updates, we can keep the test as it would be a single
conditional and not require any looping.

Let me know when you have tested this!

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Jens Axboe wrote:

That should work as well. WRT ata_sg_is_last(), if we go ahead with my
recent sg chaining updates, we can keep the test as it would be a single
conditional and not require any looping.

Let me know when you have tested this!


The patch I attached to the last email got both sata_mv test boxes 
working reliably (so far).


I worked up a patch that kills ata_sg_is_last() (plus the 
max_phys_segments sata_mv fix), see attached.  I'm thinking this is what 
I like to see in upstream.


Of course, this doesn't explain why ata_sg_is_last() was broken, but 
since it's working _and_ slightly more efficient, I don't really care :)


Jeff


diff --git a/drivers/ata/pdc_adma.c b/drivers/ata/pdc_adma.c
index 8d1b03d..199f7e1 100644
--- a/drivers/ata/pdc_adma.c
+++ b/drivers/ata/pdc_adma.c
@@ -318,7 +318,7 @@ static int adma_fill_sg(struct ata_queued_cmd *qc)
struct scatterlist *sg;
struct ata_port *ap = qc-ap;
struct adma_port_priv *pp = ap-private_data;
-   u8  *buf = pp-pkt;
+   u8  *buf = pp-pkt, *last_buf = NULL;
int i = (2 + buf[3]) * 8;
u8 pFLAGS = pORD | ((qc-tf.flags  ATA_TFLAG_WRITE) ? pDIRO : 0);
 
@@ -334,8 +334,7 @@ static int adma_fill_sg(struct ata_queued_cmd *qc)
*(__le32 *)(buf + i) = cpu_to_le32(len);
i += 4;
 
-   if (ata_sg_is_last(sg, qc))
-   pFLAGS |= pEND;
+   last_buf = buf[i];
buf[i++] = pFLAGS;
buf[i++] = qc-dev-dma_mode  0xf;
buf[i++] = 0;   /* pPKLW */
@@ -348,6 +347,10 @@ static int adma_fill_sg(struct ata_queued_cmd *qc)
VPRINTK(PRD[%u] = (0x%lX, 0x%X)\n, i/4,
(unsigned long)addr, len);
}
+
+   if (likely(last_buf))
+   *last_buf |= pEND;
+
return i;
 }
 
diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
index 4df8311..7f1b13e 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -421,7 +421,6 @@ static void mv_error_handler(struct ata_port *ap);
 static void mv_post_int_cmd(struct ata_queued_cmd *qc);
 static void mv_eh_freeze(struct ata_port *ap);
 static void mv_eh_thaw(struct ata_port *ap);
-static int mv_slave_config(struct scsi_device *sdev);
 static int mv_init_one(struct pci_dev *pdev, const struct pci_device_id *ent);
 
 static void mv5_phy_errata(struct mv_host_priv *hpriv, void __iomem *mmio,
@@ -459,7 +458,7 @@ static struct scsi_host_template mv5_sht = {
.use_clustering = 1,
.proc_name  = DRV_NAME,
.dma_boundary   = MV_DMA_BOUNDARY,
-   .slave_configure= mv_slave_config,
+   .slave_configure= ata_scsi_slave_config,
.slave_destroy  = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
 };
@@ -477,7 +476,7 @@ static struct scsi_host_template mv6_sht = {
.use_clustering = 1,
.proc_name  = DRV_NAME,
.dma_boundary   = MV_DMA_BOUNDARY,
-   .slave_configure= mv_slave_config,
+   .slave_configure= ata_scsi_slave_config,
.slave_destroy  = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
 };
@@ -756,17 +755,6 @@ static void mv_irq_clear(struct ata_port *ap)
 {
 }
 
-static int mv_slave_config(struct scsi_device *sdev)
-{
-   int rc = ata_scsi_slave_config(sdev);
-   if (rc)
-   return rc;
-
-   blk_queue_max_phys_segments(sdev-request_queue, MV_MAX_SG_CT / 2);
-
-   return 0;   /* scsi layer doesn't check return value, sigh */
-}
-
 static void mv_set_edma_ptrs(void __iomem *port_mmio,
 struct mv_host_priv *hpriv,
 struct mv_port_priv *pp)
@@ -1138,7 +1126,7 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
 {
struct mv_port_priv *pp = qc-ap-private_data;
struct scatterlist *sg;
-   struct mv_sg *mv_sg;
+   struct mv_sg *mv_sg, *last_sg = NULL;
 
mv_sg = pp-sg_tbl;
ata_for_each_sg(sg, qc) {
@@ -1159,13 +1147,13 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
sg_len -= len;
addr += len;
 
-   if (!sg_len  ata_sg_is_last(sg, qc))
-   mv_sg-flags_size |= 
cpu_to_le32(EPRD_FLAG_END_OF_TBL);
-
+   last_sg = mv_sg;
mv_sg++;
}
-
}
+
+   if (likely(last_sg))
+   last_sg-flags_size |= cpu_to_le32(EPRD_FLAG_END_OF_TBL);
 }
 
 static inline void mv_crqb_pack_cmd(__le16 *cmdw, u8 data, u8 addr, unsigned 
last)
diff --git a/drivers/ata/sata_sil24.c b/drivers/ata/sata_sil24.c
index b061927..26ebffc 100644
--- a/drivers/ata/sata_sil24.c
+++ b/drivers/ata/sata_sil24.c
@@ -796,16 +796,19 @@ static inline void sil24_fill_sg(struct ata_queued_cmd 
*qc,
 

Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jeff Garzik wrote:
 Jens Axboe wrote:
 That should work as well. WRT ata_sg_is_last(), if we go ahead with my
 recent sg chaining updates, we can keep the test as it would be a single
 conditional and not require any looping.
 Let me know when you have tested this!

 The patch I attached to the last email got both sata_mv test boxes working 
 reliably (so far).

 I worked up a patch that kills ata_sg_is_last() (plus the max_phys_segments 
 sata_mv fix), see attached.  I'm thinking this is what I like to see in 
 upstream.

Great!

 Of course, this doesn't explain why ata_sg_is_last() was broken, but since 
 it's working _and_ slightly more efficient, I don't really care :)

Tomo and I agreed to kill sg_last() a few days ago anyways, so this is
perfectly fine with me.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Ingo Molnar

* Jens Axboe [EMAIL PROTECTED] wrote:

  Of course, this doesn't explain why ata_sg_is_last() was broken, but 
  since it's working _and_ slightly more efficient, I don't really 
  care :)
 
 Tomo and I agreed to kill sg_last() a few days ago anyways, so this is 
 perfectly fine with me.

it would still be nice to figure out exactly why it was broken.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Ingo Molnar wrote:
 
 * Jens Axboe [EMAIL PROTECTED] wrote:
 
   Of course, this doesn't explain why ata_sg_is_last() was broken, but 
   since it's working _and_ slightly more efficient, I don't really 
   care :)
  
  Tomo and I agreed to kill sg_last() a few days ago anyways, so this is 
  perfectly fine with me.
 
 it would still be nice to figure out exactly why it was broken.

It would always be nice. For this case I don't think it's very
interesting, if we pursue the improved sg iteration setup.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Jens Axboe wrote:
 On Thu, Oct 18 2007, Ingo Molnar wrote:
  
  * Jens Axboe [EMAIL PROTECTED] wrote:
  
Of course, this doesn't explain why ata_sg_is_last() was broken, but 
since it's working _and_ slightly more efficient, I don't really 
care :)
   
   Tomo and I agreed to kill sg_last() a few days ago anyways, so this is 
   perfectly fine with me.
  
  it would still be nice to figure out exactly why it was broken.
 
 It would always be nice. For this case I don't think it's very
 interesting, if we pursue the improved sg iteration setup.

BTW, I think it's pretty clear that ata_sg_is_last() is broken. It's
likely a one-off in the n_iter test.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Jens Axboe wrote:

On Thu, Oct 18 2007, Jeff Garzik wrote:

Jens Axboe wrote:

That should work as well. WRT ata_sg_is_last(), if we go ahead with my
recent sg chaining updates, we can keep the test as it would be a single
conditional and not require any looping.
Let me know when you have tested this!
The patch I attached to the last email got both sata_mv test boxes working 
reliably (so far).


I worked up a patch that kills ata_sg_is_last() (plus the max_phys_segments 
sata_mv fix), see attached.  I'm thinking this is what I like to see in 
upstream.


Great!

Of course, this doesn't explain why ata_sg_is_last() was broken, but since 
it's working _and_ slightly more efficient, I don't really care :)


Tomo and I agreed to kill sg_last() a few days ago anyways, so this is
perfectly fine with me.


Yep, the [attached] patch that kills ata_sg_is_last() is working here on 
both machines that were previously croaking.


It would be nice to get pdc_adma, sata_sil24 and ipr it-works test done, 
but IMO the patch is pretty straightforward and should be OK.


Jeff


diff --git a/drivers/ata/pdc_adma.c b/drivers/ata/pdc_adma.c
index 8d1b03d..199f7e1 100644
--- a/drivers/ata/pdc_adma.c
+++ b/drivers/ata/pdc_adma.c
@@ -318,7 +318,7 @@ static int adma_fill_sg(struct ata_queued_cmd *qc)
struct scatterlist *sg;
struct ata_port *ap = qc-ap;
struct adma_port_priv *pp = ap-private_data;
-   u8  *buf = pp-pkt;
+   u8  *buf = pp-pkt, *last_buf = NULL;
int i = (2 + buf[3]) * 8;
u8 pFLAGS = pORD | ((qc-tf.flags  ATA_TFLAG_WRITE) ? pDIRO : 0);
 
@@ -334,8 +334,7 @@ static int adma_fill_sg(struct ata_queued_cmd *qc)
*(__le32 *)(buf + i) = cpu_to_le32(len);
i += 4;
 
-   if (ata_sg_is_last(sg, qc))
-   pFLAGS |= pEND;
+   last_buf = buf[i];
buf[i++] = pFLAGS;
buf[i++] = qc-dev-dma_mode  0xf;
buf[i++] = 0;   /* pPKLW */
@@ -348,6 +347,10 @@ static int adma_fill_sg(struct ata_queued_cmd *qc)
VPRINTK(PRD[%u] = (0x%lX, 0x%X)\n, i/4,
(unsigned long)addr, len);
}
+
+   if (likely(last_buf))
+   *last_buf |= pEND;
+
return i;
 }
 
diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
index 4df8311..7f1b13e 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -421,7 +421,6 @@ static void mv_error_handler(struct ata_port *ap);
 static void mv_post_int_cmd(struct ata_queued_cmd *qc);
 static void mv_eh_freeze(struct ata_port *ap);
 static void mv_eh_thaw(struct ata_port *ap);
-static int mv_slave_config(struct scsi_device *sdev);
 static int mv_init_one(struct pci_dev *pdev, const struct pci_device_id *ent);
 
 static void mv5_phy_errata(struct mv_host_priv *hpriv, void __iomem *mmio,
@@ -459,7 +458,7 @@ static struct scsi_host_template mv5_sht = {
.use_clustering = 1,
.proc_name  = DRV_NAME,
.dma_boundary   = MV_DMA_BOUNDARY,
-   .slave_configure= mv_slave_config,
+   .slave_configure= ata_scsi_slave_config,
.slave_destroy  = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
 };
@@ -477,7 +476,7 @@ static struct scsi_host_template mv6_sht = {
.use_clustering = 1,
.proc_name  = DRV_NAME,
.dma_boundary   = MV_DMA_BOUNDARY,
-   .slave_configure= mv_slave_config,
+   .slave_configure= ata_scsi_slave_config,
.slave_destroy  = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
 };
@@ -756,17 +755,6 @@ static void mv_irq_clear(struct ata_port *ap)
 {
 }
 
-static int mv_slave_config(struct scsi_device *sdev)
-{
-   int rc = ata_scsi_slave_config(sdev);
-   if (rc)
-   return rc;
-
-   blk_queue_max_phys_segments(sdev-request_queue, MV_MAX_SG_CT / 2);
-
-   return 0;   /* scsi layer doesn't check return value, sigh */
-}
-
 static void mv_set_edma_ptrs(void __iomem *port_mmio,
 struct mv_host_priv *hpriv,
 struct mv_port_priv *pp)
@@ -1138,7 +1126,7 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
 {
struct mv_port_priv *pp = qc-ap-private_data;
struct scatterlist *sg;
-   struct mv_sg *mv_sg;
+   struct mv_sg *mv_sg, *last_sg = NULL;
 
mv_sg = pp-sg_tbl;
ata_for_each_sg(sg, qc) {
@@ -1159,13 +1147,13 @@ static void mv_fill_sg(struct ata_queued_cmd *qc)
sg_len -= len;
addr += len;
 
-   if (!sg_len  ata_sg_is_last(sg, qc))
-   mv_sg-flags_size |= 
cpu_to_le32(EPRD_FLAG_END_OF_TBL);
-
+   last_sg = mv_sg;
mv_sg++;
}
-
}
+
+   if 

Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Ingo Molnar

* Jens Axboe [EMAIL PROTECTED] wrote:

  It would always be nice. For this case I don't think it's very 
  interesting, if we pursue the improved sg iteration setup.
 
 BTW, I think it's pretty clear that ata_sg_is_last() is broken. It's 
 likely a one-off in the n_iter test.

fixing that would be a -stable candidate, as a potential data corruptor 
- or is it more benign?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Ingo Molnar wrote:

* Jens Axboe [EMAIL PROTECTED] wrote:

It would always be nice. For this case I don't think it's very 
interesting, if we pursue the improved sg iteration setup.
BTW, I think it's pretty clear that ata_sg_is_last() is broken. It's 
likely a one-off in the n_iter test.


fixing that would be a -stable candidate, as a potential data corruptor 
- or is it more benign?


It is confirmed working prior to the sg-chaining stuff, so 2.6.23.1 is OK...

Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Ingo Molnar

* Jeff Garzik [EMAIL PROTECTED] wrote:

 Tomo and I agreed to kill sg_last() a few days ago anyways, so this 
 is perfectly fine with me.
 
 Yep, the [attached] patch that kills ata_sg_is_last() is working here 
 on both machines that were previously croaking.
 
 It would be nice to get pdc_adma, sata_sil24 and ipr it-works test 
 done, but IMO the patch is pretty straightforward and should be OK.

just a quick question: i have Jens's workarounds applied right now (see 
patch below). Am i now crash/corruption-safe, or do i need your patch 
too? And once your patch [and the other sg_*() patches] are upstream i 
dont need the workaround anymore, correct?

Ingo

---
 block/ll_rw_blk.c |2 +-
 drivers/ata/libata-core.c |2 +-
 drivers/scsi/scsi_lib.c   |2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

Index: linux/block/ll_rw_blk.c
===
--- linux.orig/block/ll_rw_blk.c
+++ linux/block/ll_rw_blk.c
@@ -631,7 +631,7 @@ void blk_queue_max_phys_segments(struct 
printk(%s: set to minimum %d\n, __FUNCTION__, max_segments);
}
 
-   q-max_phys_segments = max_segments;
+   q-max_phys_segments = max_segments - 1;
 }
 
 EXPORT_SYMBOL(blk_queue_max_phys_segments);
Index: linux/drivers/ata/libata-core.c
===
--- linux.orig/drivers/ata/libata-core.c
+++ linux/drivers/ata/libata-core.c
@@ -4664,7 +4664,7 @@ static int ata_sg_setup(struct ata_queue
 {
struct ata_port *ap = qc-ap;
struct scatterlist *sg = qc-__sg;
-   struct scatterlist *lsg = sg_last(qc-__sg, qc-n_elem);
+   struct scatterlist *lsg = qc-__sg[qc-n_elem - 1];
int n_elem, pre_n_elem, dir, trim_sg = 0;
 
VPRINTK(ENTER, ata%u\n, ap-print_id);
Index: linux/drivers/scsi/scsi_lib.c
===
--- linux.orig/drivers/scsi/scsi_lib.c
+++ linux/drivers/scsi/scsi_lib.c
@@ -39,7 +39,7 @@
  * (unless chaining is used). Should ideally fit inside a single page, to
  * avoid a higher order allocation.
  */
-#define SCSI_MAX_SG_SEGMENTS   128
+#define SCSI_MAX_SG_SEGMENTS   129
 
 struct scsi_host_sg_pool {
size_t  size;
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Ingo Molnar wrote:
 
 * Jens Axboe [EMAIL PROTECTED] wrote:
 
   It would always be nice. For this case I don't think it's very 
   interesting, if we pursue the improved sg iteration setup.
  
  BTW, I think it's pretty clear that ata_sg_is_last() is broken. It's 
  likely a one-off in the n_iter test.
 
 fixing that would be a -stable candidate, as a potential data corruptor 
 - or is it more benign?

I think it's safe to say that it was sg chaining introduced breakage, so
it should work fine in 2.6.23.x.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jeff Garzik

Ingo Molnar wrote:

* Jeff Garzik [EMAIL PROTECTED] wrote:

Tomo and I agreed to kill sg_last() a few days ago anyways, so this 
is perfectly fine with me.
Yep, the [attached] patch that kills ata_sg_is_last() is working here 
on both machines that were previously croaking.


It would be nice to get pdc_adma, sata_sil24 and ipr it-works test 
done, but IMO the patch is pretty straightforward and should be OK.


just a quick question: i have Jens's workarounds applied right now (see 
patch below). Am i now crash/corruption-safe, or do i need your patch 
too? And once your patch [and the other sg_*() patches] are upstream i 
dont need the workaround anymore, correct?


You need my patch if and only if you use one of the drivers touched by 
the patch.  ata_sg_is_last() was a driver helper function, so my fix 
never really touched core code.


I never had to apply the changes you included, to fix problems here.

And looking at those changes...

-   q-max_phys_segments = max_segments;
+   q-max_phys_segments = max_segments - 1;

 ...

-#define SCSI_MAX_SG_SEGMENTS   128
+#define SCSI_MAX_SG_SEGMENTS   129


I wonder if libata should be doing

blk_queue_max_phys_segments(q, q-max_phys_segments - 1)

to account for the pad entry that libata owns.

Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

  -#define SCSI_MAX_SG_SEGMENTS   128
  +#define SCSI_MAX_SG_SEGMENTS   129
 
 this one finally made the trick and it's booting fine now, without any 
 crashes!

hm, spoke too soon - i just got the ata_qc_issue() crash below. I've 
attached further below the current fixes/workarounds that i have applied 
at the moment. Any ideas?

Ingo

-
[  155.259466] kjournald starting.  Commit interval 5 seconds
[  155.265103] EXT3 FS on sda5, internal journal
[  155.269319] EXT3-fs: mounted filesystem with ordered data mode.
[  156.458225] BUG: unable to handle kernel paging request at virtual address 
7d5ac000
[  156.465723] printing eip: 784e9300 *pde = 00ddd027 *pte = 055ac000 
[  156.471964] Oops:  [#1] DEBUG_PAGEALLOC
[  156.476123] 
[  156.477597] Pid: 0, comm: swapper Not tainted (2.6.23 #40)
[  156.483055] EIP: 0060:[784e9300] EFLAGS: 00010006 CPU: 0
[  156.488520] EIP is at ata_qc_issue+0xd0/0x340
[  156.492848] EAX: 3d328000 EBX: 7d5ac000 ECX: 0020 EDX: 0020
[  156.499087] ESI: 7d5ab480 EDI: 7d5abe00 EBP: 7b54007c ESP: 78a13e1c
[  156.505328]  DS: 007b ES: 007b FS:  GS:  SS: 0068
[  156.510700] Process swapper (pid: 0, ti=78a12000 task=789753e0 
task.ti=78a12000)
[  156.517893] Stack: 7d5ac000 7b54 7b54  7d5abff0 7b54007c 
7d5ab480 7b5417a4 
[  156.526213]784c2330 784ef49e 784f1ff3 7b52de98 7d5ab480 7b54 
7b5417a4 7d5ab480 
[  156.534531]7b54 7b524004 784f20e0 784ef180 784c2330 7d5ab480 
0216 7b524004 
[  156.542851] Call Trace:
[  156.545452]  [784c2330] scsi_done+0x0/0x20
[  156.549698]  [784ef49e] ata_scsi_translate+0xbe/0x140
[  156.554897]  [784f1ff3] ata_scsi_queuecmd+0x33/0x200
[  156.560010]  [784f20e0] ata_scsi_queuecmd+0x120/0x200
[  156.565210]  [784ef180] ata_scsi_rw_xlat+0x0/0x220
[  156.570150]  [784c2330] scsi_done+0x0/0x20
[  156.574395]  [784c2bb2] scsi_dispatch_cmd+0x152/0x290
[  156.579596]  [78135aa7] trace_hardirqs_on+0x67/0xb0
[  156.584622]  [784c8abe] scsi_request_fn+0x1be/0x370
[  156.589649]  [78407ef6] blk_run_queue+0x36/0x80
[  156.594328]  [784c73c0] scsi_next_command+0x30/0x50
[  156.599354]  [784c754b] scsi_end_request+0xab/0xe0
[  156.604294]  [784c8239] scsi_io_completion+0xa9/0x3d0
[  156.609493]  [78135aa7] trace_hardirqs_on+0x67/0xb0
[  156.614520]  [78404f85] blk_done_softirq+0x45/0x80
[  156.619460]  [78404fb3] blk_done_softirq+0x73/0x80
[  156.624400]  [7811d2f3] __do_softirq+0x53/0xb0
[  156.628992]  [7811d3b8] do_softirq+0x68/0x70
[  156.633412]  [78105351] do_IRQ+0x51/0x90
[  156.637486]  [7810290f] restore_nocheck+0x12/0x15
[  156.642339]  [7810388e] common_interrupt+0x2e/0x40
[  156.647277]  [7810f4c0] pgd_dtor+0x0/0x50
[  156.651437]  [7815f1d0] quicklist_trim+0x0/0x90
[  156.656117]  [7810f4bb] check_pgt_cache+0x1b/0x20
[  156.660970]  [78100c52] cpu_idle+0x32/0x60
[  156.665217]  [78a14b35] start_kernel+0x265/0x300
[  156.669983]  [78a14380] unknown_bootoption+0x0/0x1e0
[  156.675096]  ===
[  156.678649] Code: 84 d9 01 00 00 7e 32 31 d2 89 f6 8b 1c 24 83 c2 01 8b 03 
2b 05 18 ed d7 78 c1 f8 05 c1 e0 0c 03 43 04 89 43 08 83 c3 10 89 1c 24 8b 03 
a8 01 0f 85 58 02 00 00 39 ca 75 d2 f0 83 44 24 00 00 85 
[  156.697455] EIP: [784e9300] ata_qc_issue+0xd0/0x340 SS:ESP 0068:78a13e1c
[  156.704822] Kernel panic - not syncing: Fatal exception in interrupt

---
 block/ll_rw_blk.c |2 +-
 drivers/ata/libata-core.c |2 +-
 drivers/scsi/scsi_lib.c   |2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

Index: linux/block/ll_rw_blk.c
===
--- linux.orig/block/ll_rw_blk.c
+++ linux/block/ll_rw_blk.c
@@ -631,7 +631,7 @@ void blk_queue_max_phys_segments(struct 
printk(%s: set to minimum %d\n, __FUNCTION__, max_segments);
}
 
-   q-max_phys_segments = max_segments;
+   q-max_phys_segments = max_segments - 1;
 }
 
 EXPORT_SYMBOL(blk_queue_max_phys_segments);
Index: linux/drivers/ata/libata-core.c
===
--- linux.orig/drivers/ata/libata-core.c
+++ linux/drivers/ata/libata-core.c
@@ -4664,7 +4664,7 @@ static int ata_sg_setup(struct ata_queue
 {
struct ata_port *ap = qc-ap;
struct scatterlist *sg = qc-__sg;
-   struct scatterlist *lsg = sg_last(qc-__sg, qc-n_elem);
+   struct scatterlist *lsg = qc-__sg[qc-n_elem - 1];
int n_elem, pre_n_elem, dir, trim_sg = 0;
 
VPRINTK(ENTER, ata%u\n, ap-print_id);
Index: linux/drivers/scsi/scsi_lib.c
===
--- linux.orig/drivers/scsi/scsi_lib.c
+++ linux/drivers/scsi/scsi_lib.c
@@ -39,7 +39,7 @@
  * (unless chaining is used). Should ideally fit inside a single page, to
  * avoid a higher order allocation.
  */
-#define SCSI_MAX_SG_SEGMENTS   128
+#define SCSI_MAX_SG_SEGMENTS   129
 
 struct scsi_host_sg_pool {

Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, Ingo Molnar wrote:
 
 * Ingo Molnar [EMAIL PROTECTED] wrote:
 
   -#define SCSI_MAX_SG_SEGMENTS 128
   +#define SCSI_MAX_SG_SEGMENTS 129
  
  this one finally made the trick and it's booting fine now, without any 
  crashes!
 
 hm, spoke too soon - i just got the ata_qc_issue() crash below. I've 
 attached further below the current fixes/workarounds that i have applied 
 at the moment. Any ideas?

Hang on Ingo, will post an updated patch soonish!

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Ingo Molnar

* Jeff Garzik [EMAIL PROTECTED] wrote:

 I never had to apply the changes you included, to fix problems here.

perhaps because you are not running a CONFIG_DEBUG_PAGEALLOC=y kernel?

I recently fixed DEBUG_PAGEALLOC (it would crash upon bootup on x86 most 
of the time on any real hardware - so i doubt people were able to use it 
all that much). As long as you try Linus' latest -git tree (which has 
the latest arch/x86 merge) and use the 32-bit x86 kernel it should work 
fine for you too, and you will probably be able to trigger similar 
crashes too.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread David Miller
From: Jens Axboe [EMAIL PROTECTED]
Date: Thu, 18 Oct 2007 10:21:45 +0200

 I like it. Basically the only real change is using bit 2 as a
 termination point, so we avoid going beyond the end of the sgtable.
 Here's a starting point, it actually booted for me in the first go
 (boggle). Only x86 so far, archs will need to be converted. And lots
 more drivers I'm sure, I only fixed up the ones that botched my compile.
 
 So just consider this a directional patch.

Here are some sparc64 bits:

diff --git a/arch/sparc64/kernel/iommu.c b/arch/sparc64/kernel/iommu.c
index 29af777..73852a2 100644
--- a/arch/sparc64/kernel/iommu.c
+++ b/arch/sparc64/kernel/iommu.c
@@ -473,7 +473,7 @@ static void dma_4u_unmap_single(struct device *dev, 
dma_addr_t bus_addr,
 }
 
 #define SG_ENT_PHYS_ADDRESS(SG)\
-   (__pa(page_address((SG)-page)) + (SG)-offset)
+   (__pa(page_address(sg_page(SG))) + (SG)-offset)
 
 static void fill_sg(iopte_t *iopte, struct scatterlist *sg,
int nused, int nelems,
@@ -566,7 +566,7 @@ static int dma_4u_map_sg(struct device *dev, struct 
scatterlist *sglist,
if (nelems == 1) {
sglist-dma_address =
dma_4u_map_single(dev,
- (page_address(sglist-page) +
+ (page_address(sg_page(sglist)) +
   sglist-offset),
  sglist-length, direction);
if (unlikely(sglist-dma_address == DMA_ERROR_CODE))
diff --git a/arch/sparc64/kernel/iommu_common.c 
b/arch/sparc64/kernel/iommu_common.c
index d7ca900..ec863e0 100644
--- a/arch/sparc64/kernel/iommu_common.c
+++ b/arch/sparc64/kernel/iommu_common.c
@@ -73,7 +73,7 @@ static int verify_one_map(struct scatterlist *dma_sg, struct 
scatterlist **__sg,
 
daddr = dma_sg-dma_address;
sglen = sg-length;
-   sgaddr = (unsigned long) (page_address(sg-page) + sg-offset);
+   sgaddr = (unsigned long) (page_address(sg_page(sg)) + sg-offset);
while (dlen  0) {
unsigned long paddr;
 
@@ -123,7 +123,7 @@ static int verify_one_map(struct scatterlist *dma_sg, 
struct scatterlist **__sg,
sg = sg_next(sg);
if (--nents = 0)
break;
-   sgaddr = (unsigned long) (page_address(sg-page) + sg-offset);
+   sgaddr = (unsigned long) (page_address(sg_page(sg)) + 
sg-offset);
sglen = sg-length;
}
if (dlen  0) {
@@ -191,7 +191,7 @@ void verify_sglist(struct scatterlist *sglist, int nents, 
iopte_t *iopte, int np
printk(sg(%d): page_addr(%p) off(%x) length(%x) 
   dma_address[%016x] dma_length[%016x]\n,
   i,
-  page_address(sg-page), sg-offset,
+  page_address(sg_page(sg)), sg-offset,
   sg-length,
   sg-dma_address, sg-dma_length);
}
@@ -207,15 +207,15 @@ unsigned long prepare_sg(struct scatterlist *sg, int 
nents)
unsigned long prev;
u32 dent_addr, dent_len;
 
-   prev  = (unsigned long) (page_address(sg-page) + sg-offset);
+   prev  = (unsigned long) (page_address(sg_page(sg)) + sg-offset);
prev += (unsigned long) (dent_len = sg-length);
-   dent_addr = (u32) ((unsigned long)(page_address(sg-page) + sg-offset)
+   dent_addr = (u32) ((unsigned long)(page_address(sg_page(sg)) + 
sg-offset)
(IO_PAGE_SIZE - 1UL));
while (--nents) {
unsigned long addr;
 
sg = sg_next(sg);
-   addr = (unsigned long) (page_address(sg-page) + sg-offset);
+   addr = (unsigned long) (page_address(sg_page(sg)) + sg-offset);
if (! VCONTIG(prev, addr)) {
dma_sg-dma_address = dent_addr;
dma_sg-dma_length = dent_len;
diff --git a/arch/sparc64/kernel/pci_sun4v.c b/arch/sparc64/kernel/pci_sun4v.c
index fe46ace..5324a34 100644
--- a/arch/sparc64/kernel/pci_sun4v.c
+++ b/arch/sparc64/kernel/pci_sun4v.c
@@ -366,7 +366,7 @@ static void dma_4v_unmap_single(struct device *dev, 
dma_addr_t bus_addr,
 }
 
 #define SG_ENT_PHYS_ADDRESS(SG)\
-   (__pa(page_address((SG)-page)) + (SG)-offset)
+   (__pa(page_address(sg_page(SG))) + (SG)-offset)
 
 static long fill_sg(long entry, struct device *dev,
struct scatterlist *sg,
@@ -478,7 +478,7 @@ static int dma_4v_map_sg(struct device *dev, struct 
scatterlist *sglist,
if (nelems == 1) {
sglist-dma_address =
dma_4v_map_single(dev,
- (page_address(sglist-page) +
+ (page_address(sg_page(sglist)) +
   

Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread David Miller
From: Jens Axboe [EMAIL PROTECTED]
Date: Thu, 18 Oct 2007 13:57:02 +0200

 Thanks a lot, Dave! The patch is a monster right now, I'll work on
 splitting it into a 3-step process. Any arch help is greatly
 appreciated.

I have some other bits that my compile hit, such as some things in the
crypto layer.

But I hesitate to send them to you because I think the on-stack cases
need some helpers such that DEBUG_SG works for them.

BTW, you missed a case in drivers/usb/core/message.c because of
the config used in your build.  This thing below is a good
argument for trying to avoid HIGHMEM et al. ifdefs in drivers :-)

--- drivers/usb/core/message.c~ 2007-10-18 01:46:44.0 -0700
+++ drivers/usb/core/message.c  2007-10-18 03:15:20.0 -0700
@@ -438,7 +438,7 @@
io-urbs[i]-transfer_buffer = NULL;
 #else
io-urbs[i]-transfer_buffer =
-   page_address(sg[i].page) + sg[i].offset;
+   page_address(sg_page(sg[i])) + sg[i].offset;
 #endif
} else {
/* hc may use _only_ transfer_buffer */
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, David Miller wrote:
 From: Jens Axboe [EMAIL PROTECTED]
 Date: Thu, 18 Oct 2007 13:57:02 +0200
 
  Thanks a lot, Dave! The patch is a monster right now, I'll work on
  splitting it into a 3-step process. Any arch help is greatly
  appreciated.
 
 I have some other bits that my compile hit, such as some things in the
 crypto layer.

Yeah, I have tons of that so far. I hope to have an allyesconfig
compiling pretty soonish, will send that out then.

 But I hesitate to send them to you because I think the on-stack cases
 need some helpers such that DEBUG_SG works for them.

Indeed. I convert where appropriate, but I'm not anal about it. If they
don't use sg_next() and/or for_each_sg() on their list, it should be ok.
Don't want to make the changes more than necessary right now.

 BTW, you missed a case in drivers/usb/core/message.c because of
 the config used in your build.  This thing below is a good
 argument for trying to avoid HIGHMEM et al. ifdefs in drivers :-)

Thanks :-)

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, David Miller wrote:
 From: Jens Axboe [EMAIL PROTECTED]
 Date: Thu, 18 Oct 2007 10:21:45 +0200
 
  I like it. Basically the only real change is using bit 2 as a
  termination point, so we avoid going beyond the end of the sgtable.
  Here's a starting point, it actually booted for me in the first go
  (boggle). Only x86 so far, archs will need to be converted. And lots
  more drivers I'm sure, I only fixed up the ones that botched my compile.
  
  So just consider this a directional patch.
 
 Here are some sparc64 bits:

Thanks a lot, Dave! The patch is a monster right now, I'll work on
splitting it into a 3-step process. Any arch help is greatly
appreciated.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread David Miller
From: Jens Axboe [EMAIL PROTECTED]
Date: Thu, 18 Oct 2007 14:15:47 +0200

 On Thu, Oct 18 2007, Jens Axboe wrote:
  On Thu, Oct 18 2007, David Miller wrote:
   From: Jens Axboe [EMAIL PROTECTED]
   Date: Thu, 18 Oct 2007 13:57:02 +0200
   
Thanks a lot, Dave! The patch is a monster right now, I'll work on
splitting it into a 3-step process. Any arch help is greatly
appreciated.
   
   I have some other bits that my compile hit, such as some things in the
   crypto layer.
  
  Yeah, I have tons of that so far. I hope to have an allyesconfig
  compiling pretty soonish, will send that out then.
 
 OK here goes, this compiles with allyesconfig on x86-64. Not too bad,
 the scsi/ drivers were by far the worst.

It build cleanly here on sparc64 too.

It's late and I'm about to hit bed so I'm too chicken to do a test
boot it right now :-)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] ata subsystem related crash with latest -git

2007-10-18 Thread Jens Axboe
On Thu, Oct 18 2007, David Miller wrote:
 From: Jens Axboe [EMAIL PROTECTED]
 Date: Thu, 18 Oct 2007 14:15:47 +0200
 
  On Thu, Oct 18 2007, Jens Axboe wrote:
   On Thu, Oct 18 2007, David Miller wrote:
From: Jens Axboe [EMAIL PROTECTED]
Date: Thu, 18 Oct 2007 13:57:02 +0200

 Thanks a lot, Dave! The patch is a monster right now, I'll work on
 splitting it into a 3-step process. Any arch help is greatly
 appreciated.

I have some other bits that my compile hit, such as some things in the
crypto layer.
   
   Yeah, I have tons of that so far. I hope to have an allyesconfig
   compiling pretty soonish, will send that out then.
  
  OK here goes, this compiles with allyesconfig on x86-64. Not too bad,
  the scsi/ drivers were by far the worst.
 
 It build cleanly here on sparc64 too.

Super

 It's late and I'm about to hit bed so I'm too chicken to do a test
 boot it right now :-)

Don't boot it Dave, odds are that something will break and you'll then
be stuck debugging that since you can't relax and sleep until it's
working :-)

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   >