Re: 2.6.23-rc4-mm1 myri10ge module link error on x86_64

2007-10-13 Thread Avuton Olrich
, 6 Sep 2007 15:37:51 -0400 > >> > >>> I got a link error on myri10ge when building 2.6.23-rc4-mm1 on x86_64 : > >>> > >>> ERROR: "lro_flush_all" [drivers/net/myri10ge/myri10ge.ko] undefined! > >>> ERROR: "lro_receive_frags" [drivers/

Re: 2.6.23-rc4-mm1 myri10ge module link error on x86_64

2007-10-13 Thread Avuton Olrich
On 9/7/07, Jeff Garzik [EMAIL PROTECTED] wrote: David Miller wrote: From: David Miller [EMAIL PROTECTED] Date: Thu, 06 Sep 2007 13:40:38 -0700 (PDT) From: Mathieu Desnoyers [EMAIL PROTECTED] Date: Thu, 6 Sep 2007 15:37:51 -0400 I got a link error on myri10ge when building 2.6.23-rc4

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-11 Thread Torsten Kaiser
On 10/11/07, Tejun Heo <[EMAIL PROTECTED]> wrote: > Torsten Kaiser wrote: > >>> That missing +1 would explain, why the SGE_TRM never gets set. > >> Thanks a lot for tracking this down. Does changing the above code fix > >> your problem? > > > > I did not try it. > > I'm not an libata expert and

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-11 Thread Jens Axboe
On Thu, Oct 11 2007, Tejun Heo wrote: > Jens Axboe wrote: > > This is the old ata_sg_is_last: > > > > static inline int > > ata_sg_is_last(struct scatterlist *sg, struct ata_queued_cmd *qc) > > { > > if (sg == >pad_sgent) > > return 1; > > if (qc->pad_len) > >

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-11 Thread Tejun Heo
Jens Axboe wrote: > This is the old ata_sg_is_last: > > static inline int > ata_sg_is_last(struct scatterlist *sg, struct ata_queued_cmd *qc) > { > if (sg == >pad_sgent) > return 1; > if (qc->pad_len) > return 0; > if (((sg - qc->__sg) + 1)

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-11 Thread Jens Axboe
On Thu, Oct 11 2007, Tejun Heo wrote: > Torsten Kaiser wrote: > > Looking closer at > > http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=commitdiff;h=ec6fdded4d76aa54aa57341e5dfdd61c507b1dcd > > the change to libata.h seems bogus : > > > > in ata_qc_first_sg: > > old

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-11 Thread Tejun Heo
Torsten Kaiser wrote: >>> That missing +1 would explain, why the SGE_TRM never gets set. >> Thanks a lot for tracking this down. Does changing the above code fix >> your problem? > > I did not try it. > I'm not an libata expert and while this change looks suspicios, I > can't be 100% sure if

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-11 Thread Tejun Heo
Torsten Kaiser wrote: That missing +1 would explain, why the SGE_TRM never gets set. Thanks a lot for tracking this down. Does changing the above code fix your problem? I did not try it. I'm not an libata expert and while this change looks suspicios, I can't be 100% sure if that change

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-11 Thread Jens Axboe
On Thu, Oct 11 2007, Tejun Heo wrote: Torsten Kaiser wrote: Looking closer at http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=commitdiff;h=ec6fdded4d76aa54aa57341e5dfdd61c507b1dcd the change to libata.h seems bogus : in ata_qc_first_sg: old

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-11 Thread Tejun Heo
Jens Axboe wrote: This is the old ata_sg_is_last: static inline int ata_sg_is_last(struct scatterlist *sg, struct ata_queued_cmd *qc) { if (sg == qc-pad_sgent) return 1; if (qc-pad_len) return 0; if (((sg - qc-__sg) + 1) ==

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-11 Thread Jens Axboe
On Thu, Oct 11 2007, Tejun Heo wrote: Jens Axboe wrote: This is the old ata_sg_is_last: static inline int ata_sg_is_last(struct scatterlist *sg, struct ata_queued_cmd *qc) { if (sg == qc-pad_sgent) return 1; if (qc-pad_len) return

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-11 Thread Torsten Kaiser
On 10/11/07, Tejun Heo [EMAIL PROTECTED] wrote: Torsten Kaiser wrote: That missing +1 would explain, why the SGE_TRM never gets set. Thanks a lot for tracking this down. Does changing the above code fix your problem? I did not try it. I'm not an libata expert and while this change

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-10 Thread Torsten Kaiser
On 10/11/07, Tejun Heo <[EMAIL PROTECTED]> wrote: > Torsten Kaiser wrote: > > Looking closer at > > http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=commitdiff;h=ec6fdded4d76aa54aa57341e5dfdd61c507b1dcd > > the change to libata.h seems bogus : > > > > in ata_qc_first_sg: > >

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-10 Thread Tejun Heo
Torsten Kaiser wrote: > Looking closer at > http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=commitdiff;h=ec6fdded4d76aa54aa57341e5dfdd61c507b1dcd > the change to libata.h seems bogus : > > in ata_qc_first_sg: > oldnew > return qc->__sg

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-10 Thread Tejun Heo
Torsten Kaiser wrote: Looking closer at http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=commitdiff;h=ec6fdded4d76aa54aa57341e5dfdd61c507b1dcd the change to libata.h seems bogus : in ata_qc_first_sg: oldnew return qc-__sg

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-10 Thread Torsten Kaiser
On 10/11/07, Tejun Heo [EMAIL PROTECTED] wrote: Torsten Kaiser wrote: Looking closer at http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=commitdiff;h=ec6fdded4d76aa54aa57341e5dfdd61c507b1dcd the change to libata.h seems bogus : in ata_qc_first_sg: old

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-07 Thread Torsten Kaiser
[Adding Jens Axboe, the author of what looks like the probable cause] On 10/7/07, Torsten Kaiser <[EMAIL PROTECTED]> wrote: > My sil24_fill_sg now looks like this: > static inline void sil24_fill_sg(struct ata_queued_cmd *qc, > struct sil24_sge *sge) > { >

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-07 Thread Torsten Kaiser
On 10/5/07, Torsten Kaiser <[EMAIL PROTECTED]> wrote: > So I will use the weekend to see if I can find out who issues this > command and add more debug to that place... I added some DPRINTK to sil24_qc_issue and sil24_fill_sg, but I only found one suspicious thing. My sil24_fill_sg now looks

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-07 Thread Torsten Kaiser
On 10/5/07, Torsten Kaiser [EMAIL PROTECTED] wrote: So I will use the weekend to see if I can find out who issues this command and add more debug to that place... I added some DPRINTK to sil24_qc_issue and sil24_fill_sg, but I only found one suspicious thing. My sil24_fill_sg now looks like

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-07 Thread Torsten Kaiser
[Adding Jens Axboe, the author of what looks like the probable cause] On 10/7/07, Torsten Kaiser [EMAIL PROTECTED] wrote: My sil24_fill_sg now looks like this: static inline void sil24_fill_sg(struct ata_queued_cmd *qc, struct sil24_sge *sge) { struct

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-05 Thread Torsten Kaiser
On 10/4/07, Matt Mackall <[EMAIL PROTECTED]> wrote: > On Thu, Oct 04, 2007 at 07:32:52AM +0200, Torsten Kaiser wrote: > > So now I'm rather out of ideas what to test... :( > > I'd give your previous bisect step another try. Yes, I thought about that too. But I never seemed to need more than two

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-05 Thread Torsten Kaiser
On 10/4/07, Matt Mackall [EMAIL PROTECTED] wrote: On Thu, Oct 04, 2007 at 07:32:52AM +0200, Torsten Kaiser wrote: So now I'm rather out of ideas what to test... :( I'd give your previous bisect step another try. Yes, I thought about that too. But I never seemed to need more than two tries to

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-04 Thread Matt Mackall
On Thu, Oct 04, 2007 at 07:32:52AM +0200, Torsten Kaiser wrote: > On 10/3/07, Matt Mackall <[EMAIL PROTECTED]> wrote: > > Well I can see no reason why the vma we just got to by the mm->mmap > > would have a vm_mm != mm, but I've certainly been wrong before. > > > > Try changing it to: > > > >

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-04 Thread Matt Mackall
On Thu, Oct 04, 2007 at 07:32:52AM +0200, Torsten Kaiser wrote: On 10/3/07, Matt Mackall [EMAIL PROTECTED] wrote: Well I can see no reason why the vma we just got to by the mm-mmap would have a vm_mm != mm, but I've certainly been wrong before. Try changing it to: for (vma =

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-03 Thread Torsten Kaiser
On 10/3/07, Matt Mackall <[EMAIL PROTECTED]> wrote: > Well I can see no reason why the vma we just got to by the mm->mmap > would have a vm_mm != mm, but I've certainly been wrong before. > > Try changing it to: > > for (vma = mm->mmap; vma; vma = vma->vm_next) > if

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-03 Thread Torsten Kaiser
On 10/3/07, Matt Mackall <[EMAIL PROTECTED]> wrote: > On Wed, Oct 03, 2007 at 07:36:55PM +0200, Torsten Kaiser wrote: > > Of note might be, that at the time of this error init has not been > > started. I'm using a program from initramfs to start the RAID. > > The initramfs was primarily build

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-03 Thread Matt Mackall
On Wed, Oct 03, 2007 at 07:36:55PM +0200, Torsten Kaiser wrote: > On 10/3/07, Matt Mackall <[EMAIL PROTECTED]> wrote: > > On Wed, Oct 03, 2007 at 05:55:10PM +0200, Torsten Kaiser wrote: > > > This patch removes clear_refs_smap() from fs/proc/task_mmu.c by moving > > > its code to a new function.

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-03 Thread Torsten Kaiser
On 10/3/07, Matt Mackall <[EMAIL PROTECTED]> wrote: > On Wed, Oct 03, 2007 at 05:55:10PM +0200, Torsten Kaiser wrote: > > This patch removes clear_refs_smap() from fs/proc/task_mmu.c by moving > > its code to a new function. But during the move the main for-loop from > > clear_refs_smap was

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-03 Thread Matt Mackall
On Wed, Oct 03, 2007 at 05:55:10PM +0200, Torsten Kaiser wrote: > [CC added to author of the bad patch] > > Short recap: > Since 2.6.23-rc4-mm1 all mm-kernel randomly fail one of two drives on > my Silicon Image 3132. This failure happens when my initramfs wants to &

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-03 Thread Torsten Kaiser
[CC added to author of the bad patch] Short recap: Since 2.6.23-rc4-mm1 all mm-kernel randomly fail one of two drives on my Silicon Image 3132. This failure happens when my initramfs wants to start the RAID that is on these drives. The first error libata throws is: Oct 3 16:56:46 treogen

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-03 Thread Torsten Kaiser
ws. The interpretation XFER_PIO_0 = 0x08 > seems wrong... Even all of the last bisect steps still showed the same line... > If I look at what patches remain, it seems that some other (earlier) > patch that is new in 2.6.23-rc4-mm1 is the trigger, but it will only > fail together with a se

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-03 Thread Torsten Kaiser
still showed the same line... If I look at what patches remain, it seems that some other (earlier) patch that is new in 2.6.23-rc4-mm1 is the trigger, but it will only fail together with a second patch. I'm now finished with bisecting, still 2 patches, but I don't want to spend another two

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-03 Thread Torsten Kaiser
[CC added to author of the bad patch] Short recap: Since 2.6.23-rc4-mm1 all mm-kernel randomly fail one of two drives on my Silicon Image 3132. This failure happens when my initramfs wants to start the RAID that is on these drives. The first error libata throws is: Oct 3 16:56:46 treogen

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-03 Thread Matt Mackall
On Wed, Oct 03, 2007 at 05:55:10PM +0200, Torsten Kaiser wrote: [CC added to author of the bad patch] Short recap: Since 2.6.23-rc4-mm1 all mm-kernel randomly fail one of two drives on my Silicon Image 3132. This failure happens when my initramfs wants to start the RAID

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-03 Thread Torsten Kaiser
On 10/3/07, Matt Mackall [EMAIL PROTECTED] wrote: On Wed, Oct 03, 2007 at 05:55:10PM +0200, Torsten Kaiser wrote: This patch removes clear_refs_smap() from fs/proc/task_mmu.c by moving its code to a new function. But during the move the main for-loop from clear_refs_smap was changed:

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-03 Thread Matt Mackall
On Wed, Oct 03, 2007 at 07:36:55PM +0200, Torsten Kaiser wrote: On 10/3/07, Matt Mackall [EMAIL PROTECTED] wrote: On Wed, Oct 03, 2007 at 05:55:10PM +0200, Torsten Kaiser wrote: This patch removes clear_refs_smap() from fs/proc/task_mmu.c by moving its code to a new function. But during

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-03 Thread Torsten Kaiser
On 10/3/07, Matt Mackall [EMAIL PROTECTED] wrote: On Wed, Oct 03, 2007 at 07:36:55PM +0200, Torsten Kaiser wrote: Of note might be, that at the time of this error init has not been started. I'm using a program from initramfs to start the RAID. The initramfs was primarily build using the

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-03 Thread Torsten Kaiser
On 10/3/07, Matt Mackall [EMAIL PROTECTED] wrote: Well I can see no reason why the vma we just got to by the mm-mmap would have a vm_mm != mm, but I've certainly been wrong before. Try changing it to: for (vma = mm-mmap; vma; vma = vma-vm_next) if

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-01 Thread Torsten Kaiser
tch was safe, but one time it also showed the error. If I look at what patches remain, it seems that some other (earlier) patch that is new in 2.6.23-rc4-mm1 is the trigger, but it will only fail together with a second patch. > > > [the 2.6.23-rc4-mm1 series-file has 2013 lines] > > >

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-10-01 Thread Torsten Kaiser
no conclusive proof that it really is the cause. As noted in this thread, a long time I thought that rc7 with the sg-chaining-patch was safe, but one time it also showed the error. If I look at what patches remain, it seems that some other (earlier) patch that is new in 2.6.23-rc4-mm1 is the trigger

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-30 Thread Torsten Kaiser
I thought that rc7 with the sg-chaining-patch was safe, but one time it also showed the error. > > It's not just 2.6.23-rc4-mm1. All -mm's after rc4 are broken for me. > > Confirmed breakage on -rc4-mm1, -rc6-mm1 and -rc8-mm1. I'm just > > narrowing on rc4-mm1 because that was the first

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-30 Thread Tejun Heo
like something is going wrong with request DMA or sg mapping. Maybe some change in block/*.[hc]? > It's not just 2.6.23-rc4-mm1. All -mm's after rc4 are broken for me. > Confirmed breakage on -rc4-mm1, -rc6-mm1 and -rc8-mm1. I'm just > narrowing on rc4-mm1 because that was the first versio

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-30 Thread Torsten Kaiser
4-mm makes it visible that the initialization of the SiI3132 > > is incomplete? > > I can't tell but there is a pretty large userbase of sil24/32 and you > seem to be the only one to report this problem yet. I think it might be > coming somewhere else than libata or sata_sil24

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-30 Thread Andi Kleen
On Sunday 30 September 2007 16:06:59 Thomas Gleixner wrote: > On Sun, 30 Sep 2007, Andi Kleen wrote: > > >>> OK, this explains 2) and 3). I just looked into the code and the logic > >>> vs. noapictimer on SMP is completely broken. > > > > noapictimer really doesn't make any sense on non SMP imho

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-30 Thread Thomas Gleixner
On Sun, 30 Sep 2007, Andi Kleen wrote: OK, this explains 2) and 3). I just looked into the code and the logic vs. noapictimer on SMP is completely broken. noapictimer really doesn't make any sense on non SMP imho with the old timer architecture. That is why I never bothered to implement it.

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-30 Thread Tejun Heo
Hello, Torsten. Torsten Kaiser wrote: > On 9/28/07, Torsten Kaiser <[EMAIL PROTECTED]> wrote: >> So in case of -rc3-mm1 I'm pretty sure that it works. > > That's still the case. Ah... that's weird. It would be much better if -rc3-mm1 is broken too. :-P >> Not completely sure is if

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-30 Thread Andi Kleen
> > PIT keeps jiffies (and the system) running, but the local APIC timer > interrupts can get out of sync due to this C1E effect. The way C1e works on AMD is that even when one core is woken up by the PIT the APIC timer resumes on the other core on the socket too because the deep power saving

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-30 Thread Andi Kleen
> > > OK, this explains 2) and 3). I just looked into the code and the logic > > vs. noapictimer on SMP is completely broken. noapictimer really doesn't make any sense on non SMP imho with the old timer architecture. That is why I never bothered to implement it. It's purely a UP hack. > ..and

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-30 Thread Torsten Kaiser
ices That "Operation continuing on 1 devices" is a 'little' bit misleading. A RAID5 with two failed devices will not continue to operate. :( The same error is repeated several times, so I expect the first error also looked like that. Other things I have done to narrow it down: Comp

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-30 Thread Torsten Kaiser
is a 'little' bit misleading. A RAID5 with two failed devices will not continue to operate. :( The same error is repeated several times, so I expect the first error also looked like that. Other things I have done to narrow it down: Comparing 2.6.23-rc3-mm1 and 2.6.23-rc4-mm1 I found the following hunk

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-30 Thread Andi Kleen
OK, this explains 2) and 3). I just looked into the code and the logic vs. noapictimer on SMP is completely broken. noapictimer really doesn't make any sense on non SMP imho with the old timer architecture. That is why I never bothered to implement it. It's purely a UP hack. ..and

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-30 Thread Andi Kleen
PIT keeps jiffies (and the system) running, but the local APIC timer interrupts can get out of sync due to this C1E effect. The way C1e works on AMD is that even when one core is woken up by the PIT the APIC timer resumes on the other core on the socket too because the deep power saving

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-30 Thread Tejun Heo
Hello, Torsten. Torsten Kaiser wrote: On 9/28/07, Torsten Kaiser [EMAIL PROTECTED] wrote: So in case of -rc3-mm1 I'm pretty sure that it works. That's still the case. Ah... that's weird. It would be much better if -rc3-mm1 is broken too. :-P Not completely sure is if 2.6.23-rc7-sglist

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-30 Thread Thomas Gleixner
On Sun, 30 Sep 2007, Andi Kleen wrote: OK, this explains 2) and 3). I just looked into the code and the logic vs. noapictimer on SMP is completely broken. noapictimer really doesn't make any sense on non SMP imho with the old timer architecture. That is why I never bothered to implement it.

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-30 Thread Andi Kleen
On Sunday 30 September 2007 16:06:59 Thomas Gleixner wrote: On Sun, 30 Sep 2007, Andi Kleen wrote: OK, this explains 2) and 3). I just looked into the code and the logic vs. noapictimer on SMP is completely broken. noapictimer really doesn't make any sense on non SMP imho with the old

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-30 Thread Torsten Kaiser
behavior for something like that for just one -mm version sounds very weird. It's not just 2.6.23-rc4-mm1. All -mm's after rc4 are broken for me. Confirmed breakage on -rc4-mm1, -rc6-mm1 and -rc8-mm1. I'm just narrowing on rc4-mm1 because that was the first version to break. I'm currently trying

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-30 Thread Tejun Heo
]? It's not just 2.6.23-rc4-mm1. All -mm's after rc4 are broken for me. Confirmed breakage on -rc4-mm1, -rc6-mm1 and -rc8-mm1. I'm just narrowing on rc4-mm1 because that was the first version to break. I'm currently trying to bisect 2.6.23-rc4-mm1. Here is the current status: Have you tested

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-30 Thread Torsten Kaiser
time it also showed the error. It's not just 2.6.23-rc4-mm1. All -mm's after rc4 are broken for me. Confirmed breakage on -rc4-mm1, -rc6-mm1 and -rc8-mm1. I'm just narrowing on rc4-mm1 because that was the first version to break. I'm currently trying to bisect 2.6.23-rc4-mm1. Here

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-27 Thread Torsten Kaiser
On 9/27/07, Tejun Heo <[EMAIL PROTECTED]> wrote: > Torsten Kaiser wrote: > > Known good is for me 2.6.23-rc3-mm1, the first known bad is 2.6.23-rc4-mm1. > > I will try to look at the diff between these revisions some more, but > > the change in sata_sil24.c

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-27 Thread Tejun Heo
Torsten Kaiser wrote: > Known good is for me 2.6.23-rc3-mm1, the first known bad is 2.6.23-rc4-mm1. > I will try to look at the diff between these revisions some more, but > the change in sata_sil24.c looked like a perfect match for the > symptoms I was seeing. I think the first thin

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-27 Thread Torsten Kaiser
good point (v2.6.22? v2.6.23?) and > known bad point (HEAD, aka the most recent commit in > libata-dev.git#upstream) Known good is for me 2.6.23-rc3-mm1, the first known bad is 2.6.23-rc4-mm1. I will try to look at the diff between these revisions some more, but the change i

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-27 Thread Rafael J. Wysocki
On Thursday, 27 September 2007 01:21, Thomas Gleixner wrote: > On Thu, 2007-09-27 at 01:30 +0200, Rafael J. Wysocki wrote: > > > > Tested for a couple of times with each kernel, the results seem to be > > > > reproducible 100% of the time. > > > > > > Thanks for going through this debug marathon.

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-27 Thread Jeff Garzik
Torsten Kaiser wrote: On 9/27/07, Tejun Heo <[EMAIL PROTECTED]> wrote: Tejun Heo wrote: Torsten Kaiser wrote: Comparing the driver/ata directory from rc3-mm1 and rc4-mm1 the following change looked the most suspicions to me:

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-27 Thread Torsten Kaiser
On 9/27/07, Tejun Heo <[EMAIL PROTECTED]> wrote: > Tejun Heo wrote: > > Torsten Kaiser wrote: > >> Comparing the driver/ata directory from rc3-mm1 and rc4-mm1 the > >> following change looked the most suspicions to me: > >>

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-27 Thread Torsten Kaiser
On 9/27/07, Tejun Heo [EMAIL PROTECTED] wrote: Tejun Heo wrote: Torsten Kaiser wrote: Comparing the driver/ata directory from rc3-mm1 and rc4-mm1 the following change looked the most suspicions to me:

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-27 Thread Jeff Garzik
Torsten Kaiser wrote: On 9/27/07, Tejun Heo [EMAIL PROTECTED] wrote: Tejun Heo wrote: Torsten Kaiser wrote: Comparing the driver/ata directory from rc3-mm1 and rc4-mm1 the following change looked the most suspicions to me:

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-27 Thread Rafael J. Wysocki
On Thursday, 27 September 2007 01:21, Thomas Gleixner wrote: On Thu, 2007-09-27 at 01:30 +0200, Rafael J. Wysocki wrote: Tested for a couple of times with each kernel, the results seem to be reproducible 100% of the time. Thanks for going through this debug marathon. No big

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-27 Thread Torsten Kaiser
the most recent commit in libata-dev.git#upstream) Known good is for me 2.6.23-rc3-mm1, the first known bad is 2.6.23-rc4-mm1. I will try to look at the diff between these revisions some more, but the change in sata_sil24.c looked like a perfect match for the symptoms I was seeing. What I

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-27 Thread Tejun Heo
Torsten Kaiser wrote: Known good is for me 2.6.23-rc3-mm1, the first known bad is 2.6.23-rc4-mm1. I will try to look at the diff between these revisions some more, but the change in sata_sil24.c looked like a perfect match for the symptoms I was seeing. I think the first thing to do here

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-27 Thread Torsten Kaiser
On 9/27/07, Tejun Heo [EMAIL PROTECTED] wrote: Torsten Kaiser wrote: Known good is for me 2.6.23-rc3-mm1, the first known bad is 2.6.23-rc4-mm1. I will try to look at the diff between these revisions some more, but the change in sata_sil24.c looked like a perfect match for the symptoms I

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-26 Thread Tejun Heo
Tejun Heo wrote: > Torsten Kaiser wrote: >> Comparing the driver/ata directory from rc3-mm1 and rc4-mm1 the >> following change looked the most suspicions to me: >>

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-26 Thread Tejun Heo
Torsten Kaiser wrote: > Comparing the driver/ata directory from rc3-mm1 and rc4-mm1 the > following change looked the most suspicions to me: >

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-26 Thread Thomas Gleixner
On Thu, 2007-09-27 at 01:30 +0200, Rafael J. Wysocki wrote: > > > Tested for a couple of times with each kernel, the results seem to be > > > reproducible 100% of the time. > > > > Thanks for going through this debug marathon. > > No big deal. I'm glad that you've found what's up. > > Well, we

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-26 Thread Rafael J. Wysocki
Thomas, On Wednesday, 26 September 2007 23:34, Thomas Gleixner wrote: > Rafael, > > On Wed, 2007-09-26 at 23:00 +0200, Rafael J. Wysocki wrote: > > > > > First, with the "x86-64: Disable local APIC timer use on AMD systems > > > > > with C1E" > > > > > patch and my collection of suspend patches

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-26 Thread Thomas Gleixner
On Wed, 2007-09-26 at 15:22 -0700, Linus Torvalds wrote: > > On Wed, 26 Sep 2007, Thomas Gleixner wrote: > > > > > > 1) current Linus' tree doesn't boot with any command line (regression) > > > > > > [ Linus, please revert commit e66485d747505e9d960b864fc6c37f8b2afafaf0 > > Reverted. > > >

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-26 Thread Linus Torvalds
On Wed, 26 Sep 2007, Thomas Gleixner wrote: > > > > 1) current Linus' tree doesn't boot with any command line (regression) > > > > [ Linus, please revert commit e66485d747505e9d960b864fc6c37f8b2afafaf0 Reverted. > OK, this explains 2) and 3). I just looked into the code and the logic > vs.

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-26 Thread Thomas Gleixner
Rafael, On Wed, 2007-09-26 at 23:00 +0200, Rafael J. Wysocki wrote: > > > > First, with the "x86-64: Disable local APIC timer use on AMD systems > > > > with C1E" > > > > patch and my collection of suspend patches applied, the box doesn't boot > > > > (the suspend patches don't even thouch the

[REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-26 Thread Rafael J. Wysocki
On Wednesday, 26 September 2007 21:49, Rafael J. Wysocki wrote: > On Wednesday, 26 September 2007 20:51, Thomas Gleixner wrote: > > On Wed, 2007-09-26 at 17:25 +0200, Rafael J. Wysocki wrote: > > > There still are some oddities. > > > > > > First, with the "x86-64: Disable local APIC timer use on

sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-26 Thread Torsten Kaiser
As reported in the "2.6.23-rc4-mm1"-thread and the "What's in linux-2.6-block.git for 2.6.24"-thread I'm having trouble that sometimes on bootup one drive from the SiI-3132 throws errors and becomes inaccesible. The latest kernel I have seen this error was 2.6.23-rc7-mm1. &

Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-26 Thread Rafael J. Wysocki
On Wednesday, 26 September 2007 20:51, Thomas Gleixner wrote: > On Wed, 2007-09-26 at 17:25 +0200, Rafael J. Wysocki wrote: > > There still are some oddities. > > > > First, with the "x86-64: Disable local APIC timer use on AMD systems with > > C1E" > > patch and my collection of suspend patches

Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-26 Thread Thomas Gleixner
On Wed, 2007-09-26 at 17:25 +0200, Rafael J. Wysocki wrote: > There still are some oddities. > > First, with the "x86-64: Disable local APIC timer use on AMD systems with C1E" > patch and my collection of suspend patches applied, the box doesn't boot > (the suspend patches don't even thouch the

Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-26 Thread Rafael J. Wysocki
Thomas, On Tuesday, 25 September 2007 23:24, Thomas Gleixner wrote: > Rafael, > > On Tue, 2007-09-25 at 23:28 +0200, Rafael J. Wysocki wrote: > > > I'm a bit confused by your earlier confirmation, that mainline w/o the > > > -hrt patches boots fine, when you add "apicmaintimer" to the kernel > >

Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-26 Thread Rafael J. Wysocki
Thomas, On Tuesday, 25 September 2007 23:24, Thomas Gleixner wrote: Rafael, On Tue, 2007-09-25 at 23:28 +0200, Rafael J. Wysocki wrote: I'm a bit confused by your earlier confirmation, that mainline w/o the -hrt patches boots fine, when you add apicmaintimer to the kernel command

Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-26 Thread Thomas Gleixner
On Wed, 2007-09-26 at 17:25 +0200, Rafael J. Wysocki wrote: There still are some oddities. First, with the x86-64: Disable local APIC timer use on AMD systems with C1E patch and my collection of suspend patches applied, the box doesn't boot (the suspend patches don't even thouch the boot

Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-26 Thread Rafael J. Wysocki
On Wednesday, 26 September 2007 20:51, Thomas Gleixner wrote: On Wed, 2007-09-26 at 17:25 +0200, Rafael J. Wysocki wrote: There still are some oddities. First, with the x86-64: Disable local APIC timer use on AMD systems with C1E patch and my collection of suspend patches applied, the

sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-26 Thread Torsten Kaiser
As reported in the 2.6.23-rc4-mm1-thread and the What's in linux-2.6-block.git for 2.6.24-thread I'm having trouble that sometimes on bootup one drive from the SiI-3132 throws errors and becomes inaccesible. The latest kernel I have seen this error was 2.6.23-rc7-mm1. From 7 boots 2 times

[REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-26 Thread Rafael J. Wysocki
On Wednesday, 26 September 2007 21:49, Rafael J. Wysocki wrote: On Wednesday, 26 September 2007 20:51, Thomas Gleixner wrote: On Wed, 2007-09-26 at 17:25 +0200, Rafael J. Wysocki wrote: There still are some oddities. First, with the x86-64: Disable local APIC timer use on AMD systems

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-26 Thread Thomas Gleixner
Rafael, On Wed, 2007-09-26 at 23:00 +0200, Rafael J. Wysocki wrote: First, with the x86-64: Disable local APIC timer use on AMD systems with C1E patch and my collection of suspend patches applied, the box doesn't boot (the suspend patches don't even thouch the boot code, so they

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-26 Thread Linus Torvalds
On Wed, 26 Sep 2007, Thomas Gleixner wrote: 1) current Linus' tree doesn't boot with any command line (regression) [ Linus, please revert commit e66485d747505e9d960b864fc6c37f8b2afafaf0 Reverted. OK, this explains 2) and 3). I just looked into the code and the logic vs.

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-26 Thread Thomas Gleixner
On Wed, 2007-09-26 at 15:22 -0700, Linus Torvalds wrote: On Wed, 26 Sep 2007, Thomas Gleixner wrote: 1) current Linus' tree doesn't boot with any command line (regression) [ Linus, please revert commit e66485d747505e9d960b864fc6c37f8b2afafaf0 Reverted. OK, this explains 2)

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-26 Thread Rafael J. Wysocki
Thomas, On Wednesday, 26 September 2007 23:34, Thomas Gleixner wrote: Rafael, On Wed, 2007-09-26 at 23:00 +0200, Rafael J. Wysocki wrote: First, with the x86-64: Disable local APIC timer use on AMD systems with C1E patch and my collection of suspend patches applied, the box

Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-26 Thread Thomas Gleixner
On Thu, 2007-09-27 at 01:30 +0200, Rafael J. Wysocki wrote: Tested for a couple of times with each kernel, the results seem to be reproducible 100% of the time. Thanks for going through this debug marathon. No big deal. I'm glad that you've found what's up. Well, we still have

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-26 Thread Tejun Heo
Torsten Kaiser wrote: Comparing the driver/ata directory from rc3-mm1 and rc4-mm1 the following change looked the most suspicions to me:

Re: sata_sil24 broken since 2.6.23-rc4-mm1

2007-09-26 Thread Tejun Heo
Tejun Heo wrote: Torsten Kaiser wrote: Comparing the driver/ata directory from rc3-mm1 and rc4-mm1 the following change looked the most suspicions to me:

Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-25 Thread Thomas Gleixner
Rafael, On Tue, 2007-09-25 at 23:28 +0200, Rafael J. Wysocki wrote: > > I'm a bit confused by your earlier confirmation, that mainline w/o the > > -hrt patches boots fine, when you add "apicmaintimer" to the kernel > > command line. "apicmaintimer" stops the PIT like we do in -hrt and we > > just

Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-25 Thread Rafael J. Wysocki
Thomas, On Tuesday, 25 September 2007 22:46, Thomas Gleixner wrote: > Rafael, > > On Tue, 2007-09-25 at 22:07 +0200, Rafael J. Wysocki wrote: > > On Tuesday, 25 September 2007 15:17, Thomas Gleixner wrote: > > > On Tue, 2007-09-25 at 15:16 +0200, Rafael J. Wysocki wrote: > > [--snip--] > > > >

Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-25 Thread Thomas Gleixner
Rafael, On Tue, 2007-09-25 at 22:07 +0200, Rafael J. Wysocki wrote: > On Tuesday, 25 September 2007 15:17, Thomas Gleixner wrote: > > On Tue, 2007-09-25 at 15:16 +0200, Rafael J. Wysocki wrote: > [--snip--] > > > > I start to get desperate. Below is a patch, which moves the apic timer > >

Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-25 Thread Rafael J. Wysocki
On Tuesday, 25 September 2007 15:17, Thomas Gleixner wrote: > On Tue, 2007-09-25 at 15:16 +0200, Rafael J. Wysocki wrote: [--snip--] > > I start to get desperate. Below is a patch, which moves the apic timer > disable check after the calibration routine. Can you please apply on top > of -hrt and

Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-25 Thread Thomas Gleixner
On Tue, 2007-09-25 at 15:16 +0200, Rafael J. Wysocki wrote: > > > There seems to be a history effect in the box, to make things more > > > "interesting". > > > > Did you connect this box to Andrews VAIO during KS ? > > No, but it's famous for being interestingly broken nevertheless. :) > > > I

Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-25 Thread Rafael J. Wysocki
On Monday, 24 September 2007 21:13, Thomas Gleixner wrote: > On Mon, 2007-09-24 at 21:11 +0200, Rafael J. Wysocki wrote: > > > /me scratches head > > > > Retested. > > > > > We know, that > > > - disabling local apic timers work > > > > This works reproducibly accross the board. > > Ok > > >

  1   2   3   4   5   >