Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-10-17 Thread Pavel Machek
On Fri 2007-09-21 10:06:15, Thomas Gleixner wrote:
> On Fri, 2007-09-21 at 14:51 +1000, Paul Mackerras wrote:
> > Linus Torvalds writes:
> > 
> > > It would indeed be nice if we could just take CPU's down early (while 
> > > everything is working), and run the whole suspend code with just one CPU, 
> > > rather than having to worry about the ordering between CPU and device 
> > > takedown.
> > 
> > That is certainly what we want to do on powerpc.
> 
> I would have expected that we do it exactly this way and it took me by
> surprise, that we do not.

Well, we used to do that, but acpi spec forbids that, and it means
userspace sees plugs/unplugs.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-10-17 Thread Pavel Machek
On Fri 2007-09-21 10:06:15, Thomas Gleixner wrote:
 On Fri, 2007-09-21 at 14:51 +1000, Paul Mackerras wrote:
  Linus Torvalds writes:
  
   It would indeed be nice if we could just take CPU's down early (while 
   everything is working), and run the whole suspend code with just one CPU, 
   rather than having to worry about the ordering between CPU and device 
   takedown.
  
  That is certainly what we want to do on powerpc.
 
 I would have expected that we do it exactly this way and it took me by
 surprise, that we do not.

Well, we used to do that, but acpi spec forbids that, and it means
userspace sees plugs/unplugs.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki
Thomas,

On Friday, 21 September 2007 21:16, Thomas Gleixner wrote:
> Rafael,
> 
> On Fri, 2007-09-21 at 21:20 +0200, Rafael J. Wysocki wrote:
> > On Friday, 21 September 2007 18:27, Thomas Gleixner wrote:
> > > I simply rmmod'ed the processor module before suspend and the problem is
> > > solved as well. The cpuidle patches make this problem more prominent due
> > > to the possible more direct switch into lower power states, when we wait 
> > > for
> > > a long time on something. 
> > 
> > So, perhaps we can add a .suspend()/.resume() routines to the processor 
> > driver
> > and use them to disable/enable the cpuidle functionality during a
> > suspend/resume?
> 
> http://tglx.de/private/tglx/p.diff
> 
> untested yet, but I'm on the way to do that :)

Heh, I thought of the same thing. :-)

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner
Rafael,

On Fri, 2007-09-21 at 21:20 +0200, Rafael J. Wysocki wrote:
> On Friday, 21 September 2007 18:27, Thomas Gleixner wrote:
> > I simply rmmod'ed the processor module before suspend and the problem is
> > solved as well. The cpuidle patches make this problem more prominent due
> > to the possible more direct switch into lower power states, when we wait for
> > a long time on something. 
> 
> So, perhaps we can add a .suspend()/.resume() routines to the processor driver
> and use them to disable/enable the cpuidle functionality during a
> suspend/resume?

http://tglx.de/private/tglx/p.diff

untested yet, but I'm on the way to do that :)

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki
On Friday, 21 September 2007 18:27, Thomas Gleixner wrote:
> Rafael,
> 
> On Fri, 2007-09-21 at 16:20 +0200, Rafael J. Wysocki wrote:
> > > > If you need any help from me with that, please let me know.
> > > 
> > > I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
> > > After debugging the swsusp_suspend() code path I figured out, that we
> > > end up in C2 or deeper power states while we run the suspend code. The
> > > same happens when we come back on resume. It looks like we disable stuff
> > > in the ACPI BIOS, which makes the C2 and deeper power states misbehave.
> > 
> > Hm, can you please run the test I've suggested in another branch of the
> > thread, ie.
> > 
> > # echo shutdown > /sys/power/disk
> > # echo disk > /sys/power/state
> > 
> > without your debugging code in disk.c?
> > 
> > This makes the hibernation code omit the major ACPI hooks, so if it works,
> > we'll know that these hooks are responsible for the problem.
> 
> Yes, this works fine. We still go into C3, but this seems not longer to
> brick the box.
> 
> > > I hacked the idle loop arch code to use halt() right before we call
> > > device_suspend() and switch back to the acpi idle code right after
> > > device_resume(). This solves the problem as well.
> > 
> > Well, that seems less intrusive than changing the code ordering right before
> > the major kernel release, but I think we should do our best to understand 
> > what
> > _exactly_ is happening here.
> 
> I found some other subtle thinko in the clock events code while I was
> heading down the swsusp_suspend code path. I wait for confirmation that
> it does not brick some endangered boxen, though. Still with this change
> in the clock events code, my VAIO goes into C2 or C3 and causes the box
> to wait for a helping keystroke.
> 
> The correct solution would be, that the ACPI code ignores the lower
> C-states during suspend / resume.

Yes, certainly.

> I simply rmmod'ed the processor module before suspend and the problem is
> solved as well. The cpuidle patches make this problem more prominent due
> to the possible more direct switch into lower power states, when we wait for
> a long time on something. 

So, perhaps we can add a .suspend()/.resume() routines to the processor driver
and use them to disable/enable the cpuidle functionality during a
suspend/resume?

> I think we really should not fiddle with the various cpu states during
> the critical parts of suspend / resume. Let's keep it simple. We have
> the same policy during boot and I think the suspend / resume critical
> parts have similar constraints.

I completely agree.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner
Rafael,

On Fri, 2007-09-21 at 16:20 +0200, Rafael J. Wysocki wrote:
> > > If you need any help from me with that, please let me know.
> > 
> > I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
> > After debugging the swsusp_suspend() code path I figured out, that we
> > end up in C2 or deeper power states while we run the suspend code. The
> > same happens when we come back on resume. It looks like we disable stuff
> > in the ACPI BIOS, which makes the C2 and deeper power states misbehave.
> 
> Hm, can you please run the test I've suggested in another branch of the
> thread, ie.
> 
> # echo shutdown > /sys/power/disk
> # echo disk > /sys/power/state
> 
> without your debugging code in disk.c?
> 
> This makes the hibernation code omit the major ACPI hooks, so if it works,
> we'll know that these hooks are responsible for the problem.

Yes, this works fine. We still go into C3, but this seems not longer to
brick the box.

> > I hacked the idle loop arch code to use halt() right before we call
> > device_suspend() and switch back to the acpi idle code right after
> > device_resume(). This solves the problem as well.
> 
> Well, that seems less intrusive than changing the code ordering right before
> the major kernel release, but I think we should do our best to understand what
> _exactly_ is happening here.

I found some other subtle thinko in the clock events code while I was
heading down the swsusp_suspend code path. I wait for confirmation that
it does not brick some endangered boxen, though. Still with this change
in the clock events code, my VAIO goes into C2 or C3 and causes the box
to wait for a helping keystroke.

The correct solution would be, that the ACPI code ignores the lower
C-states during suspend / resume. I simply rmmod'ed the processor module
before suspend and the problem is solved as well. The cpuidle patches
make this problem more prominent due to the possible more direct switch
into lower power states, when we wait for a long time on something.

I think we really should not fiddle with the various cpu states during
the critical parts of suspend / resume. Let's keep it simple. We have
the same policy during boot and I think the suspend / resume critical
parts have similar constraints.

tglx






-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki
Thomas,

On Friday, 21 September 2007 14:59, Thomas Gleixner wrote:
> Rafael,
> 
> On Fri, 2007-09-21 at 00:30 +0200, Rafael J. Wysocki wrote:
> > > -ETOOTIRED led me too a wrong conclusion, but still it is a valuable
> > > hint that this change is making things work again.
> > 
> > Yes, it is.
> > 
> > > I need to go down into the details of the swsusp_suspend() code path to
> > > figure out, what's the root cause. 
> > 
> > If you need any help from me with that, please let me know.
> 
> I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
> After debugging the swsusp_suspend() code path I figured out, that we
> end up in C2 or deeper power states while we run the suspend code. The
> same happens when we come back on resume. It looks like we disable stuff
> in the ACPI BIOS, which makes the C2 and deeper power states misbehave.

Hm, can you please run the test I've suggested in another branch of the
thread, ie.

# echo shutdown > /sys/power/disk
# echo disk > /sys/power/state

without your debugging code in disk.c?

This makes the hibernation code omit the major ACPI hooks, so if it works,
we'll know that these hooks are responsible for the problem.

> I hacked the idle loop arch code to use halt() right before we call
> device_suspend() and switch back to the acpi idle code right after
> device_resume(). This solves the problem as well.

Well, that seems less intrusive than changing the code ordering right before
the major kernel release, but I think we should do our best to understand what
_exactly_ is happening here.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner
Rafael,

On Fri, 2007-09-21 at 00:30 +0200, Rafael J. Wysocki wrote:
> > -ETOOTIRED led me too a wrong conclusion, but still it is a valuable
> > hint that this change is making things work again.
> 
> Yes, it is.
> 
> > I need to go down into the details of the swsusp_suspend() code path to
> > figure out, what's the root cause. 
> 
> If you need any help from me with that, please let me know.

I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
After debugging the swsusp_suspend() code path I figured out, that we
end up in C2 or deeper power states while we run the suspend code. The
same happens when we come back on resume. It looks like we disable stuff
in the ACPI BIOS, which makes the C2 and deeper power states misbehave.
I hacked the idle loop arch code to use halt() right before we call
device_suspend() and switch back to the acpi idle code right after
device_resume(). This solves the problem as well.

Len, any opinion on this one ?

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki
On Friday, 21 September 2007 09:56, Thomas Gleixner wrote:
> On Thu, 2007-09-20 at 19:35 -0400, Len Brown wrote:
> > > > (Btw, the above commit message points to just my response with a 
> > > > testing 
> > > > patch to the real email: the actual explanation of the INSANE ordering 
> > > > is 
> > > > from Len Brown in
> > > > 
> > > > 
> > > > https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html
> > > > 
> > > > and there Len claims that we *must* wake up CPU's early).
> > > 
> > > ..and points to commit 1a38416cea8ac801ae8f261074721f35317613dc which in 
> > > turn talks about http://bugzilla.kernel.org/show_bug.cgi?id=5651 
> > > 
> > > Howerver, it seems that bugzilla entry may just be bogus. It talks about 
> > > "it appears that some firmware in the future may depend on that sequence 
> > > for correction operation"
> > > 
> > > Len, Shaohua, what are the real issues here? 
> > 
> > Intel's reference BIOS for Core Duo performs some re-initialization
> > in _WAK that will get blow away if INIT follows _WAK.
> > IIR, it is related to re-initializing the thermal sensors.
> > I opened bug 5651 when the BIOS team informed me of this issue.
> > 
> > Yes, bringing a processor offline and then online again w/o
> > an intervening suspend or reset would not evaluate _WAK,
> > and thus may still run into the issue.
> 
> If this is true, then we should disable the sys//cpu/online entry
> right away.

Or drop the execution of _INI from the CPU hotplug, if possible ...

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki
Thomas,

On Thursday, 20 September 2007 23:53, Thomas Gleixner wrote:
> Rafael,
> 
> On Thu, 2007-09-20 at 23:45 +0200, Rafael J. Wysocki wrote:
> > > We disable everything in device_suspend()
> > 
> > No, we don't.  sysdevs are _not_ suspended in device_suspend().
> > They are suspended in device_power_down(), which is called
> > _after_ disable_nonboot_cpus() (from swsusp_suspend()).
> > 
> > > including timekeeping,
> > 
> > No, the timekeeping is suspended in device_power_down() (or at least it 
> > should
> > be).
> 
> Damn, you are right. Reading through 30 different logs confused me.
> 
> > >   enable_nonboot_cpus();
> > 
> > Actually, we can't do this here, because of ACPI and some interrupt handling
> > related problems.  Unfortunately, platform_finish() needs to go _after_
> > enable_nonboot_cpus() and device_resume() needs to go after 
> > platform_finish().
> > Analogously, disable_nonboot_cpus() has to go after platform_prepare().
> >
> > Otherwise, some systems will break.
> 
> Well, I don't buy this one. The system would break in the same way, when
> I take CPU#1 offline before I initiate the suspend.
> 
> > > and non-surprisingly the "my VAIO needs help from keyboard" problem went
> > > away immediately. See patch below. (on top of rc7-hrt1, -mm1 does not
> > > work at all on my VAIO due to some yet not identified wreckage)
> > 
> > Hm, I really don't know why it helps, but that's not because of the 
> > timekeeping
> > suspend, IMO.
> 
> It is related. We rely on some subtle thing which is not up when we
> resume the non boot cpu.
> 
> > > I did not yet look into the suspend to ram code, but I guess that there
> > > is an equivalent problem.
> > 
> > Yes, the code ordering is the same, but it's not totally wrong, IMHO.
> > 
> > > But I have no idea why this affects Andrews jinxed VAIO (UP machine),
> > > though I suspect that we have more timekeeping/timer depending code
> > > somewhere waiting to bite us.
> > 
> > That's possible.
> > 
> > > Also I still need to debug why the HIBERNATION_TEST code path (which has
> > > a msleep(5000) in it) does not fail,
> > 
> > See above. :-)
> 
> Yes. It makes sense. When I change the TEST code path to:
> 
> - printk("swsusp debug: Waiting for 5 seconds.\n");
> - msleep(5000);
> + printk("swsusp debug: before swsusp_suspend\n");
> + error = swsusp_suspend();
> 
> then I have the same effect as I get from real hibernation. And we
> actually shut down time keeping somewhere in that code path.
> 
> ACPI: PCI interrupt for device :00:1b.0 disabled
> swsusp debug: before swsusp_suspend
> Suspend timekeeping
> swsusp: critical section: 
> swsusp: Need to copy 112429 pages
> swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
> swsusp: critical section: done (112429 pages copied)
> Intel machine check architecture supported.
> Intel machine check reporting enabled on CPU#0.
> Resume timekeeping
> ACPI: PCI Interrupt :00:02.0[A] -> GSI 16 (level, low) -> IRQ 16
> -> works fine
> 
> This is with my patch applied. Without that I get:
> 
> CPU1 is down
> swsusp debug: before swsusp_suspend
> Suspend timekeeping
> swsusp: critical section: 
> swsusp: Need to copy 112429 pages
> swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
> swsusp: critical section: done (112429 pages copied)
> Intel machine check architecture supported.
> Intel machine check reporting enabled on CPU#0.
> Resume timekeeping
> Enabling non-boot CPUs
> --> Waits for ever until a key is pressed

Can you please run one more test?

Namely, without your debugging code in disk.c, please try

# echo shutdown > /sys/power/disk
# echo disk > /sys/power/state

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner
On Fri, 2007-09-21 at 14:51 +1000, Paul Mackerras wrote:
> Linus Torvalds writes:
> 
> > It would indeed be nice if we could just take CPU's down early (while 
> > everything is working), and run the whole suspend code with just one CPU, 
> > rather than having to worry about the ordering between CPU and device 
> > takedown.
> 
> That is certainly what we want to do on powerpc.

I would have expected that we do it exactly this way and it took me by
surprise, that we do not.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner
On Thu, 2007-09-20 at 19:35 -0400, Len Brown wrote:
> > > (Btw, the above commit message points to just my response with a testing 
> > > patch to the real email: the actual explanation of the INSANE ordering is 
> > > from Len Brown in
> > > 
> > >   
> > > https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html
> > > 
> > > and there Len claims that we *must* wake up CPU's early).
> > 
> > ..and points to commit 1a38416cea8ac801ae8f261074721f35317613dc which in 
> > turn talks about http://bugzilla.kernel.org/show_bug.cgi?id=5651 
> > 
> > Howerver, it seems that bugzilla entry may just be bogus. It talks about 
> > "it appears that some firmware in the future may depend on that sequence 
> > for correction operation"
> > 
> > Len, Shaohua, what are the real issues here? 
> 
> Intel's reference BIOS for Core Duo performs some re-initialization
> in _WAK that will get blow away if INIT follows _WAK.
> IIR, it is related to re-initializing the thermal sensors.
> I opened bug 5651 when the BIOS team informed me of this issue.
> 
> Yes, bringing a processor offline and then online again w/o
> an intervening suspend or reset would not evaluate _WAK,
> and thus may still run into the issue.

If this is true, then we should disable the sys//cpu/online entry
right away.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner
On Thu, 2007-09-20 at 19:35 -0400, Len Brown wrote:
   (Btw, the above commit message points to just my response with a testing 
   patch to the real email: the actual explanation of the INSANE ordering is 
   from Len Brown in
   
 
   https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html
   
   and there Len claims that we *must* wake up CPU's early).
  
  ..and points to commit 1a38416cea8ac801ae8f261074721f35317613dc which in 
  turn talks about http://bugzilla.kernel.org/show_bug.cgi?id=5651 
  
  Howerver, it seems that bugzilla entry may just be bogus. It talks about 
  it appears that some firmware in the future may depend on that sequence 
  for correction operation
  
  Len, Shaohua, what are the real issues here? 
 
 Intel's reference BIOS for Core Duo performs some re-initialization
 in _WAK that will get blow away if INIT follows _WAK.
 IIR, it is related to re-initializing the thermal sensors.
 I opened bug 5651 when the BIOS team informed me of this issue.
 
 Yes, bringing a processor offline and then online again w/o
 an intervening suspend or reset would not evaluate _WAK,
 and thus may still run into the issue.

If this is true, then we should disable the sys//cpu/online entry
right away.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner
On Fri, 2007-09-21 at 14:51 +1000, Paul Mackerras wrote:
 Linus Torvalds writes:
 
  It would indeed be nice if we could just take CPU's down early (while 
  everything is working), and run the whole suspend code with just one CPU, 
  rather than having to worry about the ordering between CPU and device 
  takedown.
 
 That is certainly what we want to do on powerpc.

I would have expected that we do it exactly this way and it took me by
surprise, that we do not.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki
Thomas,

On Thursday, 20 September 2007 23:53, Thomas Gleixner wrote:
 Rafael,
 
 On Thu, 2007-09-20 at 23:45 +0200, Rafael J. Wysocki wrote:
   We disable everything in device_suspend()
  
  No, we don't.  sysdevs are _not_ suspended in device_suspend().
  They are suspended in device_power_down(), which is called
  _after_ disable_nonboot_cpus() (from swsusp_suspend()).
  
   including timekeeping,
  
  No, the timekeeping is suspended in device_power_down() (or at least it 
  should
  be).
 
 Damn, you are right. Reading through 30 different logs confused me.
 
 enable_nonboot_cpus();
  
  Actually, we can't do this here, because of ACPI and some interrupt handling
  related problems.  Unfortunately, platform_finish() needs to go _after_
  enable_nonboot_cpus() and device_resume() needs to go after 
  platform_finish().
  Analogously, disable_nonboot_cpus() has to go after platform_prepare().
 
  Otherwise, some systems will break.
 
 Well, I don't buy this one. The system would break in the same way, when
 I take CPU#1 offline before I initiate the suspend.
 
   and non-surprisingly the my VAIO needs help from keyboard problem went
   away immediately. See patch below. (on top of rc7-hrt1, -mm1 does not
   work at all on my VAIO due to some yet not identified wreckage)
  
  Hm, I really don't know why it helps, but that's not because of the 
  timekeeping
  suspend, IMO.
 
 It is related. We rely on some subtle thing which is not up when we
 resume the non boot cpu.
 
   I did not yet look into the suspend to ram code, but I guess that there
   is an equivalent problem.
  
  Yes, the code ordering is the same, but it's not totally wrong, IMHO.
  
   But I have no idea why this affects Andrews jinxed VAIO (UP machine),
   though I suspect that we have more timekeeping/timer depending code
   somewhere waiting to bite us.
  
  That's possible.
  
   Also I still need to debug why the HIBERNATION_TEST code path (which has
   a msleep(5000) in it) does not fail,
  
  See above. :-)
 
 Yes. It makes sense. When I change the TEST code path to:
 
 - printk(swsusp debug: Waiting for 5 seconds.\n);
 - msleep(5000);
 + printk(swsusp debug: before swsusp_suspend\n);
 + error = swsusp_suspend();
 
 then I have the same effect as I get from real hibernation. And we
 actually shut down time keeping somewhere in that code path.
 
 ACPI: PCI interrupt for device :00:1b.0 disabled
 swsusp debug: before swsusp_suspend
 Suspend timekeeping
 swsusp: critical section: 
 swsusp: Need to copy 112429 pages
 swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
 swsusp: critical section: done (112429 pages copied)
 Intel machine check architecture supported.
 Intel machine check reporting enabled on CPU#0.
 Resume timekeeping
 ACPI: PCI Interrupt :00:02.0[A] - GSI 16 (level, low) - IRQ 16
 - works fine
 
 This is with my patch applied. Without that I get:
 
 CPU1 is down
 swsusp debug: before swsusp_suspend
 Suspend timekeeping
 swsusp: critical section: 
 swsusp: Need to copy 112429 pages
 swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
 swsusp: critical section: done (112429 pages copied)
 Intel machine check architecture supported.
 Intel machine check reporting enabled on CPU#0.
 Resume timekeeping
 Enabling non-boot CPUs
 -- Waits for ever until a key is pressed

Can you please run one more test?

Namely, without your debugging code in disk.c, please try

# echo shutdown  /sys/power/disk
# echo disk  /sys/power/state

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki
On Friday, 21 September 2007 09:56, Thomas Gleixner wrote:
 On Thu, 2007-09-20 at 19:35 -0400, Len Brown wrote:
(Btw, the above commit message points to just my response with a 
testing 
patch to the real email: the actual explanation of the INSANE ordering 
is 
from Len Brown in


https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html

and there Len claims that we *must* wake up CPU's early).
   
   ..and points to commit 1a38416cea8ac801ae8f261074721f35317613dc which in 
   turn talks about http://bugzilla.kernel.org/show_bug.cgi?id=5651 
   
   Howerver, it seems that bugzilla entry may just be bogus. It talks about 
   it appears that some firmware in the future may depend on that sequence 
   for correction operation
   
   Len, Shaohua, what are the real issues here? 
  
  Intel's reference BIOS for Core Duo performs some re-initialization
  in _WAK that will get blow away if INIT follows _WAK.
  IIR, it is related to re-initializing the thermal sensors.
  I opened bug 5651 when the BIOS team informed me of this issue.
  
  Yes, bringing a processor offline and then online again w/o
  an intervening suspend or reset would not evaluate _WAK,
  and thus may still run into the issue.
 
 If this is true, then we should disable the sys//cpu/online entry
 right away.

Or drop the execution of _INI from the CPU hotplug, if possible ...

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner
Rafael,

On Fri, 2007-09-21 at 00:30 +0200, Rafael J. Wysocki wrote:
  -ETOOTIRED led me too a wrong conclusion, but still it is a valuable
  hint that this change is making things work again.
 
 Yes, it is.
 
  I need to go down into the details of the swsusp_suspend() code path to
  figure out, what's the root cause. 
 
 If you need any help from me with that, please let me know.

I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
After debugging the swsusp_suspend() code path I figured out, that we
end up in C2 or deeper power states while we run the suspend code. The
same happens when we come back on resume. It looks like we disable stuff
in the ACPI BIOS, which makes the C2 and deeper power states misbehave.
I hacked the idle loop arch code to use halt() right before we call
device_suspend() and switch back to the acpi idle code right after
device_resume(). This solves the problem as well.

Len, any opinion on this one ?

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki
Thomas,

On Friday, 21 September 2007 14:59, Thomas Gleixner wrote:
 Rafael,
 
 On Fri, 2007-09-21 at 00:30 +0200, Rafael J. Wysocki wrote:
   -ETOOTIRED led me too a wrong conclusion, but still it is a valuable
   hint that this change is making things work again.
  
  Yes, it is.
  
   I need to go down into the details of the swsusp_suspend() code path to
   figure out, what's the root cause. 
  
  If you need any help from me with that, please let me know.
 
 I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
 After debugging the swsusp_suspend() code path I figured out, that we
 end up in C2 or deeper power states while we run the suspend code. The
 same happens when we come back on resume. It looks like we disable stuff
 in the ACPI BIOS, which makes the C2 and deeper power states misbehave.

Hm, can you please run the test I've suggested in another branch of the
thread, ie.

# echo shutdown  /sys/power/disk
# echo disk  /sys/power/state

without your debugging code in disk.c?

This makes the hibernation code omit the major ACPI hooks, so if it works,
we'll know that these hooks are responsible for the problem.

 I hacked the idle loop arch code to use halt() right before we call
 device_suspend() and switch back to the acpi idle code right after
 device_resume(). This solves the problem as well.

Well, that seems less intrusive than changing the code ordering right before
the major kernel release, but I think we should do our best to understand what
_exactly_ is happening here.

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner
Rafael,

On Fri, 2007-09-21 at 16:20 +0200, Rafael J. Wysocki wrote:
   If you need any help from me with that, please let me know.
  
  I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
  After debugging the swsusp_suspend() code path I figured out, that we
  end up in C2 or deeper power states while we run the suspend code. The
  same happens when we come back on resume. It looks like we disable stuff
  in the ACPI BIOS, which makes the C2 and deeper power states misbehave.
 
 Hm, can you please run the test I've suggested in another branch of the
 thread, ie.
 
 # echo shutdown  /sys/power/disk
 # echo disk  /sys/power/state
 
 without your debugging code in disk.c?
 
 This makes the hibernation code omit the major ACPI hooks, so if it works,
 we'll know that these hooks are responsible for the problem.

Yes, this works fine. We still go into C3, but this seems not longer to
brick the box.

  I hacked the idle loop arch code to use halt() right before we call
  device_suspend() and switch back to the acpi idle code right after
  device_resume(). This solves the problem as well.
 
 Well, that seems less intrusive than changing the code ordering right before
 the major kernel release, but I think we should do our best to understand what
 _exactly_ is happening here.

I found some other subtle thinko in the clock events code while I was
heading down the swsusp_suspend code path. I wait for confirmation that
it does not brick some endangered boxen, though. Still with this change
in the clock events code, my VAIO goes into C2 or C3 and causes the box
to wait for a helping keystroke.

The correct solution would be, that the ACPI code ignores the lower
C-states during suspend / resume. I simply rmmod'ed the processor module
before suspend and the problem is solved as well. The cpuidle patches
make this problem more prominent due to the possible more direct switch
into lower power states, when we wait for a long time on something.

I think we really should not fiddle with the various cpu states during
the critical parts of suspend / resume. Let's keep it simple. We have
the same policy during boot and I think the suspend / resume critical
parts have similar constraints.

tglx






-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki
On Friday, 21 September 2007 18:27, Thomas Gleixner wrote:
 Rafael,
 
 On Fri, 2007-09-21 at 16:20 +0200, Rafael J. Wysocki wrote:
If you need any help from me with that, please let me know.
   
   I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
   After debugging the swsusp_suspend() code path I figured out, that we
   end up in C2 or deeper power states while we run the suspend code. The
   same happens when we come back on resume. It looks like we disable stuff
   in the ACPI BIOS, which makes the C2 and deeper power states misbehave.
  
  Hm, can you please run the test I've suggested in another branch of the
  thread, ie.
  
  # echo shutdown  /sys/power/disk
  # echo disk  /sys/power/state
  
  without your debugging code in disk.c?
  
  This makes the hibernation code omit the major ACPI hooks, so if it works,
  we'll know that these hooks are responsible for the problem.
 
 Yes, this works fine. We still go into C3, but this seems not longer to
 brick the box.
 
   I hacked the idle loop arch code to use halt() right before we call
   device_suspend() and switch back to the acpi idle code right after
   device_resume(). This solves the problem as well.
  
  Well, that seems less intrusive than changing the code ordering right before
  the major kernel release, but I think we should do our best to understand 
  what
  _exactly_ is happening here.
 
 I found some other subtle thinko in the clock events code while I was
 heading down the swsusp_suspend code path. I wait for confirmation that
 it does not brick some endangered boxen, though. Still with this change
 in the clock events code, my VAIO goes into C2 or C3 and causes the box
 to wait for a helping keystroke.
 
 The correct solution would be, that the ACPI code ignores the lower
 C-states during suspend / resume.

Yes, certainly.

 I simply rmmod'ed the processor module before suspend and the problem is
 solved as well. The cpuidle patches make this problem more prominent due
 to the possible more direct switch into lower power states, when we wait for
 a long time on something. 

So, perhaps we can add a .suspend()/.resume() routines to the processor driver
and use them to disable/enable the cpuidle functionality during a
suspend/resume?

 I think we really should not fiddle with the various cpu states during
 the critical parts of suspend / resume. Let's keep it simple. We have
 the same policy during boot and I think the suspend / resume critical
 parts have similar constraints.

I completely agree.

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki
Thomas,

On Friday, 21 September 2007 21:16, Thomas Gleixner wrote:
 Rafael,
 
 On Fri, 2007-09-21 at 21:20 +0200, Rafael J. Wysocki wrote:
  On Friday, 21 September 2007 18:27, Thomas Gleixner wrote:
   I simply rmmod'ed the processor module before suspend and the problem is
   solved as well. The cpuidle patches make this problem more prominent due
   to the possible more direct switch into lower power states, when we wait 
   for
   a long time on something. 
  
  So, perhaps we can add a .suspend()/.resume() routines to the processor 
  driver
  and use them to disable/enable the cpuidle functionality during a
  suspend/resume?
 
 http://tglx.de/private/tglx/p.diff
 
 untested yet, but I'm on the way to do that :)

Heh, I thought of the same thing. :-)

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner
Rafael,

On Fri, 2007-09-21 at 21:20 +0200, Rafael J. Wysocki wrote:
 On Friday, 21 September 2007 18:27, Thomas Gleixner wrote:
  I simply rmmod'ed the processor module before suspend and the problem is
  solved as well. The cpuidle patches make this problem more prominent due
  to the possible more direct switch into lower power states, when we wait for
  a long time on something. 
 
 So, perhaps we can add a .suspend()/.resume() routines to the processor driver
 and use them to disable/enable the cpuidle functionality during a
 suspend/resume?

http://tglx.de/private/tglx/p.diff

untested yet, but I'm on the way to do that :)

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Paul Mackerras
Linus Torvalds writes:

> It would indeed be nice if we could just take CPU's down early (while 
> everything is working), and run the whole suspend code with just one CPU, 
> rather than having to worry about the ordering between CPU and device 
> takedown.

That is certainly what we want to do on powerpc.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Len Brown
On Thursday 20 September 2007 17:55, Linus Torvalds wrote:
> 
> On Thu, 20 Sep 2007, Linus Torvalds wrote:
> > 
> > (Btw, the above commit message points to just my response with a testing 
> > patch to the real email: the actual explanation of the INSANE ordering is 
> > from Len Brown in
> > 
> > 
> > https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html
> > 
> > and there Len claims that we *must* wake up CPU's early).
> 
> ..and points to commit 1a38416cea8ac801ae8f261074721f35317613dc which in 
> turn talks about http://bugzilla.kernel.org/show_bug.cgi?id=5651 
> 
> Howerver, it seems that bugzilla entry may just be bogus. It talks about 
> "it appears that some firmware in the future may depend on that sequence 
> for correction operation"
> 
> Len, Shaohua, what are the real issues here? 

Intel's reference BIOS for Core Duo performs some re-initialization
in _WAK that will get blow away if INIT follows _WAK.
IIR, it is related to re-initializing the thermal sensors.
I opened bug 5651 when the BIOS team informed me of this issue.

Yes, bringing a processor offline and then online again w/o
an intervening suspend or reset would not evaluate _WAK,
and thus may still run into the issue.

I don't know if this is a widespread issue and a commonly
used BIOS hook, or if it is specific to certain processors.

-Len

> It would indeed be nice if we could just take CPU's down early (while 
> everything is working), and run the whole suspend code with just one CPU, 
> rather than having to worry about the ordering between CPU and device 
> takedown.
> 
> That said, at least with STR, the situation is:
> 
>  1) suspend_console
>  2)   device_suspend(PMSG_SUSPEND)  (==   ->suspend)
>  3) disable_nonboot_cpus()
>  4)   device_power_down(PMSG_SUSPEND) (==   ->suspend_late)
>  5) pm_ops->enter()
>  6)   device_power_up() (==   ->resume_early)
>  7) enable_nonboot_cpus()
>  8) pm_finish()
>  9)   device_resume()   (==   ->resume
> 10) resume_console
> 
> So if we agree that things like timers etc should *never* be suspended by 
> the early suspend, and *always* use "suspend_late/resume_early", then at 
> least STR should be ok.
> 
> And I think that's a damn reasonable thing to agree on: timers (and 
> anything else that CPU shutdown/bringup could *possibly* care about) 
> should be considered core enough that they had better be on the 
> suspend_late/resume_early list.
> 
> Thomas, Rafael, can you verify that at least STR is ok in this respect?
> 
>   Linus
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
On Friday, 21 September 2007 00:05, Thomas Gleixner wrote:
> Linus,
> 
> On Thu, 2007-09-20 at 14:55 -0700, Linus Torvalds wrote:
> > And I think that's a damn reasonable thing to agree on: timers (and 
> > anything else that CPU shutdown/bringup could *possibly* care about) 
> > should be considered core enough that they had better be on the 
> > suspend_late/resume_early list.
> > 
> > Thomas, Rafael, can you verify that at least STR is ok in this respect?
> 
> -ETOOTIRED led me too a wrong conclusion, but still it is a valuable
> hint that this change is making things work again.

Yes, it is.

> I need to go down into the details of the swsusp_suspend() code path to
> figure out, what's the root cause. 

If you need any help from me with that, please let me know.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
Thomas,

On Thursday, 20 September 2007 23:53, Thomas Gleixner wrote:
> Rafael,
> 
> On Thu, 2007-09-20 at 23:45 +0200, Rafael J. Wysocki wrote:
> > > We disable everything in device_suspend()
> > 
> > No, we don't.  sysdevs are _not_ suspended in device_suspend().
> > They are suspended in device_power_down(), which is called
> > _after_ disable_nonboot_cpus() (from swsusp_suspend()).
> > 
> > > including timekeeping,
> > 
> > No, the timekeeping is suspended in device_power_down() (or at least it 
> > should
> > be).
> 
> Damn, you are right. Reading through 30 different logs confused me.
> 
> > >   enable_nonboot_cpus();
> > 
> > Actually, we can't do this here, because of ACPI and some interrupt handling
> > related problems.  Unfortunately, platform_finish() needs to go _after_
> > enable_nonboot_cpus() and device_resume() needs to go after 
> > platform_finish().
> > Analogously, disable_nonboot_cpus() has to go after platform_prepare().
> >
> > Otherwise, some systems will break.
> 
> Well, I don't buy this one. The system would break in the same way, when
> I take CPU#1 offline before I initiate the suspend.

I was referring to the resume part.  If we call enable_nonboot_cpus(), which
executes the _INI ACPI control method, after platform_finish(), which executes
the _WAK global ACPI control method, things will break.  That already happened
in the past, when the code ordering was different, AFAICS.

> > > and non-surprisingly the "my VAIO needs help from keyboard" problem went
> > > away immediately. See patch below. (on top of rc7-hrt1, -mm1 does not
> > > work at all on my VAIO due to some yet not identified wreckage)
> > 
> > Hm, I really don't know why it helps, but that's not because of the 
> > timekeeping
> > suspend, IMO.
> 
> It is related. We rely on some subtle thing which is not up when we
> resume the non boot cpu.

Yes, it looks so.

> > > I did not yet look into the suspend to ram code, but I guess that there
> > > is an equivalent problem.
> > 
> > Yes, the code ordering is the same, but it's not totally wrong, IMHO.
> > 
> > > But I have no idea why this affects Andrews jinxed VAIO (UP machine),
> > > though I suspect that we have more timekeeping/timer depending code
> > > somewhere waiting to bite us.
> > 
> > That's possible.
> > 
> > > Also I still need to debug why the HIBERNATION_TEST code path (which has
> > > a msleep(5000) in it) does not fail,
> > 
> > See above. :-)
> 
> Yes. It makes sense. When I change the TEST code path to:
> 
> - printk("swsusp debug: Waiting for 5 seconds.\n");
> - msleep(5000);
> + printk("swsusp debug: before swsusp_suspend\n");
> + error = swsusp_suspend();
> 
> then I have the same effect as I get from real hibernation. And we
> actually shut down time keeping somewhere in that code path.
> 
> ACPI: PCI interrupt for device :00:1b.0 disabled
> swsusp debug: before swsusp_suspend
> Suspend timekeeping

Exactly.  timekeeping_suspend() is called from device_power_down(), which is
called from swsusp_suspend() (after disabling interrupts).

> swsusp: critical section: 
> swsusp: Need to copy 112429 pages
> swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
> swsusp: critical section: done (112429 pages copied)
> Intel machine check architecture supported.
> Intel machine check reporting enabled on CPU#0.
> Resume timekeeping
> ACPI: PCI Interrupt :00:02.0[A] -> GSI 16 (level, low) -> IRQ 16
> -> works fine
> 
> This is with my patch applied. Without that I get:
> 
> CPU1 is down
> swsusp debug: before swsusp_suspend
> Suspend timekeeping
> swsusp: critical section: 
> swsusp: Need to copy 112429 pages
> swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
> swsusp: critical section: done (112429 pages copied)
> Intel machine check architecture supported.
> Intel machine check reporting enabled on CPU#0.
> Resume timekeeping
> Enabling non-boot CPUs
> --> Waits for ever until a key is pressed

Well, perhaps there's something else that we should suspend late and resume
early, but we don't?

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
Linus,

On Thu, 2007-09-20 at 14:55 -0700, Linus Torvalds wrote:
> And I think that's a damn reasonable thing to agree on: timers (and 
> anything else that CPU shutdown/bringup could *possibly* care about) 
> should be considered core enough that they had better be on the 
> suspend_late/resume_early list.
> 
> Thomas, Rafael, can you verify that at least STR is ok in this respect?

-ETOOTIRED led me too a wrong conclusion, but still it is a valuable
hint that this change is making things work again. I need to go down
into the details of the swsusp_suspend() code path to figure out, what's
the root cause. 

Sorry for the noise, but I'm zooming in.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
Rafael,

On Thu, 2007-09-20 at 23:54 +0200, Rafael J. Wysocki wrote:
> > Hmm. This is close to the ordering we have in STR too.
> > 
> > I have some dim memory of there being some ACPI reason why it had to be 
> > done that way.
> 
> Yes.  We're executing _INI from the CPU initialization code and that shouldn't
> be done after _WAK, which is called from platform_finish().

If I tear down CPU#1 right before I tell the kernel to hibernate, then
the box must explode in the same way. It does not. On none of 4 tested
laptops. 

Of course only the jinxed VAIO one exposes the "please press a key
problem".

I need to follow down the swsusp_suspend() code path to figure out, why
this breaks the box.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Linus Torvalds


On Thu, 20 Sep 2007, Linus Torvalds wrote:
> 
> (Btw, the above commit message points to just my response with a testing 
> patch to the real email: the actual explanation of the INSANE ordering is 
> from Len Brown in
> 
>   
> https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html
> 
> and there Len claims that we *must* wake up CPU's early).

..and points to commit 1a38416cea8ac801ae8f261074721f35317613dc which in 
turn talks about http://bugzilla.kernel.org/show_bug.cgi?id=5651 

Howerver, it seems that bugzilla entry may just be bogus. It talks about 
"it appears that some firmware in the future may depend on that sequence 
for correction operation"

Len, Shaohua, what are the real issues here? 

It would indeed be nice if we could just take CPU's down early (while 
everything is working), and run the whole suspend code with just one CPU, 
rather than having to worry about the ordering between CPU and device 
takedown.

That said, at least with STR, the situation is:

 1) suspend_console
 2)   device_suspend(PMSG_SUSPEND)(==   ->suspend)
 3) disable_nonboot_cpus()
 4)   device_power_down(PMSG_SUSPEND) (==   ->suspend_late)
 5) pm_ops->enter()
 6)   device_power_up()   (==   ->resume_early)
 7) enable_nonboot_cpus()
 8) pm_finish()
 9)   device_resume() (==   ->resume
10) resume_console

So if we agree that things like timers etc should *never* be suspended by 
the early suspend, and *always* use "suspend_late/resume_early", then at 
least STR should be ok.

And I think that's a damn reasonable thing to agree on: timers (and 
anything else that CPU shutdown/bringup could *possibly* care about) 
should be considered core enough that they had better be on the 
suspend_late/resume_early list.

Thomas, Rafael, can you verify that at least STR is ok in this respect?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
Rafael,

On Thu, 2007-09-20 at 23:45 +0200, Rafael J. Wysocki wrote:
> > We disable everything in device_suspend()
> 
> No, we don't.  sysdevs are _not_ suspended in device_suspend().
> They are suspended in device_power_down(), which is called
> _after_ disable_nonboot_cpus() (from swsusp_suspend()).
> 
> > including timekeeping,
> 
> No, the timekeeping is suspended in device_power_down() (or at least it should
> be).

Damn, you are right. Reading through 30 different logs confused me.

> > enable_nonboot_cpus();
> 
> Actually, we can't do this here, because of ACPI and some interrupt handling
> related problems.  Unfortunately, platform_finish() needs to go _after_
> enable_nonboot_cpus() and device_resume() needs to go after platform_finish().
> Analogously, disable_nonboot_cpus() has to go after platform_prepare().
>
> Otherwise, some systems will break.

Well, I don't buy this one. The system would break in the same way, when
I take CPU#1 offline before I initiate the suspend.

> > and non-surprisingly the "my VAIO needs help from keyboard" problem went
> > away immediately. See patch below. (on top of rc7-hrt1, -mm1 does not
> > work at all on my VAIO due to some yet not identified wreckage)
> 
> Hm, I really don't know why it helps, but that's not because of the 
> timekeeping
> suspend, IMO.

It is related. We rely on some subtle thing which is not up when we
resume the non boot cpu.

> > I did not yet look into the suspend to ram code, but I guess that there
> > is an equivalent problem.
> 
> Yes, the code ordering is the same, but it's not totally wrong, IMHO.
> 
> > But I have no idea why this affects Andrews jinxed VAIO (UP machine),
> > though I suspect that we have more timekeeping/timer depending code
> > somewhere waiting to bite us.
> 
> That's possible.
> 
> > Also I still need to debug why the HIBERNATION_TEST code path (which has
> > a msleep(5000) in it) does not fail,
> 
> See above. :-)

Yes. It makes sense. When I change the TEST code path to:

-   printk("swsusp debug: Waiting for 5 seconds.\n");
-   msleep(5000);
+   printk("swsusp debug: before swsusp_suspend\n");
+   error = swsusp_suspend();

then I have the same effect as I get from real hibernation. And we
actually shut down time keeping somewhere in that code path.

ACPI: PCI interrupt for device :00:1b.0 disabled
swsusp debug: before swsusp_suspend
Suspend timekeeping
swsusp: critical section: 
swsusp: Need to copy 112429 pages
swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
swsusp: critical section: done (112429 pages copied)
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Resume timekeeping
ACPI: PCI Interrupt :00:02.0[A] -> GSI 16 (level, low) -> IRQ 16
-> works fine

This is with my patch applied. Without that I get:

CPU1 is down
swsusp debug: before swsusp_suspend
Suspend timekeeping
swsusp: critical section: 
swsusp: Need to copy 112429 pages
swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
swsusp: critical section: done (112429 pages copied)
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Resume timekeeping
Enabling non-boot CPUs
--> Waits for ever until a key is pressed

Thanks,

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
On Thursday, 20 September 2007 23:35, Linus Torvalds wrote:
> 
> On Thu, 20 Sep 2007, Thomas Gleixner wrote:
> > 
> > In meantime I figured out what's happening. The ordering in
> > hibernate_snapshot() is wrong. It does:

Actually, this is incorrect.  Please read my reply to Thomas, just sent. 

> Hmm. This is close to the ordering we have in STR too.
> 
> I have some dim memory of there being some ACPI reason why it had to be 
> done that way.

Yes.  We're executing _INI from the CPU initialization code and that shouldn't
be done after _WAK, which is called from platform_finish().

> In fact, this was done in commit e3c7db621bed4afb8e231cb005057f2feb5db557, 
> long ago, by Rafael:
> 
> As indicated in a recent thread on Linux-PM, it's necessary to call
> pm_ops->finish() before devce_resume(), but enable_nonboot_cpus() has to 
> be
> called before pm_ops->finish() (cf.
> http://lists.osdl.org/pipermail/linux-pm/2006-November/004164.html).  For
> consistency, it seems reasonable to call disable_nonboot_cpus() after
> device_suspend().
> 
> This way the suspend code will remain symmetrical with respect to the 
> resume
> code and it may allow us to speed up things in the future by suspending 
> and
> resuming devices and/or saving the suspend image in many threads.
> 
> The following series of patches reorders the suspend and resume code so 
> that
> nonboot CPUs are disabled after devices have been suspended and enabled 
> before
> the devices are resumed.  It also causes pm_ops->finish() to be called 
> after
> enable_nonboot_cpus() wherever necessary.
> 
> Hmm?
> 
> It's entirely possible that that commit was simply just buggy, and we 
> should indeed move the CPU down/up to be early/late - we've fixed other 
> ordering issues since that commit went in. But this whole area is very 
> murky.
> 
> (Btw, the above commit message points to just my response with a testing 
> patch to the real email: the actual explanation of the INSANE ordering is 
> from Len Brown in
> 
>   
> https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html
> 
> and there Len claims that we *must* wake up CPU's early).
> 
> I personally think that the whole ACPI ordering requirements are just 
> insane, but the point of this email is to point these different 
> requirements out, and hopefully we can get something that works for 
> everybody.

Sure.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Linus Torvalds


On Thu, 20 Sep 2007, Thomas Gleixner wrote:
> 
> In meantime I figured out what's happening. The ordering in
> hibernate_snapshot() is wrong. It does:

Hmm. This is close to the ordering we have in STR too.

I have some dim memory of there being some ACPI reason why it had to be 
done that way.

In fact, this was done in commit e3c7db621bed4afb8e231cb005057f2feb5db557, 
long ago, by Rafael:

As indicated in a recent thread on Linux-PM, it's necessary to call
pm_ops->finish() before devce_resume(), but enable_nonboot_cpus() has to be
called before pm_ops->finish() (cf.
http://lists.osdl.org/pipermail/linux-pm/2006-November/004164.html).  For
consistency, it seems reasonable to call disable_nonboot_cpus() after
device_suspend().

This way the suspend code will remain symmetrical with respect to the resume
code and it may allow us to speed up things in the future by suspending and
resuming devices and/or saving the suspend image in many threads.

The following series of patches reorders the suspend and resume code so that
nonboot CPUs are disabled after devices have been suspended and enabled 
before
the devices are resumed.  It also causes pm_ops->finish() to be called after
enable_nonboot_cpus() wherever necessary.

Hmm?

It's entirely possible that that commit was simply just buggy, and we 
should indeed move the CPU down/up to be early/late - we've fixed other 
ordering issues since that commit went in. But this whole area is very 
murky.

(Btw, the above commit message points to just my response with a testing 
patch to the real email: the actual explanation of the INSANE ordering is 
from Len Brown in


https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html

and there Len claims that we *must* wake up CPU's early).

I personally think that the whole ACPI ordering requirements are just 
insane, but the point of this email is to point these different 
requirements out, and hopefully we can get something that works for 
everybody.

Len added to Cc.

Len? Thomas wants to call 'disable_nonboot_cpus()' early, and 
'enable_nonboot_cpus()' late. Can you explain why that is wrong?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
Thomas,

On Thursday, 20 September 2007 23:08, Thomas Gleixner wrote:
> Rafael,
> 
> On Thu, 2007-09-20 at 22:39 +0200, Rafael J. Wysocki wrote:
> > > Works as well. What's the difference between this and the real thing ?
> > 
> > The real thing also calls device_power_down(PMSG_FREEZE), which is a
> > counterpart of sysdev_shutdown(), more or less, and I think that's what goes
> > belly up.
> > 
> > You can use the patch below (on top of -rc6-mm1), which just disables the 
> > image
> > creation (that should be irrelevant anyway) and see what happens.
> 
> In meantime I figured out what's happening. The ordering in
> hibernate_snapshot() is wrong. It does:
> 
>   swsusp_shrink_memory();
> suspend_console();
> device_suspend(PMSG_FREEZE);
> platform_prepare(platform_mode);
> 
>   disable_nonboot_cpus();
> 
> swsusp_suspend();
> 
>   enable_nonboot_cpus();
> 
>   platform_finish(platform_mode);
> device_resume();
> resume_console();
> 
> We disable everything in device_suspend()

No, we don't.  sysdevs are _not_ suspended in device_suspend().
They are suspended in device_power_down(), which is called
_after_ disable_nonboot_cpus() (from swsusp_suspend()).

> including timekeeping,

No, the timekeeping is suspended in device_power_down() (or at least it should
be).

> so any  code which is depending on working timekeeping and timer
> functionality (which is suspended in timekeeping_suspend() as well) is
> busted. 
> 
> enable_nonboot_cpus() definitely relies on working timekeeping and
> timers depending on the codepath. It's just a surprise that this did not
> blow up earlier (also before clock events).
> 
> I changed the ordering of the above to:
> 
>   disable_nonboot_cpus();
> 
>   swsusp_shrink_memory();
> suspend_console();
> device_suspend(PMSG_FREEZE);
> platform_prepare(platform_mode);
> swsusp_suspend();
>   platform_finish(platform_mode);
> device_resume();
> resume_console();
> 
>   enable_nonboot_cpus();

Actually, we can't do this here, because of ACPI and some interrupt handling
related problems.  Unfortunately, platform_finish() needs to go _after_
enable_nonboot_cpus() and device_resume() needs to go after platform_finish().
Analogously, disable_nonboot_cpus() has to go after platform_prepare().

Otherwise, some systems will break.

> and non-surprisingly the "my VAIO needs help from keyboard" problem went
> away immediately. See patch below. (on top of rc7-hrt1, -mm1 does not
> work at all on my VAIO due to some yet not identified wreckage)

Hm, I really don't know why it helps, but that's not because of the timekeeping
suspend, IMO.

> I did not yet look into the suspend to ram code, but I guess that there
> is an equivalent problem.

Yes, the code ordering is the same, but it's not totally wrong, IMHO.

> But I have no idea why this affects Andrews jinxed VAIO (UP machine),
> though I suspect that we have more timekeeping/timer depending code
> somewhere waiting to bite us.

That's possible.

> Also I still need to debug why the HIBERNATION_TEST code path (which has
> a msleep(5000) in it) does not fail,

See above. :-)

> but I postpone this until tomorrow morning. I'm dead tired after hunting
> this Heisenbug which changes with every other printk added to the code.
> I'm going to add some really noisy messages for everything which accesses
> timekeeping / timers _after_ those systems have been shut down.
> 
> We really need to fix this once and forever _before_ 2.6.23 final, even
> if it requires a -rc8.

Agreed.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
Rafael,

On Thu, 2007-09-20 at 22:39 +0200, Rafael J. Wysocki wrote:
> > Works as well. What's the difference between this and the real thing ?
> 
> The real thing also calls device_power_down(PMSG_FREEZE), which is a
> counterpart of sysdev_shutdown(), more or less, and I think that's what goes
> belly up.
> 
> You can use the patch below (on top of -rc6-mm1), which just disables the 
> image
> creation (that should be irrelevant anyway) and see what happens.

In meantime I figured out what's happening. The ordering in
hibernate_snapshot() is wrong. It does:

swsusp_shrink_memory();
suspend_console();
device_suspend(PMSG_FREEZE);
platform_prepare(platform_mode);

disable_nonboot_cpus();

swsusp_suspend();

enable_nonboot_cpus();

platform_finish(platform_mode);
device_resume();
resume_console();

We disable everything in device_suspend() including timekeeping, so any
code which is depending on working timekeeping and timer functionality
(which is suspended in timekeeping_suspend() as well) is busted.

enable_nonboot_cpus() definitely relies on working timekeeping and
timers depending on the codepath. It's just a surprise that this did not
blow up earlier (also before clock events).

I changed the ordering of the above to:

disable_nonboot_cpus();

swsusp_shrink_memory();
suspend_console();
device_suspend(PMSG_FREEZE);
platform_prepare(platform_mode);
swsusp_suspend();
platform_finish(platform_mode);
device_resume();
resume_console();

enable_nonboot_cpus();

and non-surprisingly the "my VAIO needs help from keyboard" problem went
away immediately. See patch below. (on top of rc7-hrt1, -mm1 does not
work at all on my VAIO due to some yet not identified wreckage)

I did not yet look into the suspend to ram code, but I guess that there
is an equivalent problem.

But I have no idea why this affects Andrews jinxed VAIO (UP machine),
though I suspect that we have more timekeeping/timer depending code
somewhere waiting to bite us.

Also I still need to debug why the HIBERNATION_TEST code path (which has
a msleep(5000) in it) does not fail, but I postpone this until tomorrow
morning. I'm dead tired after hunting this Heisenbug which changes with
every other printk added to the code. I'm going to add some really noisy
messages for everything which accesses timekeeping / timers _after_
those systems have been shut down.

We really need to fix this once and forever _before_ 2.6.23 final, even
if it requires a -rc8.

Thanks,

tglx

--- a/kernel/power/disk.c   2007-09-11 09:25:24.0 +0200
+++ b/kernel/power/disk.c   2007-09-20 22:47:30.0 +0200
@@ -130,10 +130,14 @@ int hibernation_snapshot(int platform_mo
 {
int error;
 
+   error = disable_nonboot_cpus();
+   if (error)
+   goto resume_cpus;
+
/* Free memory before shutting down devices. */
error = swsusp_shrink_memory();
if (error)
-   return error;
+   goto resume_cpus;
 
suspend_console();
error = device_suspend(PMSG_FREEZE);
@@ -144,23 +148,22 @@ int hibernation_snapshot(int platform_mo
if (error)
goto Resume_devices;
 
-   error = disable_nonboot_cpus();
-   if (!error) {
-   if (hibernation_mode != HIBERNATION_TEST) {
-   in_suspend = 1;
-   error = swsusp_suspend();
-   /* Control returns here after successful restore */
-   } else {
-   printk("swsusp debug: Waiting for 5 seconds.\n");
-   mdelay(5000);
-   }
+   if (hibernation_mode != HIBERNATION_TEST) {
+   in_suspend = 1;
+   error = swsusp_suspend();
+   /* Control returns here after successful restore */
+   } else {
+   printk("swsusp debug: Waiting for 5 seconds.\n");
+   mdelay(5000);
}
-   enable_nonboot_cpus();
+
  Resume_devices:
platform_finish(platform_mode);
device_resume();
  Resume_console:
resume_console();
+resume_cpus:
+   enable_nonboot_cpus();
return error;
 }
 



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
On Thursday, 20 September 2007 16:12, Rafael J. Wysocki wrote:
> On Thursday, 20 September 2007 15:43, Thomas Gleixner wrote:
> > On Thu, 2007-09-20 at 15:29 +0200, Rafael J. Wysocki wrote:
> > > > > I haven't had the time to check if any special command line arguments 
> > > > > help.
> > > > > Will check tomorrow.
> > > > 
> > > > Can you please disable the patches, which I sent Linus wards:
> > > > 
> > > > timekeeping-access-rtc-outside-xtime-lock.patch
> > > > xtime-supsend-resume-fixup.patch
> > > > acpi-reevaluate-c-p-t-states.patch
> > > > clockevents-enforce-broadcast-on-resume.patch
> > > > clockevents-do-not-shutdown-broadcast-device-in-oneshot-mode.patch
> > > > clockevents-prevent-stale-tick-update-on-offline-cpu.patch
> > > 
> > > I have skipped all of them, but the resulting kernel behaves in the same
> > > way (ie. doesn't boot).
> > > 
> > > > Without those patches you get the state of rc4-mm1. It would be
> > > > interesting to know which one interferes with the acpi stuff.
> > > 
> > > It looks like something else went in between -rc4 and -rc6 that broke your
> > > patch.  I wonder what it might be ...
> > 
> > Hmm. Can you please go back in the -hrt project history:
> > http://tglx.de/projects/hrtimers/2.6.23-rc5/patch-2.6.23-rc5-hrt1.patches.tar.bz2
> > http://tglx.de/projects/hrtimers/2.6.23-rc4/patch-2.6.23-rc4-hrt1.patches.tar.bz2

Each of them on top of 2.6.23-rc6 gives the same symptoms as rc6-hrt2 (ie. the
box doesn't boot).

I'm going to check if -rc5 with patch-2.6.23-rc4-hrt1 on top of it works and
if not (I suspect so), I'll bisect the Linus' tree between -rc4 and -rc5 in
order to identify the responsible patch.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
On Thursday, 20 September 2007 17:49, Thomas Gleixner wrote:
> On Thu, 2007-09-20 at 16:50 +0200, Thomas Gleixner wrote:
> > > > > Well, the above may affect SMP systems, but the Vaio is UP.  Hmm?
> > > > 
> > > > My jinxed VAIO variant is SMP, but it looks like the same mysterious
> > > > error.
> > > 
> > > Hm.  Have you tried
> > > 
> > > # echo test > /sys/power/disk
> > > # echo disk > /sys/power/state
> > > 
> > > (should suspend devices and disable the nonboot CPUs, wait for 5 sec. and
> > > restore everything)?
> > 
> > Works fine, but I need to reboot into a non debug kernel to verify.
> 
> Works as well. What's the difference between this and the real thing ?

The real thing also calls device_power_down(PMSG_FREEZE), which is a
counterpart of sysdev_shutdown(), more or less, and I think that's what goes
belly up.

You can use the patch below (on top of -rc6-mm1), which just disables the image
creation (that should be irrelevant anyway) and see what happens.

Greetings,
Rafael

---
 kernel/power/disk.c |   11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

Index: linux-2.6.23-rc6-mm1/kernel/power/disk.c
===
--- linux-2.6.23-rc6-mm1.orig/kernel/power/disk.c
+++ linux-2.6.23-rc6-mm1/kernel/power/disk.c
@@ -168,13 +168,14 @@ int create_image(int platform_mode)
}
 
save_processor_state();
-   error = swsusp_arch_suspend();
-   if (error)
-   printk(KERN_ERR "Error %d while creating the image\n", error);
+   //error = swsusp_arch_suspend();
+   //if (error)
+   //  printk(KERN_ERR "Error %d while creating the image\n", error);
/* Restore control flow magically appears here */
restore_processor_state();
-   if (!in_suspend)
-   platform_leave(platform_mode);
+   //if (!in_suspend)
+   //  platform_leave(platform_mode);
+   in_suspend = 0;
/* NOTE:  device_power_up() is just a resume() for devices
 * that suspended with irqs off ... no overall powerup.
 */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
On Thu, 2007-09-20 at 16:50 +0200, Thomas Gleixner wrote:
> > > > Well, the above may affect SMP systems, but the Vaio is UP.  Hmm?
> > > 
> > > My jinxed VAIO variant is SMP, but it looks like the same mysterious
> > > error.
> > 
> > Hm.  Have you tried
> > 
> > # echo test > /sys/power/disk
> > # echo disk > /sys/power/state
> > 
> > (should suspend devices and disable the nonboot CPUs, wait for 5 sec. and
> > restore everything)?
> 
> Works fine, but I need to reboot into a non debug kernel to verify.

Works as well. What's the difference between this and the real thing ?

tglx




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
On Thu, 2007-09-20 at 16:47 +0200, Rafael J. Wysocki wrote:
> On Thursday, 20 September 2007 15:53, Thomas Gleixner wrote:
> > On Thu, 2007-09-20 at 16:12 +0200, Rafael J. Wysocki wrote:
> > > > Vs. the suspend / resume wreckage of rc6-mm1 / rc6-hrt2: 
> > > 
> > > ie. the one on the Vaio (I assume).
> > > 
> > > > I'm still fishing in rather dark water. Depending on the added
> > > > instrumentation points the problem mutates up to the point where it
> > > > vanishes completely. The hang, which requires key strokes again, happens
> > > > consistently at the same place:
> > > > 
> > > > The notifier call in kernel/cpu.c::_cpu_up()
> > > > 
> > > >ret = __raw_notifier_call_chain(_chain, CPU_UP_PREPARE | 
> > > > mod, hcpu,
> > > > -1, _calls);
> > > > 
> > > > does not return, but _all_ registered notifiers are called and reach
> > > > their return statement. This reminds me on:
> > > > 
> > > > http://lkml.org/lkml/2007/5/9/46
> > > > 
> > > > Sigh. I have no clue where to dig further.
> > > 
> > > Well, the above may affect SMP systems, but the Vaio is UP.  Hmm?
> > 
> > My jinxed VAIO variant is SMP, but it looks like the same mysterious
> > error.
> 
> Hm.  Have you tried
> 
> # echo test > /sys/power/disk
> # echo disk > /sys/power/state
> 
> (should suspend devices and disable the nonboot CPUs, wait for 5 sec. and
> restore everything)?

Works fine, but I need to reboot into a non debug kernel to verify.

tglx



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
On Thursday, 20 September 2007 15:53, Thomas Gleixner wrote:
> On Thu, 2007-09-20 at 16:12 +0200, Rafael J. Wysocki wrote:
> > > Vs. the suspend / resume wreckage of rc6-mm1 / rc6-hrt2: 
> > 
> > ie. the one on the Vaio (I assume).
> > 
> > > I'm still fishing in rather dark water. Depending on the added
> > > instrumentation points the problem mutates up to the point where it
> > > vanishes completely. The hang, which requires key strokes again, happens
> > > consistently at the same place:
> > > 
> > > The notifier call in kernel/cpu.c::_cpu_up()
> > > 
> > >ret = __raw_notifier_call_chain(_chain, CPU_UP_PREPARE | mod, 
> > > hcpu,
> > > -1, _calls);
> > > 
> > > does not return, but _all_ registered notifiers are called and reach
> > > their return statement. This reminds me on:
> > > 
> > > http://lkml.org/lkml/2007/5/9/46
> > > 
> > > Sigh. I have no clue where to dig further.
> > 
> > Well, the above may affect SMP systems, but the Vaio is UP.  Hmm?
> 
> My jinxed VAIO variant is SMP, but it looks like the same mysterious
> error.

Hm.  Have you tried

# echo test > /sys/power/disk
# echo disk > /sys/power/state

(should suspend devices and disable the nonboot CPUs, wait for 5 sec. and
restore everything)?

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
On Thu, 2007-09-20 at 16:12 +0200, Rafael J. Wysocki wrote:
> > Vs. the suspend / resume wreckage of rc6-mm1 / rc6-hrt2: 
> 
> ie. the one on the Vaio (I assume).
> 
> > I'm still fishing in rather dark water. Depending on the added
> > instrumentation points the problem mutates up to the point where it
> > vanishes completely. The hang, which requires key strokes again, happens
> > consistently at the same place:
> > 
> > The notifier call in kernel/cpu.c::_cpu_up()
> > 
> >ret = __raw_notifier_call_chain(_chain, CPU_UP_PREPARE | mod, 
> > hcpu,
> > -1, _calls);
> > 
> > does not return, but _all_ registered notifiers are called and reach
> > their return statement. This reminds me on:
> > 
> > http://lkml.org/lkml/2007/5/9/46
> > 
> > Sigh. I have no clue where to dig further.
> 
> Well, the above may affect SMP systems, but the Vaio is UP.  Hmm?

My jinxed VAIO variant is SMP, but it looks like the same mysterious
error.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
On Thursday, 20 September 2007 15:43, Thomas Gleixner wrote:
> On Thu, 2007-09-20 at 15:29 +0200, Rafael J. Wysocki wrote:
> > > > I haven't had the time to check if any special command line arguments 
> > > > help.
> > > > Will check tomorrow.
> > > 
> > > Can you please disable the patches, which I sent Linus wards:
> > > 
> > > timekeeping-access-rtc-outside-xtime-lock.patch
> > > xtime-supsend-resume-fixup.patch
> > > acpi-reevaluate-c-p-t-states.patch
> > > clockevents-enforce-broadcast-on-resume.patch
> > > clockevents-do-not-shutdown-broadcast-device-in-oneshot-mode.patch
> > > clockevents-prevent-stale-tick-update-on-offline-cpu.patch
> > 
> > I have skipped all of them, but the resulting kernel behaves in the same
> > way (ie. doesn't boot).
> > 
> > > Without those patches you get the state of rc4-mm1. It would be
> > > interesting to know which one interferes with the acpi stuff.
> > 
> > It looks like something else went in between -rc4 and -rc6 that broke your
> > patch.  I wonder what it might be ...
> 
> Hmm. Can you please go back in the -hrt project history:
> http://tglx.de/projects/hrtimers/2.6.23-rc5/patch-2.6.23-rc5-hrt1.patches.tar.bz2
> http://tglx.de/projects/hrtimers/2.6.23-rc4/patch-2.6.23-rc4-hrt1.patches.tar.bz2

Sure, but it'll take some time. :-)

> Also, can you send me your .config file please ?

Attached is the one I'm using on 2.6.23-rc6 w/ your patches.

> Vs. the suspend / resume wreckage of rc6-mm1 / rc6-hrt2: 

ie. the one on the Vaio (I assume).

> I'm still fishing in rather dark water. Depending on the added
> instrumentation points the problem mutates up to the point where it
> vanishes completely. The hang, which requires key strokes again, happens
> consistently at the same place:
> 
> The notifier call in kernel/cpu.c::_cpu_up()
> 
>ret = __raw_notifier_call_chain(_chain, CPU_UP_PREPARE | mod, hcpu,
> -1, _calls);
> 
> does not return, but _all_ registered notifiers are called and reach
> their return statement. This reminds me on:
> 
> http://lkml.org/lkml/2007/5/9/46
> 
> Sigh. I have no clue where to dig further.

Well, the above may affect SMP systems, but the Vaio is UP.  Hmm?

Greetings,
Rafael
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.23-rc6-hrt
# Thu Sep 20 14:26:03 2007
#
CONFIG_X86_64=y
CONFIG_64BIT=y
CONFIG_X86=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_NONIRQ_WAKEUP=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_ZONE_DMA32=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_NR_QUICK=2
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_CMPXCHG=y
CONFIG_EARLY_PRINTK=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_DMI=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
# CONFIG_TASK_XACCT is not set
# CONFIG_USER_NS is not set
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=18
CONFIG_CPUSETS=y
CONFIG_SYSFS_DEPRECATED=y
# CONFIG_RELAY is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLAB=y
# CONFIG_SLUB is not set
# CONFIG_SLOB is not set
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_BLK_DEV_BSG is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"

#
# Processor type and features
#
# CONFIG_TICK_ONESHOT is not set
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
On Thu, 2007-09-20 at 15:29 +0200, Rafael J. Wysocki wrote:
> > > I haven't had the time to check if any special command line arguments 
> > > help.
> > > Will check tomorrow.
> > 
> > Can you please disable the patches, which I sent Linus wards:
> > 
> > timekeeping-access-rtc-outside-xtime-lock.patch
> > xtime-supsend-resume-fixup.patch
> > acpi-reevaluate-c-p-t-states.patch
> > clockevents-enforce-broadcast-on-resume.patch
> > clockevents-do-not-shutdown-broadcast-device-in-oneshot-mode.patch
> > clockevents-prevent-stale-tick-update-on-offline-cpu.patch
> 
> I have skipped all of them, but the resulting kernel behaves in the same
> way (ie. doesn't boot).
> 
> > Without those patches you get the state of rc4-mm1. It would be
> > interesting to know which one interferes with the acpi stuff.
> 
> It looks like something else went in between -rc4 and -rc6 that broke your
> patch.  I wonder what it might be ...

Hmm. Can you please go back in the -hrt project history:
http://tglx.de/projects/hrtimers/2.6.23-rc5/patch-2.6.23-rc5-hrt1.patches.tar.bz2
http://tglx.de/projects/hrtimers/2.6.23-rc4/patch-2.6.23-rc4-hrt1.patches.tar.bz2

Also, can you send me your .config file please ?

Vs. the suspend / resume wreckage of rc6-mm1 / rc6-hrt2: 

I'm still fishing in rather dark water. Depending on the added
instrumentation points the problem mutates up to the point where it
vanishes completely. The hang, which requires key strokes again, happens
consistently at the same place:

The notifier call in kernel/cpu.c::_cpu_up()

   ret = __raw_notifier_call_chain(_chain, CPU_UP_PREPARE | mod, hcpu,
-1, _calls);

does not return, but _all_ registered notifiers are called and reach
their return statement. This reminds me on:

http://lkml.org/lkml/2007/5/9/46

Sigh. I have no clue where to dig further.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
On Thursday, 20 September 2007 08:18, Thomas Gleixner wrote:
> On Thu, 2007-09-20 at 02:06 +0200, Rafael J. Wysocki wrote:
> > On Wednesday, 19 September 2007 21:21, Thomas Gleixner wrote:
> > > On Wed, 2007-09-19 at 19:44 +0200, Rafael J. Wysocki wrote:
> > > > > > It boots with nohpet alone and suspend/hibernation seem to work 
> > > > > > (still,
> > > > > > it didn't want to boot right after hibernation, but booted after 
> > > > > > I'd switched
> > > > > > it off/on manually).
> > > > > 
> > > > > Can you please check, whether
> > > > > 
> > > > > http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patch
> > > > > 
> > > > > works for you ?
> > > > 
> > > > Nope.  It's a total disaster. :-(
> > > 
> > > True. I have instrumented it to the point where the broadcast device is
> > > programmed, but no interrupt comes in for totally unknown reasons.
> > > 
> > > > Doesn't boot at all, even with "noacpitimer nohpet", and that's with
> > > > NO_HZ and HIGH_RES_TIMERS unset.
> > > 
> > > > If you have a bisectable patch series, I can try to identify the 
> > > > responsible
> > > > patch.
> > > 
> > > http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patches.tar.bz2
> > > 
> > > The first patches in the queue are the mainline fixups.
> > 
> > It's x86_64-convert-to-clockevents.patch (ie. after applying it the box 
> > stops
> > to boot).
> > 
> > I haven't had the time to check if any special command line arguments help.
> > Will check tomorrow.
> 
> Can you please disable the patches, which I sent Linus wards:
> 
> timekeeping-access-rtc-outside-xtime-lock.patch
> xtime-supsend-resume-fixup.patch
> acpi-reevaluate-c-p-t-states.patch
> clockevents-enforce-broadcast-on-resume.patch
> clockevents-do-not-shutdown-broadcast-device-in-oneshot-mode.patch
> clockevents-prevent-stale-tick-update-on-offline-cpu.patch

I have skipped all of them, but the resulting kernel behaves in the same
way (ie. doesn't boot).

> Without those patches you get the state of rc4-mm1. It would be
> interesting to know which one interferes with the acpi stuff.

It looks like something else went in between -rc4 and -rc6 that broke your
patch.  I wonder what it might be ...

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
On Thu, 2007-09-20 at 02:06 +0200, Rafael J. Wysocki wrote:
> On Wednesday, 19 September 2007 21:21, Thomas Gleixner wrote:
> > On Wed, 2007-09-19 at 19:44 +0200, Rafael J. Wysocki wrote:
> > > > > It boots with nohpet alone and suspend/hibernation seem to work 
> > > > > (still,
> > > > > it didn't want to boot right after hibernation, but booted after I'd 
> > > > > switched
> > > > > it off/on manually).
> > > > 
> > > > Can you please check, whether
> > > > 
> > > > http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patch
> > > > 
> > > > works for you ?
> > > 
> > > Nope.  It's a total disaster. :-(
> > 
> > True. I have instrumented it to the point where the broadcast device is
> > programmed, but no interrupt comes in for totally unknown reasons.
> > 
> > > Doesn't boot at all, even with "noacpitimer nohpet", and that's with
> > > NO_HZ and HIGH_RES_TIMERS unset.
> > 
> > > If you have a bisectable patch series, I can try to identify the 
> > > responsible
> > > patch.
> > 
> > http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patches.tar.bz2
> > 
> > The first patches in the queue are the mainline fixups.
> 
> It's x86_64-convert-to-clockevents.patch (ie. after applying it the box stops
> to boot).
> 
> I haven't had the time to check if any special command line arguments help.
> Will check tomorrow.

Can you please disable the patches, which I sent Linus wards:

timekeeping-access-rtc-outside-xtime-lock.patch
xtime-supsend-resume-fixup.patch
acpi-reevaluate-c-p-t-states.patch
clockevents-enforce-broadcast-on-resume.patch
clockevents-do-not-shutdown-broadcast-device-in-oneshot-mode.patch
clockevents-prevent-stale-tick-update-on-offline-cpu.patch

Without those patches you get the state of rc4-mm1. It would be
interesting to know which one interferes with the acpi stuff.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
On Thu, 2007-09-20 at 02:06 +0200, Rafael J. Wysocki wrote:
 On Wednesday, 19 September 2007 21:21, Thomas Gleixner wrote:
  On Wed, 2007-09-19 at 19:44 +0200, Rafael J. Wysocki wrote:
 It boots with nohpet alone and suspend/hibernation seem to work 
 (still,
 it didn't want to boot right after hibernation, but booted after I'd 
 switched
 it off/on manually).

Can you please check, whether

http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patch

works for you ?
   
   Nope.  It's a total disaster. :-(
  
  True. I have instrumented it to the point where the broadcast device is
  programmed, but no interrupt comes in for totally unknown reasons.
  
   Doesn't boot at all, even with noacpitimer nohpet, and that's with
   NO_HZ and HIGH_RES_TIMERS unset.
  
   If you have a bisectable patch series, I can try to identify the 
   responsible
   patch.
  
  http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patches.tar.bz2
  
  The first patches in the queue are the mainline fixups.
 
 It's x86_64-convert-to-clockevents.patch (ie. after applying it the box stops
 to boot).
 
 I haven't had the time to check if any special command line arguments help.
 Will check tomorrow.

Can you please disable the patches, which I sent Linus wards:

timekeeping-access-rtc-outside-xtime-lock.patch
xtime-supsend-resume-fixup.patch
acpi-reevaluate-c-p-t-states.patch
clockevents-enforce-broadcast-on-resume.patch
clockevents-do-not-shutdown-broadcast-device-in-oneshot-mode.patch
clockevents-prevent-stale-tick-update-on-offline-cpu.patch

Without those patches you get the state of rc4-mm1. It would be
interesting to know which one interferes with the acpi stuff.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
On Thursday, 20 September 2007 08:18, Thomas Gleixner wrote:
 On Thu, 2007-09-20 at 02:06 +0200, Rafael J. Wysocki wrote:
  On Wednesday, 19 September 2007 21:21, Thomas Gleixner wrote:
   On Wed, 2007-09-19 at 19:44 +0200, Rafael J. Wysocki wrote:
  It boots with nohpet alone and suspend/hibernation seem to work 
  (still,
  it didn't want to boot right after hibernation, but booted after 
  I'd switched
  it off/on manually).
 
 Can you please check, whether
 
 http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patch
 
 works for you ?

Nope.  It's a total disaster. :-(
   
   True. I have instrumented it to the point where the broadcast device is
   programmed, but no interrupt comes in for totally unknown reasons.
   
Doesn't boot at all, even with noacpitimer nohpet, and that's with
NO_HZ and HIGH_RES_TIMERS unset.
   
If you have a bisectable patch series, I can try to identify the 
responsible
patch.
   
   http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patches.tar.bz2
   
   The first patches in the queue are the mainline fixups.
  
  It's x86_64-convert-to-clockevents.patch (ie. after applying it the box 
  stops
  to boot).
  
  I haven't had the time to check if any special command line arguments help.
  Will check tomorrow.
 
 Can you please disable the patches, which I sent Linus wards:
 
 timekeeping-access-rtc-outside-xtime-lock.patch
 xtime-supsend-resume-fixup.patch
 acpi-reevaluate-c-p-t-states.patch
 clockevents-enforce-broadcast-on-resume.patch
 clockevents-do-not-shutdown-broadcast-device-in-oneshot-mode.patch
 clockevents-prevent-stale-tick-update-on-offline-cpu.patch

I have skipped all of them, but the resulting kernel behaves in the same
way (ie. doesn't boot).

 Without those patches you get the state of rc4-mm1. It would be
 interesting to know which one interferes with the acpi stuff.

It looks like something else went in between -rc4 and -rc6 that broke your
patch.  I wonder what it might be ...

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
On Thu, 2007-09-20 at 15:29 +0200, Rafael J. Wysocki wrote:
   I haven't had the time to check if any special command line arguments 
   help.
   Will check tomorrow.
  
  Can you please disable the patches, which I sent Linus wards:
  
  timekeeping-access-rtc-outside-xtime-lock.patch
  xtime-supsend-resume-fixup.patch
  acpi-reevaluate-c-p-t-states.patch
  clockevents-enforce-broadcast-on-resume.patch
  clockevents-do-not-shutdown-broadcast-device-in-oneshot-mode.patch
  clockevents-prevent-stale-tick-update-on-offline-cpu.patch
 
 I have skipped all of them, but the resulting kernel behaves in the same
 way (ie. doesn't boot).
 
  Without those patches you get the state of rc4-mm1. It would be
  interesting to know which one interferes with the acpi stuff.
 
 It looks like something else went in between -rc4 and -rc6 that broke your
 patch.  I wonder what it might be ...

Hmm. Can you please go back in the -hrt project history:
http://tglx.de/projects/hrtimers/2.6.23-rc5/patch-2.6.23-rc5-hrt1.patches.tar.bz2
http://tglx.de/projects/hrtimers/2.6.23-rc4/patch-2.6.23-rc4-hrt1.patches.tar.bz2

Also, can you send me your .config file please ?

Vs. the suspend / resume wreckage of rc6-mm1 / rc6-hrt2: 

I'm still fishing in rather dark water. Depending on the added
instrumentation points the problem mutates up to the point where it
vanishes completely. The hang, which requires key strokes again, happens
consistently at the same place:

The notifier call in kernel/cpu.c::_cpu_up()

   ret = __raw_notifier_call_chain(cpu_chain, CPU_UP_PREPARE | mod, hcpu,
-1, nr_calls);

does not return, but _all_ registered notifiers are called and reach
their return statement. This reminds me on:

http://lkml.org/lkml/2007/5/9/46

Sigh. I have no clue where to dig further.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
On Thu, 2007-09-20 at 16:12 +0200, Rafael J. Wysocki wrote:
  Vs. the suspend / resume wreckage of rc6-mm1 / rc6-hrt2: 
 
 ie. the one on the Vaio (I assume).
 
  I'm still fishing in rather dark water. Depending on the added
  instrumentation points the problem mutates up to the point where it
  vanishes completely. The hang, which requires key strokes again, happens
  consistently at the same place:
  
  The notifier call in kernel/cpu.c::_cpu_up()
  
 ret = __raw_notifier_call_chain(cpu_chain, CPU_UP_PREPARE | mod, 
  hcpu,
  -1, nr_calls);
  
  does not return, but _all_ registered notifiers are called and reach
  their return statement. This reminds me on:
  
  http://lkml.org/lkml/2007/5/9/46
  
  Sigh. I have no clue where to dig further.
 
 Well, the above may affect SMP systems, but the Vaio is UP.  Hmm?

My jinxed VAIO variant is SMP, but it looks like the same mysterious
error.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
On Thursday, 20 September 2007 15:43, Thomas Gleixner wrote:
 On Thu, 2007-09-20 at 15:29 +0200, Rafael J. Wysocki wrote:
I haven't had the time to check if any special command line arguments 
help.
Will check tomorrow.
   
   Can you please disable the patches, which I sent Linus wards:
   
   timekeeping-access-rtc-outside-xtime-lock.patch
   xtime-supsend-resume-fixup.patch
   acpi-reevaluate-c-p-t-states.patch
   clockevents-enforce-broadcast-on-resume.patch
   clockevents-do-not-shutdown-broadcast-device-in-oneshot-mode.patch
   clockevents-prevent-stale-tick-update-on-offline-cpu.patch
  
  I have skipped all of them, but the resulting kernel behaves in the same
  way (ie. doesn't boot).
  
   Without those patches you get the state of rc4-mm1. It would be
   interesting to know which one interferes with the acpi stuff.
  
  It looks like something else went in between -rc4 and -rc6 that broke your
  patch.  I wonder what it might be ...
 
 Hmm. Can you please go back in the -hrt project history:
 http://tglx.de/projects/hrtimers/2.6.23-rc5/patch-2.6.23-rc5-hrt1.patches.tar.bz2
 http://tglx.de/projects/hrtimers/2.6.23-rc4/patch-2.6.23-rc4-hrt1.patches.tar.bz2

Sure, but it'll take some time. :-)

 Also, can you send me your .config file please ?

Attached is the one I'm using on 2.6.23-rc6 w/ your patches.

 Vs. the suspend / resume wreckage of rc6-mm1 / rc6-hrt2: 

ie. the one on the Vaio (I assume).

 I'm still fishing in rather dark water. Depending on the added
 instrumentation points the problem mutates up to the point where it
 vanishes completely. The hang, which requires key strokes again, happens
 consistently at the same place:
 
 The notifier call in kernel/cpu.c::_cpu_up()
 
ret = __raw_notifier_call_chain(cpu_chain, CPU_UP_PREPARE | mod, hcpu,
 -1, nr_calls);
 
 does not return, but _all_ registered notifiers are called and reach
 their return statement. This reminds me on:
 
 http://lkml.org/lkml/2007/5/9/46
 
 Sigh. I have no clue where to dig further.

Well, the above may affect SMP systems, but the Vaio is UP.  Hmm?

Greetings,
Rafael
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.23-rc6-hrt
# Thu Sep 20 14:26:03 2007
#
CONFIG_X86_64=y
CONFIG_64BIT=y
CONFIG_X86=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_NONIRQ_WAKEUP=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_ZONE_DMA32=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_NR_QUICK=2
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_CMPXCHG=y
CONFIG_EARLY_PRINTK=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_DMI=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
# CONFIG_TASK_XACCT is not set
# CONFIG_USER_NS is not set
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=18
CONFIG_CPUSETS=y
CONFIG_SYSFS_DEPRECATED=y
# CONFIG_RELAY is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLAB=y
# CONFIG_SLUB is not set
# CONFIG_SLOB is not set
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_BLK_DEV_BSG is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED=cfq

#
# Processor type and features
#
# CONFIG_TICK_ONESHOT is not set
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_X86_PC=y
# CONFIG_X86_VSMP is not set
CONFIG_MK8=y
# 

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
On Thursday, 20 September 2007 15:53, Thomas Gleixner wrote:
 On Thu, 2007-09-20 at 16:12 +0200, Rafael J. Wysocki wrote:
   Vs. the suspend / resume wreckage of rc6-mm1 / rc6-hrt2: 
  
  ie. the one on the Vaio (I assume).
  
   I'm still fishing in rather dark water. Depending on the added
   instrumentation points the problem mutates up to the point where it
   vanishes completely. The hang, which requires key strokes again, happens
   consistently at the same place:
   
   The notifier call in kernel/cpu.c::_cpu_up()
   
  ret = __raw_notifier_call_chain(cpu_chain, CPU_UP_PREPARE | mod, 
   hcpu,
   -1, nr_calls);
   
   does not return, but _all_ registered notifiers are called and reach
   their return statement. This reminds me on:
   
   http://lkml.org/lkml/2007/5/9/46
   
   Sigh. I have no clue where to dig further.
  
  Well, the above may affect SMP systems, but the Vaio is UP.  Hmm?
 
 My jinxed VAIO variant is SMP, but it looks like the same mysterious
 error.

Hm.  Have you tried

# echo test  /sys/power/disk
# echo disk  /sys/power/state

(should suspend devices and disable the nonboot CPUs, wait for 5 sec. and
restore everything)?

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
On Thu, 2007-09-20 at 16:50 +0200, Thomas Gleixner wrote:
Well, the above may affect SMP systems, but the Vaio is UP.  Hmm?
   
   My jinxed VAIO variant is SMP, but it looks like the same mysterious
   error.
  
  Hm.  Have you tried
  
  # echo test  /sys/power/disk
  # echo disk  /sys/power/state
  
  (should suspend devices and disable the nonboot CPUs, wait for 5 sec. and
  restore everything)?
 
 Works fine, but I need to reboot into a non debug kernel to verify.

Works as well. What's the difference between this and the real thing ?

tglx




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
On Thursday, 20 September 2007 17:49, Thomas Gleixner wrote:
 On Thu, 2007-09-20 at 16:50 +0200, Thomas Gleixner wrote:
 Well, the above may affect SMP systems, but the Vaio is UP.  Hmm?

My jinxed VAIO variant is SMP, but it looks like the same mysterious
error.
   
   Hm.  Have you tried
   
   # echo test  /sys/power/disk
   # echo disk  /sys/power/state
   
   (should suspend devices and disable the nonboot CPUs, wait for 5 sec. and
   restore everything)?
  
  Works fine, but I need to reboot into a non debug kernel to verify.
 
 Works as well. What's the difference between this and the real thing ?

The real thing also calls device_power_down(PMSG_FREEZE), which is a
counterpart of sysdev_shutdown(), more or less, and I think that's what goes
belly up.

You can use the patch below (on top of -rc6-mm1), which just disables the image
creation (that should be irrelevant anyway) and see what happens.

Greetings,
Rafael

---
 kernel/power/disk.c |   11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

Index: linux-2.6.23-rc6-mm1/kernel/power/disk.c
===
--- linux-2.6.23-rc6-mm1.orig/kernel/power/disk.c
+++ linux-2.6.23-rc6-mm1/kernel/power/disk.c
@@ -168,13 +168,14 @@ int create_image(int platform_mode)
}
 
save_processor_state();
-   error = swsusp_arch_suspend();
-   if (error)
-   printk(KERN_ERR Error %d while creating the image\n, error);
+   //error = swsusp_arch_suspend();
+   //if (error)
+   //  printk(KERN_ERR Error %d while creating the image\n, error);
/* Restore control flow magically appears here */
restore_processor_state();
-   if (!in_suspend)
-   platform_leave(platform_mode);
+   //if (!in_suspend)
+   //  platform_leave(platform_mode);
+   in_suspend = 0;
/* NOTE:  device_power_up() is just a resume() for devices
 * that suspended with irqs off ... no overall powerup.
 */
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
On Thursday, 20 September 2007 16:12, Rafael J. Wysocki wrote:
 On Thursday, 20 September 2007 15:43, Thomas Gleixner wrote:
  On Thu, 2007-09-20 at 15:29 +0200, Rafael J. Wysocki wrote:
 I haven't had the time to check if any special command line arguments 
 help.
 Will check tomorrow.

Can you please disable the patches, which I sent Linus wards:

timekeeping-access-rtc-outside-xtime-lock.patch
xtime-supsend-resume-fixup.patch
acpi-reevaluate-c-p-t-states.patch
clockevents-enforce-broadcast-on-resume.patch
clockevents-do-not-shutdown-broadcast-device-in-oneshot-mode.patch
clockevents-prevent-stale-tick-update-on-offline-cpu.patch
   
   I have skipped all of them, but the resulting kernel behaves in the same
   way (ie. doesn't boot).
   
Without those patches you get the state of rc4-mm1. It would be
interesting to know which one interferes with the acpi stuff.
   
   It looks like something else went in between -rc4 and -rc6 that broke your
   patch.  I wonder what it might be ...
  
  Hmm. Can you please go back in the -hrt project history:
  http://tglx.de/projects/hrtimers/2.6.23-rc5/patch-2.6.23-rc5-hrt1.patches.tar.bz2
  http://tglx.de/projects/hrtimers/2.6.23-rc4/patch-2.6.23-rc4-hrt1.patches.tar.bz2

Each of them on top of 2.6.23-rc6 gives the same symptoms as rc6-hrt2 (ie. the
box doesn't boot).

I'm going to check if -rc5 with patch-2.6.23-rc4-hrt1 on top of it works and
if not (I suspect so), I'll bisect the Linus' tree between -rc4 and -rc5 in
order to identify the responsible patch.

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
Rafael,

On Thu, 2007-09-20 at 22:39 +0200, Rafael J. Wysocki wrote:
  Works as well. What's the difference between this and the real thing ?
 
 The real thing also calls device_power_down(PMSG_FREEZE), which is a
 counterpart of sysdev_shutdown(), more or less, and I think that's what goes
 belly up.
 
 You can use the patch below (on top of -rc6-mm1), which just disables the 
 image
 creation (that should be irrelevant anyway) and see what happens.

In meantime I figured out what's happening. The ordering in
hibernate_snapshot() is wrong. It does:

swsusp_shrink_memory();
suspend_console();
device_suspend(PMSG_FREEZE);
platform_prepare(platform_mode);

disable_nonboot_cpus();

swsusp_suspend();

enable_nonboot_cpus();

platform_finish(platform_mode);
device_resume();
resume_console();

We disable everything in device_suspend() including timekeeping, so any
code which is depending on working timekeeping and timer functionality
(which is suspended in timekeeping_suspend() as well) is busted.

enable_nonboot_cpus() definitely relies on working timekeeping and
timers depending on the codepath. It's just a surprise that this did not
blow up earlier (also before clock events).

I changed the ordering of the above to:

disable_nonboot_cpus();

swsusp_shrink_memory();
suspend_console();
device_suspend(PMSG_FREEZE);
platform_prepare(platform_mode);
swsusp_suspend();
platform_finish(platform_mode);
device_resume();
resume_console();

enable_nonboot_cpus();

and non-surprisingly the my VAIO needs help from keyboard problem went
away immediately. See patch below. (on top of rc7-hrt1, -mm1 does not
work at all on my VAIO due to some yet not identified wreckage)

I did not yet look into the suspend to ram code, but I guess that there
is an equivalent problem.

But I have no idea why this affects Andrews jinxed VAIO (UP machine),
though I suspect that we have more timekeeping/timer depending code
somewhere waiting to bite us.

Also I still need to debug why the HIBERNATION_TEST code path (which has
a msleep(5000) in it) does not fail, but I postpone this until tomorrow
morning. I'm dead tired after hunting this Heisenbug which changes with
every other printk added to the code. I'm going to add some really noisy
messages for everything which accesses timekeeping / timers _after_
those systems have been shut down.

We really need to fix this once and forever _before_ 2.6.23 final, even
if it requires a -rc8.

Thanks,

tglx

--- a/kernel/power/disk.c   2007-09-11 09:25:24.0 +0200
+++ b/kernel/power/disk.c   2007-09-20 22:47:30.0 +0200
@@ -130,10 +130,14 @@ int hibernation_snapshot(int platform_mo
 {
int error;
 
+   error = disable_nonboot_cpus();
+   if (error)
+   goto resume_cpus;
+
/* Free memory before shutting down devices. */
error = swsusp_shrink_memory();
if (error)
-   return error;
+   goto resume_cpus;
 
suspend_console();
error = device_suspend(PMSG_FREEZE);
@@ -144,23 +148,22 @@ int hibernation_snapshot(int platform_mo
if (error)
goto Resume_devices;
 
-   error = disable_nonboot_cpus();
-   if (!error) {
-   if (hibernation_mode != HIBERNATION_TEST) {
-   in_suspend = 1;
-   error = swsusp_suspend();
-   /* Control returns here after successful restore */
-   } else {
-   printk(swsusp debug: Waiting for 5 seconds.\n);
-   mdelay(5000);
-   }
+   if (hibernation_mode != HIBERNATION_TEST) {
+   in_suspend = 1;
+   error = swsusp_suspend();
+   /* Control returns here after successful restore */
+   } else {
+   printk(swsusp debug: Waiting for 5 seconds.\n);
+   mdelay(5000);
}
-   enable_nonboot_cpus();
+
  Resume_devices:
platform_finish(platform_mode);
device_resume();
  Resume_console:
resume_console();
+resume_cpus:
+   enable_nonboot_cpus();
return error;
 }
 



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
Thomas,

On Thursday, 20 September 2007 23:08, Thomas Gleixner wrote:
 Rafael,
 
 On Thu, 2007-09-20 at 22:39 +0200, Rafael J. Wysocki wrote:
   Works as well. What's the difference between this and the real thing ?
  
  The real thing also calls device_power_down(PMSG_FREEZE), which is a
  counterpart of sysdev_shutdown(), more or less, and I think that's what goes
  belly up.
  
  You can use the patch below (on top of -rc6-mm1), which just disables the 
  image
  creation (that should be irrelevant anyway) and see what happens.
 
 In meantime I figured out what's happening. The ordering in
 hibernate_snapshot() is wrong. It does:
 
   swsusp_shrink_memory();
 suspend_console();
 device_suspend(PMSG_FREEZE);
 platform_prepare(platform_mode);
 
   disable_nonboot_cpus();
 
 swsusp_suspend();
 
   enable_nonboot_cpus();
 
   platform_finish(platform_mode);
 device_resume();
 resume_console();
 
 We disable everything in device_suspend()

No, we don't.  sysdevs are _not_ suspended in device_suspend().
They are suspended in device_power_down(), which is called
_after_ disable_nonboot_cpus() (from swsusp_suspend()).

 including timekeeping,

No, the timekeeping is suspended in device_power_down() (or at least it should
be).

 so any  code which is depending on working timekeeping and timer
 functionality (which is suspended in timekeeping_suspend() as well) is
 busted. 
 
 enable_nonboot_cpus() definitely relies on working timekeeping and
 timers depending on the codepath. It's just a surprise that this did not
 blow up earlier (also before clock events).
 
 I changed the ordering of the above to:
 
   disable_nonboot_cpus();
 
   swsusp_shrink_memory();
 suspend_console();
 device_suspend(PMSG_FREEZE);
 platform_prepare(platform_mode);
 swsusp_suspend();
   platform_finish(platform_mode);
 device_resume();
 resume_console();
 
   enable_nonboot_cpus();

Actually, we can't do this here, because of ACPI and some interrupt handling
related problems.  Unfortunately, platform_finish() needs to go _after_
enable_nonboot_cpus() and device_resume() needs to go after platform_finish().
Analogously, disable_nonboot_cpus() has to go after platform_prepare().

Otherwise, some systems will break.

 and non-surprisingly the my VAIO needs help from keyboard problem went
 away immediately. See patch below. (on top of rc7-hrt1, -mm1 does not
 work at all on my VAIO due to some yet not identified wreckage)

Hm, I really don't know why it helps, but that's not because of the timekeeping
suspend, IMO.

 I did not yet look into the suspend to ram code, but I guess that there
 is an equivalent problem.

Yes, the code ordering is the same, but it's not totally wrong, IMHO.

 But I have no idea why this affects Andrews jinxed VAIO (UP machine),
 though I suspect that we have more timekeeping/timer depending code
 somewhere waiting to bite us.

That's possible.

 Also I still need to debug why the HIBERNATION_TEST code path (which has
 a msleep(5000) in it) does not fail,

See above. :-)

 but I postpone this until tomorrow morning. I'm dead tired after hunting
 this Heisenbug which changes with every other printk added to the code.
 I'm going to add some really noisy messages for everything which accesses
 timekeeping / timers _after_ those systems have been shut down.
 
 We really need to fix this once and forever _before_ 2.6.23 final, even
 if it requires a -rc8.

Agreed.

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Linus Torvalds


On Thu, 20 Sep 2007, Thomas Gleixner wrote:
 
 In meantime I figured out what's happening. The ordering in
 hibernate_snapshot() is wrong. It does:

Hmm. This is close to the ordering we have in STR too.

I have some dim memory of there being some ACPI reason why it had to be 
done that way.

In fact, this was done in commit e3c7db621bed4afb8e231cb005057f2feb5db557, 
long ago, by Rafael:

As indicated in a recent thread on Linux-PM, it's necessary to call
pm_ops-finish() before devce_resume(), but enable_nonboot_cpus() has to be
called before pm_ops-finish() (cf.
http://lists.osdl.org/pipermail/linux-pm/2006-November/004164.html).  For
consistency, it seems reasonable to call disable_nonboot_cpus() after
device_suspend().

This way the suspend code will remain symmetrical with respect to the resume
code and it may allow us to speed up things in the future by suspending and
resuming devices and/or saving the suspend image in many threads.

The following series of patches reorders the suspend and resume code so that
nonboot CPUs are disabled after devices have been suspended and enabled 
before
the devices are resumed.  It also causes pm_ops-finish() to be called after
enable_nonboot_cpus() wherever necessary.

Hmm?

It's entirely possible that that commit was simply just buggy, and we 
should indeed move the CPU down/up to be early/late - we've fixed other 
ordering issues since that commit went in. But this whole area is very 
murky.

(Btw, the above commit message points to just my response with a testing 
patch to the real email: the actual explanation of the INSANE ordering is 
from Len Brown in


https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html

and there Len claims that we *must* wake up CPU's early).

I personally think that the whole ACPI ordering requirements are just 
insane, but the point of this email is to point these different 
requirements out, and hopefully we can get something that works for 
everybody.

Len added to Cc.

Len? Thomas wants to call 'disable_nonboot_cpus()' early, and 
'enable_nonboot_cpus()' late. Can you explain why that is wrong?

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
On Thursday, 20 September 2007 23:35, Linus Torvalds wrote:
 
 On Thu, 20 Sep 2007, Thomas Gleixner wrote:
  
  In meantime I figured out what's happening. The ordering in
  hibernate_snapshot() is wrong. It does:

Actually, this is incorrect.  Please read my reply to Thomas, just sent. 

 Hmm. This is close to the ordering we have in STR too.
 
 I have some dim memory of there being some ACPI reason why it had to be 
 done that way.

Yes.  We're executing _INI from the CPU initialization code and that shouldn't
be done after _WAK, which is called from platform_finish().

 In fact, this was done in commit e3c7db621bed4afb8e231cb005057f2feb5db557, 
 long ago, by Rafael:
 
 As indicated in a recent thread on Linux-PM, it's necessary to call
 pm_ops-finish() before devce_resume(), but enable_nonboot_cpus() has to 
 be
 called before pm_ops-finish() (cf.
 http://lists.osdl.org/pipermail/linux-pm/2006-November/004164.html).  For
 consistency, it seems reasonable to call disable_nonboot_cpus() after
 device_suspend().
 
 This way the suspend code will remain symmetrical with respect to the 
 resume
 code and it may allow us to speed up things in the future by suspending 
 and
 resuming devices and/or saving the suspend image in many threads.
 
 The following series of patches reorders the suspend and resume code so 
 that
 nonboot CPUs are disabled after devices have been suspended and enabled 
 before
 the devices are resumed.  It also causes pm_ops-finish() to be called 
 after
 enable_nonboot_cpus() wherever necessary.
 
 Hmm?
 
 It's entirely possible that that commit was simply just buggy, and we 
 should indeed move the CPU down/up to be early/late - we've fixed other 
 ordering issues since that commit went in. But this whole area is very 
 murky.
 
 (Btw, the above commit message points to just my response with a testing 
 patch to the real email: the actual explanation of the INSANE ordering is 
 from Len Brown in
 
   
 https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html
 
 and there Len claims that we *must* wake up CPU's early).
 
 I personally think that the whole ACPI ordering requirements are just 
 insane, but the point of this email is to point these different 
 requirements out, and hopefully we can get something that works for 
 everybody.

Sure.

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
Rafael,

On Thu, 2007-09-20 at 23:45 +0200, Rafael J. Wysocki wrote:
  We disable everything in device_suspend()
 
 No, we don't.  sysdevs are _not_ suspended in device_suspend().
 They are suspended in device_power_down(), which is called
 _after_ disable_nonboot_cpus() (from swsusp_suspend()).
 
  including timekeeping,
 
 No, the timekeeping is suspended in device_power_down() (or at least it should
 be).

Damn, you are right. Reading through 30 different logs confused me.

  enable_nonboot_cpus();
 
 Actually, we can't do this here, because of ACPI and some interrupt handling
 related problems.  Unfortunately, platform_finish() needs to go _after_
 enable_nonboot_cpus() and device_resume() needs to go after platform_finish().
 Analogously, disable_nonboot_cpus() has to go after platform_prepare().

 Otherwise, some systems will break.

Well, I don't buy this one. The system would break in the same way, when
I take CPU#1 offline before I initiate the suspend.

  and non-surprisingly the my VAIO needs help from keyboard problem went
  away immediately. See patch below. (on top of rc7-hrt1, -mm1 does not
  work at all on my VAIO due to some yet not identified wreckage)
 
 Hm, I really don't know why it helps, but that's not because of the 
 timekeeping
 suspend, IMO.

It is related. We rely on some subtle thing which is not up when we
resume the non boot cpu.

  I did not yet look into the suspend to ram code, but I guess that there
  is an equivalent problem.
 
 Yes, the code ordering is the same, but it's not totally wrong, IMHO.
 
  But I have no idea why this affects Andrews jinxed VAIO (UP machine),
  though I suspect that we have more timekeeping/timer depending code
  somewhere waiting to bite us.
 
 That's possible.
 
  Also I still need to debug why the HIBERNATION_TEST code path (which has
  a msleep(5000) in it) does not fail,
 
 See above. :-)

Yes. It makes sense. When I change the TEST code path to:

-   printk(swsusp debug: Waiting for 5 seconds.\n);
-   msleep(5000);
+   printk(swsusp debug: before swsusp_suspend\n);
+   error = swsusp_suspend();

then I have the same effect as I get from real hibernation. And we
actually shut down time keeping somewhere in that code path.

ACPI: PCI interrupt for device :00:1b.0 disabled
swsusp debug: before swsusp_suspend
Suspend timekeeping
swsusp: critical section: 
swsusp: Need to copy 112429 pages
swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
swsusp: critical section: done (112429 pages copied)
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Resume timekeeping
ACPI: PCI Interrupt :00:02.0[A] - GSI 16 (level, low) - IRQ 16
- works fine

This is with my patch applied. Without that I get:

CPU1 is down
swsusp debug: before swsusp_suspend
Suspend timekeeping
swsusp: critical section: 
swsusp: Need to copy 112429 pages
swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
swsusp: critical section: done (112429 pages copied)
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Resume timekeeping
Enabling non-boot CPUs
-- Waits for ever until a key is pressed

Thanks,

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Linus Torvalds


On Thu, 20 Sep 2007, Linus Torvalds wrote:
 
 (Btw, the above commit message points to just my response with a testing 
 patch to the real email: the actual explanation of the INSANE ordering is 
 from Len Brown in
 
   
 https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html
 
 and there Len claims that we *must* wake up CPU's early).

..and points to commit 1a38416cea8ac801ae8f261074721f35317613dc which in 
turn talks about http://bugzilla.kernel.org/show_bug.cgi?id=5651 

Howerver, it seems that bugzilla entry may just be bogus. It talks about 
it appears that some firmware in the future may depend on that sequence 
for correction operation

Len, Shaohua, what are the real issues here? 

It would indeed be nice if we could just take CPU's down early (while 
everything is working), and run the whole suspend code with just one CPU, 
rather than having to worry about the ordering between CPU and device 
takedown.

That said, at least with STR, the situation is:

 1) suspend_console
 2)   device_suspend(PMSG_SUSPEND)(==   -suspend)
 3) disable_nonboot_cpus()
 4)   device_power_down(PMSG_SUSPEND) (==   -suspend_late)
 5) pm_ops-enter()
 6)   device_power_up()   (==   -resume_early)
 7) enable_nonboot_cpus()
 8) pm_finish()
 9)   device_resume() (==   -resume
10) resume_console

So if we agree that things like timers etc should *never* be suspended by 
the early suspend, and *always* use suspend_late/resume_early, then at 
least STR should be ok.

And I think that's a damn reasonable thing to agree on: timers (and 
anything else that CPU shutdown/bringup could *possibly* care about) 
should be considered core enough that they had better be on the 
suspend_late/resume_early list.

Thomas, Rafael, can you verify that at least STR is ok in this respect?

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
Rafael,

On Thu, 2007-09-20 at 23:54 +0200, Rafael J. Wysocki wrote:
  Hmm. This is close to the ordering we have in STR too.
  
  I have some dim memory of there being some ACPI reason why it had to be 
  done that way.
 
 Yes.  We're executing _INI from the CPU initialization code and that shouldn't
 be done after _WAK, which is called from platform_finish().

If I tear down CPU#1 right before I tell the kernel to hibernate, then
the box must explode in the same way. It does not. On none of 4 tested
laptops. 

Of course only the jinxed VAIO one exposes the please press a key
problem.

I need to follow down the swsusp_suspend() code path to figure out, why
this breaks the box.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
Linus,

On Thu, 2007-09-20 at 14:55 -0700, Linus Torvalds wrote:
 And I think that's a damn reasonable thing to agree on: timers (and 
 anything else that CPU shutdown/bringup could *possibly* care about) 
 should be considered core enough that they had better be on the 
 suspend_late/resume_early list.
 
 Thomas, Rafael, can you verify that at least STR is ok in this respect?

-ETOOTIRED led me too a wrong conclusion, but still it is a valuable
hint that this change is making things work again. I need to go down
into the details of the swsusp_suspend() code path to figure out, what's
the root cause. 

Sorry for the noise, but I'm zooming in.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
On Friday, 21 September 2007 00:05, Thomas Gleixner wrote:
 Linus,
 
 On Thu, 2007-09-20 at 14:55 -0700, Linus Torvalds wrote:
  And I think that's a damn reasonable thing to agree on: timers (and 
  anything else that CPU shutdown/bringup could *possibly* care about) 
  should be considered core enough that they had better be on the 
  suspend_late/resume_early list.
  
  Thomas, Rafael, can you verify that at least STR is ok in this respect?
 
 -ETOOTIRED led me too a wrong conclusion, but still it is a valuable
 hint that this change is making things work again.

Yes, it is.

 I need to go down into the details of the swsusp_suspend() code path to
 figure out, what's the root cause. 

If you need any help from me with that, please let me know.

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Rafael J. Wysocki
Thomas,

On Thursday, 20 September 2007 23:53, Thomas Gleixner wrote:
 Rafael,
 
 On Thu, 2007-09-20 at 23:45 +0200, Rafael J. Wysocki wrote:
   We disable everything in device_suspend()
  
  No, we don't.  sysdevs are _not_ suspended in device_suspend().
  They are suspended in device_power_down(), which is called
  _after_ disable_nonboot_cpus() (from swsusp_suspend()).
  
   including timekeeping,
  
  No, the timekeeping is suspended in device_power_down() (or at least it 
  should
  be).
 
 Damn, you are right. Reading through 30 different logs confused me.
 
 enable_nonboot_cpus();
  
  Actually, we can't do this here, because of ACPI and some interrupt handling
  related problems.  Unfortunately, platform_finish() needs to go _after_
  enable_nonboot_cpus() and device_resume() needs to go after 
  platform_finish().
  Analogously, disable_nonboot_cpus() has to go after platform_prepare().
 
  Otherwise, some systems will break.
 
 Well, I don't buy this one. The system would break in the same way, when
 I take CPU#1 offline before I initiate the suspend.

I was referring to the resume part.  If we call enable_nonboot_cpus(), which
executes the _INI ACPI control method, after platform_finish(), which executes
the _WAK global ACPI control method, things will break.  That already happened
in the past, when the code ordering was different, AFAICS.

   and non-surprisingly the my VAIO needs help from keyboard problem went
   away immediately. See patch below. (on top of rc7-hrt1, -mm1 does not
   work at all on my VAIO due to some yet not identified wreckage)
  
  Hm, I really don't know why it helps, but that's not because of the 
  timekeeping
  suspend, IMO.
 
 It is related. We rely on some subtle thing which is not up when we
 resume the non boot cpu.

Yes, it looks so.

   I did not yet look into the suspend to ram code, but I guess that there
   is an equivalent problem.
  
  Yes, the code ordering is the same, but it's not totally wrong, IMHO.
  
   But I have no idea why this affects Andrews jinxed VAIO (UP machine),
   though I suspect that we have more timekeeping/timer depending code
   somewhere waiting to bite us.
  
  That's possible.
  
   Also I still need to debug why the HIBERNATION_TEST code path (which has
   a msleep(5000) in it) does not fail,
  
  See above. :-)
 
 Yes. It makes sense. When I change the TEST code path to:
 
 - printk(swsusp debug: Waiting for 5 seconds.\n);
 - msleep(5000);
 + printk(swsusp debug: before swsusp_suspend\n);
 + error = swsusp_suspend();
 
 then I have the same effect as I get from real hibernation. And we
 actually shut down time keeping somewhere in that code path.
 
 ACPI: PCI interrupt for device :00:1b.0 disabled
 swsusp debug: before swsusp_suspend
 Suspend timekeeping

Exactly.  timekeeping_suspend() is called from device_power_down(), which is
called from swsusp_suspend() (after disabling interrupts).

 swsusp: critical section: 
 swsusp: Need to copy 112429 pages
 swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
 swsusp: critical section: done (112429 pages copied)
 Intel machine check architecture supported.
 Intel machine check reporting enabled on CPU#0.
 Resume timekeeping
 ACPI: PCI Interrupt :00:02.0[A] - GSI 16 (level, low) - IRQ 16
 - works fine
 
 This is with my patch applied. Without that I get:
 
 CPU1 is down
 swsusp debug: before swsusp_suspend
 Suspend timekeeping
 swsusp: critical section: 
 swsusp: Need to copy 112429 pages
 swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
 swsusp: critical section: done (112429 pages copied)
 Intel machine check architecture supported.
 Intel machine check reporting enabled on CPU#0.
 Resume timekeeping
 Enabling non-boot CPUs
 -- Waits for ever until a key is pressed

Well, perhaps there's something else that we should suspend late and resume
early, but we don't?

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Len Brown
On Thursday 20 September 2007 17:55, Linus Torvalds wrote:
 
 On Thu, 20 Sep 2007, Linus Torvalds wrote:
  
  (Btw, the above commit message points to just my response with a testing 
  patch to the real email: the actual explanation of the INSANE ordering is 
  from Len Brown in
  
  
  https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html
  
  and there Len claims that we *must* wake up CPU's early).
 
 ..and points to commit 1a38416cea8ac801ae8f261074721f35317613dc which in 
 turn talks about http://bugzilla.kernel.org/show_bug.cgi?id=5651 
 
 Howerver, it seems that bugzilla entry may just be bogus. It talks about 
 it appears that some firmware in the future may depend on that sequence 
 for correction operation
 
 Len, Shaohua, what are the real issues here? 

Intel's reference BIOS for Core Duo performs some re-initialization
in _WAK that will get blow away if INIT follows _WAK.
IIR, it is related to re-initializing the thermal sensors.
I opened bug 5651 when the BIOS team informed me of this issue.

Yes, bringing a processor offline and then online again w/o
an intervening suspend or reset would not evaluate _WAK,
and thus may still run into the issue.

I don't know if this is a widespread issue and a commonly
used BIOS hook, or if it is specific to certain processors.

-Len

 It would indeed be nice if we could just take CPU's down early (while 
 everything is working), and run the whole suspend code with just one CPU, 
 rather than having to worry about the ordering between CPU and device 
 takedown.
 
 That said, at least with STR, the situation is:
 
  1) suspend_console
  2)   device_suspend(PMSG_SUSPEND)  (==   -suspend)
  3) disable_nonboot_cpus()
  4)   device_power_down(PMSG_SUSPEND) (==   -suspend_late)
  5) pm_ops-enter()
  6)   device_power_up() (==   -resume_early)
  7) enable_nonboot_cpus()
  8) pm_finish()
  9)   device_resume()   (==   -resume
 10) resume_console
 
 So if we agree that things like timers etc should *never* be suspended by 
 the early suspend, and *always* use suspend_late/resume_early, then at 
 least STR should be ok.
 
 And I think that's a damn reasonable thing to agree on: timers (and 
 anything else that CPU shutdown/bringup could *possibly* care about) 
 should be considered core enough that they had better be on the 
 suspend_late/resume_early list.
 
 Thomas, Rafael, can you verify that at least STR is ok in this respect?
 
   Linus
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Paul Mackerras
Linus Torvalds writes:

 It would indeed be nice if we could just take CPU's down early (while 
 everything is working), and run the whole suspend code with just one CPU, 
 rather than having to worry about the ordering between CPU and device 
 takedown.

That is certainly what we want to do on powerpc.

Paul.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-19 Thread Rafael J. Wysocki
On Wednesday, 19 September 2007 21:21, Thomas Gleixner wrote:
> On Wed, 2007-09-19 at 19:44 +0200, Rafael J. Wysocki wrote:
> > > > It boots with nohpet alone and suspend/hibernation seem to work (still,
> > > > it didn't want to boot right after hibernation, but booted after I'd 
> > > > switched
> > > > it off/on manually).
> > > 
> > > Can you please check, whether
> > > 
> > > http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patch
> > > 
> > > works for you ?
> > 
> > Nope.  It's a total disaster. :-(
> 
> True. I have instrumented it to the point where the broadcast device is
> programmed, but no interrupt comes in for totally unknown reasons.
> 
> > Doesn't boot at all, even with "noacpitimer nohpet", and that's with
> > NO_HZ and HIGH_RES_TIMERS unset.
> 
> > If you have a bisectable patch series, I can try to identify the responsible
> > patch.
> 
> http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patches.tar.bz2
> 
> The first patches in the queue are the mainline fixups.

It's x86_64-convert-to-clockevents.patch (ie. after applying it the box stops
to boot).

I haven't had the time to check if any special command line arguments help.
Will check tomorrow.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-19 Thread Thomas Gleixner
On Wed, 2007-09-19 at 19:44 +0200, Rafael J. Wysocki wrote:
> > > It boots with nohpet alone and suspend/hibernation seem to work (still,
> > > it didn't want to boot right after hibernation, but booted after I'd 
> > > switched
> > > it off/on manually).
> > 
> > Can you please check, whether
> > 
> > http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patch
> > 
> > works for you ?
> 
> Nope.  It's a total disaster. :-(

True. I have instrumented it to the point where the broadcast device is
programmed, but no interrupt comes in for totally unknown reasons.

> Doesn't boot at all, even with "noacpitimer nohpet", and that's with
> NO_HZ and HIGH_RES_TIMERS unset.

> If you have a bisectable patch series, I can try to identify the responsible
> patch.

http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patches.tar.bz2

The first patches in the queue are the mainline fixups.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-19 Thread Rafael J. Wysocki
On Wednesday, 19 September 2007 09:06, Thomas Gleixner wrote:
> On Tue, 2007-09-18 at 23:37 +0200, Rafael J. Wysocki wrote:
> > > > > - The Vaio also hangs during resume-from-RAM, due to git-acpi.patch
> > > > > 
> > > > > - And it hangs during suspend-to-RAM, due to git-acpi.patch
> > > 
> > > Sorry, I was wrong.
> > > 
> > > > On my HP nx6325 it only boots with "noacpitimer nohpet" on the command 
> > > > line,
> > > > but then it works.
> > > 
> > > It _sometimes_ boots with "noacpitimer nohpet" and that's if I press the 
> > > power
> > > button for a couple of times during boot (before any messages appear on 
> > > the
> > > console).
> > > 
> > > > Suspend-to-RAM and hibernation work too. :-) 
> > > 
> > > No, they don't (I must have booted -rc6 instead of it by mistake, sigh).
> > > 
> > > > Since 2.6.23-rc4-mm1 only booted with nohpet because of
> > > > 
> > > > x86_64-convert-to-clockevents.patch
> > > > 
> > > > I guess that the boot problems with this one result from the same patch.
> > > 
> > > Not sure any more ...
> > > 
> > > I'll try to compile it with NO_HZ and HIGH_RES_TIMERS unset.
> > 
> > OK, in that configuration it's much better.
> > 
> > It boots with nohpet alone and suspend/hibernation seem to work (still,
> > it didn't want to boot right after hibernation, but booted after I'd 
> > switched
> > it off/on manually).
> 
> Can you please check, whether
> 
> http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patch
> 
> works for you ?

Nope.  It's a total disaster. :-(

Doesn't boot at all, even with "noacpitimer nohpet", and that's with
NO_HZ and HIGH_RES_TIMERS unset.

If you have a bisectable patch series, I can try to identify the responsible
patch.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-19 Thread Thomas Gleixner
On Tue, 2007-09-18 at 23:37 +0200, Rafael J. Wysocki wrote:
> > > > - The Vaio also hangs during resume-from-RAM, due to git-acpi.patch
> > > > 
> > > > - And it hangs during suspend-to-RAM, due to git-acpi.patch
> > 
> > Sorry, I was wrong.
> > 
> > > On my HP nx6325 it only boots with "noacpitimer nohpet" on the command 
> > > line,
> > > but then it works.
> > 
> > It _sometimes_ boots with "noacpitimer nohpet" and that's if I press the 
> > power
> > button for a couple of times during boot (before any messages appear on the
> > console).
> > 
> > > Suspend-to-RAM and hibernation work too. :-) 
> > 
> > No, they don't (I must have booted -rc6 instead of it by mistake, sigh).
> > 
> > > Since 2.6.23-rc4-mm1 only booted with nohpet because of
> > > 
> > > x86_64-convert-to-clockevents.patch
> > > 
> > > I guess that the boot problems with this one result from the same patch.
> > 
> > Not sure any more ...
> > 
> > I'll try to compile it with NO_HZ and HIGH_RES_TIMERS unset.
> 
> OK, in that configuration it's much better.
> 
> It boots with nohpet alone and suspend/hibernation seem to work (still,
> it didn't want to boot right after hibernation, but booted after I'd switched
> it off/on manually).

Can you please check, whether

http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patch

works for you ?

tglx



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-19 Thread Thomas Gleixner
On Tue, 2007-09-18 at 23:37 +0200, Rafael J. Wysocki wrote:
- The Vaio also hangs during resume-from-RAM, due to git-acpi.patch

- And it hangs during suspend-to-RAM, due to git-acpi.patch
  
  Sorry, I was wrong.
  
   On my HP nx6325 it only boots with noacpitimer nohpet on the command 
   line,
   but then it works.
  
  It _sometimes_ boots with noacpitimer nohpet and that's if I press the 
  power
  button for a couple of times during boot (before any messages appear on the
  console).
  
   Suspend-to-RAM and hibernation work too. :-) 
  
  No, they don't (I must have booted -rc6 instead of it by mistake, sigh).
  
   Since 2.6.23-rc4-mm1 only booted with nohpet because of
   
   x86_64-convert-to-clockevents.patch
   
   I guess that the boot problems with this one result from the same patch.
  
  Not sure any more ...
  
  I'll try to compile it with NO_HZ and HIGH_RES_TIMERS unset.
 
 OK, in that configuration it's much better.
 
 It boots with nohpet alone and suspend/hibernation seem to work (still,
 it didn't want to boot right after hibernation, but booted after I'd switched
 it off/on manually).

Can you please check, whether

http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patch

works for you ?

tglx



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-19 Thread Rafael J. Wysocki
On Wednesday, 19 September 2007 09:06, Thomas Gleixner wrote:
 On Tue, 2007-09-18 at 23:37 +0200, Rafael J. Wysocki wrote:
 - The Vaio also hangs during resume-from-RAM, due to git-acpi.patch
 
 - And it hangs during suspend-to-RAM, due to git-acpi.patch
   
   Sorry, I was wrong.
   
On my HP nx6325 it only boots with noacpitimer nohpet on the command 
line,
but then it works.
   
   It _sometimes_ boots with noacpitimer nohpet and that's if I press the 
   power
   button for a couple of times during boot (before any messages appear on 
   the
   console).
   
Suspend-to-RAM and hibernation work too. :-) 
   
   No, they don't (I must have booted -rc6 instead of it by mistake, sigh).
   
Since 2.6.23-rc4-mm1 only booted with nohpet because of

x86_64-convert-to-clockevents.patch

I guess that the boot problems with this one result from the same patch.
   
   Not sure any more ...
   
   I'll try to compile it with NO_HZ and HIGH_RES_TIMERS unset.
  
  OK, in that configuration it's much better.
  
  It boots with nohpet alone and suspend/hibernation seem to work (still,
  it didn't want to boot right after hibernation, but booted after I'd 
  switched
  it off/on manually).
 
 Can you please check, whether
 
 http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patch
 
 works for you ?

Nope.  It's a total disaster. :-(

Doesn't boot at all, even with noacpitimer nohpet, and that's with
NO_HZ and HIGH_RES_TIMERS unset.

If you have a bisectable patch series, I can try to identify the responsible
patch.

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-19 Thread Thomas Gleixner
On Wed, 2007-09-19 at 19:44 +0200, Rafael J. Wysocki wrote:
   It boots with nohpet alone and suspend/hibernation seem to work (still,
   it didn't want to boot right after hibernation, but booted after I'd 
   switched
   it off/on manually).
  
  Can you please check, whether
  
  http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patch
  
  works for you ?
 
 Nope.  It's a total disaster. :-(

True. I have instrumented it to the point where the broadcast device is
programmed, but no interrupt comes in for totally unknown reasons.

 Doesn't boot at all, even with noacpitimer nohpet, and that's with
 NO_HZ and HIGH_RES_TIMERS unset.

 If you have a bisectable patch series, I can try to identify the responsible
 patch.

http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patches.tar.bz2

The first patches in the queue are the mainline fixups.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-19 Thread Rafael J. Wysocki
On Wednesday, 19 September 2007 21:21, Thomas Gleixner wrote:
 On Wed, 2007-09-19 at 19:44 +0200, Rafael J. Wysocki wrote:
It boots with nohpet alone and suspend/hibernation seem to work (still,
it didn't want to boot right after hibernation, but booted after I'd 
switched
it off/on manually).
   
   Can you please check, whether
   
   http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patch
   
   works for you ?
  
  Nope.  It's a total disaster. :-(
 
 True. I have instrumented it to the point where the broadcast device is
 programmed, but no interrupt comes in for totally unknown reasons.
 
  Doesn't boot at all, even with noacpitimer nohpet, and that's with
  NO_HZ and HIGH_RES_TIMERS unset.
 
  If you have a bisectable patch series, I can try to identify the responsible
  patch.
 
 http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patches.tar.bz2
 
 The first patches in the queue are the mainline fixups.

It's x86_64-convert-to-clockevents.patch (ie. after applying it the box stops
to boot).

I haven't had the time to check if any special command line arguments help.
Will check tomorrow.

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-18 Thread Rafael J. Wysocki
On Tuesday, 18 September 2007 22:54, Rafael J. Wysocki wrote:
> On Tuesday, 18 September 2007 22:21, Rafael J. Wysocki wrote:
> > On Tuesday, 18 September 2007 10:18, Andrew Morton wrote:
> > > 
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc6/2.6.23-rc6-mm1/
> > > 
> > > 2.6.23-rc6-mm1 is a 29MB diff against 2.6.23-rc6.
> > > 
> > > It took me over two solid days to get this lot compiling and booting on a 
> > > few
> > > boxes.  This required around ninety fixup patches and patch droppings.  
> > > There
> > > are several bugs in here which I know of (details below) and presumably 
> > > many
> > > more which I don't know of.  I have to say that this just isn't working 
> > > any
> > > more.
> > > 
> > > - The Vaio hangs when quitting X due to x86_64-mm-cpa-clflush.patch, but
> > >   I didn't drop that patch because the iommu patch series depends on it.
> > > 
> > > - The Vaio also hangs during resume-from-RAM, due to git-acpi.patch
> > > 
> > > - And it hangs during suspend-to-RAM, due to git-acpi.patch
> 
> Sorry, I was wrong.
> 
> > On my HP nx6325 it only boots with "noacpitimer nohpet" on the command line,
> > but then it works.
> 
> It _sometimes_ boots with "noacpitimer nohpet" and that's if I press the power
> button for a couple of times during boot (before any messages appear on the
> console).
> 
> > Suspend-to-RAM and hibernation work too. :-) 
> 
> No, they don't (I must have booted -rc6 instead of it by mistake, sigh).
> 
> > Since 2.6.23-rc4-mm1 only booted with nohpet because of
> > 
> > x86_64-convert-to-clockevents.patch
> > 
> > I guess that the boot problems with this one result from the same patch.
> 
> Not sure any more ...
> 
> I'll try to compile it with NO_HZ and HIGH_RES_TIMERS unset.

OK, in that configuration it's much better.

It boots with nohpet alone and suspend/hibernation seem to work (still,
it didn't want to boot right after hibernation, but booted after I'd switched
it off/on manually).

Unfortunately, I get this in dmesg:

ALSA /home/rafael/src/mm/linux-2.6.23-rc6-mm1/sound/pci/hda/hda_intel.c:1758: 
hda-intel: ioremap error

and (obviously) the sound card doesn't work.

Additionally, I've got a couple of these:

WARNING: at /home/rafael/src/mm/linux-2.6.23-rc6-mm1/drivers/usb/core/driver.c:1
217 usb_autopm_do_device()

Call Trace:
 [] :usbcore:usb_autopm_do_device+0x60/0xe9
 [] :usbcore:usb_autosuspend_device+0xc/0xe
 [] :usbcore:usb_disconnect+0x15f/0x18c
 [] :usbcore:hub_thread+0x691/0x10a1
 [] autoremove_wake_function+0x0/0x38
 [] :usbcore:hub_thread+0x0/0x10a1
 [] kthread+0x49/0x79
 [] child_rip+0xa/0x12
 [] kthread+0x0/0x79
 [] child_rip+0x0/0x12

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-18 Thread Rafael J. Wysocki
On Tuesday, 18 September 2007 22:54, Rafael J. Wysocki wrote:
 On Tuesday, 18 September 2007 22:21, Rafael J. Wysocki wrote:
  On Tuesday, 18 September 2007 10:18, Andrew Morton wrote:
   
   ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc6/2.6.23-rc6-mm1/
   
   2.6.23-rc6-mm1 is a 29MB diff against 2.6.23-rc6.
   
   It took me over two solid days to get this lot compiling and booting on a 
   few
   boxes.  This required around ninety fixup patches and patch droppings.  
   There
   are several bugs in here which I know of (details below) and presumably 
   many
   more which I don't know of.  I have to say that this just isn't working 
   any
   more.
   
   - The Vaio hangs when quitting X due to x86_64-mm-cpa-clflush.patch, but
 I didn't drop that patch because the iommu patch series depends on it.
   
   - The Vaio also hangs during resume-from-RAM, due to git-acpi.patch
   
   - And it hangs during suspend-to-RAM, due to git-acpi.patch
 
 Sorry, I was wrong.
 
  On my HP nx6325 it only boots with noacpitimer nohpet on the command line,
  but then it works.
 
 It _sometimes_ boots with noacpitimer nohpet and that's if I press the power
 button for a couple of times during boot (before any messages appear on the
 console).
 
  Suspend-to-RAM and hibernation work too. :-) 
 
 No, they don't (I must have booted -rc6 instead of it by mistake, sigh).
 
  Since 2.6.23-rc4-mm1 only booted with nohpet because of
  
  x86_64-convert-to-clockevents.patch
  
  I guess that the boot problems with this one result from the same patch.
 
 Not sure any more ...
 
 I'll try to compile it with NO_HZ and HIGH_RES_TIMERS unset.

OK, in that configuration it's much better.

It boots with nohpet alone and suspend/hibernation seem to work (still,
it didn't want to boot right after hibernation, but booted after I'd switched
it off/on manually).

Unfortunately, I get this in dmesg:

ALSA /home/rafael/src/mm/linux-2.6.23-rc6-mm1/sound/pci/hda/hda_intel.c:1758: 
hda-intel: ioremap error

and (obviously) the sound card doesn't work.

Additionally, I've got a couple of these:

WARNING: at /home/rafael/src/mm/linux-2.6.23-rc6-mm1/drivers/usb/core/driver.c:1
217 usb_autopm_do_device()

Call Trace:
 [8813885e] :usbcore:usb_autopm_do_device+0x60/0xe9
 [88138910] :usbcore:usb_autosuspend_device+0xc/0xe
 [88131aa8] :usbcore:usb_disconnect+0x15f/0x18c
 [88133305] :usbcore:hub_thread+0x691/0x10a1
 [8024a077] autoremove_wake_function+0x0/0x38
 [88132c74] :usbcore:hub_thread+0x0/0x10a1
 [80249f50] kthread+0x49/0x79
 [8020ce98] child_rip+0xa/0x12
 [80249f07] kthread+0x0/0x79
 [8020ce8e] child_rip+0x0/0x12

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/