Re: [linux-pm] Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-10-24 Thread Pavel Machek
Hi!

  That's certainly possible. We already pass a very small amount of data 
  between 
  the boot and resuming kernels at the moment, and it's done quite simply - 
  by 
  putting the variables we want to 'transfer' in a nosave page/section.
 
 Well, if the boot and image kernels are different, which is now possible on
 x86_64 with some recent patches (currently in -mm), the nosave trick won't
 work.

I guess we should remove the nosave at least from x86-64. If
someone tries to use it, he'll get a nasty surprise.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [linux-pm] Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-10-24 Thread Rafael J. Wysocki
On Thursday, 11 October 2007 22:54, Pavel Machek wrote:
 Hi!
 
   That's certainly possible. We already pass a very small amount of data 
   between 
   the boot and resuming kernels at the moment, and it's done quite simply - 
   by 
   putting the variables we want to 'transfer' in a nosave page/section.
  
  Well, if the boot and image kernels are different, which is now possible on
  x86_64 with some recent patches (currently in -mm), the nosave trick won't
  work.
 
 I guess we should remove the nosave at least from x86-64. If
 someone tries to use it, he'll get a nasty surprise.

Agreed.

I'll try to prepare a patch for that when I have a bit of time.

Greetings,
Rafael

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [linux-pm] Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-22 Thread Jeremy Maitin-Shepard
Rafael J. Wysocki [EMAIL PROTECTED] writes:

 On Friday, 21 September 2007 23:08, Jeremy Maitin-Shepard wrote:
 Rafael J. Wysocki [EMAIL PROTECTED] writes:
 
  On Friday, 21 September 2007 22:26, Jeremy Maitin-Shepard wrote:
  Rafael J. Wysocki [EMAIL PROTECTED] writes:
  
  [snip]
  
   The ACPI NVS area is explicitly marked as reserved and we don't save it.
   On x86_64 we don't save any memory areas marked as reserved and yet the
  above
   happens.
  
  I think you have mentioned before, though, that ACPI is first
  initialized by the boot kernel, before it is later initialized by
  resuming kernel.  This could well be the source of the problem.
 
  No, it's not.  I have tested that too with an ACPI-less boot kernel.
 
 Well, it seems that there just must be some other bug.  I would define
 anything that differs between the post-resume initialization of ACPI

 I'm not sure what you mean.

 from the normal boot initialization of ACPI as a bug.  If the interaction
 with the hardware is the same, then the behavior will be the same.

 The ACPI platform firmware is allowed to preserve information accross the
 hibernation-resume cycle, so this need not be the same.

All of my comments related to the case where S4 is not being used
(instead the system is just powered off normally), and a boot kernel
that does not initialize ACPI is used.  In that case, the ACPI platform
firmware should not be able to distinguish a normal boot from a resume
from hibernation.

-- 
Jeremy Maitin-Shepard

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [linux-pm] Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-22 Thread Rafael J. Wysocki
On Friday, 21 September 2007 20:11, Jeremy Maitin-Shepard wrote:
 Rafael J. Wysocki [EMAIL PROTECTED] writes:
 
  On Friday, 21 September 2007 15:14, huang ying wrote:
  On 9/21/07, Rafael J. Wysocki [EMAIL PROTECTED] wrote:
   On Friday, 21 September 2007 05:33, Eric W. Biederman wrote:
Nigel Cunningham [EMAIL PROTECTED] writes:
  [--snip--]
   
No one has yet attacked the hard problem of coming up with separate
hibernate methods for drivers.
  
   Well, I've been playing a bit with that for some time, but it's not easy 
   by
  any
   means.
  
   In short, I'm seeing some problems related to the handling of ACPI that 
   seem
  to
   shatter the entire idea of having separate hibernate methods, at least as
  far
   as ACPI systems are concerned.
  
  So sadly to hear this. Can you details it a little? Or a link?
 
  Well, the problem is that apparently some systems (eg. my HP nx6325) expect 
  us
  to execute the _PTS ACPI global control method before creating the image 
  _and_
  to execute acpi_enter_sleep_state(ACPI_STATE_S4) in order to finally put the
  system into the sleep state.  In particular, on nx6325, if we don't do that,
  then after the restore the status of the AC power will not be reported
  correctly (and if you replace the battery while in the sleep state, the
  battery status will not be updated correctly after the restore).  Similar
  issues have been reported for other machines.
 
 Suppose that instead of using ACPI S4 state at all, you instead just
 power off.  Yes, you'll lose wakeup event functionality, and flashy
 LEDs, but doesn't this take care of the problem?

Nope.

 The firmware shouldn't see the hibernate as anything other than a shutdown
 and reboot.

Actually, this assumption is apparently wrong.

 ACPI should be initialized normally when resuming, which should take care of
 getting AC power status reported properly.

Well, that doesn't work.  I've tested it, really. :-)

 This should be the behavior, anyway, on the many systems that do not
 support S4.
 
  Now, the ACPI specification requires us to put devices into low power states
  before executing _PTS and that's exactly what we're doing before a suspend 
  to
  RAM.  Thus, it seems that in general we need to do the same for hibernation 
  on
  ACPI systems.
 
 It seems that if ACPI S4 is going to be used, Switching to low power
 state is something that should be done only immediately before entering
 that state (i.e. after the image has already been saved).

Doesn't.  Work.

 In particular, it should not be done just before the atomic copy.  It is
 true that (during resume) after the atomic copy snapshot is restored,
 drivers will need to be prepared (i.e. have saved whatever information
 is necessary) to _resume_ devices from the low power state, but that
 does not mean they have to actually be put into that low power state
 before the copy is made.
 
 I agree that for the kexec implementation there may be additional
 issues.  For swsusp, uswsusp, and tuxonice, though, I don't see why
 there should be a problem.  I think that, as was recognized before, all
 of the issues are resolved by properly considering exactly what each
 callback should do and when it should be called.  The problems stem from
 ambiguous specifications, or trying to use the same callback for two
 different purposes or in two different cases.
 
 Let me know if I'm mistaken.

See above. :-)

Greetings,
Rafael

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [linux-pm] Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-22 Thread Alan Stern
On Fri, 21 Sep 2007, Rafael J. Wysocki wrote:

   Well, the problem is that apparently some systems (eg. my HP nx6325) 
   expect us
   to execute the _PTS ACPI global control method before creating the image 
   _and_
   to execute acpi_enter_sleep_state(ACPI_STATE_S4) in order to finally put 
   the
   system into the sleep state.  In particular, on nx6325, if we don't do 
   that,
   then after the restore the status of the AC power will not be reported
   correctly (and if you replace the battery while in the sleep state, the
   battery status will not be updated correctly after the restore).  Similar
   issues have been reported for other machines.
  
  Suppose that instead of using ACPI S4 state at all, you instead just
  power off.  Yes, you'll lose wakeup event functionality, and flashy
  LEDs, but doesn't this take care of the problem?
 
 Nope.
 
  The firmware shouldn't see the hibernate as anything other than a shutdown
  and reboot.
 
 Actually, this assumption is apparently wrong.

One gets the impression that the hibernation image includes a memory 
area used by the firmware.  That could explain why devices need to be 
in a low-power state when the image is created -- so that when the 
image is restored, the firmware doesn't get confused about the device 
states.

It would also explain why the firmware sees
resume-from-power-off-hibernation as different from a regular reboot:
because its data area gets overwritten as part of the resume.

In reality it's probably more complicated than this, with weird 
interactions between the firmware and the various ACPI methods.  
Nevertheless, the main idea seems valid.

Alan Stern


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [linux-pm] Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-22 Thread Kyle Moffett
On Sep 22, 2007, at 06:34:17, Rafael J. Wysocki wrote:
 On Saturday, 22 September 2007 01:19, Kyle Moffett wrote:
 On Sep 21, 2007, at 17:16:59, Jeremy Maitin-Shepard wrote:
 Rafael J. Wysocki [EMAIL PROTECTED] writes:
 The ACPI platform firmware is allowed to preserve information  
 accross the hibernation-resume cycle, so this need not be the same.

 All of my comments related to the case where S4 is not being used  
 (instead the system is just powered off normally), and a boot  
 kernel that does not initialize ACPI is used.  In that case, the  
 ACPI platform firmware should not be able to distinguish a normal  
 boot from a resume from hibernation.

 I think that in order for this to work, there would need to be  
 some ABI whereby the resume-ing kernel can pass its entire ACPI  
 state and a bunch of other ACPI-related device details to the  
 resume-ed kernel, which I believe it does not do at the moment.

 In fact we don't need to do this.

 The solution is not to touch ACPI in the boot kernel (ie. the one  
 that loads the image) and pass control to the image kernel.  This  
 is how it's supposed to work according to the spec, more or less  
 (well, there are some ugly details  that need handling, like the  
 restoration of the NVS area).

First of all, we will need to make the resumed kernel throw away  
*ALL* of its ACPI state on S5 and completely reinitialize ACPI as  
though it was booting for the first time on resume.  From what I can  
tell, we throw away all the ACPI state in the boot kernel and  
reinitialize it there, but then the reinitialized state is  
overwritten with the resumed kernel's state and the two don't always  
happen to be the same.  (Like if a battery got replaced or AC status  
changed).

Umm, I don't see how that can possibly work properly.  For a laptop,  
for example, the restore kernel will need to access the disk, the LCD  
display, and possibly the AC/battery and current CPU frequency.  From  
what I understand of ACPI, both of the former may need ACPI code to  
operate properly (Isn't there an ATA taskfile object of some kind?)  
and the latter two almost definitely need ACPI.  Ergo the boot kernel  
may need to initialize and use ACPI just to run an ATA taskfile so it  
can read from the HDD efficiently.

 I believe that what causes problems is the ACPI state data that  
 the kernel stores is *different* between identical sequential  
 boots, especially when you add/remove/replace batteries, AC, etc.

 Rather the ACPI state data that the platform firmware stores may be  
 different, depending on whether you enter S4 or S5 during power  
 off and that determines the interactions between the kernel and  
 the firmware after the next boot.

That's not what he was talking about.  The problem discussed was:
   (A) You hibernate your box, entering S5 (IE: power off)
   (B) You resume the box and the boot kernel inits all the ACPI stuff.
   (C) The boot kernel's ACPI state is completely replaced by the  
resumed kernel's state.
   (D) Hardware stops working mysteriously because of ACPI problems.

The only possible conclusion is that the state between the boot  
kernel and the resume kernel was *different* and so the device failed  
because the ACPI state in the resume kernel doesn't match the actual  
state of the hardware.

Cheers,
Kyle Moffett

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [linux-pm] Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-21 Thread Rafael J. Wysocki
On Friday, 21 September 2007 11:49, Pavel Machek wrote:
 Hi!
 
   Seems like good enough for -mm to me.
 
 (For the record, I do not think this is going to be
 hibernation-replacement any time soon. But it is functionality useful
 for other stuff -- dump memory and continue -- and yes it may be able
 to do hibernation in the long term.
 
 It really comes from the other side of reliability:
 
 * swsusp is if your kernel is perfectly healthy, it will work
 
 while this, coming from kdump is
 
 * if your kernel is not completely trashed, it should work
 
 ...which is why can't use swsusp to do dump memory and continue -- you
 want to do dumps on slightly broken systems. And yes, as a
 sideeffect it may be able to do hibernation... why not, lets see how
 it works out).

I generally agree. :-)

Greetings,
Rafael

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [linux-pm] Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-21 Thread huang ying
On 9/21/07, Rafael J. Wysocki [EMAIL PROTECTED] wrote:
 On Friday, 21 September 2007 05:33, Eric W. Biederman wrote:
  Nigel Cunningham [EMAIL PROTECTED] writes:
  
   That's not true. Kexec will itself be an implementation, otherwise you'd 
   end
   up with people screaming about no hibernation support.
 
  There needs to be an implementation of hibernation based on kexec with
  return yes.
 
   And it won't result in
   the complete removal of the existing hibernation code from the kernel. At 
   the
   very least, it's going to want the kernel being hibernated to have an
   interface by which it can find out which pages need to be saved.
 
  That interface should be running kernel - user space - target kernel.
  Not direct kernel to kernel.
 
   I wouldn't
   be surprised if it also ends up with an interface in which the kernel 
   being
   hibernated tells it what bdev/sectors in which to save the image as well
   (otherwise you're going to need a dedicated, otherwise untouched partition
   exclusively for the kexec'd kernel to use), or what network settings to 
   use
   if it wants to try to save the image to a network storage device.
 
  initramfs.  We already seem to have that interface.  And distros
  seems to do a pretty decent job of using it to configure systems.
 
   On top of
   that, there are all the issues related to device reinitialisation and so 
   on,
 
  Yes.
 
   and it looks like there's greatly increased pain for users wanting to
   configure this new implementation.
 
  Not to be callous but that really is a user space and distro issue.
 
   Kexec is by no means proven to be the panacea for all the issues.
 
  I agree.  I'm still not quite convinced it will do a satisfactory job.
  But I think it does make sense to implement a general kexec with
  return and see if that can reasonably be used for handling hibernation
  issues.  If done cleanly and with care the implementation won't be
  hibernation specific.

 Yes, and that's worth doing anyway, IMO.

  Frankly this looks like the best way I can see to implement a general
  mechanism for calling silly firmware/BIOS/EFI services after we
  have a kernel up and running.  It's a little bit like allowing
  X to call iopl(3) and do inb/outb directly.
 
  The configuration issues you raise pretty much exist for kexec on
  panic, and they seem to be being resolved for that case in a
  reasonable way.  I do agree that the current kexec+return effort seems
  to be one of those unfortunate cases where we give every mechanism in
  the kernel to do something in user space and then no one actually
  implements the user space.  That doesn't do any one any good.
 
  For hibernation we don't have the absolute need to step outside of the
  current kernel that we do in the kexec on panic approach.  However we
  have this practical fight about mechanism and policy, and kexec with
  return has this seductive allure that it appears to be the minimal
  necessary mechanism in the kernel.
 
  No one has yet attacked the hard problem of coming up with separate
  hibernate methods for drivers.

 Well, I've been playing a bit with that for some time, but it's not easy by 
 any
 means.

 In short, I'm seeing some problems related to the handling of ACPI that seem 
 to
 shatter the entire idea of having separate hibernate methods, at least as far
 as ACPI systems are concerned.

So sadly to hear this. Can you details it a little? Or a link?

Best Regards,
Huang Ying

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec