Re: [linux-pm] [PATCH -mm] kexec jump -v9

2008-05-15 Thread Alan Stern
On Wed, 14 May 2008, Eric W. Biederman wrote:

> My take on the situation is this.  For proper handling we
> need driver device_detach and device_reattach methods.
> 
> With the following semantics.  The device_detach methods
> will disable DMA and place the hardware in a sane state
> from which the device driver can reclaim and reinitialize it,
> but the hardware will not be touched.
> 
> device_reattach reattaches the driver to the hardware.

How would these differ from the already-existing remove and probe 
methods?

Alan Stern


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec based hibernation: a prototype of kexec multi-stage load

2008-05-15 Thread Eric W. Biederman
"Huang, Ying" <[EMAIL PROTECTED]> writes:

> On Wed, 2008-05-14 at 14:43 -0700, Eric W. Biederman wrote:
> [...]
>> Then as a preliminary design let's plan on this.
>> 
>> - Pass the rentry point as the return address (using the C ABI).
>>   We may want to load the stack pointer etc so we can act as
>>   a direct entry point for new code.
>
> There are some issues about passing entry point as return address. The
> kexec jump (or kexec with return) is used for
>
> - Switching between original kernel (A) and kexeced kernel (B)
> - Call some code (such as BIOS code) in physical mode
>
> 1) When call some code in physical mode, the called code can use a
> simple return to return to kernel A. So there is no return address on
> stack after return to kernel A. Instead, argument 1 is on stack top.
>
> 2) When switch back from kernel B to kernel A, kernel B will call the
> jump back entry of kernel A with C ABI. So, the return address is on
> stack top. And kernel A get jump back entry of kernel B via the return
> address.
>
> Because the stack state is different between 1) and 2), the jump back
> entry of kernel A should distinguish them.

Yes.  Because the stack state is different we need to be careful.

However I don't see that we care how we got to the proper piece of
code.  If we don't care we don't need to distinguish them.

Therefore I see two possible solutions.
1) Write a tiny trampoline that goes in the core file to keep
   the calling conventions sane.

2) After we figure out our address read the stack pointer from
   a fixed location and simply set it.  (This is my preference)

Eric

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH -mm] kexec jump -v9

2008-05-15 Thread Eric W. Biederman
"Huang, Ying" <[EMAIL PROTECTED]> writes:

> Hi, Vivek,
>
> On Wed, 2008-05-14 at 16:52 -0400, Vivek Goyal wrote:
> [...]
>> Ok, I have done some testing on this patch. Currently I have just
>> tested switching back and forth between two kernels and it is working for
>> me.
>> 
>> Just that I had to put LAPIC and IOAPIC in legacy mode for it to work. Few
>> comments/questions are inline.
>
> It seems that for LAPIC and IOAPIC, there is
> lapic_suspend()/lapic_resume() and ioapic_suspend()/ioapic_resume(),
> which will be called before/after kexec jump through
> device_power_down()/device_power_up(). So, the mechanism for
> LAPIC/IOAPIC is there, we may need to check the corresponding
> implementation.

And if you start with the device shutdown path the code is already
there and working.

Eric

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH -mm] kexec jump -v9

2008-05-15 Thread Rafael J. Wysocki
On Thursday, 15 of May 2008, Huang, Ying wrote:
> On Wed, 2008-05-14 at 15:30 -0700, Eric W. Biederman wrote:
> [...]
> > >  
> > > + if (image->preserve_context) {
> > > + KJUMP_MAGIC(control_page) = KJUMP_MAGIC_NUMBER;
> > > + if (kexec_jump_save_cpu(control_page)) {
> > > + image->start = KJUMP_ENTRY(control_page);
> > > + return;
> > 
> > Tricky, and I expect unnecessary.
> > We should be able to just have relocate_new_kernel return?
> 
> OK, I will check this. Maybe we can move CPU state saving code into
> relocate_new_kernel.
> 
> [...]
> > > -static void kernel_kexec(void)
> > > +static int kernel_kexec(void)
> > >  {
> > > + int ret = -ENOSYS;
> > >  #ifdef CONFIG_KEXEC
> > > - struct kimage *image;
> > > - image = xchg(&kexec_image, NULL);
> > > - if (!image)
> > > - return;
> > > - kernel_restart_prepare(NULL);
> > > - printk(KERN_EMERG "Starting new kernel\n");
> > > - machine_shutdown();
> > > - machine_kexec(image);
> > > + if (xchg(&kexec_lock, 1))
> > > + return -EBUSY;
> > > + if (!kexec_image) {
> > > + ret = -EINVAL;
> > > + goto unlock;
> > > + }
> > > + if (!kexec_image->preserve_context) {
> > > + kernel_restart_prepare(NULL);
> > > + printk(KERN_EMERG "Starting new kernel\n");
> > > + machine_shutdown();
> > > + }
> > > + ret = kexec_jump(kexec_image);
> > > +unlock:
> > > + xchg(&kexec_lock, 0);
> > >  #endif
> > 
> > Ugh.  No.  Not sharing the shutdown methods with reboot and
> > the normal kexec path looks like a recipe for failure to me.
> > 
> > This looks like where we really need to have the conversation.
> > What methods do we use to shutdown the system.
> > 
> > My take on the situation is this.  For proper handling we
> > need driver device_detach and device_reattach methods.
> > 
> > With the following semantics.  The device_detach methods
> > will disable DMA and place the hardware in a sane state
> > from which the device driver can reclaim and reinitialize it,
> > but the hardware will not be touched.
> > 
> > device_reattach reattaches the driver to the hardware.
> 
> Yes. Current device PM callback is not suitable for hibernation (kexec
> based or original). I think we can collaborate with Rafael J. Wysocki on
> the new device drivers hibernation callbacks.

Thanks, I'm also open for collaboration.  There will be a lot of work to do
related to the new callbacks, so any contribution is certainly welcome.

Thanks,
Rafael

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH -mm] kexec jump -v9

2008-05-15 Thread Vivek Goyal
[..]
> > > +2:
> > > + call*%edx
> > 
> > > + movl%edi, %edx
> > > + popl%edi
> > > + pushl   %edx
> > > + jmp 2b
> > > +
> > 
> > What does above piece of code do? Looks like redundant for switching
> > between the kernels? After call *%edx, we never return here. Instead
> > we come back to "kexec_jump_back_entry"?
> 
> For switching between the kernels, this is redundant. Originally another
> feature of kexec jump is to call some code in physical mode. This is
> used to provide a C ABI to called code.
> 

Hi Huang,

Ok, You want to make BIOS calls. We already do that using vm86 mode and
use bios real mode interrupts. So why do we need this interface? Or, IOW,
how is this interface better?

Do you have something in mind where/how are you going to use it?

Thanks
Vivek

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [linux-pm] [PATCH -mm] kexec jump -v9

2008-05-15 Thread Eric W. Biederman
Alan Stern <[EMAIL PROTECTED]> writes:

> On Wed, 14 May 2008, Eric W. Biederman wrote:
>
>> My take on the situation is this.  For proper handling we
>> need driver device_detach and device_reattach methods.
>> 
>> With the following semantics.  The device_detach methods
>> will disable DMA and place the hardware in a sane state
>> from which the device driver can reclaim and reinitialize it,
>> but the hardware will not be touched.
>> 
>> device_reattach reattaches the driver to the hardware.
>
> How would these differ from the already-existing remove and probe 
> methods?

Honestly I would like for them not to, and they should be
proper factors of the remove and probe methods.

However we have a fundamental gotcha that we need to handle.
Logical abstractions on physical devices.

i.e.  How do we handle the case of a filesystem on a block
  device, when we remove the block device and then read it.

We have two choices.
1) We go through the pain of teaching the upper layers in the
   kernel of how to deal with hotplug and then we are sane
   when someone removes a usb stick accidentally before
   unmounting it and then reinserts the usb stick.

2) Teach the drivers how to do just the lower have of hotplug/remove.
   In which case with the driver still present and presenting it's
   upper layer queues we have the driver relinquish it's hardware
   and then later check to see if it's hardware is still present
   and reinitialize it.

I don't know if anyone has looked at moving this to an upper layer.
Definitely a question worth asking.  The simpler we can make this
for driver authors the better.  Especially as that will make
the drivers more maintainable long term.

Eric

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [linux-pm] [PATCH -mm] kexec jump -v9

2008-05-15 Thread Eric W. Biederman
"Rafael J. Wysocki" <[EMAIL PROTECTED]> writes:

> On Thursday, 15 of May 2008, Eric W. Biederman wrote:
>> "Rafael J. Wysocki" <[EMAIL PROTECTED]> writes:
>> 
>> Just an added data partial point.  In the kexec case I have had not heard
>> anyone screaming to me that ACPI doesn't work after we switch kernels.
>
> You don't remove power from devices while doing that.

No.  It is the second half of S5.  When we go from the boot kernel
to the restored kernel I am talking about.

That path is exactly what happens successfully in the kexec case.
Transitioning from one kernel to another.

If that path works reliably in kexec then we are talking about
something that can be solved without respect to any specific
ACPI implementation.

Eric



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [linux-pm] [PATCH -mm] kexec jump -v9

2008-05-15 Thread Alan Stern
On Thu, 15 May 2008, Eric W. Biederman wrote:

> Alan Stern <[EMAIL PROTECTED]> writes:
> 
> > On Wed, 14 May 2008, Eric W. Biederman wrote:
> >
> >> My take on the situation is this.  For proper handling we
> >> need driver device_detach and device_reattach methods.
> >> 
> >> With the following semantics.  The device_detach methods
> >> will disable DMA and place the hardware in a sane state
> >> from which the device driver can reclaim and reinitialize it,
> >> but the hardware will not be touched.
> >> 
> >> device_reattach reattaches the driver to the hardware.
> >
> > How would these differ from the already-existing remove and probe 
> > methods?
> 
> Honestly I would like for them not to, and they should be
> proper factors of the remove and probe methods.

So then there's no need for new methods, right?

> However we have a fundamental gotcha that we need to handle.
> Logical abstractions on physical devices.
> 
> i.e.  How do we handle the case of a filesystem on a block
>   device, when we remove the block device and then read it.

The filesystem code should then receive an error for any I/O operating 
it tries to carry out.  That's what happens when you unplug a USB flash 
drive.

> We have two choices.
> 1) We go through the pain of teaching the upper layers in the
>kernel of how to deal with hotplug and then we are sane
>when someone removes a usb stick accidentally before
>unmounting it and then reinserts the usb stick.

I don't understand.  Suppose you teach the filesystem layer about 
hot-unplugging.  So the user removes a USB stick before unmounting it, 
and when the filesystem tries to access the media it learns that the 
device is gone -- and the filesystem is gone with it.  How is that any 
better than getting an I/O error (apart from not filling the system log 
up with error messages)?

> 2) Teach the drivers how to do just the lower have of hotplug/remove.
>In which case with the driver still present and presenting it's
>upper layer queues we have the driver relinquish it's hardware
>and then later check to see if it's hardware is still present
>and reinitialize it.

That's how usb-storage works in 2.4.  Linus told us to change it,
probably because there was no mechanism for removing the driver's data
structures after a device was unplugged.  They had to be kept around
indefinitely, in case the device was plugged in again.

> I don't know if anyone has looked at moving this to an upper layer.
> Definitely a question worth asking.  The simpler we can make this
> for driver authors the better.  Especially as that will make
> the drivers more maintainable long term.

Maybe you're talking about adding some sort of Persistent-Device
feature to the LVM?

In an event, I'm not sure why you brought all this up.  How is it 
relevant to kexec or kexec jump?

Are you worried that there needs to be a way to tell drivers to quiesce 
their devices before doing the kexec?

Alan Stern


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [linux-pm] [PATCH -mm] kexec jump -v9

2008-05-15 Thread Rafael J. Wysocki
On Thursday, 15 of May 2008, Eric W. Biederman wrote:
> "Rafael J. Wysocki" <[EMAIL PROTECTED]> writes:
> 
> > On Thursday, 15 of May 2008, Eric W. Biederman wrote:
> >> "Rafael J. Wysocki" <[EMAIL PROTECTED]> writes:
> >> 
> >> Just an added data partial point.  In the kexec case I have had not heard
> >> anyone screaming to me that ACPI doesn't work after we switch kernels.
> >
> > You don't remove power from devices while doing that.
> 
> No.  It is the second half of S5.  When we go from the boot kernel
> to the restored kernel I am talking about.

Well, you don't remove the power from devices doing that, do you?

I was referring to the fact that you remove the power from devices after saving
the image (ie. in the "poweroff" stage).  Then, you initialize them and pass
all that to the restored kernel and the question here is:
(a) Should they be reinitialized before the restored kernel has a chance to
access them?
(b) If they should, what state they ought to be in when the restored kernel
accesses them.

That basically depends on how you're going to handle the resuming of devices,
especially on the ACPI bus, in the restored kernel.

If we are to follow ACPI, the answer to (a) is "no", except for devices used to
read the image and it's better if the boot kernel doesn't touch ACPI at all.
Then, the benefit of putting the system into S4 during the "poweroff" stage is
that (a) the resume can be carried out faster and (b) the restored kernel may
use some context preserved by the platform over the sleep state.

Also, that allows you to use the wake up capabilities of some devices that
need not be available from S5.

In any case, however, I don't really think that doing the kexec jump before
creating the image is really necessary.  The kexec jump during resume is in
fact very similar to what the current hibernation code does, but it's slightly
more complicated. :-)

Thanks,
Rafael

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH -mm] kexec jump -v9

2008-05-15 Thread Eric W. Biederman
"Rafael J. Wysocki" <[EMAIL PROTECTED]> writes:

> Well, it looks like we do similar things concurrently.  Please have a look 
> here: http://kerneltrap.org/Linux/Separating_Suspend_and_Hibernation

Yes.  Part of the reason I wanted to separate these two conversations
I knew something was going on.

> Similar patches are in the Greg's tree already.

Taking a look.

I just can't get past the fact in that the only reason hibernation can
not use the widely implemented and tested probe/remove is because of
filesystems on block devices, and that you are proposing to add 4
methods for each and every driver to handle that case, when they
don't need ANYTHING!

I wonder how hard teaching the upper layers to deal with
hotplug/remove is?

The more I look at this the more I get the impression that
hibernation and suspend should be solved in separate patches.  I'm
not at all convinced that is what is good for the goose is good for
the gander for things like your prepare method.

Hibernation seems to be an extreme case of hotplug.

Suspend seems to be just an extreme case of putting unused
devices in low power state.




I don't like the fact that these methods are power management specific.
How should this impact the greater kernel ecosystem.

+ * The externally visible transitions are handled with the help of the 
following
+ * callbacks included in this structure:
+ *
+ * @prepare: Prepare the device for the upcoming transition, but do NOT change
+ * its hardware state.  Prevent new children of the device from being
+ * registered after @prepare() returns (the driver's subsystem and
+ * generally the rest of the kernel is supposed to prevent new calls to the
+ * probe method from being made too once @prepare() has succeeded).  If
+ * @prepare() detects a situation it cannot handle (e.g. registration of a
+ * child already in progress), it may return -EAGAIN, so that the PM core
+ * can execute it once again (e.g. after the new child has been registered)
+ * to recover from the race condition.  This method is executed for all
+ * kinds of suspend transitions and is followed by one of the suspend
+ * callbacks: @suspend(), @freeze(), or @poweroff().
+ * The PM core executes @prepare() for all devices before starting to
+ * execute suspend callbacks for any of them, so drivers may assume all of
+ * the other devices to be present and functional while @prepare() is being
+ * executed.  In particular, it is safe to make GFP_KERNEL memory
+ * allocations from within @prepare(), although they are likely to fail in
+ * case of hibernation, if a substantial amount of memory is requested.
+ * However, drivers may NOT assume anything about the availability of the
+ * user space at that time and it is not correct to request firmware from
+ * within @prepare() (it's too late to do that).
+ *
+ * @complete: Undo the changes made by @prepare().  This method is executed for
+ * all kinds of resume transitions, following one of the resume callbacks:
+ * @resume(), @thaw(), @restore().  Also called if the state transition
+ * fails before the driver's suspend callback (@suspend(), @freeze(),
+ * @poweroff()) can be executed (e.g. if the suspend callback fails for one
+ * of the other devices that the PM core has unsucessfully attempted to
+ * suspend earlier).
+ * The PM core executes @complete() after it has executed the appropriate
+ * resume callback for all devices.

The names above are terrible.  Perhaps: @pause/@unpause.

@pause Stop all device driver user space facing activities, and prepare
   for a possible power state transition.

Essentially these should be very much like bringing an ethernet
interface down.  The device is still there but we can't do anything
with it.  The only difference is that this may not be user visible.

+ * @suspend: Executed before putting the system into a sleep state in which the
+ * contents of main memory are preserved.  Quiesce the device, put it into
+ * a low power state appropriate for the upcoming system state (such as
+ * PCI_D3hot), and enable wakeup events as appropriate.
+ *
+ * @resume: Executed after waking the system up from a sleep state in which the
+ * contents of main memory were preserved.  Put the device into the
+ * appropriate state, according to the information saved in memory by the
+ * preceding @suspend().  The driver starts working again, responding to
+ * hardware events and software requests.  The hardware may have gone
+ * through a power-off reset, or it may have maintained state from the
+ * previous suspend() which the driver may rely on while resuming.  On most
+ * platforms, there are no restrictions on availability of resources like
+ * clocks during @resume().

Unless I have misread something.  These are exactly the same as
@poweroff and @restore.

@suspend place the device in a low power state.

Re: [PATCH -mm] kexec jump -v9

2008-05-15 Thread Rafael J. Wysocki
On Friday, 16 of May 2008, Eric W. Biederman wrote:
> "Rafael J. Wysocki" <[EMAIL PROTECTED]> writes:
> 
> > Well, it looks like we do similar things concurrently.  Please have a look 
> > here: http://kerneltrap.org/Linux/Separating_Suspend_and_Hibernation
> 
> Yes.  Part of the reason I wanted to separate these two conversations
> I knew something was going on.
> 
> > Similar patches are in the Greg's tree already.
> 
> Taking a look.
> 
> I just can't get past the fact in that the only reason hibernation can
> not use the widely implemented and tested probe/remove is because of
> filesystems on block devices, and that you are proposing to add 4
> methods for each and every driver to handle that case, when they
> don't need ANYTHING!

Why exactly do you think that removing()/probing() devices just for creating
a hibernation image is a good idea?

Also, ->poweroff() is actually similar to the late phase of ->suspend().

> I wonder how hard teaching the upper layers to deal with
> hotplug/remove is?
> 
> The more I look at this the more I get the impression that
> hibernation and suspend should be solved in separate patches.  I'm
> not at all convinced that is what is good for the goose is good for
> the gander for things like your prepare method.

This was discussed a lot with people who had exactly opposite opinions.
With BenH in particular (CCed).
 
> Hibernation seems to be an extreme case of hotplug.

I don't agree with that.

> Suspend seems to be just an extreme case of putting unused
> devices in low power state.

Ditto.

> 
> 
> 
> I don't like the fact that these methods are power management specific.

Please be more specific.

> How should this impact the greater kernel ecosystem.
> 
> + * The externally visible transitions are handled with the help of the 
> following
> + * callbacks included in this structure:
> + *
> + * @prepare: Prepare the device for the upcoming transition, but do NOT 
> change
> + *   its hardware state.  Prevent new children of the device from being
> + *   registered after @prepare() returns (the driver's subsystem and
> + *   generally the rest of the kernel is supposed to prevent new calls to the
> + *   probe method from being made too once @prepare() has succeeded).  If
> + *   @prepare() detects a situation it cannot handle (e.g. registration of a
> + *   child already in progress), it may return -EAGAIN, so that the PM core
> + *   can execute it once again (e.g. after the new child has been registered)
> + *   to recover from the race condition.  This method is executed for all
> + *   kinds of suspend transitions and is followed by one of the suspend
> + *   callbacks: @suspend(), @freeze(), or @poweroff().
> + *   The PM core executes @prepare() for all devices before starting to
> + *   execute suspend callbacks for any of them, so drivers may assume all of
> + *   the other devices to be present and functional while @prepare() is being
> + *   executed.  In particular, it is safe to make GFP_KERNEL memory
> + *   allocations from within @prepare(), although they are likely to fail in
> + *   case of hibernation, if a substantial amount of memory is requested.
> + *   However, drivers may NOT assume anything about the availability of the
> + *   user space at that time and it is not correct to request firmware from
> + *   within @prepare() (it's too late to do that).
> + *
> + * @complete: Undo the changes made by @prepare().  This method is executed 
> for
> + *   all kinds of resume transitions, following one of the resume callbacks:
> + *   @resume(), @thaw(), @restore().  Also called if the state transition
> + *   fails before the driver's suspend callback (@suspend(), @freeze(),
> + *   @poweroff()) can be executed (e.g. if the suspend callback fails for one
> + *   of the other devices that the PM core has unsucessfully attempted to
> + *   suspend earlier).
> + *   The PM core executes @complete() after it has executed the appropriate
> + *   resume callback for all devices.
> 
> The names above are terrible.  Perhaps: @pause/@unpause.

The names have been discussed either and I don't intend to change them now.
Sorry.

> @pause Stop all device driver user space facing activities, and prepare
>for a possible power state transition.
> 
> Essentially these should be very much like bringing an ethernet
> interface down.  The device is still there but we can't do anything
> with it.  The only difference is that this may not be user visible.
> 
> + * @suspend: Executed before putting the system into a sleep state in which 
> the
> + *   contents of main memory are preserved.  Quiesce the device, put it into
> + *   a low power state appropriate for the upcoming system state (such as
> + *   PCI_D3hot), and enable wakeup events as appropriate.
> + *
> + * @resume: Executed after waking the system up from a sleep state in which 
> the
> + *   contents of main memory were preserved.  Put the device into the
> + *   appropriate state, according to 

[PATCH] Use target CC and LD to build kdump and kexec_test.

2008-05-15 Thread Jamey Sharp
Signed-off-by: Jamey Sharp <[EMAIL PROTECTED]>
---
Another generic patch extracted from my Windows porting work.

I think this is correct, but review would be appreciated.

 Makefile.in |1 +
 configure.ac|2 ++
 kdump/Makefile  |1 +
 kexec_test/Makefile |4 ++--
 4 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/Makefile.in b/Makefile.in
index 037f9a4..b51c3a1 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -37,6 +37,7 @@ AR= @AR@
 BUILD_CC   = @BUILD_CC@
 BUILD_CFLAGS   = @BUILD_CFLAGS@
 TARGET_CC  = @TARGET_CC@
+TARGET_LD  = @TARGET_LD@
 TARGET_CFLAGS  = @TARGET_CFLAGS@
 
 
diff --git a/configure.ac b/configure.ac
index b2ad226..beb8b3e 100644
--- a/configure.ac
+++ b/configure.ac
@@ -87,6 +87,7 @@ fi
 dnl Find compiler for target
 if test "${target}" != "${host}" ; then
AC_CHECK_PROGS(TARGET_CC, [${target_alias}-gcc ${target}-gcc gcc])
+   AC_CHECK_PROGS(TARGET_LD, [${target_alias}-ld ${target}-ld ld])
 else
TARGET_CC="$CC"
 fi
@@ -148,6 +149,7 @@ dnl ---Output variables...
 AC_SUBST([BUILD_CC])
 AC_SUBST([BUILD_CFLAGS])
 AC_SUBST([TARGET_CC])
+AC_SUBST([TARGET_LD])
 AC_SUBST([TARGET_CFLAGS])
 AC_SUBST([ASFLAGS])
 
diff --git a/kdump/Makefile b/kdump/Makefile
index 4a788f9..1e2b72c 100644
--- a/kdump/Makefile
+++ b/kdump/Makefile
@@ -15,6 +15,7 @@ clean += $(KDUMP_OBJS) $(KDUMP_DEPS) $(KDUMP) $(KDUMP_MANPAGE)
 
 -include $(KDUMP_DEPS)
 
+$(KDUMP): CC=$(TARGET_CC)
 $(KDUMP): $(KDUMP_OBJS)
@$(MKDIR) -p $(@D)
$(CC) $(CFLAGS) $(EXTRA_CFLAGS) -o $@ $(KDUMP_OBJS)
diff --git a/kexec_test/Makefile b/kexec_test/Makefile
index 4848fc4..fec6210 100644
--- a/kexec_test/Makefile
+++ b/kexec_test/Makefile
@@ -26,6 +26,7 @@ clean += $(KEXEC_TEST_OBJS) $(KEXEC_TEST_DEPS) $(KEXEC_TEST)
 
 -include $(KEXEC_TEST_DEPS)
 
+$(KEXEC_TEST): CC=$(TARGET_CC)
 $(KEXEC_TEST): CPPFLAGS+=-DRELOC=$(RELOC)
 $(KEXEC_TEST): ASFLAGS+=-m32
 #$(KEXEC_TEST): LDFLAGS=-m32 -Wl,-e -Wl,_start -Wl,-Ttext -Wl,$(RELOC) \
@@ -34,7 +35,6 @@ $(KEXEC_TEST): LDFLAGS=-melf_i386 -e _start -Ttext $(RELOC)
 
 $(KEXEC_TEST): $(KEXEC_TEST_OBJS)
mkdir -p $(@D)
-   #$(LINK.o) -o $@ $^
-   $(LD) $(LDFLAGS) -o $@ $^
+   $(TARGET_LD) $(LDFLAGS) -o $@ $^
 
 endif
-- 
1.5.4.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH] Prototype ifdown() in kexec.h, not nested in main().

2008-05-15 Thread Jamey Sharp
Signed-off-by: Jamey Sharp <[EMAIL PROTECTED]>
---
Another generic patch extracted from my Windows porting work.

 kexec/kexec.c |3 +--
 kexec/kexec.h |2 ++
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/kexec/kexec.c b/kexec/kexec.c
index f898428..de9765e 100644
--- a/kexec/kexec.c
+++ b/kexec/kexec.c
@@ -956,8 +956,7 @@ int main(int argc, char *argv[])
sync();
}
if ((result == 0) && do_ifdown) {
-   extern int ifdown(void);
-   (void)ifdown();
+   ifdown();
}
if ((result == 0) && do_exec) {
result = my_exec();
diff --git a/kexec/kexec.h b/kexec/kexec.h
index 2d3a748..9b45476 100644
--- a/kexec/kexec.h
+++ b/kexec/kexec.h
@@ -209,6 +209,8 @@ extern unsigned long add_buffer_phys_virt(struct kexec_info 
*info,
int buf_end, int phys);
 extern void arch_reuse_initrd(void);
 
+extern int ifdown(void);
+
 extern unsigned char purgatory[];
 extern size_t purgatory_size;
 
-- 
1.5.4.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH] Open slurped files in binary mode, on systems where that matters.

2008-05-15 Thread Jamey Sharp
Signed-off-by: Jamey Sharp <[EMAIL PROTECTED]>
---
This patch shouldn't hurt any system that doesn't distinguish between
binary and text files, and helps when running on Windows.

 kexec/kexec.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/kexec/kexec.c b/kexec/kexec.c
index de9765e..1da6c1e 100644
--- a/kexec/kexec.c
+++ b/kexec/kexec.c
@@ -30,6 +30,9 @@
 #include 
 #include 
 #include 
+#ifndef _O_BINARY
+#define _O_BINARY 0
+#endif
 #include 
 
 #include "config.h"
@@ -386,7 +389,7 @@ char *slurp_file(const char *filename, off_t *r_size)
*r_size = 0;
return 0;
}
-   fd = open(filename, O_RDONLY);
+   fd = open(filename, O_RDONLY | _O_BINARY);
if (fd < 0) {
die("Cannot open `%s': %s\n",
filename, strerror(errno));
@@ -431,7 +434,7 @@ char *slurp_file_len(const char *filename, off_t size)
 
if (!filename)
return 0;
-   fd = open(filename, O_RDONLY);
+   fd = open(filename, O_RDONLY | _O_BINARY);
if (fd < 0) {
fprintf(stderr, "Cannot open %s: %s\n", filename,
strerror(errno));
-- 
1.5.4.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH -mm] kexec jump -v9

2008-05-15 Thread Vivek Goyal
On Thu, May 15, 2008 at 01:41:50PM +0800, Huang, Ying wrote:
> Hi, Vivek,
> 
> On Wed, 2008-05-14 at 16:52 -0400, Vivek Goyal wrote:
> [...]
> > Ok, I have done some testing on this patch. Currently I have just
> > tested switching back and forth between two kernels and it is working for
> > me.
> > 
> > Just that I had to put LAPIC and IOAPIC in legacy mode for it to work. Few
> > comments/questions are inline.
> 
> It seems that for LAPIC and IOAPIC, there is
> lapic_suspend()/lapic_resume() and ioapic_suspend()/ioapic_resume(),
> which will be called before/after kexec jump through
> device_power_down()/device_power_up(). So, the mechanism for
> LAPIC/IOAPIC is there, we may need to check the corresponding
> implementation.
> 

ioapic_suspend() is not putting APICs in Legacy mode and that's why
we are seeing the issue. It only saves the IOAPIC routing table entries
and these entries are restored during ioapic_resume().

But I think somebody has to put APICs in legacy mode for normal 
hibernation also. Not sure who does it. May be BIOS, so that during
resume, second kernel can get the timer interrupts.

Thanks
Vivek

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec based hibernation: a prototype of kexec multi-stage load

2008-05-15 Thread Huang, Ying
On Thu, 2008-05-15 at 11:39 -0700, Eric W. Biederman wrote:
[...]
> 2) After we figure out our address read the stack pointer from
>a fixed location and simply set it.  (This is my preference)

Just for confirmation (My English is poor).

Do you mean that kernel A just read the stack top as re-entry point,
regardless of whether it is return address or argument 1?

Best Regards,
Huang Ying


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH -mm] kexec jump -v9

2008-05-15 Thread Eric W. Biederman
Vivek Goyal <[EMAIL PROTECTED]> writes:

> ioapic_suspend() is not putting APICs in Legacy mode and that's why
> we are seeing the issue. It only saves the IOAPIC routing table entries
> and these entries are restored during ioapic_resume().
>
> But I think somebody has to put APICs in legacy mode for normal 
> hibernation also. Not sure who does it. May be BIOS, so that during
> resume, second kernel can get the timer interrupts.

I doubt anything cares in the suspend to ram case. There should just
be a small BIOS trampoline to get back to linux when the processor
restarts.  And you don't need interrupts for any of that. 

Eric

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH -mm] kexec jump -v9

2008-05-15 Thread Huang, Ying
On Thu, 2008-05-15 at 16:09 -0400, Vivek Goyal wrote:
[...]
> Ok, You want to make BIOS calls. We already do that using vm86 mode and
> use bios real mode interrupts. So why do we need this interface? Or, IOW,
> how is this interface better?

It can call code in 32-bit physical mode in addition to real mode. So It
can be used to call EFI runtime service, especially call EFI 64 runtime
service under 32-bit kernel or vice versa.

The main purpose of kexec jump is for hibernation. But I think if the
effort is small, why not support general 32-bit physical mode code call
at same time.

Best Regards,
Huang Ying


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH -mm] kexec jump -v9

2008-05-15 Thread Huang, Ying
On Thu, 2008-05-15 at 18:35 -0700, Eric W. Biederman wrote:
> Vivek Goyal <[EMAIL PROTECTED]> writes:
> 
> > ioapic_suspend() is not putting APICs in Legacy mode and that's why
> > we are seeing the issue. It only saves the IOAPIC routing table entries
> > and these entries are restored during ioapic_resume().
> >
> > But I think somebody has to put APICs in legacy mode for normal 
> > hibernation also. Not sure who does it. May be BIOS, so that during
> > resume, second kernel can get the timer interrupts.
> 
> I doubt anything cares in the suspend to ram case. There should just
> be a small BIOS trampoline to get back to linux when the processor
> restarts.  And you don't need interrupts for any of that. 

As far as I know, in suspend to ram, interrupt is used as waking up
event, such as, keyboard interrupt.

Best Regards,
Huang Ying

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH -mm] kexec jump -v9

2008-05-15 Thread Vivek Goyal
On Fri, May 16, 2008 at 09:48:34AM +0800, Huang, Ying wrote:
> On Thu, 2008-05-15 at 16:09 -0400, Vivek Goyal wrote:
> [...]
> > Ok, You want to make BIOS calls. We already do that using vm86 mode and
> > use bios real mode interrupts. So why do we need this interface? Or, IOW,
> > how is this interface better?
> 
> It can call code in 32-bit physical mode in addition to real mode. So It
> can be used to call EFI runtime service, especially call EFI 64 runtime
> service under 32-bit kernel or vice versa.
> 
> The main purpose of kexec jump is for hibernation. But I think if the
> effort is small, why not support general 32-bit physical mode code call
> at same time.
> 

In general what's the environment requirements for EFI runtime 
services? I mean, just that processor should be in protected mode with
paging disabled or one need to stop all other cpus and devices and then make
the call (as we are doing in this case?). 

Thanks
Vivek

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec based hibernation: a prototype of kexec multi-stage load

2008-05-15 Thread Vivek Goyal
On Thu, May 15, 2008 at 12:57:53PM +0800, Huang, Ying wrote:
> On Wed, 2008-05-14 at 14:43 -0700, Eric W. Biederman wrote:
> [...]
> > Then as a preliminary design let's plan on this.
> > 
> > - Pass the rentry point as the return address (using the C ABI).
> >   We may want to load the stack pointer etc so we can act as
> >   a direct entry point for new code.
> 
> There are some issues about passing entry point as return address. The
> kexec jump (or kexec with return) is used for
> 
> - Switching between original kernel (A) and kexeced kernel (B)
> - Call some code (such as BIOS code) in physical mode
> 
> 1) When call some code in physical mode, the called code can use a
> simple return to return to kernel A. So there is no return address on
> stack after return to kernel A. Instead, argument 1 is on stack top.
> 
> 2) When switch back from kernel B to kernel A, kernel B will call the
> jump back entry of kernel A with C ABI. So, the return address is on
> stack top. And kernel A get jump back entry of kernel B via the return
> address.
> 
> Because the stack state is different between 1) and 2), the jump back
> entry of kernel A should distinguish them. Possible solution can be as
> follow:
> 
> a) Before kernel A call some physical mode code or kernel B, it set
> argument 1 to be a magic number that can not be return address (such as
> -1). Jump back entry of kernel A can check whether the stack top is
> argument 1 or return address.
> 
> b) Distinguish by return address. Such as, called physical mode code
> must return 0, while kernel B must set %eax to some other number.
> 

IMHO, this kind of make more sense to me when keeping C function like
semantics in mind.

Both the cases can be treated like calls to functions (calling BIOS function
and jumping to kernel B). The basic difference between two cases is the
re-entry point. In BIOS function case, we always re-enter the function at the
start but in case of kernel B, except first entry, all other entries happen
at a run time determined address, which needs to be communicated to kernel A.

I would think that second kernel B just should execute "ret" and new entry
address of kernel B is passed to kernel A through %eax (return value of
function).

Not sure if BIOS routines can always return a fix code so that we can
differentiate between two cases.

Thanks
Vivek

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH -mm] kexec jump -v9

2008-05-15 Thread Huang, Ying
On Thu, 2008-05-15 at 21:51 -0400, Vivek Goyal wrote:
> On Fri, May 16, 2008 at 09:48:34AM +0800, Huang, Ying wrote:
> > On Thu, 2008-05-15 at 16:09 -0400, Vivek Goyal wrote:
> > [...]
> > > Ok, You want to make BIOS calls. We already do that using vm86 mode and
> > > use bios real mode interrupts. So why do we need this interface? Or, IOW,
> > > how is this interface better?
> > 
> > It can call code in 32-bit physical mode in addition to real mode. So It
> > can be used to call EFI runtime service, especially call EFI 64 runtime
> > service under 32-bit kernel or vice versa.
> > 
> > The main purpose of kexec jump is for hibernation. But I think if the
> > effort is small, why not support general 32-bit physical mode code call
> > at same time.
> > 
> 
> In general what's the environment requirements for EFI runtime 
> services? I mean, just that processor should be in protected mode with
> paging disabled or one need to stop all other cpus and devices and then make
> the call (as we are doing in this case?). 

Put processor in protected mode with paging disabled is sufficient. In
one of previous kexec jump versions, I provide some option to choose the
state saved (whether stop other cpus, whether stop devices).

I agree that now we should focus on kexec based hibernation. But I think
it is reasonable to keep the possibility with minimal effort.

Best Regards,
Huang Ying


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec based hibernation: a prototype of kexec multi-stage load

2008-05-15 Thread Huang, Ying
On Thu, 2008-05-15 at 22:00 -0400, Vivek Goyal wrote:
[...]
> IMHO, this kind of make more sense to me when keeping C function like
> semantics in mind.
> 
> Both the cases can be treated like calls to functions (calling BIOS function
> and jumping to kernel B). The basic difference between two cases is the
> re-entry point. In BIOS function case, we always re-enter the function at the
> start but in case of kernel B, except first entry, all other entries happen
> at a run time determined address, which needs to be communicated to kernel A.
> 
> I would think that second kernel B just should execute "ret" and new entry
> address of kernel B is passed to kernel A through %eax (return value of
> function).

The disadvantage of this solution is that kernel B must know it is
original kernel (A) or kexeced kernel (B). Different code should be used
by kernel A and kernel B. And after jump from A to B, jump from B to A,
when jump from A to B again, kernel A must use different code from the
first time.

Best Regards,
Huang Ying


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec based hibernation: a prototype of kexec multi-stage load

2008-05-15 Thread Eric W. Biederman
"Huang, Ying" <[EMAIL PROTECTED]> writes:

> On Thu, 2008-05-15 at 11:39 -0700, Eric W. Biederman wrote:
> [...]
>> 2) After we figure out our address read the stack pointer from
>>a fixed location and simply set it.  (This is my preference)
>
> Just for confirmation (My English is poor).
>
> Do you mean that kernel A just read the stack top as re-entry point,
> regardless of whether it is return address or argument 1?

What I was thinking was:

In kernel A()

relocate_new_kernel:

...

call*%eax

kexec_jump_back_entry:
/* This code should be PIC so figure out where we are */
call1f
1:
popl%edi
subl$(1b - relocate_kernel), %edi

/* Setup a safe stack */
lealPAGE_SIZE(%edi), %esp
...


Then in purgatory we can read the address of kexec_jump_back_entry
by examining 0(%esp) and export it in whatever fashion is sane.

However we reach kexec_jump_back_entry we should be fine.

Eric

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec based hibernation: a prototype of kexec multi-stage load

2008-05-15 Thread Huang, Ying
On Thu, 2008-05-15 at 19:25 -0700, Eric W. Biederman wrote:
> "Huang, Ying" <[EMAIL PROTECTED]> writes:
> 
> > On Thu, 2008-05-15 at 11:39 -0700, Eric W. Biederman wrote:
> > [...]
> >> 2) After we figure out our address read the stack pointer from
> >>a fixed location and simply set it.  (This is my preference)
> >
> > Just for confirmation (My English is poor).
> >
> > Do you mean that kernel A just read the stack top as re-entry point,
> > regardless of whether it is return address or argument 1?
> 
> What I was thinking was:
> 
> In kernel A()
> 
> relocate_new_kernel:
> 
> ...
> 
> call  *%eax
> 
> kexec_jump_back_entry:
> /* This code should be PIC so figure out where we are */
> call  1f
> 1:
> popl  %edi
> subl  $(1b - relocate_kernel), %edi
> 
> /* Setup a safe stack */
> lealPAGE_SIZE(%edi), %esp
> ...
> 
> 
> Then in purgatory we can read the address of kexec_jump_back_entry
> by examining 0(%esp) and export it in whatever fashion is sane.
> 
> However we reach kexec_jump_back_entry we should be fine.

I think it is reasonable to enable jumping back and forth more than one
time. So the following should be possible:

1. Jump from A to B (actually jump to purgatory, trigger the boot of B)
2. Jump from B to A
3. Jump from A to B again (jump to the kexec_jump_back_entry of B)
4. Jump from B to A
...

So it should be possible to get the re-entry point of kernel B in
kexec_jump_back_entry of kernel A too. So I think in
kexec_jump_back_entry, the caller's stack should be checked to get
re-entry point of peer. And the stack state is different depend on where
come from, from relocate_new_kernel() or return.

Best Regards,
Huang Ying


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec based hibernation: a prototype of kexec multi-stage load

2008-05-15 Thread Eric W. Biederman
"Huang, Ying" <[EMAIL PROTECTED]> writes:

> The disadvantage of this solution is that kernel B must know it is
> original kernel (A) or kexeced kernel (B). Different code should be used
> by kernel A and kernel B. And after jump from A to B, jump from B to A,
> when jump from A to B again, kernel A must use different code from the
> first time.

I don't know what the case is for keeping two kernels in memory and switching
between them.

I suspect a small piece of trampoline code between the two kernels could
handle the case. (i.e. purgatory pays attention).

That is a fundamental aspect of the design.  A general purpose infrastructure
with trampoline code to adapt it to whatever situation comes up.

Eric

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec based hibernation: a prototype of kexec multi-stage load

2008-05-15 Thread Vivek Goyal
On Fri, May 16, 2008 at 10:56:15AM +0800, Huang, Ying wrote:
> On Thu, 2008-05-15 at 19:25 -0700, Eric W. Biederman wrote:
> > "Huang, Ying" <[EMAIL PROTECTED]> writes:
> > 
> > > On Thu, 2008-05-15 at 11:39 -0700, Eric W. Biederman wrote:
> > > [...]
> > >> 2) After we figure out our address read the stack pointer from
> > >>a fixed location and simply set it.  (This is my preference)
> > >
> > > Just for confirmation (My English is poor).
> > >
> > > Do you mean that kernel A just read the stack top as re-entry point,
> > > regardless of whether it is return address or argument 1?
> > 
> > What I was thinking was:
> > 
> > In kernel A()
> > 
> > relocate_new_kernel:
> > 
> > ...
> > 
> > call*%eax
> > 
> > kexec_jump_back_entry:
> > /* This code should be PIC so figure out where we are */
> > call1f
> > 1:
> > popl%edi
> > subl$(1b - relocate_kernel), %edi
> > 
> > /* Setup a safe stack */
> > lealPAGE_SIZE(%edi), %esp
> > ...
> > 
> > 
> > Then in purgatory we can read the address of kexec_jump_back_entry
> > by examining 0(%esp) and export it in whatever fashion is sane.
> > 
> > However we reach kexec_jump_back_entry we should be fine.
> 

Huang is making use of purgatory only for booting kernel B for the first
time. Once the kernel B is booted, all the trasitions (A-->B and B<--A)
happen without using purgatory. Just keep on jumping back and forth
to "kexec_jump_back_entry".

Probably not using purgatory for later transitions is justified as long as
kernel code is simple and small. Otherwise we will shall have to teach
purgatory also of special case of resuming kernel B or booting kernel B.

> I think it is reasonable to enable jumping back and forth more than one
> time. So the following should be possible:
> 
> 1. Jump from A to B (actually jump to purgatory, trigger the boot of B)
> 2. Jump from B to A
> 3. Jump from A to B again (jump to the kexec_jump_back_entry of B)
> 4. Jump from B to A
> ...
> 
> So it should be possible to get the re-entry point of kernel B in
> kexec_jump_back_entry of kernel A too. So I think in
> kexec_jump_back_entry, the caller's stack should be checked to get
> re-entry point of peer. And the stack state is different depend on where
> come from, from relocate_new_kernel() or return.
> 

To me this idea also looks good. So control flow will look something
as follows?

relocate_new kernel:

if (!preserve_context)
set registers to known state.
jump to purgatory.
else
goto jump-back-setup:

jump-back-setup:
- Color the stack.
  move $0x 0(%esp)

- call %edx

kexec_jump_back_entry:

- If 0 (%esp) is not -1
image->start = 0(%esp)  //Re entry point of kernel B. Store it.
  else
We returned from BIOS call. Re-entry point has not changed
Do nothing.

- Continue to resume kernel A

Thanks
Vivek
 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec based hibernation: a prototype of kexec multi-stage load

2008-05-15 Thread Eric W. Biederman
"Huang, Ying" <[EMAIL PROTECTED]> writes:


> I think it is reasonable to enable jumping back and forth more than one
> time.

I'm not opposed.  I just don't understand the utility yet.

> So the following should be possible:
>
> 1. Jump from A to B (actually jump to purgatory, trigger the boot of B)
> 2. Jump from B to A
> 3. Jump from A to B again (jump to the kexec_jump_back_entry of B)
  (And we go through purgatory which remembers
   the kexec_jump_back_entry of B)
> 4. Jump from B to A
> ...
>
> So it should be possible to get the re-entry point of kernel B in
> kexec_jump_back_entry of kernel A too. So I think in
> kexec_jump_back_entry, the caller's stack should be checked to get
> re-entry point of peer. And the stack state is different depend on where
> come from, from relocate_new_kernel() or return.

Yes.

Any conditional logic needs to be in purgatory or a similar trampoline.

Eric

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec based hibernation: a prototype of kexec multi-stage load

2008-05-15 Thread Huang, Ying
On Thu, 2008-05-15 at 19:55 -0700, Eric W. Biederman wrote:
> "Huang, Ying" <[EMAIL PROTECTED]> writes:
> 
> > The disadvantage of this solution is that kernel B must know it is
> > original kernel (A) or kexeced kernel (B). Different code should be used
> > by kernel A and kernel B. And after jump from A to B, jump from B to A,
> > when jump from A to B again, kernel A must use different code from the
> > first time.
> 
> I don't know what the case is for keeping two kernels in memory and switching
> between them.

This can be used to save the memory image of kernel B and accelerate the
hibernation. The real boot of kernel B is only needed first time.

> I suspect a small piece of trampoline code between the two kernels could
> handle the case. (i.e. purgatory pays attention).
> 
> That is a fundamental aspect of the design.  A general purpose infrastructure
> with trampoline code to adapt it to whatever situation comes up.

It is possible to use purgatory to deal with this problem.

Jump from kernel A to kernel B
Jump to entry of purgatory (purgatory_entry)
purgatory save the return address (kexec_jump_back_entry_A)
Purgatory set kexec_jump_back_entry for kernel B to a code
segment in purgatory, say kexec_jump_back_entry_A_for_B
Purgatory jump to entry point of kernel B
Jump from kernel B to kernel A
Jump to purgatory (kexec_jump_back_entry_A_for_B)
Purgatory save the return address (kexec_jump_back_entry_B)
Purgatory return to kernel A (kexec_jump_back_entry_A)
Jump from kernel A to kernel B again
Jump to entry of purgatory (purgatory_entry)
Purgatory save the return address (kexec_jump_back_entry_A)
Purgatory jump to kexec_jump_back_entry_B

The disadvantage of this solution is that some information is saved in
purgatory (kexec_jump_back_entry_A, kexec_jump_back_entry_B). So,
purgatory must be saved too when save the memory image of kernel A or
kernel B. Purgatory can be seen as a part of kernel B. But it is a
little tricky to think it as a part of kernel A too.

Best Regards,
Huang Ying

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[RFC] Win32 port of the userspace tools using MinGW.

2008-05-15 Thread Jamey Sharp
OK, I haven't quite gotten around to posting the Windows kernel driver
source that goes with this. So I'm not asking that this patch be merged,
since nobody else can use it yet. :-) I'd love to get review, though:
Does this look like it's in a mergeable state? I'm happy with it, but do
I need to change anything to make it acceptable to you folks?

This patch requires basically all the other patches I've posted.


 Makefile.in   |1 +
 include/byteswap.h|   39 +++
 kexec/Makefile|6 +
 kexec/arch/i386/Makefile  |2 +
 kexec/arch/i386/kexec-x86.c   |2 +
 kexec/arch/i386/x86-linux-setup.c |4 +
 kexec/kexec-syscall.h |7 ++
 kexec/kexec.c |4 +
 kexec/kexec.h |   10 ++
 kexec/win32.c |  214 +
 10 files changed, 289 insertions(+), 0 deletions(-)
 create mode 100644 include/byteswap.h
 create mode 100644 kexec/win32.c

diff --git a/Makefile.in b/Makefile.in
index b51c3a1..bdd6ba7 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -24,6 +24,7 @@ ARCH  = @ARCH@
 OBJDIR = @OBJDIR@
 target = @target@
 host   = @host@
+host_os= @host_os@
 
 # Compiler for building kexec
 CC = @CC@
diff --git a/include/byteswap.h b/include/byteswap.h
new file mode 100644
index 000..cd5a726
--- /dev/null
+++ b/include/byteswap.h
@@ -0,0 +1,39 @@
+/* byteswap.h
+
+Copyright 2005 Red Hat, Inc.
+
+This file is part of Cygwin.
+
+This software is a copyrighted work licensed under the terms of the
+Cygwin license.  Please consult the file "CYGWIN_LICENSE" for
+details. */
+
+#ifndef _BYTESWAP_H
+#define _BYTESWAP_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+static __inline unsigned short
+bswap_16 (unsigned short __x)
+{
+  return (__x >> 8) | (__x << 8);
+}
+
+static __inline unsigned int
+bswap_32 (unsigned int __x)
+{
+  return (bswap_16 (__x & 0x) << 16) | (bswap_16 (__x >> 16));
+}
+
+static __inline unsigned long long
+bswap_64 (unsigned long long __x)
+{
+  return (((unsigned long long) bswap_32 (__x & 0xull)) << 32) | 
(bswap_32 (__x >> 32));
+}
+
+#ifdef __cplusplus
+}
+#endif
+#endif /* _BYTESWAP_H */
diff --git a/kexec/Makefile b/kexec/Makefile
index a80b940..fe05340 100644
--- a/kexec/Makefile
+++ b/kexec/Makefile
@@ -11,16 +11,22 @@ KEXEC_SRCS =
 KEXEC_GENERATED_SRCS =
 
 KEXEC_SRCS += kexec/kexec.c
+ifneq ($(host_os),mingw32msvc)
 KEXEC_SRCS += kexec/ifdown.c
+endif
 KEXEC_SRCS += kexec/kexec-elf.c
 KEXEC_SRCS += kexec/kexec-elf-exec.c
 KEXEC_SRCS += kexec/kexec-elf-core.c
 KEXEC_SRCS += kexec/kexec-elf-rel.c
 KEXEC_SRCS += kexec/kexec-elf-boot.c
 KEXEC_SRCS += kexec/kexec-iomem.c
+ifneq ($(host_os),mingw32msvc)
 KEXEC_SRCS += kexec/crashdump.c
 KEXEC_SRCS += kexec/crashdump-xen.c
 KEXEC_SRCS += kexec/phys_arch.c
+else
+KEXEC_SRCS += kexec/win32.c
+endif
 
 KEXEC_GENERATED_SRCS += $(PURGATORY_HEX_C)
 
diff --git a/kexec/arch/i386/Makefile b/kexec/arch/i386/Makefile
index f2d9636..f9dbb7b 100644
--- a/kexec/arch/i386/Makefile
+++ b/kexec/arch/i386/Makefile
@@ -9,7 +9,9 @@ i386_KEXEC_SRCS += kexec/arch/i386/kexec-multiboot-x86.c
 i386_KEXEC_SRCS += kexec/arch/i386/kexec-beoboot-x86.c
 i386_KEXEC_SRCS += kexec/arch/i386/kexec-nbi.c
 i386_KEXEC_SRCS += kexec/arch/i386/x86-linux-setup.c
+ifneq ($(host_os),mingw32msvc)
 i386_KEXEC_SRCS += kexec/arch/i386/crashdump-x86.c
+endif
 
 dist += kexec/arch/i386/Makefile $(i386_KEXEC_SRCS)\
kexec/arch/i386/kexec-x86.h kexec/arch/i386/crashdump-x86.h \
diff --git a/kexec/arch/i386/kexec-x86.c b/kexec/arch/i386/kexec-x86.c
index 89ccb0b..f937856 100644
--- a/kexec/arch/i386/kexec-x86.c
+++ b/kexec/arch/i386/kexec-x86.c
@@ -32,6 +32,7 @@
 #include "crashdump-x86.h"
 #include 
 
+#ifndef __MINGW32__
 static struct memory_range memory_range[MAX_MEMORY_RANGES];
 
 /* Return a sorted list of memory ranges. */
@@ -113,6 +114,7 @@ int get_memory_ranges(struct memory_range **range, int 
*ranges,
*ranges = memory_ranges;
return 0;
 }
+#endif /* !defined(__MINGW32__) */
 
 struct file_type file_type[] = {
{ "multiboot-x86", multiboot_x86_probe, multiboot_x86_load, 
diff --git a/kexec/arch/i386/x86-linux-setup.c 
b/kexec/arch/i386/x86-linux-setup.c
index 4b9a5e5..e750d82 100644
--- a/kexec/arch/i386/x86-linux-setup.c
+++ b/kexec/arch/i386/x86-linux-setup.c
@@ -23,8 +23,10 @@
 #include 
 #include 
 #include 
+#ifndef __MINGW32__
 #include 
 #include 
+#endif
 #include 
 #include 
 #include "../../kexec.h"
@@ -101,6 +103,7 @@ void setup_linux_bootloader_parameters(
 
 int setup_linux_vesafb(struct x86_linux_param_header *real_mode)
 {
+#ifndef __MINGW32__
struct fb_fix_screeninfo fix;
struct fb_var_screeninfo var;
int fd;
@@ -153,6 +156,7 @@ int setup_linux_vesafb(struct x86_linux_param_header 
*real_mode)
 
  out:
close(fd);
+#endif /* !defined(_

[PATCH] Factor uname-based native architecture detection into a common function.

2008-05-15 Thread Jamey Sharp
This code was copy-pasted into every architecture and was basically
identical.

Besides producing a nice net reduction in code, this factors a
portability challenge into a single function that can be easily replaced
at build-time.

Signed-off-by: Jamey Sharp <[EMAIL PROTECTED]>
---
This would allow arch_compat_trampoline to be simplified, since every
architecture besides i386 has only "return 0;" in the function body now.

Note: I can't confirm that this works on any architecture but i386.

 kexec/Makefile   |1 +
 kexec/arch/arm/kexec-arm.c   |   23 -
 kexec/arch/i386/kexec-x86.c  |   39 -
 kexec/arch/ia64/kexec-ia64.c |   23 -
 kexec/arch/mips/kexec-mips.c |   29 +++
 kexec/arch/ppc/kexec-ppc.c   |   29 +++
 kexec/arch/ppc64/kexec-ppc64.c   |   29 +++
 kexec/arch/s390/kexec-s390.c |8 +-
 kexec/arch/sh/kexec-sh.c |   34 +---
 kexec/arch/x86_64/kexec-x86_64.c |   29 +++
 kexec/kexec.c|6 +
 kexec/kexec.h|8 +++
 kexec/phys_arch.c|   23 ++
 13 files changed, 109 insertions(+), 172 deletions(-)
 create mode 100644 kexec/phys_arch.c

diff --git a/kexec/Makefile b/kexec/Makefile
index 98fed4c..a80b940 100644
--- a/kexec/Makefile
+++ b/kexec/Makefile
@@ -20,6 +20,7 @@ KEXEC_SRCS += kexec/kexec-elf-boot.c
 KEXEC_SRCS += kexec/kexec-iomem.c
 KEXEC_SRCS += kexec/crashdump.c
 KEXEC_SRCS += kexec/crashdump-xen.c
+KEXEC_SRCS += kexec/phys_arch.c
 
 KEXEC_GENERATED_SRCS += $(PURGATORY_HEX_C)
 
diff --git a/kexec/arch/arm/kexec-arm.c b/kexec/arch/arm/kexec-arm.c
index 78a55e6..a7efdb0 100644
--- a/kexec/arch/arm/kexec-arm.c
+++ b/kexec/arch/arm/kexec-arm.c
@@ -12,7 +12,6 @@
 #include 
 #include 
 #include 
-#include 
 #include "../../kexec.h"
 #include "../../kexec-syscall.h"
 #include "kexec-arm.h"
@@ -109,25 +108,13 @@ int arch_process_options(int argc, char **argv)
return 0;
 }
 
+const struct arch_map_entry arches[] = {
+   { "arm", KEXEC_ARCH_ARM },
+   { 0 },
+};
+
 int arch_compat_trampoline(struct kexec_info *info)
 {
-   int result;
-   struct utsname utsname;
-   result = uname(&utsname);
-   if (result < 0) {
-   fprintf(stderr, "uname failed: %s\n",
-   strerror(errno));
-   return -1;
-   }
-   if (strncmp(utsname.machine, "arm",3) == 0)
-   {
-   info->kexec_flags |= KEXEC_ARCH_ARM;
-   }
-   else {
-   fprintf(stderr, "Unsupported machine type: %s\n",
-   utsname.machine);
-   return -1;
-   }
return 0;
 }
 
diff --git a/kexec/arch/i386/kexec-x86.c b/kexec/arch/i386/kexec-x86.c
index 4a41fed..89ccb0b 100644
--- a/kexec/arch/i386/kexec-x86.c
+++ b/kexec/arch/i386/kexec-x86.c
@@ -25,7 +25,6 @@
 #include 
 #include 
 #include 
-#include 
 #include "../../kexec.h"
 #include "../../kexec-elf.h"
 #include "../../kexec-syscall.h"
@@ -223,29 +222,22 @@ int arch_process_options(int argc, char **argv)
return 0;
 }
 
+const struct arch_map_entry arches[] = {
+   /* For compatibility with older patches
+* use KEXEC_ARCH_DEFAULT instead of KEXEC_ARCH_386 here.
+*/
+   { "i386", KEXEC_ARCH_DEFAULT },
+   { "i486", KEXEC_ARCH_DEFAULT },
+   { "i586", KEXEC_ARCH_DEFAULT },
+   { "i686", KEXEC_ARCH_DEFAULT },
+   { "x86_64", KEXEC_ARCH_X86_64 },
+   { 0 },
+};
+
 int arch_compat_trampoline(struct kexec_info *info)
 {
-   int result;
-   struct utsname utsname;
-   result = uname(&utsname);
-   if (result < 0) {
-   fprintf(stderr, "uname failed: %s\n",
-   strerror(errno));
-   return -1;
-   }
-   if ((strcmp(utsname.machine, "i386") == 0) ||
-   (strcmp(utsname.machine, "i486") == 0) ||
-   (strcmp(utsname.machine, "i586") == 0) ||
-   (strcmp(utsname.machine, "i686") == 0)) 
+   if ((info->kexec_flags & KEXEC_ARCH_MASK) == KEXEC_ARCH_X86_64)
{
-   /* For compatibility with older patches 
-* use KEXEC_ARCH_DEFAULT instead of KEXEC_ARCH_386 here.
-*/
-   info->kexec_flags |= KEXEC_ARCH_DEFAULT;
-   }
-   else if (strcmp(utsname.machine, "x86_64") == 0)
-   {
-   info->kexec_flags |= KEXEC_ARCH_X86_64;
if (!info->rhdr.e_shdr) {
fprintf(stderr, 
"A trampoline is required for cross 
architecture support\n");
@@ -256,11 +248,6 @@ int arch_compat_trampoline(struct kexec_info *info)

info->entry = (void *)elf_rel_get_addr(&info->rhdr, 
"compat_x86_64");
}
-   el