Hi Chen Yu,

On Sun, Sep 25, 2016 at 12:17:57PM +0800, Chen Yu wrote:
> On some platforms, there is occasional panic triggered when trying to
> resume from hibernation, a typical panic looks like:
> 
> "BUG: unable to handle kernel paging request at ffff880085894000
> IP: [<ffffffff810c5dc2>] load_image_lzo+0x8c2/0xe70"
> 
> Investigation carried out by Lee Chun-Yi shows that this is because
> e820 map has been changed by BIOS across hibernation, and one
> of the page frames from suspend kernel is right located in restore
> kernel's unmapped region, so panic comes out when accessing unmapped
> kernel address.
>

Sorry for finally I can not find the issue machine back now. So I add
a patch to fool kernel as the e820 changed when S4 resume for testing.
 
> In order to expose this issue earlier, the md5 hash of e820 map
> is passed from suspend kernel to restore kernel, and the restore
> kernel will terminate the resume process once it finds the md5
> hash are not the same.
>
[...snip] 
> ---
>  arch/x86/power/hibernate_64.c | 92 
> ++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 90 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
> index 9634557..d81b1af 100644
> --- a/arch/x86/power/hibernate_64.c
> +++ b/arch/x86/power/hibernate_64.c
> @@ -11,6 +11,10 @@
>  #include <linux/gfp.h>
>  #include <linux/smp.h>
>  #include <linux/suspend.h>
> +#include <linux/scatterlist.h>
> +#include <linux/kdebug.h>

[...snip]

> @@ -216,5 +297,12 @@ int arch_hibernation_header_restore(void *addr)
>       restore_jump_address = rdr->jump_address;
>       jump_address_phys = rdr->jump_address_phys;
>       restore_cr3 = rdr->cr3;
> -     return (rdr->magic == RESTORE_MAGIC) ? 0 : -EINVAL;
> +
> +     if (rdr->magic != RESTORE_MAGIC)
> +             return -EINVAL;
> +
> +     if (hibernation_e820_mismatch(rdr->e820_digest))
> +             return -ENODEV;
> +
> +     return 0;
>  }
> --

Because the check_image_kernel() function doesn't check the return error,
kernel only shows "PM: Image mismatch: architecture specific data". The
message covered two different fail reason.
 
I suggest that it prints out a log like the restore function in ARM64
architecture. Something like this, please feel free to modify the
wording:

Index: linux/arch/x86/power/hibernate_64.c
===================================================================
--- linux.orig/arch/x86/power/hibernate_64.c
+++ linux/arch/x86/power/hibernate_64.c
@@ -298,11 +298,16 @@ int arch_hibernation_header_restore(void
        jump_address_phys = rdr->jump_address_phys;
        restore_cr3 = rdr->cr3;
 
-       if (rdr->magic != RESTORE_MAGIC)
+
+       if (rdr->magic != RESTORE_MAGIC) {
+               pr_crit("Hibernate image not generated by this kernel!\n");
                return -EINVAL;
+       }
 
-       if (hibernation_e820_mismatch(rdr->e820_digest))
+       if (hibernation_e820_mismatch(rdr->e820_digest)) {
+               pr_crit("The e820 saved regions changed!\n");
                return -ENODEV;
+       }
 
        return 0;
 }

Other parts in your patch are good to me.


Thanks a lot!
Joey Lee

Reply via email to