Re: [V2][PATCH 2/2] pefile: Fix the failure of calculation for digest

2016-07-13 Thread Dave Young
Cc crypto list

On 07/13/16 at 09:35pm, Lans Zhang wrote:
> The commit e68503bd68 forgot to set digest_len and thus cause the following
> error reported by kexec when launching a crash kernel:
> "kexec_file_load failed: Bad message"
> 
> Fixes: e68503bd68 (KEYS: Generalise system_verify_data() to provide access to 
> internal content)
> Signed-off-by: Lans Zhang 
> Cc: David Howells 
> Cc: Dave Young 
> Cc: Baoquan He 
> Cc: Vivek Goyal 
> ---
>  crypto/asymmetric_keys/mscode_parser.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/crypto/asymmetric_keys/mscode_parser.c 
> b/crypto/asymmetric_keys/mscode_parser.c
> index 6a76d5c..9492e1c 100644
> --- a/crypto/asymmetric_keys/mscode_parser.c
> +++ b/crypto/asymmetric_keys/mscode_parser.c
> @@ -124,5 +124,10 @@ int mscode_note_digest(void *context, size_t hdrlen,
>   struct pefile_context *ctx = context;
>  
>   ctx->digest = kmemdup(value, vlen, GFP_KERNEL);
> - return ctx->digest ? 0 : -ENOMEM;
> + if (!ctx->digest)
> + return -ENOMEM;
> +
> + ctx->digest_len = vlen;
> +
> + return 0;
>  }
> -- 
> 1.9.1
> 
> 
> ___
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Thiago Jung Bauermann
Am Mittwoch, 13 Juli 2016, 21:59:18 schrieb Arnd Bergmann:
> On Wednesday, July 13, 2016 3:45:41 PM CEST Thiago Jung Bauermann wrote:
> > Am Mittwoch, 13 Juli 2016, 15:13:42 schrieb Arnd Bergmann:
> > > On Wednesday, July 13, 2016 10:41:28 AM CEST Mark Rutland wrote:
> > > > On Wed, Jul 13, 2016 at 10:01:33AM +0200, Arnd Bergmann wrote:
> > > > > - kboot/petitboot with all of the user space being part of the
> > > > > trusted
> > > > > boot> >
> > > > > 
> > > > >   chain: it would be good to allow these to modify the dtb as
> > > > >   needed
> > > > >   without breaking the trust chain, just like we allow grub or
> > > > >   u-boot
> > > > >   to modify the dtb before passing it to the kernel.
> > > > 
> > > > It depends on *what* we need to modify here. We can modify the
> > > > bootargs
> > > > and initrd properties as part of the kexec_file_load syscall, so
> > > > what
> > > > else would we want to alter?
> > > 
> > > I guess petitboot can also just use kexec_load() instead of
> > > kexec_file_load(), as long as the initramfs containing petitboot is
> > > trusted by the kernel.
> > 
> > For secure boot, Petitboot needs to use kexec_file_load, because of the
> > following two features which the system call enables:
> > 
> > 1. only allow loading of signed kernels.
> > 2. "measure" (i.e., record the hashes of) the kernel, initrd, kernel
> > 
> >command line and other boot inputs for the Integrity Measurement
> >Architecture subsystem.
> > 
> > Those can't be done with kexec_load.
> 
> Can't petitboot do both of these in user space?

To be honest I'm not sure if it *can't* be done from userspace but if you do 
it from the kernel you can guarantee that any kernel image that is loaded 
gets verified and measured.

Whereas if you verify and measure the kernel in userspace then if there's a 
vulnerability in the system which allows an attacker to upload their own 
binary, then they can use kexec_load directly and bypass the verification 
and measurement.

So it's a more resilient design.

-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Dave Young
On 07/14/16 at 02:38am, AKASHI Takahiro wrote:
> Apologies for the slow response. I'm attending LinuxCon this week.
> 
> On Wed, Jul 13, 2016 at 10:34:47AM +0100, Mark Rutland wrote:
> > On Wed, Jul 13, 2016 at 10:36:14AM +0800, Dave Young wrote:
> > > But consider we can kexec to a different kernel and a different initrd so 
> > > there
> > > will be use cases to pass a total different dtb as well.
> > 
> > It depends on what you mean by "a different kernel", and what this
> > implies for the DTB.
> > 
> > I expect future arm64 Linux kernels to function with today's DTBs, and
> > the existing boot protocol. The kexec_file_load syscall already has
> > enough information for the kernel to inject the initrd and bootargs
> > properties into a DTB.
> > 
> > In practice on x86 today, kexec_file_load only supports booting to a
> > Linux kernel, because the in-kernel purgatory only implements the x86
> > Linux boot protocol. Analagously, for arm64 I think that the first
> > kernel should use its internal copy of the boot DTB, with /chosen fixed
> > up appropriately, assuming the next kernel is an arm64 Linux image.
> > 
> > If booting another OS, the only parts of the DTB I would expect to
> > change are the properties under chosen, as everything else *should* be
> > OS-independent. However the other OS may have a completely different
> > boot protocol, might not even take a DTB, and will likely need a
> > compeltely different purgatory implementation. So just allowing the DTB
> > to be altered isn't sufficient for that case.
> > 
> > There might be cases where we want a different DTB, but as far as I can
> > tell we have nothing analagous on x86 today. If we do need this, we
> > should have an idea of what real case(s) were trying to solve.
> 
> What I had in my mind was:
> 
> - Kdump
>   As Russel said, we definitely need to modify dtb.
>   In addition to bootargs and initrd proerties (FYI, in my arm64
>   implementation for arm64, eflcorehdr info is also passed as DT
>   property), we may want to remove unnecessary devices and
>   even add a dedicated storage device for storing a core dump image.
> - Say, booting BE kernel on ACPI LE kernel
>   In this case, there is no useful dtb in the kernel.
> 
> Have said that, as Mark said, we may be able to use normal kexec_load
> system call if we don't need a "secure" kexec.
> 
> BTW, why doesn't the current kexec_load have ability of verifying
> a signature of initramfs image? Is IMA/EVM expected to be used
> at runtime?

I believe there are some limitations for verify signatures in kexec_load.
First kexec-tools need to be trusted, but there's no way to sign and
verify signature of shared libraries. There maybe other limitations I
can not remember which are also reasons why Vivek moved to current
file based syscall.

Thanks
Dave

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Dave Young
On 07/13/16 at 10:34am, Mark Rutland wrote:
> On Wed, Jul 13, 2016 at 10:36:14AM +0800, Dave Young wrote:
> > But consider we can kexec to a different kernel and a different initrd so 
> > there
> > will be use cases to pass a total different dtb as well.
> 
> It depends on what you mean by "a different kernel", and what this
> implies for the DTB.
> 

I thought about kexec as a boot loader just like other bootloaders.
So just like a normal boot kexec should also accept external dtb.
But acutally kexec is different because it can get original dtb and
use it. So I agreed if we can not find a real use case that we have
to extend it we should keep current interface.

> I expect future arm64 Linux kernels to function with today's DTBs, and
> the existing boot protocol. The kexec_file_load syscall already has
> enough information for the kernel to inject the initrd and bootargs
> properties into a DTB.
> 
> In practice on x86 today, kexec_file_load only supports booting to a
> Linux kernel, because the in-kernel purgatory only implements the x86
> Linux boot protocol. Analagously, for arm64 I think that the first
> kernel should use its internal copy of the boot DTB, with /chosen fixed
> up appropriately, assuming the next kernel is an arm64 Linux image.
> 
> If booting another OS, the only parts of the DTB I would expect to
> change are the properties under chosen, as everything else *should* be
> OS-independent. However the other OS may have a completely different
> boot protocol, might not even take a DTB, and will likely need a
> compeltely different purgatory implementation. So just allowing the DTB
> to be altered isn't sufficient for that case.
> 
> There might be cases where we want a different DTB, but as far as I can
> tell we have nothing analagous on x86 today. If we do need this, we
> should have an idea of what real case(s) were trying to solve.

Agreed.

Thanks
Dave

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Arnd Bergmann
On Wednesday, July 13, 2016 3:45:41 PM CEST Thiago Jung Bauermann wrote:
> Am Mittwoch, 13 Juli 2016, 15:13:42 schrieb Arnd Bergmann:
> > On Wednesday, July 13, 2016 10:41:28 AM CEST Mark Rutland wrote:
> > > On Wed, Jul 13, 2016 at 10:01:33AM +0200, Arnd Bergmann wrote:
> > > > - kboot/petitboot with all of the user space being part of the trusted
> > > > boot> > 
> > > >   chain: it would be good to allow these to modify the dtb as needed
> > > >   without breaking the trust chain, just like we allow grub or u-boot
> > > >   to modify the dtb before passing it to the kernel.
> > > 
> > > It depends on *what* we need to modify here. We can modify the bootargs
> > > and initrd properties as part of the kexec_file_load syscall, so what
> > > else would we want to alter?
> > 
> > I guess petitboot can also just use kexec_load() instead of
> > kexec_file_load(), as long as the initramfs containing petitboot is
> > trusted by the kernel.
> 
> For secure boot, Petitboot needs to use kexec_file_load, because of the 
> following two features which the system call enables:
> 
> 1. only allow loading of signed kernels.
> 2. "measure" (i.e., record the hashes of) the kernel, initrd, kernel
>command line and other boot inputs for the Integrity Measurement
>Architecture subsystem.
> 
> Those can't be done with kexec_load.

Can't petitboot do both of these in user space?

Arnd

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Arnd Bergmann
On Wednesday, July 13, 2016 6:58:32 PM CEST Mark Rutland wrote:
> 
> >   we may want to remove unnecessary devices and even add a dedicated
> >   storage device for storing a core dump image.
> 
> I suspect that bringing up a minimal number of devices is better
> controlled by a cmdline option. In general, figuring out what is
> necessary and what is not is going to be board specific, so hacking the
> FW tables (DTB or ACPI) is not a very portable/reliable approach.
> 
> Do we actually add devices in practice? More so than the above that
> requires special knowledge of the platform (including things that were
> not described in the boot DTB).
> 
> In the ACPI case modifying a DTB alone is not sufficient to change the
> information regarding devices, as those won't be described in the DTB.
> It's not possible to convert ACPI to DTB in general.

A more likely scenario would be replacing ACPI tables with a DTB that
describes the platform in order to use devices that the ACPI tables
don't contain.

> > - Say, booting BE kernel on ACPI LE kernel
> >   In this case, there is no useful dtb in the kernel.

> If the platform only has ACPI, then you cannot boot a BE kernel to begin
> with. As above one cannot convert ACPI to DTB, so one would need
> extensive platform knowledge for this to work.

I think what he meant was to pass a DTB to the kexec kernel in order
to run BE, while the original kernel can only run LE due to ACPI.

If you boot a LE kernel using DTB, the same DTB should work
for a kexec boot for a BE kernel.

Arnd

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Thiago Jung Bauermann
Am Mittwoch, 13 Juli 2016, 15:13:42 schrieb Arnd Bergmann:
> On Wednesday, July 13, 2016 10:41:28 AM CEST Mark Rutland wrote:
> > On Wed, Jul 13, 2016 at 10:01:33AM +0200, Arnd Bergmann wrote:
> > > - kboot/petitboot with all of the user space being part of the trusted
> > > boot> > 
> > >   chain: it would be good to allow these to modify the dtb as needed
> > >   without breaking the trust chain, just like we allow grub or u-boot
> > >   to modify the dtb before passing it to the kernel.
> > 
> > It depends on *what* we need to modify here. We can modify the bootargs
> > and initrd properties as part of the kexec_file_load syscall, so what
> > else would we want to alter?
> 
> I guess petitboot can also just use kexec_load() instead of
> kexec_file_load(), as long as the initramfs containing petitboot is
> trusted by the kernel.

For secure boot, Petitboot needs to use kexec_file_load, because of the 
following two features which the system call enables:

1. only allow loading of signed kernels.
2. "measure" (i.e., record the hashes of) the kernel, initrd, kernel
   command line and other boot inputs for the Integrity Measurement
   Architecture subsystem.

Those can't be done with kexec_load.

As for what we need to modify, Petitboot does the following modifications to 
the DTB:

1. Set /chosen/linux,stdout-path based on which console is being used to 
interact with it, as Stewart mentioned in another email.
2. Set display properties on /pciex@n/.../vga@0 in machines with an 
OpenFirmware framebuffer.

-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Vivek Goyal
On Wed, Jul 13, 2016 at 06:40:10PM +0100, Russell King - ARM Linux wrote:
> On Wed, Jul 13, 2016 at 09:03:38AM -0400, Vivek Goyal wrote:
> > On Wed, Jul 13, 2016 at 09:26:39AM +0100, Russell King - ARM Linux wrote:
> > > Indeed - maybe Eric knows better, but I can't see any situation where
> > > the dtb we load via kexec should ever affect "the bootloader", unless
> > > the "kernel" that's being loaded into kexec is "the bootloader".
> > > 
> > > Now, going back to the more fundamental issue raised in my first reply,
> > > about the kernel command line.
> > > 
> > > On x86, I can see that it _is_ possible for userspace to specify a
> > > command line, and the kernel loading the image provides the command
> > > line to the to-be-kexeced kernel with very little checking.  So, if
> > > your kernel is signed, what stops the "insecure userspace" loading
> > > a signed kernel but giving it an insecure rootfs and/or console?
> > 
> > It is not kexec specific. I could do this for regular boot too, right?
> > 
> > Command line options are not signed. I thought idea behind secureboot
> > was to execute only trusted code and command line options don't enforce
> > you to execute unsigned code.
> > 
> > So it sounds like different class of security problems which you are
> > referring to and not necessarily covered by secureboot or signed
> > kernel.
> 
> Let me give you an example.
> 
> You have a secure boot setup, where the firmware/ROM validates the boot
> loader.  Good, the boot loader hasn't been tampered with.
> 
> You interrupt the boot loader and are able to modify the command line
> for the booted kernel.
> 
> The boot loader loads the kernel and verifies the kernel's signature.
> Good, the kernel hasn't been tampered with.  The kernel starts running.
> 
> You've plugged in a USB drive to the device, and specified a partition
> containing a root filesystem that you control to the kernel.  The
> validated kernel finds the USB drive, and mounts it, and executes
> your own binaries on the USB drive.

You will require physical access to the machine to be able to
insert your usb drive. And IIRC, argument was that if attacker has
physical access to machine, all bets are off anyway.

> 
> You run a shell on the console.  You now have control of the system,
> and can mount the real rootfs, inspect it, and work out what it does,
> etc.
> 
> At this point, what use was all the validation that the secure boot
> has done?  Absolutely useless.
> 
> If you can change the command line arguments given to the kernel, you
> have no security, no matter how much you verify signatures.  It's
> the illusion of security, nothing more, nothing less.
> 
> -- 
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Mark Rutland
On Thu, Jul 14, 2016 at 02:38:06AM +0900, AKASHI Takahiro wrote:
> Apologies for the slow response. I'm attending LinuxCon this week.
> 
> On Wed, Jul 13, 2016 at 10:34:47AM +0100, Mark Rutland wrote:
> > On Wed, Jul 13, 2016 at 10:36:14AM +0800, Dave Young wrote:
> > > But consider we can kexec to a different kernel and a different initrd so 
> > > there
> > > will be use cases to pass a total different dtb as well.
> > 
> > It depends on what you mean by "a different kernel", and what this
> > implies for the DTB.
> > 
> > I expect future arm64 Linux kernels to function with today's DTBs, and
> > the existing boot protocol. The kexec_file_load syscall already has
> > enough information for the kernel to inject the initrd and bootargs
> > properties into a DTB.
> > 
> > In practice on x86 today, kexec_file_load only supports booting to a
> > Linux kernel, because the in-kernel purgatory only implements the x86
> > Linux boot protocol. Analagously, for arm64 I think that the first
> > kernel should use its internal copy of the boot DTB, with /chosen fixed
> > up appropriately, assuming the next kernel is an arm64 Linux image.
> > 
> > If booting another OS, the only parts of the DTB I would expect to
> > change are the properties under chosen, as everything else *should* be
> > OS-independent. However the other OS may have a completely different
> > boot protocol, might not even take a DTB, and will likely need a
> > compeltely different purgatory implementation. So just allowing the DTB
> > to be altered isn't sufficient for that case.
> > 
> > There might be cases where we want a different DTB, but as far as I can
> > tell we have nothing analagous on x86 today. If we do need this, we
> > should have an idea of what real case(s) were trying to solve.
> 
> What I had in my mind was:
> 
> - Kdump
>   As Russel said, we definitely need to modify dtb.

I agree that *something* needs to modify the DTB to pass the cmdline and
initrd properties.

What I'm trying to point out that it isn't necessary that *userspace*
does so for the vast majority of kexec_file_load cases.

If userspace where to have to modify things dynamically, then you can't
have a secure deployment. Either you don't verify signatures on things
modified by userspace, giving a backdoor, or each machine has to have a
local copy of (locally) trusted private keys, which comes with other
risks (e.g. offline extraction of the keys).

>   In addition to bootargs and initrd proerties (FYI, in my arm64
>   implementation for arm64, eflcorehdr info is also passed as DT
>   property),

As pointed out, for kexec_file_load we can add code to the kernel can
add bootargs and initrd properties as necessary for this case. The
existing kexec_file_load prototype allows userspace to pass the required
information.

>   we may want to remove unnecessary devices and even add a dedicated
>   storage device for storing a core dump image.

I suspect that bringing up a minimal number of devices is better
controlled by a cmdline option. In general, figuring out what is
necessary and what is not is going to be board specific, so hacking the
FW tables (DTB or ACPI) is not a very portable/reliable approach.

Do we actually add devices in practice? More so than the above that
requires special knowledge of the platform (including things that were
not described in the boot DTB).

In the ACPI case modifying a DTB alone is not sufficient to change the
information regarding devices, as those won't be described in the DTB.
It's not possible to convert ACPI to DTB in general.

> - Say, booting BE kernel on ACPI LE kernel
>   In this case, there is no useful dtb in the kernel.

If the platform only has ACPI, then you cannot boot a BE kernel to begin
with. As above one cannot convert ACPI to DTB, so one would need
extensive platform knowledge for this to work.

I think it's fair to say that this is not a realistic/common case.

> Have said that, as Mark said, we may be able to use normal kexec_load
> system call if we don't need a "secure" kexec.
> 
> BTW, why doesn't the current kexec_load have ability of verifying
> a signature of initramfs image?

I believe the code was written before secure boot was a concern, and in
the absence of secure boot it was expected that a trusted userspace
would verify signatures itself.

> Is IMA/EVM expected to be used at runtime?

Sorry, I'm not sure what those abbreviations mean. Could you expand
them?

Thanks,
Mark.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Russell King - ARM Linux
On Wed, Jul 13, 2016 at 09:03:38AM -0400, Vivek Goyal wrote:
> On Wed, Jul 13, 2016 at 09:26:39AM +0100, Russell King - ARM Linux wrote:
> > Indeed - maybe Eric knows better, but I can't see any situation where
> > the dtb we load via kexec should ever affect "the bootloader", unless
> > the "kernel" that's being loaded into kexec is "the bootloader".
> > 
> > Now, going back to the more fundamental issue raised in my first reply,
> > about the kernel command line.
> > 
> > On x86, I can see that it _is_ possible for userspace to specify a
> > command line, and the kernel loading the image provides the command
> > line to the to-be-kexeced kernel with very little checking.  So, if
> > your kernel is signed, what stops the "insecure userspace" loading
> > a signed kernel but giving it an insecure rootfs and/or console?
> 
> It is not kexec specific. I could do this for regular boot too, right?
> 
> Command line options are not signed. I thought idea behind secureboot
> was to execute only trusted code and command line options don't enforce
> you to execute unsigned code.
> 
> So it sounds like different class of security problems which you are
> referring to and not necessarily covered by secureboot or signed
> kernel.

Let me give you an example.

You have a secure boot setup, where the firmware/ROM validates the boot
loader.  Good, the boot loader hasn't been tampered with.

You interrupt the boot loader and are able to modify the command line
for the booted kernel.

The boot loader loads the kernel and verifies the kernel's signature.
Good, the kernel hasn't been tampered with.  The kernel starts running.

You've plugged in a USB drive to the device, and specified a partition
containing a root filesystem that you control to the kernel.  The
validated kernel finds the USB drive, and mounts it, and executes
your own binaries on the USB drive.

You run a shell on the console.  You now have control of the system,
and can mount the real rootfs, inspect it, and work out what it does,
etc.

At this point, what use was all the validation that the secure boot
has done?  Absolutely useless.

If you can change the command line arguments given to the kernel, you
have no security, no matter how much you verify signatures.  It's
the illusion of security, nothing more, nothing less.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread AKASHI Takahiro
Apologies for the slow response. I'm attending LinuxCon this week.

On Wed, Jul 13, 2016 at 10:34:47AM +0100, Mark Rutland wrote:
> On Wed, Jul 13, 2016 at 10:36:14AM +0800, Dave Young wrote:
> > But consider we can kexec to a different kernel and a different initrd so 
> > there
> > will be use cases to pass a total different dtb as well.
> 
> It depends on what you mean by "a different kernel", and what this
> implies for the DTB.
> 
> I expect future arm64 Linux kernels to function with today's DTBs, and
> the existing boot protocol. The kexec_file_load syscall already has
> enough information for the kernel to inject the initrd and bootargs
> properties into a DTB.
> 
> In practice on x86 today, kexec_file_load only supports booting to a
> Linux kernel, because the in-kernel purgatory only implements the x86
> Linux boot protocol. Analagously, for arm64 I think that the first
> kernel should use its internal copy of the boot DTB, with /chosen fixed
> up appropriately, assuming the next kernel is an arm64 Linux image.
> 
> If booting another OS, the only parts of the DTB I would expect to
> change are the properties under chosen, as everything else *should* be
> OS-independent. However the other OS may have a completely different
> boot protocol, might not even take a DTB, and will likely need a
> compeltely different purgatory implementation. So just allowing the DTB
> to be altered isn't sufficient for that case.
> 
> There might be cases where we want a different DTB, but as far as I can
> tell we have nothing analagous on x86 today. If we do need this, we
> should have an idea of what real case(s) were trying to solve.

What I had in my mind was:

- Kdump
  As Russel said, we definitely need to modify dtb.
  In addition to bootargs and initrd proerties (FYI, in my arm64
  implementation for arm64, eflcorehdr info is also passed as DT
  property), we may want to remove unnecessary devices and
  even add a dedicated storage device for storing a core dump image.
- Say, booting BE kernel on ACPI LE kernel
  In this case, there is no useful dtb in the kernel.

Have said that, as Mark said, we may be able to use normal kexec_load
system call if we don't need a "secure" kexec.

BTW, why doesn't the current kexec_load have ability of verifying
a signature of initramfs image? Is IMA/EVM expected to be used
at runtime?

Thanks,
-Takahiro AKASHI
> Thanks,
> Mark.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Benefit!

2016-07-13 Thread Sgt Debra Moore
Mutual benefit for the both of us from Camp Stanley, get back to me for more 
info.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v22 3/8] arm64: kdump: implement machine_crash_shutdown()

2016-07-13 Thread AKASHI Takahiro
On Wed, Jul 13, 2016 at 10:32:39AM +0100, Suzuki K Poulose wrote:
> On 12/07/16 06:05, AKASHI Takahiro wrote:
> >Primary kernel calls machine_crash_shutdown() to shut down non-boot cpus
> >and save registers' status in per-cpu ELF notes before starting crash
> >dump kernel. See kernel_kexec().
> >Even if not all secondary cpus have shut down, we do kdump anyway.
> >
> >As we don't have to make non-boot(crashed) cpus offline (to preserve
> >correct status of cpus at crash dump) before shutting down, this patch
> >also adds a variant of smp_send_stop().
> >
> >Signed-off-by: AKASHI Takahiro 
> >---
> > arch/arm64/include/asm/hardirq.h  |  2 +-
> > arch/arm64/include/asm/kexec.h| 41 -
> > arch/arm64/include/asm/smp.h  |  2 ++
> > arch/arm64/kernel/machine_kexec.c | 56 --
> > arch/arm64/kernel/smp.c   | 64 
> > +++
> > 5 files changed, 160 insertions(+), 5 deletions(-)
> >
> >diff --git a/arch/arm64/include/asm/hardirq.h 
> >b/arch/arm64/include/asm/hardirq.h
> >index 8740297..1473fc2 100644
> >--- a/arch/arm64/include/asm/hardirq.h
> >+++ b/arch/arm64/include/asm/hardirq.h
> >@@ -20,7 +20,7 @@
> > #include 
> > #include 
> >
> >-#define NR_IPI  6
> >+#define NR_IPI  7
> >
> > typedef struct {
> > unsigned int __softirq_pending;
> >diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
> >index 04744dc..a908958 100644
> >--- a/arch/arm64/include/asm/kexec.h
> >+++ b/arch/arm64/include/asm/kexec.h
> >@@ -40,7 +40,46 @@
> > static inline void crash_setup_regs(struct pt_regs *newregs,
> > struct pt_regs *oldregs)
> > {
> >-/* Empty routine needed to avoid build errors. */
> >+if (oldregs) {
> >+memcpy(newregs, oldregs, sizeof(*newregs));
> >+} else {
> >+u64 tmp1, tmp2;
> >+
> >+__asm__ __volatile__ (
> >+"stp x0,   x1, [%2, #16 *  0]\n"
> >+"stp x2,   x3, [%2, #16 *  1]\n"
> >+"stp x4,   x5, [%2, #16 *  2]\n"
> >+"stp x6,   x7, [%2, #16 *  3]\n"
> >+"stp x8,   x9, [%2, #16 *  4]\n"
> >+"stpx10,  x11, [%2, #16 *  5]\n"
> >+"stpx12,  x13, [%2, #16 *  6]\n"
> >+"stpx14,  x15, [%2, #16 *  7]\n"
> >+"stpx16,  x17, [%2, #16 *  8]\n"
> >+"stpx18,  x19, [%2, #16 *  9]\n"
> >+"stpx20,  x21, [%2, #16 * 10]\n"
> >+"stpx22,  x23, [%2, #16 * 11]\n"
> >+"stpx24,  x25, [%2, #16 * 12]\n"
> >+"stpx26,  x27, [%2, #16 * 13]\n"
> >+"stpx28,  x29, [%2, #16 * 14]\n"
> >+"mov %0,  sp\n"
> >+"stpx30,  %0,  [%2, #16 * 15]\n"
> >+
> >+"/* faked current PSTATE */\n"
> >+"mrs %0, CurrentEL\n"
> >+"mrs %1, DAIF\n"
> >+"orr %0, %0, %1\n"
> >+"mrs %1, NZCV\n"
> >+"orr %0, %0, %1\n"
> >+
> >+/* pc */
> >+"adr %1, 1f\n"
> >+"1:\n"
> >+"stp %1, %0,   [%2, #16 * 16]\n"
> >+: "=r" (tmp1), "=r" (tmp2), "+r" (newregs)
> >+:
> >+: "memory"
> >+);
> >+}
> > }
> >
> > #endif /* __ASSEMBLY__ */
> >diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
> >index 0226447..6b0f2c7 100644
> >--- a/arch/arm64/include/asm/smp.h
> >+++ b/arch/arm64/include/asm/smp.h
> >@@ -136,6 +136,8 @@ static inline void cpu_panic_kernel(void)
> >  */
> > bool cpus_are_stuck_in_kernel(void);
> >
> >+extern void smp_send_crash_stop(void);
> >+
> > #endif /* ifndef __ASSEMBLY__ */
> >
> > #endif /* ifndef __ASM_SMP_H */
> >diff --git a/arch/arm64/kernel/machine_kexec.c 
> >b/arch/arm64/kernel/machine_kexec.c
> >index bc96c8a..8ac9dba8 100644
> >--- a/arch/arm64/kernel/machine_kexec.c
> >+++ b/arch/arm64/kernel/machine_kexec.c
> >@@ -9,6 +9,9 @@
> >  * published by the Free Software Foundation.
> >  */
> >
> >+#include 
> >+#include 
> >+#include 
> > #include 
> > #include 
> >
> >@@ -22,6 +25,7 @@
> > extern const unsigned char arm64_relocate_new_kernel[];
> > extern const unsigned long arm64_relocate_new_kernel_size;
> >
> >+bool in_crash_kexec;
> > static unsigned long kimage_start;
> >
> > /**
> >@@ -148,7 +152,8 @@ void machine_kexec(struct kimage *kimage)
> > /*
> >  * New cpus may have become stuck_in_kernel after we loaded the image.
> >  */
> >-BUG_ON(cpus_are_stuck_in_kernel() || (num_online_cpus() > 1));
> >+BUG_ON((cpus_are_stuck_in_kernel() || (num_online_cpus() > 1)) &&
> >+

Re: [PATCH v22 1/8] arm64: kdump: reserve memory for crash dump kernel

2016-07-13 Thread AKASHI Takahiro
Hi,

On Wed, Jul 13, 2016 at 10:12:12AM +0100, Suzuki K Poulose wrote:
> On 12/07/16 06:05, AKASHI Takahiro wrote:
> >On the startup of primary kernel, the memory region used by crash dump
> >kernel must be specified by "crashkernel=" kernel parameter.
> >reserve_crashkernel() will allocate and reserve the region for later use.
> >
> >User space tools, like kexec-tools, will be able to find that region as
> > - "Crash kernel" in /proc/iomem, or
> > - "linux,crashkernel-base" and "linux,crashkernel-size" under
> >   /sys/firmware/devicetree/base/chosen
> >
> >Signed-off-by: AKASHI Takahiro 
> >Signed-off-by: Mark Salter 
> >Signed-off-by: Pratyush Anand 
> >Reviewed-by: James Morse 
> 
> Minor nits below.
> 
> >---
> > arch/arm64/kernel/setup.c |   7 ++-
> > arch/arm64/mm/init.c  | 115 
> > ++
> > 2 files changed, 121 insertions(+), 1 deletion(-)
> >
> >diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> >index c1509e6..cb5eee0 100644
> >--- a/arch/arm64/kernel/setup.c
> >+++ b/arch/arm64/kernel/setup.c
> >@@ -31,7 +31,6 @@
> > #include 
> > #include 
> > #include 
> >-#include 
> > #include 
> > #include 
> > #include 
> >@@ -222,6 +221,12 @@ static void __init request_standard_resources(void)
> > kernel_data.end <= res->end)
> > request_resource(res, _data);
> > }
> >+
> >+#ifdef CONFIG_KEXEC_CORE
> >+/* User space tools will find "Crash kernel" region in /proc/iomem. */
> >+if (crashk_res.end)
> >+insert_resource(_resource, _res);
> >+#endif
> 
> nit: Could we not do this from reserve_crashkernel() ?

Reserve_crashkernel() should be called before arm64_memblock_init().
If the code above is put in reserve_crashkernel(), the inserted
resource will be gone later by calling request_resource() in
request_standard_resource() because the crashkernel memory is part of
System RAM.

> >
> > #include 
> > #include 
> >@@ -76,6 +78,117 @@ static int __init early_initrd(char *p)
> > early_param("initrd", early_initrd);
> > #endif
> >
> >+#ifdef CONFIG_KEXEC_CORE
> >+static unsigned long long crash_size, crash_base;
> >+static struct property crash_base_prop = {
> >+.name = "linux,crashkernel-base",
> >+.length = sizeof(u64),
> >+.value = _base
> >+};
> >+static struct property crash_size_prop = {
> >+.name = "linux,crashkernel-size",
> >+.length = sizeof(u64),
> >+.value = _size,
> >+};
> >+
> 
> >+
> >+/*
> >+ * reserve_crashkernel() - reserves memory for crash kernel
> >+ *
> >+ * This function reserves memory area given in "crashkernel=" kernel command
> >+ * line parameter. The memory reserved is used by dump capture kernel when
> >+ * primary kernel is crashing.
> >+ */
> >+static void __init reserve_crashkernel(void)
> >+{
> >+int ret;
> >+
> >+ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> >+_size, _base);
> >+/* no crashkernel= or invalid value specified */
> >+if (ret || !crash_size)
> >+return;
> >+
> >+if (crash_base == 0) {
> 
> ...
> 
> >+memblock_reserve(crash_base, crash_size);
> >+
> >+} else {
> 
> ...
> 
> >+
> >+memblock_reserve(crash_base, crash_size);
> 
> >+}
> 
> Nit: you could move the memblock_reserve() here to a single place.

Sure.

Thanks,
-Takahiro AKASHI

> >+
> >+pr_info("Reserving %lldMB of memory at %lldMB for crashkernel\n",
> >+crash_size >> 20, crash_base >> 20);
> >+
> >+crashk_res.start = crash_base;
> >+crashk_res.end = crash_base + crash_size - 1;
> 
> As mentioned above, may be we could move the insert_resource() here, which
> would keep the generic setup.c code cleaner and is a bit more reader friendly.
> 
> >+}
> >+#else
> 
> Suzuki
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v22 8/8] Documentation: dt: chosen properties for arm64 kdump

2016-07-13 Thread AKASHI Takahiro
Mark,

On Tue, Jul 12, 2016 at 11:07:45AM +0100, Mark Rutland wrote:
> Hi,
> 
> Apologies for the delay on this.
> 
> On Tue, Jul 12, 2016 at 02:05:14PM +0900, AKASHI Takahiro wrote:
> > From: James Morse 
> > 
> > Add documentation for
> > linux,crashkernel-base and crashkernel-size,
> > linux,usable-memory-range, and
> > linux,elfcorehdr
> > used by arm64 kexec/kdump to decribe the kdump reserved area, and
> > the elfcorehdr's location within it.
> > 
> > Signed-off-by: James Morse 
> > [takahiro.aka...@linaro.org:
> > renamed "usable-memory" to "usable-memory-range",
> > added "linux,crashkernel-base" and "-size" ]
> > Signed-off-by: AKASHI Takahiro 
> > ---
> >  Documentation/devicetree/bindings/chosen.txt | 45 
> > 
> >  1 file changed, 45 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/chosen.txt 
> > b/Documentation/devicetree/bindings/chosen.txt
> > index 6ae9d82..d7a3a86 100644
> > --- a/Documentation/devicetree/bindings/chosen.txt
> > +++ b/Documentation/devicetree/bindings/chosen.txt
> > @@ -52,3 +52,48 @@ This property is set (currently only on PowerPC, and 
> > only needed on
> >  book3e) by some versions of kexec-tools to tell the new kernel that it
> >  is being booted by kexec, as the booting environment may differ (e.g.
> >  a different secondary CPU release mechanism)
> > +
> > +linux,crashkernel-base
> > +linux,crashkernel-size
> > +--
> > +These properties are set (on PowerPC and arm64) during kdump to tell
> > +use-space tools, like kexec-tools, the base address of the crash-dump
> > +kernel's reserved area of memory and the size. e.g.
> 
> No need to mention what consumes this. Just state what it describes,
> e.g.
> 
>   These properties describe the base and size of the crash-dump
>   kernel's reserved area of memory. Valid for PowerPC and arm64.

Sure.

> > +
> > +/ {
> > +   chosen {
> > +   linux,crashkernel-base = <0x9 0xf000>;
> > +   linux,crashkernel-size = <0x0 0x1000>;
> > +   };
> > +};
> > +
> > +linux,usable-memory-range
> > +-
> > +
> > +This property is set (currently only on arm64) during kdump to tell
> > +the crash-dump kernel the base address of its reserved area of memory,
> > +and the size. e.g.
> 
> The description sounds like this duplicates linux,crashkernel-*. What is
> the difference between the two?

Simply saying, crashkernel-* are for the first(normal) kernel,
while usable-memory-range is for the second(crash-dump) kernel.

The former appears only under /sys/firmware/devicetree/base/
on kdump-enabled kernel  whenever "crashkernel=" command line parameter
is specified. So user space tools, like kexec-tools, will be able to know
what memory region is reserved for kdump.

The latter is added in device-tree blob which is passed to
crash-dump kernel so the kernel recognize what memory region is usable
for system ram. We will see it in /sys/firmware/devicetree/base
as well as /sys/firmware/fdt on crash dump kernel. 

So both can potentially appear at the same time on crash dump kernel, but
they will have different values in that case.
(So we need different names.)

> On powerpc it looks like there's a linux,usable-memory property (without
> the -range suffix). How do these differ?

Please also see Thiago's comment.
Basically "linux,usable-memory" property imposes further restriction
on memory node's "reg" property.

Currently kdump only supports a single memory region as usable memory
of crash dump kernel, and so adding "linux,usable-memory" to every
memory node in dtb is sort of "doing too much" IMHO.
In addition, there are no memory nodes on UEFI(w/ACPI) systems.

> > +
> > +/ {
> > +   chosen {
> > +   linux,usable-memory-range = <0x9 0xf000 0x0 0x1000>;
> > +   };
> > +};
> > +
> > +Please note that, if this property is present, any memory regions under
> > +"memory" nodes will be ignored.
> 
> What exactly do you mean by "ignored"?
> 
> Do we truly ignore this property, or do we map that memory at some
> point, even if not used for general allocation?

Totally ignored. All the regions are excluded from memblock.
See my patch #1.

> > +
> > +linux,elfcorehdr
> > +
> > +
> > +This property is set (currently only on arm64) during kdump to tell
> > +the crash-dump kernel the address and size of the elfcorehdr that describes
> > +the old kernel's memory as an elf file. This memory must reside within
> > +the area described by 'linux,usable-memory-range'. e.g.
> 
> 
> As with the linux,crashkernel-* properties, just state what this
> describes, e.g.
> 
>   This property describes the base and size of the ELF core
>   header, which describes the old kernel's memory as an ELF file.
>   This memory must reside within the range described by
>   linux,usable-memory-range.
> 
> That said, this falling within 

[V2][PATCH 1/2] PKCS#7: Fix kernel panic when referring to the empty AuthorityKeyIdentifier

2016-07-13 Thread Lans Zhang
This fix resolves the following kernel panic if the empty 
AuthorityKeyIdentifier employed.

[  459.041989] PKEY: <==public_key_verify_signature() = 0
[  459.041993] PKCS7: Verified signature 1
[  459.041995] PKCS7: ==> pkcs7_verify_sig_chain()
[  459.041999] PKCS7: verify Sample DB Certificate for SCP: 01
[  459.042002] PKCS7: - issuer Sample KEK Certificate for SCP
[  459.042014] BUG: unable to handle kernel NULL pointer dereference at 
  (null)
[  459.042135] IP: [] pkcs7_verify+0x72c/0x7f0
[  459.042217] PGD 739e6067 PUD 77719067 PMD 0
[  459.042286] Oops:  [#1] PREEMPT SMP
[  459.042328] Modules linked in:
[  459.042368] CPU: 0 PID: 474 Comm: kexec Not tainted 
4.7.0-rc7-WR8.0.0.0_standard+ #18
[  459.042462] Hardware name: To be filled by O.E.M. To be filled by 
O.E.M./Aptio CRB, BIOS 5.6.5 10/09/2014
[  459.042586] task: 880073a5 ti: 8800738e8000 task.ti: 
8800738e8000
[  459.042675] RIP: 0010:[]  [] 
pkcs7_verify+0x72c/0x7f0
[  459.042784] RSP: 0018:8800738ebd58  EFLAGS: 00010246
[  459.042845] RAX:  RBX: 880076b7da80 RCX: 0006
[  459.042929] RDX: 0001 RSI: 81c85001 RDI: 81ca00a9
[  459.043014] RBP: 8800738ebd98 R08: 0400 R09: 8800788a304c
[  459.043098] R10:  R11: 60ca R12: 8800769a2bc0
[  459.043182] R13: 880077358300 R14:  R15: 8800769a2dc0
[  459.043268] FS:  7f24cc741700() GS:880074e0() 
knlGS:
[  459.043365] CS:  0010 DS:  ES:  CR0: 80050033
[  459.043431] CR2:  CR3: 73a36000 CR4: 001006f0
[  459.043514] Stack:
[  459.043530]   ffbf0020 31ff813e68b0 
0002
[  459.043644]  8800769a2bc0  007197b8 
0002
[  459.043756]  8800738ebdd8 81153fb1  

[  459.043869] Call Trace:
[  459.043898]  [] verify_pkcs7_signature+0x61/0x140
[  459.043974]  [] verify_pefile_signature+0x2cb/0x830
[  459.044052]  [] ? verify_pefile_signature+0x830/0x830
[  459.044134]  [] bzImage64_verify_sig+0x15/0x20
[  459.046332]  [] arch_kexec_kernel_verify_sig+0x29/0x40
[  459.048552]  [] SyS_kexec_file_load+0x1f4/0x6c0
[  459.050768]  [] ? __do_page_fault+0x1b6/0x550
[  459.052996]  [] entry_SYSCALL_64_fastpath+0x17/0x93
[  459.055242] Code: e8 0a d6 ff ff 85 c0 0f 88 7a fb ff ff 4d 39 fd 4d 89 7d 
08 74 45 4d 89 fd e9 14 fe ff ff 4d 8b 76 08 31 c0 48 c7 c7 a9 00 ca 81 <41> 0f 
b7 36 49 8d 56 02 e8 d0 91 d6 ff 4d 8b 3c 24 4d 85 ff 0f
[  459.060535] RIP  [] pkcs7_verify+0x72c/0x7f0
[  459.063040]  RSP 
[  459.065456] CR2: 
[  459.075998] ---[ end trace c15f0e897cda28dc ]---

Signed-off-by: Lans Zhang 
Signed-off-by: David Howells 
Cc: Dave Young 
Cc: Baoquan He 
Cc: Vivek Goyal 
---
 crypto/asymmetric_keys/pkcs7_verify.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/crypto/asymmetric_keys/pkcs7_verify.c 
b/crypto/asymmetric_keys/pkcs7_verify.c
index 44b746e..2ffd697 100644
--- a/crypto/asymmetric_keys/pkcs7_verify.c
+++ b/crypto/asymmetric_keys/pkcs7_verify.c
@@ -227,7 +227,7 @@ static int pkcs7_verify_sig_chain(struct pkcs7_message 
*pkcs7,
if (asymmetric_key_id_same(p->id, auth))
goto found_issuer_check_skid;
}
-   } else {
+   } else if (sig->auth_ids[1]) {
auth = sig->auth_ids[1];
pr_debug("- want %*phN\n", auth->len, auth->data);
for (p = pkcs7->certs; p; p = p->next) {
-- 
1.9.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 2/2] pefile: Fix the failure of calculation for digest

2016-07-13 Thread Lans Zhang

Thanks for your review. I will send V2 soon.

Jia

On 07/13/2016 09:06 PM, David Howells wrote:

Lans Zhang  wrote:


The commit e68503bd forgot to set digest_len and thus cause the following
error reported by kexec when launching a crash kernel:
"kexec_file_load failed: Bad message"


You need to put the commit ID in a "Fixes:" line as per SubmittingPatches.

David




___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[V2][PATCH 2/2] pefile: Fix the failure of calculation for digest

2016-07-13 Thread Lans Zhang
The commit e68503bd68 forgot to set digest_len and thus cause the following
error reported by kexec when launching a crash kernel:
"kexec_file_load failed: Bad message"

Fixes: e68503bd68 (KEYS: Generalise system_verify_data() to provide access to 
internal content)
Signed-off-by: Lans Zhang 
Cc: David Howells 
Cc: Dave Young 
Cc: Baoquan He 
Cc: Vivek Goyal 
---
 crypto/asymmetric_keys/mscode_parser.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/crypto/asymmetric_keys/mscode_parser.c 
b/crypto/asymmetric_keys/mscode_parser.c
index 6a76d5c..9492e1c 100644
--- a/crypto/asymmetric_keys/mscode_parser.c
+++ b/crypto/asymmetric_keys/mscode_parser.c
@@ -124,5 +124,10 @@ int mscode_note_digest(void *context, size_t hdrlen,
struct pefile_context *ctx = context;
 
ctx->digest = kmemdup(value, vlen, GFP_KERNEL);
-   return ctx->digest ? 0 : -ENOMEM;
+   if (!ctx->digest)
+   return -ENOMEM;
+
+   ctx->digest_len = vlen;
+
+   return 0;
 }
-- 
1.9.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Vivek Goyal
On Wed, Jul 13, 2016 at 09:45:22AM +1000, Stewart Smith wrote:
> Vivek Goyal  writes:
> > On Tue, Jul 12, 2016 at 10:58:09AM -0300, Thiago Jung Bauermann wrote:
> >> Hello Eric,
> >> 
> >> Am Dienstag, 12 Juli 2016, 08:25:48 schrieb Eric W. Biederman:
> >> > AKASHI Takahiro  writes:
> >> > > Device tree blob must be passed to a second kernel on DTB-capable
> >> > > archs, like powerpc and arm64, but the current kernel interface
> >> > > lacks this support.
> >> > > 
> >> > > This patch extends kexec_file_load system call by adding an extra
> >> > > argument to this syscall so that an arbitrary number of file 
> >> > > descriptors
> >> > > can be handed out from user space to the kernel.
> >> > > 
> >> > > See the background [1].
> >> > > 
> >> > > Please note that the new interface looks quite similar to the current
> >> > > system call, but that it won't always mean that it provides the "binary
> >> > > compatibility."
> >> > > 
> >> > > [1] http://lists.infradead.org/pipermail/kexec/2016-June/016276.html
> >> > 
> >> > So this design is wrong.  The kernel already has the device tree blob,
> >> > you should not be extracting it from the kernel munging it, and then
> >> > reinserting it in the kernel if you want signatures and everything to
> >> > pass.
> >> > 
> >> > What x86 does is pass it's equivalent of the device tree blob from one
> >> > kernel to another directly and behind the scenes.  It does not go
> >> > through userspace for this.
> >> > 
> >> > Until a persuasive case can be made for going around the kernel and
> >> > probably adding a feature (like code execution) that can be used to
> >> > defeat the signature scheme I am going to nack this.
> >> 
> >> There are situations where userspace needs to change things in the device 
> >> tree to be used by the next kernel.
> >> 
> >> For example, Petitboot (the boot loader used in OpenPOWER machines) is a 
> >> userspace application running in an intermediary Linux instance and uses 
> >> kexec to load the target OS. It has to modify the device tree that will be 
> >> used by the next kernel so that the next kernel uses the same console that 
> >> petitboot was configured to use (i.e., set the /chosen/linux,stdout-path 
> >> property). It also modifies the device tree to allow the kernel to inherit 
> >> Petitboot's Openfirmware framebuffer.
> >
> > Can some of this be done with the help of kernel command line options for
> > second kernel?
> 
> how would this be any more secure?
> 
> Passing in an address for a framebuffer via command line option means
> you could scribble over any bit of memory, which is the same kind of
> damage you could do by modifying the device tree.

It is not necessarily safer but works with given framework and we don't
have to modify existing system call.

Also it will allow you to pass in only one thing at a time instead of
allowing passing in new unsigned DTB, which can potentially do lot more.

Vivek

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Vivek Goyal
On Wed, Jul 13, 2016 at 09:41:39AM +1000, Stewart Smith wrote:
> Petr Tesarik  writes:
> > On Tue, 12 Jul 2016 13:25:11 -0300
> > Thiago Jung Bauermann  wrote:
> >
> >> Hi Eric,
> >> 
> >> I'm trying to understand your concerns leading to your nack. I hope you 
> >> don't mind expanding your thoughts on them a bit.
> >> 
> >> Am Dienstag, 12 Juli 2016, 08:25:48 schrieb Eric W. Biederman:
> >> > AKASHI Takahiro  writes:
> >> > > Device tree blob must be passed to a second kernel on DTB-capable
> >> > > archs, like powerpc and arm64, but the current kernel interface
> >> > > lacks this support.
> >> > > 
> >> > > This patch extends kexec_file_load system call by adding an extra
> >> > > argument to this syscall so that an arbitrary number of file 
> >> > > descriptors
> >> > > can be handed out from user space to the kernel.
> >> > > 
> >> > > See the background [1].
> >> > > 
> >> > > Please note that the new interface looks quite similar to the current
> >> > > system call, but that it won't always mean that it provides the "binary
> >> > > compatibility."
> >> > > 
> >> > > [1] http://lists.infradead.org/pipermail/kexec/2016-June/016276.html
> >> > 
> >> > So this design is wrong.  The kernel already has the device tree blob,
> >> > you should not be extracting it from the kernel munging it, and then
> >> > reinserting it in the kernel if you want signatures and everything to
> >> > pass.
> >> 
> >> I don't understand how the kernel signature will be invalidated. 
> >> 
> >> There are some types of boot images that can embed a device tree blob in 
> >> them, but the kernel can also be handed a separate device tree blob from 
> >> firmware, the boot loader, or kexec. This latter case is what we are 
> >> discussing, so we are not talking about modifying an embedded blob in the 
> >> kernel image.
> >> 
> >> > What x86 does is pass it's equivalent of the device tree blob from one
> >> > kernel to another directly and behind the scenes.  It does not go
> >> > through userspace for this.
> >> > 
> >> > Until a persuasive case can be made for going around the kernel and
> >> > probably adding a feature (like code execution) that can be used to
> >> > defeat the signature scheme I am going to nack this.
> >> 
> >> I also don't understand what you mean by code execution. How does passing 
> >> a 
> >> device tree blob via kexec enables code execution? How can the signature 
> >> scheme be defeated?
> >
> > I'm not an expert on DTB, so I can't provide an example of code
> > execution, but you have already mentioned the /chosen/linux,stdout-path
> > property. If an attacker redirects the bootloader to an insecure
> > console, they may get access to the system that would otherwise be
> > impossible.
> 
> In this case, the user is sitting at the (or one of the) console(s) of
> the machine. There could be petitboot UIs running on the VGA display,
> IPMI serial over lan, local serial port. The logic behind setting
> /chosen/linux,stdout-path is (currently) mostly to set it for the kernel
> to what the user is interacting with. i.e. if you select an OS installer
> to boot from the VGA console, you get a graphical installer running and
> if you selected it from a text console, you get a text installer running
> (on the appropriate console).
> 
> So the bootloader (petitboot) needs to work out which console is being
> interacted with in order to set up /chosen/linux,stdout-path correctly.
> 
> This specific option could be passed as a kernel command line to the
> next kernel, yes. However, isn't the kernel command line also an attack
> vector? Is *every* command line option safe?

I don't think kernel command line is signed. And we will have to define
what is considered *unsafe*. I am working on the assumption that a
user should not be able to force execution of unsigned code at provileged
level. And passing console on kernel command line should be safe in
that respect?

Vivek

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Arnd Bergmann
On Wednesday, July 13, 2016 10:41:28 AM CEST Mark Rutland wrote:
> On Wed, Jul 13, 2016 at 10:01:33AM +0200, Arnd Bergmann wrote:
> > On Wednesday, July 13, 2016 10:36:14 AM CEST Dave Young wrote:
> > > On 07/12/16 at 03:50pm, Mark Rutland wrote:
> > > > On Tue, Jul 12, 2016 at 04:24:10PM +0200, Arnd Bergmann wrote:
> > > > > On Tuesday, July 12, 2016 10:18:11 AM CEST Vivek Goyal wrote:
> > > > 
> > > > /proc/devicetree (aka /sys/firmware/devicetree) is a filesystem derived
> > > > from the raw DTB (which is exposed at /sys/firmware/fdt).
> > > > 
> > > > The blob that was handed to the kernel at boot time is exposed at
> > > > /sys/firmware/fdt.
> > > 
> > > I believe the blob can be read and passed to kexec kernel in kernel code 
> > > without
> > > the extra fd.
> > > 
> > > But consider we can kexec to a different kernel and a different initrd so 
> > > there
> > > will be use cases to pass a total different dtb as well. From my 
> > > understanding
> > > it is reasonable but yes I think we should think carefully about the 
> > > design.
> > 
> > Ok, I can see four interesting use cases here:
> > 
> > - Using the dtb that the kernel has saved at boot time. Ideally this should 
> > not
> >   require an additional step of signing it, since the running kernel already
> >   trusts it.
> 
> We have sufficient information from the existing kexec_file_load syscall
> prototype to do this in-kernel.

Ok.

> > - A dtb blob from the file system that was produced along with the kernel 
> > image.
> >   If we require a signature on the kernel, the the same requirement should 
> > be
> >   made on the dtb. Whoever signs the kernel can also sign the dtb.
> >   The tricky part here is the kernel command line that is part of the dtb
> >   and that may need to be modified.
> 
> I suspect that for this case, following the example of the existing
> sycall, we'd allow the kernel to modify bootargs and initrd properties
> after verfiying the signature of the DTB.

Makes sense.
 
> The big question is whether this is a realistic case on a secure boot
> system.

What does x86 do here? I assume changes to the command line are also
limited.

> > - Modifying the dtb at for any of the reasons I listed: This should always
> >   be possible when we do not use secure boot, just like booting an unsigned
> >   kernel is.
> 
> This is possible with the existing kexec_load syscall, for the non
> secure boot case.

Ok, let's skip that then.

> > - kboot/petitboot with all of the user space being part of the trusted boot
> >   chain: it would be good to allow these to modify the dtb as needed without
> >   breaking the trust chain, just like we allow grub or u-boot to modify the 
> > dtb
> >   before passing it to the kernel.
> 
> It depends on *what* we need to modify here. We can modify the bootargs
> and initrd properties as part of the kexec_file_load syscall, so what
> else would we want to alter?

I guess petitboot can also just use kexec_load() instead of kexec_file_load(),
as long as the initramfs containing petitboot is trusted by the kernel.

Arnd

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 2/2] pefile: Fix the failure of calculation for digest

2016-07-13 Thread David Howells
Lans Zhang  wrote:

> The commit e68503bd forgot to set digest_len and thus cause the following
> error reported by kexec when launching a crash kernel:
> "kexec_file_load failed: Bad message"

You need to put the commit ID in a "Fixes:" line as per SubmittingPatches.

David

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2] Add a kexec_crash_loaded() function

2016-07-13 Thread Petr Tesarik
On Wed, 13 Jul 2016 05:52:33 -0700
Josh Triplett  wrote:

> On Wed, Jul 13, 2016 at 02:19:55PM +0200, Petr Tesarik wrote:
> > --- a/kernel/kexec_core.c
> > +++ b/kernel/kexec_core.c
> > @@ -95,6 +95,12 @@ int kexec_should_crash(struct task_struct *p)
> > return 0;
> >  }
> >  
> > +int kexec_crash_loaded(void)
> > +{
> > +   return !!kexec_crash_image;
> > +}
> 
> Nit: this function should return bool rather than int, since it
> effectively returns true/false.

I thought about this for a moment. Since the return value should also
be used for the corresponding sysctl, which is an integer, I wasn't
sure if bool is the correct type, especially since other boolean
functions in kexec.h also return int... But that could be legacy.

I've got no problem changing the type to 'bool' if that's what it takes.

Petr T

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Vivek Goyal
On Wed, Jul 13, 2016 at 09:26:39AM +0100, Russell King - ARM Linux wrote:
> On Wed, Jul 13, 2016 at 05:55:33PM +1000, Stewart Smith wrote:
> > Russell King - ARM Linux  writes:
> > > On Wed, Jul 13, 2016 at 02:59:51PM +1000, Stewart Smith wrote:
> > >> Russell King - ARM Linux  writes:
> > >> > On Tue, Jul 12, 2016 at 10:58:05PM +0200, Petr Tesarik wrote:
> > >> >> I'm not an expert on DTB, so I can't provide an example of code
> > >> >> execution, but you have already mentioned the 
> > >> >> /chosen/linux,stdout-path
> > >> >> property. If an attacker redirects the bootloader to an insecure
> > >> >> console, they may get access to the system that would otherwise be
> > >> >> impossible.
> > >> >
> > >> > I fail to see how kexec connects with the boot loader - the DTB image
> > >> > that's being talked about is one which is passed from the currently
> > >> > running kernel to the to-be-kexec'd kernel.  For ARM (and I suspect
> > >> > also ARM64) that's a direct call chain which doesn't involve any
> > >> > boot loader or firmware, and certainly none that would involve the
> > >> > passed DTB image.
> > >> 
> > >> For OpenPOWER machines, kexec is the bootloader. Our bootloader is a
> > >> linux kernel and initramfs with a UI (petitboot) - this means we never
> > >> have to write a device driver twice: write a kernel one and you're done
> > >> (for booting from the device and using it in your OS).
> > >
> > > I think you misunderstood my point.
> > >
> > > On ARM, we do not go:
> > >
> > >   kernel (kexec'd from) -> boot loader -> kernel (kexec'd to)
> > >
> > > but we go:
> > >
> > >   kernel (kexec'd from) -> kernel (kexec'd to)
> > >
> > > There's no intermediate step involving any bootloader.
> > >
> > > Hence, my point is that the dtb loaded by kexec is _only_ used by the
> > > kernel which is being kexec'd to, not by the bootloader, nor indeed
> > > the kernel which it is loaded into.
> > >
> > > Moreover, if you read the bit that I quoted (which is what I was
> > > replying to), you'll notice that it is talking about the DTB loaded
> > > by kexec somehow causing the _bootloader_ to be redirected to an
> > > alternative console.  This point is wholely false on ARM.
> > 
> > Ahh.. I missed the bootloader bit there.
> > 
> > In which case, we're the same on OpenPOWER, there is no intermediate
> > bootloader - in our case we have linux (with kexec) taking on what uboot
> > or grub is typically used for on other platforms.
> 
> Indeed - maybe Eric knows better, but I can't see any situation where
> the dtb we load via kexec should ever affect "the bootloader", unless
> the "kernel" that's being loaded into kexec is "the bootloader".
> 
> Now, going back to the more fundamental issue raised in my first reply,
> about the kernel command line.
> 
> On x86, I can see that it _is_ possible for userspace to specify a
> command line, and the kernel loading the image provides the command
> line to the to-be-kexeced kernel with very little checking.  So, if
> your kernel is signed, what stops the "insecure userspace" loading
> a signed kernel but giving it an insecure rootfs and/or console?

It is not kexec specific. I could do this for regular boot too, right?

Command line options are not signed. I thought idea behind secureboot
was to execute only trusted code and command line options don't enforce
you to execute unsigned code.

So it sounds like different class of security problems which you are
referring to and not necessarily covered by secureboot or signed
kernel.

Vivek

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [Xen-devel] [PATCH 2/2] Allow kdump with crash_kexec_post_notifiers

2016-07-13 Thread David Vrabel
On 13/07/16 13:20, Petr Tesarik wrote:
> If a crash kernel is loaded, do not crash the running domain. This is
> needed if the kernel is loaded with crash_kexec_post_notifiers, because
> panic notifiers are run before __crash_kexec() in that case, and this
> Xen hook prevents its being called later.

Prioritising the in-kernel kexec image over the hypervisor one seems
sensible behaviour to me.

Reviewed-by: David Vrabel 

David

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2] Add a kexec_crash_loaded() function

2016-07-13 Thread Josh Triplett
On Wed, Jul 13, 2016 at 02:19:55PM +0200, Petr Tesarik wrote:
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -95,6 +95,12 @@ int kexec_should_crash(struct task_struct *p)
>   return 0;
>  }
>  
> +int kexec_crash_loaded(void)
> +{
> + return !!kexec_crash_image;
> +}

Nit: this function should return bool rather than int, since it
effectively returns true/false.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 2/2] Allow kdump with crash_kexec_post_notifiers

2016-07-13 Thread Petr Tesarik
If a crash kernel is loaded, do not crash the running domain. This is
needed if the kernel is loaded with crash_kexec_post_notifiers, because
panic notifiers are run before __crash_kexec() in that case, and this
Xen hook prevents its being called later.

Signed-off-by: Petr Tesarik 
---
 arch/x86/xen/enlighten.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 760789a..6e3e7c6 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1325,7 +1325,8 @@ static void xen_crash_shutdown(struct pt_regs *regs)
 static int
 xen_panic_event(struct notifier_block *this, unsigned long event, void *ptr)
 {
-   xen_reboot(SHUTDOWN_crash);
+   if (!kexec_crash_loaded())
+   xen_reboot(SHUTDOWN_crash);
return NOTIFY_DONE;
 }
 


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 0/2] Allow crash dumps with crash_kexec_post_notifiers

2016-07-13 Thread Petr Tesarik
Hello all,

this patch series makes it possible to save a kernel crash dump when the
kernel command line includes "crash_kexec_post_notifiers". There might
be other approaches, but mine has the advantage that no new sysctl is
required, and the behaviour is the same whether panic notifiers are run
or not: If you load a crash kernel with kexec, it will be used, otherwise
the hypervisor facility is used (using a hypercall).

Feedback welcome!

Petr T

---

Petr Tesarik (2):
  Add a kexec_crash_loaded() function
  Allow kdump with crash_kexec_post_notifiers


 arch/x86/xen/enlighten.c |3 ++-
 include/linux/kexec.h|2 ++
 kernel/kexec_core.c  |5 +
 kernel/ksysfs.c  |2 +-
 4 files changed, 10 insertions(+), 2 deletions(-)

--
Signature

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 1/2] Add a kexec_crash_loaded() function

2016-07-13 Thread Petr Tesarik
Provide a wrapper function to be used by kernel code to check whether
a crash kernel is loaded. It returns the same value that can be seen
in /sys/kernel/kexec_crash_loaded by userspace programs.

I'm exporting the function, because it will be used by Xen, and it is
possible to compile Xen modules separately to enable the use of PV
drivers with unmodified bare-metal kernels.

Signed-off-by: Petr Tesarik 
---
 include/linux/kexec.h |2 ++
 kernel/kexec_core.c   |6 ++
 kernel/ksysfs.c   |2 +-
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index e8acb2b..d461d02 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -228,6 +228,7 @@ extern void *kexec_purgatory_get_symbol_addr(struct kimage 
*image,
 extern void __crash_kexec(struct pt_regs *);
 extern void crash_kexec(struct pt_regs *);
 int kexec_should_crash(struct task_struct *);
+int kexec_crash_loaded(void);
 void crash_save_cpu(struct pt_regs *regs, int cpu);
 void crash_save_vmcoreinfo(void);
 void arch_crash_save_vmcoreinfo(void);
@@ -324,6 +325,7 @@ struct task_struct;
 static inline void __crash_kexec(struct pt_regs *regs) { }
 static inline void crash_kexec(struct pt_regs *regs) { }
 static inline int kexec_should_crash(struct task_struct *p) { return 0; }
+static inline int kexec_crash_loaded(void) { return 0; }
 #define kexec_in_progress false
 #endif /* CONFIG_KEXEC_CORE */
 
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 56b3ed0..a303dce 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -95,6 +95,12 @@ int kexec_should_crash(struct task_struct *p)
return 0;
 }
 
+int kexec_crash_loaded(void)
+{
+   return !!kexec_crash_image;
+}
+EXPORT_SYMBOL_GPL(kexec_crash_loaded);
+
 /*
  * When kexec transitions to the new kernel there is a one-to-one
  * mapping between physical and virtual addresses.  On processors
diff --git a/kernel/ksysfs.c b/kernel/ksysfs.c
index 152da4a..bf9fc9d 100644
--- a/kernel/ksysfs.c
+++ b/kernel/ksysfs.c
@@ -101,7 +101,7 @@ KERNEL_ATTR_RO(kexec_loaded);
 static ssize_t kexec_crash_loaded_show(struct kobject *kobj,
   struct kobj_attribute *attr, char *buf)
 {
-   return sprintf(buf, "%d\n", !!kexec_crash_image);
+   return sprintf(buf, "%d\n", kexec_crash_loaded());
 }
 KERNEL_ATTR_RO(kexec_crash_loaded);
 


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Mark Rutland
On Wed, Jul 13, 2016 at 10:01:33AM +0200, Arnd Bergmann wrote:
> On Wednesday, July 13, 2016 10:36:14 AM CEST Dave Young wrote:
> > On 07/12/16 at 03:50pm, Mark Rutland wrote:
> > > On Tue, Jul 12, 2016 at 04:24:10PM +0200, Arnd Bergmann wrote:
> > > > On Tuesday, July 12, 2016 10:18:11 AM CEST Vivek Goyal wrote:
> > > 
> > > /proc/devicetree (aka /sys/firmware/devicetree) is a filesystem derived
> > > from the raw DTB (which is exposed at /sys/firmware/fdt).
> > > 
> > > The blob that was handed to the kernel at boot time is exposed at
> > > /sys/firmware/fdt.
> > 
> > I believe the blob can be read and passed to kexec kernel in kernel code 
> > without
> > the extra fd.
> > 
> > But consider we can kexec to a different kernel and a different initrd so 
> > there
> > will be use cases to pass a total different dtb as well. From my 
> > understanding
> > it is reasonable but yes I think we should think carefully about the design.
> 
> Ok, I can see four interesting use cases here:
> 
> - Using the dtb that the kernel has saved at boot time. Ideally this should 
> not
>   require an additional step of signing it, since the running kernel already
>   trusts it.

We have sufficient information from the existing kexec_file_load syscall
prototype to do this in-kernel.

> - A dtb blob from the file system that was produced along with the kernel 
> image.
>   If we require a signature on the kernel, the the same requirement should be
>   made on the dtb. Whoever signs the kernel can also sign the dtb.
>   The tricky part here is the kernel command line that is part of the dtb
>   and that may need to be modified.

I suspect that for this case, following the example of the existing
sycall, we'd allow the kernel to modify bootargs and initrd properties
after verfiying the signature of the DTB.

The big question is whether this is a realistic case on a secure boot
system.

> - Modifying the dtb at for any of the reasons I listed: This should always
>   be possible when we do not use secure boot, just like booting an unsigned
>   kernel is.

This is possible with the existing kexec_load syscall, for the non
secure boot case.

> - kboot/petitboot with all of the user space being part of the trusted boot
>   chain: it would be good to allow these to modify the dtb as needed without
>   breaking the trust chain, just like we allow grub or u-boot to modify the 
> dtb
>   before passing it to the kernel.

It depends on *what* we need to modify here. We can modify the bootargs
and initrd properties as part of the kexec_file_load syscall, so what
else would we want to alter?

Thanks,
Mark.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Mark Rutland
On Wed, Jul 13, 2016 at 10:36:14AM +0800, Dave Young wrote:
> But consider we can kexec to a different kernel and a different initrd so 
> there
> will be use cases to pass a total different dtb as well.

It depends on what you mean by "a different kernel", and what this
implies for the DTB.

I expect future arm64 Linux kernels to function with today's DTBs, and
the existing boot protocol. The kexec_file_load syscall already has
enough information for the kernel to inject the initrd and bootargs
properties into a DTB.

In practice on x86 today, kexec_file_load only supports booting to a
Linux kernel, because the in-kernel purgatory only implements the x86
Linux boot protocol. Analagously, for arm64 I think that the first
kernel should use its internal copy of the boot DTB, with /chosen fixed
up appropriately, assuming the next kernel is an arm64 Linux image.

If booting another OS, the only parts of the DTB I would expect to
change are the properties under chosen, as everything else *should* be
OS-independent. However the other OS may have a completely different
boot protocol, might not even take a DTB, and will likely need a
compeltely different purgatory implementation. So just allowing the DTB
to be altered isn't sufficient for that case.

There might be cases where we want a different DTB, but as far as I can
tell we have nothing analagous on x86 today. If we do need this, we
should have an idea of what real case(s) were trying to solve.

Thanks,
Mark.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v22 3/8] arm64: kdump: implement machine_crash_shutdown()

2016-07-13 Thread Suzuki K Poulose

On 12/07/16 06:05, AKASHI Takahiro wrote:

Primary kernel calls machine_crash_shutdown() to shut down non-boot cpus
and save registers' status in per-cpu ELF notes before starting crash
dump kernel. See kernel_kexec().
Even if not all secondary cpus have shut down, we do kdump anyway.

As we don't have to make non-boot(crashed) cpus offline (to preserve
correct status of cpus at crash dump) before shutting down, this patch
also adds a variant of smp_send_stop().

Signed-off-by: AKASHI Takahiro 
---
 arch/arm64/include/asm/hardirq.h  |  2 +-
 arch/arm64/include/asm/kexec.h| 41 -
 arch/arm64/include/asm/smp.h  |  2 ++
 arch/arm64/kernel/machine_kexec.c | 56 --
 arch/arm64/kernel/smp.c   | 64 +++
 5 files changed, 160 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h
index 8740297..1473fc2 100644
--- a/arch/arm64/include/asm/hardirq.h
+++ b/arch/arm64/include/asm/hardirq.h
@@ -20,7 +20,7 @@
 #include 
 #include 

-#define NR_IPI 6
+#define NR_IPI 7

 typedef struct {
unsigned int __softirq_pending;
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 04744dc..a908958 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -40,7 +40,46 @@
 static inline void crash_setup_regs(struct pt_regs *newregs,
struct pt_regs *oldregs)
 {
-   /* Empty routine needed to avoid build errors. */
+   if (oldregs) {
+   memcpy(newregs, oldregs, sizeof(*newregs));
+   } else {
+   u64 tmp1, tmp2;
+
+   __asm__ __volatile__ (
+   "stpx0,   x1, [%2, #16 *  0]\n"
+   "stpx2,   x3, [%2, #16 *  1]\n"
+   "stpx4,   x5, [%2, #16 *  2]\n"
+   "stpx6,   x7, [%2, #16 *  3]\n"
+   "stpx8,   x9, [%2, #16 *  4]\n"
+   "stp   x10,  x11, [%2, #16 *  5]\n"
+   "stp   x12,  x13, [%2, #16 *  6]\n"
+   "stp   x14,  x15, [%2, #16 *  7]\n"
+   "stp   x16,  x17, [%2, #16 *  8]\n"
+   "stp   x18,  x19, [%2, #16 *  9]\n"
+   "stp   x20,  x21, [%2, #16 * 10]\n"
+   "stp   x22,  x23, [%2, #16 * 11]\n"
+   "stp   x24,  x25, [%2, #16 * 12]\n"
+   "stp   x26,  x27, [%2, #16 * 13]\n"
+   "stp   x28,  x29, [%2, #16 * 14]\n"
+   "mov%0,  sp\n"
+   "stp   x30,  %0,  [%2, #16 * 15]\n"
+
+   "/* faked current PSTATE */\n"
+   "mrs%0, CurrentEL\n"
+   "mrs%1, DAIF\n"
+   "orr%0, %0, %1\n"
+   "mrs%1, NZCV\n"
+   "orr%0, %0, %1\n"
+
+   /* pc */
+   "adr%1, 1f\n"
+   "1:\n"
+   "stp%1, %0,   [%2, #16 * 16]\n"
+   : "=r" (tmp1), "=r" (tmp2), "+r" (newregs)
+   :
+   : "memory"
+   );
+   }
 }

 #endif /* __ASSEMBLY__ */
diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index 0226447..6b0f2c7 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -136,6 +136,8 @@ static inline void cpu_panic_kernel(void)
  */
 bool cpus_are_stuck_in_kernel(void);

+extern void smp_send_crash_stop(void);
+
 #endif /* ifndef __ASSEMBLY__ */

 #endif /* ifndef __ASM_SMP_H */
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index bc96c8a..8ac9dba8 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -9,6 +9,9 @@
  * published by the Free Software Foundation.
  */

+#include 
+#include 
+#include 
 #include 
 #include 

@@ -22,6 +25,7 @@
 extern const unsigned char arm64_relocate_new_kernel[];
 extern const unsigned long arm64_relocate_new_kernel_size;

+bool in_crash_kexec;
 static unsigned long kimage_start;

 /**
@@ -148,7 +152,8 @@ void machine_kexec(struct kimage *kimage)
/*
 * New cpus may have become stuck_in_kernel after we loaded the image.
 */
-   BUG_ON(cpus_are_stuck_in_kernel() || (num_online_cpus() > 1));
+   BUG_ON((cpus_are_stuck_in_kernel() || (num_online_cpus() > 1)) &&
+   !WARN_ON(in_crash_kexec));

reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
@@ -200,13 +205,58 @@ void machine_kexec(struct kimage *kimage)
 * 

Re: [PATCH v22 1/8] arm64: kdump: reserve memory for crash dump kernel

2016-07-13 Thread Suzuki K Poulose

On 12/07/16 06:05, AKASHI Takahiro wrote:

On the startup of primary kernel, the memory region used by crash dump
kernel must be specified by "crashkernel=" kernel parameter.
reserve_crashkernel() will allocate and reserve the region for later use.

User space tools, like kexec-tools, will be able to find that region as
- "Crash kernel" in /proc/iomem, or
- "linux,crashkernel-base" and "linux,crashkernel-size" under
  /sys/firmware/devicetree/base/chosen

Signed-off-by: AKASHI Takahiro 
Signed-off-by: Mark Salter 
Signed-off-by: Pratyush Anand 
Reviewed-by: James Morse 


Minor nits below.


---
 arch/arm64/kernel/setup.c |   7 ++-
 arch/arm64/mm/init.c  | 115 ++
 2 files changed, 121 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index c1509e6..cb5eee0 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -31,7 +31,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -222,6 +221,12 @@ static void __init request_standard_resources(void)
kernel_data.end <= res->end)
request_resource(res, _data);
}
+
+#ifdef CONFIG_KEXEC_CORE
+   /* User space tools will find "Crash kernel" region in /proc/iomem. */
+   if (crashk_res.end)
+   insert_resource(_resource, _res);
+#endif


nit: Could we not do this from reserve_crashkernel() ?



 #include 
 #include 
@@ -76,6 +78,117 @@ static int __init early_initrd(char *p)
 early_param("initrd", early_initrd);
 #endif

+#ifdef CONFIG_KEXEC_CORE
+static unsigned long long crash_size, crash_base;
+static struct property crash_base_prop = {
+   .name = "linux,crashkernel-base",
+   .length = sizeof(u64),
+   .value = _base
+};
+static struct property crash_size_prop = {
+   .name = "linux,crashkernel-size",
+   .length = sizeof(u64),
+   .value = _size,
+};
+



+
+/*
+ * reserve_crashkernel() - reserves memory for crash kernel
+ *
+ * This function reserves memory area given in "crashkernel=" kernel command
+ * line parameter. The memory reserved is used by dump capture kernel when
+ * primary kernel is crashing.
+ */
+static void __init reserve_crashkernel(void)
+{
+   int ret;
+
+   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+   _size, _base);
+   /* no crashkernel= or invalid value specified */
+   if (ret || !crash_size)
+   return;
+
+   if (crash_base == 0) {


...


+   memblock_reserve(crash_base, crash_size);
+
+   } else {


...


+
+   memblock_reserve(crash_base, crash_size);



+   }


Nit: you could move the memblock_reserve() here to a single place.


+
+   pr_info("Reserving %lldMB of memory at %lldMB for crashkernel\n",
+   crash_size >> 20, crash_base >> 20);
+
+   crashk_res.start = crash_base;
+   crashk_res.end = crash_base + crash_size - 1;


As mentioned above, may be we could move the insert_resource() here, which
would keep the generic setup.c code cleaner and is a bit more reader friendly.


+}
+#else


Suzuki


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2] kexec: Fix kdump failure with notsc

2016-07-13 Thread Baoquan He
On 07/13/16 at 07:46am, Wei, Jiangang wrote:
> On Mon, 2016-07-11 at 18:28 +0800, Wei Jiangang wrote:
> > Hi , Ingo
> > 
> > On Fri, 2016-07-08 at 09:38 +0200, Ingo Molnar wrote:
> > > * Eric W. Biederman  wrote:
> > > 
> > > > Sigh.  Can we please just do the work to rip out the apic shutdown code 
> > > > from the 
> > > > kexec on panic code path?
> > > > 
> > > > I forgetting details but the only reason we have do any apic shutdown 
> > > > is bugs in 
> > > > older kernels that could not initialize a system properly if we did not 
> > > > shut 
> > > > down the apics.
> > > > 
> > > > I certainly don't see an issue with goofy cases like notsc not working 
> > > > on a 
> > > > crash capture kernel if we are not initializing the hardware properly.
> > > > 
> > > > The strategy really needs to be to only do the absolutely essential 
> > > > hardware 
> > > > shutdown in the crashing kernel, every adintional line of code we 
> > > > execute in the 
> > > > crashing kernel increases our chances of hitting a bug.
> > > 
> > > Fully agreed.
> > > 
> > > > Under that policy things like requring we don't pass boot options that 
> > > > inhibit 
> > > > the dump catpure kernel from initializing the hardware from a random 
> > > > state are 
> > > > reasonable requirements.  AKA I don't see any justification in this as 
> > > > to why we 
> > > > would even want to support notsc on the dump capture kernel.  
> > > > Especially when 
> > > > things clearly work when that option is not specified.
> > > 
> > > So at least on the surface it appears 'surprising' that the 'notsc' 
> > > option (which, 
> > > supposedly, disables TSC handling) interferes with being able to fully 
> > > boot. Even 
> > > if 'notsc' is specified we are still using the local APIC, right?
> > 
> > In most case,  It's no problem that using local APIC while notsc is
> > specified.
> > But not for kdump.
> > 
> > We can get evidence, Especially from "Spurious LAPIC timer interrupt on
> > cpu 0".
> > 
> > ###serial log,
> > 
> > [0.00] NR_IRQS:524544 nr_irqs:256 16
> > [0.00] Spurious LAPIC timer interrupt on cpu 0
> > [0.00] Console: colour dummy device 80x25
> > [0.00] console [tty0] enabled
> > [0.00] console [ttyS0] enabled
> > [0.00] tsc: Fast TSC calibration using PIT
> > [0.00] tsc: Detected 2099.947 MHz processor
> > [0.00] Calibrating delay loop...
> > 
> > 
> > Due to the local apic and local apic timer hasn't been setup and enabled
> > fully, The event_handler of clock event is NULL.
> > 
> > ###codes,
> > 
> > static void local_apic_timer_interrupt(void)
> > {
> > int cpu = smp_processor_id();
> > struct clock_event_device *evt = _cpu(lapic_events, cpu);
> > 
> > /*
> >  * Normally we should not be here till LAPIC has been initialized
> > but
> >  * in some cases like kdump, its possible that there is a pending
> > LAPIC
> >  * timer interrupt from previous kernel's context and is delivered
> > in
> >  * new kernel the moment interrupts are enabled.
> >  *
> >  * Interrupts are enabled early and LAPIC is setup much later, hence
> >  * its possible that when we get here evt->event_handler is NULL.
> >  * Check for event_handler being NULL and discard the interrupt as
> >  * spurious.
> >  */
> > if (!evt->event_handler) {
> > pr_warning("Spurious LAPIC timer interrupt on cpu %d\n", cpu);
> > /* Switch it off */
> > lapic_timer_shutdown(evt);
> > return;
> > }
> > 
> >  .
> > }
> > 
> > 
> > IMHO, 
> > If we specify notsc, the dump-capture kernel waits for jiffies being
> > updated early and LAPIC and timer are setup much later, which causes no
> > timer interrupts is passed to BSP. as following,
> > 
> > start_kernel  -->
> > 1)-> calibrate_delay()  -> calibrate_delay_converge()  # hang and wait
> > for jiffies changed
> >
> > 2)-> rest_init() -> kernel_init() ->  -> apic_bsp_setup() ->
> > setup_local_APIC()
> > 
> > -> setup_percpu_clockev().
> > 
> > the setup_percpu_clockev points setup_boot_APIC_clock() which used to
> > setup the boot APIC and timer.
> > 
> > 
> > > So it might be a good idea to find the root cause of this bootup 
> > > fragility even if 
> > > 'notsc' is specified. And I fully agree that it should be fixed in the 
> > > bootup path 
> > > of the dump kernel, not the crash kernel reboot path.
> > 
> 
> Can anyone give some advice or commet on the following idea?
> Thanks in advance.

You can't do this.

The reason is disable_IO_APIC not only disable io-apic by masking each
pin but setup the apic virtual wire mode.

I dig code and found somethings really interesting. lapic_shutdown()
really disable local apic by writing the APIC software enable/disable
flag in the spurious-interrupt vector register, please check intel arch
manual section 10.4.3. However disable_IO_APIC() softwae enable local

Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Petr Tesarik
On Wed, 13 Jul 2016 09:26:39 +0100
Russell King - ARM Linux  wrote:

> On Wed, Jul 13, 2016 at 05:55:33PM +1000, Stewart Smith wrote:
> > Russell King - ARM Linux  writes:
> > > On Wed, Jul 13, 2016 at 02:59:51PM +1000, Stewart Smith wrote:
> > >> Russell King - ARM Linux  writes:
> > >> > On Tue, Jul 12, 2016 at 10:58:05PM +0200, Petr Tesarik wrote:
> > >> >> I'm not an expert on DTB, so I can't provide an example of code
> > >> >> execution, but you have already mentioned the 
> > >> >> /chosen/linux,stdout-path
> > >> >> property. If an attacker redirects the bootloader to an insecure
> > >> >> console, they may get access to the system that would otherwise be
> > >> >> impossible.
> > >> >
> > >> > I fail to see how kexec connects with the boot loader - the DTB image
> > >> > that's being talked about is one which is passed from the currently
> > >> > running kernel to the to-be-kexec'd kernel.  For ARM (and I suspect
> > >> > also ARM64) that's a direct call chain which doesn't involve any
> > >> > boot loader or firmware, and certainly none that would involve the
> > >> > passed DTB image.
> > >> 
> > >> For OpenPOWER machines, kexec is the bootloader. Our bootloader is a
> > >> linux kernel and initramfs with a UI (petitboot) - this means we never
> > >> have to write a device driver twice: write a kernel one and you're done
> > >> (for booting from the device and using it in your OS).
> > >
> > > I think you misunderstood my point.
> > >
> > > On ARM, we do not go:
> > >
> > >   kernel (kexec'd from) -> boot loader -> kernel (kexec'd to)
> > >
> > > but we go:
> > >
> > >   kernel (kexec'd from) -> kernel (kexec'd to)
> > >
> > > There's no intermediate step involving any bootloader.
> > >
> > > Hence, my point is that the dtb loaded by kexec is _only_ used by the
> > > kernel which is being kexec'd to, not by the bootloader, nor indeed
> > > the kernel which it is loaded into.
> > >
> > > Moreover, if you read the bit that I quoted (which is what I was
> > > replying to), you'll notice that it is talking about the DTB loaded
> > > by kexec somehow causing the _bootloader_ to be redirected to an
> > > alternative console.  This point is wholely false on ARM.
> > 
> > Ahh.. I missed the bootloader bit there.
> > 
> > In which case, we're the same on OpenPOWER, there is no intermediate
> > bootloader - in our case we have linux (with kexec) taking on what uboot
> > or grub is typically used for on other platforms.
> 
> Indeed - maybe Eric knows better, but I can't see any situation where
> the dtb we load via kexec should ever affect "the bootloader", unless
> the "kernel" that's being loaded into kexec is "the bootloader".
> 
> Now, going back to the more fundamental issue raised in my first reply,
> about the kernel command line.
> 
> On x86, I can see that it _is_ possible for userspace to specify a
> command line, and the kernel loading the image provides the command
> line to the to-be-kexeced kernel with very little checking.  So, if
> your kernel is signed, what stops the "insecure userspace" loading
> a signed kernel but giving it an insecure rootfs and/or console?

This is a valid point. If there are kernel options that can be misused
to defeat the purpose of UEFI SecureBoot, then we're in trouble.
Generally, the Linux kernel should treat the command line as untrusted
source.

My point is that modifying the DTB opens a completely new attack
vector. And the goal is not extending the attack surface (because there
are holes in it already), but reducing the attack surface (e.g. by
limiting available kernel command line options).

Petr T

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 2/2] pefile: Fix the failure of calculation for digest

2016-07-13 Thread Lans Zhang
The commit e68503bd forgot to set digest_len and thus cause the following
error reported by kexec when launching a crash kernel:
"kexec_file_load failed: Bad message"

Signed-off-by: Lans Zhang 
Cc: David Howells 
Cc: Dave Young 
Cc: Baoquan He 
Cc: Vivek Goyal 
---
 crypto/asymmetric_keys/mscode_parser.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/crypto/asymmetric_keys/mscode_parser.c 
b/crypto/asymmetric_keys/mscode_parser.c
index 6a76d5c..9492e1c 100644
--- a/crypto/asymmetric_keys/mscode_parser.c
+++ b/crypto/asymmetric_keys/mscode_parser.c
@@ -124,5 +124,10 @@ int mscode_note_digest(void *context, size_t hdrlen,
struct pefile_context *ctx = context;
 
ctx->digest = kmemdup(value, vlen, GFP_KERNEL);
-   return ctx->digest ? 0 : -ENOMEM;
+   if (!ctx->digest)
+   return -ENOMEM;
+
+   ctx->digest_len = vlen;
+
+   return 0;
 }
-- 
1.9.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 1/2] PKCS#7: Fix kernel panic when referring to the empty AuthorityKeyIdentifier

2016-07-13 Thread Lans Zhang
This fix resolves the following kernel panic if the empty 
AuthorityKeyIdentifier employed.

[  459.041989] PKEY: <==public_key_verify_signature() = 0
[  459.041993] PKCS7: Verified signature 1
[  459.041995] PKCS7: ==> pkcs7_verify_sig_chain()
[  459.041999] PKCS7: verify Sample DB Certificate for SCP: 01
[  459.042002] PKCS7: - issuer Sample KEK Certificate for SCP
[  459.042014] BUG: unable to handle kernel NULL pointer dereference at 
  (null)
[  459.042135] IP: [] pkcs7_verify+0x72c/0x7f0
[  459.042217] PGD 739e6067 PUD 77719067 PMD 0
[  459.042286] Oops:  [#1] PREEMPT SMP
[  459.042328] Modules linked in:
[  459.042368] CPU: 0 PID: 474 Comm: kexec Not tainted 
4.7.0-rc7-WR8.0.0.0_standard+ #18
[  459.042462] Hardware name: To be filled by O.E.M. To be filled by 
O.E.M./Aptio CRB, BIOS 5.6.5 10/09/2014
[  459.042586] task: 880073a5 ti: 8800738e8000 task.ti: 
8800738e8000
[  459.042675] RIP: 0010:[]  [] 
pkcs7_verify+0x72c/0x7f0
[  459.042784] RSP: 0018:8800738ebd58  EFLAGS: 00010246
[  459.042845] RAX:  RBX: 880076b7da80 RCX: 0006
[  459.042929] RDX: 0001 RSI: 81c85001 RDI: 81ca00a9
[  459.043014] RBP: 8800738ebd98 R08: 0400 R09: 8800788a304c
[  459.043098] R10:  R11: 60ca R12: 8800769a2bc0
[  459.043182] R13: 880077358300 R14:  R15: 8800769a2dc0
[  459.043268] FS:  7f24cc741700() GS:880074e0() 
knlGS:
[  459.043365] CS:  0010 DS:  ES:  CR0: 80050033
[  459.043431] CR2:  CR3: 73a36000 CR4: 001006f0
[  459.043514] Stack:
[  459.043530]   ffbf0020 31ff813e68b0 
0002
[  459.043644]  8800769a2bc0  007197b8 
0002
[  459.043756]  8800738ebdd8 81153fb1  

[  459.043869] Call Trace:
[  459.043898]  [] verify_pkcs7_signature+0x61/0x140
[  459.043974]  [] verify_pefile_signature+0x2cb/0x830
[  459.044052]  [] ? verify_pefile_signature+0x830/0x830
[  459.044134]  [] bzImage64_verify_sig+0x15/0x20
[  459.046332]  [] arch_kexec_kernel_verify_sig+0x29/0x40
[  459.048552]  [] SyS_kexec_file_load+0x1f4/0x6c0
[  459.050768]  [] ? __do_page_fault+0x1b6/0x550
[  459.052996]  [] entry_SYSCALL_64_fastpath+0x17/0x93
[  459.055242] Code: e8 0a d6 ff ff 85 c0 0f 88 7a fb ff ff 4d 39 fd 4d 89 7d 
08 74 45 4d 89 fd e9 14 fe ff ff 4d 8b 76 08 31 c0 48 c7 c7 a9 00 ca 81 <41> 0f 
b7 36 49 8d 56 02 e8 d0 91 d6 ff 4d 8b 3c 24 4d 85 ff 0f
[  459.060535] RIP  [] pkcs7_verify+0x72c/0x7f0
[  459.063040]  RSP 
[  459.065456] CR2: 
[  459.075998] ---[ end trace c15f0e897cda28dc ]---

Signed-off-by: Lans Zhang 
Cc: David Howells 
Cc: Dave Young 
Cc: Baoquan He 
Cc: Vivek Goyal 
---
 crypto/asymmetric_keys/pkcs7_verify.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/crypto/asymmetric_keys/pkcs7_verify.c 
b/crypto/asymmetric_keys/pkcs7_verify.c
index 44b746e..13dcc97 100644
--- a/crypto/asymmetric_keys/pkcs7_verify.c
+++ b/crypto/asymmetric_keys/pkcs7_verify.c
@@ -227,8 +227,7 @@ static int pkcs7_verify_sig_chain(struct pkcs7_message 
*pkcs7,
if (asymmetric_key_id_same(p->id, auth))
goto found_issuer_check_skid;
}
-   } else {
-   auth = sig->auth_ids[1];
+   } else if ((auth = sig->auth_ids[1])) {
pr_debug("- want %*phN\n", auth->len, auth->data);
for (p = pkcs7->certs; p; p = p->next) {
if (!p->skid)
-- 
1.9.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Dave Young
[snip]
> Now, going back to the more fundamental issue raised in my first reply,
> about the kernel command line.
> 
> On x86, I can see that it _is_ possible for userspace to specify a
> command line, and the kernel loading the image provides the command
> line to the to-be-kexeced kernel with very little checking.  So, if
> your kernel is signed, what stops the "insecure userspace" loading
> a signed kernel but giving it an insecure rootfs and/or console?

The kexec_file_load syscall was introduced for secure boot in the first
place. In case UEFI secure boot the signature verification chain only
covers kernel mode binaries. I think there is such problem in both normal
boot and kexec boot.

Thanks
Dave


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Russell King - ARM Linux
On Wed, Jul 13, 2016 at 05:55:33PM +1000, Stewart Smith wrote:
> Russell King - ARM Linux  writes:
> > On Wed, Jul 13, 2016 at 02:59:51PM +1000, Stewart Smith wrote:
> >> Russell King - ARM Linux  writes:
> >> > On Tue, Jul 12, 2016 at 10:58:05PM +0200, Petr Tesarik wrote:
> >> >> I'm not an expert on DTB, so I can't provide an example of code
> >> >> execution, but you have already mentioned the /chosen/linux,stdout-path
> >> >> property. If an attacker redirects the bootloader to an insecure
> >> >> console, they may get access to the system that would otherwise be
> >> >> impossible.
> >> >
> >> > I fail to see how kexec connects with the boot loader - the DTB image
> >> > that's being talked about is one which is passed from the currently
> >> > running kernel to the to-be-kexec'd kernel.  For ARM (and I suspect
> >> > also ARM64) that's a direct call chain which doesn't involve any
> >> > boot loader or firmware, and certainly none that would involve the
> >> > passed DTB image.
> >> 
> >> For OpenPOWER machines, kexec is the bootloader. Our bootloader is a
> >> linux kernel and initramfs with a UI (petitboot) - this means we never
> >> have to write a device driver twice: write a kernel one and you're done
> >> (for booting from the device and using it in your OS).
> >
> > I think you misunderstood my point.
> >
> > On ARM, we do not go:
> >
> > kernel (kexec'd from) -> boot loader -> kernel (kexec'd to)
> >
> > but we go:
> >
> > kernel (kexec'd from) -> kernel (kexec'd to)
> >
> > There's no intermediate step involving any bootloader.
> >
> > Hence, my point is that the dtb loaded by kexec is _only_ used by the
> > kernel which is being kexec'd to, not by the bootloader, nor indeed
> > the kernel which it is loaded into.
> >
> > Moreover, if you read the bit that I quoted (which is what I was
> > replying to), you'll notice that it is talking about the DTB loaded
> > by kexec somehow causing the _bootloader_ to be redirected to an
> > alternative console.  This point is wholely false on ARM.
> 
> Ahh.. I missed the bootloader bit there.
> 
> In which case, we're the same on OpenPOWER, there is no intermediate
> bootloader - in our case we have linux (with kexec) taking on what uboot
> or grub is typically used for on other platforms.

Indeed - maybe Eric knows better, but I can't see any situation where
the dtb we load via kexec should ever affect "the bootloader", unless
the "kernel" that's being loaded into kexec is "the bootloader".

Now, going back to the more fundamental issue raised in my first reply,
about the kernel command line.

On x86, I can see that it _is_ possible for userspace to specify a
command line, and the kernel loading the image provides the command
line to the to-be-kexeced kernel with very little checking.  So, if
your kernel is signed, what stops the "insecure userspace" loading
a signed kernel but giving it an insecure rootfs and/or console?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Stewart Smith
Arnd Bergmann  writes:
> On Wednesday, July 13, 2016 10:36:14 AM CEST Dave Young wrote:
>> On 07/12/16 at 03:50pm, Mark Rutland wrote:
>> > On Tue, Jul 12, 2016 at 04:24:10PM +0200, Arnd Bergmann wrote:
>> > > On Tuesday, July 12, 2016 10:18:11 AM CEST Vivek Goyal wrote:
>> > 
>> > /proc/devicetree (aka /sys/firmware/devicetree) is a filesystem derived
>> > from the raw DTB (which is exposed at /sys/firmware/fdt).
>> > 
>> > The blob that was handed to the kernel at boot time is exposed at
>> > /sys/firmware/fdt.
>> 
>> I believe the blob can be read and passed to kexec kernel in kernel code 
>> without
>> the extra fd.
>> 
>> But consider we can kexec to a different kernel and a different initrd so 
>> there
>> will be use cases to pass a total different dtb as well. From my 
>> understanding
>> it is reasonable but yes I think we should think carefully about the design.
>
> Ok, I can see four interesting use cases here:
>
> - Using the dtb that the kernel has saved at boot time. Ideally this should 
> not
>   require an additional step of signing it, since the running kernel already
>   trusts it.

- using current view of the hardware, flattened into a new dtb.
  This should already be trusted, as it's what we're running now (boot +
  runtime changes)

-- 
Stewart Smith
OPAL Architect, IBM.


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Stewart Smith
Ard Biesheuvel  writes:
> On 13 July 2016 at 09:36, Russell King - ARM Linux
>  wrote:
>> On Wed, Jul 13, 2016 at 02:59:51PM +1000, Stewart Smith wrote:
>>> Russell King - ARM Linux  writes:
>>> > On Tue, Jul 12, 2016 at 10:58:05PM +0200, Petr Tesarik wrote:
>>> >> I'm not an expert on DTB, so I can't provide an example of code
>>> >> execution, but you have already mentioned the /chosen/linux,stdout-path
>>> >> property. If an attacker redirects the bootloader to an insecure
>>> >> console, they may get access to the system that would otherwise be
>>> >> impossible.
>>> >
>>> > I fail to see how kexec connects with the boot loader - the DTB image
>>> > that's being talked about is one which is passed from the currently
>>> > running kernel to the to-be-kexec'd kernel.  For ARM (and I suspect
>>> > also ARM64) that's a direct call chain which doesn't involve any
>>> > boot loader or firmware, and certainly none that would involve the
>>> > passed DTB image.
>>>
>>> For OpenPOWER machines, kexec is the bootloader. Our bootloader is a
>>> linux kernel and initramfs with a UI (petitboot) - this means we never
>>> have to write a device driver twice: write a kernel one and you're done
>>> (for booting from the device and using it in your OS).
>>
>> I think you misunderstood my point.
>>
>> On ARM, we do not go:
>>
>> kernel (kexec'd from) -> boot loader -> kernel (kexec'd to)
>>
>> but we go:
>>
>> kernel (kexec'd from) -> kernel (kexec'd to)
>>
>> There's no intermediate step involving any bootloader.
>>
>> Hence, my point is that the dtb loaded by kexec is _only_ used by the
>> kernel which is being kexec'd to, not by the bootloader, nor indeed
>> the kernel which it is loaded into.
>>
>> Moreover, if you read the bit that I quoted (which is what I was
>> replying to), you'll notice that it is talking about the DTB loaded
>> by kexec somehow causing the _bootloader_ to be redirected to an
>> alternative console.  This point is wholely false on ARM.
>>
>
> The particular example may not apply, but the argument that the DTB
> -as a description of the hardware topology- needs to be signed if the
> kernel is also signed is valid. We do the same in the UEFI stub, i.e.,
> it normally takes a dtb= argument to allow the DTB to be overridden,
> but this feature is disabled when Secure Boot is in effect. By the
> same reasoning, if any kind of kexec kernel image validation is in
> effect, we should either validate the DTB image as well, or disallow
> external DTBs and only perform kexec with the kernel's current DTB
> (the blob it was booted with, not the unflattened data structure)

DTB booted with != current description of hardware

We could have had: PCI hotplug, CPU/memory/cache offlined due to
hardware error, change in available pstates / CPU frequencies.

There is merit in having a signed dtb if you're booting a signed kernel
in a secure boot scenario. However, we still need to set up /chosen/ and
we still need a way to do something like the offb hack.

-- 
Stewart Smith
OPAL Architect, IBM.


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Russell King - ARM Linux
On Wed, Jul 13, 2016 at 09:47:56AM +0200, Ard Biesheuvel wrote:
> On 13 July 2016 at 09:36, Russell King - ARM Linux
>  wrote:
> > On Wed, Jul 13, 2016 at 02:59:51PM +1000, Stewart Smith wrote:
> >> Russell King - ARM Linux  writes:
> >> > On Tue, Jul 12, 2016 at 10:58:05PM +0200, Petr Tesarik wrote:
> >> >> I'm not an expert on DTB, so I can't provide an example of code
> >> >> execution, but you have already mentioned the /chosen/linux,stdout-path
> >> >> property. If an attacker redirects the bootloader to an insecure
> >> >> console, they may get access to the system that would otherwise be
> >> >> impossible.
> >> >
> >> > I fail to see how kexec connects with the boot loader - the DTB image
> >> > that's being talked about is one which is passed from the currently
> >> > running kernel to the to-be-kexec'd kernel.  For ARM (and I suspect
> >> > also ARM64) that's a direct call chain which doesn't involve any
> >> > boot loader or firmware, and certainly none that would involve the
> >> > passed DTB image.
> >>
> >> For OpenPOWER machines, kexec is the bootloader. Our bootloader is a
> >> linux kernel and initramfs with a UI (petitboot) - this means we never
> >> have to write a device driver twice: write a kernel one and you're done
> >> (for booting from the device and using it in your OS).
> >
> > I think you misunderstood my point.
> >
> > On ARM, we do not go:
> >
> > kernel (kexec'd from) -> boot loader -> kernel (kexec'd to)
> >
> > but we go:
> >
> > kernel (kexec'd from) -> kernel (kexec'd to)
> >
> > There's no intermediate step involving any bootloader.
> >
> > Hence, my point is that the dtb loaded by kexec is _only_ used by the
> > kernel which is being kexec'd to, not by the bootloader, nor indeed
> > the kernel which it is loaded into.
> >
> > Moreover, if you read the bit that I quoted (which is what I was
> > replying to), you'll notice that it is talking about the DTB loaded
> > by kexec somehow causing the _bootloader_ to be redirected to an
> > alternative console.  This point is wholely false on ARM.
> >
> 
> The particular example may not apply, but the argument that the DTB
> -as a description of the hardware topology- needs to be signed if the
> kernel is also signed is valid. We do the same in the UEFI stub, i.e.,
> it normally takes a dtb= argument to allow the DTB to be overridden,
> but this feature is disabled when Secure Boot is in effect. By the
> same reasoning, if any kind of kexec kernel image validation is in
> effect, we should either validate the DTB image as well, or disallow
> external DTBs and only perform kexec with the kernel's current DTB
> (the blob it was booted with, not the unflattened data structure)

*Sigh*  yes, I know full well, which is why I said what I said in my
_first_ reply:

"However, your point is valid as an attacker can redirect the console
 and/or mounted root on the to-be-kexec'd kernel if they can modify
 the DTB - and there's a whole host of subtle ways to do that, not
 necessarily just modification of the kernel command line."

and I went on to raise a valid point about the necessity to do that
for crashdump, which has been _completely_ ignored.

So, I just stopped reading your reply after the first three lines,
because we are in fact in agreement... but thanks for trying to waste
my time.

Please, keep with the overall discussion, and stop replying to a single
email as a whole point in isolation to every other email in the thread.
And stop bikeshedding, by picking up on the easy stuff but ignoring the
more fundamental points, like the crashdump issue I mentioned in my
first reply and now this reply.

Thanks.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH 2/3] makedumpfile: Mark unstored ELF pages as filtered

2016-07-13 Thread Atsushi Kumagai
Hello Petr,

I'm happy to get your reply.

>On Tue, 7 Jun 2016 04:18:48 +
>Atsushi Kumagai  wrote:
>
>> >>+static void
>> >>+exclude_nodata_pages(struct cycle *cycle)
>> >>+{
>> >>+  int i;
>> >>+  unsigned long long phys_start, phys_end;
>> >>+  off_t file_size;
>> >>+
>> >>+  i = 0;
>> >>+  while (get_pt_load_extents(i, _start, _end,
>> >>+ NULL, _size)) {
>> >>+  unsigned long long pfn, pfn_end;
>> >>+
>> >>+  pfn = paddr_to_pfn(phys_start + file_size);
>> >>+  pfn_end = paddr_to_pfn(phys_end);
>> >
>> >Does this code exclude the first pfn of out of PT_LOAD even if there is
>> >no unstored pages ? pfn and pfn_end will point at the next pfn to the
>> >last pfn of PT_LOAD.
>
>Indeed, phys_end is in fact the first physical address that is _not_
>contained in the segment.
>
>Good catch!
>
>> >This will be problem for the environments which have a overlapped PT_LOAD
>> >segment like ia64.
>
>I think it's quite the opposite. Partial pages on IA64 are the only
>ones that were handled correctly with the old code.

I didn't take care partial pages, just considered the case below:

   pfn 1   2   3   4   5   6   7

PT_LAOD(1)   |---+---+---+---|
PT_LAOD(2)   |---+---+---+---|

During checking PT_LOAD(1), the bit of pfn:5 will be dropped if
mem_size equals file_size, but practically the pfn is an used page
as PT_LAOD(2) shows.
If there is only PT_LOAD(1), pfn:5 must be on hole and the actual
harm doesn't occur.

>> >
>> >>+  if (pfn < cycle->start_pfn)
>> >>+  pfn = cycle->start_pfn;
>> >>+  if (pfn_end >= cycle->end_pfn)
>> >>+  pfn_end = cycle->end_pfn - 1;
>> >>+  while (pfn <= pfn_end) {
>> >
>> >Should we change this condition to "pfn < pfn_end" ?
>
>Better than nothing, but if the last page of a LOAD segment is not
>partial, it will not be excluded. Except for partial pages, end_pfn
>refers to the first PFN _after_ the current segment.

If mem_size equals file_size, the last page of a LOAD also shouldn't
be removed. However, the condition of "pfn <= pfn_end" can remove it
since end_pfn refers to the first PFN _after_ the current segment.
I just fixed it.

>> >
>> >>+  clear_bit_on_2nd_bitmap(pfn, cycle);
>> >>+  ++pfn;
>> >>+  }
>> >>+  ++i;
>> >>+  }
>> >>+}
>>
>> I fixed this as below for v1.6.0.
>> Of course your comment would still be helpful.
>
>It comes too late, I'm afraid.
>
>In all ia64 dumps with partial pages I have seen, the full content of
>the partial page is indeed stored in another segment, so the page would
>be marked incorrectly. OTOH I think the off-by-one bug you fixed
>affected all segments, even those without any partial pages; it would
>incorrectly mark the first PFN after the segment as filtered.
>
>The perfect solution is to round the starting PFN up instead of down to
>preserve partial pages, but I'll have to think about it some more
>before I can send a patch.

Again, I didn't consider partial page cases, I also should reconsider this.
Thank you for pointing it out.


Regards,
Atsushi Kumagai

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Arnd Bergmann
On Wednesday, July 13, 2016 10:36:14 AM CEST Dave Young wrote:
> On 07/12/16 at 03:50pm, Mark Rutland wrote:
> > On Tue, Jul 12, 2016 at 04:24:10PM +0200, Arnd Bergmann wrote:
> > > On Tuesday, July 12, 2016 10:18:11 AM CEST Vivek Goyal wrote:
> > 
> > /proc/devicetree (aka /sys/firmware/devicetree) is a filesystem derived
> > from the raw DTB (which is exposed at /sys/firmware/fdt).
> > 
> > The blob that was handed to the kernel at boot time is exposed at
> > /sys/firmware/fdt.
> 
> I believe the blob can be read and passed to kexec kernel in kernel code 
> without
> the extra fd.
> 
> But consider we can kexec to a different kernel and a different initrd so 
> there
> will be use cases to pass a total different dtb as well. From my understanding
> it is reasonable but yes I think we should think carefully about the design.

Ok, I can see four interesting use cases here:

- Using the dtb that the kernel has saved at boot time. Ideally this should not
  require an additional step of signing it, since the running kernel already
  trusts it.

- A dtb blob from the file system that was produced along with the kernel image.
  If we require a signature on the kernel, the the same requirement should be
  made on the dtb. Whoever signs the kernel can also sign the dtb.
  The tricky part here is the kernel command line that is part of the dtb
  and that may need to be modified.

- Modifying the dtb at for any of the reasons I listed: This should always
  be possible when we do not use secure boot, just like booting an unsigned
  kernel is.

- kboot/petitboot with all of the user space being part of the trusted boot
  chain: it would be good to allow these to modify the dtb as needed without
  breaking the trust chain, just like we allow grub or u-boot to modify the dtb
  before passing it to the kernel.

Arnd

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Stewart Smith
Russell King - ARM Linux  writes:
> On Wed, Jul 13, 2016 at 02:59:51PM +1000, Stewart Smith wrote:
>> Russell King - ARM Linux  writes:
>> > On Tue, Jul 12, 2016 at 10:58:05PM +0200, Petr Tesarik wrote:
>> >> I'm not an expert on DTB, so I can't provide an example of code
>> >> execution, but you have already mentioned the /chosen/linux,stdout-path
>> >> property. If an attacker redirects the bootloader to an insecure
>> >> console, they may get access to the system that would otherwise be
>> >> impossible.
>> >
>> > I fail to see how kexec connects with the boot loader - the DTB image
>> > that's being talked about is one which is passed from the currently
>> > running kernel to the to-be-kexec'd kernel.  For ARM (and I suspect
>> > also ARM64) that's a direct call chain which doesn't involve any
>> > boot loader or firmware, and certainly none that would involve the
>> > passed DTB image.
>> 
>> For OpenPOWER machines, kexec is the bootloader. Our bootloader is a
>> linux kernel and initramfs with a UI (petitboot) - this means we never
>> have to write a device driver twice: write a kernel one and you're done
>> (for booting from the device and using it in your OS).
>
> I think you misunderstood my point.
>
> On ARM, we do not go:
>
>   kernel (kexec'd from) -> boot loader -> kernel (kexec'd to)
>
> but we go:
>
>   kernel (kexec'd from) -> kernel (kexec'd to)
>
> There's no intermediate step involving any bootloader.
>
> Hence, my point is that the dtb loaded by kexec is _only_ used by the
> kernel which is being kexec'd to, not by the bootloader, nor indeed
> the kernel which it is loaded into.
>
> Moreover, if you read the bit that I quoted (which is what I was
> replying to), you'll notice that it is talking about the DTB loaded
> by kexec somehow causing the _bootloader_ to be redirected to an
> alternative console.  This point is wholely false on ARM.

Ahh.. I missed the bootloader bit there.

In which case, we're the same on OpenPOWER, there is no intermediate
bootloader - in our case we have linux (with kexec) taking on what uboot
or grub is typically used for on other platforms.


-- 
Stewart Smith
OPAL Architect, IBM.


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Ard Biesheuvel
On 13 July 2016 at 09:36, Russell King - ARM Linux
 wrote:
> On Wed, Jul 13, 2016 at 02:59:51PM +1000, Stewart Smith wrote:
>> Russell King - ARM Linux  writes:
>> > On Tue, Jul 12, 2016 at 10:58:05PM +0200, Petr Tesarik wrote:
>> >> I'm not an expert on DTB, so I can't provide an example of code
>> >> execution, but you have already mentioned the /chosen/linux,stdout-path
>> >> property. If an attacker redirects the bootloader to an insecure
>> >> console, they may get access to the system that would otherwise be
>> >> impossible.
>> >
>> > I fail to see how kexec connects with the boot loader - the DTB image
>> > that's being talked about is one which is passed from the currently
>> > running kernel to the to-be-kexec'd kernel.  For ARM (and I suspect
>> > also ARM64) that's a direct call chain which doesn't involve any
>> > boot loader or firmware, and certainly none that would involve the
>> > passed DTB image.
>>
>> For OpenPOWER machines, kexec is the bootloader. Our bootloader is a
>> linux kernel and initramfs with a UI (petitboot) - this means we never
>> have to write a device driver twice: write a kernel one and you're done
>> (for booting from the device and using it in your OS).
>
> I think you misunderstood my point.
>
> On ARM, we do not go:
>
> kernel (kexec'd from) -> boot loader -> kernel (kexec'd to)
>
> but we go:
>
> kernel (kexec'd from) -> kernel (kexec'd to)
>
> There's no intermediate step involving any bootloader.
>
> Hence, my point is that the dtb loaded by kexec is _only_ used by the
> kernel which is being kexec'd to, not by the bootloader, nor indeed
> the kernel which it is loaded into.
>
> Moreover, if you read the bit that I quoted (which is what I was
> replying to), you'll notice that it is talking about the DTB loaded
> by kexec somehow causing the _bootloader_ to be redirected to an
> alternative console.  This point is wholely false on ARM.
>

The particular example may not apply, but the argument that the DTB
-as a description of the hardware topology- needs to be signed if the
kernel is also signed is valid. We do the same in the UEFI stub, i.e.,
it normally takes a dtb= argument to allow the DTB to be overridden,
but this feature is disabled when Secure Boot is in effect. By the
same reasoning, if any kind of kexec kernel image validation is in
effect, we should either validate the DTB image as well, or disallow
external DTBs and only perform kexec with the kernel's current DTB
(the blob it was booted with, not the unflattened data structure)

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2] kexec: Fix kdump failure with notsc

2016-07-13 Thread Wei, Jiangang
On Mon, 2016-07-11 at 18:28 +0800, Wei Jiangang wrote:
> Hi , Ingo
> 
> On Fri, 2016-07-08 at 09:38 +0200, Ingo Molnar wrote:
> > * Eric W. Biederman  wrote:
> > 
> > > Sigh.  Can we please just do the work to rip out the apic shutdown code 
> > > from the 
> > > kexec on panic code path?
> > > 
> > > I forgetting details but the only reason we have do any apic shutdown is 
> > > bugs in 
> > > older kernels that could not initialize a system properly if we did not 
> > > shut 
> > > down the apics.
> > > 
> > > I certainly don't see an issue with goofy cases like notsc not working on 
> > > a 
> > > crash capture kernel if we are not initializing the hardware properly.
> > > 
> > > The strategy really needs to be to only do the absolutely essential 
> > > hardware 
> > > shutdown in the crashing kernel, every adintional line of code we execute 
> > > in the 
> > > crashing kernel increases our chances of hitting a bug.
> > 
> > Fully agreed.
> > 
> > > Under that policy things like requring we don't pass boot options that 
> > > inhibit 
> > > the dump catpure kernel from initializing the hardware from a random 
> > > state are 
> > > reasonable requirements.  AKA I don't see any justification in this as to 
> > > why we 
> > > would even want to support notsc on the dump capture kernel.  Especially 
> > > when 
> > > things clearly work when that option is not specified.
> > 
> > So at least on the surface it appears 'surprising' that the 'notsc' option 
> > (which, 
> > supposedly, disables TSC handling) interferes with being able to fully 
> > boot. Even 
> > if 'notsc' is specified we are still using the local APIC, right?
> 
> In most case,  It's no problem that using local APIC while notsc is
> specified.
> But not for kdump.
> 
> We can get evidence, Especially from "Spurious LAPIC timer interrupt on
> cpu 0".
> 
> ###serial log,
> 
> [0.00] NR_IRQS:524544 nr_irqs:256 16
> [0.00] Spurious LAPIC timer interrupt on cpu 0
> [0.00] Console: colour dummy device 80x25
> [0.00] console [tty0] enabled
> [0.00] console [ttyS0] enabled
> [0.00] tsc: Fast TSC calibration using PIT
> [0.00] tsc: Detected 2099.947 MHz processor
> [0.00] Calibrating delay loop...
> 
> 
> Due to the local apic and local apic timer hasn't been setup and enabled
> fully, The event_handler of clock event is NULL.
> 
> ###codes,
> 
> static void local_apic_timer_interrupt(void)
> {
> int cpu = smp_processor_id();
> struct clock_event_device *evt = _cpu(lapic_events, cpu);
> 
> /*
>  * Normally we should not be here till LAPIC has been initialized
> but
>  * in some cases like kdump, its possible that there is a pending
> LAPIC
>  * timer interrupt from previous kernel's context and is delivered
> in
>  * new kernel the moment interrupts are enabled.
>  *
>  * Interrupts are enabled early and LAPIC is setup much later, hence
>  * its possible that when we get here evt->event_handler is NULL.
>  * Check for event_handler being NULL and discard the interrupt as
>  * spurious.
>  */
> if (!evt->event_handler) {
> pr_warning("Spurious LAPIC timer interrupt on cpu %d\n", cpu);
> /* Switch it off */
> lapic_timer_shutdown(evt);
> return;
> }
> 
>  .
> }
> 
> 
> IMHO, 
> If we specify notsc, the dump-capture kernel waits for jiffies being
> updated early and LAPIC and timer are setup much later, which causes no
> timer interrupts is passed to BSP. as following,
> 
> start_kernel  -->
> 1)-> calibrate_delay()  -> calibrate_delay_converge()  # hang and wait
> for jiffies changed
>
> 2)-> rest_init() -> kernel_init() ->  -> apic_bsp_setup() ->
> setup_local_APIC()
> 
> -> setup_percpu_clockev().
> 
> the setup_percpu_clockev points setup_boot_APIC_clock() which used to
> setup the boot APIC and timer.
> 
> 
> > So it might be a good idea to find the root cause of this bootup fragility 
> > even if 
> > 'notsc' is specified. And I fully agree that it should be fixed in the 
> > bootup path 
> > of the dump kernel, not the crash kernel reboot path.
> 

Can anyone give some advice or commet on the following idea?
Thanks in advance.

> Because the lapic and timer are not ready when dump-capture waits them
> to update the jiffies value. so I suggest to put APIC in legacy mode in
> local_apic_timer_interrupt() temporarily, which in the bootup path of
> dump kernel. 
> 
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index dcb52850a28f..af3be93997ed 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -879,6 +879,7 @@ static void local_apic_timer_interrupt(void)
> pr_warning("Spurious LAPIC timer interrupt on cpu %d\n",
> cpu);
> /* Switch it off */
> lapic_timer_shutdown(evt);
> + disable_IO_APIC();
> 

Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Russell King - ARM Linux
On Wed, Jul 13, 2016 at 02:59:51PM +1000, Stewart Smith wrote:
> Russell King - ARM Linux  writes:
> > On Tue, Jul 12, 2016 at 10:58:05PM +0200, Petr Tesarik wrote:
> >> I'm not an expert on DTB, so I can't provide an example of code
> >> execution, but you have already mentioned the /chosen/linux,stdout-path
> >> property. If an attacker redirects the bootloader to an insecure
> >> console, they may get access to the system that would otherwise be
> >> impossible.
> >
> > I fail to see how kexec connects with the boot loader - the DTB image
> > that's being talked about is one which is passed from the currently
> > running kernel to the to-be-kexec'd kernel.  For ARM (and I suspect
> > also ARM64) that's a direct call chain which doesn't involve any
> > boot loader or firmware, and certainly none that would involve the
> > passed DTB image.
> 
> For OpenPOWER machines, kexec is the bootloader. Our bootloader is a
> linux kernel and initramfs with a UI (petitboot) - this means we never
> have to write a device driver twice: write a kernel one and you're done
> (for booting from the device and using it in your OS).

I think you misunderstood my point.

On ARM, we do not go:

kernel (kexec'd from) -> boot loader -> kernel (kexec'd to)

but we go:

kernel (kexec'd from) -> kernel (kexec'd to)

There's no intermediate step involving any bootloader.

Hence, my point is that the dtb loaded by kexec is _only_ used by the
kernel which is being kexec'd to, not by the bootloader, nor indeed
the kernel which it is loaded into.

Moreover, if you read the bit that I quoted (which is what I was
replying to), you'll notice that it is talking about the DTB loaded
by kexec somehow causing the _bootloader_ to be redirected to an
alternative console.  This point is wholely false on ARM.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2] kexec: remove unnecessary unusable_pages

2016-07-13 Thread zhong jiang
On 2016/7/13 13:07, Eric W. Biederman wrote:
> zhong jiang  writes:
>
>> On 2016/7/12 23:19, Eric W. Biederman wrote:
>>> zhongjiang  writes:
>>>
 From: zhong jiang 

 In general, kexec alloc pages from buddy system, it cannot exceed
 the physical address in the system.

 The patch just remove this unnecessary code, no functional change.
>>> On 32bit systems with highmem support kexec can very easily receive a
>>> page from the buddy allocator that can exceed 4GiB.  This doesn't show
>>> up on 64bit systems as typically the memory limits are less than the
>>> address space.  But this code is very necessary on some systems and
>>> removing it is not ok.
>>>
>>> Nacked-by: "Eric W. Biederman" 
>>>
>>   This viewpoint is as opposed to me,  32bit systems architectural decide it 
>> can not
>>   access exceed 4GiB whether the highmem or not.   but there is one 
>> exception, 
>>   when PAE enable, its physical address should be extended to 36,  new 
>> paging  mechanism
>>   established for it.  therefore, the  page from the buddy allocator
>>   can exceed 4GiB.
> Exactly.  And I was dealing with PAE systems in 2001 or so with > 4GiB
> of RAM.  Which is where the unusable_pages work comes from.
>
> Other architectures such as ARM also followed a similar path, so
> it isn't just x86 that has 32bit systems with > 32 address lines.
>
>>   moreover,  on 32bit systems I can not understand why 
>> KEXEC_SOURCE_MEMORY_LIMIT
>>   is defined to -1UL. therefore, kimge_aloc_page allocate page will always 
>> add to unusable_pages.
> -1UL is a short way of writing 0xUL  Which is as close as you
> can get to writing 0x1UL in 32bits.
>
> kimage_alloc_page won't always add to unusable_pages as there is memory
> below 4GiB but it isn't easily found so there may temporarily be a
> memory shortage, as it allocates it's way there.  Unfortunately whenever
> I have looked there are memory zones that line up with the memory the
> kexec is looking for.  So it does a little bit of a weird dance to get
> the memory it needs and to discard the memory it can't use.
>
> Eric
>
  Thanks , I get it.


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 2/2] kexec: add a pmd huge entry condition during the page table

2016-07-13 Thread zhong jiang
On 2016/7/12 23:46, Eric W. Biederman wrote:
> zhongjiang  writes:
>
>> From: zhong jiang 
>>
>> when image is loaded into kernel, we need set up page table for it. and 
>> all valid pfn also set up new mapping. it will tend to establish a pmd 
>> page table in the form of a large page if pud_present is true. 
>> relocate_kernel 
>> points to code segment can locate in the pmd huge entry in 
>> init_transtion_pgtable. 
>> therefore, we need to take the situation into account.
> I can see how in theory this might be necessary but when is a kernel virtual
> address on x86_64 that is above 0x8000 in conflict with an
> identity mapped physicall address that are all below 0x8000?
>
> If anything the code could be simplified to always assume those mappings
> are unoccupied.
>
> Did you run into an actual failure somewhere?
>
> Eric
>
   I  do not understand what you trying to say,  Maybe I miss your point.
  
  The key is how to ensure that relocate_kernel points to the pmd entry is not 
huge page.
 
  Thanks
  zhongjiang
 
>> Signed-off-by: zhong jiang 
>> ---
>>  arch/x86/kernel/machine_kexec_64.c | 20 ++--
>>  1 file changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/machine_kexec_64.c 
>> b/arch/x86/kernel/machine_kexec_64.c
>> index 5a294e4..c33e344 100644
>> --- a/arch/x86/kernel/machine_kexec_64.c
>> +++ b/arch/x86/kernel/machine_kexec_64.c
>> @@ -14,6 +14,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -34,6 +35,17 @@ static struct kexec_file_ops *kexec_file_loaders[] = {
>>  };
>>  #endif
>>  
>> +static void split_pmd(pmd_t *pmd, pte_t *pte)
>> +{
>> +unsigned long pfn = pmd_pfn(*pmd);
>> +int i = 0;
>> +
>> +do {
>> +set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC));
>> +pfn++;
>> +} while (pte++, i++, i < PTRS_PER_PTE);
>> +}
>> +
>>  static void free_transition_pgtable(struct kimage *image)
>>  {
>>  free_page((unsigned long)image->arch.pud);
>> @@ -68,15 +80,19 @@ static int init_transition_pgtable(struct kimage *image, 
>> pgd_t *pgd)
>>  set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
>>  }
>>  pmd = pmd_offset(pud, vaddr);
>> -if (!pmd_present(*pmd)) {
>> +if (!pmd_present(*pmd) || pmd_huge(*pmd)) {
>>  pte = (pte_t *)get_zeroed_page(GFP_KERNEL);
>>  if (!pte)
>>  goto err;
>>  image->arch.pte = pte;
>> -set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
>> +if (pmd_huge(*pmd))
>> +split_pmd(pmd, pte);
>> +else
>> +set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
>>  }
>>  pte = pte_offset_kernel(pmd, vaddr);
>>  set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL_EXEC));
>> +
>>  return 0;
>>  err:
>>  free_transition_pgtable(image);
> .
>



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec