[PATCH kernel RFC 1/2] powerpc/pseries: Call RTAS directly

2019-07-19 Thread Alexey Kardashevskiy
The pseries guests call RTAS via a RTAS entry point which is a firmware image under powernv and simple HCALL wrapper under QEMU. For the latter, we can skip the binary image and do HCALL directly, eliminating the need in the RTAS blob entirely. This checks the DT whether the new method is

[PATCH kernel RFC 0/2] powerpc/pseries: Kexec style boot

2019-07-19 Thread Alexey Kardashevskiy
There is a funny excercise to run a guest under QEMU without the SLOF firmware and boot into a kernel directly to use petitboot as a boot loader (a more power boot loader than grub and yum), the patchset is posted as "spapr: Kexec style boot". Since there is no SLOF, i.e. no client interface and

[PATCH kernel RFC 2/2] powerpc/pseries: Kexec style ibm, client-architecture-support support

2019-07-19 Thread Alexey Kardashevskiy
This checks the FDT for "/chosen/qemu,h_cas" and calls H_CAS when present. The H_CAS hcall is implemented in QEMU for ages and currently returns an FDT with a diff to the initial FDT which SLOF updates and returns to the OS. For this patch to work, QEMU needs to provide the full tree instead, so

Re: [PATCH 2/3] DMA mapping: Move SME handling to x86-specific files

2019-07-19 Thread Thiago Jung Bauermann
kbuild test robot writes: > Hi Thiago, > > Thank you for the patch! Yet something to improve: > > [auto build test ERROR on linus/master] > [cannot apply to v5.2 next-20190718] > [if your patch is applied to the wrong git tree, please drop us a note to > help improve the system] > > url:

Re: [PATCH v10 8/9] kselftest: save-and-restore errno to allow for %m formatting

2019-07-19 Thread Aleksa Sarai
On 2019-07-19, shuah wrote: > On 7/19/19 10:42 AM, Aleksa Sarai wrote: > > Previously, using "%m" in a ksft_* format string can result in strange > > output because the errno value wasn't saved before calling other libc > > functions. The solution is to simply save and restore the errno before >

Re: [PATCH v10 8/9] kselftest: save-and-restore errno to allow for %m formatting

2019-07-19 Thread shuah
On 7/19/19 10:42 AM, Aleksa Sarai wrote: Previously, using "%m" in a ksft_* format string can result in strange output because the errno value wasn't saved before calling other libc functions. The solution is to simply save and restore the errno before we format the user-supplied format string.

[PATCH v10 9/9] selftests: add openat2(2) selftests

2019-07-19 Thread Aleksa Sarai
Test all of the various openat2(2) flags, as well as how file descriptor re-opening works. A small stress-test of a symlink-rename attack is included to show that the protections against ".."-based attacks are sufficient. In addition, the memfd selftest is fixed to no longer depend on the

[PATCH v10 8/9] kselftest: save-and-restore errno to allow for %m formatting

2019-07-19 Thread Aleksa Sarai
Previously, using "%m" in a ksft_* format string can result in strange output because the errno value wasn't saved before calling other libc functions. The solution is to simply save and restore the errno before we format the user-supplied format string. Signed-off-by: Aleksa Sarai ---

[PATCH v10 7/9] open: openat2(2) syscall

2019-07-19 Thread Aleksa Sarai
The most obvious syscall to add support for the new LOOKUP_* scoping flags would be openat(2). However, there are a few reasons why this is not the best course of action: * The new LOOKUP_* flags are intended to be security features, and openat(2) will silently ignore all unknown flags. This

[PATCH v10 6/9] namei: aggressively check for nd->root escape on ".." resolution

2019-07-19 Thread Aleksa Sarai
This patch allows for LOOKUP_BENEATH and LOOKUP_IN_ROOT to safely permit ".." resolution (in the case of LOOKUP_BENEATH the resolution will still fail if ".." resolution would resolve a path outside of the root -- while LOOKUP_IN_ROOT will chroot(2)-style scope it). Magic-link jumps are still

[PATCH v10 5/9] namei: LOOKUP_IN_ROOT: chroot-like path resolution

2019-07-19 Thread Aleksa Sarai
The primary motivation for the need for this flag is container runtimes which have to interact with malicious root filesystems in the host namespaces. One of the first requirements for a container runtime to be secure against a malicious rootfs is that they correctly scope symlinks (that is, they

[PATCH v10 4/9] namei: O_BENEATH-style path resolution flags

2019-07-19 Thread Aleksa Sarai
Add the following flags to allow various restrictions on path resolution (these affect the *entire* resolution, rather than just the final path component -- as is the case with LOOKUP_FOLLOW). The primary justification for these flags is to allow for programs to be far more strict about how they

[PATCH v10 3/9] open: O_EMPTYPATH: procfs-less file descriptor re-opening

2019-07-19 Thread Aleksa Sarai
Userspace has made use of /proc/self/fd very liberally to allow for descriptors to be re-opened. There are a wide variety of uses for this feature, but it has always required constructing a pathname and could not be done without procfs mounted. The obvious solution for this is to extend openat(2)

[PATCH v10 2/9] procfs: switch magic-link modes to be more sane

2019-07-19 Thread Aleksa Sarai
Now that magic-link modes are obeyed for file re-opening purposes, some of the pre-existing magic-link modes need to be adjusted to be more semantically correct. The most blatant example of this is /proc/self/exe, which had a mode of a+rwx even though tautologically the file could never be opened

[PATCH v10 1/9] namei: obey trailing magic-link DAC permissions

2019-07-19 Thread Aleksa Sarai
The ability for userspace to "re-open" file descriptors through /proc/self/fd has been a very useful tool for all sorts of usecases (container runtimes are one common example). However, the current interface for doing this has resulted in some pretty subtle security holes. Userspace can re-open a

[PATCH v10 0/9] namei: openat2(2) path resolution restrictions

2019-07-19 Thread Aleksa Sarai
This patch is being developed here (with snapshots of each series version being stashed in separate branches with names of the form "resolveat/vX-summary"): Patch changelog: v10: * Ensure that unlazy_walk() will fail if we are in a

Re: [PATCH v2] powerpc: slightly improve cache helpers

2019-07-19 Thread Nathan Chancellor
On Fri, Jul 19, 2019 at 10:23:03AM -0500, Segher Boessenkool wrote: > On Thu, Jul 18, 2019 at 08:24:56PM -0700, Nathan Chancellor wrote: > > On Mon, Jul 08, 2019 at 11:49:52PM -0700, Nathan Chancellor wrote: > > > On Tue, Jul 09, 2019 at 07:04:43AM +0200, Christophe Leroy wrote: > > > > Is that a

Re: [PATCH v3 5/6] fs/core/vmcore: Move sev_active() reference to x86 arch code

2019-07-19 Thread Thiago Jung Bauermann
Hello Lianbo, lijiang writes: > 在 2019年07月19日 01:47, Lendacky, Thomas 写道: >> On 7/17/19 10:28 PM, Thiago Jung Bauermann wrote: >>> Secure Encrypted Virtualization is an x86-specific feature, so it shouldn't >>> appear in generic kernel code because it forces non-x86 architectures to >>>

Re: [PATCH v3 0/6] Remove x86-specific code from generic headers

2019-07-19 Thread Thiago Jung Bauermann
Lendacky, Thomas writes: > On 7/18/19 2:44 PM, Thiago Jung Bauermann wrote: >> >> Lendacky, Thomas writes: >> >>> On 7/17/19 10:28 PM, Thiago Jung Bauermann wrote: Hello, This version is mostly about splitting up patch 2/3 into three separate patches, as suggested by

Re: [PATCH v2] powerpc: slightly improve cache helpers

2019-07-19 Thread Segher Boessenkool
On Thu, Jul 18, 2019 at 08:24:56PM -0700, Nathan Chancellor wrote: > On Mon, Jul 08, 2019 at 11:49:52PM -0700, Nathan Chancellor wrote: > > On Tue, Jul 09, 2019 at 07:04:43AM +0200, Christophe Leroy wrote: > > > Is that a Clang bug ? > > > > No idea, it happens with clang-8 and clang-9 though

Re: [PATCH v3 5/7] kexec_elf: remove elf_addr_to_cpu macro

2019-07-19 Thread Michael Ellerman
Sven Schnelle writes: > Hi Michael, > > On Thu, Jul 11, 2019 at 09:08:51PM +1000, Michael Ellerman wrote: >> Sven Schnelle writes: >> > On Wed, Jul 10, 2019 at 05:09:29PM +0200, Christophe Leroy wrote: >> >> Le 10/07/2019 à 16:29, Sven Schnelle a écrit : >> >> > It had only one definition, so

Re: [PATCH v3 0/6] Remove x86-specific code from generic headers

2019-07-19 Thread Lendacky, Thomas
On 7/18/19 2:44 PM, Thiago Jung Bauermann wrote: > > Lendacky, Thomas writes: > >> On 7/17/19 10:28 PM, Thiago Jung Bauermann wrote: >>> Hello, >>> >>> This version is mostly about splitting up patch 2/3 into three separate >>> patches, as suggested by Christoph Hellwig. Two other changes are a

Re: question on "powerpc/pseries/dma: Allow SWIOTLB"

2019-07-19 Thread Christoph Hellwig
On Fri, Jul 19, 2019 at 06:23:59PM +1000, Alexey Kardashevskiy wrote: > It is getting there and I still do not see why "swiotlb=force" should not > work if chosed in the cmdline. Ok, makes sense. But that means we also have the issue in a few other places..

Re: [PATCH 1/2] arch: mark syscall number 435 reserved for clone3

2019-07-19 Thread Christian Brauner
On Fri, Jul 19, 2019 at 09:13:16PM +1000, Michael Ellerman wrote: > Christian Brauner writes: > > On Fri, Jul 19, 2019 at 08:18:02PM +1000, Michael Ellerman wrote: > >> Christian Brauner writes: > >> > On Mon, Jul 15, 2019 at 03:56:04PM +0200, Christian Borntraeger wrote: > >> >> I think Vasily

Re: Crash in kvmppc_xive_release()

2019-07-19 Thread Cédric Le Goater
On 19/07/2019 13:20, Michael Ellerman wrote: > Cédric Le Goater writes: >> On 18/07/2019 14:49, Michael Ellerman wrote: >>> Anyone else seen this? >>> >>> This is running ~176 VMs on a Power9 (1 per thread), host crashes: >> >> This is beyond the underlying limits of XIVE. >> >> As we allocate

Re: Crash in kvmppc_xive_release()

2019-07-19 Thread Michael Ellerman
Cédric Le Goater writes: > On 18/07/2019 14:49, Michael Ellerman wrote: >> Anyone else seen this? >> >> This is running ~176 VMs on a Power9 (1 per thread), host crashes: > > This is beyond the underlying limits of XIVE. > > As we allocate 2K vCPUs per VM, that is 16K EQs for interrupt events.

Re: [PATCH] powerpc/dma: Fix invalid DMA mmap behavior

2019-07-19 Thread Arnd Bergmann
On Thu, Jul 18, 2019 at 11:52 AM Christoph Hellwig wrote: > > On Thu, Jul 18, 2019 at 10:49:34AM +0200, Christoph Hellwig wrote: > > On Thu, Jul 18, 2019 at 01:45:16PM +1000, Oliver O'Halloran wrote: > > > > Other than m68k, mips, and arm64, everybody else that doesn't have > > > >

Re: [PATCH 1/2] arch: mark syscall number 435 reserved for clone3

2019-07-19 Thread Michael Ellerman
Christian Brauner writes: > On Fri, Jul 19, 2019 at 08:18:02PM +1000, Michael Ellerman wrote: >> Christian Brauner writes: >> > On Mon, Jul 15, 2019 at 03:56:04PM +0200, Christian Borntraeger wrote: >> >> I think Vasily already has a clone3 patch for s390x with 435. >> > >> > A quick follow-up

Re: [PATCH v9 08/10] open: openat2(2) syscall

2019-07-19 Thread Christian Brauner
On Fri, Jul 19, 2019 at 05:12:18AM +0300, Dmitry V. Levin wrote: > On Thu, Jul 18, 2019 at 11:29:50PM +0200, Arnd Bergmann wrote: > [...] > > 5. you get the same problem with seccomp and strace that > >clone3() has -- these and others only track the register > >arguments by default. > >

Re: [PATCH 1/2] arch: mark syscall number 435 reserved for clone3

2019-07-19 Thread Christian Brauner
On Fri, Jul 19, 2019 at 08:18:02PM +1000, Michael Ellerman wrote: > Christian Brauner writes: > > On Mon, Jul 15, 2019 at 03:56:04PM +0200, Christian Borntraeger wrote: > >> I think Vasily already has a clone3 patch for s390x with 435. > > > > A quick follow-up on this. Helge and Michael have

Re: [PATCH 1/2] arch: mark syscall number 435 reserved for clone3

2019-07-19 Thread Michael Ellerman
Christian Brauner writes: > On Mon, Jul 15, 2019 at 03:56:04PM +0200, Christian Borntraeger wrote: >> I think Vasily already has a clone3 patch for s390x with 435. > > A quick follow-up on this. Helge and Michael have asked whether there > are any tests for clone3. Yes, there will be and I try

Re: [PATCH 2/3] DMA mapping: Move SME handling to x86-specific files

2019-07-19 Thread kbuild test robot
Hi Thiago, Thank you for the patch! Yet something to improve: [auto build test ERROR on linus/master] [cannot apply to v5.2 next-20190718] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url:

Re: question on "powerpc/pseries/dma: Allow SWIOTLB"

2019-07-19 Thread Alexey Kardashevskiy
On 19/07/2019 18:05, Christoph Hellwig wrote: On Fri, Jul 19, 2019 at 06:00:25PM +1000, Alexey Kardashevskiy wrote: But shouldn't we force usage of the direct ops in that case as the IOMMU is not neededed at all? We do, for mappings, but not unmappings and syncing. Well, I mean as in

Re: [PATCH v5 1/7] kvmppc: HMM backend driver to manage pages of secure guest

2019-07-19 Thread Bharata B Rao
On Thu, Jul 18, 2019 at 11:46:41PM -0700, Christoph Hellwig wrote: > On Thu, Jul 11, 2019 at 10:38:48AM +0530, Bharata B Rao wrote: > > Hmmm... I still find it in upstream, guess it will be removed soon? > > > > I find the below commit in mmotm. > > Please take a look at the latest hmm code in

Re: question on "powerpc/pseries/dma: Allow SWIOTLB"

2019-07-19 Thread Christoph Hellwig
On Fri, Jul 19, 2019 at 06:00:25PM +1000, Alexey Kardashevskiy wrote: > > But shouldn't we force usage of the direct ops in that case as the > > IOMMU is not neededed at all? > > We do, for mappings, but not unmappings and syncing. Well, I mean as in literally not setting a dma_ops so that the

Re: question on "powerpc/pseries/dma: Allow SWIOTLB"

2019-07-19 Thread Alexey Kardashevskiy
On 19/07/2019 17:53, Christoph Hellwig wrote: On Fri, Jul 19, 2019 at 05:52:37PM +1000, Alexey Kardashevskiy wrote: On 19/07/2019 17:10, Christoph Hellwig wrote: Hey Alexey, what is the use case for the above commit? Shouldn't we handle all addressing limits using the iommu? Our

Re: question on "powerpc/pseries/dma: Allow SWIOTLB"

2019-07-19 Thread Christoph Hellwig
On Fri, Jul 19, 2019 at 05:52:37PM +1000, Alexey Kardashevskiy wrote: > > > On 19/07/2019 17:10, Christoph Hellwig wrote: > > Hey Alexey, > > > > what is the use case for the above commit? Shouldn't we handle all > > addressing limits using the iommu? > > Our secure VMs is the use case, when

Re: question on "powerpc/pseries/dma: Allow SWIOTLB"

2019-07-19 Thread Alexey Kardashevskiy
On 19/07/2019 17:10, Christoph Hellwig wrote: Hey Alexey, what is the use case for the above commit? Shouldn't we handle all addressing limits using the iommu? Our secure VMs is the use case, when only a fraction of system memory is available for DMA. -- Alexey

Re: [PATCH] powerpc/dma: Fix invalid DMA mmap behavior

2019-07-19 Thread Shawn Anastasio
On 7/19/19 2:06 AM, Christoph Hellwig wrote: > What is inherently architecture specific here over the fact that > the pgprot_* expand to architecture specific bits? What I meant is that different architectures seem to have different criteria for setting the different pgprot_ bits. i.e. ppc

List etiquette

2019-07-19 Thread Stephen Rothwell
Hi all, Just a short note to mention a couple of things: - please do *not* post html formatted emails. I may start just rejecting them soon. - please cut down the quoting in replies to what is needed for context. Thanks -- Cheers, Stephen Rothwell pgpOWZXb9MRd1.pgp

Re: [PATCH v4 00/25] Add FADump support on PowerNV platform

2019-07-19 Thread Hari Bathini
Sorry, I missed mentioning that this patchset is based on top of upstream kernel plus the below patches:   https://patchwork.ozlabs.org/patch/1123582/     https://patchwork.ozlabs.org/patch/1123583/ On 16/07/19 5:01 PM, Hari Bathini wrote: >

Re: [PATCH v3 5/6] fs/core/vmcore: Move sev_active() reference to x86 arch code

2019-07-19 Thread lijiang
在 2019年07月19日 01:47, Lendacky, Thomas 写道: > On 7/17/19 10:28 PM, Thiago Jung Bauermann wrote: >> Secure Encrypted Virtualization is an x86-specific feature, so it shouldn't >> appear in generic kernel code because it forces non-x86 architectures to >> define the sev_active() function, which

question on "powerpc/pseries/dma: Allow SWIOTLB"

2019-07-19 Thread Christoph Hellwig
Hey Alexey, what is the use case for the above commit? Shouldn't we handle all addressing limits using the iommu?

Re: Crash in kvmppc_xive_release()

2019-07-19 Thread Michael Ellerman
Cédric Le Goater writes: > On 18/07/2019 15:14, Cédric Le Goater wrote: >> On 18/07/2019 14:49, Michael Ellerman wrote: >>> Anyone else seen this? >>> >>> This is running ~176 VMs on a Power9 (1 per thread), host crashes: >> >> This is beyond the underlying limits of XIVE. >> >> As we allocate

Re: [PATCH] powerpc/dma: Fix invalid DMA mmap behavior

2019-07-19 Thread Christoph Hellwig
On Thu, Jul 18, 2019 at 02:46:00PM -0500, Shawn Anastasio wrote: > Personally, I'm not a huge fan of an implicit default for something > inherently architecture-dependent like this at all. What is inherently architecture specific here over the fact that the pgprot_* expand to architecture

Re: [PATCH v5 1/7] kvmppc: HMM backend driver to manage pages of secure guest

2019-07-19 Thread Christoph Hellwig
On Thu, Jul 11, 2019 at 10:38:48AM +0530, Bharata B Rao wrote: > Hmmm... I still find it in upstream, guess it will be removed soon? > > I find the below commit in mmotm. Please take a look at the latest hmm code in mainline, there have also been other significant changes as well.

Re: [PATCH v3 02/11] s390x/mm: Fail when an altmap is used for arch_add_memory()

2019-07-19 Thread Michal Hocko
On Mon 15-07-19 12:51:27, David Hildenbrand wrote: > On 01.07.19 14:46, Michal Hocko wrote: > > On Mon 01-07-19 09:43:06, Michal Hocko wrote: > >> On Mon 27-05-19 13:11:43, David Hildenbrand wrote: > >>> ZONE_DEVICE is not yet supported, fail if an altmap is passed, so we > >>> don't forget

Re: [PATCH v3 06/11] mm/memory_hotplug: Allow arch_remove_pages() without CONFIG_MEMORY_HOTREMOVE

2019-07-19 Thread Michal Hocko
On Mon 15-07-19 12:54:20, David Hildenbrand wrote: [...] > So I'm leaving it like it is. arch_remove_memory() will be mandatory for > architectures implementing arch_add_memory(). I do agree that removing CONFIG_MEMORY_HOTREMOVE makes some sense. But this patch being a mid step should be simpler

Re: [PATCH v3 10/11] mm/memory_hotplug: Make unregister_memory_block_under_nodes() never fail

2019-07-19 Thread Michal Hocko
On Mon 15-07-19 13:10:33, David Hildenbrand wrote: > On 01.07.19 12:27, Michal Hocko wrote: > > On Mon 01-07-19 11:36:44, Oscar Salvador wrote: > >> On Mon, Jul 01, 2019 at 10:51:44AM +0200, Michal Hocko wrote: > >>> Yeah, we do not allow to offline multi zone (node) ranges so the current > >>>