Re: [BUG RT] dump-capture kernel not executed for panic in interrupt context

2020-09-14 Thread Eric W. Biederman
Adding the kexec list as well. Joerg Vehlow writes: > Hi Eric, >> What is this patch supposed to be doing? >> >> What bug is it fixing? > This information is part in the first message of this mail thread. > The patch was intendedfor the active discussion in this thread, > not for a broad

Re: KASAN: unknown-crash Read in do_exit

2020-09-14 Thread Eric W. Biederman
syzbot writes: > Hello, > > syzbot found the following issue on: Skimming the code it appears this is a feature not a bug. The stack_not_used code deliberately reads the unused/unitiailized portion of the stack, to see if that part of the stack was used. Perhaps someone wants to make this

Re: KASAN: unknown-crash Read in do_exit

2020-09-14 Thread Eric W. Biederman
Dmitry Vyukov writes: > On Mon, Sep 14, 2020 at 2:15 PM Eric W. Biederman > wrote: >> >> syzbot writes: >> >> > Hello, >> > >> > syzbot found the following issue on: >> >> Skimming the code it appears this is a feature not a b

Re: [PATCH] fork: Use helper function mapping_allow_writable() in dup_mmap()

2020-09-13 Thread Eric W. Biederman
Miaohe Lin writes: > Use helper function mapping_allow_writable() to atomic_inc > i_mmap_writable. Why? > Signed-off-by: Miaohe Lin > --- > kernel/fork.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/fork.c b/kernel/fork.c > index 4b328aecabb2..a0586716e327

Re: [BUG RT] dump-capture kernel not executed for panic in interrupt context

2020-09-11 Thread Eric W. Biederman
Joerg Vehlow writes: > Hi, > > here is the new version of the patch based on Peters suggestion > It looks like it works fine. I added the BUG_ON to __crash_kexec, because it > is > a precondition, that panic_cpu is set correctly, otherwise the whole locking > logic fails. > > The mutex_trylock

Re: [RFC PATCH 0/3] Add writing support to vmcore for reusing oldmem

2020-09-09 Thread Eric W. Biederman
Kairui Song writes: > Currently vmcore only supports reading, this patch series is an RFC > to add writing support to vmcore. It's x86_64 only yet, I'll add other > architecture later if there is no problem with this idea. > > My purpose of adding writing support is to reuse the crashed kernel's

Re: [PATCH] fs: Eliminate a local variable to make the code more clear

2020-09-09 Thread Eric W. Biederman
Hao Lee writes: > On Tue, Sep 08, 2020 at 07:48:57PM +0100, Al Viro wrote: >> On Tue, Sep 08, 2020 at 01:06:56PM +, Hao Lee wrote: >> > ping >> > >> > On Wed, Jul 29, 2020 at 03:21:28PM +, Hao Lee wrote: >> > > The dentry local variable is introduced in 'commit 84d17192d2afd ("get >> >

Re: possible deadlock in proc_pid_syscall (2)

2020-08-31 Thread Eric W. Biederman
pet...@infradead.org writes: > On Sun, Aug 30, 2020 at 07:31:39AM -0500, Eric W. Biederman wrote: > >> I am thinking that for cases where we want to do significant work it >> might be better to ask the process to pause at someplace safe (probably >> get_signal) and then do

Re: possible deadlock in proc_pid_syscall (2)

2020-08-30 Thread Eric W. Biederman
pet...@infradead.org writes: > On Fri, Aug 28, 2020 at 07:01:17AM -0500, Eric W. Biederman wrote: >> This feels like an issue where perf can just do too much under >> exec_update_mutex. In particular calling kern_path from >> create_local_trace_uprobe. Calling into the

Re: possible deadlock in proc_pid_syscall (2)

2020-08-28 Thread Eric W. Biederman
syzbot writes: > Hello, > > syzbot found the following issue on: > > HEAD commit:15bc20c6 Merge tag 'tty-5.9-rc3' of git://git.kernel.org/p.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=15349f9690 > kernel config:

Re: RFC: inet_timewait_sock->tw_timer list corruption

2020-08-27 Thread Eric W. Biederman
Wang Long writes: > Hi, > > we encountered a kernel panic as following: > > [4394470.273792] general protection fault: [#1] SMP NOPTI > [4394470.274038] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Tainted: GW > - - - 4.18.0-80.el8.x86_64 #1 > [4394470.274477] Hardware name:

Re: [PATCH v2 1/1] mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary

2020-08-25 Thread Eric W. Biederman
Suren Baghdasaryan writes: > Currently __set_oom_adj loops through all processes in the system to > keep oom_score_adj and oom_score_adj_min in sync between processes > sharing their mm. This is done for any task with more that one mm_users, > which includes processes with multiple threads

Re: [PATCH] MAINTAINERS: add namespace entry

2020-08-25 Thread Eric W. Biederman
y such namespaces that haven't gotten a separate maintainers entry (e.g. > time namespaces). I expect this to grow more entries and/or regular > expressions > over time. For now these entries here are sufficient. I intend to route this > patch upstream soon. > > Cc: "Eric W. Bi

Re: [PATCH 1/1] mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary

2020-08-20 Thread Eric W. Biederman
Michal Hocko writes: > On Thu 20-08-20 08:56:53, Suren Baghdasaryan wrote: > [...] >> Catching up on the discussion which was going on while I was asleep... >> So it sounds like there is a consensus that oom_adj should be moved to >> mm_struct rather than trying to synchronize it among tasks

Re: [PATCH 1/1] mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary

2020-08-20 Thread Eric W. Biederman
Oleg Nesterov writes: > On 08/20, Oleg Nesterov wrote: >> >> On 08/20, Eric W. Biederman wrote: >> > >> > --- a/fs/exec.c >> > +++ b/fs/exec.c >> > @@ -1139,6 +1139,10 @@ static int exec_mmap(struct mm_struct *mm) >> >vmaca

Re: [PATCH 1/1] mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary

2020-08-20 Thread Eric W. Biederman
Tetsuo Handa writes: > On 2020/08/20 23:00, Christian Brauner wrote: >> On Thu, Aug 20, 2020 at 10:48:43PM +0900, Tetsuo Handa wrote: >>> On 2020/08/20 22:34, Christian Brauner wrote: On Thu, Aug 20, 2020 at 03:26:31PM +0200, Michal Hocko wrote: > If you can handle vfork by other means

Re: [PATCH 1/1] mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary

2020-08-20 Thread Eric W. Biederman
Oleg Nesterov writes: > On 08/20, Eric W. Biederman wrote: >> >> --- a/fs/exec.c >> +++ b/fs/exec.c >> @@ -1139,6 +1139,10 @@ static int exec_mmap(struct mm_struct *mm) >> vmacache_flush(tsk); >> task_unlock(tsk); >> if (old_mm)

Re: [PATCH 1/1] mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary

2020-08-20 Thread Eric W. Biederman
Michal Hocko writes: > On Thu 20-08-20 07:54:44, Eric W. Biederman wrote: >> ebied...@xmission.com (Eric W. Biederman) writes: >> >> 2> Michal Hocko writes: >> > >> >> On Thu 20-08-20 07:34:41, Eric W. Biederman wrote: >> >>> Suren Ba

Re: [PATCH 1/1] mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary

2020-08-20 Thread Eric W. Biederman
ebied...@xmission.com (Eric W. Biederman) writes: 2> Michal Hocko writes: > >> On Thu 20-08-20 07:34:41, Eric W. Biederman wrote: >>> Suren Baghdasaryan writes: >>> >>> > Currently __set_oom_adj loops through all processes in the system to >>>

Re: [PATCH 1/1] mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary

2020-08-20 Thread Eric W. Biederman
Michal Hocko writes: > On Thu 20-08-20 07:34:41, Eric W. Biederman wrote: >> Suren Baghdasaryan writes: >> >> > Currently __set_oom_adj loops through all processes in the system to >> > keep oom_score_adj and oom_score_adj_min in sync between processes &g

Re: [PATCH 1/1] mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary

2020-08-20 Thread Eric W. Biederman
Suren Baghdasaryan writes: > Currently __set_oom_adj loops through all processes in the system to > keep oom_score_adj and oom_score_adj_min in sync between processes > sharing their mm. This is done for any task with more that one mm_users, > which includes processes with multiple threads

Re: [PATCH 00/11] Introduce kernel_clone(), kill _do_fork()

2020-08-19 Thread Eric W. Biederman
Christian Brauner writes: > On Wed, Aug 19, 2020 at 08:32:59AM -0500, Eric W. Biederman wrote: >> Matthew Wilcox writes: >> >> > On Wed, Aug 19, 2020 at 10:45:56AM +0200, Christian Brauner wrote: >> >> On Wed, Aug 19, 2020 at 09:43:40AM +0200, pet...@infrade

Re: [PATCH 00/11] Introduce kernel_clone(), kill _do_fork()

2020-08-19 Thread Eric W. Biederman
Matthew Wilcox writes: > On Wed, Aug 19, 2020 at 10:45:56AM +0200, Christian Brauner wrote: >> On Wed, Aug 19, 2020 at 09:43:40AM +0200, pet...@infradead.org wrote: >> > On Tue, Aug 18, 2020 at 06:44:47PM +0100, Matthew Wilcox wrote: >> > > On Tue, Aug 18, 2020 at 07:34:00PM +0200, Christian

Re: [PATCH 11/17] bpf/task_iter: In task_file_seq_get_next use fnext_task

2020-08-18 Thread Eric W. Biederman
gt; [If your patch is applied to the wrong git tree, kindly drop us a note. > And when submitting patch, we suggest to use '--base' as documented in > https://git-scm.com/docs/git-format-patch] > > url: > https://github.com/0day-ci/linux/commits/Eric-W-Biederman/exec-Move-unshare_files-to

Re: [PATCH 17/17] file: Rename __close_fd to close_fd and remove the files parameter

2020-08-18 Thread Eric W. Biederman
Christoph Hellwig writes: > Please kill off ksys_close as well while you're at it. Good point. ksys_close is just a trivial wrapper around close_fd. So the one caller of ksys_close autofs_dev_ioctl_closemount can be trivially changed to call close_fd. Eric

[PATCH 17/17] file: Rename __close_fd to close_fd and remove the files parameter

2020-08-17 Thread Eric W. Biederman
ything except current->files is passed, by limiting the callers to only operation on current->files. [1] 483ce1d4b8c3 ("take descriptor-related part of close() to file.c") [2] 44d8047f1d87 ("binder: use standard functions to allocate fds") Signed-off-by: &qu

[PATCH 13/17] file: Remove get_files_struct

2020-08-17 Thread Eric W. Biederman
and fget_light remove get_files_struct so that it does not gain any new users. [1] https://lkml.kernel.org/r/20180915160423.ga31...@redhat.com Suggested-by: Oleg Nesterov Signed-off-by: "Eric W. Biederman" --- fs/file.c | 13 - include/linux/fdtable.h | 1

[PATCH 14/17] file: Merge __fd_install into fd_install

2020-08-17 Thread Eric W. Biederman
t->files", merge them together by transforming the files parameter into a local variable initialized to current->files. [1] f869e8a7f753 ("expose a low-level variant of fd_install() for binder") [2] 44d8047f1d87 ("binder: use standard functions to allocate fds") Sign

[PATCH 12/17] proc/fd: In fdinfo seq_show don't use get_files_struct

2020-08-17 Thread Eric W. Biederman
how. The task_lock was already taken in get_files_struct, and so skipping get_files_struct performs less work overall, and avoids the problems with the files_struct reference count. [1] https://lkml.kernel.org/r/20180915160423.ga31...@redhat.com Suggested-by: Oleg Nesterov Signed-off-by: "Eric W. B

[PATCH 16/17] file: Merge __alloc_fd into alloc_fd

2020-08-17 Thread Eric W. Biederman
les", merge them together by transforming the files parameter into a ocal variable initialized to current->files. [1] dcfadfa4ec5a ("new helper: __alloc_fd()") [2] 44d8047f1d87 ("binder: use standard functions to allocate fds") Signed-off-by: "Eric W. B

[PATCH 11/17] bpf/task_iter: In task_file_seq_get_next use fnext_task

2020-08-17 Thread Eric W. Biederman
-by: Oleg Nesterov Signed-off-by: "Eric W. Biederman" --- kernel/bpf/task_iter.c | 43 ++ 1 file changed, 10 insertions(+), 33 deletions(-) diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c index 232df29793e9..831d42d7543a 100644 --- a/

[PATCH 15/17] file: In f_dupfd read RLIMIT_NOFILE once.

2020-08-17 Thread Eric W. Biederman
. Further this causes alloc_fd to take all of the same arguments as __alloc_fd except for the files_struct argument. Signed-off-by: "Eric W. Biederman" --- fs/file.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/fs/file.c b/fs/file.c index 1a755811669d..50

[PATCH 09/17] file: Implement fnext_task

2020-08-17 Thread Eric W. Biederman
through safely, without needed to increment the count on files_struct. Signed-off-by: "Eric W. Biederman" --- fs/file.c | 21 + include/linux/fdtable.h | 1 + 2 files changed, 22 insertions(+) diff --git a/fs/file.c b/fs/file.c index 8d4b385055e9..88

[PATCH 10/17] proc/fd: In proc_readfd_common use fnext_task

2020-08-17 Thread Eric W. Biederman
descritor into the generic code, and by remvoing the need for capturing and releasing a reference on files_struct. [1] https://lkml.kernel.org/r/20180915160423.ga31...@redhat.com Suggested-by: Oleg Nesterov Signed-off-by: Eric W. Biederman --- fs/proc/fd.c | 17 + 1 file changed, 5

[PATCH 08/17] proc/fd: In proc_fd_link use fcheck_task

2020-08-17 Thread Eric W. Biederman
locking, and reference counting. [1] https://lkml.kernel.org/r/20180915160423.ga31...@redhat.com Suggested-by: Oleg Nesterov Signed-off-by: "Eric W. Biederman" --- fs/proc/fd.c | 14 -- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/fs/proc/fd.c b/fs/proc/

[PATCH 05/17] bpf: In bpf_task_fd_query use fget_task

2020-08-17 Thread Eric W. Biederman
...@redhat.com Suggested-by: Oleg Nesterov Signed-off-by: "Eric W. Biederman" --- kernel/bpf/syscall.c | 20 +++- 1 file changed, 3 insertions(+), 17 deletions(-) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 86299a292214..93657d5f6538 100644 --- a/

[PATCH 07/17] proc/fd: In tid_fd_mode use fcheck_task

2020-08-17 Thread Eric W. Biederman
] https://lkml.kernel.org/r/20180915160423.ga31...@redhat.com Suggested-by: Oleg Nesterov Signed-off-by: "Eric W. Biederman" --- fs/proc/fd.c | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/fs/proc/fd.c b/fs/proc/fd.c index 81882a13212d..4048a87c51ee 100644 --- a/fs

[PATCH 06/17] file: Implement fcheck_task

2020-08-17 Thread Eric W. Biederman
As a companion to fget_task implement fcheck_task for use for querying a process about a specific file. Signed-off-by: "Eric W. Biederman" --- fs/file.c | 13 + include/linux/fdtable.h | 1 + 2 files changed, 14 insertions(+) diff --git a/fs/file.c b/fs/fi

[PATCH 03/17] exec: Remove reset_files_struct

2020-08-17 Thread Eric W. Biederman
Now that exec no longer needs to restore the previous value of current->files on error there are no more callers of reset_files_struct so remove it. Signed-off-by: "Eric W. Biederman" --- fs/file.c | 12 include/linux/fdtable.h | 1 - 2 files changed,

[PATCH 01/17] exec: Move unshare_files to fix posix file locking during exec

2020-08-17 Thread Eric W. Biederman
23.21964-1-jlay...@kernel.org [16] https://lkml.kernel.org/r/20180914105310.6454-1-jlay...@kernel.org [17] https://lkml.kernel.org/r/87a7ohs5ow@xmission.com [18] https://lkml.kernel.org/r/87pn8c1uj6.fsf...@x220.int.ebiederm.org Signed-off-by: "Eric W. Biederman" --- fs/exec.c | 29 +++

[PATCH 04/17] kcmp: In kcmp_epoll_target use fget_task

2020-08-17 Thread Eric W. Biederman
in fget_light having to fallback to fget reducing performance. Suggested-by: Oleg Nesterov Signed-off-by: "Eric W. Biederman" --- kernel/kcmp.c | 20 1 file changed, 4 insertions(+), 16 deletions(-) diff --git a/kernel/kcmp.c b/kernel/kcmp.c index b3ff9288c6cc..87

[PATCH 02/17] exec: Simplify unshare_files

2020-08-17 Thread Eric W. Biederman
Now that exec no longer needs to return the unshared files to their previous value there is no reason to return displaced. Instead when unshare_fd creates a copy of the file table, call put_files_struct before returning from unshare_files. Signed-off-by: "Eric W. Biederman" --- fs/

exec: Move unshare_files and guarantee files_struct.count is correct

2020-08-17 Thread Eric W. Biederman
++ include/linux/syscalls.h | 6 +-- kernel/bpf/syscall.c | 20 ++--- kernel/bpf/task_iter.c | 43 +- kernel/fork.c| 12 +++--- kernel/kcmp.c| 20 ++--- 11 files changed, 109 insertions(+), 195 deletions(-) Eric W. Biederman (17): exec

Re: [PATCH] Makefile: Yes. Finally remove '-Wdeclaration-after-statement'

2020-08-17 Thread Eric W. Biederman
Pavel Machek writes: > Hi! > >> > This is not just a matter of style; this is a matter of semantics, >> > especially with regard to: >> > >> > * const Correctness. >> > A const-declared variable must be initialized when defined. >> > >> > * Conditional Compilation. >> > When there

[RFC][PATCH] seccomp: Fail immediately if any thread is performing an exec

2020-08-17 Thread Eric W. Biederman
es not change during or after the calculation of the new credentials during exec. With seccomp_can_sync_threads updated to test in_execve and fail immediately taking cred_guard_mutex is no longer necessary. Signed-off-by: "Eric W. Biederman" --- I think in general this is the right thing to

Re: [PATCH 00/23] proc: Introduce /proc/namespaces/ directory to expose namespaces lineary

2020-08-17 Thread Eric W. Biederman
Christian Brauner writes: > On Mon, Aug 17, 2020 at 10:48:01AM -0500, Eric W. Biederman wrote: >> >> Creating names in the kernel for namespaces is very difficult and >> problematic. I have not seen anything that looks like all of the >> problems have been solv

Re: [PATCH 00/23] proc: Introduce /proc/namespaces/ directory to expose namespaces lineary

2020-08-17 Thread Eric W. Biederman
Creating names in the kernel for namespaces is very difficult and problematic. I have not seen anything that looks like all of the problems have been solved with restoring these new names. When your filter for your list of namespaces is user namespace creating a new directory in proc is

Re: [PATCH RFC 2/2] lkdtm: Add heap spraying test

2020-08-17 Thread Eric W. Biederman
Alexander Popov writes: > Add a simple test for CONFIG_SLAB_QUARANTINE. > > It performs heap spraying that aims to reallocate the recently freed heap > object. This technique is used for exploiting use-after-free > vulnerabilities in the kernel code. > > This test shows that

Re: [PATCH] proc: Avoid a thundering herd of threads freeing proc dentries

2020-08-17 Thread Eric W. Biederman
wi...@casper.infradead.org writes: > On Mon, Jun 22, 2020 at 10:20:40AM -0500, Eric W. Biederman wrote: >> Junxiao Bi writes: >> > On 6/20/20 9:27 AM, Matthew Wilcox wrote: >> >> On Fri, Jun 19, 2020 at 05:42:45PM -0500, Eric W. Biederman wrote: >> >>>

Re: [PATCH 1/2] kexec: Add quick kexec support for kernel

2020-08-14 Thread Eric W. Biederman
Sang Yan writes: > In normal kexec, relocating kernel may cost 5 ~ 10 seconds, to > copy all segments from vmalloced memory to kernel boot memory, > because of disabled mmu. I haven't seen kexec that slow since I tested on my 16Mhz 386. That machine has an excuse it really is slow. Anything

Re: [PATCH v7 5/7] fs,doc: Enable to enforce noexec mounts or file exec through O_MAYEXEC

2020-08-11 Thread Eric W. Biederman
Mickaël Salaün writes: > Allow for the enforcement of the O_MAYEXEC openat2(2) flag. Thanks to > the noexec option from the underlying VFS mount, or to the file execute > permission, userspace can enforce these execution policies. This may > allow script interpreters to check execution

Re: [PATCH v7 4/7] fs: Introduce O_MAYEXEC flag for openat2(2)

2020-08-11 Thread Eric W. Biederman
Mickaël Salaün writes: > When the O_MAYEXEC flag is passed, openat2(2) may be subject to > additional restrictions depending on a security policy managed by the > kernel through a sysctl or implemented by an LSM thanks to the > inode_permission hook. This new flag is ignored by open(2) and >

Re: [PATCH v7 3/7] exec: Move path_noexec() check earlier

2020-08-11 Thread Eric W. Biederman
Mickaël Salaün writes: > From: Kees Cook > > The path_noexec() check, like the regular file check, was happening too > late, letting LSMs see impossible execve()s. Check it earlier as well > in may_open() and collect the redundant fs/exec.c path_noexec() test > under the same robustness comment

Re: [PATCH v7 2/7] exec: Move S_ISREG() check earlier

2020-08-11 Thread Eric W. Biederman
Mickaël Salaün writes: > From: Kees Cook > > The execve(2)/uselib(2) syscalls have always rejected non-regular > files. Recently, it was noticed that a deadlock was introduced when trying > to execute pipes, as the S_ISREG() test was happening too late. This was > fixed in commit 73601ea5b7b1

Re: [PATCH v7 1/7] exec: Change uselib(2) IS_SREG() failure to EACCES

2020-08-11 Thread Eric W. Biederman
ebied...@xmission.com (Eric W. Biederman) writes: > Mickaël Salaün writes: > >> From: Kees Cook >> >> Change uselib(2)' S_ISREG() error return to EACCES instead of EINVAL so >> the behavior matches execve(2), and the seemingly documented value. >> The "n

Re: [PATCH v7 1/7] exec: Change uselib(2) IS_SREG() failure to EACCES

2020-08-11 Thread Eric W. Biederman
Mickaël Salaün writes: > From: Kees Cook > > Change uselib(2)' S_ISREG() error return to EACCES instead of EINVAL so > the behavior matches execve(2), and the seemingly documented value. > The "not a regular file" failure mode of execve(2) is explicitly > documented[1], but it is not mentioned

Re: [PATCH 0/8] namespaces: Introduce generic refcount

2020-08-04 Thread Eric W. Biederman
Christian Brauner writes: > On Mon, Aug 03, 2020 at 01:16:10PM +0300, Kirill Tkhai wrote: >> Every namespace type has its own counter. Some of them are >> of refcount_t, some of them are of kref. >> >> This patchset introduces generic ns_common::count for any >> type of namespaces instead of

Re: [PATCH 1/8] ns: Add common refcount into ns_common add use it as counter for net_ns

2020-08-04 Thread Eric W. Biederman
Kirill Tkhai writes: > On 04.08.2020 15:21, Eric W. Biederman wrote: >> Kirill Tkhai writes: >> >>> Currently, every type of namespaces has its own counter, >>> which is stored in ns-specific part. Say, @net has >>> struct net::count

Re: [PATCH 0/8] namespaces: Introduce generic refcount

2020-08-04 Thread Eric W. Biederman
Christian Brauner writes: > On Tue, Aug 04, 2020 at 07:11:59AM -0500, Eric W. Biederman wrote: >> Christian Brauner writes: >> >> > On Mon, Aug 03, 2020 at 01:16:10PM +0300, Kirill Tkhai wrote: >> >> Every namespace type has its own counter. Some of them a

Re: [PATCH 1/8] ns: Add common refcount into ns_common add use it as counter for net_ns

2020-08-04 Thread Eric W. Biederman
, and converts net namespace to use it first. And the other refcounts on struct net? How do they play into what you are trying to do? For the lack of an explanation. Nacked-by: "Eric W. Biederman" > Signed-off-by: Kirill Tkhai > Acked-by: Christian Brauner > --- > include/lin

[GIT PULL] exec cleanups for v5.9-rc1

2020-08-03 Thread Eric W. Biederman
released I am hoping to quickly rebase and get a lot of changes posted, reviewed and merged. I have a lot of additional fixes and cleanups that just need a little more attention before they are ready to merge. Eric W. Biederman (25): umh: Capture the pid in umh_pipe_setup umh: Mo

Re: [RFC PATCH 0/5] madvise MADV_DOEXEC

2020-08-03 Thread Eric W. Biederman
Steven Sistare writes: > On 7/30/2020 5:58 PM, ebied...@xmission.com wrote: >> Here is another suggestion. >> >> Have a very simple program that does: >> >> for (;;) { >> handle = dlopen("/my/real/program"); >> real_main = dlsym(handle, "main"); >>

Re: [RFC][PATCH] exec: Conceal the other threads from wakeups during exec

2020-07-31 Thread Eric W. Biederman
Linus Torvalds writes: > On Fri, Jul 31, 2020 at 10:19 AM Eric W. Biederman > wrote: >> >> Even limited to opt-in locations I think the trick of being able to >> transform the wait-state may solve that composition problem. > > So the part I found intriguing was t

Re: [RFC][PATCH] exec: Conceal the other threads from wakeups during exec

2020-07-31 Thread Eric W. Biederman
Linus Torvalds writes: > On Thu, Jul 30, 2020 at 4:00 PM Eric W. Biederman > wrote: >> >> The key is the function make_task_wakekill which could probably >> benefit from a little more review and refinement but appears to >> be basically correct. > > You r

Re: [RFC][PATCH] exec: Conceal the other threads from wakeups during exec

2020-07-31 Thread Eric W. Biederman
Oleg Nesterov writes: > Eric, I won't comment the intent, but I too do not understand this idea. > > On 07/30, Eric W. Biederman wrote: >> >> [This change requires more work to handle TASK_STOPPED and TASK_TRACED] > > Yes. And it is not clear to me how can you s

[RFC][PATCH] exec: Conceal the other threads from wakeups during exec

2020-07-30 Thread Eric W. Biederman
up to all of the other threads and clear group_execing_task. This may cause a spuroius wake up but that is an uncommon case and the code for TASK_UNINTERRUPTIBLE and TASK_INTERRUPTIBLE is expected to be handle spurious so it should be fine. Signed-off-by: "Eric W. Biederman" --- fs/exec.c

Re: [PATCH 00/23] proc: Introduce /proc/namespaces/ directory to expose namespaces lineary

2020-07-30 Thread Eric W. Biederman
Kirill Tkhai writes: > On 30.07.2020 17:34, Eric W. Biederman wrote: >> Kirill Tkhai writes: >> >>> Currently, there is no a way to list or iterate all or subset of namespaces >>> in the system. Some namespaces are exposed in /proc/[pid]/ns/ directories, >&

Re: [RFC PATCH 0/5] madvise MADV_DOEXEC

2020-07-30 Thread Eric W. Biederman
Steven Sistare writes: > On 7/30/2020 1:49 PM, Matthew Wilcox wrote: >> On Thu, Jul 30, 2020 at 01:35:51PM -0400, Steven Sistare wrote: >>> mshare + VA reservation is another possible solution. >>> >>> Or MADV_DOEXEC alone, which is ready now. I hope we can get back to >>> reviewing that. >>

Re: [PATCH 00/23] proc: Introduce /proc/namespaces/ directory to expose namespaces lineary

2020-07-30 Thread Eric W. Biederman
Kirill Tkhai writes: > Currently, there is no a way to list or iterate all or subset of namespaces > in the system. Some namespaces are exposed in /proc/[pid]/ns/ directories, > but some also may be as open files, which are not attached to a process. > When a namespace open fd is sent over unix

Re: [RFC][PATCH] exec: Freeze the other threads during a multi-threaded exec

2020-07-30 Thread Eric W. Biederman
Linus Torvalds writes: > On Tue, Jul 28, 2020 at 6:23 AM Eric W. Biederman > wrote: >> >> For exec all I care about are user space threads. So it appears the >> freezer infrastructure adds very little. > > Yeah. 99% of the freezer stuff is for just adding th

Re: [RFC PATCH 3/5] mm: introduce VM_EXEC_KEEP

2020-07-28 Thread Eric W. Biederman
Anthony Yznaga writes: > A vma with the VM_EXEC_KEEP flag is preserved across exec. For anonymous > vmas only. For safety, overlap with fixed address VMAs created in the new > mm during exec (e.g. the stack and elf load segments) is not permitted and > will cause the exec to fail. > (We are

Re: [RFC][PATCH] exec: Freeze the other threads during a multi-threaded exec

2020-07-28 Thread Eric W. Biederman
ebied...@xmission.com (Eric W. Biederman) writes: > Linus Torvalds writes: > >> It also makes for a possible _huge_ latency regression for execve(), >> since freezing really has never been a very low-latency operation. >> >> Other threads doing IO can now basic

Re: [RFC][PATCH] exec: Freeze the other threads during a multi-threaded exec

2020-07-28 Thread Eric W. Biederman
Linus Torvalds writes: > On Mon, Jul 27, 2020 at 2:06 PM Eric W. Biederman > wrote: >> >> Therefore make it simpler to get exec correct by freezing the other >> threads at the beginning of exec. This removes an entire class of >> races, and makes it tractable to

Re: [RFC][PATCH] exec: Freeze the other threads during a multi-threaded exec

2020-07-28 Thread Eric W. Biederman
Aleksa Sarai writes: > On 2020-07-27, Eric W. Biederman wrote: >> To the best of my knowledge processes with more than one thread >> calling exec are not common, and as all of the threads will be killed >> by exec there does not appear to be any useful work a thread can

[RFC][PATCH] exec: Freeze the other threads during a multi-threaded exec

2020-07-27 Thread Eric W. Biederman
as the maximum number of tasks is PID_MAX_LIMIT which has an upper bound of 4 * 1024 * 1024. Signed-off-by: "Eric W. Biederman" --- fs/exec.c| 99 +++- include/linux/sched/signal.h | 10 kernel/fork.c| 3

Re: [RFC PATCH 0/5] madvise MADV_DOEXEC

2020-07-27 Thread Eric W. Biederman
Anthony Yznaga writes: > This patchset adds support for preserving an anonymous memory range across > exec(3) using a new madvise MADV_DOEXEC argument. The primary benefit for > sharing memory in this manner, as opposed to re-attaching to a named shared > memory segment, is to ensure it is

Re: [PATCH v1 2/2] Show /proc/self/net only for CAP_NET_ADMIN

2020-07-27 Thread Eric W. Biederman
Alexey Gladkov writes: > Show /proc/self/net only for CAP_NET_ADMIN if procfs is mounted with > subset=pid option in user namespace. This is done to avoid possible > information leakage. > > Signed-off-by: Alexey Gladkov > --- > fs/proc/proc_net.c | 6 ++ > 1 file changed, 6 insertions(+)

Re: [PATCH] fs/nsfs.c: fix ioctl support of compat processes

2020-07-24 Thread Eric W. Biederman
Michael, As the original author of NS_GET_OWNER_UID can you take a look at this? "Dmitry V. Levin" writes: > On Fri, Jul 24, 2020 at 11:20:26AM +0200, Arnd Bergmann wrote: >> On Fri, Jul 24, 2020 at 2:12 AM Dmitry V. Levin wrote: >> > >> > According to Documentation/driver-api/ioctl.rst, in

Re: [PATCH] kernel: add a kernel_wait helper

2020-07-21 Thread Eric W. Biederman
eads. Acked-by: "Eric W. Biederman" > Signed-off-by: Christoph Hellwig > --- > include/linux/sched/task.h | 1 + > kernel/exit.c | 16 > kernel/umh.c | 29 - > 3 files changed, 21 insertions(+), 25 del

Re: [RFC PATCH 0/5] keys: Security changes, ACLs and Container keyring

2020-07-19 Thread Eric W. Biederman
David Howells writes: > Here are some patches to provide some security changes and some container > support: Nacked-by: "Eric W. Biederman" There remain unfixed security issues in the new mount api. Those need to get fixed before it is even worth anyones time reviewing

Re: [PATCH 7/7] exec: Implement kernel_execve

2020-07-15 Thread Eric W. Biederman
Christoph Hellwig writes: >> +static int count_strings_kernel(const char *const *argv) >> +{ >> +int i; >> + >> +if (!argv) >> +return 0; >> + >> +for (i = 0; argv[i]; ++i) { >> +if (i >= MAX_ARG_STRINGS) >> +return -E2BIG; >> +

Re: [PATCH 6/6] exec: use force_uaccess_begin during exec and exit

2020-07-14 Thread Eric W. Biederman
Christoph Hellwig writes: > Both exec and exit want to ensure that the uaccess routines actually do > access user pointers. Use the newly added force_uaccess_begin helper > instead of an open coded set_fs for that to prepare for kernel builds > where set_fs() does not exist. Acked

Re: [RFC][PATCHES] converting FDPIC coredumps to regsets

2020-07-14 Thread Eric W. Biederman
The fact that the elf_fdpic code continues to use the non-regset names for the functions it calls, and does not synchronize it's structure with the ordinary elf core dumping code may be sensible but it is extremely confusing to follow. As a follow up it would probably good to sort out synchronize the elf and elf_fdpic coredumping code as much as possible, just to simplify future maintenance. So for as much as I could understand and verify. Acked-by: "Eric W. Biederman" Eric

[PATCH 7/7] exec: Implement kernel_execve

2020-07-14 Thread Eric W. Biederman
for do_execve and verify it is not used. Inspired-by: https://lkml.kernel.org/r/20200627072704.2447163-1-...@lst.de Signed-off-by: "Eric W. Biederman" --- arch/x86/entry/entry_32.S | 2 +- arch/x86/entry/entry_64.S | 2 +- arch/x86/kernel/unwind_frame.c | 2 +-

[PATCH 6/7] exec: Factor bprm_stack_limits out of prepare_arg_pages

2020-07-14 Thread Eric W. Biederman
in kernel_execve. The remove prepare_args_pages and compute bprm->argc and bprm->envc directly in do_execveat_common, before bprm_stack_limits is called. Signed-off-by: "Eric W. Biederman" --- fs/exec.c | 23 --- 1 file changed, 12 insertions(+), 11 deletion

[PATCH 5/7] exec: Factor bprm_execve out of do_execve_common

2020-07-14 Thread Eric W. Biederman
ernel_execve that performs the copying differently, this separation of bprm_execve from do_execve_common makes for a nice separation of responsibilities making the exec code easier to navigate. Signed-off-by: "Eric W. Biederman" --- fs/exec.c | 108 +---

[PATCH 4/7] exec: Move bprm_mm_init into alloc_bprm

2020-07-14 Thread Eric W. Biederman
cleanup into the successful return path. This is safe because being on the successful return path implies that begin_new_exec succeeded and set brpm->mm to NULL. As bprm->mm is NULL bprm cleanup I am moving into free_bprm will do nothing. Signed-off-by: "Eric W. Biederman" --

[PATCH 3/7] exec: Move initialization of bprm->filename into alloc_bprm

2020-07-14 Thread Eric W. Biederman
y tied to the other variables in struct linux_binprm, and as such is needed to allow the call alloc_binprm to move. Signed-off-by: "Eric W. Biederman" --- fs/exec.c | 61 ++--- include/linux/binfmts.h | 1 + 2 files changed, 34 insertions(

[PATCH 1/7] exec: Remove unnecessary spaces from binfmts.h

2020-07-14 Thread Eric W. Biederman
The general convention in the linux kernel is to define a pointer member as "type *name". The declaration of struct linux_binprm has several pointer defined as "type * name". Update them to the form of "type *name" for consistency. Suggested-by: Kees Cook Sig

[PATCH 2/7] exec: Factor out alloc_bprm

2020-07-14 Thread Eric W. Biederman
binprm nor the unsharing depend upon each other so swapping the order in which they are called is trivially safe. To keep things consistent the order of cleanup at the end of do_execve_common swapped to match the order of initialization. Signed-off-by: "Eric W. Biederman" --- fs/e

[PATCH 0/7] Implementing kernel_execve

2020-07-14 Thread Eric W. Biederman
-next Eric W. Biederman (7): exec: Remove unnecessary spaces from binfmts.h exec: Factor out alloc_bprm exec: Move initialization of bprm->filename into alloc_bprm exec: Move bprm_mm_init into alloc_bprm exec: Factor bprm_execve out of do_execve_common exec: Fac

Re: [PATCH 0/5] RFC: connector: Add network namespace awareness

2020-07-13 Thread Eric W. Biederman
Matt Bennett writes: > On Thu, 2020-07-02 at 13:59 -0500, Eric W. Biederman wrote: >> Matt Bennett writes: >> >> > Previously the connector functionality could only be used by processes >> > running in the >> > default network namesp

Re: [PATCH 0/5] RFC: connector: Add network namespace awareness

2020-07-13 Thread Eric W. Biederman
Matt Bennett writes: > On Thu, 2020-07-02 at 21:10 +0200, Christian Brauner wrote: >> On Thu, Jul 02, 2020 at 08:17:38AM -0500, Eric W. Biederman wrote: >> > Matt Bennett writes: >> > >> > > Previously the connector functionality could on

Re: Linux kernel in-tree Rust support

2020-07-13 Thread Eric W. Biederman
Nick Desaulniers writes: > Hello folks, > I'm working on putting together an LLVM "Micro Conference" for the > upcoming Linux Plumbers Conf > (https://www.linuxplumbersconf.org/event/7/page/47-attend). It's not > solidified yet, but I would really like to run a session on support > for Rust "in

Re: [PATCH] [RFC] kernfs: Allow vm_ops->close() if VMA is never split

2020-07-13 Thread Eric W. Biederman
Richard Weinberger writes: > 10 years ago commit a6849fa1f7d7 ("sysfs: Fail bin file mmap if vma close is > implemented.") > removed support for vm_ops->close() for mmap on sysfs. > As far I understand the reason is that due to the wrapping in kernfs > every VMA split operation needs to be

[merged][PATCH v3 00/16] Make the user mode driver code a better citizen

2020-07-09 Thread Eric W. Biederman
ission.com Reviewed-by: Greg Kroah-Hartman +Acked-by: Alexei Starovoitov +Tested-by: Alexei Starovoitov Signed-off-by: "Eric W. Biederman" ## include/linux/umh.h ## 2: 2d97bc5269dd ! 2: b044fa2ae50d umh: Move setting PF_UMH into umh_pipe_setup

Re: [PATCH v3 10/16] exec: Remove do_execve_file

2020-07-08 Thread Eric W. Biederman
Luis Chamberlain writes: > On Wed, Jul 08, 2020 at 06:35:25AM +, Luis Chamberlain wrote: >> On Thu, Jul 02, 2020 at 11:41:34AM -0500, Eric W. Biederman wrote: >> > Now that the last callser has been removed remove this code from exec. >> > >> >

Re: [PATCH v2 00/15] Make the user mode driver code a better citizen

2020-07-07 Thread Eric W. Biederman
Just to make certain I understand what is going on I instrumented a kernel with some print statements. a) The workqueues and timers start before populate_rootfs. b) populate_rootfs does indeed happen long before the bpfilter module is intialized. c) What prevents populate_rootfs and the

Re: [PATCH v3 13/16] exit: Factor thread_group_exited out of pidfd_poll

2020-07-07 Thread Eric W. Biederman
Daniel Borkmann writes: > Hey Eric, are you planning to push the final version into a topic branch > so it can be pulled into bpf-next as discussed earlier? Yes. I just about have it ready. I am taking one last pass through the review comments to make certain I have not missed anything before

Re: [PATCH v3 13/16] exit: Factor thread_group_exited out of pidfd_poll

2020-07-07 Thread Eric W. Biederman
Christian Brauner writes: > On Fri, Jul 03, 2020 at 04:37:47PM -0500, Eric W. Biederman wrote: >> Alexei Starovoitov writes: >> >> > On Thu, Jul 02, 2020 at 11:41:37AM -0500, Eric W. Biederman wrote: >> >> Create an independent helper thread_group_exit

<    1   2   3   4   5   6   7   8   9   10   >