Re: [PATCH v2] uml: fix W=1 missing-include-dirs warnings

2021-04-16 Thread Anton Ivanov

On 15/04/2021 18:13, Randy Dunlap wrote:

Currently when using "W=1" with UML builds, there are over 700 warnings
like so:

   CC  arch/um/drivers/stderr_console.o
cc1: warning: ./arch/um/include/uapi: No such file or directory 
[-Wmissing-include-dirs]

but arch/um/ does not have include/uapi/ at all, so add that
subdir and put one Kbuild file into it (since git does not track
empty subdirs).

Signed-off-by: Randy Dunlap 
Cc: Masahiro Yamada 
Cc: Michal Marek 
Cc: linux-kbu...@vger.kernel.org
Cc: Jeff Dike 
Cc: Richard Weinberger 
Cc: Anton Ivanov 
Cc: linux...@lists.infradead.org
---
v2: use Option 4 from v1: add arch/um/include/uapi so that 'make' is
 placated -- and just like all other arch's have.

  arch/um/include/uapi/asm/Kbuild |1 +
  1 file changed, 1 insertion(+)

diff --git a/arch/um/include/uapi/asm/Kbuild b/arch/um/include/uapi/asm/Kbuild
new file mode 100644
index ..f66554cd5c45
--- /dev/null
+++ b/arch/um/include/uapi/asm/Kbuild
@@ -0,0 +1 @@
+# SPDX-License-Identifier: GPL-2.0

___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


+1

I will forward it to openwrt-dev. Their build process adds uapi to uml, 
so if we are going to change this, it will be nice to give them a heads-up.


Acked-By: Anton Ivanov 


--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/



Re: [PATCH 0/4 POC] Allow executing code and syscalls in another address space

2021-04-14 Thread Anton Ivanov

On 14/04/2021 06:52, Andrei Vagin wrote:

We already have process_vm_readv and process_vm_writev to read and write
to a process memory faster than we can do this with ptrace. And now it
is time for process_vm_exec that allows executing code in an address
space of another process. We can do this with ptrace but it is much
slower.

= Use-cases =

Here are two known use-cases. The first one is “application kernel”
sandboxes like User-mode Linux and gVisor. In this case, we have a
process that runs the sandbox kernel and a set of stub processes that
are used to manage guest address spaces. Guest code is executed in the
context of stub processes but all system calls are intercepted and
handled in the sandbox kernel. Right now, these sort of sandboxes use
PTRACE_SYSEMU to trap system calls, but the process_vm_exec can
significantly speed them up.


Certainly interesting, but will require um to rework most of its memory 
management and we will most likely need extra mm support to make use of 
it in UML. We are not likely to get away just with one syscall there.




Another use-case is CRIU (Checkpoint/Restore in User-space). Several
process properties can be received only from the process itself. Right
now, we use a parasite code that is injected into the process. We do
this with ptrace but it is slow, unsafe, and tricky. process_vm_exec can
simplify the process of injecting a parasite code and it will allow
pre-dump memory without stopping processes. The pre-dump here is when we
enable a memory tracker and dump the memory while a process is continue
running. On each interaction we dump memory that has been changed from
the previous iteration. In the final step, we will stop processes and
dump their full state. Right now the most effective way to dump process
memory is to create a set of pipes and splice memory into these pipes
from the parasite code. With process_vm_exec, we will be able to call
vmsplice directly. It means that we will not need to stop a process to
inject the parasite code.

= How it works =

process_vm_exec has two modes:

* Execute code in an address space of a target process and stop on any
   signal or system call.

* Execute a system call in an address space of a target process.

int process_vm_exec(pid_t pid, struct sigcontext uctx,
unsigned long flags, siginfo_t siginfo,
sigset_t  *sigmask, size_t sizemask)

PID - target process identification. We can consider to use pidfd
instead of PID here.

sigcontext contains a process state with what the process will be
resumed after switching the address space and then when a process will
be stopped, its sate will be saved back to sigcontext.

siginfo is information about a signal that has interrupted the process.
If a process is interrupted by a system call, signfo will contain a
synthetic siginfo of the SIGSYS signal.

sigmask is a set of signals that process_vm_exec returns via signfo.

# How fast is it

In the fourth patch, you can find two benchmarks that execute a function
that calls system calls in a loop. ptrace_vm_exe uses ptrace to trap
system calls, proces_vm_exec uses the process_vm_exec syscall to do the
same thing.

ptrace_vm_exec:   1446 ns/syscall
ptrocess_vm_exec:  289 ns/syscall

PS: This version is just a prototype. Its goal is to collect the initial
feedback, to discuss the interfaces, and maybe to get some advice on
implementation..

Cc: Andrew Morton 
Cc: Andy Lutomirski 
Cc: Anton Ivanov 
Cc: Christian Brauner 
Cc: Dmitry Safonov <0x7f454...@gmail.com>
Cc: Ingo Molnar 
Cc: Jeff Dike 
Cc: Mike Rapoport 
Cc: Michael Kerrisk (man-pages) 
Cc: Oleg Nesterov 
Cc: Peter Zijlstra 
Cc: Richard Weinberger 
Cc: Thomas Gleixner 

Andrei Vagin (4):
   signal: add a helper to restore a process state from sigcontex
   arch/x86: implement the process_vm_exec syscall
   arch/x86: allow to execute syscalls via process_vm_exec
   selftests: add tests for process_vm_exec

  arch/Kconfig  |  15 ++
  arch/x86/Kconfig  |   1 +
  arch/x86/entry/common.c   |  19 +++
  arch/x86/entry/syscalls/syscall_64.tbl|   1 +
  arch/x86/include/asm/sigcontext.h |   2 +
  arch/x86/kernel/Makefile  |   1 +
  arch/x86/kernel/process_vm_exec.c | 160 ++
  arch/x86/kernel/signal.c  | 125 ++
  include/linux/entry-common.h  |   2 +
  include/linux/process_vm_exec.h   |  17 ++
  include/linux/sched.h |   7 +
  include/linux/syscalls.h  |   6 +
  include/uapi/asm-generic/unistd.h |   4 +-
  include/uapi/linux/process_vm_exec.h  |   8 +
  kernel/entry/common.c |   2 +-
  kernel/fork.c |   9 +
  kernel/sys_ni.c   |   2 +
  .../selftests/process_vm_exec/Makefile|   7 +
 

Re: [PATCH] um: add 2 missing libs to fix various build errors

2021-04-09 Thread Anton Ivanov

On 10/04/2021 05:13, Randy Dunlap wrote:

On 4/4/21 11:20 AM, Randy Dunlap wrote:

Fix many build errors (at least 18 build error reports) for uml on i386
by adding 2 more library object files. All missing symbols are
either cmpxchg8b_emu or atomic*386.

Here are a few examples of the build errors that are eliminated:

/usr/bin/ld: core.c:(.text+0xd83): undefined reference to `cmpxchg8b_emu'
/usr/bin/ld: core.c:(.text+0x2bb2): undefined reference to 
`atomic64_add_386'
/usr/bin/ld: core.c:(.text+0x2c5d): undefined reference to 
`atomic64_xchg_386'
syscall.c:(.text+0x2f49): undefined reference to `atomic64_set_386'
/usr/bin/ld: syscall.c:(.text+0x2f54): undefined reference to 
`atomic64_set_386'
syscall.c:(.text+0x33a4): undefined reference to `atomic64_inc_386'
/usr/bin/ld: syscall.c:(.text+0x33ac): undefined reference to 
`atomic64_inc_386'
/usr/bin/ld: net/ipv4/inet_timewait_sock.o: in function `inet_twsk_alloc':
inet_timewait_sock.c:(.text+0x3d1): undefined reference to 
`atomic64_read_386'
/usr/bin/ld: inet_timewait_sock.c:(.text+0x3dd): undefined reference to 
`atomic64_set_386'
/usr/bin/ld: net/ipv4/inet_connection_sock.o: in function 
`inet_csk_clone_lock':
inet_connection_sock.c:(.text+0x1d74): undefined reference to 
`atomic64_read_386'
/usr/bin/ld: inet_connection_sock.c:(.text+0x1d80): undefined reference to 
`atomic64_set_386'
/usr/bin/ld: net/ipv4/tcp_input.o: in function `inet_reqsk_alloc':
tcp_input.c:(.text+0xa345): undefined reference to `atomic64_set_386'
/usr/bin/ld: net/mac80211/wpa.o: in function 
`ieee80211_crypto_tkip_encrypt':
wpa.c:(.text+0x739): undefined reference to `atomic64_inc_return_386'

Signed-off-by: Randy Dunlap 
Reported-by: kernel test robot 
Cc: Brendan Jackman 
Cc: Alexei Starovoitov 
Cc: kbuild-...@lists.01.org
Cc: Jeff Dike 
Cc: Richard Weinberger 
Cc: Anton Ivanov 
Cc: linux...@lists.infradead.org
Cc: Johannes Berg 
Cc: Johannes Berg 
---
My UML on i386 build environment is br0ken so this is not tested other
than to see that the .o files are built as expected.
If someone can test/verify it, please respond. Thanks.


Hi,
Instead of trying to build this on x86_64, I powered up my 32-bit x86
laptop and verified that this patch fixes the build errors of
undefined references to cmpxchg8b_emu() and atomic64_*_386() functions.

There are still some build errors in 2 object files:

/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld: 
kernel/irq/generic-chip.o:(.altinstructions+0x8): undefined reference to 
`X86_FEATURE_XMM2'
/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld: 
kernel/irq/generic-chip.o:(.altinstructions+0x15): undefined reference to 
`X86_FEATURE_XMM2'
/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld: 
kernel/irq/generic-chip.o:(.altinstructions+0x22): undefined reference to 
`X86_FEATURE_XMM'
/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld: 
kernel/irq/generic-chip.o:(.altinstructions+0x2f): undefined reference to 
`X86_FEATURE_XMM'
/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld: 
kernel/irq/generic-chip.o:(.altinstructions+0x3c): undefined reference to 
`X86_FEATURE_XMM'
/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld: 
kernel/irq/generic-chip.o:(.altinstructions+0x49): undefined reference to 
`X86_FEATURE_XMM'
/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld: 
kernel/irq/generic-chip.o:(.altinstructions+0x56): undefined reference to 
`X86_FEATURE_XMM'
/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld: 
kernel/irq/generic-chip.o:(.altinstructions+0x63): more undefined references to 
`X86_FEATURE_XMM' follow

and

/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld: 
drivers/fpga/altera-pr-ip-core.o:(.altinstructions+0x8): undefined reference to 
`X86_FEATURE_XMM2'
/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld: 
drivers/fpga/altera-pr-ip-core.o:(.altinstructions+0x15): undefined reference 
to `X86_FEATURE_XMM2'
/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld: 
drivers/fpga/altera-pr-ip-core.o:(.altinstructions+0x22): undefined reference 
to `X86_FEATURE_XMM'
/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld: 
drivers/fpga/altera-pr-ip-core.o:(.altinstructions+0x2f): undefined reference 
to `X86_FEATURE_XMM'
/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld: 
drivers/fpga/altera-pr-ip-core.o:(.altinstructions+0x3c): undefined reference 
to `X86_FEATURE_XMM'
/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld: 
drivers/fpga/altera-pr-ip-core.o:(.altinstructions+0x49): undefined reference 
to `X86_FEATURE_XMM'
/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld: 
drivers/fpga/altera-pr-ip-core.o:(.altinstructions+0x56): undefined reference 
to `X86_FEATURE_XMM'
/usr/lib/gcc/i586-suse-linux/10/../../../../i586-suse-linux/bin/ld

Re: NFS Caching broken in 4.19.37

2021-02-26 Thread Anton Ivanov

On 26/02/2021 15:03, Timo Rothenpieler wrote:
I think I can reproduce this, or something that at least looks very 
similar to this, on 5.10. Namely on 5.10.17 (On both Client and Server).


I think this is a different issue - see below.



We are running slurm, and since a while now (coincides with updating 
from 5.4 to 5.10, but a whole bunch of other stuff was updated at the 
same time, so it took me a while to correlate this) the logs it writes 
have been truncated, but only while they're being observed on the 
client, using tail -f or something like that.


Looks like this then:

On Server:

store01 /srv/export/home/users/timo/TestRun # ls -l slurm-41101.out
-rw-r--r-- 1 timo timo 1931 Feb 26 15:46 slurm-41101.out
store01 /srv/export/home/users/timo/TestRun # wc -l slurm-41101.out
61 slurm-41101.out


On Client:

timo@login01 ~/TestRun $ ls -l slurm-41101.out
-rw-r--r-- 1 timo timo 1931 Feb 26 15:46 slurm-41101.out
timo@login01 ~/TestRun $ wc -l slurm-41101.out
24 slurm-41101.out


See https://gist.github.com/BtbN/b9eb4fc08ccc53bb20087bce0bf9f826 for 
the respective file-contents.


If I run the same test job, wait until its done, and then look at its 
slurm.out file, it matches between NFS Client and Server.
If I tail -f the slurm.out on an NFS client, the file stops getting 
updated on the client, but keeps getting more logs written to it on 
the NFS server.


The slurm.out file is being written to by another NFS client, which is 
running on one of the compute nodes of the system. It's being reads 
from a login node.


These are two different clients, then what you see is possible on NFS 
with client side caching. If you have multiple clients reading/writing 
to the same files you usually need to tune the caching options and/or 
use locking. I suspect that if you leave it for a while (until the cache 
expires) it will sort itself out.


In my test-case it is just one client, it missed a file deletion and 
nothing short of an unmount and remount fixes that. I have waited for 30 
mins+. It does not seem to refresh or expire. I also see the opposite 
behavior - the bug shows up on 4.x up to at least 5.4. I do not see it 
on 5.10.


Brgds,







Timo


On 21.02.2021 16:53, Anton Ivanov wrote:

Client side. This seems to be an entirely client side issue.

A variety of kernels on the clients starting from 4.9 and up to 5.10 
using 4.19 servers. I have observed it on a 4.9 client versus 4.9 
server earlier.


4.9 fails, 4.19 fails, 5.2 fails, 5.4 fails, 5.10 works.

At present the server is at 4.19.67 in all tests.

Linux jain 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 
(2019-11-11) x86_64 GNU/Linux


I can set-up a couple of alternative servers during the week, but so 
far everything is pointing towards a client fs cache issue, not a 
server one.


Brgds,






--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/



Re: NFS Caching broken in 4.19.37

2021-02-21 Thread Anton Ivanov

On 21/02/2021 14:37, Bruce Fields wrote:

On Sun, Feb 21, 2021 at 11:38:51AM +, Anton Ivanov wrote:

On 21/02/2021 09:13, Salvatore Bonaccorso wrote:

On Sat, Feb 20, 2021 at 08:16:26PM +, Chuck Lever wrote:

Confirming you are varying client-side kernels. Should the Linux
NFS client maintainers be Cc'd?

Ok, agreed. Let's add them as well. NFS client maintainers any ideas
on how to trackle this?

This is not observed with Debian backports 5.10 package

uname -a
Linux madding 5.10.0-0.bpo.3-amd64 #1 SMP Debian 5.10.13-1~bpo10+1
(2021-02-11) x86_64 GNU/Linux

I'm still unclear: when you say you tested a certain kernel: are you
varying the client-side kernel version, or the server side, or both at
once?


Client side. This seems to be an entirely client side issue.

A variety of kernels on the clients starting from 4.9 and up to 5.10 
using 4.19 servers. I have observed it on a 4.9 client versus 4.9 server 
earlier.


4.9 fails, 4.19 fails, 5.2 fails, 5.4 fails, 5.10 works.

At present the server is at 4.19.67 in all tests.

Linux jain 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) 
x86_64 GNU/Linux


I can set-up a couple of alternative servers during the week, but so far 
everything is pointing towards a client fs cache issue, not a server one.


Brgds,


--b.



--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/



Re: NFS Caching broken in 4.19.37

2021-02-21 Thread Anton Ivanov

On 21/02/2021 09:13, Salvatore Bonaccorso wrote:

Hi,

On Sat, Feb 20, 2021 at 08:16:26PM +, Chuck Lever wrote:




On Feb 20, 2021, at 3:13 PM, Anton Ivanov  
wrote:

On 20/02/2021 20:04, Salvatore Bonaccorso wrote:

Hi,

On Mon, Jul 08, 2019 at 07:19:54PM +0100, Anton Ivanov wrote:

Hi list,

NFS caching appears broken in 4.19.37.

The more cores/threads the easier to reproduce. Tested with identical
results on Ryzen 1600 and 1600X.

1. Mount an openwrt build tree over NFS v4
2. Run make -j `cat /proc/cpuinfo | grep vendor | wc -l` ; make clean in a
loop
3. Result after 3-4 iterations:

State on the client

ls -laF 
/var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm

total 8
drwxr-xr-x 2 anivanov anivanov 4096 Jul  8 11:40 ./
drwxr-xr-x 3 anivanov anivanov 4096 Jul  8 11:40 ../

State as seen on the server (mounted via nfs from localhost):

ls -laF 
/var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm
total 12
drwxr-xr-x 2 anivanov anivanov 4096 Jul  8 11:40 ./
drwxr-xr-x 3 anivanov anivanov 4096 Jul  8 11:40 ../
-rw-r--r-- 1 anivanov anivanov   32 Jul  8 11:40 ipcbuf.h

Actual state on the filesystem:

ls -laF 
/exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm
total 12
drwxr-xr-x 2 anivanov anivanov 4096 Jul  8 11:40 ./
drwxr-xr-x 3 anivanov anivanov 4096 Jul  8 11:40 ../
-rw-r--r-- 1 anivanov anivanov   32 Jul  8 11:40 ipcbuf.h

So the client has quite clearly lost the plot. Telling it to drop caches and
re-reading the directory shows the file present.

It is possible to reproduce this using a linux kernel tree too, just takes
much more iterations - 10+ at least.

Both client and server run 4.19.37 from Debian buster. This is filed as
debian bug 931500. I originally thought it to be autofs related, but IMHO it
is actually something fundamentally broken in nfs caching resulting in cache
corruption.

According to the reporter downstream in Debian, at
https://bugs.debian.org/940821#26 thi seem still reproducible with
more recent kernels than the initial reported. Is there anything Anton
can provide to try to track down the issue?

Anton, can you reproduce with current stable series?


100% reproducible with any kernel from 4.9 to 5.4, stable or backports. It may 
exist in earlier versions, but I do not have a machine with anything before 4.9 
to test at present.


Confirming you are varying client-side kernels. Should the Linux
NFS client maintainers be Cc'd?


Ok, agreed. Let's add them as well. NFS client maintainers any ideas
on how to trackle this?


This is not observed with Debian backports 5.10 package

uname -a
Linux madding 5.10.0-0.bpo.3-amd64 #1 SMP Debian 5.10.13-1~bpo10+1 
(2021-02-11) x86_64 GNU/Linux


I left the testcase running for ~ 4 hours on a 6core/12thread Ryzen. It 
should have blown up 10 times by now.


So one of the commits between 5.4 and 5.10.13 fixed it.

If nobody can think of a particular commit which fixes it, I can try 
dissecting it during the week.


A.






 From 1-2 make clean && make  cycles to one afternoon depending on the number 
of machine cores. More cores/threads the faster it does it.

I tried playing with protocol minor versions, caching options, etc - it is 
still reproducible for any nfs4 settings as long as there is client side 
caching of metadata.

A.



Regards,
Salvatore



--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


--
Chuck Lever


Regards,
Salvatore




--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Re: NFS Caching broken in 4.19.37

2021-02-20 Thread Anton Ivanov

On 20/02/2021 20:04, Salvatore Bonaccorso wrote:

Hi,

On Mon, Jul 08, 2019 at 07:19:54PM +0100, Anton Ivanov wrote:

Hi list,

NFS caching appears broken in 4.19.37.

The more cores/threads the easier to reproduce. Tested with identical
results on Ryzen 1600 and 1600X.

1. Mount an openwrt build tree over NFS v4
2. Run make -j `cat /proc/cpuinfo | grep vendor | wc -l` ; make clean in a
loop
3. Result after 3-4 iterations:

State on the client

ls -laF 
/var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm

total 8
drwxr-xr-x 2 anivanov anivanov 4096 Jul  8 11:40 ./
drwxr-xr-x 3 anivanov anivanov 4096 Jul  8 11:40 ../

State as seen on the server (mounted via nfs from localhost):

ls -laF 
/var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm
total 12
drwxr-xr-x 2 anivanov anivanov 4096 Jul  8 11:40 ./
drwxr-xr-x 3 anivanov anivanov 4096 Jul  8 11:40 ../
-rw-r--r-- 1 anivanov anivanov   32 Jul  8 11:40 ipcbuf.h

Actual state on the filesystem:

ls -laF 
/exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm
total 12
drwxr-xr-x 2 anivanov anivanov 4096 Jul  8 11:40 ./
drwxr-xr-x 3 anivanov anivanov 4096 Jul  8 11:40 ../
-rw-r--r-- 1 anivanov anivanov   32 Jul  8 11:40 ipcbuf.h

So the client has quite clearly lost the plot. Telling it to drop caches and
re-reading the directory shows the file present.

It is possible to reproduce this using a linux kernel tree too, just takes
much more iterations - 10+ at least.

Both client and server run 4.19.37 from Debian buster. This is filed as
debian bug 931500. I originally thought it to be autofs related, but IMHO it
is actually something fundamentally broken in nfs caching resulting in cache
corruption.

According to the reporter downstream in Debian, at
https://bugs.debian.org/940821#26 thi seem still reproducible with
more recent kernels than the initial reported. Is there anything Anton
can provide to try to track down the issue?

Anton, can you reproduce with current stable series?


100% reproducible with any kernel from 4.9 to 5.4, stable or backports. 
It may exist in earlier versions, but I do not have a machine with 
anything before 4.9 to test at present.


From 1-2 make clean && make  cycles to one afternoon depending on the 
number of machine cores. More cores/threads the faster it does it.


I tried playing with protocol minor versions, caching options, etc - it 
is still reproducible for any nfs4 settings as long as there is client 
side caching of metadata.


A.



Regards,
Salvatore



--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/



Re: [PATCH] um: random: register random as hwrng-core device

2020-11-13 Thread Anton Ivanov
_wait_queue(_read_wait, );
+   ignore_sigio_fd(random_fd);
+   deactivate_fd(random_fd, RANDOM_IRQ);
  
-			if (atomic_dec_and_test(_sleep_count)) {

-   ignore_sigio_fd(random_fd);
-   deactivate_fd(random_fd, RANDOM_IRQ);
-   }
+   if (ret < 0)
+   break;
+   } else {
+   break;
}
-   else
-   return n;
-
-   if (signal_pending (current))
-   return ret ? : -ERESTARTSYS;
}
-   return ret;
-}
  
-static const struct file_operations rng_chrdev_ops = {

-   .owner  = THIS_MODULE,
-   .open   = rng_dev_open,
-   .read   = rng_dev_read,
-   .llseek = noop_llseek,
-};
-
-/* rng_init shouldn't be called more than once at boot time */
-static struct miscdevice rng_miscdev = {
-   HWRNG_MINOR,
-   RNG_MODULE_NAME,
-   _chrdev_ops,
-};
+   return ret != -EAGAIN ? ret : 0;
+}
  
  static irqreturn_t random_interrupt(int irq, void *data)

  {
-   wake_up(_read_wait);
+   complete(_data);
  
  	return IRQ_HANDLED;

  }
@@ -126,18 +74,19 @@ static int __init rng_init (void)
goto out;
  
  	random_fd = err;

-
err = um_request_irq(RANDOM_IRQ, random_fd, IRQ_READ, random_interrupt,
 0, "random", NULL);
if (err)
goto err_out_cleanup_hw;
  
  	sigio_broken(random_fd, 1);

+   hwrng.name = RNG_MODULE_NAME;
+   hwrng.read = rng_dev_read;
+   hwrng.quality = 1024;
  
-	err = misc_register (_miscdev);

+   err = hwrng_register();
if (err) {
-   printk (KERN_ERR RNG_MODULE_NAME ": misc device register "
-   "failed\n");
+   pr_err(RNG_MODULE_NAME " registering failed (%d)\n", err);
goto err_out_cleanup_hw;
}
  out:
@@ -161,8 +110,8 @@ static void cleanup(void)
  
  static void __exit rng_cleanup(void)

  {
+   hwrng_unregister();
os_close_file(random_fd);
-   misc_deregister (_miscdev);
  }
  
  module_init (rng_init);

diff --git a/drivers/char/hw_random/Kconfig b/drivers/char/hw_random/Kconfig
index e92c4d9469d8..5952210526aa 100644
--- a/drivers/char/hw_random/Kconfig
+++ b/drivers/char/hw_random/Kconfig
@@ -540,15 +540,15 @@ endif # HW_RANDOM
  
  config UML_RANDOM

depends on UML
-   tristate "Hardware random number generator"
+   select HW_RANDOM
+   tristate "UML Random Number Generator support"
help
  This option enables UML's "hardware" random number generator.  It
  attaches itself to the host's /dev/random, supplying as much entropy
  as the host has, rather than the small amount the UML gets from its
- own drivers.  It registers itself as a standard hardware random number
- generator, major 10, minor 183, and the canonical device name is
- /dev/hwrng.
- The way to make use of this is to install the rng-tools package
- (check your distro, or download from
- http://sourceforge.net/projects/gkernel/).  rngd periodically reads
- /dev/hwrng and injects the entropy into /dev/random.
+ own drivers. It registers itself as a rng-core driver thus providing
+ a device which is usually called /dev/hwrng. This hardware random
+     number generator does feed into the kernel's random number generator
+ entropy pool.
+
+ If unsure, say Y.



Acked-by: Anton Ivanov 

--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Re: [PATCH] arch: um: convert tasklets to use new tasklet_setup() API

2020-10-19 Thread Anton Ivanov




On 17/08/2020 10:15, Allen Pais wrote:

From: Allen Pais 

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier 
Signed-off-by: Allen Pais 
---
  arch/um/drivers/vector_kern.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/um/drivers/vector_kern.c b/arch/um/drivers/vector_kern.c
index 8735c468230a..06980870ae23 100644
--- a/arch/um/drivers/vector_kern.c
+++ b/arch/um/drivers/vector_kern.c
@@ -1196,9 +1196,9 @@ static int vector_net_close(struct net_device *dev)
  
  /* TX tasklet */
  
-static void vector_tx_poll(unsigned long data)

+static void vector_tx_poll(struct tasklet_struct *t)
  {
-   struct vector_private *vp = (struct vector_private *)data;
+   struct vector_private *vp = from_tasklet(vp, t, tx_poll);
  
  	vp->estats.tx_kicks++;

vector_send(vp->tx_queue);
@@ -1629,7 +1629,7 @@ static void vector_eth_configure(
});
  
  	dev->features = dev->hw_features = (NETIF_F_SG | NETIF_F_FRAGLIST);

-   tasklet_init(>tx_poll, vector_tx_poll, (unsigned long)vp);
+   tasklet_setup(>tx_poll, vector_tx_poll);
INIT_WORK(>reset_tx, vector_reset_tx);
  
  	timer_setup(>tl, vector_timer_expire, 0);




Acked-By: Anton Ivanov 

--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Re: [PATCH 3/6] docs: virt: user_mode_linux_howto_v2.rst: fix a literal block markup

2020-10-02 Thread Anton Ivanov




On 02/10/2020 06:49, Mauro Carvalho Chehab wrote:

There's a missing new line for a literal block:

.../Documentation/virt/uml/user_mode_linux_howto_v2.rst:682: WARNING: 
Unexpected indentation.

Fixes: 04301bf5b072 ("docs: replace the old User Mode Linux HowTo with a new 
one")
Signed-off-by: Mauro Carvalho Chehab 
---
  Documentation/virt/uml/user_mode_linux_howto_v2.rst | 1 +
  1 file changed, 1 insertion(+)

diff --git a/Documentation/virt/uml/user_mode_linux_howto_v2.rst 
b/Documentation/virt/uml/user_mode_linux_howto_v2.rst
index f70e6f5873c6..312e431695d9 100644
--- a/Documentation/virt/uml/user_mode_linux_howto_v2.rst
+++ b/Documentation/virt/uml/user_mode_linux_howto_v2.rst
@@ -679,6 +679,7 @@ Starting UML
  
  We can now run UML.

  ::
+
 # linux mem=2048M umid=TEST \
  ubd0=Filesystem.img \
  vec0:transport=tap,ifname=tap0,depth=128,gro=1 \



Thanks.

Acked-By: Anton Ivanov 

--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Re: [PATCH] um: vector: Use GFP_ATOMIC under spin lock

2020-09-21 Thread Anton Ivanov




On 19/06/2020 06:20, Tiezhu Yang wrote:

Use GFP_ATOMIC instead of GFP_KERNEL under spin lock to fix possible
sleep-in-atomic-context bugs.

Fixes: 9807019a62dc ("um: Loadable BPF "Firmware" for vector drivers")
Signed-off-by: Tiezhu Yang 
---
  arch/um/drivers/vector_kern.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/um/drivers/vector_kern.c b/arch/um/drivers/vector_kern.c
index 8735c46..555203e 100644
--- a/arch/um/drivers/vector_kern.c
+++ b/arch/um/drivers/vector_kern.c
@@ -1403,7 +1403,7 @@ static int vector_net_load_bpf_flash(struct net_device 
*dev,
kfree(vp->bpf->filter);
vp->bpf->filter = NULL;
} else {
-   vp->bpf = kmalloc(sizeof(struct sock_fprog), GFP_KERNEL);
+   vp->bpf = kmalloc(sizeof(struct sock_fprog), GFP_ATOMIC);
if (vp->bpf == NULL) {
netdev_err(dev, "failed to allocate memory for 
firmware\n");
goto flash_fail;
@@ -1415,7 +1415,7 @@ static int vector_net_load_bpf_flash(struct net_device 
*dev,
if (request_firmware(, efl->data, >pdev.dev))
goto flash_fail;
  
-	vp->bpf->filter = kmemdup(fw->data, fw->size, GFP_KERNEL);

+   vp->bpf->filter = kmemdup(fw->data, fw->size, GFP_ATOMIC);
    if (!vp->bpf->filter)
goto free_buffer;
  


Acked-By: Anton Ivanov 
--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Re: [PATCH v2 2/3] um: some fixes to build UML with musl

2020-07-14 Thread Anton Ivanov




On 14/07/2020 11:23, Ignat Korchagin wrote:

On Tue, Jul 14, 2020 at 9:40 AM Anton Ivanov
 wrote:



On 04/07/2020 09:52, Ignat Korchagin wrote:

musl toolchain and headers are a bit more strict. These fixes enable building
UML with musl as well as seem not to break on glibc.

Signed-off-by: Ignat Korchagin 
---
   arch/um/drivers/daemon_user.c |  1 +
   arch/um/drivers/pcap_user.c   | 12 ++--
   arch/um/drivers/slip_user.c   |  2 +-
   arch/um/drivers/vector_user.c |  4 +---
   arch/um/os-Linux/util.c   |  2 +-
   arch/x86/um/user-offsets.c|  2 +-
   6 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/arch/um/drivers/daemon_user.c b/arch/um/drivers/daemon_user.c
index 3695821d06a2..785baedc3555 100644
--- a/arch/um/drivers/daemon_user.c
+++ b/arch/um/drivers/daemon_user.c
@@ -7,6 +7,7 @@
*/

   #include 
+#include 
   #include 
   #include 
   #include 
diff --git a/arch/um/drivers/pcap_user.c b/arch/um/drivers/pcap_user.c
index bbd20638788a..52ddda3e3b10 100644
--- a/arch/um/drivers/pcap_user.c
+++ b/arch/um/drivers/pcap_user.c
@@ -32,7 +32,7 @@ static int pcap_user_init(void *data, void *dev)
   return 0;
   }

-static int pcap_open(void *data)
+static int pcap_user_open(void *data)


This change in the function name was introduced on purpose to avoid name clash 
in some version of libpcap which export pcap_open


Yes





   {
   struct pcap_data *pri = data;
   __u32 netmask;
@@ -44,14 +44,14 @@ static int pcap_open(void *data)
   if (pri->filter != NULL) {
   err = dev_netmask(pri->dev, );
   if (err < 0) {
- printk(UM_KERN_ERR "pcap_open : dev_netmask failed\n");
+ printk(UM_KERN_ERR "pcap_user_open : dev_netmask 
failed\n");
   return -EIO;
   }

   pri->compiled = uml_kmalloc(sizeof(struct bpf_program),
   UM_GFP_KERNEL);
   if (pri->compiled == NULL) {
- printk(UM_KERN_ERR "pcap_open : kmalloc failed\n");
+ printk(UM_KERN_ERR "pcap_user_open : kmalloc failed\n");
   return -ENOMEM;
   }

@@ -59,14 +59,14 @@ static int pcap_open(void *data)
  (struct bpf_program *) pri->compiled,
  pri->filter, pri->optimize, netmask);
   if (err < 0) {
- printk(UM_KERN_ERR "pcap_open : pcap_compile failed - "
+ printk(UM_KERN_ERR "pcap_user_open : pcap_compile failed - 
"
  "'%s'\n", pcap_geterr(pri->pcap));
   goto out;
   }

   err = pcap_setfilter(pri->pcap, pri->compiled);
   if (err < 0) {
- printk(UM_KERN_ERR "pcap_open : pcap_setfilter "
+ printk(UM_KERN_ERR "pcap_user_open : pcap_setfilter "
  "failed - '%s'\n", pcap_geterr(pri->pcap));
   goto out;
   }
@@ -127,7 +127,7 @@ int pcap_user_read(int fd, void *buffer, int len, struct 
pcap_data *pri)

   const struct net_user_info pcap_user_info = {
   .init   = pcap_user_init,
- .open   = pcap_open,
+ .open   = pcap_user_open,
   .close  = NULL,
   .remove = pcap_remove,
   .add_address= NULL,
diff --git a/arch/um/drivers/slip_user.c b/arch/um/drivers/slip_user.c
index 8016d32b6809..482a19c5105c 100644
--- a/arch/um/drivers/slip_user.c
+++ b/arch/um/drivers/slip_user.c
@@ -9,7 +9,7 @@
   #include 
   #include 
   #include 
-#include 
+#include 
   #include 
   #include 
   #include 
diff --git a/arch/um/drivers/vector_user.c b/arch/um/drivers/vector_user.c
index c4a0f26b2824..45d4164ad355 100644
--- a/arch/um/drivers/vector_user.c
+++ b/arch/um/drivers/vector_user.c
@@ -18,9 +18,7 @@
   #include 
   #include 
   #include 
-#include 
   #include 
-#include 
   #include 
   #include 
   #include 
@@ -332,7 +330,7 @@ static struct vector_fds *user_init_unix_fds(struct arglist 
*ifspec, int id)
   }
   switch (id) {
   case ID_BESS:
- if (connect(fd, remote_addr, sizeof(struct sockaddr_un)) < 0) {
+ if (connect(fd, (const struct sockaddr *) remote_addr, sizeof(struct 
sockaddr_un)) < 0) {
   printk(UM_KERN_ERR "bess open:cannot connect to %s %i", 
remote_addr->sun_path, -errno);
   goto unix_cleanup;
   }
diff --git a/arch/um/os-Linux/util.c b/arch/um/os-Linux/util.c
index ecf2f390fad2..07327425d06e 100644
--- a/arch/um/os-Linux/util.c
+++ b/arch/um/os-Linux/util.c
@@ -10,7 +10,7 @@
   #include 
   #include 
   #include 
-#include 
+#include 
   #include 
   #include 
   #include

Re: [PATCH v2 2/3] um: some fixes to build UML with musl

2020-07-14 Thread Anton Ivanov



On 04/07/2020 09:52, Ignat Korchagin wrote:

musl toolchain and headers are a bit more strict. These fixes enable building
UML with musl as well as seem not to break on glibc.

Signed-off-by: Ignat Korchagin 
---
  arch/um/drivers/daemon_user.c |  1 +
  arch/um/drivers/pcap_user.c   | 12 ++--
  arch/um/drivers/slip_user.c   |  2 +-
  arch/um/drivers/vector_user.c |  4 +---
  arch/um/os-Linux/util.c   |  2 +-
  arch/x86/um/user-offsets.c|  2 +-
  6 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/arch/um/drivers/daemon_user.c b/arch/um/drivers/daemon_user.c
index 3695821d06a2..785baedc3555 100644
--- a/arch/um/drivers/daemon_user.c
+++ b/arch/um/drivers/daemon_user.c
@@ -7,6 +7,7 @@
   */
  
  #include 

+#include 
  #include 
  #include 
  #include 
diff --git a/arch/um/drivers/pcap_user.c b/arch/um/drivers/pcap_user.c
index bbd20638788a..52ddda3e3b10 100644
--- a/arch/um/drivers/pcap_user.c
+++ b/arch/um/drivers/pcap_user.c
@@ -32,7 +32,7 @@ static int pcap_user_init(void *data, void *dev)
return 0;
  }
  
-static int pcap_open(void *data)

+static int pcap_user_open(void *data)


This change in the function name was introduced on purpose to avoid name clash 
in some version of libpcap which export pcap_open



  {
struct pcap_data *pri = data;
__u32 netmask;
@@ -44,14 +44,14 @@ static int pcap_open(void *data)
if (pri->filter != NULL) {
err = dev_netmask(pri->dev, );
if (err < 0) {
-   printk(UM_KERN_ERR "pcap_open : dev_netmask failed\n");
+   printk(UM_KERN_ERR "pcap_user_open : dev_netmask 
failed\n");
return -EIO;
}
  
  		pri->compiled = uml_kmalloc(sizeof(struct bpf_program),

UM_GFP_KERNEL);
if (pri->compiled == NULL) {
-   printk(UM_KERN_ERR "pcap_open : kmalloc failed\n");
+   printk(UM_KERN_ERR "pcap_user_open : kmalloc failed\n");
return -ENOMEM;
}
  
@@ -59,14 +59,14 @@ static int pcap_open(void *data)

   (struct bpf_program *) pri->compiled,
   pri->filter, pri->optimize, netmask);
if (err < 0) {
-   printk(UM_KERN_ERR "pcap_open : pcap_compile failed - "
+   printk(UM_KERN_ERR "pcap_user_open : pcap_compile failed - 
"
   "'%s'\n", pcap_geterr(pri->pcap));
goto out;
}
  
  		err = pcap_setfilter(pri->pcap, pri->compiled);

if (err < 0) {
-   printk(UM_KERN_ERR "pcap_open : pcap_setfilter "
+   printk(UM_KERN_ERR "pcap_user_open : pcap_setfilter "
   "failed - '%s'\n", pcap_geterr(pri->pcap));
goto out;
}
@@ -127,7 +127,7 @@ int pcap_user_read(int fd, void *buffer, int len, struct 
pcap_data *pri)
  
  const struct net_user_info pcap_user_info = {

.init   = pcap_user_init,
-   .open   = pcap_open,
+   .open   = pcap_user_open,
.close  = NULL,
.remove = pcap_remove,
.add_address= NULL,
diff --git a/arch/um/drivers/slip_user.c b/arch/um/drivers/slip_user.c
index 8016d32b6809..482a19c5105c 100644
--- a/arch/um/drivers/slip_user.c
+++ b/arch/um/drivers/slip_user.c
@@ -9,7 +9,7 @@
  #include 
  #include 
  #include 
-#include 
+#include 
  #include 
  #include 
  #include 
diff --git a/arch/um/drivers/vector_user.c b/arch/um/drivers/vector_user.c
index c4a0f26b2824..45d4164ad355 100644
--- a/arch/um/drivers/vector_user.c
+++ b/arch/um/drivers/vector_user.c
@@ -18,9 +18,7 @@
  #include 
  #include 
  #include 
-#include 
  #include 
-#include 
  #include 
  #include 
  #include 
@@ -332,7 +330,7 @@ static struct vector_fds *user_init_unix_fds(struct arglist 
*ifspec, int id)
}
switch (id) {
case ID_BESS:
-   if (connect(fd, remote_addr, sizeof(struct sockaddr_un)) < 0) {
+   if (connect(fd, (const struct sockaddr *) remote_addr, 
sizeof(struct sockaddr_un)) < 0) {
printk(UM_KERN_ERR "bess open:cannot connect to %s %i", 
remote_addr->sun_path, -errno);
goto unix_cleanup;
}
diff --git a/arch/um/os-Linux/util.c b/arch/um/os-Linux/util.c
index ecf2f390fad2..07327425d06e 100644
--- a/arch/um/os-Linux/util.c
+++ b/arch/um/os-Linux/util.c
@@ -10,7 +10,7 @@
  #include 
  #include 
  #include 
-#include 
+#include 
  #include 
  #include 
  #include 
diff --git a/arch/x86/um/user-offsets.c b/arch/x86/um/user-offsets.c
index c51dd8363d25..bae61554abcc 100644
--- a/arch/x86/um/user-offsets.c
+++ b/arch/x86/um/user-offsets.c
@@ -2,7 +2,7 @@
  #include 
  #include 
  #include 
-#include 
+#include 
  

Re: [PATCH] Replace HTTP links with HTTPS ones: user-mode Linux

2020-07-07 Thread Anton Ivanov

On 07/07/2020 21:32, Alexander A. Klimov wrote:

Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
   If not .svg:
 For each line:
   If doesn't contain `\bxmlns\b`:
 For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
   If both the HTTP and HTTPS versions
   return 200 OK and serve the same content:
 Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov 
---
  Continuing my work started at 93431e0607e5.
  See also: git log --oneline '--author=Alexander A. Klimov 
' v5.7..master

  If there are any URLs to be removed completely or at least not HTTPSified:
  Just clearly say so and I'll *undo my change*.
  See also: https://lkml.org/lkml/2020/6/27/64

  If there are any valid, but yet not changed URLs:
  See: https://lkml.org/lkml/2020/6/26/837

  If you apply the patch, please let me know.
  Rationale:
  I'd like not to submit patches much faster than you maintainers apply them.

  Documentation/virt/uml/user_mode_linux.rst | 2 +-
  arch/um/drivers/Kconfig| 2 +-
  arch/um/drivers/harddog_kern.c | 2 +-
  3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/virt/uml/user_mode_linux.rst 
b/Documentation/virt/uml/user_mode_linux.rst
index de0f0b2c9d5b..775d3de84331 100644
--- a/Documentation/virt/uml/user_mode_linux.rst
+++ b/Documentation/virt/uml/user_mode_linux.rst
@@ -3753,7 +3753,7 @@ Note:
  
  
Documentation on IP Masquerading, and SNAT, can be found at

-  http://www.netfilter.org.
+  https://www.netfilter.org.
  
  
If you can reach the local net, but not the outside Internet, then

diff --git a/arch/um/drivers/Kconfig b/arch/um/drivers/Kconfig
index 9160ead56e33..85e170149e99 100644
--- a/arch/um/drivers/Kconfig
+++ b/arch/um/drivers/Kconfig
@@ -259,7 +259,7 @@ config UML_NET_VDE
To use this form of networking, you will need to run vde_switch
on the host.
  
-	For more information, see 

+   For more information, see 
That site has a good overview of what VDE is and also examples
of the UML command line to use to enable VDE networking.
  
diff --git a/arch/um/drivers/harddog_kern.c b/arch/um/drivers/harddog_kern.c

index e6d4f43deba8..7a39b8b7ae55 100644
--- a/arch/um/drivers/harddog_kern.c
+++ b/arch/um/drivers/harddog_kern.c
@@ -3,7 +3,7 @@
   *SoftDog 0.05:   A Software Watchdog Device
   *
   *(c) Copyright 1996 Alan Cox , All Rights Reserved.
- * http://www.redhat.com
+ * https://www.redhat.com
   *
   *This program is free software; you can redistribute it and/or
   *modify it under the terms of the GNU General Public License



We should really try to finish the new documentation. The one in the 
kernel tree is very out of date.


The draft is here: https://github.com/kot-begemot-uk/uml-howto-v2


--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Re: [PATCH] Fix null pointer dereference in vector_user_bpf

2020-06-14 Thread Anton Ivanov

On 14/06/2020 02:19, Gaurav Singh wrote:

The bpf_prog is being checked for !NULL after uml_kmalloc
but later its used directly for example:
bpf_prog->filter = bpf and is also later returned upon
success. Fix this, do a NULL check and return right away.

Signed-off-by: Gaurav Singh 
---
  arch/um/drivers/vector_user.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/um/drivers/vector_user.c b/arch/um/drivers/vector_user.c
index c4a0f26b2824..0e6d6717bf73 100644
--- a/arch/um/drivers/vector_user.c
+++ b/arch/um/drivers/vector_user.c
@@ -789,10 +789,12 @@ void *uml_vector_user_bpf(char *filename)
return false;
}
bpf_prog = uml_kmalloc(sizeof(struct sock_fprog), UM_GFP_KERNEL);
-   if (bpf_prog != NULL) {
-   bpf_prog->len = statbuf.st_size / sizeof(struct sock_filter);
-   bpf_prog->filter = NULL;
+   if (bpf_prog == NULL) {
+   printk(KERN_ERR "Failed to allocate bpf prog buffer");
+   return NULL;
}
+   bpf_prog->len = statbuf.st_size / sizeof(struct sock_filter);
+   bpf_prog->filter = NULL;
ffd = os_open_file(filename, of_read(OPENFLAGS()), 0);
if (ffd < 0) {
printk(KERN_ERR "Error %d opening bpf file", -errno);



Acked-By: Anton Ivanov 
--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


NFS Caching broken in 4.19.37

2019-07-08 Thread Anton Ivanov

Hi list,

NFS caching appears broken in 4.19.37.

The more cores/threads the easier to reproduce. Tested with identical 
results on Ryzen 1600 and 1600X.


1. Mount an openwrt build tree over NFS v4
2. Run make -j `cat /proc/cpuinfo | grep vendor | wc -l` ; make clean in 
a loop

3. Result after 3-4 iterations:

State on the client

ls -laF 
/var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm


total 8
drwxr-xr-x 2 anivanov anivanov 4096 Jul  8 11:40 ./
drwxr-xr-x 3 anivanov anivanov 4096 Jul  8 11:40 ../

State as seen on the server (mounted via nfs from localhost):

ls -laF 
/var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm

total 12
drwxr-xr-x 2 anivanov anivanov 4096 Jul  8 11:40 ./
drwxr-xr-x 3 anivanov anivanov 4096 Jul  8 11:40 ../
-rw-r--r-- 1 anivanov anivanov   32 Jul  8 11:40 ipcbuf.h

Actual state on the filesystem:

ls -laF 
/exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm

total 12
drwxr-xr-x 2 anivanov anivanov 4096 Jul  8 11:40 ./
drwxr-xr-x 3 anivanov anivanov 4096 Jul  8 11:40 ../
-rw-r--r-- 1 anivanov anivanov   32 Jul  8 11:40 ipcbuf.h

So the client has quite clearly lost the plot. Telling it to drop caches 
and re-reading the directory shows the file present.


It is possible to reproduce this using a linux kernel tree too, just 
takes much more iterations - 10+ at least.


Both client and server run 4.19.37 from Debian buster. This is filed as 
debian bug 931500. I originally thought it to be autofs related, but 
IMHO it is actually something fundamentally broken in nfs caching 
resulting in cache corruption.


--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Re: [PATCH] x86: Hide the int3_emulate_call/jmp functions from UML

2019-05-20 Thread Anton Ivanov




On 11/05/2019 13:39, Steven Rostedt wrote:


From: "Steven Rostedt (VMware)" 

User Mode Linux does not have access to the ip or sp fields of the
pt_regs, and accessing them causes UML to fail to build. Hide the
int3_emulate_jmp() and int3_emulate_call() instructions from UML, as it
doesn't need them anyway.

Reported-by: kbuild test robot 
Signed-off-by: Steven Rostedt (VMware) 
---

[ I added this to my queue to test too ]

  arch/x86/include/asm/text-patching.h | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/text-patching.h
b/arch/x86/include/asm/text-patching.h index 05861cc08787..0bbb07eaed6b
100644 --- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -39,6 +39,7 @@ extern int poke_int3_handler(struct pt_regs *regs);
  extern void *text_poke_bp(void *addr, const void *opcode, size_t len,
void *handler); extern int after_bootmem;
  
+#ifndef CONFIG_UML_X86

  static inline void int3_emulate_jmp(struct pt_regs *regs, unsigned
long ip) {
regs->ip = ip;
@@ -65,6 +66,7 @@ static inline void int3_emulate_call(struct pt_regs
*regs, unsigned long func) int3_emulate_push(regs, regs->ip -
INT3_INSN_SIZE + CALL_INSN_SIZE); int3_emulate_jmp(regs, func);
  }
-#endif
+#endif /* CONFIG_X86_64 */
+#endif /* !CONFIG_UML_X86 */
  
  #endif /* _ASM_X86_TEXT_PATCHING_H */



The patch has been garbled by an auto-wrap. Can you resend it please.

--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Re: [RESEND PATCH 4/4] um: irq: don't set the chip for all irqs

2019-05-10 Thread Anton Ivanov



On 10/05/2019 17:20, Bartosz Golaszewski wrote:

pt., 10 maj 2019 o 11:16 Bartosz Golaszewski  napisał(a):

śr., 8 maj 2019 o 09:13 Richard Weinberger  napisał(a):

- Ursprüngliche Mail -

Can you please check?
This patch is already queued in -next. So we need to decide whether to
revert or fix it now.


I am looking at it. It passed tests in my case (I did the usual round).

It works here too. That's why I never noticed.
Yesterday I noticed just because I looked for something else in the kernel logs.

Thanks,
//richard

Hi,

sorry for the late reply - I just came back from vacation.

I see it here too, I'll check if I can find the culprit and fix it today.

Bart

Hi Richard, Anton,

I'm not sure yet what this is caused by. It doesn't seem to break
anything for me but since it's a new error message I guess it's best
to revert this patch (others are good) and revisit it for v5.3.


Can you send me your command line and .config so I can try to reproduce it.

Brgds,



Bart


--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/



Re: [RESEND PATCH 4/4] um: irq: don't set the chip for all irqs

2019-05-08 Thread Anton Ivanov

On 07/05/2019 22:26, Richard Weinberger wrote:

On Thu, Apr 11, 2019 at 11:50 AM Bartosz Golaszewski  wrote:


From: Bartosz Golaszewski 

Setting a chip for an interrupt marks it as allocated. Since UM doesn't
support dynamic interrupt numbers (yet), it means we cannot simply
increase NR_IRQS and then use the free irqs between LAST_IRQ and NR_IRQS
with gpio-mockup or iio testing drivers as irq_alloc_descs() will fail
after not being able to neither find an unallocated range of interrupts
nor expand the range.

Only call irq_set_chip_and_handler() for irqs until LAST_IRQ.

Signed-off-by: Bartosz Golaszewski 
Reviewed-by: Anton Ivanov 
Acked-by: Anton Ivanov 


Just noticed that this triggers the following errors while bootup:
Trying to reregister IRQ 11 FD 8 TYPE 0 ID   (null)
write_sigio_irq : um_request_irq failed, err = -16
Trying to reregister IRQ 11 FD 8 TYPE 0 ID   (null)
write_sigio_irq : um_request_irq failed, err = -16
VFS: Mounted root (hostfs filesystem) readonly on

Can you please check?
This patch is already queued in -next. So we need to decide whether to
revert or fix it now.


I am looking at it. It passed tests in my case (I did the usual round).

--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Re: [PATCH 14/15] um: switch to generic version of pte allocation

2019-05-03 Thread Anton Ivanov




On 02/05/2019 16:28, Mike Rapoport wrote:

um allocates PTE pages with __get_free_page() and uses
GFP_KERNEL | __GFP_ZERO for the allocations.

Switch it to the generic version that does exactly the same thing for the
kernel page tables and adds __GFP_ACCOUNT for the user PTEs.

The pte_free() and pte_free_kernel() versions are identical to the generic
ones and can be simply dropped.

Signed-off-by: Mike Rapoport 
---
  arch/um/include/asm/pgalloc.h | 16 ++--
  arch/um/kernel/mem.c  | 22 --
  2 files changed, 2 insertions(+), 36 deletions(-)

diff --git a/arch/um/include/asm/pgalloc.h b/arch/um/include/asm/pgalloc.h
index 99eb568..d7b282e 100644
--- a/arch/um/include/asm/pgalloc.h
+++ b/arch/um/include/asm/pgalloc.h
@@ -10,6 +10,8 @@
  
  #include 
  
+#include 	/* for pte_{alloc,free}_one */

+
  #define pmd_populate_kernel(mm, pmd, pte) \
set_pmd(pmd, __pmd(_PAGE_TABLE + (unsigned long) __pa(pte)))
  
@@ -25,20 +27,6 @@

  extern pgd_t *pgd_alloc(struct mm_struct *);
  extern void pgd_free(struct mm_struct *mm, pgd_t *pgd);
  
-extern pte_t *pte_alloc_one_kernel(struct mm_struct *);

-extern pgtable_t pte_alloc_one(struct mm_struct *);
-
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-   free_page((unsigned long) pte);
-}
-
-static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
-{
-   pgtable_page_dtor(pte);
-   __free_page(pte);
-}
-
  #define __pte_free_tlb(tlb,pte, address)  \
  do {  \
pgtable_page_dtor(pte); \
diff --git a/arch/um/kernel/mem.c b/arch/um/kernel/mem.c
index 99aa11b..2280374 100644
--- a/arch/um/kernel/mem.c
+++ b/arch/um/kernel/mem.c
@@ -215,28 +215,6 @@ void pgd_free(struct mm_struct *mm, pgd_t *pgd)
free_page((unsigned long) pgd);
  }
  
-pte_t *pte_alloc_one_kernel(struct mm_struct *mm)

-{
-   pte_t *pte;
-
-   pte = (pte_t *)__get_free_page(GFP_KERNEL|__GFP_ZERO);
-   return pte;
-}
-
-pgtable_t pte_alloc_one(struct mm_struct *mm)
-{
-   struct page *pte;
-
-   pte = alloc_page(GFP_KERNEL|__GFP_ZERO);
-   if (!pte)
-   return NULL;
-   if (!pgtable_page_ctor(pte)) {
-   __free_page(pte);
-   return NULL;
-   }
-   return pte;
-}
-
  #ifdef CONFIG_3_LEVEL_PGTABLES
  pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long address)
  {




Reviewed-by: Anton Ivanov 
Acked-by: Anton Ivanov 

--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Re: [PATCH] um: Do not unlock mutex that is not hold.

2019-04-05 Thread Anton Ivanov




On 02/04/2019 09:43, Daniel Walter wrote:

  Return error instead of trying to unlock a mutex that is not hold.

Signed-off-by: Daniel Walter 
---
  arch/um/drivers/ubd_kern.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
index aca09be2373e..33c1cd6a12ac 100644
--- a/arch/um/drivers/ubd_kern.c
+++ b/arch/um/drivers/ubd_kern.c
@@ -276,14 +276,14 @@ static int ubd_setup_common(char *str, int *index_out, 
char **error_out)
str++;
if(!strcmp(str, "sync")){
global_openflags = of_sync(global_openflags);
-   goto out1;
+   return err;
}
  
  		err = -EINVAL;

major = simple_strtoul(str, , 0);
if((*end != '\0') || (end == str)){
*error_out = "Didn't parse major number";
-   goto out1;
+   return err;
}
  
  		mutex_lock(_lock);




Reviewed-by: Anton Ivanov 
Acked-by: Anton Ivanov 

--
Anton R. Ivanov
https://www.kot-begemot.co.uk/


Re: [RESEND PATCH 0/4] um: build and irq fixes

2019-04-03 Thread Anton Ivanov




On 03/04/2019 09:39, Bartosz Golaszewski wrote:

śr., 3 kwi 2019 o 10:39 Bartosz Golaszewski  napisał(a):


From: Bartosz Golaszewski 

I've previously sent these patches separately. I still don't see them
in next and I don't know what the policy is for picking up uml patches
but I thought I'd resend them rebased together on top of v5.1-rc3.






I test and ack stuff to the extent I can (especially the areas which I 
have worked on recently). Richard has the final say for what goes in on 
the next merge and he does it based on his own and my testing and/or 
markings in patchwork.




And of course I forgot to pick up acks from Anton...


Indeed - I have acked some of these :)




Bartosz Golaszewski (4):
   um: remove unused variable
   um: remove uses of variable length arrays
   um: define set_pte_at() as a static inline function, not a macro
   um: irq: don't set the chip for all irqs

  arch/um/include/asm/pgtable.h |  7 ++-
  arch/um/kernel/irq.c  |  2 +-
  arch/um/kernel/skas/uaccess.c |  1 -
  arch/um/os-Linux/umid.c   | 36 ++-
  4 files changed, 34 insertions(+), 12 deletions(-)

--
2.21.0



___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um



--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Re: [PATCH] um: irq: don't set the chip for all irqs

2019-03-15 Thread Anton Ivanov




On 14/03/2019 15:03, Bartosz Golaszewski wrote:

From: Bartosz Golaszewski 

Setting a chip for an interrupt marks it as allocated. Since UM doesn't
support dynamic interrupt numbers (yet), it means we cannot simply
increase NR_IRQS and then use the free irqs between LAST_IRQ and NR_IRQS
with gpio-mockup or iio testing drivers as irq_alloc_descs() will fail
after not being able to neither find an unallocated range of interrupts
nor expand the range.

Only call irq_set_chip_and_handler() for irqs until LAST_IRQ.

Signed-off-by: Bartosz Golaszewski 
---
Note: I plan to introduce support for SPARSE_IRQ but AFAICT it will be
a bit more complicated, so in the meantime I'd like to propose this change.

  arch/um/kernel/irq.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/um/kernel/irq.c b/arch/um/kernel/irq.c
index f4874b7ec503..598d7b3d9355 100644
--- a/arch/um/kernel/irq.c
+++ b/arch/um/kernel/irq.c
@@ -479,7 +479,7 @@ void __init init_IRQ(void)
irq_set_chip_and_handler(TIMER_IRQ, _irq_type, 
handle_edge_irq);
  
  
-	for (i = 1; i < NR_IRQS; i++)

+   for (i = 1; i < LAST_IRQ; i++)
irq_set_chip_and_handler(i, _irq_type, handle_edge_irq);
/* Initialize EPOLL Loop */
os_setup_epoll();



Reviewed-by: Anton Ivanov 
Acked-by: Anton Ivanov 

--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Re: [PATCH] um: remove uses of variable length arrays

2019-03-14 Thread Anton Ivanov



On 14/03/2019 13:33, Bartosz Golaszewski wrote:

śr., 13 mar 2019 o 10:45 Anton Ivanov
 napisał(a):

On 12/03/2019 13:30, Bartosz Golaszewski wrote:

From: Bartosz Golaszewski 

While the affected code is run in user-mode, the build still warns
about it. Convert all uses of VLA to dynamic allocations.

Signed-off-by: Bartosz Golaszewski 
---
   arch/um/os-Linux/umid.c | 36 +++-
   1 file changed, 27 insertions(+), 9 deletions(-)

diff --git a/arch/um/os-Linux/umid.c b/arch/um/os-Linux/umid.c
index 998fbb445458..e261656fe9d7 100644
--- a/arch/um/os-Linux/umid.c
+++ b/arch/um/os-Linux/umid.c
@@ -135,12 +135,18 @@ static int remove_files_and_dir(char *dir)
*/
   static inline int is_umdir_used(char *dir)
   {
- char file[strlen(uml_dir) + UMID_LEN + sizeof("/pid\0")];
- char pid[sizeof("n\0")], *end;
+ char pid[sizeof("n\0")], *end, *file;
   int dead, fd, p, n, err;
+ size_t filelen;

- n = snprintf(file, sizeof(file), "%s/pid", dir);
- if (n >= sizeof(file)) {
+ err = asprintf(, "%s/pid", dir);
+ if (err < 0)
+ return 0;
+
+ filelen = strlen(file);
+
+ n = snprintf(file, filelen, "%s/pid", dir);
+ if (n >= filelen) {
   printk(UM_KERN_ERR "is_umdir_used - pid filename too long\n");
   err = -E2BIG;
   goto out;
@@ -185,6 +191,7 @@ static inline int is_umdir_used(char *dir)
   out_close:
   close(fd);
   out:
+ free(file);
   return 0;
   }

@@ -210,18 +217,21 @@ static int umdir_take_if_dead(char *dir)

   static void __init create_pid_file(void)
   {
- char file[strlen(uml_dir) + UMID_LEN + sizeof("/pid\0")];
- char pid[sizeof("n\0")];
+ char pid[sizeof("n\0")], *file;
   int fd, n;

- if (umid_file_name("pid", file, sizeof(file)))
+ file = malloc(strlen(uml_dir) + UMID_LEN + sizeof("/pid\0"));
+ if (!file)
   return;

+ if (umid_file_name("pid", file, sizeof(file)))
+ goto out;
+
   fd = open(file, O_RDWR | O_CREAT | O_EXCL, 0644);
   if (fd < 0) {
   printk(UM_KERN_ERR "Open of machine pid file \"%s\" failed: "
  "%s\n", file, strerror(errno));
- return;
+ goto out;
   }

   snprintf(pid, sizeof(pid), "%d\n", getpid());
@@ -231,6 +241,8 @@ static void __init create_pid_file(void)
  errno);

   close(fd);
+out:
+ free(file);
   }

   int __init set_umid(char *name)
@@ -385,13 +397,19 @@ __uml_setup("uml_dir=", set_uml_dir,

   static void remove_umid_dir(void)
   {
- char dir[strlen(uml_dir) + UMID_LEN + 1], err;
+ char *dir, err;
+
+ dir = malloc(strlen(uml_dir) + UMID_LEN + 1);
+ if (!dir)
+ return;

   sprintf(dir, "%s%s", uml_dir, umid);
   err = remove_files_and_dir(dir);
   if (err)
   os_warn("%s - remove_files_and_dir failed with err = %d\n",
   __func__, err);
+
+ free(dir);
   }

   __uml_exitcall(remove_umid_dir);


Thanks for bringing it up. It helped me notice that this is actually broken.

PID can be more than 5 digits nowdays.

--

Do you want to take this patch anyway and then apply the fix for the
array on top of that or do you prefer it be fixed before that?

Bart


I am OK to take it as is and have the PID length fixed after that.

--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/



Re: [PATCH] um: define set_pte_at() as a static inline function, not a macro

2019-03-13 Thread Anton Ivanov




On 13/03/2019 10:14, Bartosz Golaszewski wrote:

From: Bartosz Golaszewski 

When defined as macro, the mm argument is unused and subsequently the
variable passed as mm is considered unused by the compiler. This fixes
a build warning.

Signed-off-by: Bartosz Golaszewski 
---
  arch/um/include/asm/pgtable.h | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/um/include/asm/pgtable.h b/arch/um/include/asm/pgtable.h
index 9c04562310b3..b377df76cc28 100644
--- a/arch/um/include/asm/pgtable.h
+++ b/arch/um/include/asm/pgtable.h
@@ -263,7 +263,12 @@ static inline void set_pte(pte_t *pteptr, pte_t pteval)
*pteptr = pte_mknewpage(*pteptr);
if(pte_present(*pteptr)) *pteptr = pte_mknewprot(*pteptr);
  }
-#define set_pte_at(mm,addr,ptep,pteval) set_pte(ptep,pteval)
+
+static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
+ pte_t *pteptr, pte_t pteval)
+{
+   set_pte(pteptr, pteval);
+}
  
  #define __HAVE_ARCH_PTE_SAME

  static inline int pte_same(pte_t pte_a, pte_t pte_b)


Reviewed-by: Anton Ivanov 
Acked-by: Anton Ivanov 
--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Re: [PATCH] um: remove uses of variable length arrays

2019-03-13 Thread Anton Ivanov

On 12/03/2019 13:30, Bartosz Golaszewski wrote:

From: Bartosz Golaszewski 

While the affected code is run in user-mode, the build still warns
about it. Convert all uses of VLA to dynamic allocations.

Signed-off-by: Bartosz Golaszewski 
---
  arch/um/os-Linux/umid.c | 36 +++-
  1 file changed, 27 insertions(+), 9 deletions(-)

diff --git a/arch/um/os-Linux/umid.c b/arch/um/os-Linux/umid.c
index 998fbb445458..e261656fe9d7 100644
--- a/arch/um/os-Linux/umid.c
+++ b/arch/um/os-Linux/umid.c
@@ -135,12 +135,18 @@ static int remove_files_and_dir(char *dir)
   */
  static inline int is_umdir_used(char *dir)
  {
-   char file[strlen(uml_dir) + UMID_LEN + sizeof("/pid\0")];
-   char pid[sizeof("n\0")], *end;
+   char pid[sizeof("n\0")], *end, *file;
int dead, fd, p, n, err;
+   size_t filelen;
  
-	n = snprintf(file, sizeof(file), "%s/pid", dir);

-   if (n >= sizeof(file)) {
+   err = asprintf(, "%s/pid", dir);
+   if (err < 0)
+   return 0;
+
+   filelen = strlen(file);
+
+   n = snprintf(file, filelen, "%s/pid", dir);
+   if (n >= filelen) {
printk(UM_KERN_ERR "is_umdir_used - pid filename too long\n");
err = -E2BIG;
goto out;
@@ -185,6 +191,7 @@ static inline int is_umdir_used(char *dir)
  out_close:
close(fd);
  out:
+   free(file);
return 0;
  }
  
@@ -210,18 +217,21 @@ static int umdir_take_if_dead(char *dir)
  
  static void __init create_pid_file(void)

  {
-   char file[strlen(uml_dir) + UMID_LEN + sizeof("/pid\0")];
-   char pid[sizeof("n\0")];
+   char pid[sizeof("n\0")], *file;
int fd, n;
  
-	if (umid_file_name("pid", file, sizeof(file)))

+   file = malloc(strlen(uml_dir) + UMID_LEN + sizeof("/pid\0"));
+   if (!file)
return;
  
+	if (umid_file_name("pid", file, sizeof(file)))

+   goto out;
+
fd = open(file, O_RDWR | O_CREAT | O_EXCL, 0644);
if (fd < 0) {
printk(UM_KERN_ERR "Open of machine pid file \"%s\" failed: "
   "%s\n", file, strerror(errno));
-   return;
+   goto out;
}
  
  	snprintf(pid, sizeof(pid), "%d\n", getpid());

@@ -231,6 +241,8 @@ static void __init create_pid_file(void)
   errno);
  
  	close(fd);

+out:
+   free(file);
  }
  
  int __init set_umid(char *name)

@@ -385,13 +397,19 @@ __uml_setup("uml_dir=", set_uml_dir,
  
  static void remove_umid_dir(void)

  {
-   char dir[strlen(uml_dir) + UMID_LEN + 1], err;
+   char *dir, err;
+
+   dir = malloc(strlen(uml_dir) + UMID_LEN + 1);
+   if (!dir)
+   return;
  
  	sprintf(dir, "%s%s", uml_dir, umid);

err = remove_files_and_dir(dir);
if (err)
os_warn("%s - remove_files_and_dir failed with err = %d\n",
__func__, err);
+
+   free(dir);
  }
  
  __uml_exitcall(remove_umid_dir);




Thanks for bringing it up. It helped me notice that this is actually broken.

PID can be more than 5 digits nowdays.

--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Re: [PATCH] um: remove unused variable

2019-03-13 Thread Anton Ivanov




On 12/03/2019 13:30, Bartosz Golaszewski wrote:

From: Bartosz Golaszewski 

The buf variable is unused. Remove it.

Signed-off-by: Bartosz Golaszewski 
---
  arch/um/kernel/skas/uaccess.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/arch/um/kernel/skas/uaccess.c b/arch/um/kernel/skas/uaccess.c
index 7f06fdbc7ee1..bd3cb694322c 100644
--- a/arch/um/kernel/skas/uaccess.c
+++ b/arch/um/kernel/skas/uaccess.c
@@ -59,7 +59,6 @@ static pte_t *maybe_map(unsigned long virt, int is_write)
  static int do_op_one_page(unsigned long addr, int len, int is_write,
 int (*op)(unsigned long addr, int len, void *arg), void *arg)
  {
-   jmp_buf buf;
struct page *page;
pte_t *pte;
int n;


Reviewed-by: Anton Ivanov 
Acked-by: Anton Ivanov 
--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Re: [-next] um: Remove duplicated include from vector_user.c

2019-01-20 Thread Anton Ivanov




On 1/3/19 3:12 AM, YueHaibing wrote:

Remove duplicated include.

Signed-off-by: YueHaibing 
---
  arch/um/drivers/vector_user.c | 3 ---
  1 file changed, 3 deletions(-)

diff --git a/arch/um/drivers/vector_user.c b/arch/um/drivers/vector_user.c
index d2c17dd..b3f7b3c 100644
--- a/arch/um/drivers/vector_user.c
+++ b/arch/um/drivers/vector_user.c
@@ -16,14 +16,12 @@
  #include 
  #include 
  #include 
-#include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
-#include 
  #include 
  #include 
  #include 
@@ -31,7 +29,6 @@
  #include 
  #include 
  #include 
-#include 
  #include "vector_user.h"
  
  #define ID_GRE 0




Reviewed-by: Anton Ivanov 
Acked-by: Anton Ivanov 
--
Anton R. Ivanov

Cambridge Greys Limited, England and Wales company No 10273661
http://www.cambridgegreys.com/


Re: [PATCH] um: Remove duplicate headers

2019-01-18 Thread Anton Ivanov




On 18/01/2019 19:42, Richard Weinberger wrote:

Am Freitag, 18. Januar 2019, 20:23:07 CET schrieb Anton Ivanov:


On 18/01/2019 14:58, Sabyasachi Gupta wrote:

Remove sys/socket.h and sys/uio.h which are included more than once

Signed-off-by: Sabyasachi Gupta 
---
   arch/um/drivers/vector_user.c | 2 --
   1 file changed, 2 deletions(-)

diff --git a/arch/um/drivers/vector_user.c b/arch/um/drivers/vector_user.c
index d2c17dd..c863921 100644
--- a/arch/um/drivers/vector_user.c
+++ b/arch/um/drivers/vector_user.c
@@ -23,7 +23,6 @@
   #include 
   #include 
   #include 
-#include 
   #include 
   #include 
   #include 
@@ -31,7 +30,6 @@
   #include 
   #include 
   #include 
-#include 
   #include "vector_user.h"
   
   #define ID_GRE 0




Hi Sabyasachi,

I believe we have an identical patch enqueued already from a couple of
weeks back.


Hmm, did I miss that one in patchwork?

Thanks,
//richard



___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


https://patchwork.ozlabs.org/patch/1020163/

A.


Re: [PATCH] um: Remove duplicate headers

2019-01-18 Thread Anton Ivanov




On 18/01/2019 14:58, Sabyasachi Gupta wrote:

Remove sys/socket.h and sys/uio.h which are included more than once

Signed-off-by: Sabyasachi Gupta 
---
  arch/um/drivers/vector_user.c | 2 --
  1 file changed, 2 deletions(-)

diff --git a/arch/um/drivers/vector_user.c b/arch/um/drivers/vector_user.c
index d2c17dd..c863921 100644
--- a/arch/um/drivers/vector_user.c
+++ b/arch/um/drivers/vector_user.c
@@ -23,7 +23,6 @@
  #include 
  #include 
  #include 
-#include 
  #include 
  #include 
  #include 
@@ -31,7 +30,6 @@
  #include 
  #include 
  #include 
-#include 
  #include "vector_user.h"
  
  #define ID_GRE 0




Hi Sabyasachi,

I believe we have an identical patch enqueued already from a couple of 
weeks back.


Best Regards,

A.


Re: [PATCH] um: writev needs

2019-01-02 Thread Anton Ivanov




On 12/27/18 7:33 AM, Christoph Hellwig wrote:

vector_user.c doesn't compile without this for me.

Signed-off-by: Christoph Hellwig 
---
  arch/um/drivers/vector_user.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/arch/um/drivers/vector_user.c b/arch/um/drivers/vector_user.c
index 3d8cdbdb4e66..41eefbcdc86f 100644
--- a/arch/um/drivers/vector_user.c
+++ b/arch/um/drivers/vector_user.c
@@ -25,6 +25,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 


Acked-by: Anton Ivanov 
--
Anton R. Ivanov

Cambridge Greys Limited, England and Wales company No 10273661
http://www.cambridgegreys.com/


Re: [PATCH 2/2] um: vector: Use 'kmalloc_array' instead of 'kmalloc'

2019-01-02 Thread Anton Ivanov




On 12/26/18 7:54 AM, Christophe JAILLET wrote:

Use 'kmalloc_array' instead of 'kmalloc' when appropriate.

Signed-off-by: Christophe JAILLET 
---
I don't know why it has not already been replaced in 6da2ec56059c
("treewide: kmalloc() -> kmalloc_array()".
---
  arch/um/drivers/vector_kern.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/um/drivers/vector_kern.c b/arch/um/drivers/vector_kern.c
index 5b917716289d..dee5246bda81 100644
--- a/arch/um/drivers/vector_kern.c
+++ b/arch/um/drivers/vector_kern.c
@@ -507,12 +507,12 @@ static struct vector_queue *create_queue(
return NULL;
result->max_depth = max_size;
result->dev = vp->dev;
-   result->mmsg_vector = kmalloc(
-   (sizeof(struct mmsghdr) * max_size), GFP_KERNEL);
+   result->mmsg_vector = kmalloc_array(max_size, sizeof(struct mmsghdr),
+   GFP_KERNEL);
if (result->mmsg_vector == NULL)
goto out_mmsg_fail;
-   result->skbuff_vector = kmalloc(
-   (sizeof(void *) * max_size), GFP_KERNEL);
+   result->skbuff_vector = kmalloc_array(max_size, sizeof(void *),
+ GFP_KERNEL);
if (result->skbuff_vector == NULL)
    goto out_skb_fail;
  



Acked-by: Anton Ivanov 

--
Anton R. Ivanov

Cambridge Greys Limited, England and Wales company No 10273661
http://www.cambridgegreys.com/


Re: [RFC v3 11/19] kunit: add Python libraries for handing KUnit config and kernel

2018-12-11 Thread Anton Ivanov



On 12/11/18 2:41 PM, Steven Rostedt wrote:

On Tue, 11 Dec 2018 15:09:26 +0100
Petr Mladek  wrote:


We have liburcu already, which is good.  The main sticking points are:

  - printk has started adding a lot of %pX enhancements which printf
obviously doesn't know about.

I wonder how big problem it is and if it is worth using another
approach.

No, please do not change the %pX approach.


An alternative would be to replace them with helper functions
the would produce the same string. The meaning would be easier
to understand. But concatenating with the surrounding text
would be less elegant. People might start using pr_cont()
that is problematic (mixed lines).

Also the %pX formats are mostly used to print context of some
structures. Even the helper functions would need some maintenance
to keep them compatible.

BTW: The printk() feature has been introduced 10 years ago by
the commit 4d8a743cdd2690c0bc8 ("vsprintf: add infrastructure
support for extended '%p' specifiers").

trace-cmd and perf know about most of the %pX data and how to read it.
Perhaps we can extend the libtraceevent library to export a generic way
to read data from printk() output for other tools to use.


Going back for a second to using UML for this. UML console at present is 
interrupt driven - it emulates serial IO using several different 
back-ends (file descriptors, xterm or actual tty/ptys). Epoll events on 
the host side are used to trigger the UML interrupts - both read and write.


This works OK for normal use, but may result in all kinds of interesting 
false positives/false negatives when UML is used to run unit tests 
against a change which changes interrupt behavior.


IMO it may be useful to consider some alternatives specifically for unit 
test coverage purposes where printk and/or the whole console output 
altogether bypass some of the IRQ driven semantics.


--

Anton R. Ivanov

Cambridge Greys Limited, England and Wales company No 10273661
http://www.cambridgegreys.com/



Re: [RFC v3 01/19] kunit: test: add KUnit test runner core

2018-12-05 Thread Anton Ivanov

On 30/11/2018 03:14, Luis Chamberlain wrote:

On Wed, Nov 28, 2018 at 11:36:18AM -0800, Brendan Higgins wrote:

+#define module_test(module) \
+   static int module_kunit_init##module(void) \
+   { \
+   return kunit_run_tests(); \
+   } \
+   late_initcall(module_kunit_init##module)

Here in lies an assumption that suffices. I'm inclined to believe we
need new initcall level here so to ensure we *do* run after all the
respective kernels iniut calls. Otherwise we're left at the whims of
link order for kunit. For instance if a kunit test relies on frameworks
which are also late_initcall() we'd have complete incompatibility with
anything linked *after* kunit.


diff --git a/kunit/Kconfig b/kunit/Kconfig
new file mode 100644
index 0..49b44c4f6630a
--- /dev/null
+++ b/kunit/Kconfig
@@ -0,0 +1,17 @@
+#
+# KUnit base configuration
+#
+
+menu "KUnit support"
+
+config KUNIT
+   bool "Enable support for unit tests (KUnit)"
+   depends on UML

Consider using:

if UML
...
endif

That allows the depends to be done once.


+   help
+ Enables support for kernel unit tests (KUnit), a lightweight unit
+ testing and mocking framework for the Linux kernel. These tests are
+ able to be run locally on a developer's workstation without a VM or
+ special hardware.


Some mention of UML may be good here?


For more information, please see
+ Documentation/kunit/
+
+endmenu

I'm a bit conflicted here. This currently depends on UML but yet you
noted on RFC v2 that your intention is to liberate kunit from UML and
ideally allow unit tests to depend only on userspace. I've addressed
tests using both selftests kernels drivers and also re-written kernel
APIs to userspace to test there. I think we may need to live with both.

Then for the UML stuff, I think if we *really* accept that UML will
always be a viable option we should probably consider now throwing these
things under drivers/platform/uml/. This follows the pattern of arch
specific drivers. Whether or not we end up with a complete userspace


UML platform drivers predate that and are under arch/um/drivers/

We should either keep to current convention or consider relocating the 
existing ones - having things spread in different places around the tree 
is not good in the long run (UML already has a few of those under the 
x86 tree, let's not increase the number).



component independent of UML may implicate having a shared component
somewhere else.

Likewise, I realize the goal is to *avoid* using a virtual machine for
these tests, but would it in any way make sense to share kunit to be
supported for other architectures to allow easier-to-write tests as
well?

   Luis

___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um



--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661



Re: [RFC v3 01/19] kunit: test: add KUnit test runner core

2018-12-05 Thread Anton Ivanov

On 30/11/2018 03:14, Luis Chamberlain wrote:

On Wed, Nov 28, 2018 at 11:36:18AM -0800, Brendan Higgins wrote:

+#define module_test(module) \
+   static int module_kunit_init##module(void) \
+   { \
+   return kunit_run_tests(); \
+   } \
+   late_initcall(module_kunit_init##module)

Here in lies an assumption that suffices. I'm inclined to believe we
need new initcall level here so to ensure we *do* run after all the
respective kernels iniut calls. Otherwise we're left at the whims of
link order for kunit. For instance if a kunit test relies on frameworks
which are also late_initcall() we'd have complete incompatibility with
anything linked *after* kunit.


diff --git a/kunit/Kconfig b/kunit/Kconfig
new file mode 100644
index 0..49b44c4f6630a
--- /dev/null
+++ b/kunit/Kconfig
@@ -0,0 +1,17 @@
+#
+# KUnit base configuration
+#
+
+menu "KUnit support"
+
+config KUNIT
+   bool "Enable support for unit tests (KUnit)"
+   depends on UML

Consider using:

if UML
...
endif

That allows the depends to be done once.


+   help
+ Enables support for kernel unit tests (KUnit), a lightweight unit
+ testing and mocking framework for the Linux kernel. These tests are
+ able to be run locally on a developer's workstation without a VM or
+ special hardware.


Some mention of UML may be good here?


For more information, please see
+ Documentation/kunit/
+
+endmenu

I'm a bit conflicted here. This currently depends on UML but yet you
noted on RFC v2 that your intention is to liberate kunit from UML and
ideally allow unit tests to depend only on userspace. I've addressed
tests using both selftests kernels drivers and also re-written kernel
APIs to userspace to test there. I think we may need to live with both.

Then for the UML stuff, I think if we *really* accept that UML will
always be a viable option we should probably consider now throwing these
things under drivers/platform/uml/. This follows the pattern of arch
specific drivers. Whether or not we end up with a complete userspace


UML platform drivers predate that and are under arch/um/drivers/

We should either keep to current convention or consider relocating the 
existing ones - having things spread in different places around the tree 
is not good in the long run (UML already has a few of those under the 
x86 tree, let's not increase the number).



component independent of UML may implicate having a shared component
somewhere else.

Likewise, I realize the goal is to *avoid* using a virtual machine for
these tests, but would it in any way make sense to share kunit to be
supported for other architectures to allow easier-to-write tests as
well?

   Luis

___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um



--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661



Re: [PATCH] um: NULL check before kfree is not needed

2018-08-04 Thread Anton Ivanov

On 03/08/18 07:39, YueHaibing wrote:


kfree(NULL) is safe,so this removes NULL check before freeing the mem

Signed-off-by: YueHaibing 
---
  arch/um/drivers/vector_kern.c | 15 +--
  arch/um/drivers/vector_user.c |  6 ++
  arch/um/kernel/irq.c  |  3 +--
  3 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/arch/um/drivers/vector_kern.c b/arch/um/drivers/vector_kern.c
index 50ee3bb..c84133c 100644
--- a/arch/um/drivers/vector_kern.c
+++ b/arch/um/drivers/vector_kern.c
@@ -1118,16 +1118,11 @@ static int vector_net_close(struct net_device *dev)
os_close_file(vp->fds->tx_fd);
vp->fds->tx_fd = -1;
}
-   if (vp->bpf != NULL)
-   kfree(vp->bpf);
-   if (vp->fds->remote_addr != NULL)
-   kfree(vp->fds->remote_addr);
-   if (vp->transport_data != NULL)
-   kfree(vp->transport_data);
-   if (vp->header_rxbuffer != NULL)
-   kfree(vp->header_rxbuffer);
-   if (vp->header_txbuffer != NULL)
-   kfree(vp->header_txbuffer);
+   kfree(vp->bpf);
+   kfree(vp->fds->remote_addr);
+   kfree(vp->transport_data);
+   kfree(vp->header_rxbuffer);
+   kfree(vp->header_txbuffer);
if (vp->rx_queue != NULL)
destroy_queue(vp->rx_queue);
if (vp->tx_queue != NULL)
diff --git a/arch/um/drivers/vector_user.c b/arch/um/drivers/vector_user.c
index 4d6a78e..3d8cdbd 100644
--- a/arch/um/drivers/vector_user.c
+++ b/arch/um/drivers/vector_user.c
@@ -267,8 +267,7 @@ static struct vector_fds *user_init_raw_fds(struct arglist 
*ifspec)
os_close_file(rxfd);
if (txfd >= 0)
os_close_file(txfd);
-   if (result != NULL)
-   kfree(result);
+   kfree(result);
return NULL;
  }
  
@@ -434,8 +433,7 @@ static struct vector_fds *user_init_socket_fds(struct arglist *ifspec, int id)

if (fd >= 0)
os_close_file(fd);
if (result != NULL) {
-   if (result->remote_addr != NULL)
-   kfree(result->remote_addr);
+   kfree(result->remote_addr);
kfree(result);
}
return NULL;
diff --git a/arch/um/kernel/irq.c b/arch/um/kernel/irq.c
index 6b7f382..8360fa3 100644
--- a/arch/um/kernel/irq.c
+++ b/arch/um/kernel/irq.c
@@ -244,8 +244,7 @@ static void garbage_collect_irq_entries(void)
to_free = NULL;
}
walk = walk->next;
-   if (to_free != NULL)
-   kfree(to_free);
+   kfree(to_free);
}
  }
  
kfree in both slab and slob check for NULLs before freeing so this is 
correct. Thanks for noticing.


Richard, please apply,

A.


Re: [PATCH] um: NULL check before kfree is not needed

2018-08-04 Thread Anton Ivanov

On 03/08/18 07:39, YueHaibing wrote:


kfree(NULL) is safe,so this removes NULL check before freeing the mem

Signed-off-by: YueHaibing 
---
  arch/um/drivers/vector_kern.c | 15 +--
  arch/um/drivers/vector_user.c |  6 ++
  arch/um/kernel/irq.c  |  3 +--
  3 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/arch/um/drivers/vector_kern.c b/arch/um/drivers/vector_kern.c
index 50ee3bb..c84133c 100644
--- a/arch/um/drivers/vector_kern.c
+++ b/arch/um/drivers/vector_kern.c
@@ -1118,16 +1118,11 @@ static int vector_net_close(struct net_device *dev)
os_close_file(vp->fds->tx_fd);
vp->fds->tx_fd = -1;
}
-   if (vp->bpf != NULL)
-   kfree(vp->bpf);
-   if (vp->fds->remote_addr != NULL)
-   kfree(vp->fds->remote_addr);
-   if (vp->transport_data != NULL)
-   kfree(vp->transport_data);
-   if (vp->header_rxbuffer != NULL)
-   kfree(vp->header_rxbuffer);
-   if (vp->header_txbuffer != NULL)
-   kfree(vp->header_txbuffer);
+   kfree(vp->bpf);
+   kfree(vp->fds->remote_addr);
+   kfree(vp->transport_data);
+   kfree(vp->header_rxbuffer);
+   kfree(vp->header_txbuffer);
if (vp->rx_queue != NULL)
destroy_queue(vp->rx_queue);
if (vp->tx_queue != NULL)
diff --git a/arch/um/drivers/vector_user.c b/arch/um/drivers/vector_user.c
index 4d6a78e..3d8cdbd 100644
--- a/arch/um/drivers/vector_user.c
+++ b/arch/um/drivers/vector_user.c
@@ -267,8 +267,7 @@ static struct vector_fds *user_init_raw_fds(struct arglist 
*ifspec)
os_close_file(rxfd);
if (txfd >= 0)
os_close_file(txfd);
-   if (result != NULL)
-   kfree(result);
+   kfree(result);
return NULL;
  }
  
@@ -434,8 +433,7 @@ static struct vector_fds *user_init_socket_fds(struct arglist *ifspec, int id)

if (fd >= 0)
os_close_file(fd);
if (result != NULL) {
-   if (result->remote_addr != NULL)
-   kfree(result->remote_addr);
+   kfree(result->remote_addr);
kfree(result);
}
return NULL;
diff --git a/arch/um/kernel/irq.c b/arch/um/kernel/irq.c
index 6b7f382..8360fa3 100644
--- a/arch/um/kernel/irq.c
+++ b/arch/um/kernel/irq.c
@@ -244,8 +244,7 @@ static void garbage_collect_irq_entries(void)
to_free = NULL;
}
walk = walk->next;
-   if (to_free != NULL)
-   kfree(to_free);
+   kfree(to_free);
}
  }
  
kfree in both slab and slob check for NULLs before freeing so this is 
correct. Thanks for noticing.


Richard, please apply,

A.


Re: [uml-devel] [REVIEW][PATCH 19/22] signal/um: Use force_sig_fault in relay_signal.

2018-04-24 Thread Anton Ivanov

Hi Richard,

There was a post to uml-devel during the days when the sourceforge 
mailing list was working in random drop mode which claimed that "this 
fixes the arm build".


I have not kept it locally and I do not see it the archive (I do not see 
a few other posts there either - including some of mine).


The joys of having a broken list :(

Whoever posted it, if you are reading it, please re-post again so we can 
have a look.


In the meantime we are as you said - x86 only.

A.

On 04/24/18 09:32, Richard Weinberger wrote:

On Fri, Apr 20, 2018 at 6:06 PM, Anton Ivanov
<anton.iva...@kot-begemot.co.uk> wrote:

On 04/20/18 15:38, Eric W. Biederman wrote:

Today user mode linux only works on x86 and x86_64 and this allows
simplifications of relay_signal.


I believe someone recently fixed the ARM port. I have not had the time to
try the fixes though.

Huh? UML is for ages x86 only.





Re: [uml-devel] [REVIEW][PATCH 19/22] signal/um: Use force_sig_fault in relay_signal.

2018-04-24 Thread Anton Ivanov

Hi Richard,

There was a post to uml-devel during the days when the sourceforge 
mailing list was working in random drop mode which claimed that "this 
fixes the arm build".


I have not kept it locally and I do not see it the archive (I do not see 
a few other posts there either - including some of mine).


The joys of having a broken list :(

Whoever posted it, if you are reading it, please re-post again so we can 
have a look.


In the meantime we are as you said - x86 only.

A.

On 04/24/18 09:32, Richard Weinberger wrote:

On Fri, Apr 20, 2018 at 6:06 PM, Anton Ivanov
 wrote:

On 04/20/18 15:38, Eric W. Biederman wrote:

Today user mode linux only works on x86 and x86_64 and this allows
simplifications of relay_signal.


I believe someone recently fixed the ARM port. I have not had the time to
try the fixes though.

Huh? UML is for ages x86 only.





Re: [uml-devel] [REVIEW][PATCH 19/22] signal/um: Use force_sig_fault in relay_signal.

2018-04-20 Thread Anton Ivanov


On 04/20/18 15:38, Eric W. Biederman wrote:

Today user mode linux only works on x86 and x86_64 and this allows
simplifications of relay_signal.


I believe someone recently fixed the ARM port. I have not had the time 
to try the fixes though.


I have added the new list we are migrating to the cc list.

A.





- x86 always set si_errno to 0 in fault handlers.
- x86 does not implement si_trapno.
- Only si_codes between SI_USER and SI_KERNEL have a fault address.

Therefore warn if si_errno is set (it should never be).
Use force_sig_info in the case where we know we have a good fault.

For signals whose content it is not clear how to relay use plain
force_sig and let the signal sending code come up with an
appropriate generic siginfo.

Cc: Jeff Dike 
Cc: Richard Weinberger 
Cc: user-mode-linux-de...@lists.sourceforge.net
Signed-off-by: "Eric W. Biederman" 
---
  arch/um/kernel/trap.c | 28 +---
  1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
index d4d38520c4c6..5f0ff17cd790 100644
--- a/arch/um/kernel/trap.c
+++ b/arch/um/kernel/trap.c
@@ -296,9 +296,6 @@ unsigned long segv(struct faultinfo fi, unsigned long ip, 
int is_user,
  
  void relay_signal(int sig, struct siginfo *si, struct uml_pt_regs *regs)

  {
-   struct faultinfo *fi;
-   struct siginfo clean_si;
-
if (!UPT_IS_USER(regs)) {
if (sig == SIGBUS)
printk(KERN_ERR "Bus error - the host /dev/shm or /tmp "
@@ -308,29 +305,30 @@ void relay_signal(int sig, struct siginfo *si, struct 
uml_pt_regs *regs)
  
  	arch_examine_signal(sig, regs);
  
-	clear_siginfo(_si);

-   clean_si.si_signo = si->si_signo;
-   clean_si.si_errno = si->si_errno;
-   clean_si.si_code = si->si_code;
+   if (unlikely(si->si_errno)) {
+   printk(KERN_ERR "Attempted to relay signal %d (si_code = %d) with 
errno %d\n",
+  sig, si->si_code, si->si_errno);
+   }
switch (sig) {
case SIGILL:
case SIGFPE:
case SIGSEGV:
case SIGBUS:
case SIGTRAP:
-   fi = UPT_FAULTINFO(regs);
-   clean_si.si_addr = (void __user *) FAULT_ADDRESS(*fi);
-   current->thread.arch.faultinfo = *fi;
-#ifdef __ARCH_SI_TRAPNO
-   clean_si.si_trapno = si->si_trapno;
-#endif
-   break;
+   if ((si->si_code > SI_USER) && (si->si_code < SI_KERNEL)) {
+   struct faultinfo *fi = UPT_FAULTINFO(regs);
+   current->thread.arch.faultinfo = *fi;
+   force_sig_fault(sig, si->si_code,
+   (void __user *)FAULT_ADDRESS(*fi),
+   current);
+   break;
+   }
default:
printk(KERN_ERR "Attempted to relay unknown signal %d (si_code = 
%d)\n",
sig, si->si_code);
}
  
-	force_sig_info(sig, _si, current);

+   force_sig(sig, current);
  }
  
  void bus_handler(int sig, struct siginfo *si, struct uml_pt_regs *regs)




Re: [uml-devel] [REVIEW][PATCH 19/22] signal/um: Use force_sig_fault in relay_signal.

2018-04-20 Thread Anton Ivanov


On 04/20/18 15:38, Eric W. Biederman wrote:

Today user mode linux only works on x86 and x86_64 and this allows
simplifications of relay_signal.


I believe someone recently fixed the ARM port. I have not had the time 
to try the fixes though.


I have added the new list we are migrating to the cc list.

A.





- x86 always set si_errno to 0 in fault handlers.
- x86 does not implement si_trapno.
- Only si_codes between SI_USER and SI_KERNEL have a fault address.

Therefore warn if si_errno is set (it should never be).
Use force_sig_info in the case where we know we have a good fault.

For signals whose content it is not clear how to relay use plain
force_sig and let the signal sending code come up with an
appropriate generic siginfo.

Cc: Jeff Dike 
Cc: Richard Weinberger 
Cc: user-mode-linux-de...@lists.sourceforge.net
Signed-off-by: "Eric W. Biederman" 
---
  arch/um/kernel/trap.c | 28 +---
  1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
index d4d38520c4c6..5f0ff17cd790 100644
--- a/arch/um/kernel/trap.c
+++ b/arch/um/kernel/trap.c
@@ -296,9 +296,6 @@ unsigned long segv(struct faultinfo fi, unsigned long ip, 
int is_user,
  
  void relay_signal(int sig, struct siginfo *si, struct uml_pt_regs *regs)

  {
-   struct faultinfo *fi;
-   struct siginfo clean_si;
-
if (!UPT_IS_USER(regs)) {
if (sig == SIGBUS)
printk(KERN_ERR "Bus error - the host /dev/shm or /tmp "
@@ -308,29 +305,30 @@ void relay_signal(int sig, struct siginfo *si, struct 
uml_pt_regs *regs)
  
  	arch_examine_signal(sig, regs);
  
-	clear_siginfo(_si);

-   clean_si.si_signo = si->si_signo;
-   clean_si.si_errno = si->si_errno;
-   clean_si.si_code = si->si_code;
+   if (unlikely(si->si_errno)) {
+   printk(KERN_ERR "Attempted to relay signal %d (si_code = %d) with 
errno %d\n",
+  sig, si->si_code, si->si_errno);
+   }
switch (sig) {
case SIGILL:
case SIGFPE:
case SIGSEGV:
case SIGBUS:
case SIGTRAP:
-   fi = UPT_FAULTINFO(regs);
-   clean_si.si_addr = (void __user *) FAULT_ADDRESS(*fi);
-   current->thread.arch.faultinfo = *fi;
-#ifdef __ARCH_SI_TRAPNO
-   clean_si.si_trapno = si->si_trapno;
-#endif
-   break;
+   if ((si->si_code > SI_USER) && (si->si_code < SI_KERNEL)) {
+   struct faultinfo *fi = UPT_FAULTINFO(regs);
+   current->thread.arch.faultinfo = *fi;
+   force_sig_fault(sig, si->si_code,
+   (void __user *)FAULT_ADDRESS(*fi),
+   current);
+   break;
+   }
default:
printk(KERN_ERR "Attempted to relay unknown signal %d (si_code = 
%d)\n",
sig, si->si_code);
}
  
-	force_sig_info(sig, _si, current);

+   force_sig(sig, current);
  }
  
  void bus_handler(int sig, struct siginfo *si, struct uml_pt_regs *regs)




Re: [PATCH 1/9] um/drivers/vector_user: Delete unnecessary code in user_init_raw_fds()

2018-03-11 Thread Anton Ivanov

Thanks, well noted.

It still does not fix it completely though.

Re-reading the code it will leak a fd if the malloc for result fails. 
That return result; there should be inside the conditional falling back 
to cleanup if the alloc fails.


A.


On 03/11/18 15:16, SF Markus Elfring wrote:

From: Markus Elfring 
Date: Sun, 11 Mar 2018 11:36:18 +0100

* One condition check could never be reached with a non-null pointer
   at the end of this function. Thus remove the corresponding statement.

* Delete an initialisation for the local variable "result"
   which became unnecessary with this refactoring.

Signed-off-by: Markus Elfring 
---
  arch/um/drivers/vector_user.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/um/drivers/vector_user.c b/arch/um/drivers/vector_user.c
index 4291f1a5d342..d6a6207d4061 100644
--- a/arch/um/drivers/vector_user.c
+++ b/arch/um/drivers/vector_user.c
@@ -211,7 +211,7 @@ static struct vector_fds *user_init_raw_fds(struct arglist 
*ifspec)
struct sockaddr_ll sock;
int err = -ENOMEM;
char *iface;
-   struct vector_fds *result = NULL;
+   struct vector_fds *result;
int optval = 1;
  
  
@@ -276,8 +276,6 @@ static struct vector_fds *user_init_raw_fds(struct arglist *ifspec)

os_close_file(rxfd);
if (txfd >= 0)
os_close_file(txfd);
-   if (result != NULL)
-   kfree(result);
return NULL;
  }
  


--
Anton R. Ivanov

Cambridge Greys Limited, England and Wales company No 10273661
http://www.cambridgegreys.com/



Re: [PATCH 1/9] um/drivers/vector_user: Delete unnecessary code in user_init_raw_fds()

2018-03-11 Thread Anton Ivanov

Thanks, well noted.

It still does not fix it completely though.

Re-reading the code it will leak a fd if the malloc for result fails. 
That return result; there should be inside the conditional falling back 
to cleanup if the alloc fails.


A.


On 03/11/18 15:16, SF Markus Elfring wrote:

From: Markus Elfring 
Date: Sun, 11 Mar 2018 11:36:18 +0100

* One condition check could never be reached with a non-null pointer
   at the end of this function. Thus remove the corresponding statement.

* Delete an initialisation for the local variable "result"
   which became unnecessary with this refactoring.

Signed-off-by: Markus Elfring 
---
  arch/um/drivers/vector_user.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/um/drivers/vector_user.c b/arch/um/drivers/vector_user.c
index 4291f1a5d342..d6a6207d4061 100644
--- a/arch/um/drivers/vector_user.c
+++ b/arch/um/drivers/vector_user.c
@@ -211,7 +211,7 @@ static struct vector_fds *user_init_raw_fds(struct arglist 
*ifspec)
struct sockaddr_ll sock;
int err = -ENOMEM;
char *iface;
-   struct vector_fds *result = NULL;
+   struct vector_fds *result;
int optval = 1;
  
  
@@ -276,8 +276,6 @@ static struct vector_fds *user_init_raw_fds(struct arglist *ifspec)

os_close_file(rxfd);
if (txfd >= 0)
os_close_file(txfd);
-   if (result != NULL)
-   kfree(result);
return NULL;
  }
  


--
Anton R. Ivanov

Cambridge Greys Limited, England and Wales company No 10273661
http://www.cambridgegreys.com/



Re: [PATCH -next] um: vector: fix missing unlock on error in vector_net_open()

2018-01-05 Thread Anton Ivanov

Hi Wei,

I just double-checked.

This issue has been fixed in a patch submitted by Dan Carpenter on 09th 
Dec 2017 which I acknowledged on 11th Dec 2017 and which should be in 
Richard's queue to be applied.


It should at some point show up in Linux-next.

Best Regards and once again, thanks for looking into it.

A.


On 01/05/18 07:22, Wei Yongjun wrote:

Add the missing unlock before return from function vector_net_open()
in the error handling case.

Fixes: ad1f62ab2bd4 ("High Performance UML Vector Network Driver")
Signed-off-by: Wei Yongjun 
---
  arch/um/drivers/vector_kern.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/um/drivers/vector_kern.c b/arch/um/drivers/vector_kern.c
index d1d5301..bb83a2d 100644
--- a/arch/um/drivers/vector_kern.c
+++ b/arch/um/drivers/vector_kern.c
@@ -1156,8 +1156,10 @@ static int vector_net_open(struct net_device *dev)
struct vector_device *vdevice;
  
  	spin_lock_irqsave(>lock, flags);

-   if (vp->opened)
+   if (vp->opened) {
+   spin_unlock_irqrestore(>lock, flags);
return -ENXIO;
+   }
vp->opened = true;
spin_unlock_irqrestore(>lock, flags);




--
Anton R. Ivanov

Cambridge Greys Limited, England and Wales company No 10273661
http://www.cambridgegreys.com/



Re: [PATCH -next] um: vector: fix missing unlock on error in vector_net_open()

2018-01-05 Thread Anton Ivanov

Hi Wei,

I just double-checked.

This issue has been fixed in a patch submitted by Dan Carpenter on 09th 
Dec 2017 which I acknowledged on 11th Dec 2017 and which should be in 
Richard's queue to be applied.


It should at some point show up in Linux-next.

Best Regards and once again, thanks for looking into it.

A.


On 01/05/18 07:22, Wei Yongjun wrote:

Add the missing unlock before return from function vector_net_open()
in the error handling case.

Fixes: ad1f62ab2bd4 ("High Performance UML Vector Network Driver")
Signed-off-by: Wei Yongjun 
---
  arch/um/drivers/vector_kern.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/um/drivers/vector_kern.c b/arch/um/drivers/vector_kern.c
index d1d5301..bb83a2d 100644
--- a/arch/um/drivers/vector_kern.c
+++ b/arch/um/drivers/vector_kern.c
@@ -1156,8 +1156,10 @@ static int vector_net_open(struct net_device *dev)
struct vector_device *vdevice;
  
  	spin_lock_irqsave(>lock, flags);

-   if (vp->opened)
+   if (vp->opened) {
+   spin_unlock_irqrestore(>lock, flags);
return -ENXIO;
+   }
vp->opened = true;
spin_unlock_irqrestore(>lock, flags);




--
Anton R. Ivanov

Cambridge Greys Limited, England and Wales company No 10273661
http://www.cambridgegreys.com/



Re: [PATCH] [RFC] um: Convert ubd driver to blk-mq

2017-12-03 Thread Anton Ivanov
On 03/12/17 21:54, Richard Weinberger wrote:
> Christoph,
>
> Am Mittwoch, 29. November 2017, 22:46:51 CET schrieb Christoph Hellwig:
>> On Sun, Nov 26, 2017 at 02:10:53PM +0100, Richard Weinberger wrote:
>>> MAX_SG is 64, used for blk_queue_max_segments(). This comes from
>>> a0044bdf60c2 ("uml: batch I/O requests"). Is this still a good/sane
>>> value for blk-mq?
>> blk-mq itself doesn't change the tradeoff.
>>
>>> The driver does IO batching, for each request it issues many UML struct
>>> io_thread_req request to the IO thread on the host side.
>>> One io_thread_req per SG page.
>>> Before the conversion the driver used blk_end_request() to indicate that
>>> a part of the request is done.
>>> blk_mq_end_request() does not take a length parameter, therefore we can
>>> only mark the whole request as done. See the new is_last property on the
>>> driver.
>>> Maybe there is a way to partially end requests too in blk-mq?
>> You can, take a look at scsi_end_request which handles this for blk-mq
>> and the legacy layer.  That being said I wonder if batching really
>> makes that much sene if you execute each segment separately?
> Anton did a lot of performance improvements in this area.
> He has all the details.
> AFAIK batching brings us more throughput because in UML all IO is done by
> a different thread and the IPC has a certain overhead. 

The current UML disk IO is executed in a different thread using a pipe
as an IPC.

What batching helps with is the number of context switches and numbers
of syscalls per IO operation.

The non-batching code used 6 syscalls per disk io operation: UML write
to IPC, disk thread read from IPC, actual disk IO, disk thread write to
IPC, (e)poll in UML IRQ controller emulation, UML read from IPC.

With batching this is reduced to 5 calls per batch + number of IO ops
batched. Under load the batches grow to usually 10-30 in size which
yields a syscall reduction of 3-5 times. My code sets the batch size
limit to 64 and you can hit that on some synthetic benchmarks like
dd-ing raw disks.

There is further gains from latency reduction. The "round-trip" over the
IPC to tell the disk io thread to perform an IO operation and to confirm
the results is also reduced if you manage to pass multiple events in one go.

All in all, the difference between batched and non-batched for heavy IO
load is several times for the old blk code in UML. I need to do some
reading to get a better understanding of the new code and if needs
batching and how to match it to the actual blk-mq semantics.

A.

>
>>> Another obstacle with IO batching is that UML IO thread requests can
>>> fail. Not only due to OOM, also because the pipe between the UML kernel
>>> process and the host IO thread can return EAGAIN.
>>> In this case the driver puts the request into a list and retried later
>>> again when the pipe turns writable.
>>> I’m not sure whether this restart logic makes sense with blk-mq, maybe
>>> there is a way in blk-mq to put back a (partial) request?
>> blk_mq_requeue_request requeues requests that have been partially
>> exectuted (or not at all for that matter).
> Thanks this is what I needed.
> BTW: How can I know which blk functions are not usable in blk-mq?
> I didn't realize that I can use blk_update_request().
>
> Thanks,
> //richard
>
>

-- 
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661



Re: [PATCH] [RFC] um: Convert ubd driver to blk-mq

2017-12-03 Thread Anton Ivanov
On 03/12/17 21:54, Richard Weinberger wrote:
> Christoph,
>
> Am Mittwoch, 29. November 2017, 22:46:51 CET schrieb Christoph Hellwig:
>> On Sun, Nov 26, 2017 at 02:10:53PM +0100, Richard Weinberger wrote:
>>> MAX_SG is 64, used for blk_queue_max_segments(). This comes from
>>> a0044bdf60c2 ("uml: batch I/O requests"). Is this still a good/sane
>>> value for blk-mq?
>> blk-mq itself doesn't change the tradeoff.
>>
>>> The driver does IO batching, for each request it issues many UML struct
>>> io_thread_req request to the IO thread on the host side.
>>> One io_thread_req per SG page.
>>> Before the conversion the driver used blk_end_request() to indicate that
>>> a part of the request is done.
>>> blk_mq_end_request() does not take a length parameter, therefore we can
>>> only mark the whole request as done. See the new is_last property on the
>>> driver.
>>> Maybe there is a way to partially end requests too in blk-mq?
>> You can, take a look at scsi_end_request which handles this for blk-mq
>> and the legacy layer.  That being said I wonder if batching really
>> makes that much sene if you execute each segment separately?
> Anton did a lot of performance improvements in this area.
> He has all the details.
> AFAIK batching brings us more throughput because in UML all IO is done by
> a different thread and the IPC has a certain overhead. 

The current UML disk IO is executed in a different thread using a pipe
as an IPC.

What batching helps with is the number of context switches and numbers
of syscalls per IO operation.

The non-batching code used 6 syscalls per disk io operation: UML write
to IPC, disk thread read from IPC, actual disk IO, disk thread write to
IPC, (e)poll in UML IRQ controller emulation, UML read from IPC.

With batching this is reduced to 5 calls per batch + number of IO ops
batched. Under load the batches grow to usually 10-30 in size which
yields a syscall reduction of 3-5 times. My code sets the batch size
limit to 64 and you can hit that on some synthetic benchmarks like
dd-ing raw disks.

There is further gains from latency reduction. The "round-trip" over the
IPC to tell the disk io thread to perform an IO operation and to confirm
the results is also reduced if you manage to pass multiple events in one go.

All in all, the difference between batched and non-batched for heavy IO
load is several times for the old blk code in UML. I need to do some
reading to get a better understanding of the new code and if needs
batching and how to match it to the actual blk-mq semantics.

A.

>
>>> Another obstacle with IO batching is that UML IO thread requests can
>>> fail. Not only due to OOM, also because the pipe between the UML kernel
>>> process and the host IO thread can return EAGAIN.
>>> In this case the driver puts the request into a list and retried later
>>> again when the pipe turns writable.
>>> I’m not sure whether this restart logic makes sense with blk-mq, maybe
>>> there is a way in blk-mq to put back a (partial) request?
>> blk_mq_requeue_request requeues requests that have been partially
>> exectuted (or not at all for that matter).
> Thanks this is what I needed.
> BTW: How can I know which blk functions are not usable in blk-mq?
> I didn't realize that I can use blk_update_request().
>
> Thanks,
> //richard
>
>

-- 
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661



Re: [uml-devel] [PATCH] [RFC] um: Convert ubd driver to blk-mq

2017-11-26 Thread Anton Ivanov
On 26/11/17 13:56, Richard Weinberger wrote:
> Anton,
>
> please don't crop the CC list.

Apologies, I wanted to keep the discussion UML side until we have come
up with something.

Will not do it again.

>
> Am Sonntag, 26. November 2017, 14:41:12 CET schrieb Anton Ivanov:
>> I need to do some reading on this.
>>
>> First of all - a stupid question: mq's primary advantage is in
>> multi-core systems as it improves io and core utilization. We are still
>> single-core in UML and AFAIK this is likely to stay that way, right?
> Well, someday blk-mq should completely replace the legacy block interface.
> Christoph asked me convert the UML driver.
> Also do find corner cases in blk-mq.
>  
>> On 26/11/17 13:10, Richard Weinberger wrote:
>>> This is the first attempt to convert the UserModeLinux block driver
>>> (UBD) to blk-mq.
>>> While the conversion itself is rather trivial, a few questions
>>> popped up in my head. Maybe you can help me with them.
>>>
>>> MAX_SG is 64, used for blk_queue_max_segments(). This comes from
>>> a0044bdf60c2 ("uml: batch I/O requests"). Is this still a good/sane
>>> value for blk-mq?
>>>
>>> The driver does IO batching, for each request it issues many UML struct
>>> io_thread_req request to the IO thread on the host side.
>>> One io_thread_req per SG page.
>>> Before the conversion the driver used blk_end_request() to indicate that
>>> a part of the request is done.
>>> blk_mq_end_request() does not take a length parameter, therefore we can
>>> only mark the whole request as done. See the new is_last property on the
>>> driver.
>>> Maybe there is a way to partially end requests too in blk-mq?
>>>
>>> Another obstacle with IO batching is that UML IO thread requests can
>>> fail. Not only due to OOM, also because the pipe between the UML kernel
>>> process and the host IO thread can return EAGAIN.
>>> In this case the driver puts the request into a list and retried later
>>> again when the pipe turns writable.
>>> I’m not sure whether this restart logic makes sense with blk-mq, maybe
>>> there is a way in blk-mq to put back a (partial) request?
>> This all sounds to me as blk-mq requests need different inter-thread
>> IPC. We presently rely on the fact that each request to the IO thread is
>> fixed size and there is no natural request grouping coming from upper
>> layers.
>>
>> Unless I am missing something, this looks like we are now getting group
>> requests, right? We need to send a group at a time which is not
>> processed until the whole group has been received in the IO thread. We
>> cans still batch groups though, but should not batch individual
>> requests, right?
> The question is, do we really need batching at all with blk-mq?
> Jeff implemented that 10 years ago.

Well, but in that case we need to match our IPC to the existing batching
in the blck queue, right?

So my proposal still stands - I suggest we roll back my batching patch
which is no longer needed and change the IPC to match what is coming out
of blk-mq :)

>
>> My first step (before moving to mq) would have been to switch to a unix
>> domain socket pair probably using SOCK_SEQPACKET or SOCK_DGRAM. The
>> latter for a socket pair will return ENOBUF if you try to push more than
>> the receiving side can handle so we should not have IPC message loss.
>> This way, we can push request groups naturally instead of relying on a
>> "last" flag and keeping track of that for "end of request".
> The pipe is currently a socketpair. UML just calls it "pipe". :-(

I keep forgetting if we applied that patch or not :)

It was a pipe once upon a time and I suggested we change it socket pair
due to better buffering behavior for lots of small requests.

>
>> It will be easier to roll back the batching before we do that. Feel free
>> to roll back that commit.
>>
>> Once that is in, the whole batching will need to be redone as it should
>> account for variable IPC record size and use sendmmsg/recvmmsg pair -
>> same as in the vector IO. I am happy to do the honors on that one :)
> Let's see what block guys say.

Ack.

A.

>
> Thanks,
> //richard
>



Re: [uml-devel] [PATCH] [RFC] um: Convert ubd driver to blk-mq

2017-11-26 Thread Anton Ivanov
On 26/11/17 13:56, Richard Weinberger wrote:
> Anton,
>
> please don't crop the CC list.

Apologies, I wanted to keep the discussion UML side until we have come
up with something.

Will not do it again.

>
> Am Sonntag, 26. November 2017, 14:41:12 CET schrieb Anton Ivanov:
>> I need to do some reading on this.
>>
>> First of all - a stupid question: mq's primary advantage is in
>> multi-core systems as it improves io and core utilization. We are still
>> single-core in UML and AFAIK this is likely to stay that way, right?
> Well, someday blk-mq should completely replace the legacy block interface.
> Christoph asked me convert the UML driver.
> Also do find corner cases in blk-mq.
>  
>> On 26/11/17 13:10, Richard Weinberger wrote:
>>> This is the first attempt to convert the UserModeLinux block driver
>>> (UBD) to blk-mq.
>>> While the conversion itself is rather trivial, a few questions
>>> popped up in my head. Maybe you can help me with them.
>>>
>>> MAX_SG is 64, used for blk_queue_max_segments(). This comes from
>>> a0044bdf60c2 ("uml: batch I/O requests"). Is this still a good/sane
>>> value for blk-mq?
>>>
>>> The driver does IO batching, for each request it issues many UML struct
>>> io_thread_req request to the IO thread on the host side.
>>> One io_thread_req per SG page.
>>> Before the conversion the driver used blk_end_request() to indicate that
>>> a part of the request is done.
>>> blk_mq_end_request() does not take a length parameter, therefore we can
>>> only mark the whole request as done. See the new is_last property on the
>>> driver.
>>> Maybe there is a way to partially end requests too in blk-mq?
>>>
>>> Another obstacle with IO batching is that UML IO thread requests can
>>> fail. Not only due to OOM, also because the pipe between the UML kernel
>>> process and the host IO thread can return EAGAIN.
>>> In this case the driver puts the request into a list and retried later
>>> again when the pipe turns writable.
>>> I’m not sure whether this restart logic makes sense with blk-mq, maybe
>>> there is a way in blk-mq to put back a (partial) request?
>> This all sounds to me as blk-mq requests need different inter-thread
>> IPC. We presently rely on the fact that each request to the IO thread is
>> fixed size and there is no natural request grouping coming from upper
>> layers.
>>
>> Unless I am missing something, this looks like we are now getting group
>> requests, right? We need to send a group at a time which is not
>> processed until the whole group has been received in the IO thread. We
>> cans still batch groups though, but should not batch individual
>> requests, right?
> The question is, do we really need batching at all with blk-mq?
> Jeff implemented that 10 years ago.

Well, but in that case we need to match our IPC to the existing batching
in the blck queue, right?

So my proposal still stands - I suggest we roll back my batching patch
which is no longer needed and change the IPC to match what is coming out
of blk-mq :)

>
>> My first step (before moving to mq) would have been to switch to a unix
>> domain socket pair probably using SOCK_SEQPACKET or SOCK_DGRAM. The
>> latter for a socket pair will return ENOBUF if you try to push more than
>> the receiving side can handle so we should not have IPC message loss.
>> This way, we can push request groups naturally instead of relying on a
>> "last" flag and keeping track of that for "end of request".
> The pipe is currently a socketpair. UML just calls it "pipe". :-(

I keep forgetting if we applied that patch or not :)

It was a pipe once upon a time and I suggested we change it socket pair
due to better buffering behavior for lots of small requests.

>
>> It will be easier to roll back the batching before we do that. Feel free
>> to roll back that commit.
>>
>> Once that is in, the whole batching will need to be redone as it should
>> account for variable IPC record size and use sendmmsg/recvmmsg pair -
>> same as in the vector IO. I am happy to do the honors on that one :)
> Let's see what block guys say.

Ack.

A.

>
> Thanks,
> //richard
>



Re: UM: Fine-tuning for some function implementations

2017-01-19 Thread Anton Ivanov

How about tackling some real problems and performance issues instead?

There are a few of those in the network, interrupt and memory 
subsystems. Take your pick.


A.


On 19/01/17 17:13, SF Markus Elfring wrote:

please don't send drive-by patches.

Would you dare to take another look at the published update steps
in any other software combination?

Regards,
Markus





Re: UM: Fine-tuning for some function implementations

2017-01-19 Thread Anton Ivanov

How about tackling some real problems and performance issues instead?

There are a few of those in the network, interrupt and memory 
subsystems. Take your pick.


A.


On 19/01/17 17:13, SF Markus Elfring wrote:

please don't send drive-by patches.

Would you dare to take another look at the published update steps
in any other software combination?

Regards,
Markus





Re: [uml-devel] kernel stalls in balance_dirty_pages_ratelimited()

2014-10-19 Thread Anton Ivanov
On 19/10/14 15:59, Thomas Meyer wrote:
> Am Dienstag, den 14.10.2014, 08:31 +0100 schrieb Anton Ivanov:
>> I see a very similar stall on writeout to ubd with my patches (easy) and 
>> without (difficult - takes running an IO soak for a few days).
>>
>> It stalls (usually) when trying to flush the journal file of ext4.
> Hi,
>
> here an extract of the trace of all writeback:* tracepoints:
>
> # tracer: nop
> #
> # entries-in-buffer/entries-written: 19322/2052430   #P:1
> #
> #  _-=> irqs-off
> # / _=> need-resched
> #| / _---=> hardirq/softirq
> #|| / _--=> preempt-depth
> #||| / delay
> #   TASK-PID   CPU#  TIMESTAMP  FUNCTION
> #  | |   |      | |
>  yum-1553  [000]   1246.00: writeback_wake_background: 
> bdi 98:0
>  yum-1553  [000]   1246.00: balance_dirty_pages: bdi 
> 98:0: limit=24732 setpoint=16229 dirty=18446744073709551284 
> bdi_setpoint=16227 bdi_dirty=1 dirty_ratelimit=4 task_ratelimit=0 dirtied=1 
> dirtied_pause=0 paused=0 pause=10 period=10 think=0
> kworker/u2:0-2603  [000]   1246.00: global_dirty_state: 
> dirty=18446744073709551284 writeback=0 unstable=0 bg_thresh=5151 thresh=10303 
> limit=24732 dirtied=340953 written=358955
> kworker/u2:0-2603  [000]   1246.00: global_dirty_state: 
> dirty=18446744073709551284 writeback=0 unstable=0 bg_thresh=5151 thresh=10303 
> limit=24732 dirtied=340953 written=358955
> kworker/u2:0-2603  [000]   1246.00: writeback_start: bdi 98:0: 
> sb_dev 0:0 nr_pages=9223372036854775807 sync_mode=0 kupdate=0 range_cyclic=1 
> background=1 reason=background
> kworker/u2:0-2603  [000]   1246.00: writeback_queue_io: bdi 98:0: 
> older=4295061896 age=0 enqueue=1 reason=background
> kworker/u2:0-2603  [000]   1246.00: writeback_single_inode_start: 
> bdi 98:0: ino=29951 state=I_DIRTY_SYNC|I_DIRTY_PAGES|I_SYNC 
> dirtied_when=4295061896 age=8 index=9 to_write=1024 wrote=0
> kworker/u2:0-2603  [000]   1246.00: writeback_write_inode_start: 
> bdi 98:0: ino=29951 sync_mode=0
> kworker/u2:0-2603  [000]   1246.00: writeback_write_inode: bdi 
> 98:0: ino=29951 sync_mode=0
> kworker/u2:0-2603  [000]   1246.00: writeback_single_inode: bdi 
> 98:0: ino=29951 state=I_SYNC dirtied_when=4295061896 age=8 index=9 
> to_write=1024 wrote=1
> kworker/u2:0-2603  [000]   1246.00: writeback_written: bdi 98:0: 
> sb_dev 0:0 nr_pages=9223372036854775806 sync_mode=0 kupdate=0 range_cyclic=1 
> background=1 reason=background
> kworker/u2:0-2603  [000]   1246.00: global_dirty_state: 
> dirty=18446744073709551283 writeback=0 unstable=0 bg_thresh=5151 thresh=10303 
> limit=24732 dirtied=340953 written=358956
> kworker/u2:0-2603  [000]   1246.00: writeback_start: bdi 98:0: 
> sb_dev 0:0 nr_pages=9223372036854775806 sync_mode=0 kupdate=0 range_cyclic=1 
> background=1 reason=background
> kworker/u2:0-2603  [000]   1246.00: writeback_queue_io: bdi 98:0: 
> older=4295061896 age=0 enqueue=0 reason=background
> kworker/u2:0-2603  [000]   1246.00: writeback_written: bdi 98:0: 
> sb_dev 0:0 nr_pages=9223372036854775806 sync_mode=0 kupdate=0 range_cyclic=1 
> background=1 reason=background
> kworker/u2:0-2603  [000]   1246.00: writeback_pages_written: 1
>  yum-1553  [000]   1246.01: writeback_wake_background: 
> bdi 98:0
>  yum-1553  [000]   1246.01: writeback_dirty_inode_start: 
> bdi 98:0: ino=29951 flags=I_DIRTY_SYNC
>  yum-1553  [000]   1246.01: writeback_dirty_inode: bdi 
> 98:0: ino=29951 flags=I_DIRTY_SYNC
>  yum-1553  [000] d...  1246.01: writeback_dirty_page: bdi 
> 98:0: ino=29951 index=8
>  yum-1553  [000]   1246.01: global_dirty_state: 
> dirty=18446744073709551284 writeback=0 unstable=0 bg_thresh=5151 thresh=10303 
> limit=24732 dirtied=340954 written=358956
>  yum-1553  [000]   1246.01: writeback_wake_background: 
> bdi 98:0
>  yum-1553  [000]   1246.01: balance_dirty_pages: bdi 
> 98:0: limit=24732 setpoint=16229 dirty=18446744073709551284 
> bdi_setpoint=16227 bdi_dirty=1 dirty_ratelimit=4 task_ratelimit=0 dirtied=1 
> dirtied_pause=0 paused=0 pause=10 period=10 think=0
> kworker/u2:0-2603  [000]   1246.01: global_dirty_state: 
> dirty=18446744073709551284 writeback=0 unstable=0 bg_thresh=5151 thresh=10303 
> limit=24732 dirtied=340954 written=358956
> kworker/u2:0-260

Re: [uml-devel] kernel stalls in balance_dirty_pages_ratelimited()

2014-10-19 Thread Anton Ivanov
On 19/10/14 15:59, Thomas Meyer wrote:
 Am Dienstag, den 14.10.2014, 08:31 +0100 schrieb Anton Ivanov:
 I see a very similar stall on writeout to ubd with my patches (easy) and 
 without (difficult - takes running an IO soak for a few days).

 It stalls (usually) when trying to flush the journal file of ext4.
 Hi,

 here an extract of the trace of all writeback:* tracepoints:

 # tracer: nop
 #
 # entries-in-buffer/entries-written: 19322/2052430   #P:1
 #
 #  _-= irqs-off
 # / _= need-resched
 #| / _---= hardirq/softirq
 #|| / _--= preempt-depth
 #||| / delay
 #   TASK-PID   CPU#  TIMESTAMP  FUNCTION
 #  | |   |      | |
  yum-1553  [000]   1246.00: writeback_wake_background: 
 bdi 98:0
  yum-1553  [000]   1246.00: balance_dirty_pages: bdi 
 98:0: limit=24732 setpoint=16229 dirty=18446744073709551284 
 bdi_setpoint=16227 bdi_dirty=1 dirty_ratelimit=4 task_ratelimit=0 dirtied=1 
 dirtied_pause=0 paused=0 pause=10 period=10 think=0
 kworker/u2:0-2603  [000]   1246.00: global_dirty_state: 
 dirty=18446744073709551284 writeback=0 unstable=0 bg_thresh=5151 thresh=10303 
 limit=24732 dirtied=340953 written=358955
 kworker/u2:0-2603  [000]   1246.00: global_dirty_state: 
 dirty=18446744073709551284 writeback=0 unstable=0 bg_thresh=5151 thresh=10303 
 limit=24732 dirtied=340953 written=358955
 kworker/u2:0-2603  [000]   1246.00: writeback_start: bdi 98:0: 
 sb_dev 0:0 nr_pages=9223372036854775807 sync_mode=0 kupdate=0 range_cyclic=1 
 background=1 reason=background
 kworker/u2:0-2603  [000]   1246.00: writeback_queue_io: bdi 98:0: 
 older=4295061896 age=0 enqueue=1 reason=background
 kworker/u2:0-2603  [000]   1246.00: writeback_single_inode_start: 
 bdi 98:0: ino=29951 state=I_DIRTY_SYNC|I_DIRTY_PAGES|I_SYNC 
 dirtied_when=4295061896 age=8 index=9 to_write=1024 wrote=0
 kworker/u2:0-2603  [000]   1246.00: writeback_write_inode_start: 
 bdi 98:0: ino=29951 sync_mode=0
 kworker/u2:0-2603  [000]   1246.00: writeback_write_inode: bdi 
 98:0: ino=29951 sync_mode=0
 kworker/u2:0-2603  [000]   1246.00: writeback_single_inode: bdi 
 98:0: ino=29951 state=I_SYNC dirtied_when=4295061896 age=8 index=9 
 to_write=1024 wrote=1
 kworker/u2:0-2603  [000]   1246.00: writeback_written: bdi 98:0: 
 sb_dev 0:0 nr_pages=9223372036854775806 sync_mode=0 kupdate=0 range_cyclic=1 
 background=1 reason=background
 kworker/u2:0-2603  [000]   1246.00: global_dirty_state: 
 dirty=18446744073709551283 writeback=0 unstable=0 bg_thresh=5151 thresh=10303 
 limit=24732 dirtied=340953 written=358956
 kworker/u2:0-2603  [000]   1246.00: writeback_start: bdi 98:0: 
 sb_dev 0:0 nr_pages=9223372036854775806 sync_mode=0 kupdate=0 range_cyclic=1 
 background=1 reason=background
 kworker/u2:0-2603  [000]   1246.00: writeback_queue_io: bdi 98:0: 
 older=4295061896 age=0 enqueue=0 reason=background
 kworker/u2:0-2603  [000]   1246.00: writeback_written: bdi 98:0: 
 sb_dev 0:0 nr_pages=9223372036854775806 sync_mode=0 kupdate=0 range_cyclic=1 
 background=1 reason=background
 kworker/u2:0-2603  [000]   1246.00: writeback_pages_written: 1
  yum-1553  [000]   1246.01: writeback_wake_background: 
 bdi 98:0
  yum-1553  [000]   1246.01: writeback_dirty_inode_start: 
 bdi 98:0: ino=29951 flags=I_DIRTY_SYNC
  yum-1553  [000]   1246.01: writeback_dirty_inode: bdi 
 98:0: ino=29951 flags=I_DIRTY_SYNC
  yum-1553  [000] d...  1246.01: writeback_dirty_page: bdi 
 98:0: ino=29951 index=8
  yum-1553  [000]   1246.01: global_dirty_state: 
 dirty=18446744073709551284 writeback=0 unstable=0 bg_thresh=5151 thresh=10303 
 limit=24732 dirtied=340954 written=358956
  yum-1553  [000]   1246.01: writeback_wake_background: 
 bdi 98:0
  yum-1553  [000]   1246.01: balance_dirty_pages: bdi 
 98:0: limit=24732 setpoint=16229 dirty=18446744073709551284 
 bdi_setpoint=16227 bdi_dirty=1 dirty_ratelimit=4 task_ratelimit=0 dirtied=1 
 dirtied_pause=0 paused=0 pause=10 period=10 think=0
 kworker/u2:0-2603  [000]   1246.01: global_dirty_state: 
 dirty=18446744073709551284 writeback=0 unstable=0 bg_thresh=5151 thresh=10303 
 limit=24732 dirtied=340954 written=358956
 kworker/u2:0-2603  [000]   1246.01: global_dirty_state: 
 dirty=18446744073709551284 writeback=0 unstable=0 bg_thresh=5151 thresh=10303 
 limit=24732 dirtied=340954 written=358956
 kworker/u2:0-2603  [000]   1246.01: writeback_start: bdi 98:0: 
 sb_dev 0:0 nr_pages=9223372036854775807 sync_mode=0 kupdate=0 range_cyclic=1 
 background=1 reason=background