Robert Foley <robert.fo...@linaro.org> writes:
> We hit an issue when trying to change the log file from the monitor > console. The root of the issue here is that the qemu_logfile handle > is not thread safe. So when we try to close the file, we end up with > a seg fault. The full analysis is below along with some possible > solutions. > Will plan to post a patch soon, but any comments or opinions on our > proposed solution would be appreciated. Thanks. > > The version of QEMU we are using is: master as of about Oct 15, > 9020e9526cd08c4dc99d54dba48730de2908c970. > > This is what we did to reproduce the issue. > First we enable logging and select the log file. > (qemu) log in_asm,out_asm,op > (qemu) logfile asm.out > > Then we start this command in the guest. This just keeps the guest > performing operations that result in logging to be constantly > generated. > $ for i in {0..1000}; do ls -l; done > > Next we switch to the monitor console and change the file. > (qemu) logfile asm_new.log > > This action causes a seg fault. Please see the stack trace (below). > > The code, which changes the log file unconditionally > (qemu_set_log_filename()), closes the qemu_logfile, sets it to NULL, > and then opens the new file. > Since the file handle is still in use, we end up with a seg fault when > the code which is trying to log ends up using a NULL file handle. Yep, good catch. > We are considering a few solutions. > > A straightforward solution would be to simply prevent the file from > being changed while logging is enabled. In other words, force the > user to first disable logging before changing the log file. > This solution seems to cover the general case. However, if a user > were to disable logging and change the log file in quick succession, > we would still be subject to a similar race. A log call could still > make it through the logging enable check and proceed to use a file > handle that gets changed to NULL. > > Another option is to add a mutex to prevent the qemu_logfile handle > from being changed while it is in use. This certainly works and has > the advantage of being fairly straightforward. Also we are thinking > that since the mutex would only be used when logging is enabled it has > the advantage of not having an effect on the normal case performance. > Another option is to implement a simple atomic ref count and prevent > the file from being changed while there are outstanding references. The mutex is the simplest but I wonder if it show up on heavy usage? The log file is already doing file locking for the areas that want contiguous log statements. I wonder if using RCU to swap out the new and old FD would make things a bit smoother here? You would have to tweak the qemu_log_lock/unlock functions to preserve the current FD around the lock and call_rcu a freeing function when a new handle is set. > We are leaning towards the mutex option, and plan to post a patch > soon, but would appreciate comments or opinions on this solution. > > Thanks, > Rob Foley > > stack trace > ========================================== > Thread 10 "qemu-system-aar" received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0xffff113f9d90 (LWP 9493)] __flockfile > (stream=0x0) at ../sysdeps/pthread/flockfile.c:27 > 27 ../sysdeps/pthread/flockfile.c: No such file or directory. > (gdb) bt > #0 __flockfile (stream=0x0) at ../sysdeps/pthread/flockfile.c:27 > #1 0x0000aaaae0fac8b8 in qemu_flockfile (f=<optimized out>) at > /home/rob/qemu/qemu_unchanged/include/sysemu/os-posix.h:87 > #2 qemu_log_lock () at /home/rob/qemu/qemu_unchanged/include/qemu/log.h:57 > #3 translator_loop (ops=0xaaaae17f1348 <aarch64_translator_ops>, > db=0xffff113f9088, db@entry=0xffff113f9098, > cpu=cpu@entry=0xaaab0a52bc50, > tb=tb@entry=0xffff4c92d000 <code_gen_buffer+814927796>, > max_insns=max_insns@entry=512) at > /home/rob/qemu/qemu_unchanged/accel/tcg/translator.c:121 > #4 0x0000aaaae10c1c18 in gen_intermediate_code > (cpu=cpu@entry=0xaaab0a52bc50, tb=tb@entry=0xffff4c92d000 > <code_gen_buffer+814927796>, max_insns=max_insns@entry=512) > at /home/rob/qemu/qemu_unchanged/target/arm/translate.c:11320 > #5 0x0000aaaae0fab248 in tb_gen_code (cpu=0xaaab0a52bc50, > cpu@entry=0xffffabe2a000, pc=187650897458448, cs_base=65536, > flags=43690, cflags=-16252928, cflags@entry=524288) > at /home/rob/qemu/qemu_unchanged/accel/tcg/translate-all.c:1738 > #6 0x0000aaaae0fa8e74 in tb_find (cf_mask=524288, tb_exit=0, > last_tb=0xffff4c92cc40 <code_gen_buffer+814926836>, > cpu=0xffffabe2a000) > at /home/rob/qemu/qemu_unchanged/accel/tcg/cpu-exec.c:408 > #7 cpu_exec (cpu=0xffffabe2a000, cpu@entry=0xaaab0a52bc50) at > /home/rob/qemu/qemu_unchanged/accel/tcg/cpu-exec.c:730 > #8 0x0000aaaae0f6de24 in tcg_cpu_exec (cpu=0xaaab0a52bc50) at > /home/rob/qemu/qemu_unchanged/cpus.c:1454 > #9 0x0000aaaae0f70908 in qemu_tcg_cpu_thread_fn (arg=0xaaab0a52bc50) > at /home/rob/qemu/qemu_unchanged/cpus.c:1762 > #10 0x0000aaaae145bd38 in qemu_thread_start (args=<optimized out>) at > /home/rob/qemu/qemu_unchanged/util/qemu-thread-posix.c:519 > #11 0x0000ffffabe0a088 in start_thread (arg=0xffffcc20410f) at > pthread_create.c:463 > #12 0x0000ffffabd7a4ec in thread_start () at > ../sysdeps/unix/sysv/linux/aarch64/clone.S:78 -- Alex Bennée