Robert Foley <robert.fo...@linaro.org> writes:

> We hit an issue when trying to change the log file from the monitor
> console.  The root of the issue here is that the qemu_logfile handle
> is not thread safe.  So when we try to close the file, we end up with
> a seg fault.  The full analysis is below along with some possible
> solutions.
> Will plan to post a patch soon, but any comments or opinions on our
> proposed solution would be appreciated.  Thanks.
>
> The version of QEMU we are using is: master as of about Oct 15,
> 9020e9526cd08c4dc99d54dba48730de2908c970.
>
> This is what we did to reproduce the issue.
> First we enable logging and select the log file.
> (qemu) log in_asm,out_asm,op
> (qemu) logfile asm.out
>
> Then we start this command in the guest.  This just keeps the guest
> performing operations that result in logging to be constantly
> generated.
> $ for i in {0..1000}; do ls -l; done
>
> Next we switch to the monitor console and change the file.
> (qemu) logfile asm_new.log
>
> This action causes a seg fault.  Please see the stack trace (below).
>
> The code, which changes the log file unconditionally
> (qemu_set_log_filename()), closes the qemu_logfile, sets it to NULL,
> and then opens the new file.
> Since the file handle is still in use, we end up with a seg fault when
> the code which is trying to log ends up using a NULL file handle.

Yep, good catch.

> We are considering a few solutions.
>
> A straightforward solution would be to simply prevent the file from
> being changed while logging is enabled.  In other words, force the
> user to first disable logging before changing the log file.
> This solution seems to cover the general case.  However, if a user
> were to disable logging and change the log file in quick succession,
> we would still be subject to a similar race.  A log call could still
> make it through the logging enable check and proceed to use a file
> handle that gets changed to NULL.
>
> Another option is to add a mutex to prevent the qemu_logfile handle
> from being changed while it is in use.  This certainly works and has
> the advantage of being fairly straightforward.  Also we are thinking
> that since the mutex would only be used when logging is enabled it has
> the advantage of not having an effect on the normal case performance.
> Another option is to implement a simple atomic ref count and prevent
> the file from being changed while there are outstanding references.

The mutex is the simplest but I wonder if it show up on heavy usage? The
log file is already doing file locking for the areas that want
contiguous log statements.

I wonder if using RCU to swap out the new and old FD would make things a
bit smoother here? You would have to tweak the qemu_log_lock/unlock
functions to preserve the current FD around the lock and call_rcu a
freeing function when a new handle is set.

> We are leaning towards the mutex option, and plan to post a patch
> soon, but would appreciate comments or opinions on this solution.
>
> Thanks,
> Rob Foley
>
> stack trace
> ==========================================
> Thread 10 "qemu-system-aar" received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0xffff113f9d90 (LWP 9493)] __flockfile
> (stream=0x0) at ../sysdeps/pthread/flockfile.c:27
> 27 ../sysdeps/pthread/flockfile.c: No such file or directory.
> (gdb) bt
> #0  __flockfile (stream=0x0) at ../sysdeps/pthread/flockfile.c:27
> #1  0x0000aaaae0fac8b8 in qemu_flockfile (f=<optimized out>) at
> /home/rob/qemu/qemu_unchanged/include/sysemu/os-posix.h:87
> #2  qemu_log_lock () at /home/rob/qemu/qemu_unchanged/include/qemu/log.h:57
> #3  translator_loop (ops=0xaaaae17f1348 <aarch64_translator_ops>,
> db=0xffff113f9088, db@entry=0xffff113f9098,
> cpu=cpu@entry=0xaaab0a52bc50,
>     tb=tb@entry=0xffff4c92d000 <code_gen_buffer+814927796>,
> max_insns=max_insns@entry=512) at
> /home/rob/qemu/qemu_unchanged/accel/tcg/translator.c:121
> #4  0x0000aaaae10c1c18 in gen_intermediate_code
> (cpu=cpu@entry=0xaaab0a52bc50, tb=tb@entry=0xffff4c92d000
> <code_gen_buffer+814927796>, max_insns=max_insns@entry=512)
>     at /home/rob/qemu/qemu_unchanged/target/arm/translate.c:11320
> #5  0x0000aaaae0fab248 in tb_gen_code (cpu=0xaaab0a52bc50,
> cpu@entry=0xffffabe2a000, pc=187650897458448, cs_base=65536,
> flags=43690, cflags=-16252928, cflags@entry=524288)
>     at /home/rob/qemu/qemu_unchanged/accel/tcg/translate-all.c:1738
> #6  0x0000aaaae0fa8e74 in tb_find (cf_mask=524288, tb_exit=0,
> last_tb=0xffff4c92cc40 <code_gen_buffer+814926836>,
> cpu=0xffffabe2a000)
>     at /home/rob/qemu/qemu_unchanged/accel/tcg/cpu-exec.c:408
> #7  cpu_exec (cpu=0xffffabe2a000, cpu@entry=0xaaab0a52bc50) at
> /home/rob/qemu/qemu_unchanged/accel/tcg/cpu-exec.c:730
> #8  0x0000aaaae0f6de24 in tcg_cpu_exec (cpu=0xaaab0a52bc50) at
> /home/rob/qemu/qemu_unchanged/cpus.c:1454
> #9  0x0000aaaae0f70908 in qemu_tcg_cpu_thread_fn (arg=0xaaab0a52bc50)
> at /home/rob/qemu/qemu_unchanged/cpus.c:1762
> #10 0x0000aaaae145bd38 in qemu_thread_start (args=<optimized out>) at
> /home/rob/qemu/qemu_unchanged/util/qemu-thread-posix.c:519
> #11 0x0000ffffabe0a088 in start_thread (arg=0xffffcc20410f) at
> pthread_create.c:463
> #12 0x0000ffffabd7a4ec in thread_start () at
> ../sysdeps/unix/sysv/linux/aarch64/clone.S:78


--
Alex Bennée

Reply via email to