Daniel P. Berrangé <berra...@redhat.com> writes:

> On Tue, Sep 23, 2025 at 04:28:42PM +0200, Markus Armbruster wrote:
>> Daniel P. Berrangé <berra...@redhat.com> writes:
>> 
>> > Some code makes multiple qemu_log calls to incrementally emit
>> > a single message. Currently timestamps get prepended to all
>> > qemu_log calls, even those continuing a previous incomplete
>> > message.
>> >
>> > This changes the qemu_log so it skips adding a new line prefix,
>> > if the previous qemu_log call did NOT end with a newline.
>> >
>> > Reported-by: Richard Henderson <richard.hender...@linaro.org>
>> > Signed-off-by: Daniel P. Berrangé <berra...@redhat.com>
>> 
>> This patch has kept nagging me in the back of my brain.  So I'm back for
>> a second look.
>> 
>> > ---
>> >  util/log.c | 9 ++++++++-
>> >  1 file changed, 8 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/util/log.c b/util/log.c
>> > index abdcb6b311..2642a55c59 100644
>> > --- a/util/log.c
>> > +++ b/util/log.c
>> > @@ -143,6 +143,12 @@ void qemu_log_unlock(FILE *logfile)
>> >      }
>> >  }
>> >  
>> > +/*
>> > + * 'true' if the previous log message lacked a trailing '\n',
>> > + * and thus the subsequent call must skip any prefix
>> > + */
>> > +static __thread bool incomplete;
>> > +
>> >  void qemu_log(const char *fmt, ...)
>> >  {
>> >      FILE *f;
>> > @@ -154,7 +160,7 @@ void qemu_log(const char *fmt, ...)
>> >       * was emitted if we are delayed acquiring the
>> >       * mutex
>> >       */
>> > -    if (message_with_timestamp) {
>> > +    if (message_with_timestamp && !incomplete) {
>> >          g_autoptr(GDateTime) dt = g_date_time_new_now_utc();
>> >          timestr = g_date_time_format_iso8601(dt);
>> >      }
>> > @@ -170,6 +176,7 @@ void qemu_log(const char *fmt, ...)
>>        f = qemu_log_trylock();
>>        if (f) {
>>            va_list ap;
>> 
>>            if (timestr) {
>>                fprintf(f, "%s ", timestr);
>>            }
>> 
>> >          va_start(ap, fmt);
>> >          vfprintf(f, fmt, ap);
>> >          va_end(ap);
>> > +        incomplete = fmt[strlen(fmt) - 1] != '\n';
>> >          qemu_log_unlock(f);
>> >      }
>> >  }
>> 
>> Two cases:
>> 
>> (A) Single log
>> 
>>     qemu_log_trylock() returns @global_file, and uses RCU to ensure it
>>     remains valid until qemu_log_unlock().  I think.
>> 
>> (B) Log split per thread (-d tid)
>> 
>>     qemu_log_trylock() returns thread-local @thread_file.
>> 
>> In addition, qemu_log_trylock() locks the FILE it returns with
>> flockfile(), so no other thread can write to it until qemu_log_unlock()
>> unlocks it with funlockfile().  This ensures the entire output of in
>> between stays together.
>> 
>> Let's see how this plays with @incomplete.
>> 
>> (B) Log split per thread (-d tid)
>> 
>>     @incomplete is thread-local.  It records wether the last qemu_log()
>>     in this thread was an incomplete line.  If it was, the next
>>     qemu_log() continues the line.  Unless something else wrote to
>>     @thread_file in between, but that's not supposed to happen.  Good.
>> 
>> (A) Single log
>> 
>>     All thread log to the same FILE.  Consider:
>> 
>>     1. Thread 1 starts.  Its @incomplete is initialized to false.
>> 
>>     2. Thread 2 starts.  Its @incomplete is initialized to false.
>> 
>>     3. Thread 1 logs "abra".  Its @incomplete is set to true.
>> 
>>     4. Thread 2 logs "interrupt\n".  Its @incomplete remains false.
>> 
>>     5. Thread 2 logs "cadbra\n".  Its @incomplete goes back to false.
>> 
>>     Resulting log file contents:
>> 
>>         PREFIX "abra" PREFIX "interrupt\n"
>>         "cadabra\n"
>> 
>>     Not good.
>> 
>> We could complicate this code further to mitigate.  For instance, we
>> could use a thread-local @incomplete for (B), and a global one for (A).
>> This ensures log lines start with PREFIX as they should, but does
>> nothing to avoid mixing up line parts from different threads.  My
>> example would then produce
>> 
>>         PREFIX "abrainterrupt\n"
>>         PREFIX "cababra\n"
>> 
>> My take: "Doctor, it hurts when I do that!"  "Don't do that then."
>> Logging incomplete lines with qemu_log() can hurt.  Don't do that then.
>
> I just took a look at linux-user/syscall.c as that is one place that
> heavily uses qemu_log() for incomplete lines.
>
> What I didn't realize was that the expectation is to call
> qemu_log_trylock() (which returns the "FILE *" target) and
> then you can ignore the "FILE *" and just call qemu_log()
> repeatedly, and finally call qemu_log_unlock(FILE *) when
> done.
>
> https://gitlab.com/qemu-project/qemu/-/blob/master/linux-user/strace.c?ref_type=heads#L4396

I can see the qemu_log_trylock() / qemu_log_unlock() bracket.  But the
code within doesn't work the way you describe: it uses fprintf(f, ...).

If it did ignore @f and call qemu_log(), qemu_log() would
qemu_log_trylock() again, taking the RCU read lock and the flockfile()
lock on @f recursively.  Should work.

> This is a slightly wierd API design,

More seriously: entirely undocumented.  The only hint is the presence of
qemu_log_trylock() and qemu_log_unlock().

>                                      but that seems to be
> the intended way to serailize and in that context, my
> patch/hack here will be sufficiently good.

Right, the flockfile() locks out the other thread.


I feel this is more complex than it ought to be.  It also lacks comments
explaining the locking.

On a green field, I'd make the logging function append a newline.  When
the line to be logged is to be built up with multiple printf-like calls,
use g_string_printf().  If you're worried about reallocations, use
something like g_string_sized_new(512).  Log lines longer than that
would be masochism anyway.  If even that single allocation bothers you,
build your logging around fixed thread-local buffers, avoiding stdio.


No objection to your patch.  It digs us deeper into this hole, but we
got worse holes to worry about.


Reply via email to