i386: Fix CR2 handling for non-canonical ad...

Peter Maydell via Qemu-commits Fri, 21 Nov 2025 19:22:53 -0800

  Branch: refs/heads/staging-7.2
  Home:   https://github.com/qemu/qemu
  Commit: 873955f96548001eac120e0ebaae0d1e4c394f4d
      
https://github.com/qemu/qemu/commit/873955f96548001eac120e0ebaae0d1e4c394f4d
  Author: Mathias Krause <[email protected]>
  Date:   2025-10-15 (Wed, 15 Oct 2025)


  Changed paths:
    M target/i386/tcg/sysemu/excp_helper.c

  Log Message:
  -----------
  target/i386: Fix CR2 handling for non-canonical addresses

Commit 3563362ddfae ("target/i386: Introduce structures for mmu_translate")
accidentally modified CR2 for non-canonical address exceptions while these
should lead to a #GP / #SS instead -- without changing CR2.

Fix that.

A KUT test for this was submitted as [1].

[1] https://lore.kernel.org/kvm/[email protected]/

Fixes: 3563362ddfae ("target/i386: Introduce structures for mmu_translate")
Signed-off-by: Mathias Krause <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
(cherry picked from commit df9a3372ddebfcfc135861fa2d53cef6f98065f9)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: f4ede985eb8bb8b6cade8a0475869ace63b1742a
      
https://github.com/qemu/qemu/commit/f4ede985eb8bb8b6cade8a0475869ace63b1742a
  Author: Paolo Bonzini <[email protected]>
  Date:   2025-10-15 (Wed, 15 Oct 2025)

  Changed paths:
    M hw/intc/apic.c
    M target/i386/helper.c
    M target/i386/tcg/sysemu/seg_helper.c

  Log Message:
  -----------
  i386/cpu: Prevent delivering SIPI during SMM in TCG mode

[commit message by YiFei Zhu]

A malicious kernel may control the instruction pointer in SMM in a
multi-processor VM by sending a sequence of IPIs via APIC:

CPU0                    CPU1
IPI(CPU1, MODE_INIT)
                        x86_cpu_exec_reset()
                        apic_init_reset()
                        s->wait_for_sipi = true
IPI(CPU1, MODE_SMI)
                        do_smm_enter()
                        env->hflags |= HF_SMM_MASK;
IPI(CPU1, MODE_STARTUP, vector)
                        do_cpu_sipi()
                        apic_sipi()
                        /* s->wait_for_sipi check passes */
                        cpu_x86_load_seg_cache_sipi(vector)

A different sequence, SMI INIT SIPI, is also buggy in TCG because
INIT is not blocked or latched during SMM. However, it is not
vulnerable to an instruction pointer control in the same way because
x86_cpu_exec_reset clears env->hflags, exiting SMM.

Fixes: a9bad65d2c1f ("target-i386: wake up processors that receive an SMI")
Analyzed-by: YiFei Zhu <[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
(cherry picked from commit df32e5c568c9cf68c15a9bbd98d0c3aff19eab63)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: 9fed0f14b29d0bfc294a3a0fb8a49b3e3d31e532
      
https://github.com/qemu/qemu/commit/9fed0f14b29d0bfc294a3a0fb8a49b3e3d31e532
  Author: YiFei Zhu <[email protected]>
  Date:   2025-10-15 (Wed, 15 Oct 2025)

  Changed paths:
    M target/i386/tcg/sysemu/smm_helper.c

  Log Message:
  -----------
  i386/tcg/smm_helper: Properly apply DR values on SMM entry / exit

do_smm_enter and helper_rsm sets the env->dr, but does not sync the
values with cpu_x86_update_dr7. A malicious kernel may control the
instruction pointer in SMM by setting a breakpoint on the SMI
entry point, and after do_smm_enter cpu->breakpoints contains the
stale breakpoint; and because IDT is not reloaded upon SMI entry,
the debug exception handler controlled by the malicious kernel
is invoked.

Fixes: 01df040b5247 ("x86: Debug register emulation (Jan Kiszka)")
Reported-by: [email protected]
Signed-off-by: YiFei Zhu <[email protected]>
Link: 
https://lore.kernel.org/r/2bacb9b24e9d337dbe48791aa25d349eb9c52c3a.1758794468.git.zhuyi...@google.com
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
(cherry picked from commit cdba90ac1b0ac789b10c0b5f6ef7e9558237ec66)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: 5adcdce946a5d7fb17f69e0e50fa0b6b992acd34
      
https://github.com/qemu/qemu/commit/5adcdce946a5d7fb17f69e0e50fa0b6b992acd34
  Author: Paolo Bonzini <[email protected]>
  Date:   2025-10-15 (Wed, 15 Oct 2025)

  Changed paths:
    M util/async.c

  Log Message:
  -----------
  async: access bottom half flags with qatomic_read

Running test-aio-multithread under TSAN reveals data races on bh->flags.
Because bottom halves may be scheduled or canceled asynchronously,
without taking a lock, adjust aio_compute_bh_timeout() and aio_ctx_check()
to use a relaxed read to access the flags.

Use an acquire load to ensure that anything that was written prior to
qemu_bh_schedule() is visible.

Closes: https://gitlab.com/qemu-project/qemu/-/issues/2749
Closes: https://gitlab.com/qemu-project/qemu/-/issues/851
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
(cherry picked from commit 5142397c79330aab9bef3230991c8ac0c251110f)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: c88a40041fb011a7335719e4ba058f3dd4f93d2d
      
https://github.com/qemu/qemu/commit/c88a40041fb011a7335719e4ba058f3dd4f93d2d
  Author: Paolo Bonzini <[email protected]>
  Date:   2025-10-15 (Wed, 15 Oct 2025)

  Changed paths:
    M target/i386/cpu.c

  Log Message:
  -----------
  target/i386: user: do not set up a valid LDT on reset

In user-mode emulation, QEMU uses the default setting of the LDT base
and limit, which places it at the bottom 64K of virtual address space.
However, by default there is no LDT at all in Linux processes, and
therefore the limit should be 0.

This is visible as a NULL pointer dereference in LSL and LAR instructions
when they try to read the LDT at an unmapped address.

Resolves: #1376
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
(cherry picked from commit 58aa1d08bbc406ba3982f32ffb1bef0ff4f8f369)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: a041632d4c52dc13acd51c9cbae8abecc62f8ccb
      
https://github.com/qemu/qemu/commit/a041632d4c52dc13acd51c9cbae8abecc62f8ccb
  Author: Bastian Blank <[email protected]>
  Date:   2025-10-29 (Wed, 29 Oct 2025)

  Changed paths:
    M linux-user/ioctls.h

  Log Message:
  -----------
  linux-user: Use correct type for FIBMAP and FIGETBSZ emulation

Both the FIBMAP and FIGETBSZ ioctl get "int *" (pointer to 32bit
integer) as argument, not "long *" as specified in qemu.  Using the
correct type makes the emulation work in cross endian context.

Both ioctl does not seem to be documented. However the kernel
implementation has always used "int *".

Signed-off-by: Bastian Blank <[email protected]>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3185
Reviewed-by: Peter Maydell <[email protected]>
Reviewed-by: Helge Deller <[email protected]>
Reviwed-by: Michael Tokarev <[email protected]>
Signed-off-by: Michael Tokarev <[email protected]>
(cherry picked from commit 7c7089321670fb51022a1c4493cbcc69aa288a0f)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: 87ceabcf34b3af01e0c305a18536b540f3d07ad7
      
https://github.com/qemu/qemu/commit/87ceabcf34b3af01e0c305a18536b540f3d07ad7
  Author: Peter Maydell <[email protected]>
  Date:   2025-11-06 (Thu, 06 Nov 2025)

  Changed paths:
    M linux-user/syscall.c

  Log Message:
  -----------
  linux-user: permit sendto() with NULL buf and 0 len

If you pass sendto() a NULL buffer, this is usually an error
(causing an EFAULT return); however if you pass a 0 length then
we should not try to validate the buffer provided. Instead we
skip the copying of the user data and possible processing
through fd_trans_target_to_host_data, and call the host syscall
with NULL, 0.

(unlock_user() permits a NULL buffer pointer for "do nothing"
so we don't need to special case the unlock code.)

Cc: [email protected]
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3102
Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Michael Tokarev <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-ID: <[email protected]>
(cherry picked from commit 0db2de22fcbf90adafab9d9dd1fc8203c66bfa75)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: f3929f32fbc985a4faf0aec050229f8f535e13f6
      
https://github.com/qemu/qemu/commit/f3929f32fbc985a4faf0aec050229f8f535e13f6
  Author: Daniel P. Berrangé <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M io/channel-tls.c
    M io/trace-events

  Log Message:
  -----------
  io: add trace event when cancelling TLS handshake

Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Daniel P. Berrangé <[email protected]>
(cherry picked from commit 003f15369de4e290a4d2e58292d96c5a506e4ee6)
(Mjt: pick this commit up so that the next changes applies (more) cleanly)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: 7769aada0ac2aa876d933560e8a833b28de951df
      
https://github.com/qemu/qemu/commit/7769aada0ac2aa876d933560e8a833b28de951df
  Author: Daniel P. Berrangé <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M io/channel-tls.c

  Log Message:
  -----------
  io: release active GSource in TLS channel finalizer

While code is supposed to call qio_channel_close() before releasing the
last reference on an QIOChannel, this is not guaranteed. QIOChannelFile
and QIOChannelSocket both cleanup resources in their finalizer if the
close operation was missed.

This ensures the TLS channel will do the same failsafe cleanup.

Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Daniel P. Berrangé <[email protected]>
(cherry picked from commit 2c147611cf568eb1cd7dc8bf4479b272bad3b9d6)
(Mjt: remove releasing of ioc->bye_ioc_tag due to missing in 7.2.x
 v9.2.0-1773-g30ee88622edf "io: tls: Add qio_channel_tls_bye")
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: 911c814c8cc5f836286bd96694843036db83e99f
      
https://github.com/qemu/qemu/commit/911c814c8cc5f836286bd96694843036db83e99f
  Author: Daniel P. Berrangé <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M io/channel-websock.c

  Log Message:
  -----------
  io: move websock resource release to close method

The QIOChannelWebsock object releases all its resources in the
finalize callback. This is later than desired, as callers expect
to be able to call qio_channel_close() to fully close a channel
and release resources related to I/O.

The logic in the finalize method is at most a failsafe to handle
cases where a consumer forgets to call qio_channel_close.

This adds equivalent logic to the close method to release the
resources, using g_clear_handle_id/g_clear_pointer to be robust
against repeated invocations. The finalize method is tweaked
so that the GSource is removed before releasing the underlying
channel.

Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Daniel P. Berrangé <[email protected]>
(cherry picked from commit 322c3c4f3abee616a18b3bfe563ec29dd67eae63)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: cebdbd038e44af56e74272924dc2bf595a51fd8f
      
https://github.com/qemu/qemu/commit/cebdbd038e44af56e74272924dc2bf595a51fd8f
  Author: Daniel P. Berrangé <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M include/io/channel-websock.h
    M io/channel-websock.c

  Log Message:
  -----------
  io: fix use after free in websocket handshake code

If the QIOChannelWebsock object is freed while it is waiting to
complete a handshake, a GSource is leaked. This can lead to the
callback firing later on and triggering a use-after-free in the
use of the channel. This was observed in the VNC server with the
following trace from valgrind:

==2523108== Invalid read of size 4
==2523108==    at 0x4054A24: vnc_disconnect_start (vnc.c:1296)
==2523108==    by 0x4054A24: vnc_client_error (vnc.c:1392)
==2523108==    by 0x4068A09: vncws_handshake_done (vnc-ws.c:105)
==2523108==    by 0x44863B4: qio_task_complete (task.c:197)
==2523108==    by 0x448343D: qio_channel_websock_handshake_io 
(channel-websock.c:588)
==2523108==    by 0x6EDB862: UnknownInlinedFun (gmain.c:3398)
==2523108==    by 0x6EDB862: g_main_context_dispatch_unlocked.lto_priv.0 
(gmain.c:4249)
==2523108==    by 0x6EDBAE4: g_main_context_dispatch (gmain.c:4237)
==2523108==    by 0x45EC79F: glib_pollfds_poll (main-loop.c:287)
==2523108==    by 0x45EC79F: os_host_main_loop_wait (main-loop.c:310)
==2523108==    by 0x45EC79F: main_loop_wait (main-loop.c:589)
==2523108==    by 0x423A56D: qemu_main_loop (runstate.c:835)
==2523108==    by 0x454F300: qemu_default_main (main.c:37)
==2523108==    by 0x73D6574: (below main) (libc_start_call_main.h:58)
==2523108==  Address 0x57a6e0dc is 28 bytes inside a block of size 103,608 
free'd
==2523108==    at 0x5F2FE43: free (vg_replace_malloc.c:989)
==2523108==    by 0x6EDC444: g_free (gmem.c:208)
==2523108==    by 0x4053F23: vnc_update_client (vnc.c:1153)
==2523108==    by 0x4053F23: vnc_refresh (vnc.c:3225)
==2523108==    by 0x4042881: dpy_refresh (console.c:880)
==2523108==    by 0x4042881: gui_update (console.c:90)
==2523108==    by 0x45EFA1B: timerlist_run_timers.part.0 (qemu-timer.c:562)
==2523108==    by 0x45EFC8F: timerlist_run_timers (qemu-timer.c:495)
==2523108==    by 0x45EFC8F: qemu_clock_run_timers (qemu-timer.c:576)
==2523108==    by 0x45EFC8F: qemu_clock_run_all_timers (qemu-timer.c:663)
==2523108==    by 0x45EC765: main_loop_wait (main-loop.c:600)
==2523108==    by 0x423A56D: qemu_main_loop (runstate.c:835)
==2523108==    by 0x454F300: qemu_default_main (main.c:37)
==2523108==    by 0x73D6574: (below main) (libc_start_call_main.h:58)
==2523108==  Block was alloc'd at
==2523108==    at 0x5F343F3: calloc (vg_replace_malloc.c:1675)
==2523108==    by 0x6EE2F81: g_malloc0 (gmem.c:133)
==2523108==    by 0x4057DA3: vnc_connect (vnc.c:3245)
==2523108==    by 0x448591B: qio_net_listener_channel_func (net-listener.c:54)
==2523108==    by 0x6EDB862: UnknownInlinedFun (gmain.c:3398)
==2523108==    by 0x6EDB862: g_main_context_dispatch_unlocked.lto_priv.0 
(gmain.c:4249)
==2523108==    by 0x6EDBAE4: g_main_context_dispatch (gmain.c:4237)
==2523108==    by 0x45EC79F: glib_pollfds_poll (main-loop.c:287)
==2523108==    by 0x45EC79F: os_host_main_loop_wait (main-loop.c:310)
==2523108==    by 0x45EC79F: main_loop_wait (main-loop.c:589)
==2523108==    by 0x423A56D: qemu_main_loop (runstate.c:835)
==2523108==    by 0x454F300: qemu_default_main (main.c:37)
==2523108==    by 0x73D6574: (below main) (libc_start_call_main.h:58)
==2523108==

The above can be reproduced by launching QEMU with

  $ qemu-system-x86_64 -vnc localhost:0,websocket=5700

and then repeatedly running:

  for i in {1..100}; do
     (echo -n "GET / HTTP/1.1" && sleep 0.05) | nc -w 1 localhost 5700 &
  done

CVE-2025-11234
Reported-by: Grant Millar | Cylo <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Daniel P. Berrangé <[email protected]>
(cherry picked from commit b7a1f2ca45c7865b9e98e02ae605a65fc9458ae9)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: ce3d901244da4794908c05f25de87e260e11b676
      
https://github.com/qemu/qemu/commit/ce3d901244da4794908c05f25de87e260e11b676
  Author: Daniel P. Berrangé <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M crypto/tlscredsx509.c
    M docs/system/tls.rst
    M tests/unit/crypto-tls-x509-helpers.h
    M tests/unit/test-crypto-tlscredsx509.c
    M tests/unit/test-crypto-tlssession.c
    M tests/unit/test-io-channel-tls.c

  Log Message:
  -----------
  crypto: stop requiring "key encipherment" usage in x509 certs

This usage flag was deprecated by RFC8813, such that it is
forbidden to be present for certs using ECDSA/ECDH algorithms,
and in TLS 1.3 is conceptually obsolete.

As such many valid certs will no longer have this key usage
flag set, and QEMU should not be rejecting them, as this
prevents use of otherwise valid & desirable algorithms.

Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Daniel P. Berrangé <[email protected]>
(cherry picked from commit 3995fc238e0599e0417ba958ffc5c7a609e82a7f)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: 514a83a038e6f85660aa5ef0615088e109ffa8ac
      
https://github.com/qemu/qemu/commit/514a83a038e6f85660aa5ef0615088e109ffa8ac
  Author: Richard W.M. Jones <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M block/curl.c
    M contrib/elf2dmp/download.c

  Log Message:
  -----------
  block/curl.c: Use explicit long constants in curl_easy_setopt calls

curl_easy_setopt takes a variable argument that depends on what
CURLOPT you are setting.  Some require a long constant.  Passing a
plain int constant is potentially wrong on some platforms.

With warnings enabled, multiple warnings like this were printed:

../block/curl.c: In function ‘curl_init_state’:
../block/curl.c:474:13: warning: call to ‘_curl_easy_setopt_err_long’ declared 
with attribute warning: curl_easy_setopt expects a long argument 
[-Wattribute-warning]
  474 |             curl_easy_setopt(state->curl, CURLOPT_AUTOREFERER, 1) ||
      |             ^

Signed-off-by: Richard W.M. Jones <[email protected]>
Signed-off-by: Chenxi Mao <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Reviewed-by: Akihiko Odaki <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-ID: <[email protected]>
(cherry picked from commit ed26056d90ddff21351f3efd2cb47fea4f0e1d45)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: 020a7267911684626703cbab9e811232a18105ab
      
https://github.com/qemu/qemu/commit/020a7267911684626703cbab9e811232a18105ab
  Author: Richard W.M. Jones <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M block/curl.c

  Log Message:
  -----------
  block/curl.c: Fix CURLOPT_VERBOSE parameter type

In commit ed26056d90 ("block/curl.c: Use explicit long constants in
curl_easy_setopt calls") we missed a further call that takes a long
parameter.

Reported-by: Kevin Wolf <[email protected]>
Signed-off-by: Richard W.M. Jones <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
(cherry picked from commit ad97769e9dcf4dbdaae6d859176e5f37fd6a7c66)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: ab84c84130377b6f3430e22ff4d9a71a98e02f74
      
https://github.com/qemu/qemu/commit/ab84c84130377b6f3430e22ff4d9a71a98e02f74
  Author: Eric Blake <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M io/net-listener.c
    M io/trace-events

  Log Message:
  -----------
  qio: Add trace points to net_listener

Upcoming patches will adjust how net_listener watches for new client
connections; adding trace points now makes it easier to debug that the
changes work as intended.  For example, adding
--trace='qio_net_listener*' to the qemu-storage-daemon command line
before --nbd-server will track when the server first starts listening
for clients.

Signed-off-by: Eric Blake <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Message-ID: <[email protected]>
(cherry picked from commit 59506e59e0f0a773e892104b945d0f15623381a7)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: 737c9f03e989ff423f1d224f1e9a89a436b14a80
      
https://github.com/qemu/qemu/commit/737c9f03e989ff423f1d224f1e9a89a436b14a80
  Author: Eric Blake <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M io/net-listener.c

  Log Message:
  -----------
  qio: Unwatch before notify in QIONetListener

When changing the callback registered with QIONetListener, the code
was calling notify on the old opaque data prior to actually removing
the old GSource objects still pointing to that data.  Similarly,
during finalize, it called notify before tearing down the various
GSource objects tied to the data.

In practice, a grep of the QEMU code base found that every existing
client of QIONetListener passes in a NULL notifier (the opaque data,
if non-NULL, outlives the NetListener and so does not need cleanup
when the NetListener is torn down), so this patch has no impact.  And
even if a caller had passed in a reference-counted object with a
notifier of object_unref but kept its own reference on the data, then
the early notify would merely reduce a refcount from (say) 2 to 1, but
not free the object.  However, it is a latent bug waiting to bite any
future caller that passes in data where the notifier actually frees
the object, because the GSource could then trigger a use-after-free if
it loses the race on a last-minute client connection resulting in the
data being passed to one final use of the async callback.

Better is to delay the notify call until after all GSource that have
been given a copy of the opaque data are torn down.

CC: [email protected]
Fixes: 530473924d "io: introduce a network socket listener API", v2.12.0
Signed-off-by: Eric Blake <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Message-ID: <[email protected]>
(cherry picked from commit 6e03d5cdc991f5db86969fc6aeaca96234426263)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: cd990562fab8e483471252227b726070d1b9a61c
      
https://github.com/qemu/qemu/commit/cd990562fab8e483471252227b726070d1b9a61c
  Author: Eric Blake <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M include/io/net-listener.h
    M io/net-listener.c
    M io/trace-events

  Log Message:
  -----------
  qio: Remember context of qio_net_listener_set_client_func_full

io/net-listener.c has two modes of use: asynchronous (the user calls
qio_net_listener_set_client_func to wake up the callback via the
global GMainContext, or qio_net_listener_set_client_func_full to wake
up the callback via the caller's own alternative GMainContext), and
synchronous (the user calls qio_net_listener_wait_client which creates
its own GMainContext and waits for the first client connection before
returning, with no need for a user's callback).  But commit 938c8b79
has a latent logic flaw: when qio_net_listener_wait_client finishes on
its temporary context, it reverts all of the siocs back to the global
GMainContext rather than the potentially non-NULL context they might
have been originally registered with.  Similarly, if the user creates
a net-listener, adds initial addresses, registers an async callback
with a non-default context (which ties to all siocs for the initial
addresses), then adds more addresses with qio_net_listener_add, the
siocs for later addresses are blindly placed in the global context,
rather than sharing the context of the earlier ones.

In practice, I don't think this has caused issues.  As pointed out by
the original commit, all async callers prior to that commit were
already okay with the NULL default context; and the typical usage
pattern is to first add ALL the addresses the listener will pay
attention to before ever setting the async callback.  Likewise, if a
file uses only qio_net_listener_set_client_func instead of
qio_net_listener_set_client_func_full, then it is never using a custom
context, so later assignments of async callbacks will still be to the
same global context as earlier ones.  Meanwhile, any callers that want
to do the sync operation to grab the first client are unlikely to
register an async callback; altogether bypassing the question of
whether later assignments of a GSource are being tied to a different
context over time.

I do note that chardev/char-socket.c is the only file that calls both
qio_net_listener_wait_client (sync for a single client in
tcp_chr_accept_server_sync), and qio_net_listener_set_client_func_full
(several places, all with chr->gcontext, but sometimes with a NULL
callback function during teardown).  But as far as I can tell, the two
uses are mutually exclusive, based on the is_waitconnect parameter to
qmp_chardev_open_socket_server.

That said, it is more robust to remember when an async callback
function is tied to a non-default context, and have both the sync wait
and any late address additions honor that same context.  That way, the
code will be robust even if a later user performs a sync wait for a
specific client in the middle of servicing a longer-lived
QIONetListener that has an async callback for all other clients.

CC: [email protected]
Fixes: 938c8b79 ("qio: store gsources for net listeners", v2.12.0)
Signed-off-by: Eric Blake <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Message-ID: <[email protected]>
(cherry picked from commit b5676493a08b4ff80680aae7a1b1bfef8797c6e7)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: b048921c1c842ed8efd0f47304979823b4db7fbd
      
https://github.com/qemu/qemu/commit/b048921c1c842ed8efd0f47304979823b4db7fbd
  Author: Eric Blake <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M include/io/net-listener.h
    M io/net-listener.c

  Log Message:
  -----------
  qio: Protect NetListener callback with mutex

Without a mutex, NetListener can run into this data race between a
thread changing the async callback callback function to use when a
client connects, and the thread servicing polling of the listening
sockets:

  Thread 1:
       qio_net_listener_set_client_func(lstnr, f1, ...);
           => foreach sock: socket
               => object_ref(lstnr)
               => sock_src = qio_channel_socket_add_watch_source(sock, ...., 
lstnr, object_unref);

  Thread 2:
       poll()
          => event POLLIN on socket
               => ref(GSourceCallback)
               => if (lstnr->io_func) // while lstnr->io_func is f1
                    ...interrupt..

  Thread 1:
       qio_net_listener_set_client_func(lstnr, f2, ...);
          => foreach sock: socket
               => g_source_unref(sock_src)
          => foreach sock: socket
               => object_ref(lstnr)
               => sock_src = qio_channel_socket_add_watch_source(sock, ...., 
lstnr, object_unref);

  Thread 2:
               => call lstnr->io_func(lstnr->io_data) // now sees f2
               => return dispatch(sock)
               => unref(GSourceCallback)
                  => destroy-notify
                     => object_unref

Found by inspection; I did not spend the time trying to add sleeps or
execute under gdb to try and actually trigger the race in practice.
This is a SEGFAULT waiting to happen if f2 can become NULL because
thread 1 deregisters the user's callback while thread 2 is trying to
service the callback.  Other messes are also theoretically possible,
such as running callback f1 with an opaque pointer that should only be
passed to f2 (if the client code were to use more than just a binary
choice between a single async function or NULL).

Mitigating factor: if the code that modifies the QIONetListener can
only be reached by the same thread that is executing the polling and
async callbacks, then we are not in a two-thread race documented above
(even though poll can see two clients trying to connect in the same
window of time, any changes made to the listener by the first async
callback will be completed before the thread moves on to the second
client).  However, QEMU is complex enough that this is hard to
generically analyze.  If QMP commands (like nbd-server-stop) are run
in the main loop and the listener uses the main loop, things should be
okay.  But when a client uses an alternative GMainContext, or if
servicing a QMP command hands off to a coroutine to avoid blocking, I
am unable to state with certainty whether a given net listener can be
modified by a thread different from the polling thread running
callbacks.

At any rate, it is worth having the API be robust.  To ensure that
modifying a NetListener can be safely done from any thread, add a
mutex that guarantees atomicity to all members of a listener object
related to callbacks.  This problem has been present since
QIONetListener was introduced.

Note that this does NOT prevent the case of a second round of the
user's old async callback being invoked with the old opaque data, even
when the user has already tried to change the async callback during
the first async callback; it is only about ensuring that there is no
sharding (the eventual io_func(io_data) call that does get made will
correspond to a particular combination that the user had requested at
some point in time, and not be sharded to a combination that never
existed in practice).  In other words, this patch maintains the status
quo that a user's async callback function already needs to be robust
to parallel clients landing in the same window of poll servicing, even
when only one client is desired, if that particular listener can be
amended in a thread other than the one doing the polling.

CC: [email protected]
Fixes: 53047392 ("io: introduce a network socket listener API", v2.12.0)
Signed-off-by: Eric Blake <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
[eblake: minor commit message wording improvements]
Signed-off-by: Eric Blake <[email protected]>
(cherry picked from commit 9d86181874ab7b0e95ae988f6f80715943c618c6)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: ed8bb28165852bbbded0fe26ed4acd924bcbdcef
      
https://github.com/qemu/qemu/commit/ed8bb28165852bbbded0fe26ed4acd924bcbdcef
  Author: Peter Maydell <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M hw/net/e1000e_core.c

  Log Message:
  -----------
  hw/net/e1000e_core: Don't advance desc_offset for NULL buffer RX descriptors

In e1000e_write_packet_to_guest() we don't write data for RX descriptors
where the buffer address is NULL (as required by the i82574 datasheet
section 7.1.7.2). However, when we do this we still update desc_offset
by the amount of data we would have written to the RX descriptor if
it had a valid buffer pointer, resulting in our dropping that data
entirely. The data sheet is not 100% clear on the subject, but this
seems unlikely to be the correct behaviour.

Rearrange the null-descriptor logic so that we don't treat these
do-nothing descriptors as if we'd really written the data.

This both fixes a bug and also is a prerequisite to cleaning up
the size calculation logic in the next patch.

(Cc to stable largely because it will be needed for the next patch,
which fixes a more serious bug.)

Cc: [email protected]
Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Akihiko Odaki <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
(cherry picked from commit 6da0c9828194eb21e54fe4264cd29a1b85a29f33)
(Mjt: context fixup in hw/net/e1000e_core.c:e1000e_write_packet_to_guest())
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: 84b22d847c484aae06e40ac5d9f1eecd75a7716f
      
https://github.com/qemu/qemu/commit/84b22d847c484aae06e40ac5d9f1eecd75a7716f
  Author: Peter Maydell <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M hw/net/e1000e_core.c

  Log Message:
  -----------
  hw/net/e1000e_core: Correct rx oversize packet checks

In e1000e_write_packet_to_guest() we attempt to ensure that we don't
write more of a packet to a descriptor than will fit in the guest
configured receive buffers.  However, this code does not allow for
the "packet split" feature.  When packet splitting is enabled, the
first of up to 4 buffers in the descriptor is used for the packet
header only, with the payload going into buffers 2, 3 and 4.  Our
length check only checks against the total sizes of all 4 buffers,
which meant that if an incoming packet was large enough to fit in (1
+ 2 + 3 + 4) but not into (2 + 3 + 4) and packet splitting was
enabled, we would run into the assertion in
e1000e_write_hdr_frag_to_rx_buffers() that we had enough buffers for
the data:

qemu-system-i386: ../../hw/net/e1000e_core.c:1418: void 
e1000e_write_payload_frag_to_rx_buffers(E1000ECore *, hwaddr *, E1000EBAState 
*, const char *, dma_addr_t): Assertion `bastate->cur_idx < MAX_PS_BUFFERS' 
failed.

A malicious guest could provoke this assertion by configuring the
device into loopback mode, and then sending itself a suitably sized
packet into a suitably arrange rx descriptor.

The code also fails to deal with the possibility that the descriptor
buffers are sized such that the trailing checksum word does not fit
into the last descriptor which has actual data, which might also
trigger this assertion.

Rework the length handling to use two variables:
 * desc_size is the total amount of data DMA'd to the guest
   for the descriptor being processed in this iteration of the loop
 * rx_desc_buf_size is the total amount of space left in it

As we copy data to the guest (packet header, payload, checksum),
update these two variables.  (Previously we attempted to calculate
desc_size once at the top of the loop, but this is too difficult to
do correctly.) Then we can use the variables to ensure that we clamp
the amount of copied payload data to the remaining space in the
descriptor's buffers, even if we've used one of the buffers up in the
packet-split code, and we can tell whether we have enough space for
the full checksum word in this descriptor or whether we're going to
need to split that to the following descriptor.

I have included comments that hopefully help to make the loop
logic a little clearer.

Cc: [email protected]
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/537
Reviewed-by: Akihiko Odaki <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
(cherry picked from commit 9d946d56a2ac8a6c2df186e20d24810255c83a3f)
(Mjt: rename e1000e_write_payload_frag_to_rx_buffers back to
 e1000e_write_to_rx_buffers for 7.2.x, to compensate for missing in 7.2.x
 v8.1.0-693-g17ccd0164796 "igb: RX payload guest writting refactoring")
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: 28efd5e5dd204810250d19e10fb89f4aa0c5161a
      
https://github.com/qemu/qemu/commit/28efd5e5dd204810250d19e10fb89f4aa0c5161a
  Author: Peter Maydell <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M hw/net/e1000e_core.c

  Log Message:
  -----------
  hw/net/e1000e_core: Adjust e1000e_write_payload_frag_to_rx_buffers() assert

An assertion in e1000e_write_payload_frag_to_rx_buffers() attempts to
guard against the calling code accidentally trying to write too much
data to a single RX descriptor, such that the E1000EBAState::cur_idx
indexes off the end of the EB1000BAState::written[] array.

Unfortunately it is overzealous: it asserts that cur_idx is in
range after it has been incremented. This will fire incorrectly
for the case where the guest configures four buffers and exactly
enough bytes are written to fill all four of them.

The only places where we use cur_idx and index in to the written[]
array are the functions e1000e_write_hdr_frag_to_rx_buffers() and
e1000e_write_payload_frag_to_rx_buffers(), so we can rewrite this to
assert before doing the array dereference, rather than asserting
after updating cur_idx.

Cc: [email protected]
Reviewed-by: Akihiko Odaki <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
(cherry picked from commit bab496a18358643b686f69e2b97d73fb98d37e79)
(Mjt: in 7.2.x it is e1000e_write_to_rx_buffers, not
 e1000e_write_payload_frag_to_rx_buffers, due to missing in 7.2.x
 v8.1.0-693-g17ccd0164796 "igb: RX payload guest writting refactoring")
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: 516bab6fdfadb2e800aa2a88ad30d20e90b0258d
      
https://github.com/qemu/qemu/commit/516bab6fdfadb2e800aa2a88ad30d20e90b0258d
  Author: Peter Maydell <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M net/net.c

  Log Message:
  -----------
  net: pad packets to minimum length in qemu_receive_packet()

In commits like 969e50b61a28 ("net: Pad short frames to minimum size
before sending from SLiRP/TAP") we switched away from requiring
network devices to handle short frames to instead having the net core
code do the padding of short frames out to the ETH_ZLEN minimum size.
We then dropped the code for handling short frames from the network
devices in a series of commits like 140eae9c8f7 ("hw/net: e1000:
Remove the logic of padding short frames in the receive path").

This missed one route where the device's receive code can still see a
short frame: if the device is in loopback mode and it transmits a
short frame via the qemu_receive_packet() function, this will be fed
back into its own receive code without being padded.

Add the padding logic to qemu_receive_packet().

This fixes a buffer overrun which can be triggered in the
e1000_receive_iov() logic via the loopback code path.

Other devices that use qemu_receive_packet() to implement loopback
are cadence_gem, dp8393x, lan9118, msf2-emac, pcnet, rtl8139
and sungem.

Cc: [email protected]
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3043
Reviewed-by: Akihiko Odaki <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
(cherry picked from commit a01344d9d78089e9e585faaeb19afccff2050abf)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: ed8911d1c66c8a83df3259ca64007f8b8d938ab8
      
https://github.com/qemu/qemu/commit/ed8911d1c66c8a83df3259ca64007f8b8d938ab8
  Author: Peter Maydell <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M hw/display/xlnx_dp.c

  Log Message:
  -----------
  hw/display/xlnx_dp.c: Don't abort on AUX FIFO overrun/underrun

The documentation of the Xilinx DisplayPort subsystem at
https://www.xilinx.com/support/documents/ip_documentation/v_dp_txss1/v3_1/pg299-v-dp-txss1.pdf
doesn't say what happens if a guest tries to issue an AUX write
command with a length greater than the amount of data in the AUX
write FIFO, or tries to write more data to the write FIFO than it can
hold, or issues multiple commands that put data into the AUX read
FIFO without reading it such that it overflows.

Currently QEMU will abort() in these guest-error situations, either
in xlnx_dp.c itself or in the fifo8 code.  Make these cases all be
logged as guest errors instead.  We choose to ignore the new data on
overflow, and return 0 on underflow. This is in line with how we handled
the "read from empty RX FIFO" case in commit a09ef5040477.

Cc: [email protected]
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1418
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1419
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1424
Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Message-id: [email protected]
(cherry picked from commit f52db7f34242d3398bab0bacaa3e5dde99be5258)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: 842f4f8db399aee9b3a0138ed0b2b707984d3582
      
https://github.com/qemu/qemu/commit/842f4f8db399aee9b3a0138ed0b2b707984d3582
  Author: Peter Maydell <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M hw/display/xlnx_dp.c

  Log Message:
  -----------
  hw/display/xlnx_dp: Don't abort for unsupported graphics formats

If the guest writes an invalid or unsupported value to the
AV_BUF_FORMAT register, currently we abort().  Instead, log this as
either a guest error or an unimplemented error and continue.

The existing code treats DP_NL_VID_CB_Y0_CR_Y1 as x8b8g8r8
via a "case 0" that does not use the enum constant name for some
reason; we leave that alone beyond adding a comment about the
weird code.

Documentation of this register seems to be at:
https://docs.amd.com/r/en-US/ug1087-zynq-ultrascale-registers/AV_BUF_FORMAT-DISPLAY_PORT-Register

Cc: [email protected]
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1415
Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-id: [email protected]
(cherry picked from commit 032333eba77b83dfbd74071cc2971f0bda9a3d4f)
Signed-off-by: Michael Tokarev <[email protected]>


  Commit: d0a90254f1a47cea08f8bd1e37deac756283214c
      
https://github.com/qemu/qemu/commit/d0a90254f1a47cea08f8bd1e37deac756283214c
  Author: Peter Maydell <[email protected]>
  Date:   2025-11-15 (Sat, 15 Nov 2025)

  Changed paths:
    M hw/misc/npcm7xx_clk.c

  Log Message:
  -----------
  hw/misc/npcm_clk: Don't divide by zero when calculating frequency

If the guest misprograms the PLL registers to request a zero
divisor, we currently fall over with a division by zero:

../../hw/misc/npcm_clk.c:221:14: runtime error: division by zero
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
../../hw/misc/npcm_clk.c:221:14

Thread 1 "qemu-system-aar" received signal SIGFPE, Arithmetic exception.
0x00005555584d8f6d in npcm7xx_clk_update_pll (opaque=0x7fffed159a20) at 
../../hw/misc/npcm_clk.c:221
221             freq /= PLLCON_INDV(con) * PLLCON_OTDV1(con) * 
PLLCON_OTDV2(con);

Avoid this by treating this invalid setting like a stopped clock
(setting freq to 0).

Cc: [email protected]
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/549
Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-id: [email protected]
(cherry picked from commit 5fc50b4ec841c8a01e7346c2c804088fc3accb6b)
Signed-off-by: Michael Tokarev <[email protected]>


Compare: https://github.com/qemu/qemu/compare/ee4e62749d53...d0a90254f1a4

To unsubscribe from these emails, change your notification settings at 
https://github.com/qemu/qemu/settings/notifications

[Qemu-commits] [qemu/qemu] 873955: target/i386: Fix CR2 handling for non-canonical ad...

Reply via email to