On Fri, 20 Mar 2026 at 05:57, Sun Haoyu <[email protected]> wrote:
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/3343
We generally put the Resolves: tag line at the bottom of the
commit message, above the Signed-off-by: line.
> The Linux kernel writes back the remaining timeout for select-family
> syscalls in poll_select_finish(). If that writeback fails, it keeps
> the original return value.
>
> However, QEMU only writes back the timeout on success. If the writeback
> fails, QEMU returns -TARGET_EFAULT. This can lose the remaining
> timeout and change the return value.
>
> Update do_select(), do_pselect6(), and do_ppoll() to always write back
> the timeout to match the Linux kernel's behavior. If the timeout
> writeback fails, keep the original return value.
This surprised me, but it really is what the kernel does:
if the select/poll proper succeeds, it returns the success
value even if the writeback fails:
/*
* If an application puts its timeval in read-only memory, we
* don't want the Linux-specific update to the timeval to
* cause a fault after the select has completed
* successfully. However, because we're not updating the
* timeval, we can't restart the system call.
*/
> Tested with the issue reproducer.
>
> Signed-off-by: Sun Haoyu <[email protected]>
> ---
> Changes since v1:
> - write back the timeout in all cases
> - handle do_select() too
> - keep the original return value if timeout writeback fails
>
> linux-user/syscall.c | 31 ++++++++++++++++---------------
> 1 file changed, 16 insertions(+), 15 deletions(-)
>
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 064bc604c9..877a3b8bb6 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -1394,13 +1394,12 @@ static abi_long do_select(int n,
> return -TARGET_EFAULT;
> if (efd_addr && copy_to_user_fdset(efd_addr, &efds, n))
> return -TARGET_EFAULT;
> -
> - if (target_tv_addr) {
> - tv.tv_sec = ts.tv_sec;
> - tv.tv_usec = ts.tv_nsec / 1000;
> - if (copy_to_user_timeval(target_tv_addr, &tv)) {
> - return -TARGET_EFAULT;
> - }
> + }
> + if (target_tv_addr) {
> + tv.tv_sec = ts.tv_sec;
> + tv.tv_usec = ts.tv_nsec / 1000;
> + if (copy_to_user_timeval(target_tv_addr, &tv)) {
> + return ret;
We're about to "return ret" anyway, so we can simplify this
by dropping the if(), with a comment about why it's OK:
/*
* Like the kernel, we deliberately ignore possible
* failures writing back to the timeval struct.
*/
copy_to_user_timeval(target_tv_addr, &tv);
and similarly with the other functions.
> }
> }
>
> @@ -1529,16 +1528,18 @@ static abi_long do_pselect6(abi_long arg1, abi_long
> arg2, abi_long arg3,
> if (efd_addr && copy_to_user_fdset(efd_addr, &efds, n)) {
> return -TARGET_EFAULT;
> }
> + }
> + if (ts_addr) {
> if (time64) {
> - if (ts_addr && host_to_target_timespec64(ts_addr, &ts)) {
> - return -TARGET_EFAULT;
> + if (host_to_target_timespec64(ts_addr, &ts)) {
> + return ret;
> }
> } else {
> - if (ts_addr && host_to_target_timespec(ts_addr, &ts)) {
> - return -TARGET_EFAULT;
> + if (host_to_target_timespec(ts_addr, &ts)) {
> + return ret;
> }
> }
> - }
> + }
The indent seems to have been accidentally changed here.
> return ret;
> }
> #endif
Otherwise this looks good to me.
thanks
-- PMM