On Fri, 20 Mar 2026 at 05:57, Sun Haoyu <[email protected]> wrote:
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/3343

We generally put the Resolves: tag line at the bottom of the
commit message, above the Signed-off-by: line.

> The Linux kernel writes back the remaining timeout for select-family
> syscalls in poll_select_finish(). If that writeback fails, it keeps
> the original return value.
>
> However, QEMU only writes back the timeout on success. If the writeback
> fails, QEMU returns -TARGET_EFAULT. This can lose the remaining
> timeout and change the return value.
>
> Update do_select(), do_pselect6(), and do_ppoll() to always write back
> the timeout to match the Linux kernel's behavior. If the timeout
> writeback fails, keep the original return value.

This surprised me, but it really is what the kernel does:
if the select/poll proper succeeds, it returns the success
value even if the writeback fails:

        /*
         * If an application puts its timeval in read-only memory, we
         * don't want the Linux-specific update to the timeval to
         * cause a fault after the select has completed
         * successfully. However, because we're not updating the
         * timeval, we can't restart the system call.
         */


> Tested with the issue reproducer.
>
> Signed-off-by: Sun Haoyu <[email protected]>
> ---
> Changes since v1:
> - write back the timeout in all cases
> - handle do_select() too
> - keep the original return value if timeout writeback fails
>
>  linux-user/syscall.c | 31 ++++++++++++++++---------------
>  1 file changed, 16 insertions(+), 15 deletions(-)
>
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 064bc604c9..877a3b8bb6 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -1394,13 +1394,12 @@ static abi_long do_select(int n,
>              return -TARGET_EFAULT;
>          if (efd_addr && copy_to_user_fdset(efd_addr, &efds, n))
>              return -TARGET_EFAULT;
> -
> -        if (target_tv_addr) {
> -            tv.tv_sec = ts.tv_sec;
> -            tv.tv_usec = ts.tv_nsec / 1000;
> -            if (copy_to_user_timeval(target_tv_addr, &tv)) {
> -                return -TARGET_EFAULT;
> -            }
> +    }
> +    if (target_tv_addr) {
> +        tv.tv_sec = ts.tv_sec;
> +        tv.tv_usec = ts.tv_nsec / 1000;
> +        if (copy_to_user_timeval(target_tv_addr, &tv)) {
> +            return ret;

We're about to "return ret" anyway, so we can simplify this
by dropping the if(), with a comment about why it's OK:

           /*
            * Like the kernel, we deliberately ignore possible
            * failures writing back to the timeval struct.
            */
           copy_to_user_timeval(target_tv_addr, &tv);

and similarly with the other functions.

>          }
>      }
>
> @@ -1529,16 +1528,18 @@ static abi_long do_pselect6(abi_long arg1, abi_long 
> arg2, abi_long arg3,
>          if (efd_addr && copy_to_user_fdset(efd_addr, &efds, n)) {
>              return -TARGET_EFAULT;
>          }
> +    }
> +    if (ts_addr) {
>          if (time64) {
> -            if (ts_addr && host_to_target_timespec64(ts_addr, &ts)) {
> -                return -TARGET_EFAULT;
> +            if (host_to_target_timespec64(ts_addr, &ts)) {
> +                return ret;
>              }
>          } else {
> -            if (ts_addr && host_to_target_timespec(ts_addr, &ts)) {
> -                return -TARGET_EFAULT;
> +            if (host_to_target_timespec(ts_addr, &ts)) {
> +                return ret;
>              }
>          }
> -    }
> +     }

The indent seems to have been accidentally changed here.

>      return ret;
>  }
>  #endif

Otherwise this looks good to me.

thanks
-- PMM

Reply via email to