Re: tar does not support partial reads

Greg Kurz Tue, 20 Jul 2021 07:41:34 -0700

On Tue, 20 Jul 2021 13:26:50 +0200
Christian Schoenebeck <[email protected]> wrote:


> On Dienstag, 20. Juli 2021 08:27:45 CEST Petr Pisar wrote:
> > V Mon, Jul 19, 2021 at 03:39:53PM -0500, Paul Eggert napsal(a):
> > > On 7/19/21 7:54 AM, Christian Schoenebeck wrote:
> > > > POSIX compliant applications must always expect that read() /
> > > > write() functions might read/write less bytes than requested
> > > 
> > > Although that's true in general, it's not true for regular files. The
> > > POSIX spec for 'read'
> > > <https://pubs.opengroup.org/onlinepubs/9699919799/functions/pread.html>
> > > says, "The value returned may be less than /nbyte/ if the number of
> > > bytes left in the file is less than /nbyte/, if the /read/() request was
> > > interrupted by a signal, or if the file is a pipe or FIFO or special
> > > file and has fewer than /nbyte/ bytes immediately available for reading."
> > > 
> > > So, regular files shouldn't get short reads unless there's an EOF or a
> > > signal.
> > 
> > What does gaurantee there is no signal sent to the process?
> > 
> > -- Petr
> 
> Well, that's one point, but I cannot deny that Paul has a valid argument 
> there 
> as well. 
> 
> However it is common practice to make applications capable to deal with short 
> reads independent of any prerequisites like specific file types. And like I 
> said in my previous email, as far as the Linux kernel is concerned, it 
> clearly 
> sais that applications must be capable of short reads at any time and 
> independent of a specific file type. BSD is yet a another story though.
> 
> And BTW it is actually not QEMU responsible for this particular behaviour, 
> but 
> rather the stock Linux kernel's 9p client that exposes this behaviour to 
> applications: https://github.com/torvalds/linux/tree/master/fs/9p
> 

I agree that nothing can be done at the QEMU level to fix that : the virtio-9p
device is simply filling the buffer sized by the client at mount time. It 
doesn't
know anything about the count argument passed to read() by the application.

So I had a look at the 9p client code in linux and we have :

static ssize_t
v9fs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
{
        struct p9_fid *fid = iocb->ki_filp->private_data;
        int ret, err = 0;

        p9_debug(P9_DEBUG_VFS, "count %zu offset %lld\n",
                 iov_iter_count(to), iocb->ki_pos);

        if (iocb->ki_filp->f_flags & O_NONBLOCK)
                ret = p9_client_read_once(fid, iocb->ki_pos, to, &err);
        else
                ret = p9_client_read(fid, iocb->ki_pos, to, &err);
        if (!ret)
                return err;

        iocb->ki_pos += ret;
        return ret;
}

p9_client_read_once() sends a single request and propagates
short reads to the caller, while p9_client_read() implements
a full read, i.e. loops on p9_client_read_once() until all the
requested data was read.

strace on tar shows that tar is setting the O_NONBLOCK flag:

openat(AT_FDCWD, "register.h", 
O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 4
...
read(4, "/*\n * Copyright 2006-2018  Thoma"..., 9728) = 4096

This explains why tag is getting short reads.

Looking more closely at what POSIX says about O_NONBLOCK:

When attempting to read a file (other than a pipe or FIFO) that supports 
non-blocking reads and has no data currently available:

    - If O_NONBLOCK is set, read() shall return -1 and set errno to [EAGAIN].

    - If O_NONBLOCK is clear, read() shall block the calling thread until some 
data becomes available.

    - The use of the O_NONBLOCK flag has no effect if there is some data 
available.

and

[EAGAIN]
    The file is neither a pipe, nor a FIFO, nor a socket, the O_NONBLOCK flag 
is set for the file descriptor, and the thread would be delayed in the read 
operation.

The case of the reported issue is thus "O_NONBLOCK is set and some data
is available", which should lead O_NONBLOCK to be ignored, i.e. switch
to a full read instead of propagating the short read IIUC.

Makes sense ?

Cc'ing Dominique and v9fs-developer for greater audience.

> Independent of 9p, you may encounter short reads with network mounted file 
> systems in general as well.
> 
> The rationale behind this exposed behaviour is to allow each application to 
> decide whether they want to consume the partial data currently available and 
> (potentially) reduce the app's overall execution time, or rather to wait for 
> the full amount of data to become available by calling read() again.
> 
> Was there a specific reason in the past for tar to switch from gnulib's 
> (short-read capable) full_read() to safe_read() in 1999?
> 
> Best regards,
> Christian Schoenebeck
> 
>

Re: tar does not support partial reads

Reply via email to