On Mon, 2013-02-25 at 12:22 -0500, Ulrich Drepper wrote:
> When using sendfile with a non-blocking output file descriptor for a
> socket the operation can cause a partial write because of capacity
> issues.  This is nothing critical and the operation could resume after
> the output queue is cleared.  The problem is: there is no way to
> determine where to resume.
> 
> The system call just returns -EAGAIN without any further indication.
> The caller doesn't know what to resend.
> 
> And this even though the interface of sendfile would be capable of
> communicating this information and the man page (I know, it's not
> authoritive) describes this behavior as well.
> 
> The problem is probably in a few places, here is one (fs/splice.c):
> 
> static ssize_t default_file_splice_write(struct pipe_inode_info *pipe,
>                                          struct file *out, loff_t *ppos,
>                                          size_t len, unsigned int flags)
> {
>         ssize_t ret;
> 
>         ret = splice_from_pipe(pipe, out, ppos, len, flags, write_pipe_buf);
>         if (ret > 0)
>                 *ppos += ret;
> 
>         return ret;
> }
> 
> Note that *ppos is only updated if the call doesn't fail.  We could
> also update the position if ret == -EAGAIN.  This would require
> re-architecting the system a bit to either update *ppos in
> splice_from_pipe etc or to communicate number of the bytes which are
> written from the splice_from_pipe call.  In any case, the result would
> be that the caller knows where to resume the operation.
> 
> I would argue that this doesn't break the ABI.  In case existing
> programs today just resend packages today from the beginning they will
> have send an unpredictable number of bytes in the previous sendfile()
> call, making the state of the communication unpredictable.
> 
> Opinions?  I think as is sendfile() isn't useful with O_NONBLOCK.
> --

I don't understand the issue.

sendfile() returns -EAGAIN only if no bytes were copied to the socket.

If some bytes were copied, sendfile() returns the number of bytes,
exactly like write() would do for a partial write.

I guess the following should work (well... with better tests)

offset = 0;
while (offset < len) {
   res = sendfile(sock, fd, &offset, len - offset);
   if (res >= 0) {
        offset += res;
   } else {
        if (errno != EAGAIN)
            break;
        wait_some_event();
    }
}




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to