[PATCH] Use sendfile to copy data between file descriptors
Busybox already uses sendfile in httpd. This patch proposes to use it globally to copy data between file descriptors. It speeds up the copying on slow systems a lot - below are the times needed to copy a 450Mb file with and without this option enabled on a BeagleBone Black: sendfile: user0m0.000s sys 0m8.170s read/write: user0m0.470s sys 0m16.300s It doesn't add a lot of bloat either: function old new delta bb_full_fd_action233 311 +78 -- (add/remove: 0/0 grow/shrink: 1/0 up/down: 78/0) Total: 78 bytes textdata bss dec hex filename 81283941239552 826514 c9c92 busybox_old 81291741239552 826592 c9ce0 busybox_unstripped This patch also moves USE_SENDFILE feature from httpd subconfiguration to global settings. Signed-off-by: Bartosz Golaszewski bartekg...@gmail.com --- Config.in | 8 +++ libbb/copyfd.c| 59 --- networking/Config.src | 8 --- networking/httpd.c| 6 +++--- 4 files changed, 58 insertions(+), 23 deletions(-) diff --git a/Config.in b/Config.in index b83beb5..4b5ea69 100644 --- a/Config.in +++ b/Config.in @@ -264,6 +264,14 @@ config PAM Use PAM in some busybox applets (currently login and httpd) instead of direct access to password database. +config FEATURE_USE_SENDFILE +bool Use sendfile system call +default y +depends on HTTPD +help + When enabled, busybox will use the kernel sendfile() function + instead of read/write loops where applicable. + config LONG_OPTS bool Support for --long-options default y diff --git a/libbb/copyfd.c b/libbb/copyfd.c index eda2747..e10fe82 100644 --- a/libbb/copyfd.c +++ b/libbb/copyfd.c @@ -9,6 +9,28 @@ #include libbb.h +#if ENABLE_FEATURE_USE_SENDFILE +#include sys/sendfile.h +#endif + +/* + * Returns 1 if all bytes have been copied, 0 otherwise. + */ +static int check_status(int *status, off_t *size, off_t *total, ssize_t rd) +{ + *total += rd; + if (*status 0) { /* if we aren't copying till EOF... */ + *size -= rd; + if (!(*size)) { + /* 'size' bytes copied - all done */ + *status = 0; + return 1; + } + } + + return 0; +} + /* Used by NOFORK applets (e.g. cat) - must not use xmalloc. * size 0 means ignore write errors, used by tar --to-command * size = 0 means copy till EOF @@ -18,6 +40,7 @@ static off_t bb_full_fd_action(int src_fd, int dst_fd, off_t size) int status = -1; off_t total = 0; bool continue_on_write_error = 0; + ssize_t rd; #if CONFIG_FEATURE_COPYBUF_KB = 4 char buffer[CONFIG_FEATURE_COPYBUF_KB * 1024]; enum { buffer_size = sizeof(buffer) }; @@ -56,10 +79,29 @@ static off_t bb_full_fd_action(int src_fd, int dst_fd, off_t size) status = 1; /* copy until eof */ } - while (1) { - ssize_t rd; +#if ENABLE_FEATURE_USE_SENDFILE + { + ssize_t sz = size (size buffer_size) + ? size : MAXINT(ssize_t) - 0x; - rd = safe_read(src_fd, buffer, size buffer_size ? buffer_size : size); + while (1) { + rd = sendfile(dst_fd, src_fd, NULL, sz); + if (rd = 0) { + /* Might be EOF, might be an error, +* to make sure fall back to the read-write +* loop. +*/ + break; + } else if (check_status(status, size, total, rd)) { + break; + } + } + } +#endif + + while (1) { + rd = safe_read(src_fd, buffer, + size buffer_size ? buffer_size : size); if (!rd) { /* eof - all done */ status = 0; @@ -80,15 +122,8 @@ static off_t bb_full_fd_action(int src_fd, int dst_fd, off_t size) dst_fd = -1; } } - total += rd; - if (status 0) { /* if we aren't copying till EOF... */ - size -= rd; - if (!size) { - /* 'size' bytes copied - all done */ - status = 0; - break; - } - } + if (check_status(status, size, total, rd)) + break; } out: diff --git a/networking/Config.src
Re: [PATCH] Use sendfile to copy data between file descriptors
Hi Bartosz, On Mon, Oct 20, 2014 at 01:36:45PM +0200, Bartosz Golaszewski wrote: Busybox already uses sendfile in httpd. This patch proposes to use it globally to copy data between file descriptors. [...] diff --git a/Config.in b/Config.in index b83beb5..4b5ea69 100644 --- a/Config.in +++ b/Config.in @@ -264,6 +264,14 @@ config PAM Use PAM in some busybox applets (currently login and httpd) instead of direct access to password database. +config FEATURE_USE_SENDFILE +bool Use sendfile system call +default y +depends on HTTPD Why depend on HTTPD, then? +help + When enabled, busybox will use the kernel sendfile() function + instead of read/write loops where applicable. + baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}ooO--U--Ooo{= - bar...@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il - ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Use sendfile to copy data between file descriptors
Le 20/10/2014 13:36, Bartosz Golaszewski a écrit : Busybox already uses sendfile in httpd. This patch proposes to use it globally to copy data between file descriptors. I haven't been keeping up-to-date with sendfile() in the last couple of years, but AFAICR sendfile() was limited in the type of file it's writing from/to, and the way to perform zero-copy data transmission across generic fds is splice(). Which is not generic either, because splice() needs a pipe as one of its endpoints, so if you're copying from a socket to a file, for instance, you need a pipe in the middle, and splice from the socket to the pipe then splice from the pipe to the file. This is a bit complex to handle for every case. I have done it in skalibs, so the code is there, and I can busyboxify it if needed, but that's quite a bit of code and additional complexity for a gain that would only be felt on large data copies, so I'm not sure it's worth it. -- Laurent ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Use sendfile to copy data between file descriptors
2014-10-20 14:10 GMT+02:00 Baruch Siach bar...@tkos.co.il: Why depend on HTTPD, then? My bad, I'll send another one. Best regards, Bartosz Gołaszewski ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Use sendfile to copy data between file descriptors
Hi Laurent, On Mon, Oct 20, 2014 at 02:34:02PM +0200, Laurent Bercot wrote: Le 20/10/2014 13:36, Bartosz Golaszewski a écrit : Busybox already uses sendfile in httpd. This patch proposes to use it globally to copy data between file descriptors. I haven't been keeping up-to-date with sendfile() in the last couple of years, but AFAICR sendfile() was limited in the type of file it's writing from/to, and the way to perform zero-copy data transmission across generic fds is splice(). This has changed in kernel version 2.6.33. Since that version sendfile() out_fd can be any file. The in_fd parameter must support mmap, though. See the sendfile(2) man page. baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}ooO--U--Ooo{= - bar...@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il - ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
[PATCHv2] Use sendfile to copy data between file descriptors
Busybox already uses sendfile in httpd. This patch proposes to use it globally to copy data between file descriptors. It speeds up the copying on slow systems a lot - below are the times needed to copy a 450Mb file with and without this option enabled on a BeagleBone Black: sendfile: user0m0.000s sys 0m8.170s read/write: user0m0.470s sys 0m16.300s It doesn't add a lot of bloat either: function old new delta bb_full_fd_action233 311 +78 -- (add/remove: 0/0 grow/shrink: 1/0 up/down: 78/0) Total: 78 bytes textdata bss dec hex filename 81283941239552 826514 c9c92 busybox_old 81291741239552 826592 c9ce0 busybox_unstripped This patch also moves USE_SENDFILE feature from httpd subconfiguration to global settings. Signed-off-by: Bartosz Golaszewski bartekg...@gmail.com --- Config.in | 7 ++ libbb/copyfd.c| 59 --- networking/Config.src | 8 --- networking/httpd.c| 6 +++--- 4 files changed, 57 insertions(+), 23 deletions(-) diff --git a/Config.in b/Config.in index b83beb5..3edaf8f 100644 --- a/Config.in +++ b/Config.in @@ -264,6 +264,13 @@ config PAM Use PAM in some busybox applets (currently login and httpd) instead of direct access to password database. +config FEATURE_USE_SENDFILE +bool Use sendfile system call +default y +help + When enabled, busybox will use the kernel sendfile() function + instead of read/write loops where applicable. + config LONG_OPTS bool Support for --long-options default y diff --git a/libbb/copyfd.c b/libbb/copyfd.c index eda2747..e10fe82 100644 --- a/libbb/copyfd.c +++ b/libbb/copyfd.c @@ -9,6 +9,28 @@ #include libbb.h +#if ENABLE_FEATURE_USE_SENDFILE +#include sys/sendfile.h +#endif + +/* + * Returns 1 if all bytes have been copied, 0 otherwise. + */ +static int check_status(int *status, off_t *size, off_t *total, ssize_t rd) +{ + *total += rd; + if (*status 0) { /* if we aren't copying till EOF... */ + *size -= rd; + if (!(*size)) { + /* 'size' bytes copied - all done */ + *status = 0; + return 1; + } + } + + return 0; +} + /* Used by NOFORK applets (e.g. cat) - must not use xmalloc. * size 0 means ignore write errors, used by tar --to-command * size = 0 means copy till EOF @@ -18,6 +40,7 @@ static off_t bb_full_fd_action(int src_fd, int dst_fd, off_t size) int status = -1; off_t total = 0; bool continue_on_write_error = 0; + ssize_t rd; #if CONFIG_FEATURE_COPYBUF_KB = 4 char buffer[CONFIG_FEATURE_COPYBUF_KB * 1024]; enum { buffer_size = sizeof(buffer) }; @@ -56,10 +79,29 @@ static off_t bb_full_fd_action(int src_fd, int dst_fd, off_t size) status = 1; /* copy until eof */ } - while (1) { - ssize_t rd; +#if ENABLE_FEATURE_USE_SENDFILE + { + ssize_t sz = size (size buffer_size) + ? size : MAXINT(ssize_t) - 0x; - rd = safe_read(src_fd, buffer, size buffer_size ? buffer_size : size); + while (1) { + rd = sendfile(dst_fd, src_fd, NULL, sz); + if (rd = 0) { + /* Might be EOF, might be an error, +* to make sure fall back to the read-write +* loop. +*/ + break; + } else if (check_status(status, size, total, rd)) { + break; + } + } + } +#endif + + while (1) { + rd = safe_read(src_fd, buffer, + size buffer_size ? buffer_size : size); if (!rd) { /* eof - all done */ status = 0; @@ -80,15 +122,8 @@ static off_t bb_full_fd_action(int src_fd, int dst_fd, off_t size) dst_fd = -1; } } - total += rd; - if (status 0) { /* if we aren't copying till EOF... */ - size -= rd; - if (!size) { - /* 'size' bytes copied - all done */ - status = 0; - break; - } - } + if (check_status(status, size, total, rd)) + break; } out: diff --git a/networking/Config.src b/networking/Config.src
Re: [PATCH] Use sendfile to copy data between file descriptors
2014-10-20 14:45 GMT+02:00 Baruch Siach bar...@tkos.co.il: Hi Laurent, On Mon, Oct 20, 2014 at 02:34:02PM +0200, Laurent Bercot wrote: Le 20/10/2014 13:36, Bartosz Golaszewski a écrit : Busybox already uses sendfile in httpd. This patch proposes to use it globally to copy data between file descriptors. I haven't been keeping up-to-date with sendfile() in the last couple of years, but AFAICR sendfile() was limited in the type of file it's writing from/to, and the way to perform zero-copy data transmission across generic fds is splice(). This has changed in kernel version 2.6.33. Since that version sendfile() out_fd can be any file. The in_fd parameter must support mmap, though. See the sendfile(2) man page. baruch There's even a comment in the kernel code in fs/splice.c: /** 1289 * do_splice_direct - splices data directly between two files (...) 1297 * Description: 1298 *For use by do_sendfile(). splice can easily emulate sendfile, but 1299 *doing it in the application would incur an extra system call 1300 *(splice in + splice out, as compared to just sendfile()). So this helper 1301 *can splice directly through a process-private pipe. 1302 * 1303 */ So apparently sendfile() is ok in this case. Best regards, Bartosz Gołaszewski ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCHv2] Use sendfile to copy data between file descriptors
+config FEATURE_USE_SENDFILE +bool Use sendfile system call +default y +help + When enabled, busybox will use the kernel sendfile() function + instead of read/write loops where applicable. Please include the kernel version requirement here. +#if ENABLE_FEATURE_USE_SENDFILE + { + ssize_t sz = size (size buffer_size) + ? size : MAXINT(ssize_t) - 0x; - rd = safe_read(src_fd, buffer, size buffer_size ? buffer_size : size); + while (1) { And here. Comment should be enough, as the call will probably error out on older kernels. + rd = sendfile(dst_fd, src_fd, NULL, sz); + if (rd = 0) { + /* Might be EOF, might be an error, +* to make sure fall back to the - Lauri -- http://www.fastmail.fm - A no graphics, no pop-ups email service ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
1.22.1: been hit by the zcat bug
That deserves a patch in http://busybox.net/downloads/fixes-1.22.1/, IMO. At least, commits: 7c47b560a8fc97956dd8132bd7f1863d83c19866 b664f740d90880560ce46b11f766625341342e80 640ce3de07807133796bccd0bdfa146bbfc788c7 may be relevantat. Cheers, -- Cristian ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: new lzop patch doesn't have correct format?
On 19 Oct 2014 20:59, Michael D. Setzer II wrote: Saw new patch http://busybox.net/downloads/fixes-1.22.1/busybox-1.22.1-lzop.patch But unlike the other patches the patch -p0 file doesn't work. hmm, the few that i've posted have always just been created by git format-patch which means they work with -p1. that's the convention i'm used to -- apply everything with -p1. let's see if Denys has an opinion -mike signature.asc Description: Digital signature ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCHv2] Use sendfile to copy data between file descriptors
On 20 Oct 2014 21:00, Lauri Kasanen wrote: +config FEATURE_USE_SENDFILE +bool Use sendfile system call +default y +help + When enabled, busybox will use the kernel sendfile() function + instead of read/write loops where applicable. Please include the kernel version requirement here. `man 2 sendfile`: In Linux kernels before 2.6.33, out_fd must refer to a socket. Since Linux 2.6.33 it can be any file. If it is a regular file, then sendfile() changes the file offset appropriately. -mike signature.asc Description: Digital signature ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox