CVS commit: src/bin/cp
Module Name:src Committed By: andvar Date: Fri Jun 7 21:01:00 UTC 2024 Modified Files: src/bin/cp: cp.c Log Message: s/concatentation/concatenation/ in comment. To generate a diff of this commit: cvs rdiff -u -r1.62 -r1.63 src/bin/cp/cp.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files. Modified files: Index: src/bin/cp/cp.c diff -u src/bin/cp/cp.c:1.62 src/bin/cp/cp.c:1.63 --- src/bin/cp/cp.c:1.62 Fri May 22 14:54:30 2020 +++ src/bin/cp/cp.c Fri Jun 7 21:01:00 2024 @@ -1,4 +1,4 @@ -/* $NetBSD: cp.c,v 1.62 2020/05/22 14:54:30 christos Exp $ */ +/* $NetBSD: cp.c,v 1.63 2024/06/07 21:01:00 andvar Exp $ */ /* * Copyright (c) 1988, 1993, 1994 @@ -43,7 +43,7 @@ __COPYRIGHT( #if 0 static char sccsid[] = "@(#)cp.c 8.5 (Berkeley) 4/29/95"; #else -__RCSID("$NetBSD: cp.c,v 1.62 2020/05/22 14:54:30 christos Exp $"); +__RCSID("$NetBSD: cp.c,v 1.63 2024/06/07 21:01:00 andvar Exp $"); #endif #endif /* not lint */ @@ -375,7 +375,7 @@ copy(char *argv[], enum op type, int fts * is the case where the target exists. * * Also, check for "..". This is for correct path - * concatentation for paths ending in "..", e.g. + * concatenation for paths ending in "..", e.g. * cp -R .. /tmp * Paths ending in ".." are changed to ".". This is * tricky, but seems the easiest way to fix the problem.
CVS commit: src/bin/cp
Module Name:src Committed By: andvar Date: Fri Jun 7 21:01:00 UTC 2024 Modified Files: src/bin/cp: cp.c Log Message: s/concatentation/concatenation/ in comment. To generate a diff of this commit: cvs rdiff -u -r1.62 -r1.63 src/bin/cp/cp.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.
CVS commit: src/bin/cp
Module Name:src Committed By: christos Date: Mon Jan 15 17:41:06 UTC 2024 Modified Files: src/bin/cp: utils.c Log Message: PR/57857: Ricardo Branco: Always copy regular files, even if they appear to be zero-sized. To generate a diff of this commit: cvs rdiff -u -r1.49 -r1.50 src/bin/cp/utils.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files. Modified files: Index: src/bin/cp/utils.c diff -u src/bin/cp/utils.c:1.49 src/bin/cp/utils.c:1.50 --- src/bin/cp/utils.c:1.49 Sun May 17 19:34:11 2020 +++ src/bin/cp/utils.c Mon Jan 15 12:41:06 2024 @@ -1,4 +1,4 @@ -/* $NetBSD: utils.c,v 1.49 2020/05/17 23:34:11 christos Exp $ */ +/* $NetBSD: utils.c,v 1.50 2024/01/15 17:41:06 christos Exp $ */ /*- * Copyright (c) 1991, 1993, 1994 @@ -34,7 +34,7 @@ #if 0 static char sccsid[] = "@(#)utils.c 8.3 (Berkeley) 4/1/94"; #else -__RCSID("$NetBSD: utils.c,v 1.49 2020/05/17 23:34:11 christos Exp $"); +__RCSID("$NetBSD: utils.c,v 1.50 2024/01/15 17:41:06 christos Exp $"); #endif #endif /* not lint */ @@ -179,10 +179,10 @@ copy_file(FTSENT *entp, int dne) rval = 0; /* - * There's no reason to do anything other than close the file - * now if it's regular and empty, so let's not bother. + * We always copy regular files, even if they appear to be 0-sized + * because kernfs and procfs don't return file sizes. */ - bool need_copy = !S_ISREG(fs->st_mode) || fs->st_size > 0; + bool need_copy = S_ISREG(fs->st_mode) || fs->st_size > 0; struct finfo fi;
CVS commit: src/bin/cp
Module Name:src Committed By: christos Date: Mon Jan 15 17:41:06 UTC 2024 Modified Files: src/bin/cp: utils.c Log Message: PR/57857: Ricardo Branco: Always copy regular files, even if they appear to be zero-sized. To generate a diff of this commit: cvs rdiff -u -r1.49 -r1.50 src/bin/cp/utils.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.
CVS commit: src/bin/cp
Module Name:src Committed By: christos Date: Mon Sep 23 18:01:09 UTC 2019 Modified Files: src/bin/cp: utils.c Log Message: PR/54564: Jan Schaumann: cp of a fifo yields an empty file Don't short-circuit 0 sized stat entries if they don't belong to regular files. Also don't try to mmap non-regular files. To generate a diff of this commit: cvs rdiff -u -r1.46 -r1.47 src/bin/cp/utils.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.
CVS commit: src/bin/cp
Module Name:src Committed By: christos Date: Mon Sep 23 18:01:09 UTC 2019 Modified Files: src/bin/cp: utils.c Log Message: PR/54564: Jan Schaumann: cp of a fifo yields an empty file Don't short-circuit 0 sized stat entries if they don't belong to regular files. Also don't try to mmap non-regular files. To generate a diff of this commit: cvs rdiff -u -r1.46 -r1.47 src/bin/cp/utils.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files. Modified files: Index: src/bin/cp/utils.c diff -u src/bin/cp/utils.c:1.46 src/bin/cp/utils.c:1.47 --- src/bin/cp/utils.c:1.46 Tue Jul 17 09:04:58 2018 +++ src/bin/cp/utils.c Mon Sep 23 14:01:09 2019 @@ -1,4 +1,4 @@ -/* $NetBSD: utils.c,v 1.46 2018/07/17 13:04:58 darcy Exp $ */ +/* $NetBSD: utils.c,v 1.47 2019/09/23 18:01:09 christos Exp $ */ /*- * Copyright (c) 1991, 1993, 1994 @@ -34,7 +34,7 @@ #if 0 static char sccsid[] = "@(#)utils.c 8.3 (Berkeley) 4/1/94"; #else -__RCSID("$NetBSD: utils.c,v 1.46 2018/07/17 13:04:58 darcy Exp $"); +__RCSID("$NetBSD: utils.c,v 1.47 2019/09/23 18:01:09 christos Exp $"); #endif #endif /* not lint */ @@ -174,87 +174,83 @@ copy_file(FTSENT *entp, int dne) rval = 0; - /* + /* * There's no reason to do anything other than close the file - * now if it's empty, so let's not bother. + * now if it's regular and empty, so let's not bother. */ - if (fs->st_size > 0) { - struct finfo fi; - - fi.from = entp->fts_path; - fi.to = to.p_path; - fi.size = fs->st_size; - - /* - * Mmap and write if less than 8M (the limit is so - * we don't totally trash memory on big files). - * This is really a minor hack, but it wins some CPU back. - */ - bool use_read; + bool need_copy = !S_ISREG(fs->st_mode) || fs->st_size > 0; - use_read = true; - if (fs->st_size <= MMAP_MAX_SIZE) { - size_t fsize = (size_t)fs->st_size; - p = mmap(NULL, fsize, PROT_READ, MAP_FILE|MAP_SHARED, - from_fd, (off_t)0); - if (p != MAP_FAILED) { -size_t remainder; - -use_read = false; - -(void) madvise(p, (size_t)fs->st_size, - MADV_SEQUENTIAL); - -/* - * Write out the data in small chunks to - * avoid locking the output file for a - * long time if the reading the data from - * the source is slow. - */ -remainder = fsize; -do { - ssize_t chunk; - - chunk = (remainder > MMAP_MAX_WRITE) ? - MMAP_MAX_WRITE : remainder; - if (write(to_fd, [fsize - remainder], - chunk) != chunk) { - warn("%s", to.p_path); - rval = 1; - break; - } - remainder -= chunk; - ptotal += chunk; - if (pinfo) - progress(, ptotal); -} while (remainder > 0); + struct finfo fi; -if (munmap(p, fsize) < 0) { - warn("%s", entp->fts_path); - rval = 1; -} - } - } + fi.from = entp->fts_path; + fi.to = to.p_path; + fi.size = fs->st_size; - if (use_read) { - while ((rcount = read(from_fd, buf, MAXBSIZE)) > 0) { -wcount = write(to_fd, buf, (size_t)rcount); -if (rcount != wcount || wcount == -1) { + /* + * Mmap and write if less than 8M (the limit is so + * we don't totally trash memory on big files). + * This is really a minor hack, but it wins some CPU back. + */ + if (S_ISREG(fs->st_mode) && fs->st_size && fs->st_size <= MMAP_MAX_SIZE) { + size_t fsize = (size_t)fs->st_size; + p = mmap(NULL, fsize, PROT_READ, MAP_FILE|MAP_SHARED, + from_fd, (off_t)0); + if (p != MAP_FAILED) { + size_t remainder; + + need_copy = false; + + (void) madvise(p, (size_t)fs->st_size, MADV_SEQUENTIAL); + + /* + * Write out the data in small chunks to + * avoid locking the output file for a + * long time if the reading the data from + * the source is slow. + */ + remainder = fsize; + do { +ssize_t chunk; + +chunk = (remainder > MMAP_MAX_WRITE) ? +MMAP_MAX_WRITE : remainder; +if (write(to_fd, [fsize - remainder], +chunk) != chunk) { warn("%s", to.p_path); rval = 1; break; } -ptotal += wcount; +remainder -= chunk; +ptotal += chunk; if (pinfo) progress(, ptotal); - } - if (rcount < 0) { + } while (remainder > 0); + + if (munmap(p, fsize) < 0) { warn("%s", entp->fts_path); rval = 1; } } } + if (need_copy) { + while ((rcount = read(from_fd, buf, MAXBSIZE)) > 0) { + wcount = write(to_fd, buf, (size_t)rcount); + if (rcount != wcount || wcount == -1) { +warn("%s", to.p_path); +rval = 1; +break; + } + ptotal += wcount; + if (pinfo) +progress(, ptotal); + } + if (rcount < 0) { + warn("%s", entp->fts_path); + rval = 1; + } + } + if (pflag && (fcpxattr(from_fd, to_fd) != 0)) warn("%s: error copying extended attributes", to.p_path);
Re: CVS commit: src/bin/cp
On Tue, 21 Dec 2010, Christos Zoulas wrote: Modified Files: src/bin/cp: cp.1 cp.c utils.c Log Message: Add -a archive flag. from Aleksey Cheusov Please add a note under STANDARDS in the man page saying that -a is non-standard. --apb (Alan Barrett)
Re: CVS commit: src/bin/cp
On Mon, Oct 25, 2010 at 05:49:11PM +0100, David Laight wrote: No, since in general the file is also being extended (certainly in this case it is) it also has to lock the file size, and that's going to deny stat() until it's done. A stat request during a write can safely return the old size. Yes it can, if it has it. Hence multiversion... -- David A. Holland dholl...@netbsd.org
Re: CVS commit: src/bin/cp
On Sun, Oct 24, 2010 at 10:56:40PM +, David Holland wrote: Anyway, ISTM that writing from the mmap buffer in say 64K chunks would retain nearly all the advantages and get rid of the latency problem. The way the code is currently written it only uses mmap(2) for files smaller than 8MB anyway. Your suggested change would require more work than reducing the size of the mapped memory. There is also the problem that the overhead per call to mmap(2) or munmap(2) is high on some platforms, IIRC alpha is one of them. Changing the code as you suggested above might therefore impact performance on some ports. Kind regards -- Matthias Scheler http://zhadum.org.uk/
Re: CVS commit: src/bin/cp
On Mon, Oct 25, 2010 at 08:52:43AM +0100, Matthias Scheler wrote: On Sun, Oct 24, 2010 at 10:56:40PM +, David Holland wrote: Anyway, ISTM that writing from the mmap buffer in say 64K chunks would retain nearly all the advantages and get rid of the latency problem. The way the code is currently written it only uses mmap(2) for files smaller than 8MB anyway. Your suggested change would require more work than reducing the size of the mapped memory. Forget that, I misunderstood what you said. We can still use one mmap(2) but should write out that memory with multile write(2) system calls. Kind regards -- Matthias Scheler http://zhadum.org.uk/
Re: CVS commit: src/bin/cp
On Mon, Oct 25, 2010 at 11:17:22AM +0200, Juergen Hannken-Illjes wrote: On Sun, Oct 24, 2010 at 05:21:06AM +, David Holland wrote: On Fri, Oct 22, 2010 at 05:56:06PM +, Antti Kantee wrote: Disable mmap path. With the current vnode locking scheme it has a very annoying property: if the source media is slow (like a slow network), the target file will be locked for the duration of the entire max 8MB write and cause processes attempting to e.g. stat() it to tstile (for several minutes in the worst case). Revisit this if/when vnode locking gets a little smarter. Wouldn't it be better to just ratchet back the block size to something like 64K that happens faster? Or first fault the mapped region instead of madvise() like: Do we implement MADV_WILLNEED? Joerg
Re: CVS commit: src/bin/cp
On Mon, Oct 25, 2010 at 06:30:54PM +0200, Joerg Sonnenberger wrote: On Mon, Oct 25, 2010 at 11:17:22AM +0200, Juergen Hannken-Illjes wrote: On Sun, Oct 24, 2010 at 05:21:06AM +, David Holland wrote: On Fri, Oct 22, 2010 at 05:56:06PM +, Antti Kantee wrote: Disable mmap path. With the current vnode locking scheme it has a very annoying property: if the source media is slow (like a slow network), the target file will be locked for the duration of the entire max 8MB write and cause processes attempting to e.g. stat() it to tstile (for several minutes in the worst case). Revisit this if/when vnode locking gets a little smarter. Wouldn't it be better to just ratchet back the block size to something like 64K that happens faster? Or first fault the mapped region instead of madvise() like: Do we implement MADV_WILLNEED? According to the man page This WILL NOT fault pages in from backing store. -- Juergen Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)
Re: CVS commit: src/bin/cp
On Mon, Oct 25, 2010 at 11:17:22AM +0200, Juergen Hannken-Illjes wrote: On Sun, Oct 24, 2010 at 05:21:06AM +, David Holland wrote: On Fri, Oct 22, 2010 at 05:56:06PM +, Antti Kantee wrote: Disable mmap path. With the current vnode locking scheme it has a very annoying property: if the source media is slow (like a slow network), the target file will be locked for the duration of the entire max 8MB write and cause processes attempting to e.g. stat() it to tstile (for several minutes in the worst case). Revisit this if/when vnode locking gets a little smarter. Wouldn't it be better to just ratchet back the block size to something like 64K that happens faster? Or first fault the mapped region instead of madvise() like: int pgsz = getpagesize(); char *q; volatile char c; for (q = p; q p+fs-st_size; q += pgsz) c = *q; That won't really help - the pages could easily be discarded almost immediately - eg in order to fault in a later page. David -- David Laight: da...@l8s.co.uk
Re: CVS commit: src/bin/cp
On Sun, Oct 24, 2010 at 10:56:40PM +, David Holland wrote: I think write() only needs to lock the the file enough to ensure that the file offset is correct. No, since in general the file is also being extended (certainly in this case it is) it also has to lock the file size, and that's going to deny stat() until it's done. A stat request during a write can safely return the old size. David -- David Laight: da...@l8s.co.uk
Re: CVS commit: src/bin/cp
On Mon, Oct 25, 2010 at 06:41:11PM +0200, Juergen Hannken-Illjes wrote: Do we implement MADV_WILLNEED? According to the man page This WILL NOT fault pages in from backing store. The version of the man page I have says It might or might not fault pages in from backing store. Joerg
re: CVS commit: src/bin/cp
On Mon, Oct 25, 2010 at 08:52:43AM +0100, Matthias Scheler wrote: On Sun, Oct 24, 2010 at 10:56:40PM +, David Holland wrote: Anyway, ISTM that writing from the mmap buffer in say 64K chunks would retain nearly all the advantages and get rid of the latency problem. The way the code is currently written it only uses mmap(2) for files smaller than 8MB anyway. Your suggested change would require more work than reducing the size of the mapped memory. Forget that, I misunderstood what you said. We can still use one mmap(2) but should write out that memory with multile write(2) system calls. FWIW, bozohttpd defaults to mapping upto 64MB regions and writing out up to 64KB at a time. .mrg.
Re: CVS commit: src/bin/cp
On Mon, Oct 25, 2010 at 06:46:36PM +0200, Joerg Sonnenberger wrote: On Mon, Oct 25, 2010 at 06:41:11PM +0200, Juergen Hannken-Illjes wrote: Do we implement MADV_WILLNEED? According to the man page This WILL NOT fault pages in from backing store. The version of the man page I have says It might or might not fault pages in from backing store. Mine was 5.0.2, yours looks more -current :-) -- Juergen Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)
Re: CVS commit: src/bin/cp
On Tue, Oct 26, 2010 at 05:05:28AM +1100, matthew green wrote: On Mon, Oct 25, 2010 at 08:52:43AM +0100, Matthias Scheler wrote: On Sun, Oct 24, 2010 at 10:56:40PM +, David Holland wrote: Anyway, ISTM that writing from the mmap buffer in say 64K chunks would retain nearly all the advantages and get rid of the latency problem. The way the code is currently written it only uses mmap(2) for files smaller than 8MB anyway. Your suggested change would require more work than reducing the size of the mapped memory. Forget that, I misunderstood what you said. We can still use one mmap(2) but should write out that memory with multile write(2) system calls. FWIW, bozohttpd defaults to mapping upto 64MB regions and writing out up to 64KB at a time. cp(1) now maps a region upto 8MB and writes it out in chunks of at most 64KB. Kind regards -- Matthias Scheler http://zhadum.org.uk/
Re: CVS commit: src/bin/cp
(adding tech-kern because this seems likely to become lengthy; if following up please drop source-changes-d) On Sun, Oct 24, 2010 at 11:30:43AM +0100, David Laight wrote: [mmap mode disabled in cp due to long vnode lock waits] Because individual write() calls are supposed to be atomic, I don't think there is such a thing as a locking improvement that'll help with this behavior. :-/ I think write() only needs to lock the the file enough to ensure that the file offset is correct. No, since in general the file is also being extended (certainly in this case it is) it also has to lock the file size, and that's going to deny stat() until it's done. Possibly the written range needs locking against other accesses - but I think the app is supposed to use file locking for that No, the reason single syscalls are atomic is specifically so apps don't have to use file locking for simple cases, like e.g. writing lines to log files. (and mmap will always be non-atomic w.r.t. write). Yes, but the mmap in this case is for reading. (what the code in cp does is map the source and then write to the destination file from the mapped region) Actually if 2 writes are issued for the same part of a file, the kernel can act as if they were requested in either order - since the app(s) cannot know the order the calls would be made in! As long as nothing else happens in between that establishes an ordering, yes. Which means it could just sleep the 2nd write until the first terminates! Yes indeed, that's what locks do :-) Writes with O_APPEND (and writes that extend the file) are more problematical since you cant allow a second such write to start until the first has completed - for instance it might try to read from an unmapped user-space address and return a short length. Right. Except I guess going to some kind of multiversion model for vnodes. Don't you just need 2 locks? One for locking the data areas, and the other for the file data itself. Nope. Or at least, that doesn't fix the objectionable behavior. Because you need to lock the file size, if you want stat() to be able to complete while the operation is in progress you need to be able to let stat() read a version of the file size from before the operation began. One can do multiversion schemes in the kernel; apparently it's neither as ghastly or as expensive as one would think. See for example the transactional Linux paper from SOSP 2009. But it'd be a pretty big hacking job to do it in NetBSD. Anyway, ISTM that writing from the mmap buffer in say 64K chunks would retain nearly all the advantages and get rid of the latency problem. -- David A. Holland dholl...@netbsd.org