CVS commit: src/bin/cp

2024-06-07 Thread Andrius Varanavicius
Module Name:src
Committed By:   andvar
Date:   Fri Jun  7 21:01:00 UTC 2024

Modified Files:
src/bin/cp: cp.c

Log Message:
s/concatentation/concatenation/ in comment.


To generate a diff of this commit:
cvs rdiff -u -r1.62 -r1.63 src/bin/cp/cp.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: src/bin/cp/cp.c
diff -u src/bin/cp/cp.c:1.62 src/bin/cp/cp.c:1.63
--- src/bin/cp/cp.c:1.62	Fri May 22 14:54:30 2020
+++ src/bin/cp/cp.c	Fri Jun  7 21:01:00 2024
@@ -1,4 +1,4 @@
-/* $NetBSD: cp.c,v 1.62 2020/05/22 14:54:30 christos Exp $ */
+/* $NetBSD: cp.c,v 1.63 2024/06/07 21:01:00 andvar Exp $ */
 
 /*
  * Copyright (c) 1988, 1993, 1994
@@ -43,7 +43,7 @@ __COPYRIGHT(
 #if 0
 static char sccsid[] = "@(#)cp.c	8.5 (Berkeley) 4/29/95";
 #else
-__RCSID("$NetBSD: cp.c,v 1.62 2020/05/22 14:54:30 christos Exp $");
+__RCSID("$NetBSD: cp.c,v 1.63 2024/06/07 21:01:00 andvar Exp $");
 #endif
 #endif /* not lint */
 
@@ -375,7 +375,7 @@ copy(char *argv[], enum op type, int fts
 			 * is the case where the target exists.
 			 *
 			 * Also, check for "..".  This is for correct path
-			 * concatentation for paths ending in "..", e.g.
+			 * concatenation for paths ending in "..", e.g.
 			 *	cp -R .. /tmp
 			 * Paths ending in ".." are changed to ".".  This is
 			 * tricky, but seems the easiest way to fix the problem.



CVS commit: src/bin/cp

2024-06-07 Thread Andrius Varanavicius
Module Name:src
Committed By:   andvar
Date:   Fri Jun  7 21:01:00 UTC 2024

Modified Files:
src/bin/cp: cp.c

Log Message:
s/concatentation/concatenation/ in comment.


To generate a diff of this commit:
cvs rdiff -u -r1.62 -r1.63 src/bin/cp/cp.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.



CVS commit: src/bin/cp

2024-01-15 Thread Christos Zoulas
Module Name:src
Committed By:   christos
Date:   Mon Jan 15 17:41:06 UTC 2024

Modified Files:
src/bin/cp: utils.c

Log Message:
PR/57857: Ricardo Branco: Always copy regular files, even if they appear to
be zero-sized.


To generate a diff of this commit:
cvs rdiff -u -r1.49 -r1.50 src/bin/cp/utils.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: src/bin/cp/utils.c
diff -u src/bin/cp/utils.c:1.49 src/bin/cp/utils.c:1.50
--- src/bin/cp/utils.c:1.49	Sun May 17 19:34:11 2020
+++ src/bin/cp/utils.c	Mon Jan 15 12:41:06 2024
@@ -1,4 +1,4 @@
-/* $NetBSD: utils.c,v 1.49 2020/05/17 23:34:11 christos Exp $ */
+/* $NetBSD: utils.c,v 1.50 2024/01/15 17:41:06 christos Exp $ */
 
 /*-
  * Copyright (c) 1991, 1993, 1994
@@ -34,7 +34,7 @@
 #if 0
 static char sccsid[] = "@(#)utils.c	8.3 (Berkeley) 4/1/94";
 #else
-__RCSID("$NetBSD: utils.c,v 1.49 2020/05/17 23:34:11 christos Exp $");
+__RCSID("$NetBSD: utils.c,v 1.50 2024/01/15 17:41:06 christos Exp $");
 #endif
 #endif /* not lint */
 
@@ -179,10 +179,10 @@ copy_file(FTSENT *entp, int dne)
 	rval = 0;
 
 	/* 
-	 * There's no reason to do anything other than close the file
-	 * now if it's regular and empty, so let's not bother.
+	 * We always copy regular files, even if they appear to be 0-sized
+	 * because kernfs and procfs don't return file sizes.
 	 */
-	bool need_copy = !S_ISREG(fs->st_mode) || fs->st_size > 0;
+	bool need_copy = S_ISREG(fs->st_mode) || fs->st_size > 0;
 
 	struct finfo fi;
 



CVS commit: src/bin/cp

2024-01-15 Thread Christos Zoulas
Module Name:src
Committed By:   christos
Date:   Mon Jan 15 17:41:06 UTC 2024

Modified Files:
src/bin/cp: utils.c

Log Message:
PR/57857: Ricardo Branco: Always copy regular files, even if they appear to
be zero-sized.


To generate a diff of this commit:
cvs rdiff -u -r1.49 -r1.50 src/bin/cp/utils.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.



CVS commit: src/bin/cp

2019-09-23 Thread Christos Zoulas
Module Name:src
Committed By:   christos
Date:   Mon Sep 23 18:01:09 UTC 2019

Modified Files:
src/bin/cp: utils.c

Log Message:
PR/54564: Jan Schaumann: cp of a fifo yields an empty file
Don't short-circuit 0 sized stat entries if they don't belong to regular files.
Also don't try to mmap non-regular files.


To generate a diff of this commit:
cvs rdiff -u -r1.46 -r1.47 src/bin/cp/utils.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.



CVS commit: src/bin/cp

2019-09-23 Thread Christos Zoulas
Module Name:src
Committed By:   christos
Date:   Mon Sep 23 18:01:09 UTC 2019

Modified Files:
src/bin/cp: utils.c

Log Message:
PR/54564: Jan Schaumann: cp of a fifo yields an empty file
Don't short-circuit 0 sized stat entries if they don't belong to regular files.
Also don't try to mmap non-regular files.


To generate a diff of this commit:
cvs rdiff -u -r1.46 -r1.47 src/bin/cp/utils.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: src/bin/cp/utils.c
diff -u src/bin/cp/utils.c:1.46 src/bin/cp/utils.c:1.47
--- src/bin/cp/utils.c:1.46	Tue Jul 17 09:04:58 2018
+++ src/bin/cp/utils.c	Mon Sep 23 14:01:09 2019
@@ -1,4 +1,4 @@
-/* $NetBSD: utils.c,v 1.46 2018/07/17 13:04:58 darcy Exp $ */
+/* $NetBSD: utils.c,v 1.47 2019/09/23 18:01:09 christos Exp $ */
 
 /*-
  * Copyright (c) 1991, 1993, 1994
@@ -34,7 +34,7 @@
 #if 0
 static char sccsid[] = "@(#)utils.c	8.3 (Berkeley) 4/1/94";
 #else
-__RCSID("$NetBSD: utils.c,v 1.46 2018/07/17 13:04:58 darcy Exp $");
+__RCSID("$NetBSD: utils.c,v 1.47 2019/09/23 18:01:09 christos Exp $");
 #endif
 #endif /* not lint */
 
@@ -174,87 +174,83 @@ copy_file(FTSENT *entp, int dne)
 
 	rval = 0;
 
-	/*
+	/* 
 	 * There's no reason to do anything other than close the file
-	 * now if it's empty, so let's not bother.
+	 * now if it's regular and empty, so let's not bother.
 	 */
-	if (fs->st_size > 0) {
-		struct finfo fi;
-
-		fi.from = entp->fts_path;
-		fi.to = to.p_path;
-		fi.size = fs->st_size;
-
-		/*
-		 * Mmap and write if less than 8M (the limit is so
-		 * we don't totally trash memory on big files).
-		 * This is really a minor hack, but it wins some CPU back.
-		 */
-		bool use_read;
+	bool need_copy = !S_ISREG(fs->st_mode) || fs->st_size > 0;
 
-		use_read = true;
-		if (fs->st_size <= MMAP_MAX_SIZE) {
-			size_t fsize = (size_t)fs->st_size;
-			p = mmap(NULL, fsize, PROT_READ, MAP_FILE|MAP_SHARED,
-			from_fd, (off_t)0);
-			if (p != MAP_FAILED) {
-size_t remainder;
-
-use_read = false;
-
-(void) madvise(p, (size_t)fs->st_size,
- MADV_SEQUENTIAL);
-
-/*
- * Write out the data in small chunks to
- * avoid locking the output file for a
- * long time if the reading the data from
- * the source is slow.
- */
-remainder = fsize;
-do {
-	ssize_t chunk;
-
-	chunk = (remainder > MMAP_MAX_WRITE) ?
-	MMAP_MAX_WRITE : remainder;
-	if (write(to_fd, [fsize - remainder],
-	chunk) != chunk) {
-		warn("%s", to.p_path);
-		rval = 1;
-		break;
-	}
-	remainder -= chunk;
-	ptotal += chunk;
-	if (pinfo)
-		progress(, ptotal);
-} while (remainder > 0);
+	struct finfo fi;
 
-if (munmap(p, fsize) < 0) {
-	warn("%s", entp->fts_path);
-	rval = 1;
-}
-			}
-		}
+	fi.from = entp->fts_path;
+	fi.to = to.p_path;
+	fi.size = fs->st_size;
 
-		if (use_read) {
-			while ((rcount = read(from_fd, buf, MAXBSIZE)) > 0) {
-wcount = write(to_fd, buf, (size_t)rcount);
-if (rcount != wcount || wcount == -1) {
+	/*
+	 * Mmap and write if less than 8M (the limit is so
+	 * we don't totally trash memory on big files).
+	 * This is really a minor hack, but it wins some CPU back.
+	 */
+	if (S_ISREG(fs->st_mode) && fs->st_size && fs->st_size <= MMAP_MAX_SIZE) {
+		size_t fsize = (size_t)fs->st_size;
+		p = mmap(NULL, fsize, PROT_READ, MAP_FILE|MAP_SHARED,
+		from_fd, (off_t)0);
+		if (p != MAP_FAILED) {
+			size_t remainder;
+
+			need_copy = false;
+
+			(void) madvise(p, (size_t)fs->st_size, MADV_SEQUENTIAL);
+
+			/*
+			 * Write out the data in small chunks to
+			 * avoid locking the output file for a
+			 * long time if the reading the data from
+			 * the source is slow.
+			 */
+			remainder = fsize;
+			do {
+ssize_t chunk;
+
+chunk = (remainder > MMAP_MAX_WRITE) ?
+MMAP_MAX_WRITE : remainder;
+if (write(to_fd, [fsize - remainder],
+chunk) != chunk) {
 	warn("%s", to.p_path);
 	rval = 1;
 	break;
 }
-ptotal += wcount;
+remainder -= chunk;
+ptotal += chunk;
 if (pinfo)
 	progress(, ptotal);
-			}
-			if (rcount < 0) {
+			} while (remainder > 0);
+
+			if (munmap(p, fsize) < 0) {
 warn("%s", entp->fts_path);
 rval = 1;
 			}
 		}
 	}
 
+	if (need_copy) {
+		while ((rcount = read(from_fd, buf, MAXBSIZE)) > 0) {
+			wcount = write(to_fd, buf, (size_t)rcount);
+			if (rcount != wcount || wcount == -1) {
+warn("%s", to.p_path);
+rval = 1;
+break;
+			}
+			ptotal += wcount;
+			if (pinfo)
+progress(, ptotal);
+		}
+		if (rcount < 0) {
+			warn("%s", entp->fts_path);
+			rval = 1;
+		}
+	}
+
 	if (pflag && (fcpxattr(from_fd, to_fd) != 0))
 		warn("%s: error copying extended attributes", to.p_path);
 



Re: CVS commit: src/bin/cp

2010-12-21 Thread Alan Barrett
On Tue, 21 Dec 2010, Christos Zoulas wrote:
 Modified Files:
   src/bin/cp: cp.1 cp.c utils.c
 
 Log Message:
 Add -a archive flag. from Aleksey Cheusov

Please add a note under STANDARDS in the man page saying that
-a is non-standard.

--apb (Alan Barrett)


Re: CVS commit: src/bin/cp

2010-10-26 Thread David Holland
On Mon, Oct 25, 2010 at 05:49:11PM +0100, David Laight wrote:
   No, since in general the file is also being extended (certainly in
   this case it is) it also has to lock the file size, and that's going
   to deny stat() until it's done.
  
  A stat request during a write can safely return the old size.

Yes it can, if it has it. Hence multiversion...

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src/bin/cp

2010-10-25 Thread Matthias Scheler
On Sun, Oct 24, 2010 at 10:56:40PM +, David Holland wrote:
 Anyway, ISTM that writing from the mmap buffer in say 64K chunks would
 retain nearly all the advantages and get rid of the latency problem.

The way the code is currently written it only uses mmap(2) for
files smaller than 8MB anyway. Your suggested change would require
more work than reducing the size of the mapped memory.

There is also the problem that the overhead per call to mmap(2) or
munmap(2) is high on some platforms, IIRC alpha is one of them.
Changing the code as you suggested above might therefore impact
performance on some ports.

Kind regards

-- 
Matthias Scheler  http://zhadum.org.uk/


Re: CVS commit: src/bin/cp

2010-10-25 Thread Matthias Scheler
On Mon, Oct 25, 2010 at 08:52:43AM +0100, Matthias Scheler wrote:
 On Sun, Oct 24, 2010 at 10:56:40PM +, David Holland wrote:
  Anyway, ISTM that writing from the mmap buffer in say 64K chunks would
  retain nearly all the advantages and get rid of the latency problem.
 
 The way the code is currently written it only uses mmap(2) for
 files smaller than 8MB anyway. Your suggested change would require
 more work than reducing the size of the mapped memory.

Forget that, I misunderstood what you said. We can still use one
mmap(2) but should write out that memory with multile write(2)
system calls.

Kind regards

-- 
Matthias Scheler  http://zhadum.org.uk/


Re: CVS commit: src/bin/cp

2010-10-25 Thread Joerg Sonnenberger
On Mon, Oct 25, 2010 at 11:17:22AM +0200, Juergen Hannken-Illjes wrote:
 On Sun, Oct 24, 2010 at 05:21:06AM +, David Holland wrote:
  On Fri, Oct 22, 2010 at 05:56:06PM +, Antti Kantee wrote:
Disable mmap path.  With the current vnode locking scheme it has
a very annoying property: if the source media is slow (like a slow
network), the target file will be locked for the duration of the
entire max 8MB write and cause processes attempting to e.g. stat()
it to tstile (for several minutes in the worst case).  Revisit
this if/when vnode locking gets a little smarter.
  
  Wouldn't it be better to just ratchet back the block size to something
  like 64K that happens faster?
 
 Or first fault the mapped region instead of madvise() like:

Do we implement MADV_WILLNEED?

Joerg


Re: CVS commit: src/bin/cp

2010-10-25 Thread Juergen Hannken-Illjes
On Mon, Oct 25, 2010 at 06:30:54PM +0200, Joerg Sonnenberger wrote:
 On Mon, Oct 25, 2010 at 11:17:22AM +0200, Juergen Hannken-Illjes wrote:
  On Sun, Oct 24, 2010 at 05:21:06AM +, David Holland wrote:
   On Fri, Oct 22, 2010 at 05:56:06PM +, Antti Kantee wrote:
 Disable mmap path.  With the current vnode locking scheme it has
 a very annoying property: if the source media is slow (like a slow
 network), the target file will be locked for the duration of the
 entire max 8MB write and cause processes attempting to e.g. stat()
 it to tstile (for several minutes in the worst case).  Revisit
 this if/when vnode locking gets a little smarter.
   
   Wouldn't it be better to just ratchet back the block size to something
   like 64K that happens faster?
  
  Or first fault the mapped region instead of madvise() like:
 
 Do we implement MADV_WILLNEED?

According to the man page This WILL NOT fault pages in from backing store.

-- 
Juergen Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)


Re: CVS commit: src/bin/cp

2010-10-25 Thread David Laight
On Mon, Oct 25, 2010 at 11:17:22AM +0200, Juergen Hannken-Illjes wrote:
 On Sun, Oct 24, 2010 at 05:21:06AM +, David Holland wrote:
  On Fri, Oct 22, 2010 at 05:56:06PM +, Antti Kantee wrote:
Disable mmap path.  With the current vnode locking scheme it has
a very annoying property: if the source media is slow (like a slow
network), the target file will be locked for the duration of the
entire max 8MB write and cause processes attempting to e.g. stat()
it to tstile (for several minutes in the worst case).  Revisit
this if/when vnode locking gets a little smarter.
  
  Wouldn't it be better to just ratchet back the block size to something
  like 64K that happens faster?
 
 Or first fault the mapped region instead of madvise() like:
 
   int pgsz = getpagesize();
   char *q;
   volatile char c;
 
   for (q = p; q  p+fs-st_size; q += pgsz)
   c = *q;

That won't really help - the pages could easily be discarded almost
immediately - eg in order to fault in a later page.

David

-- 
David Laight: da...@l8s.co.uk


Re: CVS commit: src/bin/cp

2010-10-25 Thread David Laight
On Sun, Oct 24, 2010 at 10:56:40PM +, David Holland wrote:
   
   I think write() only needs to lock the the file enough to ensure that
   the file offset is correct.
 
 No, since in general the file is also being extended (certainly in
 this case it is) it also has to lock the file size, and that's going
 to deny stat() until it's done.

A stat request during a write can safely return the old size.

David

-- 
David Laight: da...@l8s.co.uk


Re: CVS commit: src/bin/cp

2010-10-25 Thread Joerg Sonnenberger
On Mon, Oct 25, 2010 at 06:41:11PM +0200, Juergen Hannken-Illjes wrote:
  Do we implement MADV_WILLNEED?
 
 According to the man page This WILL NOT fault pages in from backing store.

The version of the man page I have says It might or might not fault
pages in from backing store.

Joerg


re: CVS commit: src/bin/cp

2010-10-25 Thread matthew green

 On Mon, Oct 25, 2010 at 08:52:43AM +0100, Matthias Scheler wrote:
  On Sun, Oct 24, 2010 at 10:56:40PM +, David Holland wrote:
   Anyway, ISTM that writing from the mmap buffer in say 64K chunks would
   retain nearly all the advantages and get rid of the latency problem.
  
  The way the code is currently written it only uses mmap(2) for
  files smaller than 8MB anyway. Your suggested change would require
  more work than reducing the size of the mapped memory.
 
 Forget that, I misunderstood what you said. We can still use one
 mmap(2) but should write out that memory with multile write(2)
 system calls.

FWIW, bozohttpd defaults to mapping upto 64MB regions and writing
out up to 64KB at a time.


.mrg.


Re: CVS commit: src/bin/cp

2010-10-25 Thread Juergen Hannken-Illjes
On Mon, Oct 25, 2010 at 06:46:36PM +0200, Joerg Sonnenberger wrote:
 On Mon, Oct 25, 2010 at 06:41:11PM +0200, Juergen Hannken-Illjes wrote:
   Do we implement MADV_WILLNEED?
  
  According to the man page This WILL NOT fault pages in from backing store.
 
 The version of the man page I have says It might or might not fault
 pages in from backing store.

Mine was 5.0.2, yours looks more -current :-)

-- 
Juergen Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)


Re: CVS commit: src/bin/cp

2010-10-25 Thread Matthias Scheler
On Tue, Oct 26, 2010 at 05:05:28AM +1100, matthew green wrote:
 
  On Mon, Oct 25, 2010 at 08:52:43AM +0100, Matthias Scheler wrote:
   On Sun, Oct 24, 2010 at 10:56:40PM +, David Holland wrote:
Anyway, ISTM that writing from the mmap buffer in say 64K chunks would
retain nearly all the advantages and get rid of the latency problem.
   
   The way the code is currently written it only uses mmap(2) for
   files smaller than 8MB anyway. Your suggested change would require
   more work than reducing the size of the mapped memory.
  
  Forget that, I misunderstood what you said. We can still use one
  mmap(2) but should write out that memory with multile write(2)
  system calls.
 
 FWIW, bozohttpd defaults to mapping upto 64MB regions and writing
 out up to 64KB at a time.

cp(1) now maps a region upto 8MB and writes it out in chunks of at
most 64KB.

Kind regards

-- 
Matthias Scheler  http://zhadum.org.uk/


Re: CVS commit: src/bin/cp

2010-10-24 Thread David Holland
(adding tech-kern because this seems likely to become lengthy; if
following up please drop source-changes-d)

On Sun, Oct 24, 2010 at 11:30:43AM +0100, David Laight wrote:
[mmap mode disabled in cp due to long vnode lock waits]
  
   Because individual write() calls are supposed to be atomic, I don't
   think there is such a thing as a locking improvement that'll help with
   this behavior. :-/
  
  I think write() only needs to lock the the file enough to ensure that
  the file offset is correct.

No, since in general the file is also being extended (certainly in
this case it is) it also has to lock the file size, and that's going
to deny stat() until it's done.

  Possibly the written range needs locking against other accesses -
  but I think the app is supposed to use file locking for that

No, the reason single syscalls are atomic is specifically so apps
don't have to use file locking for simple cases, like e.g. writing
lines to log files.

   (and mmap will always be non-atomic w.r.t. write).

Yes, but the mmap in this case is for reading.

(what the code in cp does is map the source and then write to the
destination file from the mapped region)

  Actually if 2 writes are issued for the same part of a file, the
  kernel can act as if they were requested in either order - since the
  app(s) cannot know the order the calls would be made in!

As long as nothing else happens in between that establishes an
ordering, yes.

  Which means it could just sleep the 2nd write until the first terminates!

Yes indeed, that's what locks do :-)

  Writes with O_APPEND (and writes that extend the file) are more
  problematical since you cant allow a second such write to start until
  the first has completed - for instance it might try to read from
  an unmapped user-space address and return a short length.

Right.

   Except I guess going to some kind of multiversion model for vnodes.
  
  Don't you just need 2 locks? One for locking the data areas, and the
  other for the file data itself.

Nope. Or at least, that doesn't fix the objectionable behavior.
Because you need to lock the file size, if you want stat() to be able
to complete while the operation is in progress you need to be able to
let stat() read a version of the file size from before the operation
began.

One can do multiversion schemes in the kernel; apparently it's neither
as ghastly or as expensive as one would think. See for example the
transactional Linux paper from SOSP 2009. But it'd be a pretty big
hacking job to do it in NetBSD.

Anyway, ISTM that writing from the mmap buffer in say 64K chunks would
retain nearly all the advantages and get rid of the latency problem.

-- 
David A. Holland
dholl...@netbsd.org