On 9/21/25 11:12 AM, Pádraig Brady wrote:
On 21/09/2025 06:30, Paul Eggert wrote:
On 2025-09-20 14:34, Kye Hunter wrote:

I don't
think there's any need to look beyond the most intuitive implementation
of the --update=older option: If the timestamps on the target are newer
than the timestamps on the source, don't do anything.

We're talking only mtime, right? --update looks only at mtime, not at
atime or ctime or whatever.

I'm only seeing extra calls to chmod being applied to directories

That's as per the documentation: --update affects only non-directories.

/* Avoid calling chown if we know it's not necessary.  */ , so it seems reasonable that the behavior of preserving the mode should be the same.

Sure, but the problem here is that it appears cp doesn't know whether
calling chmod is necessary. If deciding whether it's necessary is likely
more expensive than just doing the chmod, cp might as well just do the
chmod.

Again, to make this more concrete let's consider this code:
touch foo
cp -l foo bar
mkdir qux
cp --preserve=links --update=older --target-directory="./qux" *
strace cp --preserve=links --update=older --target-directory="./qux" *

Here, with the --update=older option, it seems unexpected that ./qux/foo
and ./qux/bar are being relinked the second time they are copied from
".", when what the manual describes is "--update[=UPDATE] control
which existing files are updated" and "'older' [...] results in files
being replaced if they're older than the corresponding source file."

--update affects only how non-directories are processed. In this
particular case, non-directories are indeed skipped; only the directory
qux is updated. So 'cp' is conforming to its documentation.

You're right that 'cp' could avoid some relinking if it kept track of
st_dev and st_ino for all destinations it's already updated. But 'cp'
would need to spend more RAM to do that, no? That could be a significant
burden in a recursive copy.

Code to show the current behavior of cp for scenario 2:

Unfortunately I'm not following that scenario or its description (it's
fairly long....).

This line even
comes with the helpful comment /* Note we currently replace DST_NAME
unconditionally, even if it was a newer separate file.  */ , so actually
this behavior is somehow intentional, but I don't see any justification
given for this exception to the expected behavior of --update=older or
--no-preserve=links.

Nor do I. It seems a clear bug, or at least a problem. However, there
clearly was intent here: see Pádraig's commit
2aea1828a1aab158f68cccf3eac408203889021e dated 2011-07-27 which puts
into tests/cp/preserve/link the comment "a separate newer file in dest
will be overwritten!" indicating that the behavior you subscribe is at
least surprising.

I would be inclined to change the behavior to be the way you suggest.
This would simplify cp and make it a bit faster, and would make its
behavior easier to explain. However we're close to a release right now
and we should wait for Pádraig's opinion as he wrote the code (and the
test case) in question.

Thanks for helping me think through all this.

Yes this seems a bit hairy to change just before (hopefully tomorrow's) release.
As for the code/test, it's a while ago, but the exclamation mark by me
suggests the code change was a refactor, the behavior was surprising,
but worth adding a test for documentation at least. I.e. the behavior
could be changed IMHO (I've not thought about this much TBH).

cheers,
Padraig

cheers,
Padraig

Yeah definitely not the kind of thing I would want to change the day before a release; looking forward to this change in 9.9! (or 10.0, idk what your release cycle is like.)

I think it's fair to say that, besides this bug with linking, the rest of the odd behavior we've been talking about seems to be conforming to documentation. Digging though the code a bit, I have two suggestions for some reasonably simple changes to cp that I think would be fairly clear improvements for the user. I don't think the two ideas are necessarily exclusive, but I'm not sure that there would be much point in doing both either.

1. At
   https://github.com/coreutils/coreutils/blob/master/src/copy.c#L2691
   , cp_internal calls xcopy_acl, which calls qcopy_acl, which calls
   chmod_or_fchmod (and then other stuff about ACLs). There doesn't
   seem to be anything simple to do at this level about the ACLs, but
   in cp_internal the variables src_sb and dst_sb already contain the
   mode bits for the source and destination files. So if the signatures
   of xcopy_acl and qcopy_acl were modified to accept an optional
   ONLY_ACLS flag, then the mode bits in src_sb and dst_sb could be
   compared in copy_internal to tell qcopy_acl that it can skip the
   call to chmod_or_fchmod. Probably this kind of optimization would
   not be useful in the normal use-case of cp, but I think it makes
   sense that when the --update=older flag is present, most destination
   files probably exist and don't need their mode changed. So the tree
   of if statements starting at
   https://github.com/coreutils/coreutils/blob/master/src/copy.c#L2689
   could contain a section for the case of --update=older and in that
   case pass the extra flag to xcopy_acl.
2. It makes sense that the --update=older option is mainly interested
   in the file's data, and is using the mtime to check if that has
   changed. I think it would not be too difficult of a change to add
   another "--update-attrs" flag which would have similar usage to
   --update, and could use fairly simple logic with the ctime to check
   if the other file attributes have changed. Similar to the first
   point, the ctime is already present in src_sb and dst_sb, so the
   implementation would mostly just be a matter of adding a little
   extra logic to check for a --update-attrs flag and to compare the
   ctimes. This would involve changes in more areas of the code, but it
   seems like they would mostly be fairly simple. It's not totally
   clear to me exactly how the OS is deciding when ctime should be set
   (I don't think I even knew ctime existed 2 weeks ago 😂), but even
   if ctime does not perfectly capture the idea of "only updating
   attributes when they've changed in the source", it seems like this
   approach would still do reasonably well for the simplicity of
   implementing it.

There might also be some room to clarify the documentation of the --update flag, e.g., that as you say, it is only intended to affect non-directories, and is only concerned with the file's data, rather than its attributes.

Thanks,

-Kye

--
Kye E. Hunter
PGP: 6859 E2DE D598 49EA 9319  10CD DEF2 BA03 A6BE 3062
--


Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to