On 9/21/25 11:12 AM, Pádraig Brady wrote:
On 21/09/2025 06:30, Paul Eggert wrote:On 2025-09-20 14:34, Kye Hunter wrote:I don't think there's any need to look beyond the most intuitive implementation of the --update=older option: If the timestamps on the target are newer than the timestamps on the source, don't do anything.We're talking only mtime, right? --update looks only at mtime, not at atime or ctime or whatever.I'm only seeing extra calls to chmod being applied to directoriesThat's as per the documentation: --update affects only non-directories./* Avoid calling chown if we know it's not necessary. */ , so it seems reasonable that the behavior of preserving the mode should be the same.Sure, but the problem here is that it appears cp doesn't know whether calling chmod is necessary. If deciding whether it's necessary is likely more expensive than just doing the chmod, cp might as well just do the chmod.Again, to make this more concrete let's consider this code: touch foo cp -l foo bar mkdir qux cp --preserve=links --update=older --target-directory="./qux" * strace cp --preserve=links --update=older --target-directory="./qux" *Here, with the --update=older option, it seems unexpected that ./qux/fooand ./qux/bar are being relinked the second time they are copied from ".", when what the manual describes is "--update[=UPDATE] control which existing files are updated" and "'older' [...] results in files being replaced if they're older than the corresponding source file."--update affects only how non-directories are processed. In this particular case, non-directories are indeed skipped; only the directory qux is updated. So 'cp' is conforming to its documentation. You're right that 'cp' could avoid some relinking if it kept track of st_dev and st_ino for all destinations it's already updated. But 'cp' would need to spend more RAM to do that, no? That could be a significant burden in a recursive copy.Code to show the current behavior of cp for scenario 2:Unfortunately I'm not following that scenario or its description (it's fairly long....).This line even comes with the helpful comment /* Note we currently replace DST_NAMEunconditionally, even if it was a newer separate file. */ , so actuallythis behavior is somehow intentional, but I don't see any justification given for this exception to the expected behavior of --update=older or --no-preserve=links.Nor do I. It seems a clear bug, or at least a problem. However, there clearly was intent here: see Pádraig's commit 2aea1828a1aab158f68cccf3eac408203889021e dated 2011-07-27 which puts into tests/cp/preserve/link the comment "a separate newer file in dest will be overwritten!" indicating that the behavior you subscribe is at least surprising. I would be inclined to change the behavior to be the way you suggest. This would simplify cp and make it a bit faster, and would make its behavior easier to explain. However we're close to a release right now and we should wait for Pádraig's opinion as he wrote the code (and the test case) in question. Thanks for helping me think through all this.Yes this seems a bit hairy to change just before (hopefully tomorrow's) release.As for the code/test, it's a while ago, but the exclamation mark by me suggests the code change was a refactor, the behavior was surprising, but worth adding a test for documentation at least. I.e. the behavior could be changed IMHO (I've not thought about this much TBH). cheers, Padraig cheers, Padraig
Yeah definitely not the kind of thing I would want to change the day before a release; looking forward to this change in 9.9! (or 10.0, idk what your release cycle is like.)
I think it's fair to say that, besides this bug with linking, the rest of the odd behavior we've been talking about seems to be conforming to documentation. Digging though the code a bit, I have two suggestions for some reasonably simple changes to cp that I think would be fairly clear improvements for the user. I don't think the two ideas are necessarily exclusive, but I'm not sure that there would be much point in doing both either.
1. At https://github.com/coreutils/coreutils/blob/master/src/copy.c#L2691 , cp_internal calls xcopy_acl, which calls qcopy_acl, which calls chmod_or_fchmod (and then other stuff about ACLs). There doesn't seem to be anything simple to do at this level about the ACLs, but in cp_internal the variables src_sb and dst_sb already contain the mode bits for the source and destination files. So if the signatures of xcopy_acl and qcopy_acl were modified to accept an optional ONLY_ACLS flag, then the mode bits in src_sb and dst_sb could be compared in copy_internal to tell qcopy_acl that it can skip the call to chmod_or_fchmod. Probably this kind of optimization would not be useful in the normal use-case of cp, but I think it makes sense that when the --update=older flag is present, most destination files probably exist and don't need their mode changed. So the tree of if statements starting at https://github.com/coreutils/coreutils/blob/master/src/copy.c#L2689 could contain a section for the case of --update=older and in that case pass the extra flag to xcopy_acl. 2. It makes sense that the --update=older option is mainly interested in the file's data, and is using the mtime to check if that has changed. I think it would not be too difficult of a change to add another "--update-attrs" flag which would have similar usage to --update, and could use fairly simple logic with the ctime to check if the other file attributes have changed. Similar to the first point, the ctime is already present in src_sb and dst_sb, so the implementation would mostly just be a matter of adding a little extra logic to check for a --update-attrs flag and to compare the ctimes. This would involve changes in more areas of the code, but it seems like they would mostly be fairly simple. It's not totally clear to me exactly how the OS is deciding when ctime should be set (I don't think I even knew ctime existed 2 weeks ago 😂), but even if ctime does not perfectly capture the idea of "only updating attributes when they've changed in the source", it seems like this approach would still do reasonably well for the simplicity of implementing it.There might also be some room to clarify the documentation of the --update flag, e.g., that as you say, it is only intended to affect non-directories, and is only concerned with the file's data, rather than its attributes.
Thanks, -Kye -- Kye E. Hunter PGP: 6859 E2DE D598 49EA 9319 10CD DEF2 BA03 A6BE 3062 --
OpenPGP_signature.asc
Description: OpenPGP digital signature
