Alex wrote: > Does the GNU coreutils 'cp' utility guarantee that the target file > after copying is the same as the source one?
Mostly it is a correct by construction process. GNU 'cp' will report any error that occurs during the copy. If no error occurs then the copy was correct. > If it doesn't then I'll need to make my own diff-ing or checksums > verifying, right? Or maybe, all the copying is implemented via 100%- > reliable low-level calls, so all the checking I'm talking about is > redundant? The 'cp' command reads and writes files using the kernel system calls. The only way to have a file that isn't identical to the source is if the kernel is buggy and incorrectly reports success when in actuality it had failed. Otherwise if the read and write calls both return success then the file will be successfully copied. Therefore in the 'cp' command itself there isn't a need to do an additional comparison check and indeed especiall on large files such a check would be a severe penalty. Note that "sparse" files are somewhat of a special case and can be expanded or preserved depending upon the options used for the copy. But I don't think that is what you are talking about. In summary I don't think you need to do an additional integrity check if the 'cp' reports success. There are times when being able to deduce if a /previous/ run of 'cp' was successful. For example if the 'cp' command was prevented from finishing because power was lost to the system. Obviously no success or failure was reported and the calling process also didn't run and the files might not be identical. There may be a partially written file on disk in that case. Even if you added a post copy check you could be in this condition since the post copy check couldn't run with the power off either. The 'rsync' tool is very useful in such situations for two reasons. One is that it will re-sync the files only if they are not the same making recovery efficient and doing nothing if nothing needs to be done making doing nothing very efficient too. Another is that rsync copies files to a temporary location and then renames them into place when the full file is available so as to avoid a time when only a partial file is in place. However even that venerable technique fails on some newer buggy filesystem implementations that try to optimize too much and reorder actions. (Trying to avoid starting a long discussion about it here but people who recognize what I am referring to will know the arguments on both sides.) In any case using 'rsync' is useful if you need to be able to run the same command repeatedly and want to avoid unnecessary copies. Bob