On 06/05/18 18:58, Jason Smith wrote: > Hello, Pádraig, > > I was actually the one to identify the bug and analyze the code underlying it > (Illia, my colleague, was kind enough to volunteer to file the bug report and > attempt a patch), and so I may be in the better position to follow up on > this. I appreciate your addressing the matter so promptly. > > We are indeed provisionally using --remove-destination; however, we would > prefer to use --force instead for the atomicity it affords. > > Regarding the proposed solution, I think that it well addresses the present > issue. But there are related issues that I should raise while we are > discussing it and that may influence its resolution. > > To begin, it seems problematic to make the return value in the case of > hard-link creation dependent on whether the source and the destination are > associated with the same device ID, as the purpose of the same_file_ok > function seems to be to indicate whether it matters that the files may > somehow be the same. If their file systems are different, that is not a > problem with the files' being the same; it is a problem with their being on > different file systems. The resulting error message (that the operation > cannot be performed because the files are the same) is therefore incorrect > and misleading. Perhaps, as it seems to me, this check should rather be > performed outside this function and associated with its own error message. > > In fact, even if the source and destination are completely unrelated, this > function may still return false and cause the same-file error message to be > displayed. For example, let [source] be a regular file or directory on one > file system, and let [destination] be a symbolic link on another. Then the > following command will result in a same-file error, even if [destination] > does not refer to [source]: > > cp --recursive --link --no-dereference [source] [destination] > > Moreover, the name and description of the function seem misleading. One would > think from reading these that the source and destination files passed to it > have already been determined to be equivalent in some way, and that the > function is meant to determine whether it is all right in that case to > overwrite the destination; however, it appears that the function is called > whenever the destination already exists, regardless of its relation to the > source. This discrepancy can lead to misunderstandings and problems such as > the kind last described. Ideally, the function's name and description would > be made more accurate and precise, and the function used strictly accordingly. > > Assuming we were to move the file-system check outside the same_file_ok > function, we would be left with the following code within the function: > > /* It's ok to remove a destination symbolic link when creating a symbolic > link or hard link. */ > if (S_ISLNK (dst_sb_link->st_mode) && (x->symbolic_link || x->hard_link)) > return true; > > But as we earlier in the function handle the cases when both the source and > the destination are symbolic links, and there does not seem to be a reason to > constrain overwriting of symbolic links otherwise (at least within this > function), we can perhaps simplify this even further to the following: > > /* At this point, it's ok to remove a destination symbolic link. */ > if (S_ISLNK (dst_sb_link->st_mode)) > return true; > > But I have not analyzed all the relevant code to determine the full impact of > this change. > > Finally, it would seem to make sense for the same_file_ok function to return > true immediately if the source and destination files passed to it have > different inode numbers or are on different file systems (and are thus not > the same file). In fact, this is done in lines 1488-89 if DEREF_NEVER is not > set; however, it is not done if DEREF_NEVER is set. For reference, those > lines read as follows: > > if (!same) > return true; > > Cursory analysis suggests that if these lines were unconditionally executed > near the top of the function, certain problems might be avoided. > > Cheers, > Jason
same_file_ok() has accreted awkward complexity, having started out with a series of FIXMEs. It gained support for `mv hardlink1 hardlink2` in fc6073d6 and then having that removed in 222d7ac0. As part of the change in fc6073d6 (2003), it actually made the problematic clause we've been analyzing in same_file_ok() a noop. Thus that clause stagnated for 14 years and so when being converted as part of the recent 37696788 change it no longer became a noop. So while I agree this whole area definitely needs a refactor, let's address the specific issues for now, so that distributors can fix the issue with minimal risk. Since this clause was a noop for 14 years, we should discount it. However the recent change also tries to handle this case: touch foo ln -s foo symlink cp -dl foo symlink Up until the recent 37696788, cp did nothing but after it errors out saying 'symlink' exists. This could be seen as correct as -f is not specified. Now it could also be argued that a noop is ok as the symlink points to the right place, and on some systems we simulate hardlinks to symlinks with symlinks. But on the otherhand `ln foo symlink` does give EEXIST, so for consistency it would be good to keep this behavior of 37696788. The attached patch hopefully handles everything, and adds tests for the two problematic cases. thanks again, Pádraig.
From df3e53f003533feb9a4ae1daa01ad1a7cd9e0c96 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= <[email protected]> Date: Mon, 14 May 2018 02:26:05 -0700 Subject: [PATCH] cp: fix symlink checks when overwriting files Ensure this _does_ recreate the symlink Given "path1" and "path2" are on different devices. $ touch "path1/file" $ cd path2/; ln -s path1/file $ cp -sf path1/file . Ensure this does _not_ overwrite file $ touch file $ ln -s file l1 $ cp -sf l1 file * src/copy.c (same_file_ok): Remove device ids from consideration, instead deferring to future EXDEV with --link or allowing the first case above to work. Also ensure that we do not exist this function too early, when the destination file is not a symlink, which protects against the second case. * tests/cp/cross-dev-symlink.sh: Add a test for the first case. * tests/cp/same-file.sh: Add a test for the second case above. * NEWS: Mention the bug fixes. * THANKS.in: Mention the reporters who also analyzed the code. Fixes https://bugs.gnu.org/31364 --- NEWS | 7 +++++++ THANKS.in | 2 ++ src/copy.c | 18 ++++++++---------- tests/cp/cross-dev-symlink.sh | 38 ++++++++++++++++++++++++++++++++++++++ tests/cp/same-file.sh | 15 ++++++++++++++- tests/local.mk | 1 + 6 files changed, 70 insertions(+), 11 deletions(-) create mode 100755 tests/cp/cross-dev-symlink.sh diff --git a/NEWS b/NEWS index de02814..7aa2925 100644 --- a/NEWS +++ b/NEWS @@ -4,6 +4,13 @@ GNU coreutils NEWS -*- outline -*- ** Bug fixes + 'cp --symlink SRC DST' will again correctly validate DST. + If DST is a regular file and SRC is a symlink to DST, + then cp will no longer allow that operation to clobber DST. + Also if DST is a symlink, then it can always be replaced, + even if it points to SRC on a separate device. + [bug introduced with coreutils-8.27] + 'ls -aA' is now equivalent to 'ls -A', since -A now overrides -a. [bug introduced in coreutils-5.3.0] diff --git a/THANKS.in b/THANKS.in index c63cc9b..3951b66 100644 --- a/THANKS.in +++ b/THANKS.in @@ -261,6 +261,7 @@ Ian Kent [email protected] Ian Lance Taylor [email protected] Ian Turner [email protected] Iida Yosiaki [email protected] +Illia Bobyr [email protected] Ilya N. Golubev [email protected] Ingo Saitz [email protected] Ivan Labath [email protected] @@ -285,6 +286,7 @@ Jan-Pawel Wrozstinski [email protected] Jari Aalto [email protected] Jarkko Hietaniemi [email protected] Jarod Wilson [email protected] +Jason Smith [email protected] Jean Charles Delepine [email protected] Jean-Pierre Tosoni [email protected] Jeff Moore [email protected] diff --git a/src/copy.c b/src/copy.c index 4998c83..0407c56 100644 --- a/src/copy.c +++ b/src/copy.c @@ -1627,14 +1627,9 @@ same_file_ok (char const *src_name, struct stat const *src_sb, } } - /* It's ok to remove a destination symlink. But that works only - when creating symbolic links, or when the source and destination - are on the same file system and when creating hard links or when - unlinking before opening the destination. */ - if (x->symbolic_link - || ((x->hard_link || x->unlink_dest_before_opening) - && S_ISLNK (dst_sb_link->st_mode))) - return dst_sb_link->st_dev == src_sb_link->st_dev; + /* It's ok to recreate a destination symlink. */ + if (x->symbolic_link && S_ISLNK (dst_sb_link->st_mode)) + return true; if (x->dereference == DEREF_NEVER) { @@ -1651,10 +1646,13 @@ same_file_ok (char const *src_name, struct stat const *src_sb, if ( ! SAME_INODE (tmp_src_sb, tmp_dst_sb)) return true; - /* FIXME: shouldn't this be testing whether we're making symlinks? */ if (x->hard_link) { - *return_now = true; + /* It's ok to attempt to hardlink the same file, + and return early if not replacing a symlink. + Note we need to return early to avoid a later + unlink() of DST (when SRC is a symlink). */ + *return_now = ! S_ISLNK (dst_sb_link->st_mode); return true; } } diff --git a/tests/cp/cross-dev-symlink.sh b/tests/cp/cross-dev-symlink.sh new file mode 100755 index 0000000..e945b40 --- /dev/null +++ b/tests/cp/cross-dev-symlink.sh @@ -0,0 +1,38 @@ +#!/bin/sh +# Ensure symlinks can be replaced across devices + +# Copyright (C) 2018 Free Software Foundation, Inc. + +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <https://www.gnu.org/licenses/>. + +. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src +print_ver_ cp +require_root_ + +cwd=$(pwd) +cleanup_() { cd /; umount "$cwd/mnt"; } + +truncate -s100M fs.img || framework_failure_ +mkfs -t ext4 fs.img || skip_ 'failed to create ext4 file system' +mkdir mnt || framework_failure_ +mount fs.img mnt || skip_ 'failed to mount ext4 file system' + +mkdir mnt/path1 || framework_failure_ +touch mnt/path1/file || framework_failure_ +mkdir path2 || framework_failure_ +cd path2 && ln -s ../mnt/path1/file || framework_failure_ + +cp -sf ../mnt/path1/file . 2>err || fail=1 + +Exit $fail diff --git a/tests/cp/same-file.sh b/tests/cp/same-file.sh index 01bdb98..e50a991 100755 --- a/tests/cp/same-file.sh +++ b/tests/cp/same-file.sh @@ -47,7 +47,7 @@ contents=XYZ for args in 'foo symlink' 'symlink foo' 'foo foo' 'sl1 sl2' \ 'foo hardlink' 'hlsl sl2'; do for options in '' -d -f -df --rem -b -bd -bf -bdf \ - -l -dl -fl -dfl -bl -bdl -bfl -bdfl; do + -l -dl -fl -dfl -bl -bdl -bfl -bdfl -s -sf; do case $args$options in # These tests are not portable. # They all involve making a hard link to a symbolic link. @@ -100,6 +100,7 @@ for args in 'foo symlink' 'symlink foo' 'foo foo' 'sl1 sl2' \ # and put brackets around the output. if test -s _err; then sed ' + s/symbolic link/symlink/ s/^[^:]*:\([^:]*\).*/cp:\1/ 1s/^/[/ $s/$/]/ @@ -149,6 +150,8 @@ cat <<\EOF | sed "$remove_these_sed" > expected 0 -bdl (foo symlink symlink.~1~ -> foo) 0 -bfl (foo symlink symlink.~1~ -> foo) 0 -bdfl (foo symlink symlink.~1~ -> foo) +1 -s [cp: cannot create symlink 'symlink' to 'foo'] (foo symlink -> foo) +0 -sf (foo symlink -> foo) 1 [cp: 'symlink' and 'foo' are the same file] (foo symlink -> foo) 1 -d [cp: 'symlink' and 'foo' are the same file] (foo symlink -> foo) @@ -164,6 +167,8 @@ cat <<\EOF | sed "$remove_these_sed" > expected 0 -fl (foo symlink -> foo) 0 -bl (foo symlink -> foo) 0 -bfl (foo symlink -> foo) +1 -s [cp: 'symlink' and 'foo' are the same file] (foo symlink -> foo) +1 -sf [cp: 'symlink' and 'foo' are the same file] (foo symlink -> foo) 1 [cp: 'foo' and 'foo' are the same file] (foo) 1 -d [cp: 'foo' and 'foo' are the same file] (foo) @@ -182,6 +187,8 @@ cat <<\EOF | sed "$remove_these_sed" > expected 0 -bdl (foo) 0 -bfl (foo foo.~1~) 0 -bdfl (foo foo.~1~) +1 -s [cp: 'foo' and 'foo' are the same file] (foo) +1 -sf [cp: 'foo' and 'foo' are the same file] (foo) 1 [cp: 'sl1' and 'sl2' are the same file] (foo sl1 -> foo sl2 -> foo) 0 -d (foo sl1 -> foo sl2 -> foo) @@ -196,6 +203,8 @@ cat <<\EOF | sed "$remove_these_sed" > expected 0 -fl (foo sl1 -> foo sl2) 0 -bl (foo sl1 -> foo sl2 sl2.~1~ -> foo) 0 -bfl (foo sl1 -> foo sl2 sl2.~1~ -> foo) +1 -s [cp: cannot create symlink 'sl2' to 'sl1'] (foo sl1 -> foo sl2 -> foo) +0 -sf (foo sl1 -> foo sl2 -> sl1) 1 [cp: 'foo' and 'hardlink' are the same file] (foo hardlink) 1 -d [cp: 'foo' and 'hardlink' are the same file] (foo hardlink) @@ -214,6 +223,8 @@ cat <<\EOF | sed "$remove_these_sed" > expected 0 -bdl (foo hardlink) 0 -bfl (foo hardlink) 0 -bdfl (foo hardlink) +1 -s [cp: 'foo' and 'hardlink' are the same file] (foo hardlink) +1 -sf [cp: 'foo' and 'hardlink' are the same file] (foo hardlink) 1 [cp: 'hlsl' and 'sl2' are the same file] (foo hlsl -> foo sl2 -> foo) 0 -d (foo hlsl -> foo sl2 -> foo) @@ -232,6 +243,8 @@ cat <<\EOF | sed "$remove_these_sed" > expected 0 -bdl (foo hlsl -> foo sl2 -> foo) 0 -bfl (foo hlsl -> foo sl2 sl2.~1~ -> foo) 0 -bdfl (foo hlsl -> foo sl2 -> foo) +1 -s [cp: cannot create symlink 'sl2' to 'hlsl'] (foo hlsl -> foo sl2 -> foo) +0 -sf (foo hlsl -> foo sl2 -> hlsl) EOF diff --git a/tests/local.mk b/tests/local.mk index e60ea1d..1f8b189 100644 --- a/tests/local.mk +++ b/tests/local.mk @@ -113,6 +113,7 @@ all_root_tests = \ tests/cp/cp-mv-enotsup-xattr.sh \ tests/cp/capability.sh \ tests/cp/sparse-fiemap.sh \ + tests/cp/cross-dev-symlink.sh \ tests/dd/skip-seek-past-dev.sh \ tests/df/problematic-chars.sh \ tests/df/over-mount-device.sh \ -- 2.9.3
