[PATCH] libb/remove_file.c: Use correct algorithm to delete directory
Following algorithm to delete directory is wrong: opendir(); while (readdir()) unlink(); closedir(); This is because it is unspecified which entry readdir() will return if files were added/removed since the last call to opendir() or rewinddir(): http://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html Also, readdir() may buffer several directory entries per actual read operation. Because of all this, readdir() function may skip some entries if files were added/removed to/from the directory and the delete directory will fail. This can be very easily verified with the following test: In one terminal run command below: while true; do echo "aaa bbb ccc" >rand/$RANDOM.txt; done And in the other terminal try to delete directory: busybox rm -rf rand This will fail with the error: rm: can't remove 'rand': Directory not empty (Of course we are talking about deleting non-empty directories now.) The following algorithm is (one of) correct one(s): opendir(); do { unlinked_files = 0; while (readdir()) { unlink(); unlinked_files = 1; } if (unlinked_files) rewinddir(); } while (unlinked_files); closedir(); This commit corrects the algorithm to delete the directory. Signed-off-by: Leonid Fedorenchik --- libbb/remove_file.c | 28 ++-- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/libbb/remove_file.c b/libbb/remove_file.c index eaca293..b609be9 100644 --- a/libbb/remove_file.c +++ b/libbb/remove_file.c @@ -51,16 +51,24 @@ int FAST_FUNC remove_file(const char *path, int flags) return -1; } - while ((d = readdir(dp)) != NULL) { - char *new_path; - - new_path = concat_subpath_file(path, d->d_name); - if (new_path == NULL) - continue; - if (remove_file(new_path, flags) < 0) - status = -1; - free(new_path); - } + int unlinked_files; + do { + unlinked_files = 0; + while ((d = readdir(dp)) != NULL) { + char *new_path; + + new_path = concat_subpath_file(path, d->d_name); + if (new_path == NULL) + continue; + if (remove_file(new_path, flags) < 0) + status = -1; + free(new_path); + unlinked_files = 1; + } + if (unlinked_files) { + rewinddir(dp); + } + } while (unlinked_files); if (closedir(dp) < 0) { bb_perror_msg("can't close '%s'", path); -- 2.3.2 -- Best regards, Leonid Fedorenchik Software Engineer Paragon Software Group Skype: leonid.fedorenchik http://www.paragon-software.com ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] libb/remove_file.c: Use correct algorithm to delete directory
On 13/04/2015 12:54, Leonid Fedorenchik wrote: opendir(); do { unlinked_files = 0; while (readdir()) { unlink(); unlinked_files = 1; } if (unlinked_files) rewinddir(); } while (unlinked_files); closedir(); That is actually incorrect. It depends on non-portable and unspecified directory implementation. The readdir() specification page you linked says: "If a file is removed from or added to the directory after the most recent call to opendir() or rewinddir(), whether a subsequent call to readdir() returns an entry for that file is unspecified." Which means that it is very possible for readdir() to keep returning NULL after a rewinddir() even if files have been added to the directory in the meantime. The fact that your readdir() implementation doesn't is accidental. If you really want to use that algorithm, you have to loop around opendir() and closedir(): re-open the directory after every pass, until it is empty. But this is dangerous: it opens a race condition where you could delete a new directory created by another process after you successfully deleted the old one. In other words, there is no way to atomically delete a file hierarchy in Unix, and every "rm -rf" implementation is a compromise. A way that I have found safer than most is the following: * atomically rename() the directory to a non-existing, unique, random, hard to predict directory name. * recursively delete the newly named directory with your favorite deletion algorithm. No matter what method you choose, there is still a race condition, but the chances it will be triggered are significantly reduced. -- Laurent ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] libb/remove_file.c: Use correct algorithm to delete directory
On 13/04/2015 13:38, Laurent Bercot wrote: Which means that it is very possible for readdir() to keep returning NULL after a rewinddir() even if files have been added to the directory in the meantime. Sorry, I was wrong about that. rewinddir() actually sets the directory in the same state as opendir(), so re-opening the directory after every pass is unnecessary. The reason why the algorithm is still susceptible to a race condition is that a file can always been added between the most recent invocation of opendir()/rewinddir() and the next invocation of readdir(). In other words, a concurrent process that constantly creates files will compete with the process that unlinks them, and the results are unpredictable. Also, the concurrent process can still create a file between a readdir() that returns NULL (which will be interpreted as "success, the directory is empty, we can now unlink it) and the unlink(): so unlinking the directory name can still fail. I apologize for the confusion: I gave the wrong reason why this algorithm is not better than the current busybox one. The fact remains that it is impossible to guarantee that the unlink() of a directory will succeed, and adding a rewinddir() loop does not add value. -- Laurent ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox