[PATCH] libb/remove_file.c: Use correct algorithm to delete directory

2015-04-13 Thread Leonid Fedorenchik
Following algorithm to delete directory is wrong:

opendir();
while (readdir()) unlink();
closedir();

This is because it is unspecified which entry readdir() will return if
files were added/removed since the last call to opendir() or rewinddir():
http://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html
Also, readdir() may buffer several directory entries per actual read
operation.
Because of all this, readdir() function may skip some entries if files
were added/removed to/from the directory and the delete directory will
fail.

This can be very easily verified with the following test:
In one terminal run command below:

while true; do echo "aaa bbb ccc" >rand/$RANDOM.txt; done

And in the other terminal try to delete directory:

busybox rm -rf rand

This will fail with the error:

rm: can't remove 'rand': Directory not empty

(Of course we are talking about deleting non-empty directories now.)

The following algorithm is (one of) correct one(s):

opendir();
do {
unlinked_files = 0;
while (readdir()) {
unlink();
unlinked_files = 1;
}
if (unlinked_files)
rewinddir();
} while (unlinked_files);
closedir();

This commit corrects the algorithm to delete the directory.

Signed-off-by: Leonid Fedorenchik 
---
 libbb/remove_file.c | 28 ++--
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/libbb/remove_file.c b/libbb/remove_file.c
index eaca293..b609be9 100644
--- a/libbb/remove_file.c
+++ b/libbb/remove_file.c
@@ -51,16 +51,24 @@ int FAST_FUNC remove_file(const char *path, int flags)
return -1;
}
 
-   while ((d = readdir(dp)) != NULL) {
-   char *new_path;
-
-   new_path = concat_subpath_file(path, d->d_name);
-   if (new_path == NULL)
-   continue;
-   if (remove_file(new_path, flags) < 0)
-   status = -1;
-   free(new_path);
-   }
+   int unlinked_files;
+   do {
+   unlinked_files = 0;
+   while ((d = readdir(dp)) != NULL) {
+   char *new_path;
+
+   new_path = concat_subpath_file(path, d->d_name);
+   if (new_path == NULL)
+   continue;
+   if (remove_file(new_path, flags) < 0)
+   status = -1;
+   free(new_path);
+   unlinked_files = 1;
+   }
+   if (unlinked_files) {
+   rewinddir(dp);
+   }
+   } while (unlinked_files);
 
if (closedir(dp) < 0) {
bb_perror_msg("can't close '%s'", path);
-- 
2.3.2

-- 

Best regards,
Leonid Fedorenchik

Software Engineer
Paragon Software Group
Skype: leonid.fedorenchik
http://www.paragon-software.com
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: [PATCH] libb/remove_file.c: Use correct algorithm to delete directory

2015-04-13 Thread Laurent Bercot

On 13/04/2015 12:54, Leonid Fedorenchik wrote:

opendir();
do {
unlinked_files = 0;
while (readdir()) {
unlink();
unlinked_files = 1;
}
if (unlinked_files)
rewinddir();
} while (unlinked_files);
closedir();


 That is actually incorrect. It depends on non-portable and
unspecified directory implementation.

 The readdir() specification page you linked says:

 "If a file is removed from or added to the directory after the most
recent call to opendir() or rewinddir(), whether a subsequent call to
readdir() returns an entry for that file is unspecified."

 Which means that it is very possible for readdir() to keep returning
NULL after a rewinddir() even if files have been added to the directory
in the meantime. The fact that your readdir() implementation doesn't is
accidental.

 If you really want to use that algorithm, you have to loop around
opendir() and closedir(): re-open the directory after every pass, until
it is empty. But this is dangerous: it opens a race condition where you
could delete a new directory created by another process after you
successfully deleted the old one.

 In other words, there is no way to atomically delete a file hierarchy
in Unix, and every "rm -rf" implementation is a compromise.

 A way that I have found safer than most is the following:
 * atomically rename() the directory to a non-existing, unique, random,
hard to predict directory name.
 * recursively delete the newly named directory with your favorite
deletion algorithm. No matter what method you choose, there is still
a race condition, but the chances it will be triggered are significantly
reduced.

--
 Laurent

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: [PATCH] libb/remove_file.c: Use correct algorithm to delete directory

2015-04-13 Thread Laurent Bercot

On 13/04/2015 13:38, Laurent Bercot wrote:

  Which means that it is very possible for readdir() to keep returning
NULL after a rewinddir() even if files have been added to the directory
in the meantime.


 Sorry, I was wrong about that. rewinddir() actually sets the directory
in the same state as opendir(), so re-opening the directory after every
pass is unnecessary.

 The reason why the algorithm is still susceptible to a race condition is
that a file can always been added between the most recent invocation of
opendir()/rewinddir() and the next invocation of readdir(). In other words,
a concurrent process that constantly creates files will compete with the
process that unlinks them, and the results are unpredictable. Also, the
concurrent process can still create a file between a readdir() that returns
NULL (which will be interpreted as "success, the directory is empty, we can
now unlink it) and the unlink(): so unlinking the directory name can still
fail.

 I apologize for the confusion: I gave the wrong reason why this algorithm
is not better than the current busybox one. The fact remains that it is
impossible to guarantee that the unlink() of a directory will succeed, and
adding a rewinddir() loop does not add value.

--
 Laurent

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox