Re: Deletion race in NtSetFileInformation ? (Directory not empty error in rm -r -f)

2010-09-14 Thread Earl Chew
Corinna Vinschen wrote:
> ...or having a cwd below the directory.  Trying to remove a directory
> which is the CWD of some process is the most common reason that the
> directory is blocked, because the Win32 CWD is opened without the
> FILE_SHARE_DELETE flag.  Especially something like `rm -rf ../foo'
> is suspicious, if foo is the CWD of the current shell.

Hmm ... the other thing that I just remembered is that I first noticed
this problem on 1.7.5-1 on Win7, and the thing that made me suspicious
was that replacing the offending command with:

strace rm -f -r ...

made the command suddenly work! But ...

sleep 1 ; rm -f -r ...

failed in the same way  :-(

I haven't tried reproducing this particular behaviour on 1.7.7 (yet).

Earl

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Deletion race in NtSetFileInformation ? (Directory not empty error in rm -r -f)

2010-09-14 Thread Corinna Vinschen
On Sep 14 09:39, Earl Chew wrote:
> > There shouldn't be any race.  When you set the delete disposition,
> > the file is actually deleted as soon as the last handle to the file
> > is closed.  If the file isn't opened by another process, it will
> > disappear right at the NtClose at the end of unlink_nt.  Please note
> > that the call to check_dir_not_empty already takes place *only* if
> > trying to open the directory failed with STATUS_SHARING_VIOLATION.
> > So there *was* another process blocking things.
> 
> Corinna,
> 
> Yes, I noticed that check wrt STATUS_SHARING_VIOLATION.
> 
> These actions are performed consequential to a shell script,
> that launches a Makefile, that performs the rm -r -f ... so within
> that context there is definitely scope for oversight and we
> might inadvertently have a process getting in the way.
> 
> When you describe the other process blocking things, what might
> that other process be doing?
> 
> I presume that other process having the directory in question
> open, or as cwd is sufficient.

...or having a cwd below the directory.  Trying to remove a directory
which is the CWD of some process is the most common reason that the
directory is blocked, because the Win32 CWD is opened without the
FILE_SHARE_DELETE flag.  Especially something like `rm -rf ../foo'
is suspicious, if foo is the CWD of the current shell.

We're trying to revert this to the Linux way again in 1.7.8 (see the
thread starting at http://cygwin.com/ml/cygwin/2010-09/msg00342.html),
but even after that the problem remains for any non-Cygwin process.

> Is there anything else I should be on the lookout for?

Virus scanners, etc.  There *might* be some unfortunate interaction
with a scanner which keeps handles of just deleted files or dirs open.


Corinna

-- 
Corinna Vinschen  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader  cygwin AT cygwin DOT com
Red Hat

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Deletion race in NtSetFileInformation ? (Directory not empty error in rm -r -f)

2010-09-14 Thread Earl Chew
> There shouldn't be any race.  When you set the delete disposition,
> the file is actually deleted as soon as the last handle to the file
> is closed.  If the file isn't opened by another process, it will
> disappear right at the NtClose at the end of unlink_nt.  Please note
> that the call to check_dir_not_empty already takes place *only* if
> trying to open the directory failed with STATUS_SHARING_VIOLATION.
> So there *was* another process blocking things.

Corinna,

Yes, I noticed that check wrt STATUS_SHARING_VIOLATION.

These actions are performed consequential to a shell script,
that launches a Makefile, that performs the rm -r -f ... so within
that context there is definitely scope for oversight and we
might inadvertently have a process getting in the way.

When you describe the other process blocking things, what might
that other process be doing?

I presume that other process having the directory in question
open, or as cwd is sufficient.

Is there anything else I should be on the lookout for?

Earl



--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Deletion race in NtSetFileInformation ? (Directory not empty error in rm -r -f)

2010-09-14 Thread Corinna Vinschen
On Sep 13 14:20, Earl Chew wrote:
> I have a Makefile which performs "rm -f -r" as part of a clean target.
> On Win7 with 1.7.5-1 this can fail with:
> 
> rm -f -r win32
> rm: cannot remove directory `win32': Directory not empty
> 
> 
> I tried 1.7.7-1 but the problem still seems to be there.
> 
> 
> Doing a little digging, I find that /bin/rm calls
> unlinkat("win32/dll"), which eventually calls unlink_nt().
> 
> 
> A short time later, /bin/rm calls unlink_at("win32") and
> fails at check_dir_not_empty() because it finds the following
> entries::
> 
>  413407 [main] rm 3612 check_dir_not_empty: File name: 2 0x2E 0x610E 0x10 "."
>  413493 [main] rm 3612 check_dir_not_empty: File name: 4 0x2E 0x2E 0x18   ".."
>  413574 [main] rm 3612 check_dir_not_empty: File name: 6 0x64 0x6C 0x6C   
> "dll"
> 
> Huh?  Wasn't this the directory that was just deleted?
> 
> Taking a look in the directory after the fact shows that the parent directory
> appears to be empty :
> 
> W:> dir win32
>  Volume in drive W is OS
>  Volume Serial Number is C0E0-BBEE
> 
>  Directory of W:\cerberus\acl\col_\ato\win32
> 
> 13/09/2010  01:57 PM  .
> 13/09/2010  01:57 PM  ..
>0 File(s)  0 bytes
>2 Dir(s)  392,720,297,984 bytes free
> 
> 
> Hmm ... my reading of unlink_nt() is that the directory "win32/dll"
> is deleted by setting FileDispositionInformation via NtSetFileInformation().

Yes, that's how it is done by the Win32 API as well.

> Since the file entry seems to be found during the subsequent 
> check_dir_not_empty()
> call when trying to delete the parent directory, is some form
> of explicit synchronisation required when deleting
> the child "win32/dll" to be sure that the deletion is
> actually complete?

There shouldn't be any race.  When you set the delete disposition,
the file is actually deleted as soon as the last handle to the file
is closed.  If the file isn't opened by another process, it will
disappear right at the NtClose at the end of unlink_nt.  Please note
that the call to check_dir_not_empty already takes place *only* if
trying to open the directory failed with STATUS_SHARING_VIOLATION.
So there *was* another process blocking things.


Corinna

-- 
Corinna Vinschen  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader  cygwin AT cygwin DOT com
Red Hat

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Deletion race in NtSetFileInformation ? (Directory not empty error in rm -r -f)

2010-09-13 Thread Earl Chew
I have a Makefile which performs "rm -f -r" as part of a clean target.
On Win7 with 1.7.5-1 this can fail with:

rm -f -r win32
rm: cannot remove directory `win32': Directory not empty


I tried 1.7.7-1 but the problem still seems to be there.


Doing a little digging, I find that /bin/rm calls
unlinkat("win32/dll"), which eventually calls unlink_nt().


A short time later, /bin/rm calls unlink_at("win32") and
fails at check_dir_not_empty() because it finds the following
entries::

 413407 [main] rm 3612 check_dir_not_empty: File name: 2 0x2E 0x610E 0x10 "."
 413493 [main] rm 3612 check_dir_not_empty: File name: 4 0x2E 0x2E 0x18   ".."
 413574 [main] rm 3612 check_dir_not_empty: File name: 6 0x64 0x6C 0x6C   "dll"

Huh?  Wasn't this the directory that was just deleted?

Taking a look in the directory after the fact shows that the parent directory
appears to be empty :

W:> dir win32
 Volume in drive W is OS
 Volume Serial Number is C0E0-BBEE

 Directory of W:\cerberus\acl\col_\ato\win32

13/09/2010  01:57 PM  .
13/09/2010  01:57 PM  ..
   0 File(s)  0 bytes
   2 Dir(s)  392,720,297,984 bytes free


Hmm ... my reading of unlink_nt() is that the directory "win32/dll"
is deleted by setting FileDispositionInformation via NtSetFileInformation().

Since the file entry seems to be found during the subsequent 
check_dir_not_empty()
call when trying to delete the parent directory, is some form
of explicit synchronisation required when deleting
the child "win32/dll" to be sure that the deletion is
actually complete?


Earl

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple