[issue26111] On Windows, os.scandir will keep a handle on the directory until the iterator is exhausted

2021-02-25 Thread Steve Dower


Steve Dower  added the comment:

> FYI, in Windows 10, deleting files and directories now tries a POSIX delete

Yeah, FWIW, I haven't been able to get clear guidance on what I can/cannot 
publicly announce we've done in this space. But since you've found it I guess I 
can say sorry that I couldn't announce it more loudly! :)

A number of our other issues should be able to be closed soon once the changes 
get out in the open.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26111] On Windows, os.scandir will keep a handle on the directory until the iterator is exhausted

2021-02-24 Thread Eryk Sun


Eryk Sun  added the comment:

Issue 25994 added support for the context-manager protocol and close() method 
in 3.6. So it's at least much easier to ensure that the handle gets closed. 

The documentation of scandir() links to WinAPI FindFirstFile and FindNextFile, 
which at least mentions the "search handle". It's not made explicit that this 
encapsulates a handle for a kernel file object, nor are the operations (e.g. 
move, rename, delete) discussed that are allowed directly on the directory. 
Similarly, the directory stream that's returned by and used by POSIX opendir() 
and readdir() may or may not encapsulate a file descriptor. 

I don't think Python's documentation is the best place to discuss 
platform-specific implementation details in most cases. Exceptions should be 
made in some cases, but I don't think this is one of them because I can't even 
link to a document about the implementation details of FindNextFile. At a lower 
level I can link to documents about the NtQueryDirectoryFile[Ex] system call, 
but that's not much help in terms of officially documenting what FindNextFile 
does. Microsoft prefers to keep the Windows API details opaque, which gives 
them wiggle room.

FYI, in Windows 10, deleting files and directories now tries a POSIX delete (if 
supported by the filesystem) that immediately unlinks the name as soon as the 
handle that's used to perform the delete is closed, such as the handle that's 
opened to implement DeleteFile (os.unlink) and RemoveDirectory (os.rmdir). NTFS 
supports this feature by moving the file/directory to a reserved 
"\$Extend\$Deleted" directory:

>>> os.mkdir('spam')
>>> h = win32file.CreateFile('spam', 0, 0, None, 3, 0x0200_, None)
>>> print(win32file.GetFinalPathNameByHandle(h, 0))
\\?\C:\Temp\test\test\spam

>>> os.rmdir('spam')
>>> print(win32file.GetFinalPathNameByHandle(h, 0))
\\?\C:\$Extend\$Deleted\0010949A5E2FE5BB

Of course, none of the above is documented for RemoveDirectory().

--
resolution:  -> third party
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26111] On Windows, os.scandir will keep a handle on the directory until the iterator is exhausted

2016-01-14 Thread Eryk Sun

Eryk Sun added the comment:

> That behavior on Windows is quite counterintuitive.

It's counter-intuitive from a POSIX point of view, in which anonymous files are 
allowed. In contrast, Windows allows any existing reference to unset the delete 
disposition, so the name cannot be unlinked until all references are closed.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26111] On Windows, os.scandir will keep a handle on the directory until the iterator is exhausted

2016-01-14 Thread Martin Panter

Martin Panter added the comment:

Can you explain how it is different? The way I see it, both problems are about 
the scandir() iterator holding an open reference (file descriptor or handle) to 
a directory/folder, when the iterator was not exhausted, but the caller no 
longer needs it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26111] On Windows, os.scandir will keep a handle on the directory until the iterator is exhausted

2016-01-14 Thread Remy Roy

Remy Roy added the comment:

>From my point of view, Issue 25994 is about the potential file 
>descriptor/handle leaks and this issue is about being unable to perform some 
>filesystem calls because of a hidden unclosed file descriptor/handle.

I am not going to protest if you want to treat them as the same issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26111] On Windows, os.scandir will keep a handle on the directory until the iterator is exhausted

2016-01-14 Thread Eryk Sun

Eryk Sun added the comment:

If you own the only reference you can also delete the reference, which 
deallocates the iterator and closes the handle.

Can you provide concrete examples where os.remove and os.chmod fail? At least 
in Windows 7 and 10 the directory handle is opened with the normal read and 
write sharing, but also with delete sharing. This sharing mode is fairly close 
to POSIX behavior (an important distinction is noted below). I get the 
following results in Windows 10:

>>> import os, stat
>>> os.mkdir('test')
>>> f = open('test/file1', 'w'); f.close()
>>> f = open('test/file2', 'w'); f.close()
>>> it = os.scandir('test')
>>> next(it)


rename, chmod, and rmdir operations succeed:

>>> os.rename('test', 'spam')
>>> os.chmod('spam', stat.S_IREAD)
>>> os.chmod('spam', stat.S_IWRITE)
>>> os.remove('spam/file1')
>>> os.remove('spam/file2')
>>> os.rmdir('spam')

Apparently cached entries can be an issue, but this caching is up to WinAPI 
FindNextFile and the system call NtQueryDirectoryFile:

>>> next(it)


An important distinction is that a deleted file in Windows doesn't actually get 
unlinked until all handles and kernel pointer references are closed. Also, once 
the delete disposition is set, no *new* handles can be created for the existing 
file or directory (all access is denied), and a new file or directory with same 
name cannot be created.

>>> os.listdir('spam')
Traceback (most recent call last):
  File "", line 1, in 
PermissionError: [WinError 5] Access is denied: 'spam'

>>> f = open('spam', 'w')
Traceback (most recent call last):
  File "", line 1, in 
PermissionError: [Errno 13] Permission denied: 'spam'

If we had another handle we could use that to rename "spam" to get it out of 
the way, at least. Without that, AFAIK, all we can do is deallocate the 
iterator or wait for it to be exhausted, which closes the handle and thus 
allows Windows to finally unlink "spam":

>>> next(it)
Traceback (most recent call last):
  File "", line 1, in 
StopIteration

Creating a new file named "spam" is allowed now:

>>> f = open('spam', 'w')
>>> f.close()

--
nosy: +eryksun

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26111] On Windows, os.scandir will keep a handle on the directory until the iterator is exhausted

2016-01-14 Thread Ben Hoyt

Changes by Ben Hoyt :


--
nosy: +benhoyt

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26111] On Windows, os.scandir will keep a handle on the directory until the iterator is exhausted

2016-01-14 Thread Remy Roy

New submission from Remy Roy:

On Windows, os.scandir will keep a handle on the directory being scanned until 
the iterator is exhausted. This behavior can cause various problems if try to 
use some filesystem calls like os.chmod or os.remove on the directory while the 
handle is still being kept.

There are some use cases where the iterator is not going to be exhausted like 
looking for a specific entry in a directory and breaking from the loop 
prematurely.

This behavior should at least be documented.  Alternatively, it might be 
interesting to provide a way prematurely end the scan without having to exhaust 
it and close the handle.

As a workaround, you can force the exhaustion after you are done with the 
iterator with something like:

for entry in iterator:
pass

This is going to affect os.walk as well since it uses os.scandir .

The original github issue can be found on 
https://github.com/benhoyt/scandir/issues/58 .

--
components: Windows
messages: 258212
nosy: paul.moore, remyroy, steve.dower, tim.golden, zach.ware
priority: normal
severity: normal
status: open
title: On Windows, os.scandir will keep a handle on the directory until the 
iterator is exhausted
type: behavior
versions: Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com