Steve Dower <steve.do...@python.org> added the comment:

Okay, so it sounds like there's a class of files where we can't rely on the 
FindFileData having the right values. But we get enough information to be able 
just suppress the caching behaviour for those, right?

Basically, my criteria for fixing this in the runtime is that we should not add 
any new system calls during iteration, and cannot switch to always bypassing 
the cache for DirEntry.stat().

What this probably means is if we can detect a link from the FFD struct (which 
I think we can?) then we can cache the attributes we trust and send .stat() 
through the real call.

What it also means is that the "file still in use by another app" scenario will 
probably have to manually use os.stat(). We can't detect it, and it's the same 
race condition as calling os.stat() shortly before the update flushes anyway.

I won't accept having to make a second set of system calls on every file just 
in case one of them is being modified by another application. That's not the 
normal case, and the point of scandir is to improve performance in the normal 
enumeration cases.

Updating the documentation to mention/emphasise that some DirEntry.stat() 
fields may not update immediately, and so using os.stat() for current data is 
required, may be helpful. Though I think that's already implied by the line 
that says "Call os.stat() to fetch up-to-date information."

So if someone wants to improve the docs, or has a way to recognise links (with 
unreliable data in the directory listing) and not pre-fill the stat object, 
feel free to submit a PR. Otherwise, unfortunately, we're pretty much bound by 
Windows's own optimisations here.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue41106>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to