[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2017-06-28 Thread STINNER Victor

Changes by STINNER Victor :


--
pull_requests: +2536

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-29 Thread STINNER Victor

STINNER Victor added the comment:

The issue looks to be fixed in Python 3.5 (future version 3.5.2) and 3.6, I 
close the issue.

I repeat myself: On Python 3, don't store filenames as bytes, it's a bad 
practice. Use unicode, it works on all platforms and gives access to the full 
Unicode Character Set!

--
resolution:  -> fixed
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-29 Thread STINNER Victor

STINNER Victor added the comment:

Since all test_os now pass again on 3.x buildbots (after a few months of red 
buildbots), I backported the _DummyDirEntry fixes to 3.5.

David:
> The 3.5 branch failure about the mis-matched tuple values from the walk is a 
> separate issue, (...)

I'm sorry, I don't understand you. The test_os failure was the same on Python 
3.5 and 3.6:


- ('@test_4312_tmp\\TEST1\\SUB2', ['broken_link', 'link'], ['tmp3'])
? --
+ ('@test_4312_tmp\\TEST1\\SUB2', ['link'], ['broken_link', 'tmp3'])
? ++

This error should now be fixed on Python 3.5 and 3.6.

Is there a remaining open issue related to os.scandir() (bytes or Unicode)?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-29 Thread STINNER Victor

STINNER Victor added the comment:

> BTW, with respect to changeset 2b25fa7e3b7a, I'm not sure it has quite the 
> right semantics for is_dir, at least not if it's supposed to parallel 
> os.path.isdir().  I believe that should return True for a symbolic link to a 
> directory, which it doesn't look like this change would, if is_symlink 
> happened to have been called first.  It's possible the semantics of how 
> _DummyDirEntry is used precludes that scenario, but it seems a little 
> fragile.  I'd probably just use lstat() for is_symlink, but otherwise not 
> touch is_dir.

I think that you misunderstood the code. The "use cache lstat" path is only 
taken if the file is *not* symbolic link.

I tested manually _DummyDirEntry on a symbolic link to a directory: _lstat is 
filled by the constructor, is_dir() *and* is_symlink() returns True. Both are 
not exclusive, is_dir() works as follow_symlinks=True, whereas is_symlink() 
works as follow_symlinks=False.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-29 Thread Roundup Robot

Roundup Robot added the comment:

New changeset c38ac7ab8d9a by Victor Stinner in branch '3.5':
Issue #25911: Backport os._DummyDirEntry fixes
https://hg.python.org/cpython/rev/c38ac7ab8d9a

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-29 Thread David Bolen

David Bolen added the comment:

I'm including some comments here from an email thread I had with Victor about 
this issue on the Win8 buildbot, which led to his recent changeset 2b25fa7e3b7a.

The Win8 3.x failure (the 5 != 4 length error) was due to the revision of 
os._DummyDirEntry (introduced I believe in eb91d0387d59) using stat() rather 
than lstat() to check for a symbolic link, which couldn't work, which in turn 
failed to exclude the symbolic directory link from the walk in the test.

The 3.5 branch failure about the mis-matched tuple values from the walk is a 
separate issue, where the broken directory link is still counted as a directory 
(rather than a file).  It seems due to os.path.isdir() - in the older 
os._DummyDirEntry implementation - returning True for the broken link in the 
test, rather than False as on Unix.  That's because on Unix the underlying 
os.stat call fails, but on Windows isdir is replaced with nt._isdir which 
avoids os.stat but is just a shim for.  The success on Windows sort of makes 
sense since the target_is_directory parameter to os.symlink() was used to 
create the broken link in the test, so Windows knows it's a directory, even if 
the link target is missing.

If the bytes walk implementation needs to treat broken links like files on 
Windows (matching the non-bytes version), then it can't depend on isdir() 
failing.  The non-bytes version (using scandir()) works even on Windows since 
internally scandir appears to build up mode bits out of os.stat/os.lstat calls 
and never uses isdir.

Back-porting the 3.x implementation of DummyDirEntry would be one quick way to 
fix the 3.5 issue as it also avoids isdir().

BTW, with respect to changeset 2b25fa7e3b7a, I'm not sure it has quite the 
right semantics for is_dir, at least not if it's supposed to parallel 
os.path.isdir().  I believe that should return True for a symbolic link to a 
directory, which it doesn't look like this change would, if is_symlink happened 
to have been called first.  It's possible the semantics of how _DummyDirEntry 
is used precludes that scenario, but it seems a little fragile.  I'd probably 
just use lstat() for is_symlink, but otherwise not touch is_dir.

Finally, as to why this only showed up on the Win8 buildbot - it turns out 
that's the only machine where the tests could create a symbolic link and thus 
encounter the scenario - they were prevented on Win7 and Win10 under the normal 
buildbot user due to lack of privileges (which was a surprise to me).  That's 
also what was happening on Victor's local Win8 VM.  So the failing conditions 
were just being skipped.  The Win8 buildbot was itself sort of pure luck, as 
early on I must have set it up to run under an elevated (administrative) 
command prompt, probably due to trying to handle some other issue.  I've now 
done the same thing for Win10 so it began seeing the same failures in the last 
few tests.

--
nosy: +db3l

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-29 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 2b25fa7e3b7a by Victor Stinner in branch 'default':
Fix os._DummyDirEntry.is_symlink()
https://hg.python.org/cpython/rev/2b25fa7e3b7a

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-25 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Some tests were failed on Windows 8 even in 3.4 (issue23808).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-25 Thread STINNER Victor

STINNER Victor added the comment:

Update: test_os pass on all Windows buildbots except one!

I don't understand why the test fails on this buildbot.

http://buildbot.python.org/all/builders/AMD64%20Windows8%203.x/builds/1906/steps/test/logs/stdio

==
FAIL: test_walk_bottom_up (test.test_os.BytesWalkTests)
--
Traceback (most recent call last):
  File "D:\buildarea\3.x.bolen-windows8\build\lib\test\test_os.py", line 895, 
in test_walk_bottom_up
self.assertEqual(len(all), 4)
AssertionError: 5 != 4

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-25 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 82da02b5bf22 by Victor Stinner in branch 'default':
Issue #25911: more info on test_os failure
https://hg.python.org/cpython/rev/82da02b5bf22

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-24 Thread STINNER Victor

STINNER Victor added the comment:

I faile to reproduce the bug by running test_os alone on Windows 8.

I wrote a script to try to isolate a sequential list of tests which reproduce 
the bug on Windows, but I failed to find such list after 2 hours:
https://bitbucket.org/haypo/misc/src/3e2e6138798c6cec061544ff116e50a8d1d6d2b8/python/isolate.py?fileviewer=file-view-default

So I'm unable to reproduce the bug, I tried a change to enhance 
os._DummyDirEntry, it should now mimick better the C implementation of 
os.DirEntry. Let's see if the test still fails randomly...

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-24 Thread Roundup Robot

Roundup Robot added the comment:

New changeset eb91d0387d59 by Victor Stinner in branch 'default':
Enhance os._DummyDirEntry
https://hg.python.org/cpython/rev/eb91d0387d59

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-24 Thread Roundup Robot

Roundup Robot added the comment:

New changeset d54ee39b061f by Victor Stinner in branch 'default':
Fix DeprecationWarning on Windows
https://hg.python.org/cpython/rev/d54ee39b061f

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-22 Thread STINNER Victor

STINNER Victor added the comment:

Windows buildbots still fail (sometimes?) because of this issue :-/

http://buildbot.python.org/all/builders/AMD64%20Windows8%203.x/builds/1874/steps/test/logs/stdio

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-14 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Deprecation warnings are suppressed, but my attempt to fix test failures failed.

http://buildbot.python.org/all/builders/AMD64%20Windows8%203.x/builds/1832/steps/test/logs/stdio
==
FAIL: test_walk_bottom_up (test.test_os.BytesWalkTests)
--
Traceback (most recent call last):
  File "D:\buildarea\3.x.bolen-windows8\build\lib\test\test_os.py", line 894, 
in test_walk_bottom_up
self.sub2_tree)
AssertionError: Tuples differ: ('@test_4060_tmp\\TEST1\\SUB2', ['broken_link', 
'link'], ['tmp3']) != ('@test_4060_tmp\\TEST1\\SUB2', ['link'], ['broken_link', 
'tmp3'])

First differing element 1:
['broken_link', 'link']
['link']

- ('@test_4060_tmp\\TEST1\\SUB2', ['broken_link', 'link'], ['tmp3'])
? --

+ ('@test_4060_tmp\\TEST1\\SUB2', ['link'], ['broken_link', 'tmp3'])
? ++


==
FAIL: test_walk_prune (test.test_os.BytesWalkTests)
--
Traceback (most recent call last):
  File "D:\buildarea\3.x.bolen-windows8\build\lib\test\test_os.py", line 874, 
in test_walk_prune
self.assertEqual(all[1], self.sub2_tree)
AssertionError: Tuples differ: ('@test_4060_tmp\\TEST1\\SUB2', ['broken_link', 
'link'], ['tmp3']) != ('@test_4060_tmp\\TEST1\\SUB2', ['link'], ['broken_link', 
'tmp3'])

First differing element 1:
['broken_link', 'link']
['link']

- ('@test_4060_tmp\\TEST1\\SUB2', ['broken_link', 'link'], ['tmp3'])
? --

+ ('@test_4060_tmp\\TEST1\\SUB2', ['link'], ['broken_link', 'tmp3'])
? ++


--

In some cases a broken link is considered as a link to regular file, but in 
other cases  is considered as a link to directory.

It is hard to fix this Windows issue without Windows. Needed a help of Windows 
experts.

--
assignee: serhiy.storchaka -> 

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-14 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
stage: patch review -> needs patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-08 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 757159fb4847 by Serhiy Storchaka in branch '3.5':
Issue #25911: Tring to silence deprecation warnings in bytes path walk tests.
https://hg.python.org/cpython/rev/757159fb4847

New changeset ecf4e51c222f by Serhiy Storchaka in branch 'default':
Issue #25911: Tring to silence deprecation warnings in bytes path walk tests.
https://hg.python.org/cpython/rev/ecf4e51c222f

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-03-08 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 19a3e0e664af by Serhiy Storchaka in branch '3.4':
Issues #23808, #25911: Trying to fix walk tests on Windows.
https://hg.python.org/cpython/rev/19a3e0e664af

New changeset f9e22717722d by Serhiy Storchaka in branch '3.5':
Issues #23808, #25911: Trying to fix walk tests on Windows.
https://hg.python.org/cpython/rev/f9e22717722d

New changeset da020e408c7f by Serhiy Storchaka in branch 'default':
Issues #23808, #25911: Trying to fix walk tests on Windows.
https://hg.python.org/cpython/rev/da020e408c7f

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-02-24 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 9bffe39e8273 by Serhiy Storchaka in branch '3.5':
Fixed a bug in os.walk() with bytes path on Windows caused by merging fixes
https://hg.python.org/cpython/rev/9bffe39e8273

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-02-11 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

New tests emit deprecation warnings on Windows and failed.

http://buildbot.python.org/all/builders/AMD64%20Windows8%203.x/builds/1753/steps/test/logs/stdio
test_rmdir_on_directory_link_to_missing_target (test.test_os.Win32SymlinkTests) 
... D:\buildarea\3.x.bolen-windows8\build\lib\os.py:447: DeprecationWarning: 
The Windows bytes API has been deprecated, use Unicode filenames instead
  for name in listdir(dir):
D:\buildarea\3.x.bolen-windows8\build\lib\os.py:441: DeprecationWarning: The 
Windows bytes API has been deprecated, use Unicode filenames instead
  return path.isdir(self.path)
D:\buildarea\3.x.bolen-windows8\build\lib\ntpath.py:249: DeprecationWarning: 
The Windows bytes API has been deprecated, use Unicode filenames instead
  st = os.lstat(path)
D:\buildarea\3.x.bolen-windows8\build\lib\test\test_os.py:2688: 
DeprecationWarning: The Windows bytes API has been deprecated, use Unicode 
filenames instead
  func(name, *func_args)
D:\buildarea\3.x.bolen-windows8\build\lib\unittest\case.py:176: 
DeprecationWarning: The Windows bytes API has been deprecated, use Unicode 
filenames instead
  callable_obj(*args, **kwargs)
D:\buildarea\3.x.bolen-windows8\build\lib\test\test_os.py:1881: 
DeprecationWarning: The Windows bytes API has been deprecated, use Unicode 
filenames instead
  sorted(os.listdir(path)),
D:\buildarea\3.x.bolen-windows8\build\lib\test\test_os.py:1867: 
DeprecationWarning: The Windows bytes API has been deprecated, use Unicode 
filenames instead
  sorted(os.listdir(os.fsencode(support.TESTFN))),
test test_os failed
skipped 'currently fails; consider for improvement'

==
FAIL: test_walk_bottom_up (test.test_os.BytesWalkTests)
--
Traceback (most recent call last):
  File "D:\buildarea\3.x.bolen-windows8\build\lib\test\test_os.py", line 894, 
in test_walk_bottom_up
self.sub2_tree)
AssertionError: Tuples differ: ('@test_4312_tmp\\TEST1\\SUB2', ['broken_link', 
'link'], ['tmp3']) != ('@test_4312_tmp\\TEST1\\SUB2', ['link'], ['broken_link', 
'tmp3'])

First differing element 1:
['broken_link', 'link']
['link']

- ('@test_4312_tmp\\TEST1\\SUB2', ['broken_link', 'link'], ['tmp3'])
? --

+ ('@test_4312_tmp\\TEST1\\SUB2', ['link'], ['broken_link', 'tmp3'])
? ++


==
FAIL: test_walk_prune (test.test_os.BytesWalkTests)
--
Traceback (most recent call last):
  File "D:\buildarea\3.x.bolen-windows8\build\lib\test\test_os.py", line 874, 
in test_walk_prune
self.assertEqual(all[1], self.sub2_tree)
AssertionError: Tuples differ: ('@test_4312_tmp\\TEST1\\SUB2', ['broken_link', 
'link'], ['tmp3']) != ('@test_4312_tmp\\TEST1\\SUB2', ['link'], ['broken_link', 
'tmp3'])

First differing element 1:
['broken_link', 'link']
['link']

- ('@test_4312_tmp\\TEST1\\SUB2', ['broken_link', 'link'], ['tmp3'])
? --

+ ('@test_4312_tmp\\TEST1\\SUB2', ['link'], ['broken_link', 'tmp3'])
? ++

Proposed patch should silence warnings. Could anyone please test it on Windows?

--
status: closed -> open
Added file: http://bugs.python.org/file41897/test_walk_bytes_deprecate.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-02-11 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
resolution: fixed -> 
stage: resolved -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-02-08 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-02-08 Thread STINNER Victor

STINNER Victor added the comment:

Bastien:
> In py3.4 and below we used to be able to use bytes filenames with os.walk(), 
> it’s now impossible under windows due to the limitation of os.scandir().
>
> This issue was reported to Blender tracker 
> (https://developer.blender.org/T47018).

Again, why do you use bytes? Blender only supports Python 3. You must use 
Unicode on all platforms. On Windows, it gives you support of the full Unicode 
character set for free.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-02-08 Thread STINNER Victor

STINNER Victor added the comment:

> New changeset 5310f94772f4 by Serhiy Storchaka in branch '3.5':
> Issue #25911: Restored support of bytes paths in os.walk() on Windows.
> https://hg.python.org/cpython/rev/5310f94772f4
> 
> New changeset b060af2a58b6 by Serhiy Storchaka in branch 'default':
> Issue #25911: Restored support of bytes paths in os.walk() on Windows.
> https://hg.python.org/cpython/rev/b060af2a58b6

While this change is ok for Python 3.5, I would prefer to *drop* support for 
bytes filenames on Windows. I started a thread on python-dev for that:
https://mail.python.org/pipermail/python-dev/2016-February/143150.html

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-02-08 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 5310f94772f4 by Serhiy Storchaka in branch '3.5':
Issue #25911: Restored support of bytes paths in os.walk() on Windows.
https://hg.python.org/cpython/rev/5310f94772f4

New changeset b060af2a58b6 by Serhiy Storchaka in branch 'default':
Issue #25911: Restored support of bytes paths in os.walk() on Windows.
https://hg.python.org/cpython/rev/b060af2a58b6

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-02-08 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
assignee:  -> serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-01-16 Thread Campbell Barton

Campbell Barton added the comment:

A related question, (realize this isn't a support forum), but what is the 
equivalent API calls in Python3.5 so we can update scripts.

Should we be doing: os.walk(os.fsdecode(path)) ?

>From Python's documentation, os.fsdecode is using 'strict' handler which may 
>raise exceptions.
I'm mainly concerned that changes here will introduce bugs where bytes 
previously resolved the paths (albeit working in an imperfect way).

If this is not the best place to ask about this, then can post elsewhere.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-01-13 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

> Did we want a deprecation warning too?

I didn't tested on Windows. If bytes paths considered deprecated on Windows, 
os.listdir() should emit a warning. Otherwise I don't think we should special 
case os.walk().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2016-01-13 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Fixed a typo found by Eryk Sun.

--
Added file: http://bugs.python.org/file41598/walk_bytes_2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2015-12-23 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

There is open issue16700 for the documentation.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2015-12-23 Thread Eryk Sun

Eryk Sun added the comment:

Considering there's no plan to implement bytes paths for scandir on Windows, 
then the following line in the os docs needs to be modified: "All functions 
accepting path or file names accept both bytes and string objects, and result 
in an object of the same type, if a path or file name is returned." It should 
be changed to acknowledge that some functions on Windows do not support bytes 
paths, and that existing support is deprecated and may be removed in a future 
release.

Also, the docs for listdir should warn that using bytes returns invalid results 
on Windows when filenames contain characters that aren't mapped in the system 
locale codepage.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2015-12-23 Thread Steve Dower

Steve Dower added the comment:

Did we want a deprecation warning too? I don't see it in the patch.

Otherwise, LGTM.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2015-12-23 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is a patch that restores the ability of os.walk() to work with bytes 
filenames on Windows.

--
keywords: +patch
stage: test needed -> patch review
Added file: http://bugs.python.org/file41397/walks_bytes.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2015-12-21 Thread Ben Hoyt

Ben Hoyt added the comment:

Just to clarify what Victor and Zachary mentioned: bytes filenames have been 
deprecated only *on Windows* since Python 3.3; bytes filenames are still fine 
on POSIX.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2015-12-21 Thread STINNER Victor

STINNER Victor added the comment:

Python 3.5.0 and 3.5.1 are out. I don't think that it's worth to add back
bytes support is os.walk() in 3.5.2. I would prefer to do the opposite and
really drop bytes support for all filenames in the whole stdlib in 3.6.

It's really a bad idea to use bytes on Windows. With Python 3, it becomes
possible and simple to handle filenames as Unicode on all platforms and in
all cases (see PEP 393 for undecodable filenames).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2015-12-21 Thread Zachary Ware

Zachary Ware added the comment:

I like Serhiy's suggestion to use the old walk implementation on Windows if the 
argument is bytes, with a DeprecationWarning (like all other bytes operations 
in os on Windows).  There was no warning that bytes paths would stop working 
with os.walk, so we should fix it and I'm bumping up the priority.

bytes filenames have been deprecated since Python 3.3 (#13374); I don't think 
scandir (new in 3.5) should support bytes.

--
priority: normal -> high

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2015-12-21 Thread Eryk Sun

Eryk Sun added the comment:

The ANSI API is problematic because it returns a best-fit encoding to the 
system codepage. For example:

>>> os.listdir('.')
['ƠƨưƸǀLjǐǘǠǨǰǸ']

>>> os.listdir(b'.')
[b'O?u?|?iu?Kj?']

To somewhat work around this problem, listdir and scandir could return the 
cAlternateFilename of the WIN32_FIND_DATA struct if present. This is the 
classic 8.3 short name that Microsoft file systems create for MS-DOS 
compatibility. For NTFS it can be disabled in the registry, or per volume, but 
I assume whoever does that knows what to expect. 

Also, since Python wouldn't need the short name for a wide-character path, 
there's no point in asking for it. (For NTFS it's a separate entry in the MFT. 
If it exists, which is known ahead of time, finding the entry requires a second 
lookup.) In this case it's better to call FindFirstFileExW and request only 
FindExInfoBasic. Generally the difference is inconsequential, but in a 
contrived example with 1 similarly-named files from "ĀāĂă" to 
"ĀāĂă" and short names from "~1" to "~1", skipping the short name 
lookup shaved about 10% off the total time. For this test, I replaced the 
FindFirstFileW call in posix_scandir with the following call:

iterator->handle = FindFirstFileExW(path_strW,
FindExInfoBasic,
&iterator->file_data,
FindExSearchNameMatch,
NULL, 0);

--
nosy: +eryksun

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2015-12-20 Thread Campbell Barton

Campbell Barton added the comment:

@haypo, I checked available info online and couldn't find any reference to byte 
file-paths were deprecated since Python3.2.

On the contrary, the 3.2 release notes [0] state:

 "countless fixes regarding bytes/string issues; among them full support for a 
bytes environment (filenames, environment variables)"


Also docs for open 3.2 [1] and 3.5 [2] say that byte filenames are supported 
with no mention of deprecation.

Since this is already working properly in 3.5 for other systems besides 
ms-windows, and worked in 3.4x.
Dropping support on a single platform seems a rather problematic regression.



[0]: https://www.python.org/download/releases/3.2/
[1]: https://docs.python.org/3.2/library/functions.html#open
[2]: https://docs.python.org/3.5/library/functions.html#open

--
nosy: +ideasman42

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2015-12-20 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

For softer transition we can return old os.walk() implementation for bytes (may 
be with a deprecation warning). Using os.scandir() is just an optimization, and 
this shouldn't break existing program.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2015-12-20 Thread STINNER Victor

STINNER Victor added the comment:

Regressions are not cool, but it was a deliberate choice to not support
bytes filenames in os.scandir(). If you use bytes, you can get filenames
which are invalid: you will be unable to use the filename with open() to
read its content for example.

Bytes filename are deprecated since python 3.2 if i recall correctly. It's
time to use the right time, it's also simpler to use on Python 3.

I suggest to document the regression rather than adding bytes support to
os.scandir or don't use scandir in os.walk().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2015-12-19 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
nosy: +benhoyt, haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2015-12-19 Thread SilentGhost

SilentGhost added the comment:

Perhaps, I'm looking in the wrong place, but there doesn't seem be any test for 
bytes filenames.

--
keywords: +3.5regression
nosy: +SilentGhost
stage:  -> test needed
type:  -> behavior
versions: +Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25911] Regression: os.walk now using os.scandir() breaks bytes filenames on windows

2015-12-19 Thread Bastien Montagne

New submission from Bastien Montagne:

In py3.4 and below we used to be able to use bytes filenames with os.walk(), 
it’s now impossible under windows due to the limitation of os.scandir().

This issue was reported to Blender tracker 
(https://developer.blender.org/T47018).

I assume this should be considered as a regression? At least, this should be 
mentioned in the documentation…

Thanks,
Bastien

--
components: Library (Lib), Windows
messages: 256731
nosy: mont29, paul.moore, steve.dower, tim.golden, zach.ware
priority: normal
severity: normal
status: open
title: Regression: os.walk now using os.scandir() breaks bytes filenames on 
windows
versions: Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com