[issue20140] UnicodeDecodeError in ntpath.py when home dir contains non-ascii signs

2021-02-25 Thread Eryk Sun


Change by Eryk Sun :


--
resolution:  -> out of date
stage: needs patch -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20140] UnicodeDecodeError in ntpath.py when home dir contains non-ascii signs

2016-09-12 Thread Eryk Sun

Eryk Sun added the comment:

> It might be worth testing a patch that changes expanduser to 
> decode the environment variables 

If expanduser() is passed a unicode path, it can use 
_winreg.ExpandEvironmentStrings(u'%USERPROFILE%') instead of decoding 
os.environ['USERPROFILE']. In 2.7, os.environ is a lossy ANSI encoding of the 
native Unicode environment block.

--
nosy: +eryksun

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20140] UnicodeDecodeError in ntpath.py when home dir contains non-ascii signs

2016-09-12 Thread Robert Collins

Robert Collins added the comment:

Given two (or more) parameters where one is unicode and one is not, upcasting 
will occur multiples times in path.join on windows: 
 - '\\' is str and will cast up safely in all codecs
 - the other str (or bytes) parameter will be upcast using sys.defaultencoding 
which is often / usually ASCII on Windows

This will then fail when the str parameter is not valid ASCII.

>From this we can conclude that this is a failure to use path.join correctly: 
>if all the parameters passed in were unicode, no error would occur as only 
>'\\' would be getting coerced to unicode.

The interesting question is why there was a str parameter that wasn't valid 
ASCII; and that lies with path.expanduser() which is returning a str for the 
non-ascii home directory.

Changing that to return unicode rather than a no-encoding specified str when 
HOME or HOMEPATH etc etc contain non-ascii characters is a change that would 
worry me - specifically that we'd encounter code that assumes it is always str, 
e.g. by calling path.join(expanduser('~fred'), '\xe1\xbd\x84D') which will then 
blow up.

Worth noting too is that 

 expanduser(u'~user/\u14ffd')

will also blow up in the same way in the same situation - as it ends up 
decoding the user home path when it concatenates userhome and path[i:].

So, what to do:
 - It might be worth testing a patch that changes expanduser to decode the 
environment variables - I'm not sure whether we'd want the filesystemencoding 
or the defaultencoding for handling these environment variables. Steve Dower 
probably knows :).
 - Or we say 'sorry, too hard in 2.7' and move on: join *itself* is fine here, 
given the limits of 2.7.

--
nosy: +rbcollins, steve.dower

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20140] UnicodeDecodeError in ntpath.py when home dir contains non-ascii signs

2015-02-15 Thread Vinay Sajip

Vinay Sajip added the comment:

 There is a bug in distlib.resources.

As far as I know, this is no longer the case - a change was made in 
distlib.resources to get around the problem:

https://bitbucket.org/vinay.sajip/distlib/src/471427909ebbba2f4fa9f4cbc34f17bd2d31b8e3/distlib/resources.py?at=default#cl-31

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20140
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20140] UnicodeDecodeError in ntpath.py when home dir contains non-ascii signs

2015-02-15 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Yes, the implementation of os.path is alright. There is a bug in 
distlib.resources. And the lack of os.path documentation.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20140
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20140] UnicodeDecodeError in ntpath.py when home dir contains non-ascii signs

2015-01-13 Thread Lin Wei

Lin Wei added the comment:

The patch (http://bugs.python.org/issue9291#msg206938) for #9291 actually helps 
with this issue, at least for me.

By the way, @Serhiy do you mean that the problem is merely documentation, while 
the implementation is alright?

--
nosy: +Lin.Wei

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20140
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20140] UnicodeDecodeError in ntpath.py when home dir contains non-ascii signs

2014-09-27 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

This looks to me as documentation issue. Unfortunately it is not explicitly 
documented that os.path.join() shouldn't mix str and unicode components (except 
ascii-only str, such as '.').

There is relevant note in 3.x documentation. It should be adapted to 2.7.

--
assignee:  - docs@python
components: +Documentation -Windows
keywords: +easy
nosy: +docs@python, serhiy.storchaka
stage:  - needs patch
type: crash - behavior

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20140
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20140] UnicodeDecodeError in ntpath.py when home dir contains non-ascii signs

2014-05-27 Thread honglei jiang

honglei jiang added the comment:

Python:canopy-1.3.0.1715.win-x86_64\
OS:Win8.1 64

directory
'F:\\Flask\\EmberJS\\\xd6\xd0\xce\xc4\\Prj\\static'
os.path.isdir(directory)
True
filename
u'todomvc/architecture-examples/angularjs/index.html'
os.path.join(directory,filename)
Traceback (most recent call last):
  File 
c:\Users\honglei\AppData\Local\Enthought\Canopy\User\Lib\site-packages\flask\helpers.py,
 line 1, in module
# -*- coding: utf-8 -*-
  File 
C:\Users\honglei\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.3.0.1715.win-x86_64\Lib\ntpath.py,
 line 108, in join
path += \\ + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd6 in position 17: 
ordinal not in range(128)

f=os.path.join(directory.decode(sys.getfilesystemencoding()),filename)
os.path.isfile(f)
True

--
nosy: +jhonglei

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20140
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20140] UnicodeDecodeError in ntpath.py when home dir contains non-ascii signs

2014-01-07 Thread Jarek Śmiejczak

Jarek Śmiejczak added the comment:

@Vinay.Sajip
After adding change you suggested i'm getting different error:
---
C:\Users\Jarosławpip install virtualenv
Downloading/unpacking virtualenv
  Running setup.py (path:c:\users\jarosa~1\appdata\local\temp\pip_build_Jaros│a
\virtualenv\setup.py) egg_info for package virtualenv

warning: no previously-included files matching '*' found under directory 'd
cs\_templates'
warning: no previously-included files matching '*' found under directory 'd
cs\_build'
Cleaning up...
Exception:
Traceback (most recent call last):
  File c:\python27\lib\site-packages\pip\basecommand.py, line 122, in main
status = self.run(options, args)
  File c:\python27\lib\site-packages\pip\commands\install.py, line 270, in ru

requirement_set.prepare_files(finder, force_root_egg_info=self.bundle, bund
e=self.bundle)
  File c:\python27\lib\site-packages\pip\req.py, line 1211, in prepare_files
req_to_install.assert_source_matches_version()
  File c:\python27\lib\site-packages\pip\req.py, line 451, in assert_source_m
tches_version
% (display_path(self.source_dir), version, self))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb3 in position 62: ordina
 not in range(128)

Traceback (most recent call last):
  File c:\python27\Scripts\pip-script.py, line 9, in module
load_entry_point('pip==1.5', 'console_scripts', 'pip')()
  File c:\python27\lib\site-packages\pip\__init__.py, line 185, in main
return command.main(cmd_args)
  File c:\python27\lib\site-packages\pip\basecommand.py, line 161, in main
text = '\n'.join(complete_log)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb3 in position 77: ordina
 not in range(128)

C:\Users\Jarosław
---

It looks like this needs a little more changes in pip to solve this issue.
What's strange: In Windows 8.1, name of home directory is first name saved in 
your Microsoft Profile (if you log via this profile of course), so it should be 
a pretty common issue (i think).

Thanks for your fast reaction and support.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20140
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20140] UnicodeDecodeError in ntpath.py when home dir contains non-ascii signs

2014-01-06 Thread Jarek Śmiejczak

New submission from Jarek Śmiejczak:

Full traceback:
https://gist.github.com/jarekps/2729ee1917ea372e6642

Error's starts in pip but after investigation of traceback it looks like it is 
python's issue (version 2.7.5).
Windows version: 8.1 Enterprise x64 with Polish language pack.
Feel free to ask if any additional information is necessary.

--
components: Windows
messages: 207424
nosy: Jarek.Śmiejczak
priority: normal
severity: normal
status: open
title: UnicodeDecodeError in ntpath.py when home dir contains non-ascii signs
type: crash
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20140
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20140] UnicodeDecodeError in ntpath.py when home dir contains non-ascii signs

2014-01-06 Thread STINNER Victor

STINNER Victor added the comment:

 https://gist.github.com/jarekps/2729ee1917ea372e6642

Copy of the output:
---
C:\Users\Jarosławpip
Traceback (most recent call last):
File c:\python27\Scripts\pip-script.py, line 9, in module
load_entry_point('pip==1.5', 'console_scripts', 'pip')()
File 
c:\python27\lib\site-packages\distribute-0.6.49-py2.7.egg\pkg_resources.py, 
line 345, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File 
c:\python27\lib\site-packages\distribute-0.6.49-py2.7.egg\pkg_resources.py, 
line 2381, in load_entry_point
return ep.load()
File 
c:\python27\lib\site-packages\distribute-0.6.49-py2.7.egg\pkg_resources.py, 
line 2087, in load
entry = __import__(self.module_name, globals(),globals(), ['__name__'])
File c:\python27\lib\site-packages\pip\__init__.py, line 11, in module
from pip.vcs import git, mercurial, subversion, bazaar # noqa
File c:\python27\lib\site-packages\pip\vcs\subversion.py, line 4, in module
from pip.index import Link
File c:\python27\lib\site-packages\pip\index.py, line 16, in module
from pip.wheel import Wheel, wheel_ext, wheel_setuptools_support
File c:\python27\lib\site-packages\pip\wheel.py, line 23, in module
from pip._vendor.distlib.scripts import ScriptMaker
File c:\python27\lib\site-packages\pip\_vendor\distlib\scripts.py, line 15, 
in module
from .resources import finder
File c:\python27\lib\site-packages\pip\_vendor\distlib\resources.py, line 
105, in module
cache = Cache()
File c:\python27\lib\site-packages\pip\_vendor\distlib\resources.py, line 40, 
in __init__
base = os.path.join(get_cache_base(), 'resource-cache')
File c:\python27\lib\ntpath.py, line 108, in join
path += \\ + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb3 in position 14: 
ordinal not in range(128)
---

It looks like a bug in distlib.resources, not in Python.

os.path.join() works correctly if all arguments are bytes strings (str type). I 
should work if all arguments are Unicode strings only containing ASCII 
characters. (I don't know if it works if all aruments are Unicode strings.)

In your case, it looks like os.path.join() is called with a unicode and a bytes 
string.

--
nosy: +haypo, ncoghlan, vinay.sajip

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20140
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20140] UnicodeDecodeError in ntpath.py when home dir contains non-ascii signs

2014-01-06 Thread Vinay Sajip

Vinay Sajip added the comment:

It's not failing specifically because of distlib or os.path.join functionality: 
it's failing because, given a Unicode path C:\Users\Jarosław\..., Python is 
attempting to decode it using the default, ASCII codec. I'll certainly look at 
updating distlib to handle this case, but the same problem could bite the user 
in other areas.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20140
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20140] UnicodeDecodeError in ntpath.py when home dir contains non-ascii signs

2014-01-06 Thread Vinay Sajip

Vinay Sajip added the comment:

Jarek: I can't easily test this in my environment; perhaps you can help. Could 
you change, in the file

c:\python27\lib\site-packages\pip\_vendor\distlib\resources.py,

line 40 from

base = os.path.join(get_cache_base(), 'resource-cache')
to
base = os.path.join(get_cache_base(), str('resource-cache'))

to see if that resolves the problem? Currently, 'resource-cache' is a Unicode 
string (because of from __future__ import unicode_literals in the containing 
module) and that causes Python to try and convert the get_cache_base() result 
to Unicode using ASCII, which leads to the failure.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20140
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com