Why it is behaving differently on the command line? What should I do to fix this?

I was experimenting with this a bit more and found some more confusing things. Can somebody please enlight me?

Here is a test function:


    def password_hash(self,password):
        public = bytearray([random.randint(0,255) for _ in range(5)])
        private = bytearray([random.randint(0,255)])
        pwd = bytearray(password.encode())
        digest = hashlib.sha1(public+pwd+private).digest()
        print("digest",digest,type(digest))
        print("de",digest.encode())
        # and some more stuff here...

This function was called inside a script, and gave me this:

('digest', '\xa0\x98\x8b\xff\x04\xf9V;\xbd\x1eIHzh\x10-\xc5!\x14\x1b', <type 'str'>)
Traceback (most recent call last):
File "/home/gandalf/Python/Lib/shopzeus/scripts/yaaf_pwmgr.py", line 478, in <module>
    pwmgr.run(parser,args)
File "/home/gandalf/Python/Lib/shopzeus/scripts/yaaf_pwmgr.py", line 241, in run
    self.authdb.user_create(name,password,propvalues)
File "/home/gandalf/Python/Lib/shopzeus/yaaf/db/authdb.py", line 205, in user_create
    "password":(password and Binary(self.password_hash(password))) or None,
File "/home/gandalf/Python/Lib/shopzeus/yaaf/db/authdb.py", line 134, in password_hash
    print("de",digest.encode())
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 0: ordinal not in range(128)

Then I have tried the very same thing from the interactive shell:

gandalf@gandalf-HP-G62-Notebook-PC:~/Python/Projects/appserver$ python3
Python 3.3.1 (default, Sep 25 2013, 19:29:01)
[GCC 4.7.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> digest = '\xa0\x98\x8b\xff\x04\xf9V;\xbd\x1eIHzh\x10-\xc5!\x14\x1b'
>>> digest.encode()
b'\xc2\xa0\xc2\x98\xc2\x8b\xc3\xbf\x04\xc3\xb9V;\xc2\xbd\x1eIHzh\x10-\xc3\x85!\x14\x1b'
>>>


WHAT??? Seems like the default value of the encoding parameter of the str.encode method is different if I start it interactively. But this contradicts its documentation:

>>> print(digest.encode.__doc__)
S.encode(encoding='utf-8', errors='strict') -> bytes

Encode S using the codec registered for encoding. Default encoding
is 'utf-8'. errors may be given to set a different error
handling scheme. Default is 'strict' meaning that encoding errors raise
a UnicodeEncodeError. Other possible values are 'ignore', 'replace' and
'xmlcharrefreplace' as well as any other name registered with
codecs.register_error that can handle UnicodeEncodeErrors.


So is the default utf-8 or not? Should the documentation be updated? Or do we have a bug in the interactive shell?



--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to