On May 18, 2013, at 5:15 AM, Aymeric Augustin <aymeric.augus...@polytechnique.org> wrote:
> Apologies for answering so late. I see the change discussed here was already > committed. The change itself is fine — essentially because it's limited to > the bcrypt password hasher — but I'd like to bring some perspective to parts > of this discussion. > > Overall, I strongly advocate consistency in the Python ecosystem, and the > standard library sets the, err, standard. Here's how it deals with this > situation in Python 3. > >>>> import hashlib > > 1) Hash functions must reject str objects because the encoding isn't > guaranteed: > >>>> hashlib.md5('foo') > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > TypeError: Unicode-objects must be encoded before hashing > > 2) Digests must be returned as bytes (quite obviously): > >>>> hashlib.md5(b'foo').digest() > b'\xac\xbd\x18\xdbL\xc2\xf8\\\xed\xefeO\xcc\xc4\xa4\xd8' > > 3) Hex digests must be returned as str: > >>>> hashlib.md5(b'foo').hexdigest() > 'acbd18db4cc2f85cedef654fccc4a4d8' > > Adapting this example to Python 2 is left as an exercise :) > > As a consequence, I agree with Claude's recommendation to use unicode strings > whenever possible (eg. for hex digests). However, I believe that a simple > hash function mustn't accept unicode strings. Wrappers — say, an > make_password_hash function — must encode unicode strings to bytes before > passing them to hash functions. > > Regarding Donald's pull request, `data = force_bytes(data)` makes sense, > because the hasher must be fed bytes. There's already a `password = > force_bytes(password)` just above. > > I'm less enthusiastic about the change adding `force_text(data)`. It actually > works around bcrypt.hashpw returning an unexpected type in these > circumstance. But, if that's how bcrypt.hashpw works, that's fine. Well the python library returns bytes (and accepts bytes for the salt) because fundamentally bcrypt operates on bytes, and the C library reflects that. The force_text would need to happen either in Django or in the Python library and I believe it's more appropriate for it to happen in Django. > > Donald, we've discussed this before and I know you have strong feelings > against the design of the standard library in this regard. Still, Python is > the environment we're living in, and we shouldn't fight it. > > -- > Aymeric. > > > > -- > You received this message because you are subscribed to the Google Groups > "Django developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to django-developers+unsubscr...@googlegroups.com. > To post to this group, send email to django-developers@googlegroups.com. > Visit this group at http://groups.google.com/group/django-developers?hl=en. > For more options, visit https://groups.google.com/groups/opt_out. > > ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
signature.asc
Description: Message signed with OpenPGP using GPGMail