Re: unicode and hashlib

Jeff H Sat, 29 Nov 2008 06:30:55 -0800

On Nov 28, 2:03 pm, Terry Reedy <[EMAIL PROTECTED]> wrote:
> Jeff H wrote:
> > hashlib.md5 does not appear to like unicode,
> >   UnicodeEncodeError: 'ascii' codec can't encode character u'\xa6' in
> > position 1650: ordinal not in range(128)
>
> It is the (default) ascii encoder that does not like non-ascii chars.
> I suspect that is you encode to bytes first with an encoder that does
> work (latin-???), md5 will be happy.
>
> Reports like this should include Python version.
>
> > After googling, I've found BDFL and others on Py3K talking about the
> > problems of hashing non-bytes (i.e. buffers)
> > http://www.mail-archive.com/[EMAIL PROTECTED]/msg09824.html
>
> > So what is the canonical way to hash unicode?
> >  * convert unicode to local
> >  * hash in current local
> > ???
> > but what if local has ordinals outside of 128?
>
> > Is this just a problem for md5 hashes that I would not encounter using
> > a different method?  i.e. Should I just use the built-in hash function?
> > --
> >http://mail.python.org/mailman/listinfo/python-list
>
>


Python v2.52 -- however, this is not really a bug report because your
analysis is correct. I am converting cp1252 strings to unicode before
I persist them in a database.  I am looking for advice/direction/
wisdom on how to sling these strings<g>

-Jeff
--
http://mail.python.org/mailman/listinfo/python-list

Re: unicode and hashlib

Reply via email to