Re: encode and decode builtins
Garrett Berg writes: > ... > However, there are times that I do not care what data I am working with, > and I find myself writing something like: > > if isinstance(data, bytes): data = data.decode() Apparently, below this code, you do care that "data" contains "str" (not "bytes") -- otherwise, you could simply drop this line. Note, in addition, that the code above might only work if "data" actually contains ASCII only bytes (at least in Python 2.x, it would fails otherwise). To safely decode bytes into unicode, you need to precisely know the encoding -- something which is not consistent with "I do not care about what data I am working with". -- https://mail.python.org/mailman/listinfo/python-list
Re: encode and decode builtins
On 11/16/14 2:39 AM, Garrett Berg wrote: I made the switch to python 3 about two months ago, and I have to say I love everything about it, /especially/ the change to using only bytes and str (no more unicode! or... everything is unicode!) As someone who works with embedded devices, it is great to know what data I am working with. I am glad that you are excited about Python 3. But I'm a little surprised to hear your characterization of the changes it brought. Both Python 2 and Python 3 are the same in that they have two types for representing strings: one for byte strings, and one for Unicode strings. The difference is that Python 2 called them str and unicode, with "" being a byte string; Python 3 calls them bytes and str, with "" being a unicode string. Also, Python 2 happily converted between them implicitly, while Python 3 does not. However, there are times that I do not care what data I am working with, and I find myself writing something like: if isinstance(data, bytes): data = data.decode() This goes against a fundamental tenet of both Python 2 and 3: you should know what data you have, and deal with it properly. This is tedious and breaks the pythonic method of not caring about what your input is. If I expect that my input can always be decoded into valid data, then why do I have to write this? Instead, I would like to propose to add *encode* and *decode* as builtins. I have written simple code to demonstrate my desire: https://gist.github.com/cloudformdesign/d8065a32cdd76d1b3230 If you find these functions useful, by all means use them in your code. BTW: looks to me like you have infinite recursion on lines 9 and 20, so that must be a simple oversight. There may be a few edge cases I am missing, which would all the more prove my point -- we need a function like this! You are free to have a function like that. Getting them added to the standard library is extremely unlikely. Basically, if I expect my data to be a string I can just write: data = decode(data) Which would accomplish two goals: explicitly stating what I expect of my data, and doing so concisely and cleanly. -- Ned Batchelder, http://nedbatchelder.com -- https://mail.python.org/mailman/listinfo/python-list
Re: encode and decode builtins
Garrett Berg writes: > I made the switch to python 3 about two months ago, and I have to say > I love everything about it, *especially* the change to using only > bytes and str (no more unicode! or... everything is unicode!) As > someone who works with embedded devices, it is great to know what data > I am working with. THanks! It is great to hear from people directly benefiting from this clear distinction. > However, there are times that I do not care what data I am working > with, and I find myself writing something like: > > if isinstance(data, bytes): data = data.decode() Why are you in a position where ‘data’ is not known to be bytes? If you want ‘unicode’ objects, isn't the API guaranteeing to provide them? > This is tedious and breaks the pythonic method of not caring about > what your input is. I wouldn't call that Pythonic. Rather, in the face of ambiguity (“is this text or bytes?”), Pythonic code refuses the temptation to guess: you need to clarify what you have as early as possible in the process. > If I expect that my input can always be decoded into valid data, then > why do I have to write this? I don't know. Why do you have to? -- \ “God was invented to explain mystery. God is always invented to | `\ explain those things that you do not understand.” —Richard P. | _o__)Feynman, 1988 | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
encode and decode builtins
I made the switch to python 3 about two months ago, and I have to say I love everything about it, *especially* the change to using only bytes and str (no more unicode! or... everything is unicode!) As someone who works with embedded devices, it is great to know what data I am working with. However, there are times that I do not care what data I am working with, and I find myself writing something like: if isinstance(data, bytes): data = data.decode() This is tedious and breaks the pythonic method of not caring about what your input is. If I expect that my input can always be decoded into valid data, then why do I have to write this? Instead, I would like to propose to add *encode* and *decode* as builtins. I have written simple code to demonstrate my desire: https://gist.github.com/cloudformdesign/d8065a32cdd76d1b3230 There may be a few edge cases I am missing, which would all the more prove my point -- we need a function like this! Basically, if I expect my data to be a string I can just write: data = decode(data) Which would accomplish two goals: explicitly stating what I expect of my data, and doing so concisely and cleanly. -- https://mail.python.org/mailman/listinfo/python-list