[Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Steven D'Aprano
Normally I'd take a question like this to Python-List, but this question has turned out to be quite diversive, with people having strong opinions but no definitive answer. So I thought I'd ask here and hope that some of the core devs would have an idea. Why does base64 encoding in Python return

[Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Stephen J. Turnbull
Steven D'Aprano writes: > base64.b64encode take bytes as input and returns bytes. Some people are > arguing that this is wrong behaviour, as RFC 3548 That RFC is obsolete: the replacement is RFC 4648. However, the text is essentially unchanged. > specifies that Base64 should transform byte

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Joao S. O. Bueno
On 14 June 2016 at 12:19, Steven D'Aprano wrote: > Is there > a good reason for returning bytes? What about: it returns 0-255 numeric values for each position in a stream, with no clue whatsoever to how those values map to text characters beyond the 32-128 range? Maybe base64.decode could take

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Victor Stinner
To port OpenStack to Python 3, I wrote 4 (2x2) helper functions which accept bytes *and* Unicode as input. xxx_as_bytes() functions return bytes, xxx_as_text() return Unicode: http://docs.openstack.org/developer/oslo.serialization/api.html Victor Le 14 juin 2016 5:21 PM, "Steven D'Aprano" a écrit

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Paul Moore
On 14 June 2016 at 16:19, Steven D'Aprano wrote: > Why does base64 encoding in Python return bytes? I seem to recall there was a debate about this around the time of the Python 3 move. (IIRC, it was related to the fact that there used to be a base64 "codec", that wasn't available in Python 3 beca

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Toshio Kuratomi
On Jun 14, 2016 8:32 AM, "Joao S. O. Bueno" wrote: > > On 14 June 2016 at 12:19, Steven D'Aprano wrote: > > Is there > > a good reason for returning bytes? > > What about: it returns 0-255 numeric values for each position in a stream, with > no clue whatsoever to how those values map to text cha

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Terry Reedy
On 6/14/2016 11:19 AM, Steven D'Aprano wrote: Normally I'd take a question like this to Python-List, but this question has turned out to be quite diversive, with people having strong opinions but no definitive answer. So I thought I'd ask here and hope that some of the core devs would have an ide

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Mark Lawrence via Python-Dev
On 14/06/2016 16:51, Paul Moore wrote: On 14 June 2016 at 16:19, Steven D'Aprano wrote: Why does base64 encoding in Python return bytes? I seem to recall there was a debate about this around the time of the Python 3 move. (IIRC, it was related to the fact that there used to be a base64 "codec

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Joao S. O. Bueno
On 14 June 2016 at 13:32, Toshio Kuratomi wrote: > > On Jun 14, 2016 8:32 AM, "Joao S. O. Bueno" wrote: >> >> On 14 June 2016 at 12:19, Steven D'Aprano wrote: >> > Is there >> > a good reason for returning bytes? >> >> What about: it returns 0-255 numeric values for each position in a >> stream

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Paul Sokolovsky
Hello, On Tue, 14 Jun 2016 16:51:44 +0100 Paul Moore wrote: > On 14 June 2016 at 16:19, Steven D'Aprano wrote: > > Why does base64 encoding in Python return bytes? > > I seem to recall there was a debate about this around the time of the > Python 3 move. (IIRC, it was related to the fact that

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Random832
On Tue, Jun 14, 2016, at 13:05, Joao S. O. Bueno wrote: > Sorry, it is 2016, and I don't think at this point anyone can consider > an ASCII string > as a representative pattern of textual data in any field of application. > Bytes are not text. Bytes with an associated, meaningful, encoding are > te

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Joao S. O. Bueno
On 14 June 2016 at 14:45, Random832 wrote: > On Tue, Jun 14, 2016, at 13:05, Joao S. O. Bueno wrote: >> Sorry, it is 2016, and I don't think at this point anyone can consider >> an ASCII string >> as a representative pattern of textual data in any field of application. >> Bytes are not text. Bytes

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Random832
On Tue, Jun 14, 2016, at 13:19, Paul Sokolovsky wrote: > Well, it's easy to remember the conclusion - it was decided to return > bytes. The reason also wouldn't be hard to imagine - regardless of the > fact that base64 uses ASCII codes for digits and letters, it's still > essentially a binary data.

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread R. David Murray
On Tue, 14 Jun 2016 14:05:19 -0300, "Joao S. O. Bueno" wrote: > On 14 June 2016 at 13:32, Toshio Kuratomi wrote: > > > > On Jun 14, 2016 8:32 AM, "Joao S. O. Bueno" wrote: > >> > >> On 14 June 2016 at 12:19, Steven D'Aprano wrote: > >> > Is there > >> > a good reason for returning bytes? > >>

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Daniel Holth
IMO this is more a philosophical problem than a programming problem. base64 has a dual-nature. It is both text and bytes. At least it should fit in a 1-byte-per-character efficient Python 3 unicode string also. ___ Python-Dev mailing list Python-Dev@pytho

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Paul Sokolovsky
Hello, On Tue, 14 Jun 2016 14:02:02 -0400 Random832 wrote: > On Tue, Jun 14, 2016, at 13:19, Paul Sokolovsky wrote: > > Well, it's easy to remember the conclusion - it was decided to > > return bytes. The reason also wouldn't be hard to imagine - > > regardless of the fact that base64 uses ASCII

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Terry Reedy
On 6/14/2016 12:32 PM, Toshio Kuratomi wrote: The input to encoding would have to remain bytes (that's the main purpose of base64... to turn bytes into an ascii string). The purpose is to turn arbitrary binary data (commonly images) into 'safe bytes' that will not get mangled on transmission

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Paul Sokolovsky
Hello, On Tue, 14 Jun 2016 18:13:11 + Daniel Holth wrote: > IMO this is more a philosophical problem than a programming problem. > base64 has a dual-nature. It is both text and bytes. At least it > should fit in a 1-byte-per-character efficient Python 3 unicode > string also. You probably m

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Terry Reedy
On 6/14/2016 12:29 PM, Mark Lawrence via Python-Dev wrote: As I've the time to play detective I'd suggest https://mail.python.org/pipermail/python-3000/2007-July/008975.html Thank you for finding that. I reread it and still believe that bytes was the right choice. Base64 is an generic edge

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Greg Ewing
Joao S. O. Bueno wrote: The arguments about compactness and what is most likely to happen next applies (transmission trhough a binary network protocol), I'm not convinced that this is what is most likely to happen next *in a Python program*. How many people implement their own binary network pr

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Greg Ewing
R. David Murray wrote: The fundamental purpose of the base64 encoding is to take a series of arbitrary bytes and reversibly turn them into another series of bytes in which the eighth bit is not significant. No, it's not. If that were its only purpose, it would be called base128, and the RFC wou

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Steven D'Aprano
On Tue, Jun 14, 2016 at 05:29:12PM +0100, Mark Lawrence via Python-Dev wrote: > As I've the time to play detective I'd suggest > https://mail.python.org/pipermail/python-3000/2007-July/008975.html Thanks Mark, that's great! -- Steve ___ Python-Dev

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Stephen J. Turnbull
Greg Ewing writes: > The RFC does *not* do that. It describes the output in terms of > characters, and does not specify any bit patterns for the > output. The RFC is unclear on this point, but I read it as specifying the ASCII coded character set, not the ASCII repertoire of (abstract) charact

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Random832
On Tue, Jun 14, 2016, at 22:58, Stephen J. Turnbull wrote: > The RFC is unclear on this point, but I read it as specifying the > ASCII coded character set, not the ASCII repertoire of (abstract) > characters. Therefore, it specifies an invertible mapping from a > particular set of integers to char

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Greg Ewing
Stephen J. Turnbull wrote: it does refer to *encoded* characters as the output of the encoding process: > The encoding process represents 24-bit groups of input bits > as output strings of 4 encoded characters. The "encoding" being referred to there is the encoding from input bytes

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Simon Cross
On Tue, Jun 14, 2016 at 8:42 PM, Terry Reedy wrote: > Thank you for finding that. I reread it and still believe that bytes was > the right choice. Base64 is an generic edge encoding for binary data. It > fits in with the the standard paradigm as a edge encoding. I'd like to me-too Terry's sent

Re: [Python-Dev] Why does base64 return bytes?

2016-06-15 Thread Greg Ewing
Stephen J. Turnbull wrote: The RFC is unclear on this point, but I read it as specifying the ASCII coded character set, not the ASCII repertoire of (abstract) characters. Well, I think you've misread it. Or at least there is a more general reading possible that is entirely consistent with the

Re: [Python-Dev] Why does base64 return bytes?

2016-06-15 Thread Greg Ewing
Simon Cross wrote: If we only support one, I would prefer it to be bytes since (bytes -> bytes -> unicode) seems like less overhead and slightly conceptually clearer than (bytes -> unicode -> bytes), Whereas bytes -> unicode, followed if needed by unicode -> bytes, seems conceptually clearer to

Re: [Python-Dev] Why does base64 return bytes?

2016-06-15 Thread Steven D'Aprano
On Tue, Jun 14, 2016 at 09:40:51PM -0700, Guido van Rossum wrote: > I'm officially on vacation, but I was surprised that people now assume > RFCs, which specify internet protocols, would have a bearing on programming > languages. (With perhaps an exception for RFCs that specifically specify > how p

Re: [Python-Dev] Why does base64 return bytes?

2016-06-15 Thread Daniel Holth
In that case could we just add a base64_text() method somewhere? Who would like to measure whether it would be a win? On Wed, Jun 15, 2016 at 8:34 AM Steven D'Aprano wrote: > On Tue, Jun 14, 2016 at 09:40:51PM -0700, Guido van Rossum wrote: > > I'm officially on vacation, but I was surprised tha

Re: [Python-Dev] Why does base64 return bytes?

2016-06-15 Thread Paul Moore
On 15 June 2016 at 13:53, Daniel Holth wrote: > In that case could we just add a base64_text() method somewhere? Who would > like to measure whether it would be a win? "Just adding" a method in the stdlib, means we'd have to support it long term (backward compatibility). So by the time such an ex

Re: [Python-Dev] Why does base64 return bytes?

2016-06-15 Thread Isaac Morland
On Wed, 15 Jun 2016, Greg Ewing wrote: Simon Cross wrote: If we only support one, I would prefer it to be bytes since (bytes -> bytes -> unicode) seems like less overhead and slightly conceptually clearer than (bytes -> unicode -> bytes), Whereas bytes -> unicode, followed if needed by unicod

Re: [Python-Dev] Why does base64 return bytes?

2016-06-15 Thread Steven D'Aprano
On Wed, Jun 15, 2016 at 12:53:15PM +, Daniel Holth wrote: > In that case could we just add a base64_text() method somewhere? Who would > like to measure whether it would be a win? Just call .decode('ascii') on the output of base64.b64encode. Not every one-liner needs to be a standard function

Re: [Python-Dev] Why does base64 return bytes?

2016-06-15 Thread Daniel Holth
It would be a codec. base64_text in the codecs module. Probably 1 line different than the existing codec. Very easy to use and maintain. Less surprising and less error prone for everyone who thinks base64 should convert between bytes to text. Sounds like an obvious win to me. On Wed, Jun 15, 2016

Re: [Python-Dev] Why does base64 return bytes?

2016-06-15 Thread Anders J. Munch
Paul Moore: > Finding out whether users/projects typically write such a helper > function for themselves would be a better way of getting this > information. Personally, I suspect they don't, but facts beat > speculation. Well, I did. It was necessary to get 2to3 conversion to work(*). I turned e

Re: [Python-Dev] Why does base64 return bytes?

2016-06-15 Thread Greg Ewing
Steven D'Aprano wrote: I'm satisfied that the choice made by Python is the right choice, and that it meets the spirit (if, arguably, not the letter) of the RFC. IMO it meets the letter (if you read it a certain way) but *not* the spirit. -- Greg ___

Re: [Python-Dev] Why does base64 return bytes?

2016-06-16 Thread R. David Murray
On Wed, 15 Jun 2016 11:51:05 +1200, Greg Ewing wrote: > R. David Murray wrote: > > The fundamental purpose of the base64 encoding is to take a series > > of arbitrary bytes and reversibly turn them into another series of > > bytes in which the eighth bit is not significant. > > No, it's not. If