[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-25 Thread R. David Murray

Changes by R. David Murray :


--
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-20 Thread Guido van Rossum

Guido van Rossum  added the comment:

Aside: I, too, at first thought this would be a bad idea because it brings back 
the Python 2 issue of accepting some but not all Unicode strings. But then I 
realized that by their nature these functions only accepts a very specific set 
of characters -- so the restriction to (a subset of) ASCII is intrinsic to the 
functionality, and there is no possibility of confusion. If anything, accepting 
bytes is more likely to be confusing (they could be EBCDIC! :-). So no 
objection here. And a slight preference for ValueError.

--
nosy: +gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-20 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> Non-ascii binary data should not be being rejected unless validate
> is true.  So what are you going to do with non-ascii-range unicode in
> that case?  Ignore it as well?  That can't be right.

It's not ignored, it raises ValueError. Since the common case it to feed
valid (not invalid) baseXX data to these functions, that's a very benign
limitation.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-20 Thread R. David Murray

R. David Murray  added the comment:

I disagree with this commit.  Reopening pending discussion on python-dev.

--
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-20 Thread R. David Murray

R. David Murray  added the comment:

Non-ascii binary data should not be being rejected unless validate
is true.  So what are you going to do with non-ascii-range unicode in
that case?  Ignore it as well?  That can't be right.

I believe this should be discussed on python-dev.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-20 Thread Antoine Pitrou

Changes by Antoine Pitrou :


--
resolution:  -> fixed
stage: patch review -> committed/rejected
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-20 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

I've committed issue13641-alternative-v1.patch. I really think practicality 
beats purity here and, furthermore, there's no associated danger (non-ASCII 
data is rejected both as bytes and str).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-20 Thread Roundup Robot

Roundup Robot  added the comment:

New changeset c760bd844222 by Antoine Pitrou in branch 'default':
Issue #13641: Decoding functions in the base64 module now accept ASCII-only 
unicode strings.
http://hg.python.org/cpython/rev/c760bd844222

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-19 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> OK' I'm back to being 100% on the side of rejecting both of these
> changes.  ASCII is not unocode, it is bytes.  You can decode it to
> unicode but it is not unicode.  Those transformations operate bytes to
> bytes, not bytes to unicode.

ASCII is just a subset of the unicode character set.

> We made the bytes unicode separation to avoid the problem where you
> have a working program that unexpectedly gets non ASCII input and
> blows up with a unicode error.

How is blowing up with a unicode error worse than blowing up with a
ValueError? Both indicate wrong input. At worse the code could catch
UnicodeError and re-raise it as ValueError, but I don't see the point.

> The programer should have to explicitly encode to ASCII if they are
> inadvisedly workimg with it in a string as part of a wire protocol
> (why else would they be using these transforms).

Inadvisedly? There are many situations where you can have base64 data in
some unicode strings.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-19 Thread R. David Murray

R. David Murray  added the comment:

OK' I'm back to being 100% on the side of rejecting both of these changes.  
ASCII is not unocode, it is bytes.  You can decode it to unicode but it is not 
unicode.  Those transformations operate bytes to bytes, not bytes to unicode.

We made the bytes unicode separation to avoid the problem where you have a 
working program that unexpectedly gets non ASCII input and blows up with a 
unicode error.  IMO these patches are reintroducing that problem.  The 
programer should have to explicitly encode to ASCII if they are inadvisedly 
workimg with it in a string as part of a wire protocol (why else would they be 
using these transforms).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-19 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

I think trying to prevent mixed argument types is completely overkill. There's 
no ambiguity since they all have to be ASCII anyway.
So I would prefer to commit issue13641-alternative-v1.patch

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-19 Thread Catalin Iacob

Catalin Iacob  added the comment:

Attached v2 of patch where mixing str and binary data for altchars or map01 
raises TypeError.

I also added a note for each of the changed functions that it also accepts 
strings (but didn't also update the docstrings).

When writing the docs, the new functionality seemed hard to describe; maybe 
that means this issue only complicates things and is not worth it, or maybe it 
just means I don't have experience at writing docs.

But, regardless of having worked at a patch, I have to admit that I'm also not 
100% sure this issue is a good idea. I *do* think that either both this issue 
and #13637 should be accepted or both rejected.

--
Added file: http://bugs.python.org/file24565/issue13641-alternative-v2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-17 Thread Catalin Iacob

Catalin Iacob  added the comment:

My current patch allows mixing of bytes and str for the data to be decoded and 
the altchars or map01 parameter. Given David's observation in msg153505 I'll 
update the patch to require that both the data and altchars/map01 have the same 
type.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-16 Thread R. David Murray

R. David Murray  added the comment:

OK, I skimmed the thread I was remembering, and while it was discussing 
str->str and bytes->bytes primarily, the only pronouncement I could find was 
that functions should not accept a *mix* of bytes and string.  So I guess I 
withdraw my objection, although it still makes me a bit uncomfortable.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-16 Thread poq

poq  added the comment:

FWIW, I was surprised by the return type of b64encode when I first used it in 
Python 3. It seems to me that b64encode turns binary data into text and thus 
intuitively should take bytes and return str.

Similarly it seems intuitive to me for b64decode to take str as input and 
return bytes, as it turns text back into binary data.

--
nosy: +poq

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-16 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> However, accepting bytes or string and returning bytes is not an
> obviously good idea, and IMO at least merits some discussion.

Why? "a" in "a2b" means ASCII, and unicode is as valid a container for ASCII 
text as bytes is.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-16 Thread R. David Murray

R. David Murray  added the comment:

Um.  I'm inclined to think that #13637 was a mistake.

Functions that accept bytes and return bytes and also accept string and return 
string seem uncontroversial.  However, accepting bytes or string and returning 
bytes is not an obviously good idea, and IMO at least merits some discussion.  
In fact, I thought it *had* been discussed, specifically in the context of the 
b2a/a2b functions, and been rejected.

--
nosy: +r.david.murray

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-02-16 Thread Catalin Iacob

Catalin Iacob  added the comment:

Attached alternative patch with a different approach: on input, strings are 
encoded as bytes and the rest of the code proceeds as before.

All existing tests for bytes now test for strings as well and there is a new 
test for strings with non ASCII characters.

Berker's patch was more intrusive and forgot to allow strings in _translate, 
leading to failures if altchars or map01 were used.

--
nosy: +catalin.iacob
Added file: http://bugs.python.org/file24533/issue13641-alternative-v1.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-01-19 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Thanks for the updated patch!
Two comments:
- I see no tests for map01 and altchars being passed as an str, is this 
supported by the patch or am I reading it wrong?
- apparently b16decode is not tackled, is it deliberate?

Thanks again.

--
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-01-16 Thread Berker Peksag

Changes by Berker Peksag :


Added file: http://bugs.python.org/file24252/issue13641_v3_with_tests.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-01-04 Thread Berker Peksag

Berker Peksag  added the comment:

Hi Antoine,

I added some tests for b64decode function.

Also, I wrote some tests for b32decode and b16decode functions and failed. I 
think my patch is not working for b32decode and b16decode functions. I'll dig 
into code and try to find a way.

Thanks!

--
Added file: http://bugs.python.org/file24141/issue13641_v2_with_tests.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2012-01-01 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Thanks for the patch, Berker. It seems a bit too simple, though. You should add 
some tests in Lib/test/test_base64.py and run them (using "./python -m test -v 
test_base64"), this will allow you to see if your changes are correct.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2011-12-31 Thread Berker Peksag

Changes by Berker Peksag :


--
keywords: +patch
nosy: +berkerpeksag
Added file: http://bugs.python.org/file24119/issue13641_v1.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2011-12-30 Thread Éric Araujo

Changes by Éric Araujo :


--
nosy: +eric.araujo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2011-12-28 Thread Anthony Kong

Changes by Anthony Kong :


--
nosy: +Anthony.Kong

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2011-12-27 Thread Petri Lehtinen

Changes by Petri Lehtinen :


--
nosy: +petri.lehtinen

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13641] decoding functions in the base64 module could accept unicode strings

2011-12-20 Thread Antoine Pitrou

New submission from Antoine Pitrou :

Similarly to #13637 for the binascii module, the decoding functions in the 
base64 module could accept ASCII-only unicode strings.

--
components: Library (Lib)
keywords: easy
messages: 149912
nosy: pitrou
priority: normal
severity: normal
status: open
title: decoding functions in the base64 module could accept unicode strings
type: enhancement
versions: Python 3.3

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com