[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-30 Thread Senthil Kumaran
Changes by Senthil Kumaran : -- nosy: +orsenthil ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pyt

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-30 Thread Nick Coghlan
Nick Coghlan added the comment: Yeah, the general implementation concept I'm thinking of going with for option 2 will use a few helper functions: url, coerced_to_str = _coerce_to_str(url) if coerced_to_str: param = _force_to_str(param) # as appropriate ... return _undo_coercion(result, coe

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-30 Thread STINNER Victor
STINNER Victor added the comment: > Option 2 (the alternative Antoine suggested and I'm considering): > - "decode" ... to str ... > - ... objects are "encoded" back to actual bytes before > they are returned In this case, you have to be very careful to not mix str and bytes decoded to str u

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-30 Thread Nick Coghlan
Nick Coghlan added the comment: >From a function *user* perspective, the latter API (bytes->bytes, str->str) is >exactly what I'm doing. Antoine's point is that there are two ways to achieve that: Option 1 (what my patch currently does): - provide bytes and str variants of all constants - cho

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-29 Thread STINNER Victor
STINNER Victor added the comment: I don't understand why you would like to implicitly convert bytes to str (which is one of the worse design choice of Python2). If you don't want to care about encodings, use bytes is fine. Decode bytes using an arbitrary encoding is the fastest way to mojibak

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-29 Thread Nick Coghlan
Nick Coghlan added the comment: > I think it's quite misguided. latin1 encoding and decoding is blindingly > fast (on the order of 1GB/s. here). Unless you have multi-megabyte URLs, > you won't notice any overhead. Ah, I didn't know that (although it makes sense now I think about it). I'll star

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: > The primary reason for supporting ASCII compatible bytes directly is > specifically to avoid the encoding and decoding overhead associated > with the translation to unicode text. I think it's quite misguided. latin1 encoding and decoding is blindingly fast (o

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-29 Thread Nick Coghlan
Nick Coghlan added the comment: One of Antoine's review comments made me realise I hadn't explicitly noted the "why not decode with latin-1?" rationale for the bytes handling. (It did come up in one or more of the myriad python-dev discussions on the topic, I just hadn't noted it here) The p

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-29 Thread Nick Coghlan
Nick Coghlan added the comment: Added to Reitveld: http://codereview.appspot.com/2318041/ -- ___ Python tracker ___ ___ Python-bugs-li

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-29 Thread Nick Coghlan
Nick Coghlan added the comment: Agreed - I think there's a non-zero chance of triggering the str-path by mistake if we try to duck-type it (I just added a similar comment to #9969 regarding a possible convenience API for tokenisation) -- ___ Python

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: > A possible duck-typing approach here would be to replace the > "instance(x, str)" tests with "hasattr(x, 'encode')" checks instead. Looks more ugly than useful to me. People wanting to emulate str had better subclass it anyway... -- nosy: +pitrou _

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-28 Thread Nick Coghlan
Nick Coghlan added the comment: A possible duck-typing approach here would be to replace the "instance(x, str)" tests with "hasattr(x, 'encode')" checks instead. Thoughts? -- ___ Python tracker __

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-28 Thread Nick Coghlan
Nick Coghlan added the comment: Attached patch is a very rough first cut at this. I've gone with the basic approach of simply assigning the literals to local variables in each function that uses them. My rationale for that is: 1. Every function has to have some kind of boilerplate to switch ba

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-20 Thread Nick Coghlan
Nick Coghlan added the comment: The design approach (at least for urllib.parse) is to add separate *b APIs that operate on bytes rather than implicitly allowing bytes in the ordinary versions of the function. Allowing programmers to manipulate correctly encoded (and hence ASCII compatible) b

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-17 Thread Nick Coghlan
Nick Coghlan added the comment: >From the python-dev thread >(http://mail.python.org/pipermail/python-dev/2010-September/103780.html): == So the domain of any polymorphic text manipulation functions we define would be: - Unicode strings - byte sequences where the encoding is eithe

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-16 Thread Éric Araujo
Changes by Éric Araujo : -- nosy: +eric.araujo ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pytho

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-16 Thread Eric Smith
Changes by Eric Smith : -- nosy: +eric.smith ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-16 Thread R. David Murray
Changes by R. David Murray : -- nosy: +r.david.murray ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mai

[issue9873] Allow bytes in some APIs that use string literals internally

2010-09-16 Thread Nick Coghlan
New submission from Nick Coghlan : As per python-dev discussion in June, many Py3k APIs currently gratuitously prevent the use of bytes and bytearray objects as arguments due to their use of string literals internally. Examples: urllib.parse.urlparse urllib.parse.urlunparse urllib.parse.urljoi