On 01/12/2014 09:26 AM, Paul Moore wrote:
On 12 January 2014 17:03, Ethan Furman <et...@stoneleaf.us> wrote:
We know full well the difference between unicode and bytes, and we know full
well that numbers and much of the text we need has an ASCII (bytes!)
representation.  When we do a b'Content Length: %d' % len(binary_data) we
are expecting to get back a bytes object, /not/ a unicode object.

What I am struggling to understand here is what room for compromise
there is. Clearly, for whatever reason,

b'Content Length: ' + str(len(binary_data)).encode('ascii'))

is not acceptable for you. OK, fair enough. Also, apparently, writing a helper

def int_to_bytes(n):
     return str(n).encode('ascii')

b'Content Length: ' + int_to_bytes(len(binary_data))

is unacceptable. But I'm not clear why it's unacceptable. Maybe I
missed the explanation - God knows, the thread is long enough :-)

True enough! ;) It's unacceptable in the sense that the bytes type is /almost/ there, it's /almost/ what is needed to handle the boundary conditions. We have a __bytes__ method (how is it supposed to be used?) that could be made to fit the interpolation bill.

It seems to me the core of Nick's refusal is the (and I agree!) rejection of bytes interpolation returning unicode -- but that's not what I'm asking for! I'm asking for it to return bytes, with the interpolated data (in the case if %d, %s, etc) being strictly-ASCII encoded.


On the other hand, Nick has explained why b'Content Length: %d' %
len(binary_data) is unacceptable to him (you don't have to agree with
his opinion, just concede that he has explained his position in a way
that you understand).

Only because he (or Benno) finally wrote some tests and I was able to see what he thought I was wanting. Which does seem to leave a *tiny* bit of wiggle room if bytes interpolation always return bytes, and never a unicode (yeah, I know, snowball's chance and all that).


I'm not trying to argue you're wrong - I don't know your codebase, nor
do I know your application area. But surely somewhere between "we must
have % formatting including %d for bytes" and the above, there's a
middle ground that you *are* willing to accept? Can you give any
indications of what that might be? What, specifically, about the
helper function is the problem? I don't think it is any less space
efficient, it doesn't double-encode, and I don't think it's more
difficult to understand (although it is a little longer, it trades
that off against being a bit more explicit as to what's going on).
Surely you're not arguing that your code must work unchanged (not
"there's a way of writing the code so it works on Python 2 and 3", but
"the code you currently have for Python 2 must work with no changes at
all")?

I'm arguing from three PoVs:

1) 2 & 3 compatible code base

2) having the bytes type /be/ the boundary type

3) readable code


Can you give an example of code that is *nearly* acceptable to you,
which works in Python 2 and 3 today, and explain what improvements you
would like to see to it in order to use it instead of waiting for a
core change?

I'm not trying to be difficult (just naturally good at it, I guess ;) , but I don't see a lot room for compromises -- I would like % interpolation, I'm told I have to use a helper function. I will if I have to, but first I have to try and make myself understood, and I'm not sure that has happened yet. Following Nick's example I'm writing up some tests that clearly show what I would like to see. Then at least we can debate what I'm actually asking for, and now what the (understandably) unicode-what-a-mess-we-had-in-py2k-don't-want-again that some think I am asking for.

--
~Ethan~
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to