Nick Coghlan wrote:
By allowing format characters that *do* assume ASCII, the entire construct is rendered unsafe - you have to look inside the format string to determine if it is assuming ASCII compatibility or not, thus the entire construct must be deemed as assuming ASCII compatibility at the level of static semantic analysis.
I don't see how any of the currently proposed formatting operations make a data-dependent ASCII assumption. When you write b"%d" % x, you're not assuming that x is ASCII, you're assuming that it's an *integer*. The %d conversion of an integer is defined to produce only ASCII characters, and it works on any integer, so there's no data-dependent assumption there. Something that *would* involve such an assumption would be if b"%s" % 'hello' were defined to encode 'hello' as ASCII. But Guido has proposed not doing that, and instead interpolating ascii('hello'). Since ascii() is defined to return only ASCII characters, and works on any string, there is again no data-dependent assumption. My preference would be for b"%s" % 'hello' to raise an exception, but that would still be data-independent. As for having to look inside the format string to know what types are expected, that's no different from any other formatting operation. All it means is that static type analysis in Python is hard, but we already knew that.
Allowing these ASCII assuming format codes in the core bytes interpolation introduces *exactly* the same problem as is present in the Python 2 text model: code that *appears* to support arbitrary binary data, but is in fact assuming ASCII compatibility.
Can you provide an example of code using Guido's currently approved formatting semantics that would fail when given arbitrary binary data? I don't see how it can happen. -- Greg _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com