Benjamin Peterson added the comment:
On Tue, Apr 24, 2018, at 04:33, Pekka Klärck wrote:
>
> Pekka Klärck added the comment:
>
> I didn't submit this as a bug report but as an enhancement request. From
> usability point of view, saying that
Pekka Klärck added the comment:
I didn't submit this as a bug report but as an enhancement request. From
usability point of view, saying that results differ but you just cannot see the
difference is not very helpful.
The exact reason I didn't submit this as an
Benjamin Peterson added the comment:
As stated, the bug report is invalid: the repr _does_ differ, it's just not
presented that way by however you're viewing the two reprs. Distinct codepoint
sequences that look identical under certain circumstances can happen many
Pekka Klärck added the comment:
Thanks for pointing out `ascii()`. Seems to do exactly what I want.
`repr()` showing combining characters would, in my opinion, still be useful to
avoid problems like I demonstrated with unittest and pytest. I doubt it's a
good idea
Serhiy Storchaka added the comment:
Use ascii() in Python 3 if you want the behavior of repr() in Python 2. It
escapes all non-ascii characters.
But escaping only combining characters in addition to non-printable characters
in repr() looks an interesting idea.
Pekka Klärck added the comment:
Forgot to mention that this doesn't affect Python 2:
>>> a = u'hyv\xe4'
>>> b = u'hyva\u0308'
>>> print(repr(a))
u'hyv\xe4'
>>> print(repr(b))
u'hyva\u0308'
In addition to hoping `repr()` would be enhanced in future Python 3 versions,
New submission from Pekka Klärck :
If I have two strings that look the same but have different Unicode form, it's
very hard to see where the problem actually is:
>>> a = 'hyv\xe4'
>>> b = 'hyva\u0308'
>>> print(a)
hyvä
>>> print(b)
hyvä
>>> a == b
False
>>>