New submission from Diego Argueta <diego.argu...@gmail.com>: The way Python 3 handles identifiers containing mathematical characters appears to be broken. I didn't test the entire range of U+1D400 through U+1D59F but I spot-checked them and the bug manifests itself there:
Python 3.9.7 (default, Sep 10 2021, 14:59:43) [GCC 11.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> foo = 1234567890 >>> bar = 1234567890 >>> foo is bar False >>> 𝖇𝖆𝖗 = 1234567890 >>> foo is 𝖇𝖆𝖗 False >>> bar is 𝖇𝖆𝖗 True >>> 𝖇𝖆𝖗 = 0 >>> bar 0 This differs from the behavior with other non-ASCII characters. For example, ASCII 'a' and Cyrillic 'a' are properly treated as different identifiers: >>> а = 987654321 # Cyrillic lowercase 'a', U+0430 >>> a = 123456789 # ASCII 'a' >>> а # Cyrillic 987654321 >>> a # ASCII 123456789 While a bit of a pathological case, it is a nasty surprise. It's possible this is a symptom of a larger bug in the way identifiers are resolved. This is similar but not identical to https://bugs.python.org/issue46555 Note: I did not find this myself; I give credit to Cooper Stimson (https://github.com/6C1) for finding this bug. I merely reported it. ---------- components: Parser, Unicode messages: 412084 nosy: da, ezio.melotti, lys.nikolaou, pablogsal, vstinner priority: normal severity: normal status: open title: Unicode identifiers not necessarily unique type: behavior versions: Python 3.7, Python 3.8, Python 3.9 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue46572> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com