[issue33108] Unicode char 304 in lowercase has len = 2

2018-03-20 Thread Kiril Dimitrov
Kiril Dimitrov added the comment: This is roughly my use case: zip( "ßx", [0.5, 0.3]) is [('ß', 0.5), ('x', 0.3)] zip("ßx".upper(), [0.5, 0.3]) will be [('S', 0.5), ('S', 0.3)] in later case you never get to see the value for &

[issue33108] Unicode char 304 in lowercase has len = 2

2018-03-20 Thread Kiril Dimitrov
Change by Kiril Dimitrov : -- title: Unicode char 304 in lowercase has len 2 -> Unicode char 304 in lowercase has len = 2 ___ Python tracker <https://bugs.python.org/issu

[issue33108] Unicode char 304 in lowercase has len 2

2018-03-20 Thread Kiril Dimitrov
New submission from Kiril Dimitrov : >>> chr(304) 'İ' >>> chr(304).lower() 'i̇' >>> len(chr(304).lower()) 2 This breaks unicode text matching. There is no other unicode character with the same behaviour (in 3.6.2 and 3.6.4). ------ components