New submission from Wonsup Yoon:
unicodedata can't normalize(NFC) hangul strings which contain \u1176(HANGUL
JUNGSEONG A-O).
>>> from unicodedata import normalize
>>> normalize("NFC", "\u1100\u1176\u11a8")
'깍'
=> should be "\u1100\u1176\u11a8" not '깍' (\uae4d)
I attached a patch for this issue. (Fixing boundary of modern medial vowels)
----------
components: Unicode
files: u1176.patch
keywords: patch
messages: 287077
nosy: ezio.melotti, haypo, pusnow
priority: normal
severity: normal
status: open
title: bug in unicodedata.normalize: u1176
versions: Python 2.7, Python 3.6
Added file: http://bugs.python.org/file46535/u1176.patch
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue29456>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com