New submission from Wonsup Yoon:

unicodedata can't normalize(NFC) hangul strings which contain \u1176(HANGUL 
JUNGSEONG A-O).

>>> from unicodedata import normalize
>>> normalize("NFC", "\u1100\u1176\u11a8")
'깍'

=> should be "\u1100\u1176\u11a8" not '깍' (\uae4d)

I attached a patch for this issue. (Fixing boundary of modern medial vowels)

----------
components: Unicode
files: u1176.patch
keywords: patch
messages: 287077
nosy: ezio.melotti, haypo, pusnow
priority: normal
severity: normal
status: open
title: bug in unicodedata.normalize: u1176
versions: Python 2.7, Python 3.6
Added file: http://bugs.python.org/file46535/u1176.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29456>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to