[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2018-06-18 Thread Xiang Zhang


Xiang Zhang  added the comment:

Thanks for your confirmation, Ma Lin. Also thanks for Wonsup!

--
components: +Unicode -Library (Lib)

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2018-06-16 Thread Ma Lin


Ma Lin  added the comment:

You are right.

I found a Normalization Test Suite for Unicode 3.2
http://www.unicode.org/Public/3.2-Update/NormalizationTest-3.2.0.txt

\u1176 is not in the range of the second character.
\u11a7, \u11c3 are not in the range of the third character.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2018-06-15 Thread Xiang Zhang


Xiang Zhang  added the comment:

As I said, I checked Unicode 3.0 for the hangul composition algorithm. It looks 
consistent with Unicode 4.1+. 3.0 only gets description but no sample 
implementation. So I think the changed code also applies to Unicode 3.0+.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2018-06-15 Thread Ma Lin


Ma Lin  added the comment:

> We have a ucd_3_2_0 in unicodedata.

Probably this 3.2 unicodedata is used for IDNA2003.
In IDNA2003 there is a step: normalize the domain_name string to Unicode 
Normalization Form C.

Now we changed the Composition code of Hangul to Unicode Standard 4.1+, and 
fixed the bug even in Unicode Standard 4.1-.
Should this (Unicode Standard 4.1+ behavior) cause a security vulnerability for 
someone who is using IDNA2003 via ucd_3_2_0?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2018-06-15 Thread Xiang Zhang


Change by Xiang Zhang :


--
components: +Library (Lib) -Unicode
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2018-06-15 Thread Xiang Zhang


Xiang Zhang  added the comment:


New changeset 1889c4cbd62e200fa4cde3d6219e0aadf9bd8149 by Xiang Zhang in branch 
'2.7':
bpo-29456: Fix bugs in unicodedata.normalize: u1176, u11a7 and u11c3 (GH-1958) 
(GH-7704)
https://github.com/python/cpython/commit/1889c4cbd62e200fa4cde3d6219e0aadf9bd8149


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2018-06-15 Thread miss-islington


miss-islington  added the comment:


New changeset e2e7ff0d0378ba44f10a1aae10e4bee957fb44d2 by Miss Islington (bot) 
in branch '3.6':
bpo-29456: Fix bugs in unicodedata.normalize: u1176, u11a7 and u11c3 (GH-1958)
https://github.com/python/cpython/commit/e2e7ff0d0378ba44f10a1aae10e4bee957fb44d2


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2018-06-15 Thread Xiang Zhang


Change by Xiang Zhang :


--
pull_requests: +7320

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2018-06-15 Thread miss-islington


miss-islington  added the comment:


New changeset 0e2b76ea4e48d0fc1ca34ae4ffbb2fd6c19664bb by Miss Islington (bot) 
in branch '3.7':
bpo-29456: Fix bugs in unicodedata.normalize: u1176, u11a7 and u11c3 (GH-1958)
https://github.com/python/cpython/commit/0e2b76ea4e48d0fc1ca34ae4ffbb2fd6c19664bb


--
nosy: +miss-islington

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2018-06-15 Thread miss-islington


Change by miss-islington :


--
pull_requests: +7318

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2018-06-15 Thread miss-islington


Change by miss-islington :


--
pull_requests: +7319

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2018-06-15 Thread Xiang Zhang


Xiang Zhang  added the comment:


New changeset d134809cd3764c6a634eab7bb8995e3e2eff14d5 by Xiang Zhang (Wonsup 
Yoon) in branch 'master':
bpo-29456: Fix bugs in unicodedata.normalize: u1176, u11a7 and u11c3 (GH-1958)
https://github.com/python/cpython/commit/d134809cd3764c6a634eab7bb8995e3e2eff14d5


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2018-06-15 Thread Xiang Zhang


Xiang Zhang  added the comment:

Sorry for the absence and late response. I just reviewed it and think it's 
ready. I think the change in the unicode standard is more like a bug in the 
implementation than an intentional change. It's mentioned in Unicode 3.0 the 
third character is out of bounds when TIndex <= 0 or TIndex >= TCount. We have 
a ucd_3_2_0 in unicodedata.

I'll merge it after resolve the CI bot.

--
versions: +Python 3.8 -Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2018-04-12 Thread Wonsup Yoon

Wonsup Yoon  added the comment:

Hello!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2018-02-28 Thread Ma Lin

Ma Lin  added the comment:

ping, this was forgotten.

--
nosy: +Ma Lin

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2017-08-27 Thread Wonsup Yoon

Wonsup Yoon added the comment:

Hello?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2017-08-19 Thread Wonsup Yoon

Wonsup Yoon added the comment:

This patch fixes changes in Unicode 4.1.0.
I think it well reviewed and it is time to merge.
Who can commit this patch? 

@animalize says:
Let me give a supplement:

Before Unicode 4.1.0 (draft), here is: TBase <= code <= TBase+TCount
see: http://www.unicode.org/reports/tr15/tr15-24.html#hangul_composition

After Unicode 4.1.0, here is TBase < code < TBase+TCount, which in line with 
the latest version (Unicode 10.0)
see: http://www.unicode.org/reports/tr15/tr15-25.html#hangul_composition

This change happened in 2005.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2017-08-09 Thread Xiang Zhang

Xiang Zhang added the comment:

Hi Wonsup, sorry for the delay. I get really busy with my work these days. If 
no one get involved I'd try to find time reviewing your patch this week.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2017-08-09 Thread Xiang Zhang

Changes by Xiang Zhang :


Removed file: http://bugs.python.org/file47070/800.jpg

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2017-08-09 Thread 高可爱

Changes by 高可爱 :


Added file: http://bugs.python.org/file47070/800.jpg

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2017-08-09 Thread Wonsup Yoon

Wonsup Yoon added the comment:

I think it can be merged. Is there anything I need to do?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2017-08-02 Thread Wonsup Yoon

Wonsup Yoon added the comment:

I added some test cases for this issue. Please, someone check this.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2017-07-26 Thread Wonsup Yoon

Wonsup Yoon added the comment:

Any updates? I need this fix for my project.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2017-06-05 Thread Wonsup Yoon

Changes by Wonsup Yoon :


--
pull_requests: +2029

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29456] bugs in unicodedata.normalize: u1176, u11a7 and u11c3

2017-06-05 Thread Wonsup Yoon

Changes by Wonsup Yoon :


--
title: bug in unicodedata.normalize: u1176, u11a7 and u11c3 -> bugs in 
unicodedata.normalize: u1176, u11a7 and u11c3

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com