Victor Ruiz vic...@ninibe.com added the comment:
Hi,
I think I've come across what seems to be another flavor of this issue. The
following string will cause a crash in some interpreters.
text =
u\u062d\u064e\u064a\u0651\u064b\u0627\u060c\u0648\u064e\u064a\u064e\u062d\u0650\u0642\u0651\u064e
Changes by Alexander Belopolsky alexander.belopol...@gmail.com:
--
status: closed - open
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10254
___
Alexander Belopolsky alexander.belopol...@gmail.com added the comment:
This new data does not crash Python 2.7.2, so I assume the issue has been
fixed. Re-closing.
--
status: open - closed
___
Python tracker rep...@bugs.python.org
STINNER Victor victor.stin...@haypocalc.com added the comment:
This new data does not crash Python 2.7.2, so I assume the issue has been
fixed.
Yes, the bug was already fixed in branch 2.7 by the SVN commit r87541:
changeset: 67185:54f1d5651555
branch: 2.7
parent:
STINNER Victor victor.stin...@haypocalc.com added the comment:
This fix is part of Python 2.7.2, but not of 2.7.2.
... but not of 2.7.1.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10254
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
Committed backports:
r87540 (3.1)
r87541 (2.7)
r87546 (2.6)
--
resolution: - fixed
stage: commit review - committed/rejected
status: open - closed
versions: +Python 3.2
___
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
Committed to py3k in revision 87442.
--
versions: -Python 3.2
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10254
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
In the new patch, issue10254b.diff, I've added a test that would crash
unpatched code:
unicodedata.normalize('NFC', 'C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸Ç')
Segmentation fault
Martin, I still feel uneasy about the
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
Attached patch, issue10254a.diff, adds the OP's cases to test_unicodedata and
changes the code as I suggested in msg124173 because ISTM that comb = comb1
matches the pr-29 definition:
D2'. In any character sequence
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
On Mon, Dec 20, 2010 at 2:50 PM, Alexander Belopolsky
rep...@bugs.python.org wrote:
..
Unfortunately, all tests pass with either comb = comb1 or comb == comb1, so
before
I commit, I would like to figure out the test
Martin v. Löwis mar...@v.loewis.de added the comment:
Am 17.12.2010 01:56, schrieb STINNER Victor:
STINNER Victor victor.stin...@haypocalc.com added the comment:
Ooops, sorry. I just applied the patch suggested by Marc-Andre
Lemburg in msg22885 (#1054943). As the patch worked for the
Martin v. Löwis mar...@v.loewis.de added the comment:
So lacking a new patch, I think we should revert the existing change
for now.
Oops, I missed that Alexander has proposed a patch.
--
___
Python tracker rep...@bugs.python.org
Martin v. Löwis mar...@v.loewis.de added the comment:
The logic suggested by Martin in msg120018 looks right to me, but the
whole code seems to be unnecessarily complex. (And comb1==comb may
need to be changed to comb1=comb.) I don't understand why linear
search through skipped array is
Martin v. Löwis mar...@v.loewis.de added the comment:
Passing Part3 tests and not crashing on crash.py is probably good
enough for a commit, but I don't have a proof that length 20 skipped
buffer is always enough.
I would agree with that. I still didn't have time to fully review the
patch,
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
On Fri, Dec 17, 2010 at 3:47 AM, Martin v. Löwis rep...@bugs.python.org wrote:
..
The worst case (wrt. cskipped) is the maximum number of characters that
can get combined into a single base character. It used to be (and I
Martin v. Löwis mar...@v.loewis.de added the comment:
The C forms (NFC and NFKC) do canonical composition and U+FDFA is a
compatibility composite. (BTW, makeunicodedata.py checks that maximum
decomposed length of a character is 19, but it would be better if it
would compute and define a
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
On Fri, Dec 17, 2010 at 2:08 PM, Martin v. Löwis rep...@bugs.python.org wrote:
..
As far as I (and a two-line script) can tell
the maximum length of a canonical decomposition of a character is 4.
Even better - so
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
Adding an assert as shown in the diff below, makes it easy to reproduce the
crash in py3k branch:
$ ./python.exe crash.py
Assertion failed: (cskipped 20), function nfc_nfkc, file
Modules/unicodedata.c, line 714.
Abort
STINNER Victor victor.stin...@haypocalc.com added the comment:
Ooops, sorry. I just applied the patch suggested by Marc-Andre Lemburg in
msg22885 (#1054943). As the patch worked for the examples given in Unicode PRI
29 and the test suite passed, it was enough for me. I don't understand the
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
The logic suggested by Martin in msg120018 looks right to me, but the whole
code seems to be unnecessarily complex. (And comb1==comb may need to be
changed to comb1=comb.) I don't understand why linear search through
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
Attached patch, issue10254.diff, is essentially Martin's code from msg120018
and Part3 tests from NormalizationTest.txt.
Since this bug exposes a buffer overflow condition, I think it qualifies as a
security issue, so I
Jonathan Halcrow jonathan.halc...@gmail.com added the comment:
I think I've come across a related problem. I am experiencing a segfault when
NFC-normalizing a certain string [1].
The crash occurs with 2.7.1 in OS X (built from source with homebrew).
Here is the backtrace:
#0 0x0025a96e in
Antoine Pitrou pit...@free.fr added the comment:
I can reproduce the crash under 2.7, but not 2.6 or 3.x here. So it might be a
separate issue.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10254
Antoine Pitrou pit...@free.fr added the comment:
After a bit of debugging, the crash is due to the skipped array being
overflowed in nfc_nfkc() in unicodedata.c. cskipped goes up to 21 while the
array only has 20 entries. This happens in all branches (but only crashes in
2.7 right now for
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:
--
nosy: +Arfrever
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10254
___
Changes by Ezio Melotti ezio.melo...@gmail.com:
--
nosy: +ezio.melotti
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10254
___
___
New submission from Merlijn van Deen valhall...@gmail.com:
Summary: Somewhere between 2.6.5 r79063 and 3.1 r79147 a regression in the
unicode NFC normalization has been introduces. This regression leads to bot
edit wars on wikipedia [1]. It is reproducable with a simple script [2].
Merlijn van Deen valhall...@gmail.com added the comment:
Please note: The bug might very well be present in python 3.2 and 3.3. However,
I do not have these versions installed, so I cannot confirm this.
--
___
Python tracker rep...@bugs.python.org
Antoine Pitrou pit...@free.fr added the comment:
Confirmed on Python 3.2.
--
nosy: +haypo, loewis, pitrou
versions: +Python 3.2
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10254
___
Martin v. Löwis mar...@v.loewis.de added the comment:
The change from issue1054943 is indeed bogus. As written, the code will happily
run over starters, even though a blocked start means that subsequent characters
can't possibly be combinable. That way, the code manages to combine, in
Marc-Andre Lemburg m...@egenix.com added the comment:
Martin v. Löwis wrote:
It's unfortunate that the patch had been backported to 2.6.6; we can't fix it
there anymore.
Why not ? It looks a lot like a security fix.
--
nosy: +lemburg
___
Python
Martin v. Löwis mar...@v.loewis.de added the comment:
It's unfortunate that the patch had been backported to 2.6.6; we can't fix
it there anymore.
Why not ? It looks a lot like a security fix.
Indeed, you could argue that. It's up to the 2.6 release manager, I guess.
--
Changes by R. David Murray rdmur...@bitdance.com:
--
nosy: +barry
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10254
___
___
Python-bugs-list
33 matches
Mail list logo