[issue22362] Warn about octal escapes 0o377 in re

2014-09-23 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 3b32f495fb38 by Serhiy Storchaka in branch 'default':
Issue #22362: Forbidden ambiguous octal escapes out of range 0-0o377 in
https://hg.python.org/cpython/rev/3b32f495fb38

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22362
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22362] Warn about octal escapes 0o377 in re

2014-09-23 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Thanks Antoine and Victor for the review.

--
resolution:  - fixed
stage: patch review - resolved
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22362
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22362] Warn about octal escapes 0o377 in re

2014-09-21 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

If this is error, should the patch be applied to maintained releases?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22362
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22362] Warn about octal escapes 0o377 in re

2014-09-18 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Warning or exception? This is a question.

--
assignee:  - serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22362
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22362] Warn about octal escapes 0o377 in re

2014-09-18 Thread STINNER Victor

STINNER Victor added the comment:

 Warning or exception? This is a question.

Using -Werror, warnings raise exceptions :-)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22362
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22362] Warn about octal escapes 0o377 in re

2014-09-18 Thread Antoine Pitrou

Antoine Pitrou added the comment:

This is an error, so it should really be an exception. There's no use case for 
being lenient, IMO.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22362
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22362] Warn about octal escapes 0o377 in re

2014-09-12 Thread STINNER Victor

STINNER Victor added the comment:

re_octal_escape_overflow_raise.patch: you should write a subfunction to not 
repeat the error message 3 times.

+if c  0o377:

Hum, I never use octal. 255 instead of 0o377 would be less surprising :-p By 
the way, you should also check for negative numbers.

 -3  0xff
253

Before,  0xff also converted negative numbers to positive in range 0..255.

--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22362
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22362] Warn about octal escapes 0o377 in re

2014-09-12 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

 By the way, you should also check for negative numbers.

Not in this case. You can't construct negative number from three octal digits.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22362
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22362] Warn about octal escapes 0o377 in re

2014-09-11 Thread Antoine Pitrou

Antoine Pitrou added the comment:

I think we should simply raise ValueError in 3.5. There's no reason to accept 
such invalid escapes.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22362
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22362] Warn about octal escapes 0o377 in re

2014-09-11 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Well, here is a patch which makes re raise an exception on suspicious octals.

--
Added file: 
http://bugs.python.org/file36602/re_octal_escape_overflow_raise.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22362
___diff -r 180f5bf7d1b9 Lib/sre_parse.py
--- a/Lib/sre_parse.py  Thu Sep 11 14:33:02 2014 +0300
+++ b/Lib/sre_parse.py  Thu Sep 11 23:31:31 2014 +0300
@@ -283,7 +283,11 @@ def _class_escape(source, escape):
 elif c in OCTDIGITS:
 # octal escape (up to three digits)
 escape += source.getwhile(2, OCTDIGITS)
-return LITERAL, int(escape[1:], 8)  0xff
+c = int(escape[1:], 8)
+if c  0o377:
+raise error('octal escape value %r outside of '
+'range 0-0o377' % escape)
+return LITERAL, c
 elif c in DIGITS:
 raise ValueError
 if len(escape) == 2:
@@ -325,7 +329,7 @@ def _escape(source, escape, state):
 elif c == 0:
 # octal escape
 escape += source.getwhile(2, OCTDIGITS)
-return LITERAL, int(escape[1:], 8)  0xff
+return LITERAL, int(escape[1:], 8)
 elif c in DIGITS:
 # octal escape *or* decimal group reference (sigh)
 if source.next in DIGITS:
@@ -334,7 +338,11 @@ def _escape(source, escape, state):
 source.next in OCTDIGITS):
 # got three octal digits; this is an octal escape
 escape = escape + source.get()
-return LITERAL, int(escape[1:], 8)  0xff
+c = int(escape[1:], 8)
+if c  0o377:
+raise error('octal escape value %r outside of '
+'range 0-0o377' % escape)
+return LITERAL, c
 # not an octal escape, so this is a group reference
 group = int(escape[1:])
 if group  state.groups:
@@ -825,7 +833,11 @@ def parse_template(source, pattern):
 s.next in OCTDIGITS):
 this += sget()
 isoctal = True
-lappend(chr(int(this[1:], 8)  0xff))
+c = int(this[1:], 8)
+if c  0o377:
+raise error('octal escape value %r outside of '
+'range 0-0o377' % this)
+lappend(chr(c))
 if not isoctal:
 addgroup(int(this[1:]))
 else:
diff -r 180f5bf7d1b9 Lib/test/test_re.py
--- a/Lib/test/test_re.py   Thu Sep 11 14:33:02 2014 +0300
+++ b/Lib/test/test_re.py   Thu Sep 11 23:31:31 2014 +0300
@@ -154,8 +154,8 @@ class ReTests(unittest.TestCase):
 self.assertEqual(re.sub('x', r'\09', 'x'), '\0' + '9')
 self.assertEqual(re.sub('x', r'\0a', 'x'), '\0' + 'a')
 
-self.assertEqual(re.sub('x', r'\400', 'x'), '\0')
-self.assertEqual(re.sub('x', r'\777', 'x'), '\377')
+self.assertRaises(re.error, re.sub, 'x', r'\400', 'x')
+self.assertRaises(re.error, re.sub, 'x', r'\777', 'x')
 
 self.assertRaises(re.error, re.sub, 'x', r'\1', 'x')
 self.assertRaises(re.error, re.sub, 'x', r'\8', 'x')
@@ -691,7 +691,7 @@ class ReTests(unittest.TestCase):
 self.assertIsNotNone(re.match(r\08, \0008))
 self.assertIsNotNone(re.match(r\01, \001))
 self.assertIsNotNone(re.match(r\018, \0018))
-self.assertIsNotNone(re.match(r\567, chr(0o167)))
+self.assertRaises(re.error, re.match, r\567, )
 self.assertRaises(re.error, re.match, r\911, )
 self.assertRaises(re.error, re.match, r\x1, )
 self.assertRaises(re.error, re.match, r\x1z, )
@@ -719,6 +719,7 @@ class ReTests(unittest.TestCase):
 self.assertIsNotNone(re.match(r[\U%08x] % i, chr(i)))
 self.assertIsNotNone(re.match(r[\U%08x0] % i, chr(i)+0))
 self.assertIsNotNone(re.match(r[\U%08xz] % i, chr(i)+z))
+self.assertRaises(re.error, re.match, r[\567], )
 self.assertIsNotNone(re.match(r[\U0001d49c-\U0001d4b5], 
\U0001d49e))
 self.assertRaises(re.error, re.match, r[\911], )
 self.assertRaises(re.error, re.match, r[\x1z], )
@@ -740,7 +741,7 @@ class ReTests(unittest.TestCase):
 self.assertIsNotNone(re.match(br\08, b\0008))
 self.assertIsNotNone(re.match(br\01, b\001))
 self.assertIsNotNone(re.match(br\018, b\0018))
-self.assertIsNotNone(re.match(br\567, bytes([0o167])))
+self.assertRaises(re.error, re.match, br\567, b)
 self.assertRaises(re.error, re.match, br\911, b)
 self.assertRaises(re.error, re.match, br\x1, b)
 self.assertRaises(re.error, re.match, br\x1z, b)
@@ -755,6 

[issue22362] Warn about octal escapes 0o377 in re

2014-09-08 Thread Serhiy Storchaka

New submission from Serhiy Storchaka:

Currently the re module accepts octal escapes from \400 to \777, but ignore 
highest bit.

 re.search(r'\542', 'abc')
_sre.SRE_Match object; span=(1, 2), match='b'

This behavior looks surprising and is inconsistent with the regex module which 
preserve highest bit. Such escaping is not portable across different regular 
exception engines.

I propose to add a warning when octal escape value is larger than 0o377. Here 
is preliminary patch which adds UserWarning. Or may be better to emit 
DeprecationWarning and then replace it by ValueError in future releases?

--
components: Library (Lib), Regular Expressions
files: re_octal_escape_overflow.patch
keywords: patch
messages: 226570
nosy: ezio.melotti, mrabarnett, pitrou, serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: Warn about octal escapes  0o377 in re
type: enhancement
versions: Python 3.5
Added file: http://bugs.python.org/file36571/re_octal_escape_overflow.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22362
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com