[issue22817] re.split fails with lookahead/behind
Serhiy Storchaka added the comment: It is possible to change this behavior (see example patch). With this patch: re.split(r'(?=CA)(?=GCTG)', 'ACGTCAGCTGAAAAGCTGACGTACGT') ['ACGTCA', 'GCTGAAAA', 'GCTGACGTACGT'] re.split(r'\b', the quick, brown fox) ['', 'the', ' ', 'quick', ', ', 'brown', ' ', 'fox', ''] But unfortunately this is backward incompatible change and will likely break existing code (and breaks tests). Consider following example: re.split('(:*)', 'ab'). Currently the result is ['ab'], but with the patch it is ['', '', 'a', '', 'b', '', '']. In third-part regex module [1] there is the V1 flag which switches incompatible bahavior change. regex.split('(:*)', 'ab') ['ab'] regex.split('(?V1)(:*)', 'ab') ['', '', 'a', '', 'b', '', ''] regex.split(r'(?=CA)(?=GCTG)', 'ACGTCAGCTGAAAAGCTGACGTACGT') ['ACGTCAGCTGAAAAGCTGACGTACGT'] regex.split(r'(?V1)(?=CA)(?=GCTG)', 'ACGTCAGCTGAAAAGCTGACGTACGT') ['ACGTCA', 'GCTGAAAA', 'GCTGACGTACGT'] regex.split(r'\b', the quick, brown fox) ['the quick, brown fox'] regex.split(r'(?V1)\b', the quick, brown fox) ['', 'the', ' ', 'quick', ', ', 'brown', ' ', 'fox', ''] I don't know how to solve this issue without introducing such flag (or adding special boolean argument to re.split()). As a workaround I suggest you to use the regex module. [1] https://pypi.python.org/pypi/regex -- keywords: +patch Added file: http://bugs.python.org/file37147/re_split_zero_width.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22817 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22789] Compress the marshalled data in PYC files
Serhiy Storchaka added the comment: Compressing pyc files one by one wouldn't save much space because disk space is allocated by blocks (up to 32 KiB on FAT32). If the size of pyc file is less than block size, we will not gain anything. ZIP file has advantage due more compact packing of files. In additional it can has less access time due to less fragmentation. Unfortunately it doesn't support the LZ4 compression, but we can store LZ4 compressed files in ZIP file without additional compression. Uncompressed TAR file has same advantages but needs longer initialization time (for building the index). -- nosy: +serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22789 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22817] re.split fails with lookahead/behind
Serhiy Storchaka added the comment: Previous attempts to solve this issue: issue852532, issue988761, issue3262. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22817 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22789] Compress the marshalled data in PYC files
Marc-Andre Lemburg added the comment: On 08.11.2014 10:28, Serhiy Storchaka wrote: Compressing pyc files one by one wouldn't save much space because disk space is allocated by blocks (up to 32 KiB on FAT32). If the size of pyc file is less than block size, we will not gain anything. ZIP file has advantage due more compact packing of files. In additional it can has less access time due to less fragmentation. Unfortunately it doesn't support the LZ4 compression, but we can store LZ4 compressed files in ZIP file without additional compression. Uncompressed TAR file has same advantages but needs longer initialization time (for building the index). The aim is to reduce file load time, not really to save disk space. By having less data to read from the disk, it may be possible to achieve a small startup speedup. However, you're right in that using a single archive with many PYC files would be more efficient, since it lowers the number of stat() calls. The trick to store LZ4 compressed data in a ZIP file would enable this. BTW: We could add optional LZ4 compression to the marshal format to make all this work transparently and without having to change the import mechanism itself: We'd just need to add a new flag or type code indicating that the rest of the stream is LZ4 compressed. The PYC writer could then enable this flag or type code per default (or perhaps enabled via some env var od command line flag) and everything would then just work with both LZ4 compressed byte code as well as non-compressed byte code. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22789 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22818] Deprecate splitting on possible zero-width re patterns
New submission from Serhiy Storchaka: For now re.split doesn't split with zero-width regex. There are a number of issues for this (issue852532, issue988761, issue3262, issue22817). This is definitely a bug, but fixing this bug will likely break existing code which use regular expressions which can match zero-width (e.g. re.split('(:*)', 'ab')). I propose to deprecate splitting on possible zero-width regular expressions. This expressions either not work at all as expected (r'\b' never split) or can be rewritten to not match empty string ('(:*)' to '(:+)'). In next release (3.6) we can convert deprecation warning to the exception, an then after transitional period change behavior to more correct handling zero-width matches without breaking backward compatibility. -- components: Extension Modules, Regular Expressions files: re_deprecate_split_zero_width.patch keywords: patch messages: 230843 nosy: ezio.melotti, mrabarnett, pitrou, serhiy.storchaka priority: normal severity: normal stage: patch review status: open title: Deprecate splitting on possible zero-width re patterns type: behavior versions: Python 3.5 Added file: http://bugs.python.org/file37148/re_deprecate_split_zero_width.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22818 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22819] Python3.4: xml.sax.saxutils.XMLGenerator.__init__ fails with pythonw.exe
New submission from Edward K. Ream: In Python3.2 xml.sax.saxutils.XMLGenerator.__init__ succeeds if the out keyword argument is not given and sys.stdout is None, which will typically be the case when using pythonw.exe. Alas, on Python3.4, the ctor throws an exception in this case. This is a major compatibility issue, and is completely unnecessary: the ctor should work as before. An easy fix: allocate a file-like object as the out stream, or just do what is done in Python 3.2 ;-) -- components: Library (Lib) messages: 230844 nosy: Edward.K..Ream priority: normal severity: normal status: open title: Python3.4: xml.sax.saxutils.XMLGenerator.__init__ fails with pythonw.exe type: crash versions: Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22819 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22820] RESTART line with no output
New submission from Edward Alexander: Whenever i run my code on Python IDLE editor, the output is as follows: == RESTART I am a newbie,it seems i cannot move from this point . This is my code: def convert_to_celsius(fahrenheit): return (fahrenheit - 32) * 5.0 / 9.0 convert_to_celsius(80) -- components: IDLE messages: 230845 nosy: sukari priority: normal severity: normal status: open title: RESTART line with no output type: crash versions: Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22820 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2636] Adding a new regex module (compatible with re)
Serhiy Storchaka added the comment: Here is my (slowly implemented) plan: 0. Recommend regex as advanced replacement of re (issue22594). 1. Fix all obvious bugs in the re module if this doesn't break backward compatibility (issue12728, issue14260, and many already closed issues). 2. Deprecate and then forbid behavior which looks as a bug, doesn't match regex in V1 mode and can't be fixed without breaking backward compatibility (issue22407, issue22493, issue22818). 3. Unify minor details with regex (issue22364, issue22578). 4. Fork regex and drop all advanced nonstandard features (such as fuzzy matching). Too many features make learning and using the module more hard. They should be in advanced module (regex). 5. Write benchmarks which cover all corner cases and compare re with regex case by case. Optimize slower module. Currently re is faster regex for all simple examples which I tried (may be as result of issue18685), but in total results of benchmarks (msg109447) re is slower. 6. May be implement some standard features which were rejected in favor of this issue (issue433028, issue433030). re should conform at least Level 1 of UTS #18 (http://www.unicode.org/reports/tr18/#Basic_Unicode_Support). In best case in 3.7 or 3.8 we could replace re with simplified regex. Or at this time re will be free from bugs and warts. -- nosy: +serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22819] Python3.4: xml.sax.saxutils.XMLGenerator.__init__ fails with pythonw.exe
Serhiy Storchaka added the comment: In any case XMLGenerator is not usable if the out keyword argument is not given and sys.stdout is None. Just the exception will be raised later. I consider early failure as a feature, not a bug. -- nosy: +serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22819 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3511] Incorrect charset range handling with ignore case flag?
Serhiy Storchaka added the comment: Fixed in issue17381 (which has more realistic example than [9-A]). -- nosy: +serhiy.storchaka resolution: wont fix - duplicate superseder: - IGNORECASE breaks unicode literal range matching ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3511 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue433028] SRE: (?flag:...) is not supported
Serhiy Storchaka added the comment: I agree that they'd be nice. The regex module is too advanced and need much work and some transitional period for including in the stdlib, but this feature can be implemented right now. -- assignee: effbot - serhiy.storchaka nosy: +serhiy.storchaka resolution: duplicate - stage: - needs patch superseder: Major reworking of Python 2.5.2 re module - versions: +Python 3.5 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue433028 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue433027] SRE: (?-flag) is not supported.
Serhiy Storchaka added the comment: I think issue433028 supersedes this and looks more preferable. No need to implement several ways to do same things. -- nosy: +serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue433027 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue433024] SRE: (?flag) isn't properly scoped
Serhiy Storchaka added the comment: See also issue22493. -- nosy: +serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue433024 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22819] Python3.4: xml.sax.saxutils.XMLGenerator.__init__ fails with pythonw.exe
R. David Murray added the comment: Agreed. I was going to ask what it was that 3.2 did that was useful. -- nosy: +r.david.murray ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22819 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22820] RESTART line with no output
R. David Murray added the comment: Your code doesn't produce any output (ie: there are no print calls). I don't use Idle myself, but I'm guessing that is why you don't see anything after the restart line. If that is the case, do you see a place in whatever help or documentation you have read that it would be appropriate to mention this? (The distinction is that the REPL window automatically prints the return value of whatever you type in, whereas in a program you get output only if you do the print yourself). -- assignee: - docs@python components: +Documentation nosy: +docs@python, r.david.murray, terry.reedy type: crash - behavior versions: +Python 3.4, Python 3.5 -Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22820 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22819] Python3.4: xml.sax.saxutils.XMLGenerator.__init__ fails with pythonw.exe
Serhiy Storchaka added the comment: FYI the code was changed in issue1470548. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22819 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1282] re module needs to support bytes / memoryview well
Serhiy Storchaka added the comment: Actually non-hashable patterns are not supported. re.match(bytearray(b'.'), b'x') Traceback (most recent call last): File stdin, line 1, in module File /home/serhiy/py/cpython/Lib/re.py, line 163, in match return _compile(pattern, flags).match(string) File /home/serhiy/py/cpython/Lib/re.py, line 281, in _compile p, loc = _cache[type(pattern), pattern, flags] TypeError: unhashable type: 'bytearray' Should it be considered as a bug? -- nosy: +serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1282 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1708652] Exact matching
Serhiy Storchaka added the comment: Was implemented as fullmatch() in issue16203. -- nosy: +serhiy.storchaka resolution: rejected - duplicate superseder: - Proposal: add re.fullmatch() method ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1708652 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20152] Derby #15: Convert 50 sites to Argument Clinic across 9 files
Brett Cannon added the comment: So I disagree that the code needs to be tweaked before converting to Argument Clinic. If the Clinic conversion is not adding to the problem then the code churn is just going to make applying this patch that much harder. Thanks for the code review regardless, though! -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20152 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20152] Derby #15: Convert 50 sites to Argument Clinic across 9 files
Serhiy Storchaka added the comment: If first convert to Argument Clinic then fixing bugs will be much harder. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20152 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20152] Derby #15: Convert 50 sites to Argument Clinic across 9 files
Serhiy Storchaka added the comment: About argument names. You have changed argument names and docstrings in any case (e.g. was op, now code). Why not conform with standard documentation? This wouldn't add additional code churn if change it now. But will add if change it later. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20152 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1610654] cgi.py multipart/form-data
Rishi added the comment: Hi, I have created a new patch with a small design change. The change is that in situations where I don't find the boundary instead of keeping the last x bytes in the buffer I simply drain the whole data and call a readline(). This seems like the right thing to do also. I managed to get rid of the two obfuscated helper functions keep_x_buffer and remove_x_buffer that I had and the code should look familiar (I hope) to a module owner. This also helped me get rid of quite a few class member variables that I could move to locals of my main function(multi_read). I still need to maintain an overlap, but only for the trailing CRLF boundary. Ran all the new and old tests and tested on apache with the ubuntu iso server image. Without the patch ubuntu iso server image took 93seconds .. with the patch it took 25seconds. -- Added file: http://bugs.python.org/file37149/issue1610654_3.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1610654 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22821] Argument of wrong type is passed to fcntl()
New submission from Serhiy Storchaka: Arguments of wrong type is passed to C function fcntl() in the fcntl module. Third argument of fcntl() should be either pointer to binary structure or C int. But C long is passed instead. All works on platforms where sizeof(long) == sizeof(int) or on little-endian platforms, but on big-endian platform with sizeof(long) != sizeof(int) this will pass wrong value. -- components: Extension Modules messages: 230861 nosy: brett.cannon, serhiy.storchaka priority: normal severity: normal stage: needs patch status: open title: Argument of wrong type is passed to fcntl() type: behavior versions: Python 2.7, Python 3.4, Python 3.5 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22821 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2636] Adding a new regex module (compatible with re)
Antoine Pitrou added the comment: Here is my (slowly implemented) plan: Exciting. Perhaps you should post your plan on python-dev. In any case, huge thanks for your work on the re module. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22821] Argument of wrong type is passed to fcntl()
Serhiy Storchaka added the comment: Here is a patch. It is much easier than I expected. -- keywords: +patch stage: needs patch - patch review Added file: http://bugs.python.org/file37150/fcntl_arg_type.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22821 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22687] horrible performance of textwrap.wrap() with a long word
Serhiy Storchaka added the comment: May be atomic grouping or possessive quantifiers (issue433030) will help with this issue. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22687 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22800] IPv6Network constructor sometimes does not recognize legitimate netmask
Changes by Antoine Pitrou pit...@free.fr: -- versions: +Python 3.5 -Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22800 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22434] Use named constants internally in the re module
Serhiy Storchaka added the comment: Could you please make a review of any patch Antoine? This would help me to debug re engine. It doesn't matter which patch apply, with good chance all this will be changed before 3.5 release and may be not once. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22434 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2636] Adding a new regex module (compatible with re)
Serhiy Storchaka added the comment: Exciting. Perhaps you should post your plan on python-dev. Thank you Antoine. I think all interested core developers are already aware about this issue. A disadvantage of posting on python-dev is that this would require manually copy links and may be titles of all mentioned issues, while here they are available automatically. Oh, I'm lazy. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2636] Adding a new regex module (compatible with re)
Ezio Melotti added the comment: So you are suggesting to fix bugs in re to make it closer to regex, and then replace re with a forked subset of regex that doesn't include advanced features, or just to fix/improve re until it matches the behavior of regex? If you are suggesting the former, I would also suggest checking the coverage and bringing it as close as possible to 100%. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22800] IPv6Network constructor sometimes does not recognize legitimate netmask
Antoine Pitrou added the comment: The doc is unhelpful on this, but looking at the implementation and tests, only a prefix length is allowed, not an expanded netmask. This would therefore be a feature request. -- type: behavior - enhancement versions: -Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22800 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22800] IPv6Network constructor sometimes does not recognize legitimate netmask
Chris PeBenito added the comment: That's unfortunate. The library provides factory functions so v4 and v6 addresses/networks are easily handled together, and yet it seems to have been overlooked that you can do this: ipaddress.ip_network('192.168.1.0/255.255.255.0') but not this: ipaddress.ip_network('ff00::/ff00::') I'll open up another issue for the docs, as they are not simply unhelpful, they're misleading, as the IPv6Network docs clearly say that you should be able to use an expanded netmask. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22800 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22800] IPv6Network constructor sometimes does not recognize legitimate netmask
Antoine Pitrou added the comment: I don't know enough about IPv6 to give more insight (perhaps Peter Moody can answer), but the tests have this comment: # We only support CIDR for IPv6, because expanded netmasks are not # standard notation. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22800 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22822] IPv6Network constructor docs incorrect about valid input
New submission from Chris PeBenito: Here: https://docs.python.org/3/library/ipaddress.html#ipaddress.IPv6Network In the constructor documentation, item 1 says: A string consisting of an IP address and an optional mask, separated by a slash (/). The IP address is the network address, and the mask can be either a single number, which means it’s a prefix, or a string representation of an IPv6 address. If it’s the latter, the mask is interpreted as a net mask. If no mask is provided, it’s considered to be /128. For example, the following address specifications are equivalent: 2001:db00::0/24 and 2001:db00::0/:ff00::. However in issue22800 it has been identified that using the expanded netmask (e.g. ff00::/ff00::) is not supported. -- assignee: docs@python components: Documentation messages: 230871 nosy: docs@python, pebenito priority: normal severity: normal status: open title: IPv6Network constructor docs incorrect about valid input type: behavior versions: Python 3.3, Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22822 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1282] re module needs to support bytes / memoryview well
Guido van Rossum added the comment: Hm, I don't see a reason why the *pattern* should be a bytearray or memoryview, only the string it is searching. But if you fixed it by casting it to bytes I won't stop you. :-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1282 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2636] Adding a new regex module (compatible with re)
Serhiy Storchaka added the comment: So you are suggesting to fix bugs in re to make it closer to regex, and then replace re with a forked subset of regex that doesn't include advanced features, or just to fix/improve re until it matches the behavior of regex? Depends on what will be easier. May be some bugs are so hard to fix that replacing re with regex is only solution. But if fixed re will be simpler and faster than lightened regex and will contain all necessary features, there will be no need in the replacing. Currently the code of regex looks more high level and better structured, but the code of re looks simpler and is much smaller. In any case the closer will be re and regex the easier will be the migration. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1282] re module needs to support bytes / memoryview well
Serhiy Storchaka added the comment: It is easy to fix with small (but non zero) cost, but I don't see a reason too. So I don't reopen this issue. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1282 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22695] open() declared deprecated in python 3 docs
Roundup Robot added the comment: New changeset 9001298e3094 by Berker Peksag in branch '3.4': Issue #22695: Fix rendering of the deprecated-removed role in HTML. https://hg.python.org/cpython/rev/9001298e3094 New changeset ec81edc30221 by Berker Peksag in branch 'default': Issue #22695: Fix rendering of the deprecated-removed role in HTML. https://hg.python.org/cpython/rev/ec81edc30221 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22695 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22695] open() declared deprecated in python 3 docs
Berker Peksag added the comment: Fixed. Thanks for the reviews. -- resolution: - fixed stage: commit review - resolved status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22695 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2636] Adding a new regex module (compatible with re)
Ezio Melotti added the comment: Ok, regardless of what will happen, increasing test coverage is a worthy goal. We might start by looking at the regex test suite to see if we can import some tests from there. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22434] Use named constants internally in the re module
Raymond Hettinger added the comment: I reviewed re_named_consts.patch and it looks great (I especially like the removal of superfluous OPCODES dictionary lookups and improved repr for the integer codes). Since the op codes are singletons, you can use identity tests instead of equality checks in sre_parse.py: - if op == in: + if op == IN: Also, I'll echo the suggestion to make NamedIntConstant private with a leading underscore. Nice work. -- nosy: +rhettinger ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22434 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22434] Use named constants internally in the re module
Changes by Raymond Hettinger raymond.hettin...@gmail.com: -- nosy: +effbot ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22434 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22823] Use set literals instead of creating a set from a list
New submission from Raymond Hettinger: There are many places where the old-style of creating a set from a list still persists. The literal notation is idiomatic, cleaner looking, and faster. Here's a typical change: diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py --- a/Lib/sre_compile.py +++ b/Lib/sre_compile.py @@ -22,10 +22,10 @@ else: MAXCODE = 0x -_LITERAL_CODES = set([LITERAL, NOT_LITERAL]) -_REPEATING_CODES = set([REPEAT, MIN_REPEAT, MAX_REPEAT]) -_SUCCESS_CODES = set([SUCCESS, FAILURE]) -_ASSERT_CODES = set([ASSERT, ASSERT_NOT]) +_LITERAL_CODES = {LITERAL, NOT_LITERAL} +_REPEATING_CODES = {REPEAT, MIN_REPEAT, MAX_REPEAT} +_SUCCESS_CODES = {SUCCESS, FAILURE} +_ASSERT_CODES = {ASSERT, ASSERT_NOT} Here are typical timings: $ py34 -m timeit '{10, 20, 30}' 1000 loops, best of 3: 0.145 usec per loop $ py34 -m timeit 'set([10, 20, 30])' 100 loops, best of 3: 0.477 usec per loop -- components: Library (Lib) keywords: easy messages: 230879 nosy: rhettinger priority: normal severity: normal status: open title: Use set literals instead of creating a set from a list type: enhancement versions: Python 3.5 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22823 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22824] Update reprlib to use set literals
New submission from Raymond Hettinger: Currently reprlib outputs: reprlib.repr(set('supercalifragilisticexpialidocious')) set(['a', 'c', 'd', 'e', 'f', 'g', ...]) This should be: {'a', 'c', 'd', 'e', 'f', 'g', ...} -- keywords: easy messages: 230880 nosy: rhettinger priority: normal severity: normal status: open title: Update reprlib to use set literals type: enhancement versions: Python 3.5 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22824 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22824] Update reprlib to use set literals
Changes by Berker Peksag berker.pek...@gmail.com: -- nosy: +berker.peksag ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22824 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22824] Update reprlib to use set literals
Changes by Ezio Melotti ezio.melo...@gmail.com: -- components: +Library (Lib) nosy: +ezio.melotti stage: - needs patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22824 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22823] Use set literals instead of creating a set from a list
Raymond Hettinger added the comment: Note, to keep the tests stable, nothing in Lib/tests should be changed. Any update should target the rest of Lib and Doc. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22823 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22791] datetime.utcfromtimestamp() shoud have option for create tz aware datetime
Akira Li added the comment: from datetime import datetime, timezone datetime.fromtimestamp(0, timezone.utc) datetime.datetime(1970, 1, 1, 0, 0, tzinfo=datetime.timezone.utc) already works and it is documented [1] [1] https://docs.python.org/3/library/datetime.html#datetime.datetime.fromtimestamp Or it can be written as: epoch = datetime(1970, 1, 1, tzinfo=timezone.utc) aware_utc = epoch + timedelta(seconds=posix_timestamp) -- nosy: +akira ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22791 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22791] datetime.utcfromtimestamp() shoud have option for create tz aware datetime
Alexander Belopolsky added the comment: I personally wish we could deprecate utcfromtimestamp. With timezone.utc in stdlib and being a singleton there is no reason to put UTC time in naive datetime instances. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22791 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22823] Use set literals instead of creating a set from a list
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti stage: - needs patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22823 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22791] datetime.utcfromtimestamp() shoud have option for create tz aware datetime
INADA Naoki added the comment: akira: It seems cleaner than utcfromtimestamp().replace(). I think utcfromtimestamp() should have note about it. Note that it returns **naive** (tz=None) datetime. Naive datetime is treated as localtime in most functions. If you want to create aware datetime, use `.fromtimestamp(ts, tz=timezone.utc)` instead. And I want to add this note to docstring too, since I read docstring before document. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22791 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com