[issue25472] Typing: Specialized subclasses of generics cannot be unpickled
New submission from Matt Chaput: If I try to pickle and unpickle an object of a class that has specialized a generic superclass, when I try to unpickle I get this error: TypeError: descriptor '__dict__' for 'A' objects doesn't apply to 'B' object Test case: from typing import Generic, TypeVar import pickle T = TypeVar("T") class A(Generic[T]): def __init__(self, x: T): self.x = x class B(A[str]): def __init__(self, x: str): self.x = x b = B("hello") z = pickle.dumps(b) print(z) _ = pickle.loads(z) -- messages: 253421 nosy: maatt priority: normal severity: normal status: open title: Typing: Specialized subclasses of generics cannot be unpickled versions: Python 3.5 ___ Python tracker <http://bugs.python.org/issue25472> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21198] Minor tarfile documentation bug
Matt Chaput added the comment: Oops! Yes, I accidentally included a bunch of other crap. -- ___ Python tracker <http://bugs.python.org/issue21198> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21198] Minor tarfile documentation bug
Changes by Matt Chaput : Removed file: http://bugs.python.org/file34824/issue21198.patch ___ Python tracker <http://bugs.python.org/issue21198> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20491] textwrap: Non-breaking space not honored
Matt Chaput added the comment: Patch on top of dbudinova's that attempts to replace the concatenation of strings with a verbose regex. -- nosy: +maatt Added file: http://bugs.python.org/file34827/issue20491_verbose.patch ___ Python tracker <http://bugs.python.org/issue20491> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21146] update gzip usage examples in docs
Matt Chaput added the comment: The patch looks good to me. -- nosy: +maatt ___ Python tracker <http://bugs.python.org/issue21146> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21198] Minor tarfile documentation bug
Matt Chaput added the comment: Simple patch to remove the underscore in tarfile.rst. -- keywords: +patch nosy: +maatt Added file: http://bugs.python.org/file34824/issue21198.patch ___ Python tracker <http://bugs.python.org/issue21198> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6623] Lib/ftplib.py Netrc class should be removed.
Matt Chaput added the comment: This patch is the same as my previous one, except instead of removing Netrc usage from the ftplib.test() function, it replaces it with the netrc.netrc object. Note that there are no existing tests for the ftplib.test() function. Also did some very minor cleanups (bare raise is no longer valid) to get rid of warnings/errors in static analyzer. -- Added file: http://bugs.python.org/file34818/remove_Netrc_class2.patch ___ Python tracker <http://bugs.python.org/issue6623> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6623] Lib/ftplib.py Netrc class should be removed.
Matt Chaput added the comment: Created patch to remove the Netrc class and its unit tests (for Python 3.5). -- nosy: +maatt Added file: http://bugs.python.org/file34806/remove_Netrc_class.patch ___ Python tracker <http://bugs.python.org/issue6623> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11240] Running unit tests in a command line tool leads to infinite loop with multiprocessing on Windows
Matt Chaput added the comment: IIRC the root issue turned out to be that when you execute any multiprocessing statements at the module/script level on Windows, you need to put it under if __name__ == "__main__", otherwise it will cause infinite spawning. I think this is mentioned in the multiprocessing docs but should probably be in giant blinking red letters ;) -- ___ Python tracker <http://bugs.python.org/issue11240> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14447] marshal.load() reads entire remaining file instead of just next value
New submission from Matt Chaput : In Python 3.2, if you write several values to a file with multiple calls to marshal.dump(), and then try to read them back, the first marshal.load() returns the first value, but reads to the end of the file, so subsequent calls to marshal.load() raise an EOFError. E.g.: import marshal f = open("test", "wb") marshal.dump(("hello", 1), f) marshal.dump(("there", 2), f) marshal.dump(("friend", 3), f) f.close() f = open("test", "rb") print(marshal.load(f)) # ('hello', 1) print(marshal.load(f)) # ERROR This page seems to indicate this was also a bug in Python 3.1: http://www.velocityreviews.com/forums/t728526-python-3-1-2-and-marshal.html -- components: IO messages: 157093 nosy: mattchaput priority: normal severity: normal status: open title: marshal.load() reads entire remaining file instead of just next value type: behavior versions: Python 3.1, Python 3.2 ___ Python tracker <http://bugs.python.org/issue14447> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12986] Using getrandbits() in uuid.uuid4() is faster and more readable
Matt Chaput added the comment: Passed all tests OK. -- ___ Python tracker <http://bugs.python.org/issue12986> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2636] Adding a new regex module (compatible with re)
Matt Chaput added the comment: Not sure if this is better as a separate feature request or a comment here, but... the new version of .NET includes an option to specify a time limit on evaluation of regexes (not sure if this is a feature in other regex libs). This would be useful especially when you're executing regexes configured by the user and you don't know if/when they might go exponential. Something like this maybe: # Raises an re.Timeout if not complete within 60 seconds match = myregex.match(mystring, maxseconds=60.0) -- nosy: +mattchaput ___ Python tracker <http://bugs.python.org/issue2636> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12986] Using getrandbits() in uuid.uuid4() is faster and more readable
New submission from Matt Chaput : Currently the 'uuid' module uses os.urandom (in the absence of a system UUID generation function) to generate random UUIDs in the uuid.uudi4() function. This patch changes the implementation of uuid4() to use random.getrandbits() as the source of randomness instead, for the following reasons: * In my quick tests, using getrandbits() is much faster on Windows and Linux. Some applications do need to generate UUIDs quickly. >>> setup = "import uuid, os, random" >>> ur = "uuid.UUID(bytes=os.urandom(16), version=4)" >>> grb = "uuid.UUID(int=random.getrandbits(128), version=4)" >>> # Windows >>> timeit.Timer(ur, setup).timeit() 22.861042160383903 >>> timeit.Timer(grb, setup).timeit() 3.8689128309085135 >>> # Linux >>> timeit.Timer(ur, setup).timeit() 29.32686185836792 >> timeit.Timer(grb, setup).timeit() 3.7429409027099609 * The patched code is cleaner. It avoids the try...finally required by the possibly unavailable os.urandom function, and the fallback to generating random bytes. -- components: Library (Lib) files: fastuuid4.patch keywords: patch messages: 144087 nosy: mattchaput priority: normal severity: normal status: open title: Using getrandbits() in uuid.uuid4() is faster and more readable type: performance Added file: http://bugs.python.org/file23163/fastuuid4.patch ___ Python tracker <http://bugs.python.org/issue12986> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12870] Regex object should have introspection methods
Matt Chaput added the comment: Yes, it's an optimization of my code, not the regex, as I said. Believe me, it's not premature. I've listed two general use cases for the two methods. To me it seems obvious that having to test a large number of regexes against a string, and having to test a single regex against a large number of strings, are two very common programming tasks, and they could both be speeded up quite a bit using these methods. As of now my parsing code and other code such as PyParsing are resorting to hacks like requiring the user to manually specify the possible first chars of a regex at configuration. With the hacks, the code can be hundreds of times faster. But the hacks are error-prone and should be unnecessary. The PCRE library implements at least the "first char" functionality, and a lot more regex introspection that would be useful, through its pcre_fullinfo() function. -- ___ Python tracker <http://bugs.python.org/issue12870> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12870] Regex object should have introspection methods
Matt Chaput added the comment: Ezio, no offense, but I think it's safe to say you've completely misunderstood this bug. It is not about "explaining what a regex matches" or optimizing the regex. Read the last sentences of the two paragraphs explaining the proposed methods for the use cases. This is about allowing MY CODE to programmatically get certain information about a regex object to allow it to limit the number of times it has to call regex.match(). AFAIK there's no good way to get this information about a regex object without adding these methods or building my own pure-Python regex interpreter, which would be both Herculean and pointless. -- ___ Python tracker <http://bugs.python.org/issue12870> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12870] Regex object should have introspection methods
New submission from Matt Chaput : Several times in the recent past I've wished for the following methods on the regular expression object. These would allow me to speed up search and parsing code, by limiting the number of regex matches I need to try. literal_prefix(): Returns any literal string at the start of the pattern (before any "special" parts). E.g., for the pattern "ab(c|d)ef" the method would return "ab". For the pattern "abc|def" the method would return "". When matching a regex against keys in a btree, this would let me limit the search to just the range of keys with the prefix. first_chars(): Returns a string/list/set/whatever of the possible first characters that could appear at the start of a matching string. E.g. for the pattern "ab(c|d)ef" the method would return "a". For the pattern "[a-d]ef" the method would return "abcd". When parsing a string with regexes, this would let me only have to test the regexes that could match at the current character. As long as you're making a new regex package, I thought I'd put in a request for these :) -- components: Regular Expressions messages: 143266 nosy: mattchaput priority: normal severity: normal status: open title: Regex object should have introspection methods type: feature request versions: Python 3.3 ___ Python tracker <http://bugs.python.org/issue12870> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11240] Running unit tests in a command line tool leads to infinite loop with multiprocessing on Windows
Matt Chaput added the comment: If I do "c:\python27\python run_nose.py" it works correctly. If I do "nosetests" I get the process explosion. Maybe the bug is in how distutils and nose work from the command line? I'm confused. -- ___ Python tracker <http://bugs.python.org/issue11240> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11240] Running unit tests in a command line tool leads to infinite loop with multiprocessing on Windows
Matt Chaput added the comment: I don't know what to tell you... to the best of my knowledge there's absolutely no way for my code to kick off the entire test suite -- I always do that through PyDev (which doesn't cause the bug, by the way). The closest thing is the boilerplate at the bottom of every test file: if __name__ == "__main__": unittest.main() ...but even that would only start the tests in that file, not the entire suite. Another thing that makes me think multiprocessing is re-running the original command line is that if I use "python setup.py test" to start the tests, when it gets to the MP tests it seems to run that command for each Process that gets started, but if I use "nosetests", it seems to run "nosetests" for each started Process. -- ___ Python tracker <http://bugs.python.org/issue11240> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11240] Running unit tests in a command line tool leads to infinite loop with multiprocessing on Windows
Matt Chaput added the comment: Thank you, I understand all that, but I don't think you understand the issue. My code is not __main__. I am not starting the test suite. It's the distutils/nose code that's doing that. It seems as if the multiprocessing module is starting new Windows processes by duplicating the command line of the original process. That doesn't seem to work very well, given the example of running test suites, hence the bug. -- ___ Python tracker <http://bugs.python.org/issue11240> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11240] Running unit tests in a command line tool leads to infinite loop with multiprocessing on Windows
New submission from Matt Chaput : If you start unit tests with a command line such as "python setup.py test" or "nosetests", if the tested code starts a multiprocessing.Process on Windows, each new process will act as if it was started as "python setup.py test"/"nosetests", leading to an infinite explosion of processes that eventually locks up the entire machine. -- components: Windows messages: 128768 nosy: mattchaput priority: normal severity: normal status: open title: Running unit tests in a command line tool leads to infinite loop with multiprocessing on Windows type: behavior versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue11240> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1172711] long long support for array module
Matt Chaput added the comment: This is an important feature to me. Now I get to redo a bunch of code to have two completely different code paths to do the same thing because nobody could be bothered to keep array up-to-date. -- nosy: +mattchaput ___ Python tracker <http://bugs.python.org/issue1172711> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2027] Module containing C implementations of common text algorithms
Matt Chaput added the comment: The Porter stemming and Levenshtein edit-distance algorithms are not "fast-moving" nor are they fusion reactors... they've been around forever, and are simple to implement, but are still useful in various common scenarios. I'd say this is similar to Python including an implementation of digest functions such as SHA: it's useful enough, and compute-intensive enough, to warrant a C implementation. Shipping C extensions is not an option for everyone; it's especially a pain with Windows. __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2027> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2027] Module containing C implementations of common text algorithms
New submission from Matt Chaput: Add a module to the standard library containing fast (C) implementations of common text/language related algorithms, to begin specifically Porter (and perhaps other) stemming and Levenshtein (and perhaps other) edit distance. Both these algorithms are useful in multiple domains, well known and understood, and have sample implementations all over the Web, but are compute-intensive and prohibitively expensive when implemented in pure Python. -- components: Library (Lib) messages: 62134 nosy: mchaput severity: normal status: open title: Module containing C implementations of common text algorithms type: rfe __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2027> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com