Re: [Python-Dev] PEP 380 (yield from) is now Final
Finally, a reason to use Python 3 ;-) Chris On 13/01/2012 16:00, Guido van Rossum wrote: AWESOME!!! On Fri, Jan 13, 2012 at 4:14 AM, Nick Coghlan ncogh...@gmail.com mailto:ncogh...@gmail.com wrote: I marked PEP 380 as Final this evening, after pushing the tested and documented implementation to hg.python.org http://hg.python.org: http://hg.python.org/cpython/rev/d64ac9ab4cd0 As the list of names in the NEWS and What's New entries suggests, it was quite a collaborative effort to get this one over the line, and that's without even listing all the people that offered helpful suggestions and comments along the way :) print(\n.join(list((lambda:(yield from (Cheers,, Nick)))( -- --Guido van Rossum (python.org/~guido http://python.org/~guido) __ This email has been scanned by the Symantec Email Security.cloud service. For more information please visit http://www.symanteccloud.com __ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/chris%40simplistix.co.uk -- Simplistix - Content Management, Batch Processing Python Consulting - http://www.simplistix.co.uk ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of the fix for the hash collision vulnerability
Jack Diederich writes: On Thu, Jan 12, 2012 at 9:57 PM, Guido van Rossum gu...@python.org wrote: Hm... I started out as a big fan of the randomized hash, but thinking more about it, I actually believe that the chances of some legitimate app having 1000 collisions are way smaller than the chances that somebody's code will break due to the variable hashing. Python's dicts are designed to avoid hash conflicts by resizing and keeping the available slots bountiful. 1000 conflicts sounds like a number that couldn't be hit accidentally I may be missing something, but AIUI, with the resize, the search for an unused slot after collision will be looking in a different series of slots, so the N counter for the N^2 behavior resets on resize. If not, you can delete this message now. If so, since (a) in the error-on-many-collisions approach we're adding a test here for collision count anyway and (b) we think this is almost never gonna happen, can't we defuse the exploit by just resizing the dict after 1000 collisions, with strictly better performance than the error approach, and almost current performance for normal input? In order to prevent attackers from exploiting every 1000th collision to force out-of-memory, the expansion factor for collision-induced resizing could be very small. (I don't know if that's possible in the Python dict implementation, if the algorithm requires something like doubling the dict size on every resize this is right out, of course.) Or, since this is an error/rare path anyway, offer the user a choice of an error or a resize on hitting 1000 collisions? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of the fix for the hash collision vulnerability
On Sat, 14 Jan 2012 04:45:57 +0100 mar...@v.loewis.de wrote: What an implementation looks like: http://pastebin.com/9ydETTag some stuff to be filled in, but this is all that is really required. I think this statement (and the patch) is wrong. You also need to change the byte string hashing, at least for 2.x. This I consider the biggest flaw in that approach - other people may have written string-like objects which continue to compare equal to a string but now hash different. They're unlikely to have rewritten the hash algorithm by hand - especially given the caveats wrt. differences between Python integers and C integers. Rather, they would have returned the hash() of the equivalent str or unicode object. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of the fix for the hash collision vulnerability
On Sat, 14 Jan 2012 13:55:22 +1100 Steven D'Aprano st...@pearwood.info wrote: On 14/01/12 12:58, Gregory P. Smith wrote: I do like *randomly seeding the hash*. *+1*. This is easy. It can easily be back ported to any Python version. It is perfectly okay to break existing users who had anything depending on ordering of internal hash tables. Their code was already broken. For the record: steve@runes:~$ python -c print(hash('spam ham')) -376510515 steve@runes:~$ jython -c print(hash('spam ham')) 2054637885 Not to mention: $ ./python -c print(hash('spam ham')) -6071355389066156083 (64-bit CPython) Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of the fix for the hash collision vulnerability
I think this statement (and the patch) is wrong. You also need to change the byte string hashing, at least for 2.x. This I consider the biggest flaw in that approach - other people may have written string-like objects which continue to compare equal to a string but now hash different. They're unlikely to have rewritten the hash algorithm by hand - especially given the caveats wrt. differences between Python integers and C integers. See the CHAR_HASH macro in http://hg.python.org/cpython/file/e78f00dbd7ae/Modules/expat/xmlparse.c It's not *that* unlikely that more copies of that algorithm exist. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] cpython: Implement PEP 380 - 'yield from' (closes #11682)
On Jan 14, 2012 5:56 PM, Georg Brandl g.bra...@gmx.net wrote: On 01/14/2012 07:53 AM, Nick Coghlan wrote: I agree, but it's one of the challenges of a long-lived branch like the PEP 380 one (I believe some of these cosmetic changes started life in Greg's original patch and separating them out would have been quite a pain). Anyone that wants to see the gory details of the branch history can take a look at my bitbucket repo: https://bitbucket.org/ncoghlan/cpython_sandbox/changesets/tip/branch%28%22pep380%22%29 I see. I hadn't followed the development of PEP 380 closely before. In any case, it is probably a good idea to mention this branch URL in the commit message in case it is meant to be kept permanently (it would also be possible to put only that branch of your sandbox into another clone at hg.python.org). You're right we should have a PSF-controlled copy of the entire branch history in cases like this. I actually still keep an irregularly updated clone of my entire sandbox repo on hg.python.org (that's actually where it started), so I'll refresh that and add a link to the pep380 branch history into the tracker item that covered the PEP 380 integration into 3.3. (Vaguely related tangent: the new code added by the patch probably has a few parts that could benefit from the new GetAttrId private API) Maybe another candidate for an issue, so that we don't forget? I just added a note about it to the C API cleanup tracker item. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Sphinx version for Python 2.x docs
On Sat, Jan 14, 2012 at 04:24, Éric Araujo mer...@netwok.org wrote: Hi Sandro, Thanks for getting the ball rolling on this. One style for markup, one Sphinx version to code our extensions against and one location for the documenting guidelines will make our work a bit easier. thanks :) I'm happy to help! During the build process, there are some warnings that I can understand: I assume you mean “can’t”, as you later ask how to fix them. As a yes, indeed general rule, they’re only warnings, so they don’t break the build, only some links or stylings, so I think it’s okay to ignore them *right now*. but I like to get them fixed nonetheless: after all, the current build doesn't show warnings - but I agree it's a non-blocking issue. Doc/glossary.rst:520: WARNING: unknown keyword: nonlocal That’s a mistake I did in cefe4f38fa0e. This sentence should be removed. Do you mean revert this whole hunk: @@ -480,10 +516,11 @@ nested scope The ability to refer to a variable in an enclosing definition. For instance, a function defined inside another function can refer to - variables in the outer function. Note that nested scopes work only for - reference and not for assignment which will always write to the innermost - scope. In contrast, local variables both read and write in the innermost - scope. Likewise, global variables read and write to the global namespace. + variables in the outer function. Note that nested scopes by default work + only for reference and not for assignment. Local variables both read and + write in the innermost scope. Likewise, global variables read and write + to the global namespace. The :keyword:`nonlocal` allows writing to outer + scopes. new-style class Any class which inherits from :class:`object`. This includes all built-in or just The :keyword:`nonlocal` allows writing to outer scopes.? Doc/library/stdtypes.rst:2372: WARNING: more than one target found for cross-reference u'next': Need to use :meth:`.next` to let Sphinx find the right target (more info on request :) it seems what it needed to was :meth:`next` (without the dot). The current page links all 'next' in file.next() to functions.html#next, and using :meth:`next` does that. Doc/library/sys.rst:651: WARNING: unknown keyword: None Should use ``None``. fixed Doc/reference/datamodel.rst:1942: WARNING: unknown keyword: not in Doc/reference/expressions.rst:1184: WARNING: unknown keyword: is not I don’t know if these should work (i.e. create a link to the appropriate language reference section) or abuse the markup (there are “not” and “in” keywords, but no “not in” keyword → use ``not in``). I’d say ignore them. ACK, but I'm willing to fix them if someone tells me how to :) I'm going to prepare the patches and then push - i'll send a heads-up afterward. Cheers, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of the fix for the hash collision vulnerability
Am 13.01.2012 18:08, schrieb Mark Dickinson: On Fri, Jan 13, 2012 at 2:57 AM, Guido van Rossum gu...@python.org wrote: How pathological the data needs to be before the collision counter triggers? I'd expect *very* pathological. How pathological do you consider the set {1 n for n in range(2000)} to be? I think this is not a counter-example for the proposed algorithm (at least not in the way I think it should be implemented). Those values may collide on the slot in the set, but they don't collide on the actual hash value. So in order to determine whether the collision limit is exceeded, we shouldn't count colliding slots, but colliding hash values (which we will all encounter during an insert). though admittedly only around 30 collisions per hash value. I do consider the case of hashing integers with only one bit set pathological. However, this can be overcome by factoring the magnitude of the number into the hash as well. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of the fix for the hash collision vulnerability
Am 14.01.2012 01:37, schrieb Benjamin Peterson: 2012/1/13 Guido van Rossum gu...@python.org: Really? Even though you came up with specifically to prove me wrong? Coming up with a counterexample now invalidates it? There are two concerns here: - is it possible to come up with an example of constructed values that show many collisions in a way that poses a threat? To this, the answer is apparently yes, and the proposed reaction is to hard-limit the number of collisions accepted by the implementation. - then, *assuming* such a limitation is in place: is it possible to come up with a realistic application that would break under this limitation. Mark's example is no such realistic application, instead, it is yet another example demonstrating collisions using constructed values (although the specific example would continue to work fine even under the limitation). A valid counterexample would have to come from a real application, or at least from a scenario that is plausible for a real application. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] 2.7 now uses Sphinx 1.0
Hello, just a heads-up: documentation for 2.7 branch has been ported to use sphinx 1.0, so now the same syntax can be used for 2.x and 3.x patches, hopefully easying working on both python stacks. Cheers, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Documenting Python is moving to devguide
Hi all, (another) heads-up about my current work: I've just pushed the Documenting Python doc section (ftr: http://docs.python.org/documenting/index.html) to devguide. That was possibile now that we use the same sphinx version on all the active branches. It was not a re-editing of the content, that might still be outdated and in need of work, but just a brutal cut paste of the current files. Now that we have a central place, additional editing will be much more easy. The section is still available in the cpython repo, and I'm waiting to remove it because it's better to have some redirections in place from the current urls to the new ones. I've prepared a small set of RewriteRules (attached): I don't know the actual setup of apache for docs.p.o but at least they are a start :) whomever has root access, could please review apply those rules? Once the rewrites are in place, i'll take care of removing the Doc/documenting dir from the active branches. Cheers, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi RewriteEngine On RewriteRule /documenting/$ /devguide/documenting.html [NE,R=permanent,L] RewriteRule /documenting/index.html /devguide/documenting.html [NE,R=permanent,L] RewriteRule /documenting/intro.html /devguide/documenting.html#introduction[NE,R=permanent,L] RewriteRule /documenting/style.html /devguide/documenting.html#style-guide [NE,R=permanent,L] RewriteRule /documenting/rest.html /devguide/documenting.html#restructuredtext-primer [NE,R=permanent,L] RewriteRule /documenting/markup.html /devguide/documenting.html#additional-markup-constructs[NE,R=permanent,L] RewriteRule /documenting/fromlatex.html /devguide/documenting.html#differences-to-the-latex-markup [NE,R=permanent,L] RewriteRule /documenting/building.html /devguide/documenting.html#building-the-documentation [NE,R=permanent,L] ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of the fix for the hash collision vulnerability
My patch example does change the bytes object hash as well as Unicode. On Jan 13, 2012 7:46 PM, mar...@v.loewis.de wrote: What an implementation looks like: http://pastebin.com/9ydETTag some stuff to be filled in, but this is all that is really required. I think this statement (and the patch) is wrong. You also need to change the byte string hashing, at least for 2.x. This I consider the biggest flaw in that approach - other people may have written string-like objects which continue to compare equal to a string but now hash different. Regards, Martin __**_ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/**mailman/listinfo/python-devhttp://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** greg%40krypto.orghttp://mail.python.org/mailman/options/python-dev/greg%40krypto.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documenting Python is moving to devguide
Hi again, On Sat, Jan 14, 2012 at 19:09, Sandro Tosi sandro.t...@gmail.com wrote: Hi all, (another) heads-up about my current work: I've just pushed the Documenting Python doc section (ftr: http://docs.python.org/documenting/index.html) to devguide. That was possibile now that we use the same sphinx version on all the active branches. It was not a re-editing of the content, that might still be outdated and in need of work, but just a brutal cut paste of the current files. Now that we have a central place, additional editing will be much more easy. The section is still available in the cpython repo, and I'm waiting to remove it because it's better to have some redirections in place from the current urls to the new ones. I've prepared a small set of RewriteRules (attached): I don't know the actual setup of apache for docs.p.o but at least they are a start :) whomever has root access, could please review apply those rules? Thanks to Georg that applied the rewrites both for 2.7 and 3.2 . Once the rewrites are in place, i'll take care of removing the Doc/documenting dir from the active branches. and so Doc/documenting is gone on all the active branches. Cheers, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of the fix for the hash collision vulnerability
FWIW the quick change i pastebin'ed is basically covered by the change already under review in http://bugs.python.org/review/13704/show. I've made my comments and suggestions there. I looked into Modules/expat/xmlparse.c and it has an odd copy of the old string hash algorithm entirely for its own internal use and its own internal hash table implementations. That module is likely vulnerable to creatively crafted documents for the same reason. With 13704 and the public API it provides to get the random hash seed, that module could simply be updated to use that in its own hash implementation. As for when to enable it or not, I unfortunately have to agree, despite my wild desires we can't turn on the hash randomization change by default in anything prior to 3.3. -gps On Sat, Jan 14, 2012 at 11:17 AM, Gregory P. Smith g...@krypto.org wrote: My patch example does change the bytes object hash as well as Unicode. On Jan 13, 2012 7:46 PM, mar...@v.loewis.de wrote: What an implementation looks like: http://pastebin.com/9ydETTag some stuff to be filled in, but this is all that is really required. I think this statement (and the patch) is wrong. You also need to change the byte string hashing, at least for 2.x. This I consider the biggest flaw in that approach - other people may have written string-like objects which continue to compare equal to a string but now hash different. Regards, Martin __**_ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/**mailman/listinfo/python-devhttp://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** greg%40krypto.orghttp://mail.python.org/mailman/options/python-dev/greg%40krypto.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of the fix for the hash collision vulnerability
Victor Stinner wrote: - Marc Andre Lemburg proposes to fix the vulnerability directly in dict (for any key type). The patch raises an exception if a lookup causes more than 1000 collisions. Am I missing something? How does this fix the vulnerability? It seems to me that the only thing this does is turn one sort of DOS attack into another sort of DOS attack: hostile users will just cause hash collisions until an exception is raised and the application falls over. Catching these exceptions, and recovering from them (how?), would be the responsibility of the application author. Given that developers are unlikely to ever see 1000 collisions by accident, or even realise that it could happen, I don't expect that many people will do that -- until they personally get bitten. -- Steven ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of the fix for the hash collision vulnerability
Guido van Rossum wrote: On Fri, Jan 13, 2012 at 5:58 PM, Gregory P. Smith g...@krypto.org wrote: It is perfectly okay to break existing users who had anything depending on ordering of internal hash tables. Their code was already broken. We *will*provide a flag and/or environment variable that can be set to turn the feature off at their own peril which they can use in their test harnesses that are stupid enough to use doctests with order dependencies. No, that is not how we usually take compatibility between bugfix releases. Your code is already broken is not an argument to break forcefully what worked (even if by happenstance) before. The difference between CPython and Jython (or between different CPython feature releases) also isn't relevant -- historically we have often bent over backwards to avoid changing behavior that was technically undefined, if we believed it would affect a significant fraction of users. I don't think anyone doubts that this will break lots of code (at least, the arguments I've heard have been their code is broken, not nobody does that). I don't know about lots of code, but it will break at least one library (or so I'm told): http://mail.python.org/pipermail/python-list/2012-January/1286535.html -- Steven ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of the fix for the hash collision vulnerability
On Sun, Jan 15, 2012 at 2:42 PM, Steven D'Aprano st...@pearwood.info wrote: Victor Stinner wrote: - Marc Andre Lemburg proposes to fix the vulnerability directly in dict (for any key type). The patch raises an exception if a lookup causes more than 1000 collisions. Am I missing something? How does this fix the vulnerability? It seems to me that the only thing this does is turn one sort of DOS attack into another sort of DOS attack: hostile users will just cause hash collisions until an exception is raised and the application falls over. Catching these exceptions, and recovering from them (how?), would be the responsibility of the application author. Given that developers are unlikely to ever see 1000 collisions by accident, or even realise that it could happen, I don't expect that many people will do that -- until they personally get bitten. As I understand it, the way the attack works is that a *single* malicious request from the attacker can DoS the server by eating CPU resources while evaluating a massive collision chain induced in a dict by attacker supplied data. Explicitly truncating the collision chain boots them out almost immediately (likely with a 500 response for an internal server error), so they no longer affect other events, threads and processes on the same machine. In some ways, the idea is analogous to the way we implement explicit recursion limiting in an attempt to avoid actually blowing the C stack - we take a hard-to-detect-and-hard-to-handle situation (i.e. blowing the C stack or malicious generation of long collision chains in a dict) and replace it with something that is easy to detect and can be handled by normal exception processing (i.e. a recursion depth exception or one reporting an excessive number of slot collisions in a dict lookup). That then makes the default dict implementation safe from this kind of attack by default, and use cases that are getting that many collisions legitimately can be handled in one of two ways: - switch to a more appropriate data type (if you're getting that many collisions with benign data, a dict is probably the wrong container to be using) - offer a mechanism (command line switch or environment variable) to turn the collision limiting off Now, where you can still potentially run into problems is if a single shared dict is used to store both benign and malicious data - if the malicious data makes it into the destination dict before the exception finally gets triggered, and then benign data also happens to trigger the same collision chain, then yes, the entire app may fall over. However, such an app would have been crippled by the original DoS anyway, since its performance would have been gutted - the collision chain limiting just means it will trigger exceptions for the cases that would been insanely slow. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documenting Python is moving to devguide
The section is still available in the cpython repo, and I'm waiting to remove it because it's better to have some redirections in place from the current urls to the new ones. I've prepared a small set of RewriteRules (attached): I don't know the actual setup of apache for docs.p.o but at least they are a start :) whomever has root access, could please review apply those rules? Thanks to Georg that applied the rewrites both for 2.7 and 3.2 . Once the rewrites are in place, i'll take care of removing the Doc/documenting dir from the active branches. and so Doc/documenting is gone on all the active branches. Good work Sandro, thanks! Documenting Python definitely belongs in the devguide Eli ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com