On Wednesday, January 20, 2016 4:10 PM, Brett Cannon <br...@python.org> wrote:


>I think Glenn was assuming we had a single, global version # that all dicts 
>shared without having a per-dict version ID. The key thing here is that we 
>have a global counter that tracks the number of mutations for all dictionaries 
>but whose value we store as a per-dictionary value. That ends up making the 
>version ID inherently both a token representing the state of any dict but also 
>the uniqueness of the dict since no two dictionaries will ever have the same 
>version ID.

This idea worries me. I'm not sure why, but I think because of threading. After 
all, it's pretty rare for two threads to both want to work on the same dict, 
but very, very common for two threads to both want to work on _any_ dict. So, 
imagine someone manages to remove the GIL from CPython by using STM: now most 
transactions are bumping that global counter, meaning most transactions fail 
and have to be retried, so you end up with 8 cores each running at 1/64th the 
speed of a single core but burning 100% CPU. Obviously a real-life 
implementation wouldn't be _that_ stupid; you'd special-case the 
version-bumping (maybe unconditionally bump it N times before starting the 
transaction, and then as long as you don't bump more than N times during the 
transaction, you can commit without touching it), but there's still going to be 
a lot of contention.

And that also affects something like PyPy being able to use FAT-Python-style 
AoT optimizations via cpyext. At first glance that sounds like a stupid 
idea--why would you want to run an optimizer through a slow emulator? But the 
optimizer only runs once and transforms the function code, which runs a zillion 
times, so who cares how slow the optimizer is? Of course it may still be true 
that many of the AoT optimizations that FAT makes don't apply very well to 
PyPy, in which case it doesn't matter. But I don't think we can assume that a 
priori.

Is there a way to define this loosely enough so that the implementation _can_ 
be a single global counter, if that turns out to be most efficient, but can 
also be a counter per dictionary and a globally-unique ID per dictionary?
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to