2017-08-16 1:55 GMT+02:00 Yury Selivanov <yselivanov...@gmail.com>: > Hi, > > Here's the PEP 550 version 2. Thanks to a very active and insightful > discussion here on Python-ideas, we've discovered a number of > problems with the first version of the PEP. This version is a complete > rewrite (only Abstract, Rationale, and Goals sections were not updated). > > The updated PEP is live on python.org: > https://www.python.org/dev/peps/pep-0550/ > > There is no reference implementation at this point, but I'm confident > that this version of the spec will have the same extremely low > runtime overhead as the first version. Thanks to the new ContextItem > design, accessing values in the context is even faster now. > > Thank you! > > > PEP: 550 > Title: Execution Context > Version: $Revision$ > Last-Modified: $Date$ > Author: Yury Selivanov <y...@magic.io> > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 11-Aug-2017 > Python-Version: 3.7 > Post-History: 11-Aug-2017, 15-Aug-2017 > > > Abstract > ======== > > This PEP proposes a new mechanism to manage execution state--the > logical environment in which a function, a thread, a generator, > or a coroutine executes in. > > A few examples of where having a reliable state storage is required: > > * Context managers like decimal contexts, ``numpy.errstate``, > and ``warnings.catch_warnings``; > > * Storing request-related data such as security tokens and request > data in web applications, implementing i18n; > > * Profiling, tracing, and logging in complex and large code bases. > > The usual solution for storing state is to use a Thread-local Storage > (TLS), implemented in the standard library as ``threading.local()``. > Unfortunately, TLS does not work for the purpose of state isolation > for generators or asynchronous code, because such code executes > concurrently in a single thread. > > > Rationale > ========= > > Traditionally, a Thread-local Storage (TLS) is used for storing the > state. However, the major flaw of using the TLS is that it works only > for multi-threaded code. It is not possible to reliably contain the > state within a generator or a coroutine. For example, consider > the following generator:: > > def calculate(precision, ...): > with decimal.localcontext() as ctx: > # Set the precision for decimal calculations > # inside this block > ctx.prec = precision > > yield calculate_something() > yield calculate_something_else() > > Decimal context is using a TLS to store the state, and because TLS is > not aware of generators, the state can leak. If a user iterates over > the ``calculate()`` generator with different precisions one by one > using a ``zip()`` built-in, the above code will not work correctly. > For example:: > > g1 = calculate(precision=100) > g2 = calculate(precision=50) > > items = list(zip(g1, g2)) > > # items[0] will be a tuple of: > # first value from g1 calculated with 100 precision, > # first value from g2 calculated with 50 precision. > # > # items[1] will be a tuple of: > # second value from g1 calculated with 50 precision (!!!), > # second value from g2 calculated with 50 precision. > > An even scarier example would be using decimals to represent money > in an async/await application: decimal calculations can suddenly > lose precision in the middle of processing a request. Currently, > bugs like this are extremely hard to find and fix. > > Another common need for web applications is to have access to the > current request object, or security context, or, simply, the request > URL for logging or submitting performance tracing data:: > > async def handle_http_request(request): > context.current_http_request = request > > await ... > # Invoke your framework code, render templates, > # make DB queries, etc, and use the global > # 'current_http_request' in that code. > > # This isn't currently possible to do reliably > # in asyncio out of the box. > > These examples are just a few out of many, where a reliable way to > store context data is absolutely needed. > > The inability to use TLS for asynchronous code has lead to > proliferation of ad-hoc solutions, which are limited in scope and > do not support all required use cases. > > Current status quo is that any library, including the standard > library, that uses a TLS, will likely not work as expected in > asynchronous code or with generators (see [3]_ as an example issue.) > > Some languages that have coroutines or generators recommend to > manually pass a ``context`` object to every function, see [1]_ > describing the pattern for Go. This approach, however, has limited > use for Python, where we have a huge ecosystem that was built to work > with a TLS-like context. Moreover, passing the context explicitly > does not work at all for libraries like ``decimal`` or ``numpy``, > which use operator overloading. > > .NET runtime, which has support for async/await, has a generic > solution of this problem, called ``ExecutionContext`` (see [2]_). > On the surface, working with it is very similar to working with a TLS, > but the former explicitly supports asynchronous code. > > > Goals > ===== > > The goal of this PEP is to provide a more reliable alternative to > ``threading.local()``. It should be explicitly designed to work with > Python execution model, equally supporting threads, generators, and > coroutines. > > An acceptable solution for Python should meet the following > requirements: > > * Transparent support for code executing in threads, coroutines, > and generators with an easy to use API. > > * Negligible impact on the performance of the existing code or the > code that will be using the new mechanism. > > * Fast C API for packages like ``decimal`` and ``numpy``. > > Explicit is still better than implicit, hence the new APIs should only > be used when there is no acceptable way of passing the state > explicitly. > > > Specification > ============= > > Execution Context is a mechanism of storing and accessing data specific > to a logical thread of execution. We consider OS threads, > generators, and chains of coroutines (such as ``asyncio.Task``) > to be variants of a logical thread. > > In this specification, we will use the following terminology: > > * **Local Context**, or LC, is a key/value mapping that stores the > context of a logical thread. > > * **Execution Context**, or EC, is an OS-thread-specific dynamic > stack of Local Contexts. > > * **Context Item**, or CI, is an object used to set and get values > from the Execution Context. > > Please note that throughout the specification we use simple > pseudo-code to illustrate how the EC machinery works. The actual > algorithms and data structures that we will use to implement the PEP > are discussed in the `Implementation Strategy`_ section. > > > Context Item Object > ------------------- > > The ``sys.new_context_item(description)`` function creates a > new ``ContextItem`` object. The ``description`` parameter is a > ``str``, explaining the nature of the context key for introspection > and debugging purposes. > > ``ContextItem`` objects have the following methods and attributes: > > * ``.description``: read-only description; > > * ``.set(o)`` method: set the value to ``o`` for the context item > in the execution context. > > * ``.get()`` method: return the current EC value for the context item. > Context items are initialized with ``None`` when created, so > this method call never fails. > > The below is an example of how context items can be used:: > > my_context = sys.new_context_item(description='mylib.context') > my_context.set('spam') >
Minor suggestion: Could we allow something like `sys.set_new_context_item(description='mylib.context', initial_value='spam')`? That would make it easier for type checkers to infer the type of a ContextItem, and it would save a line of code in the common case. With this modification, the type of new_context_item would be @overload def new_context_item(*, description: str, initial_value: T) -> ContextItem[T]: ... @overload def new_context_item(*, description: str) -> ContextItem[Any]: ... If we only allow the second variant, type checkers would need some sort of special casing to figure out that after .set(), .get() will return the same type. > # Later, to access the value of my_context: > print(my_context.get()) > > > Thread State and Multi-threaded code > ------------------------------------ > > Execution Context is implemented on top of Thread-local Storage. > For every thread there is a separate stack of Local Contexts -- > mappings of ``ContextItem`` objects to their values in the LC. > New threads always start with an empty EC. > > For CPython:: > > PyThreadState: > execution_context: ExecutionContext([ > LocalContext({ci1: val1, ci2: val2, ...}), > ... > ]) > > The ``ContextItem.get()`` and ``.set()`` methods are defined as > follows (in pseudo-code):: > > class ContextItem: > > def get(self): > tstate = PyThreadState_Get() > > for local_context in reversed(tstate.execution_context): > if self in local_context: > return local_context[self] > > def set(self, value): > tstate = PyThreadState_Get() > > if not tstate.execution_context: > tstate.execution_context = [LocalContext()] > > tstate.execution_context[-1][self] = value > > With the semantics defined so far, the Execution Context can already > be used as an alternative to ``threading.local()``:: > > def print_foo(): > print(ci.get() or 'nothing') > > ci = sys.new_context_item(description='test') > ci.set('foo') > > # Will print "foo": > print_foo() > > # Will print "nothing": > threading.Thread(target=print_foo).start() > > > Manual Context Management > ------------------------- > > Execution Context is generally managed by the Python interpreter, > but sometimes it is desirable for the user to take the control > over it. A few examples when this is needed: > > * running a computation in ``concurrent.futures.ThreadPoolExecutor`` > with the current EC; > > * reimplementing generators with iterators (more on that later); > > * managing contexts in asynchronous frameworks (implement proper > EC support in ``asyncio.Task`` and ``asyncio.loop.call_soon``.) > > For these purposes we add a set of new APIs (they will be used in > later sections of this specification): > > * ``sys.new_local_context()``: create an empty ``LocalContext`` > object. > > * ``sys.new_execution_context()``: create an empty > ``ExecutionContext`` object. > > * Both ``LocalContext`` and ``ExecutionContext`` objects are opaque > to Python code, and there are no APIs to modify them. > > * ``sys.get_execution_context()`` function. The function returns a > copy of the current EC: an ``ExecutionContext`` instance. > > The runtime complexity of the actual implementation of this function > can be O(1), but for the purposes of this section it is equivalent > to:: > > def get_execution_context(): > tstate = PyThreadState_Get() > return copy(tstate.execution_context) > > * ``sys.run_with_execution_context(ec: ExecutionContext, func, *args, > **kwargs)`` runs ``func(*args, **kwargs)`` in the provided execution > context:: > > def run_with_execution_context(ec, func, *args, **kwargs): > tstate = PyThreadState_Get() > > old_ec = tstate.execution_context > > tstate.execution_context = ExecutionContext( > ec.local_contexts + [LocalContext()] > ) > > try: > return func(*args, **kwargs) > finally: > tstate.execution_context = old_ec > > Any changes to Local Context by ``func`` will be ignored. > This allows to reuse one ``ExecutionContext`` object for multiple > invocations of different functions, without them being able to > affect each other's environment:: > > ci = sys.new_context_item('example') > ci.set('spam') > > def func(): > print(ci.get()) > ci.set('ham') > > ec = sys.get_execution_context() > > sys.run_with_execution_context(ec, func) > sys.run_with_execution_context(ec, func) > > # Will print: > # spam > # spam > > * ``sys.run_with_local_context(lc: LocalContext, func, *args, > **kwargs)`` runs ``func(*args, **kwargs)`` in the current execution > context using the specified local context. > > Any changes that ``func`` does to the local context will be > persisted in ``lc``. This behaviour is different from the > ``run_with_execution_context()`` function, which always creates > a new throw-away local context. > > In pseudo-code:: > > def run_with_local_context(lc, func, *args, **kwargs): > tstate = PyThreadState_Get() > > old_ec = tstate.execution_context > > tstate.execution_context = ExecutionContext( > old_ec.local_contexts + [lc] > ) > > try: > return func(*args, **kwargs) > finally: > tstate.execution_context = old_ec > > Using the previous example:: > > ci = sys.new_context_item('example') > ci.set('spam') > > def func(): > print(ci.get()) > ci.set('ham') > > ec = sys.get_execution_context() > lc = sys.new_local_context() > > sys.run_with_local_context(lc, func) > sys.run_with_local_context(lc, func) > > # Will print: > # spam > # ham > > As an example, let's make a subclass of > ``concurrent.futures.ThreadPoolExecutor`` that preserves the execution > context for scheduled functions:: > > class Executor(concurrent.futures.ThreadPoolExecutor): > > def submit(self, fn, *args, **kwargs): > context = sys.get_execution_context() > > fn = functools.partial( > sys.run_with_execution_context, context, > fn, *args, **kwargs) > > return super().submit(fn) > > > EC Semantics for Coroutines > --------------------------- > > Python :pep:`492` coroutines are used to implement cooperative > multitasking. For a Python end-user they are similar to threads, > especially when it comes to sharing resources or modifying > the global state. > > An event loop is needed to schedule coroutines. Coroutines that > are explicitly scheduled by the user are usually called Tasks. > When a coroutine is scheduled, it can schedule other coroutines using > an ``await`` expression. In async/await world, awaiting a coroutine > is equivalent to a regular function call in synchronous code. Thus, > Tasks are similar to threads. > > By drawing a parallel between regular multithreaded code and > async/await, it becomes apparent that any modification of the > execution context within one Task should be visible to all coroutines > scheduled within it. Any execution context modifications, however, > must not be visible to other Tasks executing within the same OS > thread. > > > Coroutine Object Modifications > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > To achieve this, a small set of modifications to the coroutine object > is needed: > > * New ``cr_local_context`` attribute. This attribute is readable > and writable for Python code. > > * When a coroutine object is instantiated, its ``cr_local_context`` > is initialized with an empty Local Context. > > * Coroutine's ``.send()`` and ``.throw()`` methods are modified as > follows (in pseudo-C):: > > if coro.cr_local_context is not None: > tstate = PyThreadState_Get() > > tstate.execution_context.push(coro.cr_local_context) > > try: > # Perform the actual `Coroutine.send()` or > # `Coroutine.throw()` call. > return coro.send(...) > finally: > coro.cr_local_context = tstate.execution_context.pop() > else: > # Perform the actual `Coroutine.send()` or > # `Coroutine.throw()` call. > return coro.send(...) > > * When Python interpreter sees an ``await`` instruction, it inspects > the ``cr_local_context`` attribute of the coroutine that is about > to be awaited. For ``await coro``: > > * If ``coro.cr_local_context`` is an empty ``LocalContext`` object > that ``coro`` was created with, the interpreter will set > ``coro.cr_local_context`` to ``None``. > > * If ``coro.cr_local_context`` was modified by Python code, the > interpreter will leave it as is. > > This makes any changes to execution context made by nested coroutine > calls within a Task to be visible throughout the Task:: > > ci = sys.new_context_item('example') > > async def nested(): > ci.set('nested') > > asynd def main(): > ci.set('main') > print('before:', ci.get()) > await nested() > print('after:', ci.get()) > > # Will print: > # before: main > # after: nested > > Essentially, coroutines work with Execution Context items similarly > to threads, and ``await`` expression acts like a function call. > > This mechanism also works for ``yield from`` in generators decorated > with ``@types.coroutine`` or ``@asyncio.coroutine``, which are > called "generator-based coroutines" according to :pep:`492`, > and should be fully compatible with native async/await coroutines. > > > Tasks > ^^^^^ > > In asynchronous frameworks like asyncio, coroutines are run by > an event loop, and need to be explicitly scheduled (in asyncio > coroutines are run by ``asyncio.Task``.) > > With the currently defined semantics, the interpreter makes > coroutines linked by an ``await`` expression share the same > Local Context. > > The interpreter, however, is not aware of the Task concept, and > cannot help with ensuring that new Tasks started in coroutines, > use the correct EC:: > > current_request = sys.new_context_item(description='request') > > async def child(): > print('current request:', repr(current_request.get())) > > async def handle_request(request): > current_request.set(request) > event_loop.create_task(child) > > run(top_coro()) > > # Will print: > # current_request: None > > To enable correct Execution Context propagation into Tasks, the > asynchronous framework needs to assist the interpreter: > > * When ``create_task`` is called, it should capture the current > execution context with ``sys.get_execution_context()`` and save it > on the Task object. > > * When the Task object runs its coroutine object, it should execute > ``.send()`` and ``.throw()`` methods within the captured > execution context, using the ``sys.run_with_execution_context()`` > function. > > With help from the asynchronous framework, the above snippet will > run correctly, and the ``child()`` coroutine will be able to access > the current request object through the ``current_request`` > Context Item. > > > Event Loop Callbacks > ^^^^^^^^^^^^^^^^^^^^ > > Similarly to Tasks, functions like asyncio's ``loop.call_soon()`` > should capture the current execution context with > ``sys.get_execution_context()`` and execute callbacks > within it with ``sys.run_with_execution_context()``. > > This way the following code will work:: > > current_request = sys.new_context_item(description='request') > > def log(): > request = current_request.get() > print(request) > > async def request_handler(request): > current_request.set(request) > get_event_loop.call_soon(log) > > > Generators > ---------- > > Generators in Python, while similar to Coroutines, are used in a > fundamentally different way. They are producers of data, and > they use ``yield`` expression to suspend/resume their execution. > > A crucial difference between ``await coro`` and ``yield value`` is > that the former expression guarantees that the ``coro`` will be > executed fully, while the latter is producing ``value`` and > suspending the generator until it gets iterated again. > > Generators, similarly to coroutines, have a ``gi_local_context`` > attribute, which is set to an empty Local Context when created. > > Contrary to coroutines though, ``yield from o`` expression in > generators (that are not generator-based coroutines) is semantically > equivalent to ``for v in o: yield v``, therefore the interpreter does > not attempt to control their ``gi_local_context``. > > > EC Semantics for Generators > ^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Every generator object has its own Local Context that stores > only its own local modifications of the context. When a generator > is being iterated, its local context will be put in the EC stack > of the current thread. This means that the generator will be able > to see access items from the surrounding context:: > > local = sys.new_context_item("local") > global = sys.new_context_item("global") > > def generator(): > local.set('inside gen:') > while True: > print(local.get(), global.get()) > yield > > g = gen() > > local.set('hello') > global.set('spam') > next(g) > > local.set('world') > global.set('ham') > next(g) > > # Will print: > # inside gen: spam > # inside gen: ham > > Any changes to the EC in nested generators are invisible to the outer > generator:: > > local = sys.new_context_item("local") > > def inner_gen(): > local.set('spam') > yield > > def outer_gen(): > local.set('ham') > yield from gen() > print(local.get()) > > list(outer_gen()) > > # Will print: > # ham > > > Running generators without LC > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Similarly to coroutines, generators with ``gi_local_context`` > set to ``None`` simply use the outer Local Context. > > The ``@contextlib.contextmanager`` decorator uses this mechanism to > allow its generator to affect the EC:: > > item = sys.new_context_item('test') > > @contextmanager > def context(x): > old = item.get() > item.set('x') > try: > yield > finally: > item.set(old) > > with context('spam'): > > with context('ham'): > print(1, item.get()) > > print(2, item.get()) > > # Will print: > # 1 ham > # 2 spam > > > Implementing Generators with Iterators > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > The Execution Context API allows to fully replicate EC behaviour > imposed on generators with a regular Python iterator class:: > > class Gen: > > def __init__(self): > self.local_context = sys.new_local_context() > > def __iter__(self): > return self > > def __next__(self): > return sys.run_with_local_context( > self.local_context, self._next_impl) > > def _next_impl(self): > # Actual __next__ implementation. > ... > > > Asynchronous Generators > ----------------------- > > Asynchronous Generators (AG) interact with the Execution Context > similarly to regular generators. > > They have an ``ag_local_context`` attribute, which, similarly to > regular generators, can be set to ``None`` to make them use the outer > Local Context. This is used by the new > ``contextlib.asynccontextmanager`` decorator. > > The EC support of ``await`` expression is implemented using the same > approach as in coroutines, see the `Coroutine Object Modifications`_ > section. > > > Greenlets > --------- > > Greenlet is an alternative implementation of cooperative > scheduling for Python. Although greenlet package is not part of > CPython, popular frameworks like gevent rely on it, and it is > important that greenlet can be modified to support execution > contexts. > > In a nutshell, greenlet design is very similar to design of > generators. The main difference is that for generators, the stack > is managed by the Python interpreter. Greenlet works outside of the > Python interpreter, and manually saves some ``PyThreadState`` > fields and pushes/pops the C-stack. Thus the ``greenlet`` package > can be easily updated to use the new low-level `C API`_ to enable > full support of EC. > > > New APIs > ======== > > Python > ------ > > Python APIs were designed to completely hide the internal > implementation details, but at the same time provide enough control > over EC and LC to re-implement all of Python built-in objects > in pure Python. > > 1. ``sys.new_context_item(description='...')``: create a > ``ContextItem`` object used to access/set values in EC. > > 2. ``ContextItem``: > > * ``.description``: read-only attribute. > * ``.get()``: return the current value for the item. > * ``.set(o)``: set the current value in the EC for the item. > > 3. ``sys.get_execution_context()``: return the current > ``ExecutionContext``. > > 4. ``sys.new_execution_context()``: create a new empty > ``ExecutionContext``. > > 5. ``sys.new_local_context()``: create a new empty ``LocalContext``. > > 6. ``sys.run_with_execution_context(ec: ExecutionContext, > func, *args, **kwargs)``. > > 7. ``sys.run_with_local_context(lc:LocalContext, > func, *args, **kwargs)``. > > > C API > ----- > > 1. ``PyContextItem * PyContext_NewItem(char *desc)``: create a > ``PyContextItem`` object. > > 2. ``PyObject * PyContext_GetItem(PyContextItem *)``: get the > current value for the context item. > > 3. ``int PyContext_SetItem(PyContextItem *, PyObject *)``: set > the current value for the context item. > > 4. ``PyLocalContext * PyLocalContext_New()``: create a new empty > ``PyLocalContext``. > > 5. ``PyLocalContext * PyExecutionContext_New()``: create a new empty > ``PyExecutionContext``. > > 6. ``PyExecutionContext * PyExecutionContext_Get()``: get the > EC for the active thread state. > > 7. ``int PyExecutionContext_Set(PyExecutionContext *)``: set the > passed EC object as the current for the active thread state. > > 8. ``int PyExecutionContext_SetWithLocalContext(PyExecutionContext *, > PyLocalContext *)``: allows to implement > ``sys.run_with_local_context`` Python API. > > > Implementation Strategy > ======================= > > LocalContext is a Weak Key Mapping > ---------------------------------- > > Using a weak key mapping for ``LocalContext`` implementation > enables the following properties with regards to garbage > collection: > > * ``ContextItem`` objects are strongly-referenced only from the > application code, not from any of the Execution Context > machinery or values they point to. This means that there > are no reference cycles that could extend their lifespan > longer than necessary, or prevent their garbage collection. > > * Values put in the Execution Context are guaranteed to be kept > alive while there is a ``ContextItem`` key referencing them in > the thread. > > * If a ``ContextItem`` is garbage collected, all of its values will > be removed from all contexts, allowing them to be GCed if needed. > > * If a thread has ended its execution, its thread state will be > cleaned up along with its ``ExecutionContext``, cleaning > up all values bound to all Context Items in the thread. > > > ContextItem.get() Cache > ----------------------- > > We can add three new fields to ``PyThreadState`` and > ``PyInterpreterState`` structs: > > * ``uint64_t PyThreadState->unique_id``: a globally unique > thread state identifier (we can add a counter to > ``PyInterpreterState`` and increment it when a new thread state is > created.) > > * ``uint64_t PyInterpreterState->context_item_deallocs``: every time > a ``ContextItem`` is GCed, all Execution Contexts in all threads > will lose track of it. ``context_item_deallocs`` will simply > count all ``ContextItem`` deallocations. > > * ``uint64_t PyThreadState->execution_context_ver``: every time > a new item is set, or an existing item is updated, or the stack > of execution contexts is changed in the thread, we increment this > counter. > > The above two fields allow implementing a fast cache path in > ``ContextItem.get()``, in pseudo-code:: > > class ContextItem: > > def get(self): > tstate = PyThreadState_Get() > > if (self.last_tstate_id == tstate.unique_id and > self.last_ver == tstate.execution_context_ver > self.last_deallocs == > tstate.iterp.context_item_deallocs): > return self.last_value > > value = None > for mapping in reversed(tstate.execution_context): > if self in mapping: > value = mapping[self] > break > > self.last_value = value > self.last_tstate_id = tstate.unique_id > self.last_ver = tstate.execution_context_ver > self.last_deallocs = tstate.interp.context_item_deallocs > > return value > > This is similar to the trick that decimal C implementation uses > for caching the current decimal context, and will have the same > performance characteristics, but available to all > Execution Context users. > > > Approach #1: Use a dict for LocalContext > ---------------------------------------- > > The straightforward way of implementing the proposed EC > mechanisms is to create a ``WeakKeyDict`` on top of Python > ``dict`` type. > > To implement the ``ExecutionContext`` type we can use Python > ``list`` (or a custom stack implementation with some > pre-allocation optimizations). > > This approach will have the following runtime complexity: > > * O(M) for ``ContextItem.get()``, where ``M`` is the number of > Local Contexts in the stack. > > It is important to note that ``ContextItem.get()`` will implement > a cache making the operation O(1) for packages like ``decimal`` > and ``numpy``. > > * O(1) for ``ContextItem.set()``. > > * O(N) for ``sys.get_execution_context()``, where ``N`` is the > total number of items in the current **execution** context. > > > Approach #2: Use HAMT for LocalContext > -------------------------------------- > > Languages like Clojure and Scala use Hash Array Mapped Tries (HAMT) > to implement high performance immutable collections [5]_, [6]_. > > Immutable mappings implemented with HAMT have O(log\ :sub:`32`\ N) > performance for both ``set()``, ``get()``, and ``merge()`` operations, > which is essentially O(1) for relatively small mappings > (read about HAMT performance in CPython in the > `Appendix: HAMT Performance`_ section.) > > In this approach we use the same design of the ``ExecutionContext`` > as in Approach #1, but we will use HAMT backed weak key Local Context > implementation. With that we will have the following runtime > complexity: > > * O(M * log\ :sub:`32`\ N) for ``ContextItem.get()``, > where ``M`` is the number of Local Contexts in the stack, > and ``N`` is the number of items in the EC. The operation will > essentially be O(M), because execution contexts are normally not > expected to have more than a few dozen of items. > > (``ContextItem.get()`` will have the same caching mechanism as in > Approach #1.) > > * O(log\ :sub:`32`\ N) for ``ContextItem.set()`` where ``N`` is the > number of items in the current **local** context. This will > essentially be an O(1) operation most of the time. > > * O(log\ :sub:`32`\ N) for ``sys.get_execution_context()``, where > ``N`` is the total number of items in the current **execution** > context. > > Essentially, using HAMT for Local Contexts instead of Python dicts, > allows to bring down the complexity of ``sys.get_execution_context()`` > from O(N) to O(log\ :sub:`32`\ N) because of the more efficient > merge algorithm. > > > Approach #3: Use HAMT and Immutable Linked List > ----------------------------------------------- > > We can make an alternative ``ExecutionContext`` design by using > a linked list. Each ``LocalContext`` in the ``ExecutionContext`` > object will be wrapped in a linked-list node. > > ``LocalContext`` objects will use an HAMT backed weak key > implementation described in the Approach #2. > > Every modification to the current ``LocalContext`` will produce a > new version of it, which will be wrapped in a **new linked list > node**. Essentially this means, that ``ExecutionContext`` is an > immutable forest of ``LocalContext`` objects, and can be safely > copied by reference in ``sys.get_execution_context()`` (eliminating > the expensive "merge" operation.) > > With this approach, ``sys.get_execution_context()`` will be an > **O(1) operation**. > > > Summary > ------- > > We believe that approach #3 enables an efficient and complete > Execution Context implementation, with excellent runtime performance. > > `ContextItem.get() Cache`_ enables fast retrieval of context items > for performance critical libraries like decimal and numpy. > > Fast ``sys.get_execution_context()`` enables efficient management > of execution contexts in asynchronous libraries like asyncio. > > > Design Considerations > ===================== > > Can we fix ``PyThreadState_GetDict()``? > --------------------------------------- > > ``PyThreadState_GetDict`` is a TLS, and some of its existing users > might depend on it being just a TLS. Changing its behaviour to follow > the Execution Context semantics would break backwards compatibility. > > > PEP 521 > ------- > > :pep:`521` proposes an alternative solution to the problem: > enhance Context Manager Protocol with two new methods: ``__suspend__`` > and ``__resume__``. To make it compatible with async/await, > the Asynchronous Context Manager Protocol will also need to be > extended with ``__asuspend__`` and ``__aresume__``. > > This allows to implement context managers like decimal context and > ``numpy.errstate`` for generators and coroutines. > > The following code:: > > class Context: > > def __enter__(self): > self.old_x = get_execution_context_item('x') > set_execution_context_item('x', 'something') > > def __exit__(self, *err): > set_execution_context_item('x', self.old_x) > > would become this:: > > local = threading.local() > > class Context: > > def __enter__(self): > self.old_x = getattr(local, 'x', None) > local.x = 'something' > > def __suspend__(self): > local.x = self.old_x > > def __resume__(self): > local.x = 'something' > > def __exit__(self, *err): > local.x = self.old_x > > Besides complicating the protocol, the implementation will likely > negatively impact performance of coroutines, generators, and any code > that uses context managers, and will notably complicate the > interpreter implementation. > > :pep:`521` also does not provide any mechanism to propagate state > in a local context, like storing a request object in an HTTP request > handler to have better logging. Nor does it solve the leaking state > problem for greenlet/gevent. > > > Can Execution Context be implemented outside of CPython? > -------------------------------------------------------- > > Because async/await code needs an event loop to run it, an EC-like > solution can be implemented in a limited way for coroutines. > > Generators, on the other hand, do not have an event loop or > trampoline, making it impossible to intercept their ``yield`` points > outside of the Python interpreter. > > > Backwards Compatibility > ======================= > > This proposal preserves 100% backwards compatibility. > > > Appendix: HAMT Performance > ========================== > > To assess if HAMT can be used for Execution Context, we implemented > it in CPython [7]_. > > .. figure:: pep-0550-hamt_vs_dict.png > :align: center > :width: 100% > > Figure 1. Benchmark code can be found here: [9]_. > > Figure 1 shows that HAMT indeed displays O(1) performance for all > benchmarked dictionary sizes. For dictionaries with less than 100 > items, HAMT is a bit slower than Python dict/shallow copy. > > .. figure:: pep-0550-lookup_hamt.png > :align: center > :width: 100% > > Figure 2. Benchmark code can be found here: [10]_. > > Figure 2 shows comparison of lookup costs between Python dict > and an HAMT immutable mapping. HAMT lookup time is 30-40% worse > than Python dict lookups on average, which is a very good result, > considering how well Python dicts are optimized. > > Note, that according to [8]_, HAMT design can be further improved. > > > Acknowledgments > =============== > > I thank Elvis Pranskevichus and Victor Petrovykh for countless > discussions around the topic and PEP proof reading and edits. > > Thanks to Nathaniel Smith for proposing the ``ContextItem`` design > [17]_ [18]_, for pushing the PEP towards a more complete design, and > coming up with the idea of having a stack of contexts in the thread > state. > > Thanks to Nick Coghlan for numerous suggestions and ideas on the > mailing list, and for coming up with a case that cause the complete > rewrite of the initial PEP version [19]_. > > > References > ========== > > .. [1] https://blog.golang.org/context > > .. [2] https://msdn.microsoft.com/en-us/library/system.threading. > executioncontext.aspx > > .. [3] https://github.com/numpy/numpy/issues/9444 > > .. [4] http://bugs.python.org/issue31179 > > .. [5] https://en.wikipedia.org/wiki/Hash_array_mapped_trie > > .. [6] http://blog.higher-order.net/2010/08/16/assoc-and-clojures- > persistenthashmap-part-ii.html > > .. [7] https://github.com/1st1/cpython/tree/hamt > > .. [8] https://michael.steindorfer.name/publications/oopsla15.pdf > > .. [9] https://gist.github.com/1st1/9004813d5576c96529527d44c5457dcd > > .. [10] https://gist.github.com/1st1/dbe27f2e14c30cce6f0b5fddfc8c437e > > .. [11] https://github.com/1st1/cpython/tree/pep550 > > .. [12] https://www.python.org/dev/peps/pep-0492/#async-await > > .. [13] https://github.com/MagicStack/uvloop/blob/master/examples/ > bench/echoserver.py > > .. [14] https://github.com/MagicStack/pgbench > > .. [15] https://github.com/python/performance > > .. [16] https://gist.github.com/1st1/6b7a614643f91ead3edf37c4451a6b4c > > .. [17] https://mail.python.org/pipermail/python-ideas/2017- > August/046752.html > > .. [18] https://mail.python.org/pipermail/python-ideas/2017- > August/046772.html > > .. [19] https://mail.python.org/pipermail/python-ideas/2017- > August/046780.html > > > Copyright > ========= > > This document has been placed in the public domain. > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ >
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/