Author: Hakan Ardo <ha...@debian.org> Branch: extradoc Changeset: r4663:04439fef5415 Date: 2012-08-17 12:20 +0200 http://bitbucket.org/pypy/extradoc/changeset/04439fef5415/
Log: merge diff --git a/talk/stm2012/stmimpl.rst b/talk/stm2012/stmimpl.rst --- a/talk/stm2012/stmimpl.rst +++ b/talk/stm2012/stmimpl.rst @@ -19,7 +19,8 @@ done in a local copy. If this transaction successfully commits, the original global object is *not* changed --- it is really immutable. But the copy becomes global, and the old global object's header is updated -with a pointer to the new global object. +with a pointer to the new global object. We thus make a chained list +of global versions. CPUs model @@ -31,9 +32,9 @@ be delayed and only show up later in main memory. The delayed stores are always flushed to main memory in program order. -Of course if the same CPU loads a value just stored, it will see the +Of course if the same CPU loads a value it just stored, it will see the value as modified (self-consistency); but other CPUs might temporarily -still see the old value. +see the old value. The MFENCE instruction waits until all delayed stores from this CPU have been flushed. (A CPU has no built-in way to wait until *other* CPU's @@ -49,12 +50,23 @@ Every object starts with three fields: - h_global (boolean) -- h_nonmodified (boolean) +- h_possibly_outdated (boolean) +- h_written (boolean) - h_version (unsigned integer) The h_version is an unsigned "version number". More about it below. -The other two fields are flags. (In practice they are just two bits -of the GC h_tid field.) +The other fields are flags. (In practice they are just bits inside the +GC h_tid field.) + +- ``h_global`` means that the object is a global object. + +- ``h_possibly_outdated`` is used as an optimization: it means that the + object is possibly outdated. It is False for all local objects. It + is also False if the object is a global object, is the most recent of + its chained list of versions, and is known to have no ``global2local`` + entry in any transaction. + +- ``h_written`` is set on local objects that have been written to. Transaction details @@ -65,6 +77,8 @@ - start_time - global2local +- list_of_read_objects +- recent_reads_cache The ``start_time`` is the "time" at which the transaction started. All reads and writes done so far in the transaction appear consistent with @@ -74,8 +88,17 @@ ``global2local`` is a dictionary-like mapping of global objects to their corresponding local objects. +``list_of_read_objects`` is a set of all global objects read from, in +the version that was used for reading. It is actually implemented as a +list, but the order or repeated presence of elements in the list is +irrelevant. -Pseudo-code during transactions +``recent_reads_cache`` is a fixed-size cache that remembers recent +additions to the preceeding list, in order to avoid inserting too much +repeated entries into the list, as well as keep lightweight statistics. + + +Pseudo-code: read/write barriers --------------------------------------- Variable names: @@ -87,19 +110,19 @@ * ``R`` is a pointer to an object that was checked for being *read-ready*: reading its fields is ok. -* ``L`` is a pointer to a *local* object. Reading its fields is - always ok, but not necessarily writing. +* ``L`` is a pointer to a *local* object. We can always read from + but not necessarily write to local objects. -* ``W`` is a pointer to a local object ready to *write*. +* ``W`` is a pointer to a *writable* local object. -``W = Allocate(size)`` allocates a local object, and as the name of -the variable suggests, returns it ready to write:: +``W = Allocate(size)`` allocates a local object:: def Allocate(size): W = malloc(size) W->h_global = False - W->h_nonmodified = False + W->h_possibly_outdated = False + W->h_written = True W->h_version = 0 return W @@ -115,7 +138,7 @@ while (v := R->h_version) & 1: # "has a more recent version" R = v & ~ 1 if v > start_time: # object too recent? - validate_fast() # try to move start_time forward + ValidateFast() # try to move start_time forward return LatestGlobalVersion(G) # restart searching from G PossiblyUpdateChain(G) return R @@ -125,107 +148,113 @@ It takes a random pointer ``P`` and returns a possibly different pointer ``R`` out of which we can read from the object. The result ``R`` remains valid for read access until either the current transaction ends, -or until a write into the same object is done. - -:: +or until a write into the same object is done. Pseudo-code:: def DirectReadBarrier(P): if not P->h_global: # fast-path return P - R = LatestGlobalVersion(P) + if not P->h_possibly_outdated: + R = P + else: + R = LatestGlobalVersion(P) + if R->h_possibly_outdated and R in global2local: + L = global2local[R] + return L + R = AddInReadSet(R) # see below + return R + + +A simple optimization is possible. If ``R`` is returned by a previous +call to ``DirectReadBarrier`` and the current transaction is still +running, but we could have written to ``R`` in the meantime, then we +need to repeat only part of the logic, because we don't need +``AddInReadSet`` again. It gives this:: + + def RepeatReadBarrier(R): + if not R->h_possibly_outdated: # fast-path + return R + # LatestGlobalVersion(R) would either return R or abort + # the whole transaction, so omitting it is not wrong if R in global2local: L = global2local[R] return L - else: - AddInReadSet(R) # see below - return R + return R -``L = Localize(R)`` is an operation that takes a read-ready pointer to -a global object and returns a corresponding pointer to a local object. - -:: +``L = Localize(R)`` is an operation that takes a read-ready pointer to a +global object and returns a corresponding pointer to a local object:: def Localize(R): - if P in global2local: - return global2local[P] + if R in global2local: + return global2local[R] L = malloc(sizeof R) - L->h_nonmodified = True - L->h_version = P + L->h_global = False + L->h_possibly_outdated = False + L->h_written = False + L->h_version = R # back-reference to the original L->objectbody... = R->objectbody... global2local[R] = L return L -``L = LocalizeReadBarrier(P)`` is a different version of the read -barrier that works by returning a local object. +``W = WriteBarrier(P)`` and ``W = WriteBarrierFromReadReady(R)`` are +two versions of the write barrier:: -:: - - def LocalizeReadBarrier(P): + def WriteBarrier(P): if not P->h_global: # fast-path return P - R = LatestGlobalVersion(P) - L = Localize(R) - return L - - -``W = WriteBarrier(P)`` is the write barrier. - -:: - - def WriteBarrier(P): - W = LocalizeReadBarrier(P) - W->h_nonmodified = False + if P->h_possibly_outdated: + R = LatestGlobalVersion(P) + else: + R = P + W = Localize(R) + W->h_written = True + R->h_possibly_outdated = True return W + def WriteBarrierFromReadReady(P): + if not R->h_global: # fast-path + return R + W = Localize(R) + W->h_written = True + R->h_possibly_outdated = True + return W -``R = AdaptiveReadBarrier(P)`` is the adaptive read barrier. It can use -the technique of either ``DirectReadBarrier`` or -``LocalizeReadBarrier``, based on heuristics for better performance:: - def AdaptiveReadBarrier(P): - if not P->h_global: # fast-path - return P - R = LatestGlobalVersion(P) - if R in global2local: - return global2local[R] - if R seen often enough in readset: - L = Localize(R) # LocalizeReadBarrier - return L +Auto-localization of some objects +---------------------------------------- + +The "fast-path" markers above are quick checks that are supposed to be +inlined in the caller, so that we only have to pay for a full call to a +barrier implementation when the fast-path fails. + +However, even the fast-path of ``DirectReadBarrier`` fails repeatedly +when the ``DirectReadBarrier`` is invoked repeatedly on the same set of +global objects. This occurs in example of code that repeatedly +traverses the same data structure, visiting the same objects over and +over again. + +If the objects that make up the data structure were local, then we would +completely avoid triggering the read barrier's implementation. So +occasionally, it is better to *localize* global objects even when they +are only read from. + +This is done by tweaking ``AddInReadSet``, whose main purpose is to +record the read object in a set (actually a list):: + + def AddInReadSet(R): + if R not in recent_reads_cache: + list_of_read_objects.append(R) + recent_reads_cache[R] = 1 + # the cache is fixed-size, so the line above + # possibly evinces another older entry + return R else: - AddInReadSet(R) # DirectReadBarrier - return R - - -This adaptive localization of read-only objects is useful for example in -the following situation: we have a pointer ``P1`` to some parent object, -out of which we repeatedly try to read the same field ``Field`` and use -the result ``P`` in some call. Because the call may possibly have write -effects to the parent object, we normally need to redo -``DirectReadBarrier`` on ``P1`` every time. If instead we do -``AdaptiveReadBarrier`` then after a few iterations it will localize the -object and return ``L1``. On ``L1`` no read barrier is needed any more. - -Moreover, if we also need to read the subobject ``P``, we also need to -call a read barrier on it every time. It may return ``L`` after a few -iterations, but this time we win less, because during the next iteration -we again read ``P`` out of ``L1``. The trick is that when we read a -field out of a local object ``L1``, and it is a pointer on which we -subsequently do a read barrier, then afterwards we can update the -original pointer directly in ``L1``. - -Similarily, if we start with a global ``R1`` and read a pointer ``P`` -which is updated to its latest global version ``R``, then we can update -the original pointer in-place. - -The only case in which it is not permitted xxx - -:: - - def DependentUpdate(R1, Field, R): - if R1->h_global: # can't modify R1 unless it is local - return - R1->Field = R # possibly update the pointer - - + count = recent_reads_cache[R] + count += 1 + recent_reads_cache[R] = count + if count < THRESHOLD: + return R + else: + L = Localize(R) + return L _______________________________________________ pypy-commit mailing list pypy-commit@python.org http://mail.python.org/mailman/listinfo/pypy-commit