Re: User model stacking: current status

Brian Goetz Thu, 05 May 2022 12:21:50 -0700

There are lots of other things to discuss here, including a discussionof what does non-atomic B2 really mean, and whether there areadditional risks that come from tearing _between the null and thefields_.

So, let's discuss non-atomic B2s. (First, note that atomicity is onlyrelevant in the heap; on the stack, everything is thread-confined, sothere will be no tearing.)


If we have:

    non-atomic __b2 class DateTime {
        long date;
        long time;
    }

then the layout of a B2 (or a B3.ref) is really (long, long, boolean),not just (long, long), because of the null channel. (We may be able tohide the null channel elsewhere, but that's an optimization.)

If two threads racily write (d1, t1) and (d2, t2) to a shared mutableDateTime, it is possible for an observer to observe (d1, t2) or (d2,t1). Saying non-atomic says "this is the cost of data races". Butadditionally, if we have a race between writing null and (d, t), thereis another possible form of tearing.

Let's write this out more explicitly. Suppose that T1 writes a non-nullvalue (d, t, true), and T2 writes null as (0, 0, false). Then it wouldbe possible to observe (0, 0, true), which means that we would beconceivably exposing the zero value to the user, even though a B2 classmight want to hide its zero.

So, suppose instead that we implemented writing a null as simply storingfalse to the synthetic boolean field. Then, in the event of a racebetween reader and writer, we could only see values for date and timethat were previously put there by some thread. This satisfies the OOTA(out of thin air) safety requirements of the JMM.

The other consequence we might have from this sort of tearing is if oneof the other fields is an OOP. If the GC is unaware of the significanceof the null field (and we'd like for the GC to stay unaware of this),then it is possible to have a null value where one of the oop fields(from a previous write) is non-null, keeping that object reachable evenwhen it is logically not reachable. (As an interesting connection, theboolean here is "special" in the same way as the synthetic booleanchannel is in pattern matching -- it dictates whether the _other_channels are valid. Which makes nullable values a good implementationstrategy for pattern carriers.)

So we have a choice for how we implement writing nulls, with apick-your-poison consequence:

- If we do a wide write, and write all the fields to zero, we riskexposing a zero value even when the zero is a bad value; - If we do a narrow write, and only write the null field, we riskpinning other OOPs in memory

Re: User model stacking: current status

Reply via email to