On 20 Nov, 2014, at 11:07 , Masao Uebayashi <uebay...@gmail.com> wrote: > On Wed, Nov 19, 2014 at 2:49 PM, Dennis Ferguson > <dennis.c.fergu...@gmail.com> wrote: >> That's very true, but the code is only correct if the writes occur >> in the order the C code has them written (i.e. tqe_next in the >> new element is written before writing the tqe_next of the existing >> list element to point at it). To ensure the compiler or the >> cache hardware doesn't reorder those writes it needs a write barrier >> inserted to enforce the order as written. > > Yes, in the pserialize(9) + TAILQ() case, everything except tqe_next > is only written by writers, and protected with a mutex, which ensures > (implies) ordering. What is ordered by what has to be clearly > documented.
We might be on the same page, but your response is confusing me enough that it might be useful to make this explicit. The guts of the current TAILQ_INSERT_TAIL() macro look like this: (elm)->field.tqe_next = TAILQ_END(head); (elm)->field.tqe_prev = (head)->tqh_last; *(head)->tqh_last = (elm); (head)->tqh_last = &(elm)->field.tqe_next; In standard C (assuming no volatile variable declarations) this replacement sequence is exactly equivalent to the above *(head)->tqh_last = (elm); (elm)->field.tqe_prev = (head)->tqh_last; (head)->tqh_last = &(elm)->field.tqe_next; (elm)->field.tqe_next = TAILQ_END(head); because the end result of executing the 4 lines in either order is identical and, in a single-threaded process, that is all that matters. Since the sequences are equivalent the compiler's optimizer and the machine's cache hardware are free to turn one sequence into the other if there is a performance advantage gained by doing so. If (and only if) there may be a concurrent reader of the structure it is pretty clear that the second sequence can't work; a concurrent reader may see (elm)->field.tqe_next uninitialized. What is also true, but not so clear, is that the while the first sequence looks right it is equally incorrect since it is equivalent to the second. The code is only correct if the required ordering is enforced by a barrier, e.g. (elm)->field.tqe_next = TAILQ_END(head); (elm)->field.tqe_prev = (head)->tqh_last; membar_producer(); *(head)->tqh_last = (elm); (head)->tqh_last = &(elm)->field.tqe_next; That tells the compiler to avoid reordering the stores done in code before the barrier to after the barrier, and vice versa, and to do what is needed (if anything) to tell the cache hardware to do the same (on an x86 only the compiler barrier is necessary). It makes the ordering you need explicit. The conclusion is that while the current TAILQ_REMOVE() macro will work unchanged to maintain a list with concurrent readers (given no requirement to run on an MP DEC Alpha) the TAILQ_INSERT_*() macros will not, even though they look correct. The mutex and/or the use of pserialize() does nothing to change this. If you want to insert things into a TAILQ list without blocking concurrent readers you must use alternative macros that have membar_producer() calls in the right spots. Are we on the same page? Dennis Ferguson