----- On Aug 7, 2020, at 2:47 PM, Peter Oskolkov [email protected] wrote:

> On Fri, Aug 7, 2020 at 11:25 AM Mathieu Desnoyers
> <[email protected]> wrote:
>>
>> ----- On Aug 7, 2020, at 1:55 PM, Peter Oskolkov [email protected] wrote:
>>
>> > On Thu, Aug 6, 2020 at 5:27 PM Boqun Feng <[email protected]> wrote:
>> [...]
>> >> What if the manager thread update ->percpu_list_ptr and call
>> >> membarrier() here? I.e.
>> >>
>> >>         CPU0                    CPU1
>> >>                                 list_ptr = 
>> >> atomic_load(&args->percpu_list_ptr); // read list_b
>> >>
>> >>         atomic_store(&args->percpu_list_ptr, list_a);
>> >>         sys_membarrier(MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU, 1); // 
>> >> send ipi to
>> >>         restart rseq.cs on CPU1
>> >>
>> >>                                 <got IPI, but not in a rseq.cs, so 
>> >> nothing to do>
>> >>                                 cpu = rseq_cpu_start(); // start a 
>> >> rseq.cs and accessing list_b!
>> >>
>> >> The thing is, atomic_load() is an reference to ->percpu_list_ptr, which
>> >> is outside the rseq.cs, simply restarting rseq doesn't kill this
>> >> reference.
>> >>
>> >> Am I missing something subtle?
>> >
>> > rseq_cmpeqv_cmpeqv_store is used below to make sure the reference is
>> > the one that should be used; if it is no longer "active", the
>> > iteration is restarted.
>>
>> I suspect it "works" because the manager thread does not free and
>> repurpose the memory where list_a is allocated, nor does it store to
>> its list head (which would corrupt the pointer dereferenced by CPU 1
>> in the scenario above). This shares similarities with type-safe memory
>> allocation (see SLAB_TYPESAFE_BY_RCU).
>>
>> Even though it is not documented as such (or otherwise) in the example code,
>> I feel this example looks like it guarantees that the manager thread "owns"
>> list_a after the rseq-fence, when in fact it can still be read by the rseq
>> critical sections.
>>
>> AFAIU moving the atomic_load(&args->percpu_list_ptr) into the critical 
>> section
>> should entirely solve this and guarantee exclusive access to the old list
>> after the manager's rseq-fence. I wonder why this simpler approach is not
>> favored ?
> 
> I think the test code mimics our actual production code, where the concerns
> you outlined are not particularly relevant. I'll see if the test can
> be simplified
> in v3 along the lines you suggested.

In order to implement that, you'll need to extend the rseq per-arch
macros. Here is one I did for x86 (but not all other arch) which dereferences
a pointer, adds an offset that the resulting address, and loads the contents
of that memory location, all within a rseq critical section. See
https://git.kernel.org/pub/scm/libs/librseq/librseq.git/tree/include/rseq/rseq-x86.h#n1292

int rseq_deref_loadoffp(intptr_t *p, off_t voffp, intptr_t *load, int cpu)

I did that following a discussion with Paul Turner about the requirements for 
the
rseq fence.

For the use-case you have in this example, you will probably want to create a 
new

int rseq_deref_offset_addv(intptr_t *p, off_t voffp, intptr_t count, int cpu)

Which dereferences the list pointer and adds an offset within the critical 
section,
and then increments the value at that memory location as a commit.

offsetof() is very useful to generate the voffp argument.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

Reply via email to