On 2016-10-01 14:11, Mark Rutland wrote:
Hi Brent,
Evidently my questions weren't sufficiently clear; even with your
answers it's
not clear to me what precise issue you're attempting to solve. I've
tried to be
more specific this time.
At a high-level, can you clarify whether you're attempting to solve is:
(a) a functional correctness issue (e.g. data corruption)
(b) a performance issue
And whether this was seen in practice, or found through code
inspection?
On Sat, Oct 01, 2016 at 12:11:36PM -0400, bdegr...@codeaurora.org
wrote:
On 2016-09-30 15:32, Mark Rutland wrote:
>On Fri, Sep 30, 2016 at 01:40:57PM -0400, Brent DeGraaf wrote:
>>+ * so LSE needs an explicit barrier here as well. Without this, the
>>+ * changed contents of the area protected by the spinlock could be
>>+ * observed prior to the lock.
>>+ */
>
>By whom? We generally expect that if data is protected by a lock, you take
>the lock before accessing it. If you expect concurrent lockless readers,
>then there's a requirement on the writer side to explicitly provide the
>ordering it requires -- spinlocks are not expected to provide that.
More details are in my response to Robin, but there is an API arm64
supports
in spinlock.h which is used by lockref to determine whether a lock is
free or
not. For that code to work properly without adding these barriers,
that API
needs to take the lock.
Can you please provide a concrete example of a case where things go
wrong,
citing the code (or access pattern) in question? e.g. as in the commit
messages
for:
8e86f0b409a44193 ("arm64: atomics: fix use of acquire + release for
full barrier semantics)
859b7a0e89120505 ("mm/slub: fix lockups on PREEMPT && !SMP kernels")
(for the latter, I messed up the register -> var mapping in one
paragraph, but
the style and reasoning is otherwise sound).
In the absence of a concrete example as above, it's very difficult to
reason
about what the underlying issue is, and what a valid fix would be for
said
issue.
>What pattern of accesses are made by readers and writers such that there is
>a problem?
Please note here I was asking specifically w.r.t. the lockref code,
e.g. which
loads could see stale data, and what does the code do based on this
value such
that there is a problem.
I added the barriers to the readers/writers because I do not know
these are
not similarly abused. There is a lot of driver code out there, and
ensuring
order is the safest way to be sure we don't get burned by something
similar
to the lockref access.
Making the architecture-provided primitives overly strong harms
performance and
efficiency (in general), makes the code harder to maintain and optimise
in
future, and only masks the issue (which could crop up on other
architectures,
for instance).
Thanks,
Mark.
Thinking about this, as the reader/writer code has no known "abuse"
case, I'll
remove it from the patchset, then provide a v2 patchset with a detailed
explanation for the lockref problem using the commits you provided as an
example,
as well as performance consideration.
Brent