Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-21 Thread Byungchul Park
On Mon, Sep 19, 2016 at 10:50:09AM +0200, Peter Zijlstra wrote:
> Clearly I'm still missing stuff...

By the way.. do I have to explain more? Lack of explanation?

It would be the best to consider 'all valid acquires', which can occur
deadlock, but it looks impossible without parsing all code in head.

So it would be the safest to rely on 'acquires which actually happened',
even though it might be 'random acquires' among all valid acquires.

This conservative appoach is exactly same as how original lockdep is doing.
Let me explain more if you doubt it.



Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-21 Thread Byungchul Park
On Mon, Sep 19, 2016 at 10:50:09AM +0200, Peter Zijlstra wrote:
> Clearly I'm still missing stuff...

By the way.. do I have to explain more? Lack of explanation?

It would be the best to consider 'all valid acquires', which can occur
deadlock, but it looks impossible without parsing all code in head.

So it would be the safest to rely on 'acquires which actually happened',
even though it might be 'random acquires' among all valid acquires.

This conservative appoach is exactly same as how original lockdep is doing.
Let me explain more if you doubt it.



Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-20 Thread Byungchul Park
On Tue, Sep 20, 2016 at 02:50:38PM +0900, Byungchul Park wrote:
> On Mon, Sep 19, 2016 at 10:50:09AM +0200, Peter Zijlstra wrote:
> > On Mon, Sep 19, 2016 at 11:41:02AM +0900, Byungchul Park wrote:
> > 
> > > > But since these threads are independently scheduled there is no point in
> > > > transferring the point in time thread A does W to thread B. There is no
> > > > relation there.
> > > > 
> > > > B could have already executed the complete or it could not yet have
> > > > started execution at all or anything in between, entirely random.
> > > 
> > > Of course B could have already executed the complete or it could not yet
> > > have started execution at all or anything in between. But it's not 
> > > entirely
> > > random.
> > > 
> > > It might be a random point since they are independently scheduled, but 
> > > it's
> > > not entirely random. And it's a random point among valid points which 
> > > lockdep
> > > needs to consider. For example,
> > > 
> > > 
> > > CONTEXT 1 CONTEXT 2(forked one)
> > > = =
> > > (a)   acquire F
> > > acquire A acquire G
> > > acquire B wait_for_completion Z
> > > acquire C
> > > (b)   acquire H
> > > fork 2acquire I
> > > acquire D acquire J
> > > complete Zacquire K
> > > 
> > 
> > I'm hoping you left out the releases for brevity? Because calling fork()
> > with locks held is _really_ poor form.
> 
> Exactly. Sorry. I shouldn't have omitted releases.
> 
> > 
> > > I can provide countless examples with which I can say you're wrong.
> > > In this case, all acquires between (a) and (b) must be ignored when
> > > generating dependencies with complete operation of Z.
> > 
> > I still don't get the point. Why does this matter?
> > 
> > Sure, A-C are irrelevant in this example, but I don't see how they're
> > differently irrelevant from a whole bunch of other prior state action.
> > 
> > 
> > Earlier you said the algorithm for selecting the dependency is the first
> > acquire observed in the completing thread after the
> > wait_for_completion(). Is this correct?
> 
> Sorry for insufficient description.
> 
> held_locks of left context will be,
> 
> time 1: a
> time 2: a, x[0]
> time 3: a, x[1]
> ...
> time n: b
> 
> Between time 1 and time (n-1), 'a' will be the first among held_locks. At
> time n, 'b' will be the fist among held_locks. So 'a' and 'b' should be
> connected to 'z' if we ignore IRQ context. (I will explain it soon.)
> 
> Acquire x[i] is also valid one but crossrelease doesn't take it into
> account since original lockdep will cover using 'a -> x[i]'. So only

... since lockdep will cover 'z -> x[i]' using 'z -> a' and 'a -> x[i]' ...

> connections we need are 'z -> a' and 'z -> b'.
> 
> > 
> > 
> > W z
> > 
> > A a
> > for (i<0;i >   A x[i]
> >   R x[i]
> > }
> > R a
> > 
> > 
> >   A b
> >   R b
> >   C z
> > 
> 
> My crossrelease implementation distinguishes each IRQ from normal context
> or other IRQs in different timeline, even though they might share
> held_locks. So in this example, precisely speaking, there are two different
> contexts. One is normal context and the other is IRQ context. So only 'A b'
> is related with 'W z' in this example.
> 
> > 
> > That would be 'a' in this case, but that isn't at all related. Its just
> > as irrelevant as your A-C. And we can pick @many as big as needed to
> > flush the prev held cyclic buffer (although I've no idea how that
> > matters either).
> 
> I designed crossrelease so that x[i] is not added into ring buffer because
> adding 'z -> a' is sufficient and x[i] doesn't need to be taken into
> account in this case.
> 
> > 
> > What we want here is to link z to b, no? That is the last, not the first
> 
> Exactly right. Only 'z -> b' must be added under considering IRQ context.
> That is the first among held_locks in the IRQ context.
> 
> > acquire, it also is independent of when W happened.
> 
> If the IRQ is really random, then it can happen before W z and it can also
> happen after W z. We cannot determine the time. Then we need to consider all
> combination and possibility. It's a key point. We have to consider
> dependencies for all possibility.

   possibilities.

> 
> However, we don't know what synchronizes the flow. So it must be based on
   ^^^
 generally

> what actually happened, to identify true dependencies.
> 
> > 
> > At the same time, picking the last is no guarantee either, since that
> > can equally miss dependencies. Suppose the IRQ handler did:
> > 
> > 
> >   A c
> >   R c
> >   A b
> >   R b
> >   C z
> > 
> > 
> 
> time 1: c (in held_locks)
> time 2: b (in held_locks)
> 
> So 'c' and 'b' can be the first among 

Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-20 Thread Byungchul Park
On Tue, Sep 20, 2016 at 02:50:38PM +0900, Byungchul Park wrote:
> On Mon, Sep 19, 2016 at 10:50:09AM +0200, Peter Zijlstra wrote:
> > On Mon, Sep 19, 2016 at 11:41:02AM +0900, Byungchul Park wrote:
> > 
> > > > But since these threads are independently scheduled there is no point in
> > > > transferring the point in time thread A does W to thread B. There is no
> > > > relation there.
> > > > 
> > > > B could have already executed the complete or it could not yet have
> > > > started execution at all or anything in between, entirely random.
> > > 
> > > Of course B could have already executed the complete or it could not yet
> > > have started execution at all or anything in between. But it's not 
> > > entirely
> > > random.
> > > 
> > > It might be a random point since they are independently scheduled, but 
> > > it's
> > > not entirely random. And it's a random point among valid points which 
> > > lockdep
> > > needs to consider. For example,
> > > 
> > > 
> > > CONTEXT 1 CONTEXT 2(forked one)
> > > = =
> > > (a)   acquire F
> > > acquire A acquire G
> > > acquire B wait_for_completion Z
> > > acquire C
> > > (b)   acquire H
> > > fork 2acquire I
> > > acquire D acquire J
> > > complete Zacquire K
> > > 
> > 
> > I'm hoping you left out the releases for brevity? Because calling fork()
> > with locks held is _really_ poor form.
> 
> Exactly. Sorry. I shouldn't have omitted releases.
> 
> > 
> > > I can provide countless examples with which I can say you're wrong.
> > > In this case, all acquires between (a) and (b) must be ignored when
> > > generating dependencies with complete operation of Z.
> > 
> > I still don't get the point. Why does this matter?
> > 
> > Sure, A-C are irrelevant in this example, but I don't see how they're
> > differently irrelevant from a whole bunch of other prior state action.
> > 
> > 
> > Earlier you said the algorithm for selecting the dependency is the first
> > acquire observed in the completing thread after the
> > wait_for_completion(). Is this correct?
> 
> Sorry for insufficient description.
> 
> held_locks of left context will be,
> 
> time 1: a
> time 2: a, x[0]
> time 3: a, x[1]
> ...
> time n: b
> 
> Between time 1 and time (n-1), 'a' will be the first among held_locks. At
> time n, 'b' will be the fist among held_locks. So 'a' and 'b' should be
> connected to 'z' if we ignore IRQ context. (I will explain it soon.)
> 
> Acquire x[i] is also valid one but crossrelease doesn't take it into
> account since original lockdep will cover using 'a -> x[i]'. So only

... since lockdep will cover 'z -> x[i]' using 'z -> a' and 'a -> x[i]' ...

> connections we need are 'z -> a' and 'z -> b'.
> 
> > 
> > 
> > W z
> > 
> > A a
> > for (i<0;i >   A x[i]
> >   R x[i]
> > }
> > R a
> > 
> > 
> >   A b
> >   R b
> >   C z
> > 
> 
> My crossrelease implementation distinguishes each IRQ from normal context
> or other IRQs in different timeline, even though they might share
> held_locks. So in this example, precisely speaking, there are two different
> contexts. One is normal context and the other is IRQ context. So only 'A b'
> is related with 'W z' in this example.
> 
> > 
> > That would be 'a' in this case, but that isn't at all related. Its just
> > as irrelevant as your A-C. And we can pick @many as big as needed to
> > flush the prev held cyclic buffer (although I've no idea how that
> > matters either).
> 
> I designed crossrelease so that x[i] is not added into ring buffer because
> adding 'z -> a' is sufficient and x[i] doesn't need to be taken into
> account in this case.
> 
> > 
> > What we want here is to link z to b, no? That is the last, not the first
> 
> Exactly right. Only 'z -> b' must be added under considering IRQ context.
> That is the first among held_locks in the IRQ context.
> 
> > acquire, it also is independent of when W happened.
> 
> If the IRQ is really random, then it can happen before W z and it can also
> happen after W z. We cannot determine the time. Then we need to consider all
> combination and possibility. It's a key point. We have to consider
> dependencies for all possibility.

   possibilities.

> 
> However, we don't know what synchronizes the flow. So it must be based on
   ^^^
 generally

> what actually happened, to identify true dependencies.
> 
> > 
> > At the same time, picking the last is no guarantee either, since that
> > can equally miss dependencies. Suppose the IRQ handler did:
> > 
> > 
> >   A c
> >   R c
> >   A b
> >   R b
> >   C z
> > 
> > 
> 
> time 1: c (in held_locks)
> time 2: b (in held_locks)
> 
> So 'c' and 'b' can be the first among held_locks at each 

Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-20 Thread Byungchul Park
On Tue, Sep 20, 2016 at 02:50:38PM +0900, Byungchul Park wrote:
> On Mon, Sep 19, 2016 at 10:50:09AM +0200, Peter Zijlstra wrote:
> > On Mon, Sep 19, 2016 at 11:41:02AM +0900, Byungchul Park wrote:
> > 
> > > > But since these threads are independently scheduled there is no point in
> > > > transferring the point in time thread A does W to thread B. There is no
> > > > relation there.
> > > > 
> > > > B could have already executed the complete or it could not yet have
> > > > started execution at all or anything in between, entirely random.
> > > 
> > > Of course B could have already executed the complete or it could not yet
> > > have started execution at all or anything in between. But it's not 
> > > entirely
> > > random.
> > > 
> > > It might be a random point since they are independently scheduled, but 
> > > it's
> > > not entirely random. And it's a random point among valid points which 
> > > lockdep
> > > needs to consider. For example,
> > > 
> > > 
> > > CONTEXT 1 CONTEXT 2(forked one)
> > > = =
> > > (a)   acquire F
> > > acquire A acquire G
> > > acquire B wait_for_completion Z
> > > acquire C
> > > (b)   acquire H
> > > fork 2acquire I
> > > acquire D acquire J
> > > complete Zacquire K
> > > 
> > 
> > I'm hoping you left out the releases for brevity? Because calling fork()
> > with locks held is _really_ poor form.
> 
> Exactly. Sorry. I shouldn't have omitted releases.
> 
> > 
> > > I can provide countless examples with which I can say you're wrong.
> > > In this case, all acquires between (a) and (b) must be ignored when
> > > generating dependencies with complete operation of Z.
> > 
> > I still don't get the point. Why does this matter?
> > 
> > Sure, A-C are irrelevant in this example, but I don't see how they're
> > differently irrelevant from a whole bunch of other prior state action.
> > 
> > 
> > Earlier you said the algorithm for selecting the dependency is the first
> > acquire observed in the completing thread after the
> > wait_for_completion(). Is this correct?
> 
> Sorry for insufficient description.
> 
> held_locks of left context will be,
> 
> time 1: a
> time 2: a, x[0]
> time 3: a, x[1]
> ...
> time n: b
> 
> Between time 1 and time (n-1), 'a' will be the first among held_locks. At
> time n, 'b' will be the fist among held_locks. So 'a' and 'b' should be
> connected to 'z' if we ignore IRQ context. (I will explain it soon.)
> 
> Acquire x[i] is also valid one but crossrelease doesn't take it into
> account since original lockdep will cover using 'a -> x[i]'. So only
> connections we need are 'z -> a' and 'z -> b'.
> 
> > 
> > 
> > W z
> > 
> > A a
> > for (i<0;i >   A x[i]
> >   R x[i]
> > }
> > R a
> > 
> > 
> >   A b
> >   R b
> >   C z
> > 
> 
> My crossrelease implementation distinguishes each IRQ from normal context
> or other IRQs in different timeline, even though they might share
> held_locks. So in this example, precisely speaking, there are two different
> contexts. One is normal context and the other is IRQ context. So only 'A b'
> is related with 'W z' in this example.
> 
> > 
> > That would be 'a' in this case, but that isn't at all related. Its just
> > as irrelevant as your A-C. And we can pick @many as big as needed to
> > flush the prev held cyclic buffer (although I've no idea how that
> > matters either).
> 
> I designed crossrelease so that x[i] is not added into ring buffer because
> adding 'z -> a' is sufficient and x[i] doesn't need to be taken into
> account in this case.
> 
> > 
> > What we want here is to link z to b, no? That is the last, not the first
> 
> Exactly right. Only 'z -> b' must be added under considering IRQ context.
> That is the first among held_locks in the IRQ context.
> 
> > acquire, it also is independent of when W happened.
> 
> If the IRQ is really random, then it can happen before W z and it can also
> happen after W z. We cannot determine the time. Then we need to consider all
> combination and possibility. It's a key point. We have to consider
> dependencies for all possibility.
> 
> However, we don't know what synchronizes the flow. So it must be based on
> what actually happened, to identify true dependencies.
> 
> > 
> > At the same time, picking the last is no guarantee either, since that
> > can equally miss dependencies. Suppose the IRQ handler did:
> > 
> > 
> >   A c
> >   R c
> >   A b
> >   R b
> >   C z
> > 
> > 
> 
> time 1: c (in held_locks)
> time 2: b (in held_locks)
> 
> So 'c' and 'b' can be the first among held_locks at each moment.
> So 'z -> b' and 'z -> c' will be added.
> 
> > instead. We'd miss the z depends on c relation, and since they're
> > independent lock sections, 

Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-20 Thread Byungchul Park
On Tue, Sep 20, 2016 at 02:50:38PM +0900, Byungchul Park wrote:
> On Mon, Sep 19, 2016 at 10:50:09AM +0200, Peter Zijlstra wrote:
> > On Mon, Sep 19, 2016 at 11:41:02AM +0900, Byungchul Park wrote:
> > 
> > > > But since these threads are independently scheduled there is no point in
> > > > transferring the point in time thread A does W to thread B. There is no
> > > > relation there.
> > > > 
> > > > B could have already executed the complete or it could not yet have
> > > > started execution at all or anything in between, entirely random.
> > > 
> > > Of course B could have already executed the complete or it could not yet
> > > have started execution at all or anything in between. But it's not 
> > > entirely
> > > random.
> > > 
> > > It might be a random point since they are independently scheduled, but 
> > > it's
> > > not entirely random. And it's a random point among valid points which 
> > > lockdep
> > > needs to consider. For example,
> > > 
> > > 
> > > CONTEXT 1 CONTEXT 2(forked one)
> > > = =
> > > (a)   acquire F
> > > acquire A acquire G
> > > acquire B wait_for_completion Z
> > > acquire C
> > > (b)   acquire H
> > > fork 2acquire I
> > > acquire D acquire J
> > > complete Zacquire K
> > > 
> > 
> > I'm hoping you left out the releases for brevity? Because calling fork()
> > with locks held is _really_ poor form.
> 
> Exactly. Sorry. I shouldn't have omitted releases.
> 
> > 
> > > I can provide countless examples with which I can say you're wrong.
> > > In this case, all acquires between (a) and (b) must be ignored when
> > > generating dependencies with complete operation of Z.
> > 
> > I still don't get the point. Why does this matter?
> > 
> > Sure, A-C are irrelevant in this example, but I don't see how they're
> > differently irrelevant from a whole bunch of other prior state action.
> > 
> > 
> > Earlier you said the algorithm for selecting the dependency is the first
> > acquire observed in the completing thread after the
> > wait_for_completion(). Is this correct?
> 
> Sorry for insufficient description.
> 
> held_locks of left context will be,
> 
> time 1: a
> time 2: a, x[0]
> time 3: a, x[1]
> ...
> time n: b
> 
> Between time 1 and time (n-1), 'a' will be the first among held_locks. At
> time n, 'b' will be the fist among held_locks. So 'a' and 'b' should be
> connected to 'z' if we ignore IRQ context. (I will explain it soon.)
> 
> Acquire x[i] is also valid one but crossrelease doesn't take it into
> account since original lockdep will cover using 'a -> x[i]'. So only
> connections we need are 'z -> a' and 'z -> b'.
> 
> > 
> > 
> > W z
> > 
> > A a
> > for (i<0;i >   A x[i]
> >   R x[i]
> > }
> > R a
> > 
> > 
> >   A b
> >   R b
> >   C z
> > 
> 
> My crossrelease implementation distinguishes each IRQ from normal context
> or other IRQs in different timeline, even though they might share
> held_locks. So in this example, precisely speaking, there are two different
> contexts. One is normal context and the other is IRQ context. So only 'A b'
> is related with 'W z' in this example.
> 
> > 
> > That would be 'a' in this case, but that isn't at all related. Its just
> > as irrelevant as your A-C. And we can pick @many as big as needed to
> > flush the prev held cyclic buffer (although I've no idea how that
> > matters either).
> 
> I designed crossrelease so that x[i] is not added into ring buffer because
> adding 'z -> a' is sufficient and x[i] doesn't need to be taken into
> account in this case.
> 
> > 
> > What we want here is to link z to b, no? That is the last, not the first
> 
> Exactly right. Only 'z -> b' must be added under considering IRQ context.
> That is the first among held_locks in the IRQ context.
> 
> > acquire, it also is independent of when W happened.
> 
> If the IRQ is really random, then it can happen before W z and it can also
> happen after W z. We cannot determine the time. Then we need to consider all
> combination and possibility. It's a key point. We have to consider
> dependencies for all possibility.
> 
> However, we don't know what synchronizes the flow. So it must be based on
> what actually happened, to identify true dependencies.
> 
> > 
> > At the same time, picking the last is no guarantee either, since that
> > can equally miss dependencies. Suppose the IRQ handler did:
> > 
> > 
> >   A c
> >   R c
> >   A b
> >   R b
> >   C z
> > 
> > 
> 
> time 1: c (in held_locks)
> time 2: b (in held_locks)
> 
> So 'c' and 'b' can be the first among held_locks at each moment.
> So 'z -> b' and 'z -> c' will be added.
> 
> > instead. We'd miss the z depends on c relation, and since they're
> > independent lock sections, lockdep wouldn't 

Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-20 Thread Byungchul Park
On Mon, Sep 19, 2016 at 10:50:09AM +0200, Peter Zijlstra wrote:
> On Mon, Sep 19, 2016 at 11:41:02AM +0900, Byungchul Park wrote:
> 
> > > But since these threads are independently scheduled there is no point in
> > > transferring the point in time thread A does W to thread B. There is no
> > > relation there.
> > > 
> > > B could have already executed the complete or it could not yet have
> > > started execution at all or anything in between, entirely random.
> > 
> > Of course B could have already executed the complete or it could not yet
> > have started execution at all or anything in between. But it's not entirely
> > random.
> > 
> > It might be a random point since they are independently scheduled, but it's
> > not entirely random. And it's a random point among valid points which 
> > lockdep
> > needs to consider. For example,
> > 
> > 
> > CONTEXT 1   CONTEXT 2(forked one)
> > =   =
> > (a) acquire F
> > acquire A   acquire G
> > acquire B   wait_for_completion Z
> > acquire C
> > (b) acquire H
> > fork 2  acquire I
> > acquire D   acquire J
> > complete Z  acquire K
> > 
> 
> I'm hoping you left out the releases for brevity? Because calling fork()
> with locks held is _really_ poor form.

Exactly. Sorry. I shouldn't have omitted releases.

> 
> > I can provide countless examples with which I can say you're wrong.
> > In this case, all acquires between (a) and (b) must be ignored when
> > generating dependencies with complete operation of Z.
> 
> I still don't get the point. Why does this matter?
> 
> Sure, A-C are irrelevant in this example, but I don't see how they're
> differently irrelevant from a whole bunch of other prior state action.
> 
> 
> Earlier you said the algorithm for selecting the dependency is the first
> acquire observed in the completing thread after the
> wait_for_completion(). Is this correct?

Sorry for insufficient description.

held_locks of left context will be,

time 1: a
time 2: a, x[0]
time 3: a, x[1]
...
time n: b

Between time 1 and time (n-1), 'a' will be the first among held_locks. At
time n, 'b' will be the fist among held_locks. So 'a' and 'b' should be
connected to 'z' if we ignore IRQ context. (I will explain it soon.)

Acquire x[i] is also valid one but crossrelease doesn't take it into
account since original lockdep will cover using 'a -> x[i]'. So only
connections we need are 'z -> a' and 'z -> b'.

> 
> 
>   W z
> 
>   A a
>   for (i<0;i A x[i]
> R x[i]
>   }
>   R a
> 
>   
> A b
> R b
> C z
>   

My crossrelease implementation distinguishes each IRQ from normal context
or other IRQs in different timeline, even though they might share
held_locks. So in this example, precisely speaking, there are two different
contexts. One is normal context and the other is IRQ context. So only 'A b'
is related with 'W z' in this example.

> 
> That would be 'a' in this case, but that isn't at all related. Its just
> as irrelevant as your A-C. And we can pick @many as big as needed to
> flush the prev held cyclic buffer (although I've no idea how that
> matters either).

I designed crossrelease so that x[i] is not added into ring buffer because
adding 'z -> a' is sufficient and x[i] doesn't need to be taken into
account in this case.

> 
> What we want here is to link z to b, no? That is the last, not the first

Exactly right. Only 'z -> b' must be added under considering IRQ context.
That is the first among held_locks in the IRQ context.

> acquire, it also is independent of when W happened.

If the IRQ is really random, then it can happen before W z and it can also
happen after W z. We cannot determine the time. Then we need to consider all
combination and possibility. It's a key point. We have to consider
dependencies for all possibility.

However, we don't know what synchronizes the flow. So it must be based on
what actually happened, to identify true dependencies.

> 
> At the same time, picking the last is no guarantee either, since that
> can equally miss dependencies. Suppose the IRQ handler did:
> 
>   
> A c
> R c
> A b
> R b
> C z
>   
> 

time 1: c (in held_locks)
time 2: b (in held_locks)

So 'c' and 'b' can be the first among held_locks at each moment.
So 'z -> b' and 'z -> c' will be added.

> instead. We'd miss the z depends on c relation, and since they're
> independent lock sections, lockdep wouldn't make a b-c relation either.
> 
> 
> Clearly I'm still missing stuff...

Sorry for insufficient description. I tried to describ crossrelease in as
much detail as possible, really.

The reason why I consider only the first among valid locks in held_locks is
simple. For example,

Context 1
A a -> A b -> A 

Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-20 Thread Byungchul Park
On Mon, Sep 19, 2016 at 10:50:09AM +0200, Peter Zijlstra wrote:
> On Mon, Sep 19, 2016 at 11:41:02AM +0900, Byungchul Park wrote:
> 
> > > But since these threads are independently scheduled there is no point in
> > > transferring the point in time thread A does W to thread B. There is no
> > > relation there.
> > > 
> > > B could have already executed the complete or it could not yet have
> > > started execution at all or anything in between, entirely random.
> > 
> > Of course B could have already executed the complete or it could not yet
> > have started execution at all or anything in between. But it's not entirely
> > random.
> > 
> > It might be a random point since they are independently scheduled, but it's
> > not entirely random. And it's a random point among valid points which 
> > lockdep
> > needs to consider. For example,
> > 
> > 
> > CONTEXT 1   CONTEXT 2(forked one)
> > =   =
> > (a) acquire F
> > acquire A   acquire G
> > acquire B   wait_for_completion Z
> > acquire C
> > (b) acquire H
> > fork 2  acquire I
> > acquire D   acquire J
> > complete Z  acquire K
> > 
> 
> I'm hoping you left out the releases for brevity? Because calling fork()
> with locks held is _really_ poor form.

Exactly. Sorry. I shouldn't have omitted releases.

> 
> > I can provide countless examples with which I can say you're wrong.
> > In this case, all acquires between (a) and (b) must be ignored when
> > generating dependencies with complete operation of Z.
> 
> I still don't get the point. Why does this matter?
> 
> Sure, A-C are irrelevant in this example, but I don't see how they're
> differently irrelevant from a whole bunch of other prior state action.
> 
> 
> Earlier you said the algorithm for selecting the dependency is the first
> acquire observed in the completing thread after the
> wait_for_completion(). Is this correct?

Sorry for insufficient description.

held_locks of left context will be,

time 1: a
time 2: a, x[0]
time 3: a, x[1]
...
time n: b

Between time 1 and time (n-1), 'a' will be the first among held_locks. At
time n, 'b' will be the fist among held_locks. So 'a' and 'b' should be
connected to 'z' if we ignore IRQ context. (I will explain it soon.)

Acquire x[i] is also valid one but crossrelease doesn't take it into
account since original lockdep will cover using 'a -> x[i]'. So only
connections we need are 'z -> a' and 'z -> b'.

> 
> 
>   W z
> 
>   A a
>   for (i<0;i A x[i]
> R x[i]
>   }
>   R a
> 
>   
> A b
> R b
> C z
>   

My crossrelease implementation distinguishes each IRQ from normal context
or other IRQs in different timeline, even though they might share
held_locks. So in this example, precisely speaking, there are two different
contexts. One is normal context and the other is IRQ context. So only 'A b'
is related with 'W z' in this example.

> 
> That would be 'a' in this case, but that isn't at all related. Its just
> as irrelevant as your A-C. And we can pick @many as big as needed to
> flush the prev held cyclic buffer (although I've no idea how that
> matters either).

I designed crossrelease so that x[i] is not added into ring buffer because
adding 'z -> a' is sufficient and x[i] doesn't need to be taken into
account in this case.

> 
> What we want here is to link z to b, no? That is the last, not the first

Exactly right. Only 'z -> b' must be added under considering IRQ context.
That is the first among held_locks in the IRQ context.

> acquire, it also is independent of when W happened.

If the IRQ is really random, then it can happen before W z and it can also
happen after W z. We cannot determine the time. Then we need to consider all
combination and possibility. It's a key point. We have to consider
dependencies for all possibility.

However, we don't know what synchronizes the flow. So it must be based on
what actually happened, to identify true dependencies.

> 
> At the same time, picking the last is no guarantee either, since that
> can equally miss dependencies. Suppose the IRQ handler did:
> 
>   
> A c
> R c
> A b
> R b
> C z
>   
> 

time 1: c (in held_locks)
time 2: b (in held_locks)

So 'c' and 'b' can be the first among held_locks at each moment.
So 'z -> b' and 'z -> c' will be added.

> instead. We'd miss the z depends on c relation, and since they're
> independent lock sections, lockdep wouldn't make a b-c relation either.
> 
> 
> Clearly I'm still missing stuff...

Sorry for insufficient description. I tried to describ crossrelease in as
much detail as possible, really.

The reason why I consider only the first among valid locks in held_locks is
simple. For example,

Context 1
A a -> A b -> A crosslock -> R a -> R 

Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-19 Thread Peter Zijlstra
On Mon, Sep 19, 2016 at 11:41:02AM +0900, Byungchul Park wrote:

> > But since these threads are independently scheduled there is no point in
> > transferring the point in time thread A does W to thread B. There is no
> > relation there.
> > 
> > B could have already executed the complete or it could not yet have
> > started execution at all or anything in between, entirely random.
> 
> Of course B could have already executed the complete or it could not yet
> have started execution at all or anything in between. But it's not entirely
> random.
> 
> It might be a random point since they are independently scheduled, but it's
> not entirely random. And it's a random point among valid points which lockdep
> needs to consider. For example,
> 
> 
> CONTEXT 1 CONTEXT 2(forked one)
> = =
> (a)   acquire F
> acquire A acquire G
> acquire B wait_for_completion Z
> acquire C
> (b)   acquire H
> fork 2acquire I
> acquire D acquire J
> complete Zacquire K
> 

I'm hoping you left out the releases for brevity? Because calling fork()
with locks held is _really_ poor form.

> I can provide countless examples with which I can say you're wrong.
> In this case, all acquires between (a) and (b) must be ignored when
> generating dependencies with complete operation of Z.

I still don't get the point. Why does this matter?

Sure, A-C are irrelevant in this example, but I don't see how they're
differently irrelevant from a whole bunch of other prior state action.


Earlier you said the algorithm for selecting the dependency is the first
acquire observed in the completing thread after the
wait_for_completion(). Is this correct?


W z

A a
for (i<0;i

Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-19 Thread Peter Zijlstra
On Mon, Sep 19, 2016 at 11:41:02AM +0900, Byungchul Park wrote:

> > But since these threads are independently scheduled there is no point in
> > transferring the point in time thread A does W to thread B. There is no
> > relation there.
> > 
> > B could have already executed the complete or it could not yet have
> > started execution at all or anything in between, entirely random.
> 
> Of course B could have already executed the complete or it could not yet
> have started execution at all or anything in between. But it's not entirely
> random.
> 
> It might be a random point since they are independently scheduled, but it's
> not entirely random. And it's a random point among valid points which lockdep
> needs to consider. For example,
> 
> 
> CONTEXT 1 CONTEXT 2(forked one)
> = =
> (a)   acquire F
> acquire A acquire G
> acquire B wait_for_completion Z
> acquire C
> (b)   acquire H
> fork 2acquire I
> acquire D acquire J
> complete Zacquire K
> 

I'm hoping you left out the releases for brevity? Because calling fork()
with locks held is _really_ poor form.

> I can provide countless examples with which I can say you're wrong.
> In this case, all acquires between (a) and (b) must be ignored when
> generating dependencies with complete operation of Z.

I still don't get the point. Why does this matter?

Sure, A-C are irrelevant in this example, but I don't see how they're
differently irrelevant from a whole bunch of other prior state action.


Earlier you said the algorithm for selecting the dependency is the first
acquire observed in the completing thread after the
wait_for_completion(). Is this correct?


W z

A a
for (i<0;i
  A b
  R b
  C z


That would be 'a' in this case, but that isn't at all related. Its just
as irrelevant as your A-C. And we can pick @many as big as needed to
flush the prev held cyclic buffer (although I've no idea how that
matters either).

What we want here is to link z to b, no? That is the last, not the first
acquire, it also is independent of when W happened.

At the same time, picking the last is no guarantee either, since that
can equally miss dependencies. Suppose the IRQ handler did:


  A c
  R c
  A b
  R b
  C z


instead. We'd miss the z depends on c relation, and since they're
independent lock sections, lockdep wouldn't make a b-c relation either.


Clearly I'm still missing stuff...


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-18 Thread Byungchul Park
On Wed, Sep 14, 2016 at 10:11:17AM +0200, Peter Zijlstra wrote:
> On Wed, Sep 14, 2016 at 11:27:22AM +0900, Byungchul Park wrote:
> > > Well, there is, its just not trivially observable. We must be able to
> > > acquire a in order to complete b, therefore there is a dependency.
> > 
> > No. We cannot say there is a dependency unconditionally. There can
> > be a dependency or not.
> > 
> > L a L a
> > U a
> > ~ what if serialized by something?
> 
> Well, there's no serialization in the example, so no what if.

It was a korean traditional holliday for a week so I'm late.

I mean we cannot _ensure_ there's no serialization while lockdep works.
In _the_ case you suggested, you'are right if only those code exists.
But it's meaningless.

> > W b C b
> > 
> > If something we don't recognize serializes locks, which ensures
> > 'W b' happens after 'L a , U a' in the other context, then there's
> > no dependency here.
> 
> Its not there.
> 
> > We should say 'b depends on a' in only case that the sequence
> > 'W b and then L a and then C b, where last two ops are in same
> > context' _actually_ happened at least once. Otherwise, it might
> > add a false dependency.
> > 
> > It's same as how original lockdep works with typical locks. It adds
> > a dependency only when a lock is actually hit.
> 
> But since these threads are independently scheduled there is no point in
> transferring the point in time thread A does W to thread B. There is no
> relation there.
> 
> B could have already executed the complete or it could not yet have
> started execution at all or anything in between, entirely random.

Of course B could have already executed the complete or it could not yet
have started execution at all or anything in between. But it's not entirely
random.

It might be a random point since they are independently scheduled, but it's
not entirely random. And it's a random point among valid points which lockdep
needs to consider. For example,


CONTEXT 1   CONTEXT 2(forked one)
=   =
(a) acquire F
acquire A   acquire G
acquire B   wait_for_completion Z
acquire C
(b) acquire H
fork 2  acquire I
acquire D   acquire J
complete Z  acquire K


I can provide countless examples with which I can say you're wrong.
In this case, all acquires between (a) and (b) must be ignored when
generating dependencies with complete operation of Z. It's never random.

Ideally, it would be of course the best to consider all points (not random
points) after (b) which are valid points which lockdep needs to work with.
But I think it's impossible to parse and identify all synchronizations and
forks in kernel code, furthermore, new synchronization interface can be
introduced in future.

So IMHO it would be the second best to consider random points among valid
points, which anyway actually happened so it's guarrented that it has a
depenency with Z.

It's similar to how lockdep works for typical lock e.g. spin lock. Current
lockdep builds dependecy graph based on call paths which actually happened
in each context, which might be different from each run. Even current
lockdep doesn't parse all code and identify dependencies but works based on
actual call paths in runtime which can be random but will eventually cover
it almost (not perfect).

> > > What does that mean? Any why? This is a random point in time without
> > > actual meaning.
> > 
> > It's not random point. We have to consider meaningful sequences among
> > those which are globally observable. That's why we need to serialize
> > those locks.
> 
> Serialize how? there is no serialization.

I mean I did it in my crossrelease implementation.

> 
> > For example,
> > 
> > W b
> > L a
> > U a
> > C b
> > 
> > Once this sequence is observable globally, we can say 'It's possible to
> > run in this sequence. Is this sequence problematic or not?'.
> > 
> > L a
> > U a
> > W b
> > C b
> > 
> > If only this sequence can be observable, we should not assume
> > this sequence can be changed. However once the former sequence
> > happens, it has a possibility to hit the same sequence again later.
> > So we can check deadlock possibility with the sequence,
> > 
> > _not randomly_.
> 
> I still don't get it.
> 
> > We need to connect between the crosslock and the first lock among
> > locks having been acquired since the crosslock was held.
> 
> Which can be _any_ lock in the history of that thread. It could be
> rq->lock from getting the thread scheduled.

I think I already answered it. Right?

> 
> > Others will be
> > connected each other by original lockdep.
> > 
> > By the way, does my document miss this description? If so, sorry.
> > I will check and update it.
> 
> I couldn't find anything useful, but then I could not understand most of
> what was written, and 

Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-18 Thread Byungchul Park
On Wed, Sep 14, 2016 at 10:11:17AM +0200, Peter Zijlstra wrote:
> On Wed, Sep 14, 2016 at 11:27:22AM +0900, Byungchul Park wrote:
> > > Well, there is, its just not trivially observable. We must be able to
> > > acquire a in order to complete b, therefore there is a dependency.
> > 
> > No. We cannot say there is a dependency unconditionally. There can
> > be a dependency or not.
> > 
> > L a L a
> > U a
> > ~ what if serialized by something?
> 
> Well, there's no serialization in the example, so no what if.

It was a korean traditional holliday for a week so I'm late.

I mean we cannot _ensure_ there's no serialization while lockdep works.
In _the_ case you suggested, you'are right if only those code exists.
But it's meaningless.

> > W b C b
> > 
> > If something we don't recognize serializes locks, which ensures
> > 'W b' happens after 'L a , U a' in the other context, then there's
> > no dependency here.
> 
> Its not there.
> 
> > We should say 'b depends on a' in only case that the sequence
> > 'W b and then L a and then C b, where last two ops are in same
> > context' _actually_ happened at least once. Otherwise, it might
> > add a false dependency.
> > 
> > It's same as how original lockdep works with typical locks. It adds
> > a dependency only when a lock is actually hit.
> 
> But since these threads are independently scheduled there is no point in
> transferring the point in time thread A does W to thread B. There is no
> relation there.
> 
> B could have already executed the complete or it could not yet have
> started execution at all or anything in between, entirely random.

Of course B could have already executed the complete or it could not yet
have started execution at all or anything in between. But it's not entirely
random.

It might be a random point since they are independently scheduled, but it's
not entirely random. And it's a random point among valid points which lockdep
needs to consider. For example,


CONTEXT 1   CONTEXT 2(forked one)
=   =
(a) acquire F
acquire A   acquire G
acquire B   wait_for_completion Z
acquire C
(b) acquire H
fork 2  acquire I
acquire D   acquire J
complete Z  acquire K


I can provide countless examples with which I can say you're wrong.
In this case, all acquires between (a) and (b) must be ignored when
generating dependencies with complete operation of Z. It's never random.

Ideally, it would be of course the best to consider all points (not random
points) after (b) which are valid points which lockdep needs to work with.
But I think it's impossible to parse and identify all synchronizations and
forks in kernel code, furthermore, new synchronization interface can be
introduced in future.

So IMHO it would be the second best to consider random points among valid
points, which anyway actually happened so it's guarrented that it has a
depenency with Z.

It's similar to how lockdep works for typical lock e.g. spin lock. Current
lockdep builds dependecy graph based on call paths which actually happened
in each context, which might be different from each run. Even current
lockdep doesn't parse all code and identify dependencies but works based on
actual call paths in runtime which can be random but will eventually cover
it almost (not perfect).

> > > What does that mean? Any why? This is a random point in time without
> > > actual meaning.
> > 
> > It's not random point. We have to consider meaningful sequences among
> > those which are globally observable. That's why we need to serialize
> > those locks.
> 
> Serialize how? there is no serialization.

I mean I did it in my crossrelease implementation.

> 
> > For example,
> > 
> > W b
> > L a
> > U a
> > C b
> > 
> > Once this sequence is observable globally, we can say 'It's possible to
> > run in this sequence. Is this sequence problematic or not?'.
> > 
> > L a
> > U a
> > W b
> > C b
> > 
> > If only this sequence can be observable, we should not assume
> > this sequence can be changed. However once the former sequence
> > happens, it has a possibility to hit the same sequence again later.
> > So we can check deadlock possibility with the sequence,
> > 
> > _not randomly_.
> 
> I still don't get it.
> 
> > We need to connect between the crosslock and the first lock among
> > locks having been acquired since the crosslock was held.
> 
> Which can be _any_ lock in the history of that thread. It could be
> rq->lock from getting the thread scheduled.

I think I already answered it. Right?

> 
> > Others will be
> > connected each other by original lockdep.
> > 
> > By the way, does my document miss this description? If so, sorry.
> > I will check and update it.
> 
> I couldn't find anything useful, but then I could not understand most of
> what was written, and 

Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-14 Thread Peter Zijlstra
On Wed, Sep 14, 2016 at 11:27:22AM +0900, Byungchul Park wrote:
> > Well, there is, its just not trivially observable. We must be able to
> > acquire a in order to complete b, therefore there is a dependency.
> 
> No. We cannot say there is a dependency unconditionally. There can
> be a dependency or not.
> 
> L a L a
> U a
> ~ what if serialized by something?

Well, there's no serialization in the example, so no what if.

> W b C b
> 
> If something we don't recognize serializes locks, which ensures
> 'W b' happens after 'L a , U a' in the other context, then there's
> no dependency here.

Its not there.

> We should say 'b depends on a' in only case that the sequence
> 'W b and then L a and then C b, where last two ops are in same
> context' _actually_ happened at least once. Otherwise, it might
> add a false dependency.
> 
> It's same as how original lockdep works with typical locks. It adds
> a dependency only when a lock is actually hit.

But since these threads are independently scheduled there is no point in
transferring the point in time thread A does W to thread B. There is no
relation there.

B could have already executed the complete or it could not yet have
started execution at all or anything in between, entirely random.

> > What does that mean? Any why? This is a random point in time without
> > actual meaning.
> 
> It's not random point. We have to consider meaningful sequences among
> those which are globally observable. That's why we need to serialize
> those locks.

Serialize how? there is no serialization.

> For example,
> 
> W b
> L a
> U a
> C b
> 
> Once this sequence is observable globally, we can say 'It's possible to
> run in this sequence. Is this sequence problematic or not?'.
> 
> L a
> U a
> W b
> C b
> 
> If only this sequence can be observable, we should not assume
> this sequence can be changed. However once the former sequence
> happens, it has a possibility to hit the same sequence again later.
> So we can check deadlock possibility with the sequence,
> 
> _not randomly_.

I still don't get it.

> We need to connect between the crosslock and the first lock among
> locks having been acquired since the crosslock was held.

Which can be _any_ lock in the history of that thread. It could be
rq->lock from getting the thread scheduled.

> Others will be
> connected each other by original lockdep.
> 
> By the way, does my document miss this description? If so, sorry.
> I will check and update it.

I couldn't find anything useful, but then I could not understand most of
what was written, and I tried hard :-(


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-14 Thread Peter Zijlstra
On Wed, Sep 14, 2016 at 11:27:22AM +0900, Byungchul Park wrote:
> > Well, there is, its just not trivially observable. We must be able to
> > acquire a in order to complete b, therefore there is a dependency.
> 
> No. We cannot say there is a dependency unconditionally. There can
> be a dependency or not.
> 
> L a L a
> U a
> ~ what if serialized by something?

Well, there's no serialization in the example, so no what if.

> W b C b
> 
> If something we don't recognize serializes locks, which ensures
> 'W b' happens after 'L a , U a' in the other context, then there's
> no dependency here.

Its not there.

> We should say 'b depends on a' in only case that the sequence
> 'W b and then L a and then C b, where last two ops are in same
> context' _actually_ happened at least once. Otherwise, it might
> add a false dependency.
> 
> It's same as how original lockdep works with typical locks. It adds
> a dependency only when a lock is actually hit.

But since these threads are independently scheduled there is no point in
transferring the point in time thread A does W to thread B. There is no
relation there.

B could have already executed the complete or it could not yet have
started execution at all or anything in between, entirely random.

> > What does that mean? Any why? This is a random point in time without
> > actual meaning.
> 
> It's not random point. We have to consider meaningful sequences among
> those which are globally observable. That's why we need to serialize
> those locks.

Serialize how? there is no serialization.

> For example,
> 
> W b
> L a
> U a
> C b
> 
> Once this sequence is observable globally, we can say 'It's possible to
> run in this sequence. Is this sequence problematic or not?'.
> 
> L a
> U a
> W b
> C b
> 
> If only this sequence can be observable, we should not assume
> this sequence can be changed. However once the former sequence
> happens, it has a possibility to hit the same sequence again later.
> So we can check deadlock possibility with the sequence,
> 
> _not randomly_.

I still don't get it.

> We need to connect between the crosslock and the first lock among
> locks having been acquired since the crosslock was held.

Which can be _any_ lock in the history of that thread. It could be
rq->lock from getting the thread scheduled.

> Others will be
> connected each other by original lockdep.
> 
> By the way, does my document miss this description? If so, sorry.
> I will check and update it.

I couldn't find anything useful, but then I could not understand most of
what was written, and I tried hard :-(


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Byungchul Park
On Wed, Sep 14, 2016 at 11:27 AM, Byungchul Park
 wrote:
> On Wed, Sep 14, 2016 at 4:38 AM, Peter Zijlstra  wrote:
>> On Wed, Sep 14, 2016 at 02:12:51AM +0900, Byungchul Park wrote:
>>> On Wed, Sep 14, 2016 at 12:05 AM, Peter Zijlstra  
>>> wrote:
>>> >
>>> >
>>> > So the idea is to add support for non-owner serialization primitives,
>>> > like completions/semaphores, and so far I've not looked at the code yet.
>>> > I did spend 2+ hours trying to decipher your documentation thing, but am
>>> > still confused, that thing is exceedingly hard to parse/read.
>>> >
>>> > So the typical scenario would be something like:
>>> >
>>> > L a L a
>>> > U a
>>> > W b C b
>>> >
>>> > where L := lock, U := unlock, W := wait_for_completion and C :=
>>> > complete.
>>> >
>>> > On the left, the blocking thread we can easily observe the 'b depends on
>>> > a' relation, since we block while holding a.
>>>
>>> I think 'a depends on b' relation.
>>>
>>> Why does b depend on a? b depends on a's what?
>>
>> b blocks while holding a. In any case, for the graph it doesn't matter,
>> its not a directed graph as such, all we care about is acyclic.
>
> It's a directed graph.The meaning of backward traverse is different
> from the meaning of forward traverse, isn't it?
>
>>> > On the right however that same relation is hard to see, since by the
>>>
>>> Yes, there's no dependency on the right side of your example.
>>
>> Well, there is, its just not trivially observable. We must be able to
>> acquire a in order to complete b, therefore there is a dependency.
>
> No. We cannot say there is a dependency unconditionally. There can
> be a dependency or not.
>
> L a L a
> U a
> ~ what if serialized by something?
> W b C b
>
> If something we don't recognize serializes locks, which ensures
> 'W b' happens after 'L a , U a' in the other context, then there's
> no dependency here.
>
> We should say 'b depends on a' in only case that the sequence
> 'W b and then L a and then C b, where last two ops are in same
> context' _actually_ happened at least once. Otherwise, it might
> add a false dependency.
>
> It's same as how original lockdep works with typical locks. It adds
> a dependency only when a lock is actually hit.
>
>>> > time we would run complete, a has already been released.
>>>
>>> I will change your example a little bit.
>>>
>>> W b
>>> ~~~ <- serialized
>>> L a
>>> U a
>>> C b
>>>
>>> (Remind crossrelease considers dependencies in only case that
>>> lock sequence serialized is observable.)
>>
>> What does that mean? Any why? This is a random point in time without
>> actual meaning.
>
> It's not random point. We have to consider meaningful sequences among
> those which are globally observable. That's why we need to serialize
> those locks. For example,
>
> W b
> L a
> U a
> C b
>
> Once this sequence is observable globally, we can say 'It's possible to
> run in this sequence. Is this sequence problematic or not?'.
>
> L a
> U a
> W b
> C b
>
> If only this sequence can be observable, we should not assume
> this sequence can be changed. However once the former sequence
> happens, it has a possibility to hit the same sequence again later.
> So we can check deadlock possibility with the sequence,
>
> _not randomly_.
>
>>> There's a dependency in this case, like 'b depends on a' relation.
>>> 'C b' may not be hit, depending on the result of 'L a', that is,
>>> acquired or not.
>>
>> With or without a random reference point, the dependency is there.
>
> The dependency might not be there.
>
>>> > I _think_ you propose to keep track of all prior held locks and then use
>>> > the union of the held list on the block-chain with the prior held list
>>> > from the complete context.
>>>
>>> Almost right. Only thing we need to do to consider the union is to
>>> connect two chains of two contexts by adding one dependency 'b -> a'.
>>
>> Sure, but how do you arrive at which connection to make. The document is
>> entirely silent on this crucial point.
>
> We need to connect between the crosslock and the first lock among
> locks having been acquired since the crosslock was held. Others will be
> connected each other by original lockdep.
>
> By the way, does my document miss this description? If so, sorry.
> I will check and update it.
>
>> The union between the held-locks of the blocked and prev-held-locks of
>> the release should give a fair indication I think, but then, I've not
>> thought too hard on this yet.
>
> I make the union only If actually the lock sequence happened once so
> that it's globally visible.
>
>>> > The document describes you have a prior held list, and that its a
>>> > circular thing, but it then completely fails to specify what gets added
>>> > and how its used.
>>>
>>> Why does it fail? It keeps them in the form of hlock. This data is enough
>>> to generate dependencies.
>>
>> It fails to 

Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Byungchul Park
On Wed, Sep 14, 2016 at 11:27 AM, Byungchul Park
 wrote:
> On Wed, Sep 14, 2016 at 4:38 AM, Peter Zijlstra  wrote:
>> On Wed, Sep 14, 2016 at 02:12:51AM +0900, Byungchul Park wrote:
>>> On Wed, Sep 14, 2016 at 12:05 AM, Peter Zijlstra  
>>> wrote:
>>> >
>>> >
>>> > So the idea is to add support for non-owner serialization primitives,
>>> > like completions/semaphores, and so far I've not looked at the code yet.
>>> > I did spend 2+ hours trying to decipher your documentation thing, but am
>>> > still confused, that thing is exceedingly hard to parse/read.
>>> >
>>> > So the typical scenario would be something like:
>>> >
>>> > L a L a
>>> > U a
>>> > W b C b
>>> >
>>> > where L := lock, U := unlock, W := wait_for_completion and C :=
>>> > complete.
>>> >
>>> > On the left, the blocking thread we can easily observe the 'b depends on
>>> > a' relation, since we block while holding a.
>>>
>>> I think 'a depends on b' relation.
>>>
>>> Why does b depend on a? b depends on a's what?
>>
>> b blocks while holding a. In any case, for the graph it doesn't matter,
>> its not a directed graph as such, all we care about is acyclic.
>
> It's a directed graph.The meaning of backward traverse is different
> from the meaning of forward traverse, isn't it?
>
>>> > On the right however that same relation is hard to see, since by the
>>>
>>> Yes, there's no dependency on the right side of your example.
>>
>> Well, there is, its just not trivially observable. We must be able to
>> acquire a in order to complete b, therefore there is a dependency.
>
> No. We cannot say there is a dependency unconditionally. There can
> be a dependency or not.
>
> L a L a
> U a
> ~ what if serialized by something?
> W b C b
>
> If something we don't recognize serializes locks, which ensures
> 'W b' happens after 'L a , U a' in the other context, then there's
> no dependency here.
>
> We should say 'b depends on a' in only case that the sequence
> 'W b and then L a and then C b, where last two ops are in same
> context' _actually_ happened at least once. Otherwise, it might
> add a false dependency.
>
> It's same as how original lockdep works with typical locks. It adds
> a dependency only when a lock is actually hit.
>
>>> > time we would run complete, a has already been released.
>>>
>>> I will change your example a little bit.
>>>
>>> W b
>>> ~~~ <- serialized
>>> L a
>>> U a
>>> C b
>>>
>>> (Remind crossrelease considers dependencies in only case that
>>> lock sequence serialized is observable.)
>>
>> What does that mean? Any why? This is a random point in time without
>> actual meaning.
>
> It's not random point. We have to consider meaningful sequences among
> those which are globally observable. That's why we need to serialize
> those locks. For example,
>
> W b
> L a
> U a
> C b
>
> Once this sequence is observable globally, we can say 'It's possible to
> run in this sequence. Is this sequence problematic or not?'.
>
> L a
> U a
> W b
> C b
>
> If only this sequence can be observable, we should not assume
> this sequence can be changed. However once the former sequence
> happens, it has a possibility to hit the same sequence again later.
> So we can check deadlock possibility with the sequence,
>
> _not randomly_.
>
>>> There's a dependency in this case, like 'b depends on a' relation.
>>> 'C b' may not be hit, depending on the result of 'L a', that is,
>>> acquired or not.
>>
>> With or without a random reference point, the dependency is there.
>
> The dependency might not be there.
>
>>> > I _think_ you propose to keep track of all prior held locks and then use
>>> > the union of the held list on the block-chain with the prior held list
>>> > from the complete context.
>>>
>>> Almost right. Only thing we need to do to consider the union is to
>>> connect two chains of two contexts by adding one dependency 'b -> a'.
>>
>> Sure, but how do you arrive at which connection to make. The document is
>> entirely silent on this crucial point.
>
> We need to connect between the crosslock and the first lock among
> locks having been acquired since the crosslock was held. Others will be
> connected each other by original lockdep.
>
> By the way, does my document miss this description? If so, sorry.
> I will check and update it.
>
>> The union between the held-locks of the blocked and prev-held-locks of
>> the release should give a fair indication I think, but then, I've not
>> thought too hard on this yet.
>
> I make the union only If actually the lock sequence happened once so
> that it's globally visible.
>
>>> > The document describes you have a prior held list, and that its a
>>> > circular thing, but it then completely fails to specify what gets added
>>> > and how its used.
>>>
>>> Why does it fail? It keeps them in the form of hlock. This data is enough
>>> to generate dependencies.
>>
>> It fails to explain. It just barely mentions you keep them, it doesn't
>> mention how 

Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Byungchul Park
On Wed, Sep 14, 2016 at 4:38 AM, Peter Zijlstra  wrote:
> On Wed, Sep 14, 2016 at 02:12:51AM +0900, Byungchul Park wrote:
>> On Wed, Sep 14, 2016 at 12:05 AM, Peter Zijlstra  
>> wrote:
>> >
>> >
>> > So the idea is to add support for non-owner serialization primitives,
>> > like completions/semaphores, and so far I've not looked at the code yet.
>> > I did spend 2+ hours trying to decipher your documentation thing, but am
>> > still confused, that thing is exceedingly hard to parse/read.
>> >
>> > So the typical scenario would be something like:
>> >
>> > L a L a
>> > U a
>> > W b C b
>> >
>> > where L := lock, U := unlock, W := wait_for_completion and C :=
>> > complete.
>> >
>> > On the left, the blocking thread we can easily observe the 'b depends on
>> > a' relation, since we block while holding a.
>>
>> I think 'a depends on b' relation.
>>
>> Why does b depend on a? b depends on a's what?
>
> b blocks while holding a. In any case, for the graph it doesn't matter,
> its not a directed graph as such, all we care about is acyclic.

It's a directed graph.The meaning of backward traverse is different
from the meaning of forward traverse, isn't it?

>> > On the right however that same relation is hard to see, since by the
>>
>> Yes, there's no dependency on the right side of your example.
>
> Well, there is, its just not trivially observable. We must be able to
> acquire a in order to complete b, therefore there is a dependency.

No. We cannot say there is a dependency unconditionally. There can
be a dependency or not.

L a L a
U a
~ what if serialized by something?
W b C b

If something we don't recognize serializes locks, which ensures
'W b' happens after 'L a , U a' in the other context, then there's
no dependency here.

We should say 'b depends on a' in only case that the sequence
'W b and then L a and then C b, where last two ops are in same
context' _actually_ happened at least once. Otherwise, it might
add a false dependency.

It's same as how original lockdep works with typical locks. It adds
a dependency only when a lock is actually hit.

>> > time we would run complete, a has already been released.
>>
>> I will change your example a little bit.
>>
>> W b
>> ~~~ <- serialized
>> L a
>> U a
>> C b
>>
>> (Remind crossrelease considers dependencies in only case that
>> lock sequence serialized is observable.)
>
> What does that mean? Any why? This is a random point in time without
> actual meaning.

It's not random point. We have to consider meaningful sequences among
those which are globally observable. That's why we need to serialize
those locks. For example,

W b
L a
U a
C b

Once this sequence is observable globally, we can say 'It's possible to
run in this sequence. Is this sequence problematic or not?'.

L a
U a
W b
C b

If only this sequence can be observable, we should not assume
this sequence can be changed. However once the former sequence
happens, it has a possibility to hit the same sequence again later.
So we can check deadlock possibility with the sequence,

_not randomly_.

>> There's a dependency in this case, like 'b depends on a' relation.
>> 'C b' may not be hit, depending on the result of 'L a', that is,
>> acquired or not.
>
> With or without a random reference point, the dependency is there.

The dependency might not be there.

>> > I _think_ you propose to keep track of all prior held locks and then use
>> > the union of the held list on the block-chain with the prior held list
>> > from the complete context.
>>
>> Almost right. Only thing we need to do to consider the union is to
>> connect two chains of two contexts by adding one dependency 'b -> a'.
>
> Sure, but how do you arrive at which connection to make. The document is
> entirely silent on this crucial point.

We need to connect between the crosslock and the first lock among
locks having been acquired since the crosslock was held. Others will be
connected each other by original lockdep.

By the way, does my document miss this description? If so, sorry.
I will check and update it.

> The union between the held-locks of the blocked and prev-held-locks of
> the release should give a fair indication I think, but then, I've not
> thought too hard on this yet.

I make the union only If actually the lock sequence happened once so
that it's globally visible.

>> > The document describes you have a prior held list, and that its a
>> > circular thing, but it then completely fails to specify what gets added
>> > and how its used.
>>
>> Why does it fail? It keeps them in the form of hlock. This data is enough
>> to generate dependencies.
>
> It fails to explain. It just barely mentions you keep them, it doesn't
> mention how they're used or why.

Okay. I need to explain more about that.

>> > Also, by it being a circular list (of indeterminate, but finite size),
>> > there is the possibility of missing dependencies if 

Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Byungchul Park
On Wed, Sep 14, 2016 at 4:38 AM, Peter Zijlstra  wrote:
> On Wed, Sep 14, 2016 at 02:12:51AM +0900, Byungchul Park wrote:
>> On Wed, Sep 14, 2016 at 12:05 AM, Peter Zijlstra  
>> wrote:
>> >
>> >
>> > So the idea is to add support for non-owner serialization primitives,
>> > like completions/semaphores, and so far I've not looked at the code yet.
>> > I did spend 2+ hours trying to decipher your documentation thing, but am
>> > still confused, that thing is exceedingly hard to parse/read.
>> >
>> > So the typical scenario would be something like:
>> >
>> > L a L a
>> > U a
>> > W b C b
>> >
>> > where L := lock, U := unlock, W := wait_for_completion and C :=
>> > complete.
>> >
>> > On the left, the blocking thread we can easily observe the 'b depends on
>> > a' relation, since we block while holding a.
>>
>> I think 'a depends on b' relation.
>>
>> Why does b depend on a? b depends on a's what?
>
> b blocks while holding a. In any case, for the graph it doesn't matter,
> its not a directed graph as such, all we care about is acyclic.

It's a directed graph.The meaning of backward traverse is different
from the meaning of forward traverse, isn't it?

>> > On the right however that same relation is hard to see, since by the
>>
>> Yes, there's no dependency on the right side of your example.
>
> Well, there is, its just not trivially observable. We must be able to
> acquire a in order to complete b, therefore there is a dependency.

No. We cannot say there is a dependency unconditionally. There can
be a dependency or not.

L a L a
U a
~ what if serialized by something?
W b C b

If something we don't recognize serializes locks, which ensures
'W b' happens after 'L a , U a' in the other context, then there's
no dependency here.

We should say 'b depends on a' in only case that the sequence
'W b and then L a and then C b, where last two ops are in same
context' _actually_ happened at least once. Otherwise, it might
add a false dependency.

It's same as how original lockdep works with typical locks. It adds
a dependency only when a lock is actually hit.

>> > time we would run complete, a has already been released.
>>
>> I will change your example a little bit.
>>
>> W b
>> ~~~ <- serialized
>> L a
>> U a
>> C b
>>
>> (Remind crossrelease considers dependencies in only case that
>> lock sequence serialized is observable.)
>
> What does that mean? Any why? This is a random point in time without
> actual meaning.

It's not random point. We have to consider meaningful sequences among
those which are globally observable. That's why we need to serialize
those locks. For example,

W b
L a
U a
C b

Once this sequence is observable globally, we can say 'It's possible to
run in this sequence. Is this sequence problematic or not?'.

L a
U a
W b
C b

If only this sequence can be observable, we should not assume
this sequence can be changed. However once the former sequence
happens, it has a possibility to hit the same sequence again later.
So we can check deadlock possibility with the sequence,

_not randomly_.

>> There's a dependency in this case, like 'b depends on a' relation.
>> 'C b' may not be hit, depending on the result of 'L a', that is,
>> acquired or not.
>
> With or without a random reference point, the dependency is there.

The dependency might not be there.

>> > I _think_ you propose to keep track of all prior held locks and then use
>> > the union of the held list on the block-chain with the prior held list
>> > from the complete context.
>>
>> Almost right. Only thing we need to do to consider the union is to
>> connect two chains of two contexts by adding one dependency 'b -> a'.
>
> Sure, but how do you arrive at which connection to make. The document is
> entirely silent on this crucial point.

We need to connect between the crosslock and the first lock among
locks having been acquired since the crosslock was held. Others will be
connected each other by original lockdep.

By the way, does my document miss this description? If so, sorry.
I will check and update it.

> The union between the held-locks of the blocked and prev-held-locks of
> the release should give a fair indication I think, but then, I've not
> thought too hard on this yet.

I make the union only If actually the lock sequence happened once so
that it's globally visible.

>> > The document describes you have a prior held list, and that its a
>> > circular thing, but it then completely fails to specify what gets added
>> > and how its used.
>>
>> Why does it fail? It keeps them in the form of hlock. This data is enough
>> to generate dependencies.
>
> It fails to explain. It just barely mentions you keep them, it doesn't
> mention how they're used or why.

Okay. I need to explain more about that.

>> > Also, by it being a circular list (of indeterminate, but finite size),
>> > there is the possibility of missing dependencies if they got spooled out
>> > by recent 

Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Byungchul Park
On Wed, Sep 14, 2016 at 6:42 AM, Peter Zijlstra  wrote:
> On Tue, Sep 13, 2016 at 09:38:29PM +0200, Peter Zijlstra wrote:
>
>> > > I _think_ you propose to keep track of all prior held locks and then use
>> > > the union of the held list on the block-chain with the prior held list
>> > > from the complete context.
>> >
>> > Almost right. Only thing we need to do to consider the union is to
>> > connect two chains of two contexts by adding one dependency 'b -> a'.
>>
>> Sure, but how do you arrive at which connection to make. The document is
>> entirely silent on this crucial point.
>>
>> The union between the held-locks of the blocked and prev-held-locks of
>> the release should give a fair indication I think, but then, I've not
>> thought too hard on this yet.
>
> s/union/intersection/
>
> those that are in both sets.

Precisely speaking, I introduces separate chains.

For example,

1. Held-locks of the blocked,
A -> B -> C (which original lockdep builds)

2. Prev-held-locks of the release
G -> H -> I (which original lockdep builds, too)

3. Cross chain (which I introduced newly)
C -> G

Then the 'A -> B -> C -> G -> H -> I' can be traversed
when bfs is performed.

-- 
Thanks,
Byungchul


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Byungchul Park
On Wed, Sep 14, 2016 at 6:42 AM, Peter Zijlstra  wrote:
> On Tue, Sep 13, 2016 at 09:38:29PM +0200, Peter Zijlstra wrote:
>
>> > > I _think_ you propose to keep track of all prior held locks and then use
>> > > the union of the held list on the block-chain with the prior held list
>> > > from the complete context.
>> >
>> > Almost right. Only thing we need to do to consider the union is to
>> > connect two chains of two contexts by adding one dependency 'b -> a'.
>>
>> Sure, but how do you arrive at which connection to make. The document is
>> entirely silent on this crucial point.
>>
>> The union between the held-locks of the blocked and prev-held-locks of
>> the release should give a fair indication I think, but then, I've not
>> thought too hard on this yet.
>
> s/union/intersection/
>
> those that are in both sets.

Precisely speaking, I introduces separate chains.

For example,

1. Held-locks of the blocked,
A -> B -> C (which original lockdep builds)

2. Prev-held-locks of the release
G -> H -> I (which original lockdep builds, too)

3. Cross chain (which I introduced newly)
C -> G

Then the 'A -> B -> C -> G -> H -> I' can be traversed
when bfs is performed.

-- 
Thanks,
Byungchul


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Peter Zijlstra
On Tue, Sep 13, 2016 at 09:38:29PM +0200, Peter Zijlstra wrote:

> > > I _think_ you propose to keep track of all prior held locks and then use
> > > the union of the held list on the block-chain with the prior held list
> > > from the complete context.
> > 
> > Almost right. Only thing we need to do to consider the union is to
> > connect two chains of two contexts by adding one dependency 'b -> a'.
> 
> Sure, but how do you arrive at which connection to make. The document is
> entirely silent on this crucial point.
> 
> The union between the held-locks of the blocked and prev-held-locks of
> the release should give a fair indication I think, but then, I've not
> thought too hard on this yet.

s/union/intersection/

those that are in both sets.


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Peter Zijlstra
On Tue, Sep 13, 2016 at 09:38:29PM +0200, Peter Zijlstra wrote:

> > > I _think_ you propose to keep track of all prior held locks and then use
> > > the union of the held list on the block-chain with the prior held list
> > > from the complete context.
> > 
> > Almost right. Only thing we need to do to consider the union is to
> > connect two chains of two contexts by adding one dependency 'b -> a'.
> 
> Sure, but how do you arrive at which connection to make. The document is
> entirely silent on this crucial point.
> 
> The union between the held-locks of the blocked and prev-held-locks of
> the release should give a fair indication I think, but then, I've not
> thought too hard on this yet.

s/union/intersection/

those that are in both sets.


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Peter Zijlstra
On Wed, Sep 14, 2016 at 02:12:51AM +0900, Byungchul Park wrote:
> On Wed, Sep 14, 2016 at 12:05 AM, Peter Zijlstra  wrote:
> >
> >
> > So the idea is to add support for non-owner serialization primitives,
> > like completions/semaphores, and so far I've not looked at the code yet.
> > I did spend 2+ hours trying to decipher your documentation thing, but am
> > still confused, that thing is exceedingly hard to parse/read.
> >
> > So the typical scenario would be something like:
> >
> > L a L a
> > U a
> > W b C b
> >
> > where L := lock, U := unlock, W := wait_for_completion and C :=
> > complete.
> >
> > On the left, the blocking thread we can easily observe the 'b depends on
> > a' relation, since we block while holding a.
> 
> I think 'a depends on b' relation.
> 
> Why does b depend on a? b depends on a's what?

b blocks while holding a. In any case, for the graph it doesn't matter,
its not a directed graph as such, all we care about is acyclic.

> > On the right however that same relation is hard to see, since by the
> 
> Yes, there's no dependency on the right side of your example.

Well, there is, its just not trivially observable. We must be able to
acquire a in order to complete b, therefore there is a dependency.

> > time we would run complete, a has already been released.
> 
> I will change your example a little bit.
> 
> W b
> ~~~ <- serialized
> L a
> U a
> C b
> 
> (Remind crossrelease considers dependencies in only case that
> lock sequence serialized is observable.)

What does that mean? Any why? This is a random point in time without
actual meaning.

> There's a dependency in this case, like 'b depends on a' relation.
> 'C b' may not be hit, depending on the result of 'L a', that is,
> acquired or not.

With or without a random reference point, the dependency is there.

> > I _think_ you propose to keep track of all prior held locks and then use
> > the union of the held list on the block-chain with the prior held list
> > from the complete context.
> 
> Almost right. Only thing we need to do to consider the union is to
> connect two chains of two contexts by adding one dependency 'b -> a'.

Sure, but how do you arrive at which connection to make. The document is
entirely silent on this crucial point.

The union between the held-locks of the blocked and prev-held-locks of
the release should give a fair indication I think, but then, I've not
thought too hard on this yet.

> > The document describes you have a prior held list, and that its a
> > circular thing, but it then completely fails to specify what gets added
> > and how its used.
> 
> Why does it fail? It keeps them in the form of hlock. This data is enough
> to generate dependencies.

It fails to explain. It just barely mentions you keep them, it doesn't
mention how they're used or why.

> > Also, by it being a circular list (of indeterminate, but finite size),
> > there is the possibility of missing dependencies if they got spooled out
> > by recent activity.
> 
> Yes, right. They might be missed. It means just missing some chances to
> check a deadlock. It's all. Better than do nothing.

Sure, but you didn't specify. Again, the document is very ambiguous and
ill specified.

> > The document has an example in the section 'How cross release works'
> > (the 3rd) that simply cannot be. It lists lock action _after_ acquire,
> 
> Could you tell me more? Why cannot it be?

Once you block you cannot take more locks, that simply doesnt make
sense.

> > but you place the acquire in wait_for_completion. We _block_ there.
> >
> > Is this the general idea?
> >
> > If so, I cannot see how something like:
> >
> > W a W b
> > C b C a
> 
> I didn't tell it works in this case. But it can be a future work. I'm not
> sure but I don't think making it work is impossible at all. But anyway
> current implementation cannot deal with this case.

+4. 'crosslock a -> crosslock b' dependency
+
+   Creating this kind of dependency directly is unnecessary since it can
+   be covered by other kinds of dependencies.

Says something different, doesn't it?

> > On the whole 'release context' thing you want to cast lockdep in, please
> > consider something like:
> >
> > L a L b
> > L b L a
> > U a U a
> > U b U b
> >
> > Note that the release order of locks is immaterial and can generate
> > 'wrong' dependencies. Note how on both sides a is released under b,
> 
> Who said the release order is important? Wrong for what?

Well, again, you mention 'release context' without actually specifying
what you mean with that.

L a
L b
U b

and

L a
L b
U a

Here the unlocks obviously have different context. Without further
specification its simply impossible to tell what you mean.


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Peter Zijlstra
On Wed, Sep 14, 2016 at 02:12:51AM +0900, Byungchul Park wrote:
> On Wed, Sep 14, 2016 at 12:05 AM, Peter Zijlstra  wrote:
> >
> >
> > So the idea is to add support for non-owner serialization primitives,
> > like completions/semaphores, and so far I've not looked at the code yet.
> > I did spend 2+ hours trying to decipher your documentation thing, but am
> > still confused, that thing is exceedingly hard to parse/read.
> >
> > So the typical scenario would be something like:
> >
> > L a L a
> > U a
> > W b C b
> >
> > where L := lock, U := unlock, W := wait_for_completion and C :=
> > complete.
> >
> > On the left, the blocking thread we can easily observe the 'b depends on
> > a' relation, since we block while holding a.
> 
> I think 'a depends on b' relation.
> 
> Why does b depend on a? b depends on a's what?

b blocks while holding a. In any case, for the graph it doesn't matter,
its not a directed graph as such, all we care about is acyclic.

> > On the right however that same relation is hard to see, since by the
> 
> Yes, there's no dependency on the right side of your example.

Well, there is, its just not trivially observable. We must be able to
acquire a in order to complete b, therefore there is a dependency.

> > time we would run complete, a has already been released.
> 
> I will change your example a little bit.
> 
> W b
> ~~~ <- serialized
> L a
> U a
> C b
> 
> (Remind crossrelease considers dependencies in only case that
> lock sequence serialized is observable.)

What does that mean? Any why? This is a random point in time without
actual meaning.

> There's a dependency in this case, like 'b depends on a' relation.
> 'C b' may not be hit, depending on the result of 'L a', that is,
> acquired or not.

With or without a random reference point, the dependency is there.

> > I _think_ you propose to keep track of all prior held locks and then use
> > the union of the held list on the block-chain with the prior held list
> > from the complete context.
> 
> Almost right. Only thing we need to do to consider the union is to
> connect two chains of two contexts by adding one dependency 'b -> a'.

Sure, but how do you arrive at which connection to make. The document is
entirely silent on this crucial point.

The union between the held-locks of the blocked and prev-held-locks of
the release should give a fair indication I think, but then, I've not
thought too hard on this yet.

> > The document describes you have a prior held list, and that its a
> > circular thing, but it then completely fails to specify what gets added
> > and how its used.
> 
> Why does it fail? It keeps them in the form of hlock. This data is enough
> to generate dependencies.

It fails to explain. It just barely mentions you keep them, it doesn't
mention how they're used or why.

> > Also, by it being a circular list (of indeterminate, but finite size),
> > there is the possibility of missing dependencies if they got spooled out
> > by recent activity.
> 
> Yes, right. They might be missed. It means just missing some chances to
> check a deadlock. It's all. Better than do nothing.

Sure, but you didn't specify. Again, the document is very ambiguous and
ill specified.

> > The document has an example in the section 'How cross release works'
> > (the 3rd) that simply cannot be. It lists lock action _after_ acquire,
> 
> Could you tell me more? Why cannot it be?

Once you block you cannot take more locks, that simply doesnt make
sense.

> > but you place the acquire in wait_for_completion. We _block_ there.
> >
> > Is this the general idea?
> >
> > If so, I cannot see how something like:
> >
> > W a W b
> > C b C a
> 
> I didn't tell it works in this case. But it can be a future work. I'm not
> sure but I don't think making it work is impossible at all. But anyway
> current implementation cannot deal with this case.

+4. 'crosslock a -> crosslock b' dependency
+
+   Creating this kind of dependency directly is unnecessary since it can
+   be covered by other kinds of dependencies.

Says something different, doesn't it?

> > On the whole 'release context' thing you want to cast lockdep in, please
> > consider something like:
> >
> > L a L b
> > L b L a
> > U a U a
> > U b U b
> >
> > Note that the release order of locks is immaterial and can generate
> > 'wrong' dependencies. Note how on both sides a is released under b,
> 
> Who said the release order is important? Wrong for what?

Well, again, you mention 'release context' without actually specifying
what you mean with that.

L a
L b
U b

and

L a
L b
U a

Here the unlocks obviously have different context. Without further
specification its simply impossible to tell what you mean.


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Byungchul Park
On Wed, Sep 14, 2016 at 12:05 AM, Peter Zijlstra  wrote:
>
>
> So the idea is to add support for non-owner serialization primitives,
> like completions/semaphores, and so far I've not looked at the code yet.
> I did spend 2+ hours trying to decipher your documentation thing, but am
> still confused, that thing is exceedingly hard to parse/read.
>
> So the typical scenario would be something like:
>
> L a L a
> U a
> W b C b
>
> where L := lock, U := unlock, W := wait_for_completion and C :=
> complete.
>
> On the left, the blocking thread we can easily observe the 'b depends on
> a' relation, since we block while holding a.

I think 'a depends on b' relation.

Why does b depend on a? b depends on a's what?

> On the right however that same relation is hard to see, since by the

Yes, there's no dependency on the right side of your example.

> time we would run complete, a has already been released.

I will change your example a little bit.

W b
~~~ <- serialized
L a
U a
C b

(Remind crossrelease considers dependencies in only case that
lock sequence serialized is observable.)

There's a dependency in this case, like 'b depends on a' relation.
'C b' may not be hit, depending on the result of 'L a', that is,
acquired or not.

> I _think_ you propose to keep track of all prior held locks and then use
> the union of the held list on the block-chain with the prior held list
> from the complete context.

Almost right. Only thing we need to do to consider the union is to
connect two chains of two contexts by adding one dependency 'b -> a'.

> The document describes you have a prior held list, and that its a
> circular thing, but it then completely fails to specify what gets added
> and how its used.

Why does it fail? It keeps them in the form of hlock. This data is enough
to generate dependencies.

> Also, by it being a circular list (of indeterminate, but finite size),
> there is the possibility of missing dependencies if they got spooled out
> by recent activity.

Yes, right. They might be missed. It means just missing some chances to
check a deadlock. It's all. Better than do nothing.

Furthermore, It just only needs 10~50 entries on qemu-i386 4core since
it's optimized as far as possible so that it only considers essential ones
instead of all the prior held list. The number of entries might need to be
changed on a large system. It's future work.

> The document has an example in the section 'How cross release works'
> (the 3rd) that simply cannot be. It lists lock action _after_ acquire,

Could you tell me more? Why cannot it be?

> but you place the acquire in wait_for_completion. We _block_ there.
>
> Is this the general idea?
>
> If so, I cannot see how something like:
>
> W a W b
> C b C a

I didn't tell it works in this case. But it can be a future work. I'm not
sure but I don't think making it work is impossible at all. But anyway
current implementation cannot deal with this case.

> would work, somewhere in that document it states that this would be
> handled by the existing dependencies, but I cannot see how. The blocking

Nowhere I mentioned it can be handled.

However it will work if we can identify and add 'a -> b' and 'b -> a'
dependencies. It's fairly possible.

> thread (either one) has no held context, therefore the previously
> mentioned union of held and prev-held is empty.

Right, in current implementation.

> The alternative is not doing the union, but then you end up with endless
> pointless dependencies afaict.

That's a matter of whether or not we can identify additional dependencies.
I proposed the way to find certain and additional dependencies which original
lockdep missed. Of course there might be other dependencies which cannot
be identified with current implementation. It must be a future work.

> On the whole 'release context' thing you want to cast lockdep in, please
> consider something like:
>
> L a L b
> L b L a
> U a U a
> U b U b
>
> Note that the release order of locks is immaterial and can generate
> 'wrong' dependencies. Note how on both sides a is released under b,

Who said the release order is important? Wrong for what?

Using the way crossrelease works, AB-BA deadlock of course can be
detected, even though typical lock like 'a' and 'b' does not need to use
crossrelease feature at all.

On left side,
L a : queue a
L b : queue b
U a : add 'a -> b' (consider only locks having been tried since L a was held.)
U b : nop

On right side,
L b : queue b
L a : queue a
U a : nop
U b : add 'b -> a' (consider only locks having been tried since L b was held.)

AB-BA deadlock can be detected with dependencies 'a -> b' and 'b -> a'.
Fairly true dependencies can be generated with this concept, and never fail
to observe this kind of deadlock.

> failing to observe the stil obvious AB-BA deadlock.
>
> Note that lockdep doesn't model release or anything, it models
> blocked-on relations, like PI 

Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Byungchul Park
On Wed, Sep 14, 2016 at 12:05 AM, Peter Zijlstra  wrote:
>
>
> So the idea is to add support for non-owner serialization primitives,
> like completions/semaphores, and so far I've not looked at the code yet.
> I did spend 2+ hours trying to decipher your documentation thing, but am
> still confused, that thing is exceedingly hard to parse/read.
>
> So the typical scenario would be something like:
>
> L a L a
> U a
> W b C b
>
> where L := lock, U := unlock, W := wait_for_completion and C :=
> complete.
>
> On the left, the blocking thread we can easily observe the 'b depends on
> a' relation, since we block while holding a.

I think 'a depends on b' relation.

Why does b depend on a? b depends on a's what?

> On the right however that same relation is hard to see, since by the

Yes, there's no dependency on the right side of your example.

> time we would run complete, a has already been released.

I will change your example a little bit.

W b
~~~ <- serialized
L a
U a
C b

(Remind crossrelease considers dependencies in only case that
lock sequence serialized is observable.)

There's a dependency in this case, like 'b depends on a' relation.
'C b' may not be hit, depending on the result of 'L a', that is,
acquired or not.

> I _think_ you propose to keep track of all prior held locks and then use
> the union of the held list on the block-chain with the prior held list
> from the complete context.

Almost right. Only thing we need to do to consider the union is to
connect two chains of two contexts by adding one dependency 'b -> a'.

> The document describes you have a prior held list, and that its a
> circular thing, but it then completely fails to specify what gets added
> and how its used.

Why does it fail? It keeps them in the form of hlock. This data is enough
to generate dependencies.

> Also, by it being a circular list (of indeterminate, but finite size),
> there is the possibility of missing dependencies if they got spooled out
> by recent activity.

Yes, right. They might be missed. It means just missing some chances to
check a deadlock. It's all. Better than do nothing.

Furthermore, It just only needs 10~50 entries on qemu-i386 4core since
it's optimized as far as possible so that it only considers essential ones
instead of all the prior held list. The number of entries might need to be
changed on a large system. It's future work.

> The document has an example in the section 'How cross release works'
> (the 3rd) that simply cannot be. It lists lock action _after_ acquire,

Could you tell me more? Why cannot it be?

> but you place the acquire in wait_for_completion. We _block_ there.
>
> Is this the general idea?
>
> If so, I cannot see how something like:
>
> W a W b
> C b C a

I didn't tell it works in this case. But it can be a future work. I'm not
sure but I don't think making it work is impossible at all. But anyway
current implementation cannot deal with this case.

> would work, somewhere in that document it states that this would be
> handled by the existing dependencies, but I cannot see how. The blocking

Nowhere I mentioned it can be handled.

However it will work if we can identify and add 'a -> b' and 'b -> a'
dependencies. It's fairly possible.

> thread (either one) has no held context, therefore the previously
> mentioned union of held and prev-held is empty.

Right, in current implementation.

> The alternative is not doing the union, but then you end up with endless
> pointless dependencies afaict.

That's a matter of whether or not we can identify additional dependencies.
I proposed the way to find certain and additional dependencies which original
lockdep missed. Of course there might be other dependencies which cannot
be identified with current implementation. It must be a future work.

> On the whole 'release context' thing you want to cast lockdep in, please
> consider something like:
>
> L a L b
> L b L a
> U a U a
> U b U b
>
> Note that the release order of locks is immaterial and can generate
> 'wrong' dependencies. Note how on both sides a is released under b,

Who said the release order is important? Wrong for what?

Using the way crossrelease works, AB-BA deadlock of course can be
detected, even though typical lock like 'a' and 'b' does not need to use
crossrelease feature at all.

On left side,
L a : queue a
L b : queue b
U a : add 'a -> b' (consider only locks having been tried since L a was held.)
U b : nop

On right side,
L b : queue b
L a : queue a
U a : nop
U b : add 'b -> a' (consider only locks having been tried since L b was held.)

AB-BA deadlock can be detected with dependencies 'a -> b' and 'b -> a'.
Fairly true dependencies can be generated with this concept, and never fail
to observe this kind of deadlock.

> failing to observe the stil obvious AB-BA deadlock.
>
> Note that lockdep doesn't model release or anything, it models
> blocked-on relations, like PI does.

As you said, 

Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Byungchul Park
On Tue, Sep 13, 2016 at 7:05 PM, Peter Zijlstra  wrote:
> On Tue, Sep 13, 2016 at 06:45:06PM +0900, Byungchul Park wrote:
>> Crossrelease feature calls a lock 'crosslock' if it is releasable
>> in any context. For crosslock, all locks having been held in the
>> release context of the crosslock, until eventually the crosslock
>> will be released, have dependency with the crosslock.
>>
>> Using crossrelease feature, we can detect deadlock possibility even
>> for lock_page(), wait_for_complete() and so on.
>>
>
> Completely inadequate.
>
> Please explain how cross-release does what it does. Talk about lock
> graphs and such.

Hello,

Could you tell me about what you intend, in detail?
I'm now asking you since I really don't know it.

The reason I reworked on documentation was to
answer your requests like "explain mathematically",
"tell how it works with graph" and so on. Should I
do anything else? I really don't know it.

If I missed something, please let me know. Then I
can do whatever you want if it's necessary.

> I do not have time to reverse engineer this stuff.

Why don't you read the document in the last patch
first? The document is my answer for your requests
you asked in version 1 thread. Insufficient?


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Byungchul Park
On Tue, Sep 13, 2016 at 7:05 PM, Peter Zijlstra  wrote:
> On Tue, Sep 13, 2016 at 06:45:06PM +0900, Byungchul Park wrote:
>> Crossrelease feature calls a lock 'crosslock' if it is releasable
>> in any context. For crosslock, all locks having been held in the
>> release context of the crosslock, until eventually the crosslock
>> will be released, have dependency with the crosslock.
>>
>> Using crossrelease feature, we can detect deadlock possibility even
>> for lock_page(), wait_for_complete() and so on.
>>
>
> Completely inadequate.
>
> Please explain how cross-release does what it does. Talk about lock
> graphs and such.

Hello,

Could you tell me about what you intend, in detail?
I'm now asking you since I really don't know it.

The reason I reworked on documentation was to
answer your requests like "explain mathematically",
"tell how it works with graph" and so on. Should I
do anything else? I really don't know it.

If I missed something, please let me know. Then I
can do whatever you want if it's necessary.

> I do not have time to reverse engineer this stuff.

Why don't you read the document in the last patch
first? The document is my answer for your requests
you asked in version 1 thread. Insufficient?


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Peter Zijlstra


So the idea is to add support for non-owner serialization primitives,
like completions/semaphores, and so far I've not looked at the code yet.
I did spend 2+ hours trying to decipher your documentation thing, but am
still confused, that thing is exceedingly hard to parse/read.

So the typical scenario would be something like:

L a L a
U a
W b C b

where L := lock, U := unlock, W := wait_for_completion and C :=
complete.

On the left, the blocking thread we can easily observe the 'b depends on
a' relation, since we block while holding a.

On the right however that same relation is hard to see, since by the
time we would run complete, a has already been released.

I _think_ you propose to keep track of all prior held locks and then use
the union of the held list on the block-chain with the prior held list
from the complete context.

The document describes you have a prior held list, and that its a
circular thing, but it then completely fails to specify what gets added
and how its used.

Also, by it being a circular list (of indeterminate, but finite size),
there is the possibility of missing dependencies if they got spooled out
by recent activity.

The document has an example in the section 'How cross release works'
(the 3rd) that simply cannot be. It lists lock action _after_ acquire,
but you place the acquire in wait_for_completion. We _block_ there.

Is this the general idea?

If so, I cannot see how something like:

W a W b
C b C a

would work, somewhere in that document it states that this would be
handled by the existing dependencies, but I cannot see how. The blocking
thread (either one) has no held context, therefore the previously
mentioned union of held and prev-held is empty.

The alternative is not doing the union, but then you end up with endless
pointless dependencies afaict.


On the whole 'release context' thing you want to cast lockdep in, please
consider something like:

L a L b
L b L a
U a U a
U b U b

Note that the release order of locks is immaterial and can generate
'wrong' dependencies. Note how on both sides a is released under b,
failing to observe the stil obvious AB-BA deadlock.

Note that lockdep doesn't model release or anything, it models
blocked-on relations, like PI does.


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Peter Zijlstra


So the idea is to add support for non-owner serialization primitives,
like completions/semaphores, and so far I've not looked at the code yet.
I did spend 2+ hours trying to decipher your documentation thing, but am
still confused, that thing is exceedingly hard to parse/read.

So the typical scenario would be something like:

L a L a
U a
W b C b

where L := lock, U := unlock, W := wait_for_completion and C :=
complete.

On the left, the blocking thread we can easily observe the 'b depends on
a' relation, since we block while holding a.

On the right however that same relation is hard to see, since by the
time we would run complete, a has already been released.

I _think_ you propose to keep track of all prior held locks and then use
the union of the held list on the block-chain with the prior held list
from the complete context.

The document describes you have a prior held list, and that its a
circular thing, but it then completely fails to specify what gets added
and how its used.

Also, by it being a circular list (of indeterminate, but finite size),
there is the possibility of missing dependencies if they got spooled out
by recent activity.

The document has an example in the section 'How cross release works'
(the 3rd) that simply cannot be. It lists lock action _after_ acquire,
but you place the acquire in wait_for_completion. We _block_ there.

Is this the general idea?

If so, I cannot see how something like:

W a W b
C b C a

would work, somewhere in that document it states that this would be
handled by the existing dependencies, but I cannot see how. The blocking
thread (either one) has no held context, therefore the previously
mentioned union of held and prev-held is empty.

The alternative is not doing the union, but then you end up with endless
pointless dependencies afaict.


On the whole 'release context' thing you want to cast lockdep in, please
consider something like:

L a L b
L b L a
U a U a
U b U b

Note that the release order of locks is immaterial and can generate
'wrong' dependencies. Note how on both sides a is released under b,
failing to observe the stil obvious AB-BA deadlock.

Note that lockdep doesn't model release or anything, it models
blocked-on relations, like PI does.


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Peter Zijlstra
On Tue, Sep 13, 2016 at 12:05:20PM +0200, Peter Zijlstra wrote:
> On Tue, Sep 13, 2016 at 06:45:06PM +0900, Byungchul Park wrote:
> > Crossrelease feature calls a lock 'crosslock' if it is releasable
> > in any context. For crosslock, all locks having been held in the
> > release context of the crosslock, until eventually the crosslock
> > will be released, have dependency with the crosslock.
> > 
> > Using crossrelease feature, we can detect deadlock possibility even
> > for lock_page(), wait_for_complete() and so on.
> > 
> 
> Completely inadequate.
> 
> Please explain how cross-release does what it does. Talk about lock
> graphs and such.
> 
> I do not have time to reverse engineer this stuff.

Blergh, I'll do it anyway.. hold on for an email asking specific
questions.


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Peter Zijlstra
On Tue, Sep 13, 2016 at 12:05:20PM +0200, Peter Zijlstra wrote:
> On Tue, Sep 13, 2016 at 06:45:06PM +0900, Byungchul Park wrote:
> > Crossrelease feature calls a lock 'crosslock' if it is releasable
> > in any context. For crosslock, all locks having been held in the
> > release context of the crosslock, until eventually the crosslock
> > will be released, have dependency with the crosslock.
> > 
> > Using crossrelease feature, we can detect deadlock possibility even
> > for lock_page(), wait_for_complete() and so on.
> > 
> 
> Completely inadequate.
> 
> Please explain how cross-release does what it does. Talk about lock
> graphs and such.
> 
> I do not have time to reverse engineer this stuff.

Blergh, I'll do it anyway.. hold on for an email asking specific
questions.


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Peter Zijlstra
On Tue, Sep 13, 2016 at 06:45:06PM +0900, Byungchul Park wrote:
> Crossrelease feature calls a lock 'crosslock' if it is releasable
> in any context. For crosslock, all locks having been held in the
> release context of the crosslock, until eventually the crosslock
> will be released, have dependency with the crosslock.
> 
> Using crossrelease feature, we can detect deadlock possibility even
> for lock_page(), wait_for_complete() and so on.
> 

Completely inadequate.

Please explain how cross-release does what it does. Talk about lock
graphs and such.

I do not have time to reverse engineer this stuff.


Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Peter Zijlstra
On Tue, Sep 13, 2016 at 06:45:06PM +0900, Byungchul Park wrote:
> Crossrelease feature calls a lock 'crosslock' if it is releasable
> in any context. For crosslock, all locks having been held in the
> release context of the crosslock, until eventually the crosslock
> will be released, have dependency with the crosslock.
> 
> Using crossrelease feature, we can detect deadlock possibility even
> for lock_page(), wait_for_complete() and so on.
> 

Completely inadequate.

Please explain how cross-release does what it does. Talk about lock
graphs and such.

I do not have time to reverse engineer this stuff.


[PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Byungchul Park
Crossrelease feature calls a lock 'crosslock' if it is releasable
in any context. For crosslock, all locks having been held in the
release context of the crosslock, until eventually the crosslock
will be released, have dependency with the crosslock.

Using crossrelease feature, we can detect deadlock possibility even
for lock_page(), wait_for_complete() and so on.

Signed-off-by: Byungchul Park 
---
 include/linux/irqflags.h |  12 +-
 include/linux/lockdep.h  | 122 +++
 include/linux/sched.h|   5 +
 kernel/exit.c|   9 +
 kernel/fork.c|  20 ++
 kernel/locking/lockdep.c | 517 +--
 lib/Kconfig.debug|  13 ++
 7 files changed, 682 insertions(+), 16 deletions(-)

diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index 5dd1272..b1854fa 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -23,9 +23,17 @@
 # define trace_softirq_context(p)  ((p)->softirq_context)
 # define trace_hardirqs_enabled(p) ((p)->hardirqs_enabled)
 # define trace_softirqs_enabled(p) ((p)->softirqs_enabled)
-# define trace_hardirq_enter() do { current->hardirq_context++; } while (0)
+# define trace_hardirq_enter() \
+do {   \
+   current->hardirq_context++; \
+   crossrelease_hardirq_start();   \
+} while (0)
 # define trace_hardirq_exit()  do { current->hardirq_context--; } while (0)
-# define lockdep_softirq_enter()   do { current->softirq_context++; } 
while (0)
+# define lockdep_softirq_enter()   \
+do {   \
+   current->softirq_context++; \
+   crossrelease_softirq_start();   \
+} while (0)
 # define lockdep_softirq_exit()do { current->softirq_context--; } 
while (0)
 # define INIT_TRACE_IRQFLAGS   .softirqs_enabled = 1,
 #else
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index eabe013..6b3708b 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -108,6 +108,12 @@ struct lock_class {
unsigned long   contention_point[LOCKSTAT_POINTS];
unsigned long   contending_point[LOCKSTAT_POINTS];
 #endif
+#ifdef CONFIG_LOCKDEP_CROSSRELEASE
+   /*
+* Flag to indicate whether it's a crosslock or normal.
+*/
+   int cross;
+#endif
 };
 
 #ifdef CONFIG_LOCK_STAT
@@ -143,6 +149,9 @@ struct lock_class_stats lock_stats(struct lock_class 
*class);
 void clear_lock_stats(struct lock_class *class);
 #endif
 
+#ifdef CONFIG_LOCKDEP_CROSSRELEASE
+struct cross_lock;
+#endif
 /*
  * Map the lock object (the lock instance) to the lock-class object.
  * This is embedded into specific lock instances:
@@ -155,6 +164,9 @@ struct lockdep_map {
int cpu;
unsigned long   ip;
 #endif
+#ifdef CONFIG_LOCKDEP_CROSSRELEASE
+   struct cross_lock   *xlock;
+#endif
 };
 
 static inline void lockdep_copy_map(struct lockdep_map *to,
@@ -258,7 +270,82 @@ struct held_lock {
unsigned int hardirqs_off:1;
unsigned int references:12; /* 32 
bits */
unsigned int pin_count;
+#ifdef CONFIG_LOCKDEP_CROSSRELEASE
+   /*
+* This is used to find out the first plock among plocks having
+* been acquired since a crosslock was held. Crossrelease feature
+* uses chain cache between the crosslock and the first plock to
+* avoid building unnecessary dependencies, like how lockdep uses
+* a sort of chain cache for normal locks.
+*/
+   unsigned int gen_id;
+#endif
+};
+
+#ifdef CONFIG_LOCKDEP_CROSSRELEASE
+#define MAX_PLOCK_TRACE_ENTRIES5
+
+/*
+ * This is for keeping locks waiting for commit to happen so that
+ * dependencies are actually built later at commit step.
+ *
+ * Every task_struct has an array of pend_lock. Each entiry will be
+ * added with a lock whenever lock_acquire() is called for normal lock.
+ */
+struct pend_lock {
+   /*
+* prev_gen_id is used to check whether any other hlock in the
+* current is already dealing with the xlock, with which commit
+* is performed. If so, this plock can be skipped.
+*/
+   unsigned intprev_gen_id;
+   /*
+* A kind of global timestamp increased and set when this plock
+* is inserted.
+*/
+   unsigned intgen_id;
+
+   int hardirq_context;
+   int softirq_context;
+
+   /*
+* Whenever irq happens, these are updated so that we can
+* distinguish each irq context uniquely.
+*/
+   unsigned inthardirq_id;
+   unsigned intsoftirq_id;
+
+   /*
+* Seperate stack_trace data. This will be used at commit step.
+*/
+   struct stack_trace   

[PATCH v3 07/15] lockdep: Implement crossrelease feature

2016-09-13 Thread Byungchul Park
Crossrelease feature calls a lock 'crosslock' if it is releasable
in any context. For crosslock, all locks having been held in the
release context of the crosslock, until eventually the crosslock
will be released, have dependency with the crosslock.

Using crossrelease feature, we can detect deadlock possibility even
for lock_page(), wait_for_complete() and so on.

Signed-off-by: Byungchul Park 
---
 include/linux/irqflags.h |  12 +-
 include/linux/lockdep.h  | 122 +++
 include/linux/sched.h|   5 +
 kernel/exit.c|   9 +
 kernel/fork.c|  20 ++
 kernel/locking/lockdep.c | 517 +--
 lib/Kconfig.debug|  13 ++
 7 files changed, 682 insertions(+), 16 deletions(-)

diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index 5dd1272..b1854fa 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -23,9 +23,17 @@
 # define trace_softirq_context(p)  ((p)->softirq_context)
 # define trace_hardirqs_enabled(p) ((p)->hardirqs_enabled)
 # define trace_softirqs_enabled(p) ((p)->softirqs_enabled)
-# define trace_hardirq_enter() do { current->hardirq_context++; } while (0)
+# define trace_hardirq_enter() \
+do {   \
+   current->hardirq_context++; \
+   crossrelease_hardirq_start();   \
+} while (0)
 # define trace_hardirq_exit()  do { current->hardirq_context--; } while (0)
-# define lockdep_softirq_enter()   do { current->softirq_context++; } 
while (0)
+# define lockdep_softirq_enter()   \
+do {   \
+   current->softirq_context++; \
+   crossrelease_softirq_start();   \
+} while (0)
 # define lockdep_softirq_exit()do { current->softirq_context--; } 
while (0)
 # define INIT_TRACE_IRQFLAGS   .softirqs_enabled = 1,
 #else
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index eabe013..6b3708b 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -108,6 +108,12 @@ struct lock_class {
unsigned long   contention_point[LOCKSTAT_POINTS];
unsigned long   contending_point[LOCKSTAT_POINTS];
 #endif
+#ifdef CONFIG_LOCKDEP_CROSSRELEASE
+   /*
+* Flag to indicate whether it's a crosslock or normal.
+*/
+   int cross;
+#endif
 };
 
 #ifdef CONFIG_LOCK_STAT
@@ -143,6 +149,9 @@ struct lock_class_stats lock_stats(struct lock_class 
*class);
 void clear_lock_stats(struct lock_class *class);
 #endif
 
+#ifdef CONFIG_LOCKDEP_CROSSRELEASE
+struct cross_lock;
+#endif
 /*
  * Map the lock object (the lock instance) to the lock-class object.
  * This is embedded into specific lock instances:
@@ -155,6 +164,9 @@ struct lockdep_map {
int cpu;
unsigned long   ip;
 #endif
+#ifdef CONFIG_LOCKDEP_CROSSRELEASE
+   struct cross_lock   *xlock;
+#endif
 };
 
 static inline void lockdep_copy_map(struct lockdep_map *to,
@@ -258,7 +270,82 @@ struct held_lock {
unsigned int hardirqs_off:1;
unsigned int references:12; /* 32 
bits */
unsigned int pin_count;
+#ifdef CONFIG_LOCKDEP_CROSSRELEASE
+   /*
+* This is used to find out the first plock among plocks having
+* been acquired since a crosslock was held. Crossrelease feature
+* uses chain cache between the crosslock and the first plock to
+* avoid building unnecessary dependencies, like how lockdep uses
+* a sort of chain cache for normal locks.
+*/
+   unsigned int gen_id;
+#endif
+};
+
+#ifdef CONFIG_LOCKDEP_CROSSRELEASE
+#define MAX_PLOCK_TRACE_ENTRIES5
+
+/*
+ * This is for keeping locks waiting for commit to happen so that
+ * dependencies are actually built later at commit step.
+ *
+ * Every task_struct has an array of pend_lock. Each entiry will be
+ * added with a lock whenever lock_acquire() is called for normal lock.
+ */
+struct pend_lock {
+   /*
+* prev_gen_id is used to check whether any other hlock in the
+* current is already dealing with the xlock, with which commit
+* is performed. If so, this plock can be skipped.
+*/
+   unsigned intprev_gen_id;
+   /*
+* A kind of global timestamp increased and set when this plock
+* is inserted.
+*/
+   unsigned intgen_id;
+
+   int hardirq_context;
+   int softirq_context;
+
+   /*
+* Whenever irq happens, these are updated so that we can
+* distinguish each irq context uniquely.
+*/
+   unsigned inthardirq_id;
+   unsigned intsoftirq_id;
+
+   /*
+* Seperate stack_trace data. This will be used at commit step.
+*/
+   struct stack_trace  trace;
+