Re: [PATCH] Priority Lists for the RT mutex

2005-04-12 Thread Daniel Walker
On Sun, 2005-04-10 at 04:09, Ingo Molnar wrote:
> Unless i'm missing something, this could be implemented by detaching
> lock->owner_prio from lock->owner - via e.g. negative values. Thus some
> minimal code would check whether we need the owner's priority in the PI
> logic, or the semaphore's "own" priority level.

The owners priority should be set to the semaphore's priority .. Or the
highest priority of all the semaphores that it has locked.

Daniel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-12 Thread Esben Nielsen
I looked at the PI-code to see what priority the task (old_owner below)
would end up with when it released a lock. From rt.c:

prio = mutex_getprio(old_owner);
if (new_owner && !plist_empty(_owner->pi_waiters)) {
w = plist_entry(_owner->pi_waiters, struct
rt_mutex_waiter, pi_list);
prio = w->task->prio;
}
if (prio != old_owner->prio)
pi_setprio(lock, old_owner, prio);

What has new_owner to do with it? Shouldn't it be old_owner in these
lines? I.e. the prio we want to set old_owner to should be the prio of the
head of the old_owner->pi_waiters, not the new_owner!

Esben


On Mon, 11 Apr 2005, Ingo Molnar wrote:

> 
> * Perez-Gonzalez, Inaky <[EMAIL PROTECTED]> wrote:
> 
> > Let me re-phrase then: it is a must have only on PI, to make sure you 
> > don't have a loop when doing it. Maybe is a consequence of the 
> > algorithm I chose. -However- it should be possible to disable it in 
> > cases where you are reasonably sure it won't happen (such as kernel 
> > code). In any case, AFAIR, I still did not implement it.
> 
> are there cases where userspace wants to disable deadlock-detection for 
> its own locks?
> 
> the deadlock detector in PREEMPT_RT is pretty much specialized for 
> debugging (it does all sorts of weird locking tricks to get the first 
> deadlock out, and to really report it on the console), but it ought to 
> be possible to make it usable for userspace-controlled locks as well.
> 
>   Ingo
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-12 Thread Paul E. McKenney
On Thu, Apr 07, 2005 at 10:52:25AM -0700, Daniel Walker wrote:
> 
> Source: Daniel Walker <[EMAIL PROTECTED]> MontaVista Software, Inc
> Description:
>   This patch adds the priority list data structure from Inaky 
> Perez-Gonzalez 
> to the Preempt Real-Time mutex.
> 
> the patch order is (starting with a 2.6.11 kernel tree),
> 
> patch-2.6.12-rc2
> realtime-preempt-2.6.12-rc2-V0.7.44-01
>   
> Signed-off-by: Daniel Walker <[EMAIL PROTECTED]>
> 
> Index: linux-2.6.11/include/linux/plist.h
> ===
> --- linux-2.6.11.orig/include/linux/plist.h   1970-01-01 00:00:00.0 
> +
> +++ linux-2.6.11/include/linux/plist.h2005-04-07 17:47:42.0 
> +
> @@ -0,0 +1,310 @@

[ . . . ]

> +/* Grunt to do the real removal work of @pl from the plist. */
> +static inline
> +void __plist_del (struct plist *pl)
> +{
> + struct list_head *victim;
> + if (list_empty (>dp_node))  /* SP-node, not head */
> + victim = >sp_node;
> + else if (list_empty (>sp_node)) /* DP-node, empty SP list */
> + victim = >dp_node;
> + else {  /* SP list head, not empty */
> + struct plist *pl_new = container_of (pl->sp_node.next,
> +  struct plist, sp_node);
> + victim = >sp_node;
> + list_replace_rcu (>dp_node, _new->dp_node);

If you are protecting this list with RCU...

> + }
> + list_del_init (victim);

... you need to wait for a grace period before deleting the element
removed from the list.

Or are you just using list_replace_rcu() for its replacement capability?
If so, seems like it might be worthwhile to make a list_replace().
This would get rid of the memory barrier, and also keep from confusing
people like myself.  ;-)

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-12 Thread Paul E. McKenney
On Thu, Apr 07, 2005 at 10:52:25AM -0700, Daniel Walker wrote:
 
 Source: Daniel Walker [EMAIL PROTECTED] MontaVista Software, Inc
 Description:
   This patch adds the priority list data structure from Inaky 
 Perez-Gonzalez 
 to the Preempt Real-Time mutex.
 
 the patch order is (starting with a 2.6.11 kernel tree),
 
 patch-2.6.12-rc2
 realtime-preempt-2.6.12-rc2-V0.7.44-01
   
 Signed-off-by: Daniel Walker [EMAIL PROTECTED]
 
 Index: linux-2.6.11/include/linux/plist.h
 ===
 --- linux-2.6.11.orig/include/linux/plist.h   1970-01-01 00:00:00.0 
 +
 +++ linux-2.6.11/include/linux/plist.h2005-04-07 17:47:42.0 
 +
 @@ -0,0 +1,310 @@

[ . . . ]

 +/* Grunt to do the real removal work of @pl from the plist. */
 +static inline
 +void __plist_del (struct plist *pl)
 +{
 + struct list_head *victim;
 + if (list_empty (pl-dp_node))  /* SP-node, not head */
 + victim = pl-sp_node;
 + else if (list_empty (pl-sp_node)) /* DP-node, empty SP list */
 + victim = pl-dp_node;
 + else {  /* SP list head, not empty */
 + struct plist *pl_new = container_of (pl-sp_node.next,
 +  struct plist, sp_node);
 + victim = pl-sp_node;
 + list_replace_rcu (pl-dp_node, pl_new-dp_node);

If you are protecting this list with RCU...

 + }
 + list_del_init (victim);

... you need to wait for a grace period before deleting the element
removed from the list.

Or are you just using list_replace_rcu() for its replacement capability?
If so, seems like it might be worthwhile to make a list_replace().
This would get rid of the memory barrier, and also keep from confusing
people like myself.  ;-)

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-12 Thread Esben Nielsen
I looked at the PI-code to see what priority the task (old_owner below)
would end up with when it released a lock. From rt.c:

prio = mutex_getprio(old_owner);
if (new_owner  !plist_empty(new_owner-pi_waiters)) {
w = plist_entry(new_owner-pi_waiters, struct
rt_mutex_waiter, pi_list);
prio = w-task-prio;
}
if (prio != old_owner-prio)
pi_setprio(lock, old_owner, prio);

What has new_owner to do with it? Shouldn't it be old_owner in these
lines? I.e. the prio we want to set old_owner to should be the prio of the
head of the old_owner-pi_waiters, not the new_owner!

Esben


On Mon, 11 Apr 2005, Ingo Molnar wrote:

 
 * Perez-Gonzalez, Inaky [EMAIL PROTECTED] wrote:
 
  Let me re-phrase then: it is a must have only on PI, to make sure you 
  don't have a loop when doing it. Maybe is a consequence of the 
  algorithm I chose. -However- it should be possible to disable it in 
  cases where you are reasonably sure it won't happen (such as kernel 
  code). In any case, AFAIR, I still did not implement it.
 
 are there cases where userspace wants to disable deadlock-detection for 
 its own locks?
 
 the deadlock detector in PREEMPT_RT is pretty much specialized for 
 debugging (it does all sorts of weird locking tricks to get the first 
 deadlock out, and to really report it on the console), but it ought to 
 be possible to make it usable for userspace-controlled locks as well.
 
   Ingo
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-12 Thread Daniel Walker
On Sun, 2005-04-10 at 04:09, Ingo Molnar wrote:
 Unless i'm missing something, this could be implemented by detaching
 lock-owner_prio from lock-owner - via e.g. negative values. Thus some
 minimal code would check whether we need the owner's priority in the PI
 logic, or the semaphore's own priority level.

The owners priority should be set to the semaphore's priority .. Or the
highest priority of all the semaphores that it has locked.

Daniel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Perez-Gonzalez, Inaky
>From: Bill Huey (hui) [mailto:[EMAIL PROTECTED]
>
>> Quick fix: the usual. Enable deadlock detection and if it
>> returns deadlock, assume it is locked already and proceed (or
>> do a recursive mutex, or a trylock).
>
>You have to be joking me ? geez.
>...

This is way *more* common than you think--I've seen it around
some big multithreaded OSS packages [can't remember which now]. 

>> Sure--and because most was for legacy reasons that adhered to
>> POSIX strictly, it was very simple: we need POSIX this, that and
>> that (PI, proper adherence to scheduler policy wake up/rt-behaviour,
>> deadlock detection, etc).
>
>Some of this stuff sounds like recursive locking. Would this be a
>better expression to solve the "top level API locking" problem
>you're referring to ?

Bingo. That's another way to "fix" it. Luckily, recursive locking
can be safely and quickly done in user space (I own this lock,
ergo I just inc the lock count).

The problem with deadlocks is when the scenario gets more complex
and you are trying to lock a mutex and the owner is waiting for
a mutex whose owner is waiting for a mutex that you own...this
more commonly happens when you don't know what the heck is going
on in the code, which unfortunately is very common on people that
inherits big pieces of stacks to maintain.

>> Fortunately in those areas POSIX is not too gray; code to the book.
>> Deal.
>
>I would think that there will have to be a graph discontinuity
>between user/kernel spaces at kernel entry and exit for the deadlock
>detector. Can't say about issues at fork time, but I would expect
>that those objects would have to be destroyed when the process exits.

fork time is not an issue, as POSIX makes forks and thread incompatible
(in a nutshell, only the thread calling fork() survives, all the mutexes
are [IIRC] reinitialized or something like that...).

>The current RT (Ingo's) lock isn't recursive nor is the deadlock
>detector the last time I looked. Do think that this is a problem
>for legacy apps if it gets overload for being the userspace futex
>as well ? (assuming I'm understanding all of this correctly)

Should be not on the recursive side; as I said, that is easy to do
[currently NPTL does it with the futexes]. The deadlock stuff gets
hairier, but it's not such a big of a deal when you have your data
structures setup. It takes time, though.

>> Of course, selling it to the lkml is another story.
>
>I would think that pushing as much of this into userspace would
>make the kernel hooks for it more acceptable. Don't know.

Agreed. Deadlock checking though, has to be done in the kernel. For
the generic case it is the only way to do it sanely.

-- Inaky 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread hui
On Mon, Apr 11, 2005 at 04:28:25PM -0700, Perez-Gonzalez, Inaky wrote:
> >From: Bill Huey (hui) [mailto:[EMAIL PROTECTED]
...
> API than once upon a time was made multithreaded by just adding
> a bunch of pthread_mutex_[un]lock() at the API entry point...
> without realizing that some of the top level API calls also 
> called other top level API calls, so they'd deadlock.

Oh crap.

> Quick fix: the usual. Enable deadlock detection and if it
> returns deadlock, assume it is locked already and proceed (or
> do a recursive mutex, or a trylock).

You have to be joking me ? geez.
... 
> It is certainly something to explore, but I'd better drive your
> way than do it. It's cleaner. Hides implementation details.
>
> I agree, but it doesn't work that well when talking about legacy 
> systems...that's the problem.

Yeah, ok, I understand what's going on now. There isn't a notion
of projecting priority across into the Unix/Linux kernel traditionally
which is why it seemed so bizarre.

> Sure--and because most was for legacy reasons that adhered to 
> POSIX strictly, it was very simple: we need POSIX this, that and
> that (PI, proper adherence to scheduler policy wake up/rt-behaviour,
> deadlock detection, etc). 

Some of this stuff sounds like recursive locking. Would this be a
better expression to solve the "top level API locking" problem
you're referring to ?

> Fortunately in those areas POSIX is not too gray; code to the book.
> Deal. 

I would think that there will have to be a graph discontinuity
between user/kernel spaces at kernel entry and exit for the deadlock
detector. Can't say about issues at fork time, but I would expect
that those objects would have to be destroyed when the process exits.

The current RT (Ingo's) lock isn't recursive nor is the deadlock
detector the last time I looked. Do think that this is a problem
for legacy apps if it gets overload for being the userspace futex
as well ? (assuming I'm understanding all of this correctly)

> Of course, selling it to the lkml is another story.

I would think that pushing as much of this into userspace would
make the kernel hooks for it more acceptable. Don't know.

/me thinks more

bill

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Perez-Gonzalez, Inaky
>From: Bill Huey (hui) [mailto:[EMAIL PROTECTED]
>On Mon, Apr 11, 2005 at 03:31:41PM -0700, Perez-Gonzalez, Inaky wrote:
>> If you are exposing the kernel locks to userspace to implement
>> mutexes (eg POSIX mutexes), deadlock checking is a feature you want
>> to have to complain with POSIX. According to some off the record
>> requirements I've been given, some applications badly need it (I have
>> a hard time believing that they are so broken, but heck...).
>
>I'd like to read about those requirements, but, IMO a lot of the value

More than a formal requirement is a "practical" one. Some
company (leader in their field, of course) has this huge
blobl of code they want to use in Linux and it has the typical
API than once upon a time was made multithreaded by just adding
a bunch of pthread_mutex_[un]lock() at the API entry point...
without realizing that some of the top level API calls also 
called other top level API calls, so they'd deadlock.

Quick fix: the usual. Enable deadlock detection and if it
returns deadlock, assume it is locked already and proceed (or
do a recursive mutex, or a trylock).

And so on, and so forth...

>of various priority protocols varies greatly on the context and size (N
>threads) of the application using it. If user/kernel space have to be
>coupled via some thread of execution, (IMO) then it's better to
seperate
>them with some event notification queues like signals (wake a thread
>via an queue event) than to mix locks across the user/kernel space
> ...

I wonder if we are confusing apples and oranges here--I don't think
we were considering cases about mixing kernel locks that are accessible
both from kernel and user space. 

The focus is to have a kernel lock entity and that user space can
use it for implementing the user space mutexes, but *not* mix
them (like in user and kernel space sharing this lock for doing 
whatever).

It is certainly something to explore, but I'd better drive your
way than do it. It's cleaner. Hides implementation details.

>It's important to outline the requirements of the applications and then
>see what you can do using minimal synchronization objects before
>exploding that divide.

I agree, but it doesn't work that well when talking about legacy 
systems...that's the problem.

>Also, Posix isn't always politically neutral nor complete regarding
>various things. You have to consider the context of these things.
>I'll have to think about this a bit more and review your patch more
>carefully.

Sure--and because most was for legacy reasons that adhered to 
POSIX strictly, it was very simple: we need POSIX this, that and
that (PI, proper adherence to scheduler policy wake up/rt-behaviour,
deadlock detection, etc). 

Fortunately in those areas POSIX is not too gray; code to the book.
Deal. 

Of course, selling it to the lkml is another story.

-- Inaky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread hui
On Mon, Apr 11, 2005 at 03:31:41PM -0700, Perez-Gonzalez, Inaky wrote:
> If you are exposing the kernel locks to userspace to implement
> mutexes (eg POSIX mutexes), deadlock checking is a feature you want
> to have to complain with POSIX. According to some off the record
> requirements I've been given, some applications badly need it (I have 
> a hard time believing that they are so broken, but heck...).

I'd like to read about those requirements, but, IMO a lot of the value
of various priority protocols varies greatly on the context and size (N
threads) of the application using it. If user/kernel space have to be
coupled via some thread of execution, (IMO) then it's better to seperate
them with some event notification queues like signals (wake a thread
via an queue event) than to mix locks across the user/kernel space
boundary. There's tons of abuse that can be opened up with various
priority protocols with regard to RT apps and giving it a first class
entry way without consideration is kind of scary.

It's important to outline the requirements of the applications and then
see what you can do using minimal synchronization objects before
exploding that divide.

Also, Posix isn't always politically neutral nor complete regarding
various things. You have to consider the context of these things.
I'll have to think about this a bit more and review your patch more
carefully.

I'm all ears if you think I'm wrong.

bill

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Perez-Gonzalez, Inaky
>From: Bill Huey (hui) [mailto:[EMAIL PROTECTED]
>
>On Mon, Apr 11, 2005 at 10:57:37AM +0200, Ingo Molnar wrote:
>>
>> * Perez-Gonzalez, Inaky <[EMAIL PROTECTED]> wrote:
>>
>> > Let me re-phrase then: it is a must have only on PI, to make sure
you
>> > don't have a loop when doing it. Maybe is a consequence of the
>> > algorithm I chose. -However- it should be possible to disable it in
>> > cases where you are reasonably sure it won't happen (such as kernel
>> > code). In any case, AFAIR, I still did not implement it.
>>
>> are there cases where userspace wants to disable deadlock-detection
for
>> its own locks?
>
>I'd disable it for userspace locks. There might be folks that want to
>implement userspace drivers, but I can't imagine it being 'ok' to have
>the kernel call out to userspace and have it block correctly. I would
>expect them to do something else that's less drastic.

If you are exposing the kernel locks to userspace to implement
mutexes (eg POSIX mutexes), deadlock checking is a feature you want
to have to complain with POSIX. According to some off the record
requirements I've been given, some applications badly need it (I have 
a hard time believing that they are so broken, but heck...).

-- Inaky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread hui
On Mon, Apr 11, 2005 at 10:57:37AM +0200, Ingo Molnar wrote:
> 
> * Perez-Gonzalez, Inaky <[EMAIL PROTECTED]> wrote:
> 
> > Let me re-phrase then: it is a must have only on PI, to make sure you 
> > don't have a loop when doing it. Maybe is a consequence of the 
> > algorithm I chose. -However- it should be possible to disable it in 
> > cases where you are reasonably sure it won't happen (such as kernel 
> > code). In any case, AFAIR, I still did not implement it.
> 
> are there cases where userspace wants to disable deadlock-detection for 
> its own locks?

I'd disable it for userspace locks. There might be folks that want to
implement userspace drivers, but I can't imagine it being 'ok' to have
the kernel call out to userspace and have it block correctly. I would
expect them to do something else that's less drastic.
 
> the deadlock detector in PREEMPT_RT is pretty much specialized for 
> debugging (it does all sorts of weird locking tricks to get the first 
> deadlock out, and to really report it on the console), but it ought to 
> be possible to make it usable for userspace-controlled locks as well.

If I understand things correctly, I'd let that be an RT app issue and
the app folks should decided what is appropriate for their setup. If
they need a deadlock detector they should decide on their own protocol.
The kernel debugging issues are completely different.

That's my two cents.

bill

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Daniel Walker
On Sun, 2005-04-10 at 03:53, Ingo Molnar wrote:

> ok, i've added this patch to the -45-00 release. It's looking good on my 
> testsystems so far, but it will need some more testing i guess.


Yes, I ran the PI test, and just let the system run .. So it could use
more extensive testing..

Daniel 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Perez-Gonzalez, Inaky
>From: Ingo Molnar [mailto:[EMAIL PROTECTED]
>
>* Perez-Gonzalez, Inaky <[EMAIL PROTECTED]> wrote:
>
>> Let me re-phrase then: it is a must have only on PI, to make sure you
>> don't have a loop when doing it. Maybe is a consequence of the
>> algorithm I chose. -However- it should be possible to disable it in
>> cases where you are reasonably sure it won't happen (such as kernel
>> code). In any case, AFAIR, I still did not implement it.
>
>are there cases where userspace wants to disable deadlock-detection for
>its own locks?

I would guess--if I know I have coded my application properly
(cough) or I am using locks that by design are completely orthogonal,
I would say deadlock checking is getting in the way.

>the deadlock detector in PREEMPT_RT is pretty much specialized for
>debugging (it does all sorts of weird locking tricks to get the first
>deadlock out, and to really report it on the console), but it ought to
>be possible to make it usable for userspace-controlled locks as well.

fusyn's is as simple as it can get: when you are about to lock(), it
checks that you don't own the lock already, but it generalizes it
(it checks that the owner of the lock is not waiting for a lock 
whose owner is waiting for a lock whose owner...is waiting for a lock
that you own).

-- Inaky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Ingo Molnar

* Perez-Gonzalez, Inaky <[EMAIL PROTECTED]> wrote:

> Let me re-phrase then: it is a must have only on PI, to make sure you 
> don't have a loop when doing it. Maybe is a consequence of the 
> algorithm I chose. -However- it should be possible to disable it in 
> cases where you are reasonably sure it won't happen (such as kernel 
> code). In any case, AFAIR, I still did not implement it.

are there cases where userspace wants to disable deadlock-detection for 
its own locks?

the deadlock detector in PREEMPT_RT is pretty much specialized for 
debugging (it does all sorts of weird locking tricks to get the first 
deadlock out, and to really report it on the console), but it ought to 
be possible to make it usable for userspace-controlled locks as well.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Perez-Gonzalez, Inaky
>From: Ingo Molnar [mailto:[EMAIL PROTECTED]
>
>* Perez-Gonzalez, Inaky <[EMAIL PROTECTED]> wrote:
>
>> >OTOH, deadlock detection is another issue. It's quite expensive and
i'm
>> >not sure we want to make it a runtime thing. But for fusyn's
deadlock
>> >detection and safe teardown on owner-exit is a must-have i suspect?
>>
>> Not really. Deadlock check is needed on PI, so it can be done at the
>> same time (you have to walk the chain anyway). In any other case, it
>> is an option you can request (or not).
>
>well, i was talking about the mutex code in PREEMPT_RT. There deadlock
>detection is an optional debug feature. You dont _have_ to do deadlock
>detection for the kernel's locks, and there's a difference in
>performance.

Big mouth'o mine :-| 

Let me re-phrase then: it is a must have only on PI, to make sure 
you don't have a loop when doing it. Maybe is a consequence of the
algorithm I chose. -However- it should be possible to disable it
in cases where you are reasonably sure it won't happen (such as
kernel code). In any case, AFAIR, I still did not implement it.

Was this more useful?

-- Inaky 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Ingo Molnar

* Perez-Gonzalez, Inaky <[EMAIL PROTECTED]> wrote:

> >OTOH, deadlock detection is another issue. It's quite expensive and i'm
> >not sure we want to make it a runtime thing. But for fusyn's deadlock
> >detection and safe teardown on owner-exit is a must-have i suspect?
> 
> Not really. Deadlock check is needed on PI, so it can be done at the 
> same time (you have to walk the chain anyway). In any other case, it 
> is an option you can request (or not).

well, i was talking about the mutex code in PREEMPT_RT. There deadlock 
detection is an optional debug feature. You dont _have_ to do deadlock 
detection for the kernel's locks, and there's a difference in 
performance.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Perez-Gonzalez, Inaky
>From: Ingo Molnar [mailto:[EMAIL PROTECTED]
>
>
>i'd not mind merging the extra bits to PREEMPT_RT to enable fusyn's, if
>they come in small, clean steps. E.g. Daniel's plist.h stuff was nice
>and clean.

I am finishing breaking it up in small bits so you can take a look
at it. Should be finished tomorrow noon (PST).

>is priority ceiling coming in via some POSIX feature that fusyn's need
>to address? What would be needed precisely - a way to set a priority
for
>a lock (independently of the owner's task priority), and let that
>control the PI mechanism?

Yep. It is kind of easy to do (at least in fusyns)--it is just a
matter of setting the priority of the lock, that sets the priority
of its list node.

Because the promotion code only cares about the priority of the
list node, it blends automatically in the whole scheme. The PI
code will modify the list node's priority while promoting all the
tasks affected in the ownership chain, only when the fulocks/mutexes
are PI. The PP code will modify the priority of the fulock/mutex's
list node with an special call. 

[you can check for my 2004 OLS paper for a deeper explanation, or
I can extend this one, if you want]. 

>i.e. this doesnt seem to really affect the core design of RT mutexes.

Nope it doesn't. As I said, it is done in such a way that no 
modifications are needed.

>OTOH, deadlock detection is another issue. It's quite expensive and i'm
>not sure we want to make it a runtime thing. But for fusyn's deadlock
>detection and safe teardown on owner-exit is a must-have i suspect?

Not really. Deadlock check is needed on PI, so it can be done at the
same time (you have to walk the chain anyway). In any other case, it
is an option you can request (or not).

-- Inaky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Perez-Gonzalez, Inaky
From: Ingo Molnar [mailto:[EMAIL PROTECTED]


i'd not mind merging the extra bits to PREEMPT_RT to enable fusyn's, if
they come in small, clean steps. E.g. Daniel's plist.h stuff was nice
and clean.

I am finishing breaking it up in small bits so you can take a look
at it. Should be finished tomorrow noon (PST).

is priority ceiling coming in via some POSIX feature that fusyn's need
to address? What would be needed precisely - a way to set a priority
for
a lock (independently of the owner's task priority), and let that
control the PI mechanism?

Yep. It is kind of easy to do (at least in fusyns)--it is just a
matter of setting the priority of the lock, that sets the priority
of its list node.

Because the promotion code only cares about the priority of the
list node, it blends automatically in the whole scheme. The PI
code will modify the list node's priority while promoting all the
tasks affected in the ownership chain, only when the fulocks/mutexes
are PI. The PP code will modify the priority of the fulock/mutex's
list node with an special call. 

[you can check for my 2004 OLS paper for a deeper explanation, or
I can extend this one, if you want]. 

i.e. this doesnt seem to really affect the core design of RT mutexes.

Nope it doesn't. As I said, it is done in such a way that no 
modifications are needed.

OTOH, deadlock detection is another issue. It's quite expensive and i'm
not sure we want to make it a runtime thing. But for fusyn's deadlock
detection and safe teardown on owner-exit is a must-have i suspect?

Not really. Deadlock check is needed on PI, so it can be done at the
same time (you have to walk the chain anyway). In any other case, it
is an option you can request (or not).

-- Inaky
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Ingo Molnar

* Perez-Gonzalez, Inaky [EMAIL PROTECTED] wrote:

 OTOH, deadlock detection is another issue. It's quite expensive and i'm
 not sure we want to make it a runtime thing. But for fusyn's deadlock
 detection and safe teardown on owner-exit is a must-have i suspect?
 
 Not really. Deadlock check is needed on PI, so it can be done at the 
 same time (you have to walk the chain anyway). In any other case, it 
 is an option you can request (or not).

well, i was talking about the mutex code in PREEMPT_RT. There deadlock 
detection is an optional debug feature. You dont _have_ to do deadlock 
detection for the kernel's locks, and there's a difference in 
performance.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Perez-Gonzalez, Inaky
From: Ingo Molnar [mailto:[EMAIL PROTECTED]

* Perez-Gonzalez, Inaky [EMAIL PROTECTED] wrote:

 OTOH, deadlock detection is another issue. It's quite expensive and
i'm
 not sure we want to make it a runtime thing. But for fusyn's
deadlock
 detection and safe teardown on owner-exit is a must-have i suspect?

 Not really. Deadlock check is needed on PI, so it can be done at the
 same time (you have to walk the chain anyway). In any other case, it
 is an option you can request (or not).

well, i was talking about the mutex code in PREEMPT_RT. There deadlock
detection is an optional debug feature. You dont _have_ to do deadlock
detection for the kernel's locks, and there's a difference in
performance.

Big mouth'o mine :-| 

Let me re-phrase then: it is a must have only on PI, to make sure 
you don't have a loop when doing it. Maybe is a consequence of the
algorithm I chose. -However- it should be possible to disable it
in cases where you are reasonably sure it won't happen (such as
kernel code). In any case, AFAIR, I still did not implement it.

Was this more useful?

-- Inaky 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Ingo Molnar

* Perez-Gonzalez, Inaky [EMAIL PROTECTED] wrote:

 Let me re-phrase then: it is a must have only on PI, to make sure you 
 don't have a loop when doing it. Maybe is a consequence of the 
 algorithm I chose. -However- it should be possible to disable it in 
 cases where you are reasonably sure it won't happen (such as kernel 
 code). In any case, AFAIR, I still did not implement it.

are there cases where userspace wants to disable deadlock-detection for 
its own locks?

the deadlock detector in PREEMPT_RT is pretty much specialized for 
debugging (it does all sorts of weird locking tricks to get the first 
deadlock out, and to really report it on the console), but it ought to 
be possible to make it usable for userspace-controlled locks as well.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Perez-Gonzalez, Inaky
From: Ingo Molnar [mailto:[EMAIL PROTECTED]

* Perez-Gonzalez, Inaky [EMAIL PROTECTED] wrote:

 Let me re-phrase then: it is a must have only on PI, to make sure you
 don't have a loop when doing it. Maybe is a consequence of the
 algorithm I chose. -However- it should be possible to disable it in
 cases where you are reasonably sure it won't happen (such as kernel
 code). In any case, AFAIR, I still did not implement it.

are there cases where userspace wants to disable deadlock-detection for
its own locks?

I would guess--if I know I have coded my application properly
(cough) or I am using locks that by design are completely orthogonal,
I would say deadlock checking is getting in the way.

the deadlock detector in PREEMPT_RT is pretty much specialized for
debugging (it does all sorts of weird locking tricks to get the first
deadlock out, and to really report it on the console), but it ought to
be possible to make it usable for userspace-controlled locks as well.

fusyn's is as simple as it can get: when you are about to lock(), it
checks that you don't own the lock already, but it generalizes it
(it checks that the owner of the lock is not waiting for a lock 
whose owner is waiting for a lock whose owner...is waiting for a lock
that you own).

-- Inaky
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Daniel Walker
On Sun, 2005-04-10 at 03:53, Ingo Molnar wrote:

 ok, i've added this patch to the -45-00 release. It's looking good on my 
 testsystems so far, but it will need some more testing i guess.


Yes, I ran the PI test, and just let the system run .. So it could use
more extensive testing..

Daniel 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread hui
On Mon, Apr 11, 2005 at 10:57:37AM +0200, Ingo Molnar wrote:
 
 * Perez-Gonzalez, Inaky [EMAIL PROTECTED] wrote:
 
  Let me re-phrase then: it is a must have only on PI, to make sure you 
  don't have a loop when doing it. Maybe is a consequence of the 
  algorithm I chose. -However- it should be possible to disable it in 
  cases where you are reasonably sure it won't happen (such as kernel 
  code). In any case, AFAIR, I still did not implement it.
 
 are there cases where userspace wants to disable deadlock-detection for 
 its own locks?

I'd disable it for userspace locks. There might be folks that want to
implement userspace drivers, but I can't imagine it being 'ok' to have
the kernel call out to userspace and have it block correctly. I would
expect them to do something else that's less drastic.
 
 the deadlock detector in PREEMPT_RT is pretty much specialized for 
 debugging (it does all sorts of weird locking tricks to get the first 
 deadlock out, and to really report it on the console), but it ought to 
 be possible to make it usable for userspace-controlled locks as well.

If I understand things correctly, I'd let that be an RT app issue and
the app folks should decided what is appropriate for their setup. If
they need a deadlock detector they should decide on their own protocol.
The kernel debugging issues are completely different.

That's my two cents.

bill

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Perez-Gonzalez, Inaky
From: Bill Huey (hui) [mailto:[EMAIL PROTECTED]

On Mon, Apr 11, 2005 at 10:57:37AM +0200, Ingo Molnar wrote:

 * Perez-Gonzalez, Inaky [EMAIL PROTECTED] wrote:

  Let me re-phrase then: it is a must have only on PI, to make sure
you
  don't have a loop when doing it. Maybe is a consequence of the
  algorithm I chose. -However- it should be possible to disable it in
  cases where you are reasonably sure it won't happen (such as kernel
  code). In any case, AFAIR, I still did not implement it.

 are there cases where userspace wants to disable deadlock-detection
for
 its own locks?

I'd disable it for userspace locks. There might be folks that want to
implement userspace drivers, but I can't imagine it being 'ok' to have
the kernel call out to userspace and have it block correctly. I would
expect them to do something else that's less drastic.

If you are exposing the kernel locks to userspace to implement
mutexes (eg POSIX mutexes), deadlock checking is a feature you want
to have to complain with POSIX. According to some off the record
requirements I've been given, some applications badly need it (I have 
a hard time believing that they are so broken, but heck...).

-- Inaky
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread hui
On Mon, Apr 11, 2005 at 03:31:41PM -0700, Perez-Gonzalez, Inaky wrote:
 If you are exposing the kernel locks to userspace to implement
 mutexes (eg POSIX mutexes), deadlock checking is a feature you want
 to have to complain with POSIX. According to some off the record
 requirements I've been given, some applications badly need it (I have 
 a hard time believing that they are so broken, but heck...).

I'd like to read about those requirements, but, IMO a lot of the value
of various priority protocols varies greatly on the context and size (N
threads) of the application using it. If user/kernel space have to be
coupled via some thread of execution, (IMO) then it's better to seperate
them with some event notification queues like signals (wake a thread
via an queue event) than to mix locks across the user/kernel space
boundary. There's tons of abuse that can be opened up with various
priority protocols with regard to RT apps and giving it a first class
entry way without consideration is kind of scary.

It's important to outline the requirements of the applications and then
see what you can do using minimal synchronization objects before
exploding that divide.

Also, Posix isn't always politically neutral nor complete regarding
various things. You have to consider the context of these things.
I'll have to think about this a bit more and review your patch more
carefully.

I'm all ears if you think I'm wrong.

bill

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Perez-Gonzalez, Inaky
From: Bill Huey (hui) [mailto:[EMAIL PROTECTED]
On Mon, Apr 11, 2005 at 03:31:41PM -0700, Perez-Gonzalez, Inaky wrote:
 If you are exposing the kernel locks to userspace to implement
 mutexes (eg POSIX mutexes), deadlock checking is a feature you want
 to have to complain with POSIX. According to some off the record
 requirements I've been given, some applications badly need it (I have
 a hard time believing that they are so broken, but heck...).

I'd like to read about those requirements, but, IMO a lot of the value

More than a formal requirement is a practical one. Some
company (leader in their field, of course) has this huge
blobl of code they want to use in Linux and it has the typical
API than once upon a time was made multithreaded by just adding
a bunch of pthread_mutex_[un]lock() at the API entry point...
without realizing that some of the top level API calls also 
called other top level API calls, so they'd deadlock.

Quick fix: the usual. Enable deadlock detection and if it
returns deadlock, assume it is locked already and proceed (or
do a recursive mutex, or a trylock).

And so on, and so forth...

of various priority protocols varies greatly on the context and size (N
threads) of the application using it. If user/kernel space have to be
coupled via some thread of execution, (IMO) then it's better to
seperate
them with some event notification queues like signals (wake a thread
via an queue event) than to mix locks across the user/kernel space
 ...

I wonder if we are confusing apples and oranges here--I don't think
we were considering cases about mixing kernel locks that are accessible
both from kernel and user space. 

The focus is to have a kernel lock entity and that user space can
use it for implementing the user space mutexes, but *not* mix
them (like in user and kernel space sharing this lock for doing 
whatever).

It is certainly something to explore, but I'd better drive your
way than do it. It's cleaner. Hides implementation details.

It's important to outline the requirements of the applications and then
see what you can do using minimal synchronization objects before
exploding that divide.

I agree, but it doesn't work that well when talking about legacy 
systems...that's the problem.

Also, Posix isn't always politically neutral nor complete regarding
various things. You have to consider the context of these things.
I'll have to think about this a bit more and review your patch more
carefully.

Sure--and because most was for legacy reasons that adhered to 
POSIX strictly, it was very simple: we need POSIX this, that and
that (PI, proper adherence to scheduler policy wake up/rt-behaviour,
deadlock detection, etc). 

Fortunately in those areas POSIX is not too gray; code to the book.
Deal. 

Of course, selling it to the lkml is another story.

-- Inaky
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread hui
On Mon, Apr 11, 2005 at 04:28:25PM -0700, Perez-Gonzalez, Inaky wrote:
 From: Bill Huey (hui) [mailto:[EMAIL PROTECTED]
...
 API than once upon a time was made multithreaded by just adding
 a bunch of pthread_mutex_[un]lock() at the API entry point...
 without realizing that some of the top level API calls also 
 called other top level API calls, so they'd deadlock.

Oh crap.

 Quick fix: the usual. Enable deadlock detection and if it
 returns deadlock, assume it is locked already and proceed (or
 do a recursive mutex, or a trylock).

You have to be joking me ? geez.
... 
 It is certainly something to explore, but I'd better drive your
 way than do it. It's cleaner. Hides implementation details.

 I agree, but it doesn't work that well when talking about legacy 
 systems...that's the problem.

Yeah, ok, I understand what's going on now. There isn't a notion
of projecting priority across into the Unix/Linux kernel traditionally
which is why it seemed so bizarre.

 Sure--and because most was for legacy reasons that adhered to 
 POSIX strictly, it was very simple: we need POSIX this, that and
 that (PI, proper adherence to scheduler policy wake up/rt-behaviour,
 deadlock detection, etc). 

Some of this stuff sounds like recursive locking. Would this be a
better expression to solve the top level API locking problem
you're referring to ?

 Fortunately in those areas POSIX is not too gray; code to the book.
 Deal. 

I would think that there will have to be a graph discontinuity
between user/kernel spaces at kernel entry and exit for the deadlock
detector. Can't say about issues at fork time, but I would expect
that those objects would have to be destroyed when the process exits.

The current RT (Ingo's) lock isn't recursive nor is the deadlock
detector the last time I looked. Do think that this is a problem
for legacy apps if it gets overload for being the userspace futex
as well ? (assuming I'm understanding all of this correctly)

 Of course, selling it to the lkml is another story.

I would think that pushing as much of this into userspace would
make the kernel hooks for it more acceptable. Don't know.

/me thinks more

bill

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-11 Thread Perez-Gonzalez, Inaky
From: Bill Huey (hui) [mailto:[EMAIL PROTECTED]

 Quick fix: the usual. Enable deadlock detection and if it
 returns deadlock, assume it is locked already and proceed (or
 do a recursive mutex, or a trylock).

You have to be joking me ? geez.
...

This is way *more* common than you think--I've seen it around
some big multithreaded OSS packages [can't remember which now]. 

 Sure--and because most was for legacy reasons that adhered to
 POSIX strictly, it was very simple: we need POSIX this, that and
 that (PI, proper adherence to scheduler policy wake up/rt-behaviour,
 deadlock detection, etc).

Some of this stuff sounds like recursive locking. Would this be a
better expression to solve the top level API locking problem
you're referring to ?

Bingo. That's another way to fix it. Luckily, recursive locking
can be safely and quickly done in user space (I own this lock,
ergo I just inc the lock count).

The problem with deadlocks is when the scenario gets more complex
and you are trying to lock a mutex and the owner is waiting for
a mutex whose owner is waiting for a mutex that you own...this
more commonly happens when you don't know what the heck is going
on in the code, which unfortunately is very common on people that
inherits big pieces of stacks to maintain.

 Fortunately in those areas POSIX is not too gray; code to the book.
 Deal.

I would think that there will have to be a graph discontinuity
between user/kernel spaces at kernel entry and exit for the deadlock
detector. Can't say about issues at fork time, but I would expect
that those objects would have to be destroyed when the process exits.

fork time is not an issue, as POSIX makes forks and thread incompatible
(in a nutshell, only the thread calling fork() survives, all the mutexes
are [IIRC] reinitialized or something like that...).

The current RT (Ingo's) lock isn't recursive nor is the deadlock
detector the last time I looked. Do think that this is a problem
for legacy apps if it gets overload for being the userspace futex
as well ? (assuming I'm understanding all of this correctly)

Should be not on the recursive side; as I said, that is easy to do
[currently NPTL does it with the futexes]. The deadlock stuff gets
hairier, but it's not such a big of a deal when you have your data
structures setup. It takes time, though.

 Of course, selling it to the lkml is another story.

I would think that pushing as much of this into userspace would
make the kernel hooks for it more acceptable. Don't know.

Agreed. Deadlock checking though, has to be done in the kernel. For
the generic case it is the only way to do it sanely.

-- Inaky 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-10 Thread Ingo Molnar

* Sven-Thorsten Dietrich <[EMAIL PROTECTED]> wrote:

> On Fri, 2005-04-08 at 08:28 +0200, Ingo Molnar wrote:
> > * Daniel Walker <[EMAIL PROTECTED]> wrote:
> > 
> > >   This patch adds the priority list data structure from Inaky 
> > > Perez-Gonzalez to the Preempt Real-Time mutex.
> > 
> > this one looks really clean.
> > 
> > it makes me wonder, what is the current status of fusyn's? Such a light 
> > datastructure would be much more mergeable upstream than the former 
> > 100-queues approach.
> 
> Ingo,
> 
> Joe Korty has been doing a lot of work on Fusyn recently.
> 
> Fusyn is now a generic implementation, similar to the RT mutex. SMP 
> scalability is somewhat better for decoupled locks on PI (last I 
> checked). It has the extra bulk required to support user space.
> 
> The major issue that concerns the Fusym and the real-time patch is 
> that merging the kernel portion of Fusyn creates a collision of 
> redundant PI implementations with respect to the RT mutex.

i'd not mind merging the extra bits to PREEMPT_RT to enable fusyn's, if 
they come in small, clean steps. E.g. Daniel's plist.h stuff was nice 
and clean.

> There are a few mistakes on the page, (RT mutex does not do priority 
> ceiling), but for the most part the info is current.

is priority ceiling coming in via some POSIX feature that fusyn's need
to address? What would be needed precisely - a way to set a priority for
a lock (independently of the owner's task priority), and let that
control the PI mechanism?

Unless i'm missing something, this could be implemented by detaching
lock->owner_prio from lock->owner - via e.g. negative values. Thus some
minimal code would check whether we need the owner's priority in the PI
logic, or the semaphore's "own" priority level.

i.e. this doesnt seem to really affect the core design of RT mutexes.

OTOH, deadlock detection is another issue. It's quite expensive and i'm 
not sure we want to make it a runtime thing. But for fusyn's deadlock 
detection and safe teardown on owner-exit is a must-have i suspect?

> If the RT mutex could be exposed in non PREEMPT_RT configurations, the 
> fulock portion could be superseded by the RT mutex, and the remaining 
> pieces merged in. Or vice versa.

sure, RT mutexes could be exposed in !PREEMPT_RT.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-10 Thread Ingo Molnar

* Daniel Walker <[EMAIL PROTECTED]> wrote:

> Description:
>   This patch adds the priority list data structure from Inaky 
> Perez-Gonzalez to the Preempt Real-Time mutex.

ok, i've added this patch to the -45-00 release. It's looking good on my 
testsystems so far, but it will need some more testing i guess.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-10 Thread Ingo Molnar

* Daniel Walker [EMAIL PROTECTED] wrote:

 Description:
   This patch adds the priority list data structure from Inaky 
 Perez-Gonzalez to the Preempt Real-Time mutex.

ok, i've added this patch to the -45-00 release. It's looking good on my 
testsystems so far, but it will need some more testing i guess.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-10 Thread Ingo Molnar

* Sven-Thorsten Dietrich [EMAIL PROTECTED] wrote:

 On Fri, 2005-04-08 at 08:28 +0200, Ingo Molnar wrote:
  * Daniel Walker [EMAIL PROTECTED] wrote:
  
 This patch adds the priority list data structure from Inaky 
   Perez-Gonzalez to the Preempt Real-Time mutex.
  
  this one looks really clean.
  
  it makes me wonder, what is the current status of fusyn's? Such a light 
  datastructure would be much more mergeable upstream than the former 
  100-queues approach.
 
 Ingo,
 
 Joe Korty has been doing a lot of work on Fusyn recently.
 
 Fusyn is now a generic implementation, similar to the RT mutex. SMP 
 scalability is somewhat better for decoupled locks on PI (last I 
 checked). It has the extra bulk required to support user space.
 
 The major issue that concerns the Fusym and the real-time patch is 
 that merging the kernel portion of Fusyn creates a collision of 
 redundant PI implementations with respect to the RT mutex.

i'd not mind merging the extra bits to PREEMPT_RT to enable fusyn's, if 
they come in small, clean steps. E.g. Daniel's plist.h stuff was nice 
and clean.

 There are a few mistakes on the page, (RT mutex does not do priority 
 ceiling), but for the most part the info is current.

is priority ceiling coming in via some POSIX feature that fusyn's need
to address? What would be needed precisely - a way to set a priority for
a lock (independently of the owner's task priority), and let that
control the PI mechanism?

Unless i'm missing something, this could be implemented by detaching
lock-owner_prio from lock-owner - via e.g. negative values. Thus some
minimal code would check whether we need the owner's priority in the PI
logic, or the semaphore's own priority level.

i.e. this doesnt seem to really affect the core design of RT mutexes.

OTOH, deadlock detection is another issue. It's quite expensive and i'm 
not sure we want to make it a runtime thing. But for fusyn's deadlock 
detection and safe teardown on owner-exit is a must-have i suspect?

 If the RT mutex could be exposed in non PREEMPT_RT configurations, the 
 fulock portion could be superseded by the RT mutex, and the remaining 
 pieces merged in. Or vice versa.

sure, RT mutexes could be exposed in !PREEMPT_RT.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-08 Thread Perez-Gonzalez, Inaky
>From: Daniel Walker [mailto:[EMAIL PROTECTED]
>
>> Current tip of development has some issues with conditional variables
>> and broadcasts (requeue stuff) that I need to sink my teeth in. Joe
>> Korty is fixing up a lot of corner cases I wasn't catching, but
>> other than that is doing fine.
>
>You try to get out, and they suck you right back in.

Don't mention it :] That's why I want to get some more people
hooked up to this...so I can move on to do other things :)

>> How long ago since you saw it? I also implemented the futex
redirection
>> stuff we discussed some months ago.
>
>It's been a while since I've seen the fusyn scheduler changes. I have
>the curernt fusyn CVS, I'll take a look at it.

All that stuff is in futex.c; bear in mind what I said at
the confcall, it is just a hacky proof-of-concept--it doesn't
even implement the async interface.

It kind of works, but is not all that solid [last time I tried
the JVMs locked up].

-- Inaky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-08 Thread Daniel Walker
On Fri, 2005-04-08 at 14:25, Perez-Gonzalez, Inaky wrote:
> I concur with Daniel. If we can decide how to deal with that (toss
> one out, keep one, merge them, whatever), we could reuse all the user
> space glue that is the hardest part to get right.

I have a preference to the Real-Time PI , but that's just cause I've
worked with it more. Ingo's really the one that should be make those
choices though, since he has the biggest influence over what goes into
sched.c ..

> Current tip of development has some issues with conditional variables
> and broadcasts (requeue stuff) that I need to sink my teeth in. Joe
> Korty is fixing up a lot of corner cases I wasn't catching, but 
> other than that is doing fine.

You try to get out, and they suck you right back in.

> How long ago since you saw it? I also implemented the futex redirection
> stuff we discussed some months ago.

It's been a while since I've seen the fusyn scheduler changes. I have
the curernt fusyn CVS, I'll take a look at it.

Daniel


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-08 Thread Perez-Gonzalez, Inaky
>From: Daniel Walker [mailto:[EMAIL PROTECTED]
>
>On Thu, 2005-04-07 at 23:28, Ingo Molnar wrote:
>
>> this one looks really clean.
>>
>> it makes me wonder, what is the current status of fusyn's? Such a
light
>> datastructure would be much more mergeable upstream than the former
>> 100-queues approach.
>
>   Inaky was telling me that 100 queues approach is two years old.
>
>The biggest problem is that fusyn has it's own PI system .. So it's not
>clear if that will work with the RT mutex,. I was thinking the PI stuff
>could be made generic so, fusyn, maybe futex, can use it also .

I concur with Daniel. If we can decide how to deal with that (toss
one out, keep one, merge them, whatever), we could reuse all the user
space glue that is the hardest part to get right.

Current tip of development has some issues with conditional variables
and broadcasts (requeue stuff) that I need to sink my teeth in. Joe
Korty is fixing up a lot of corner cases I wasn't catching, but 
other than that is doing fine.

How long ago since you saw it? I also implemented the futex redirection
stuff we discussed some months ago.

-- Inaky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-08 Thread Daniel Walker

On Thu, 2005-04-07 at 23:28, Ingo Molnar wrote:

> this one looks really clean.
> 
> it makes me wonder, what is the current status of fusyn's? Such a light 
> datastructure would be much more mergeable upstream than the former 
> 100-queues approach.


Inaky was telling me that 100 queues approach is two years old. 

The biggest problem is that fusyn has it's own PI system .. So it's not
clear if that will work with the RT mutex,. I was thinking the PI stuff
could be made generic so, fusyn, maybe futex, can use it also .

Daniel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-08 Thread Sven-Thorsten Dietrich
On Fri, 2005-04-08 at 08:28 +0200, Ingo Molnar wrote:
> * Daniel Walker <[EMAIL PROTECTED]> wrote:
> 
> > This patch adds the priority list data structure from Inaky 
> > Perez-Gonzalez to the Preempt Real-Time mutex.
> 
> this one looks really clean.
> 
> it makes me wonder, what is the current status of fusyn's? Such a light 
> datastructure would be much more mergeable upstream than the former 
> 100-queues approach.

Ingo,

Joe Korty has been doing a lot of work on Fusyn recently.

Fusyn is now a generic implementation, similar to the RT mutex. SMP
scalability is somewhat better for decoupled locks on PI (last I
checked). It has the extra bulk required to support user space.

The major issue that concerns the Fusym and the real-time patch is that
merging the kernel portion of Fusyn creates a collision of redundant PI
implementations with respect to the RT mutex.

The issues are outlined here:

http://developer.osdl.org/dev/mutexes/

There are a few mistakes on the page, (RT mutex does not do priority
ceiling), but for the most part the info is current.

If the RT mutex could be exposed in non PREEMPT_RT configurations,
the fulock portion could be superseded by the RT mutex, and the
remaining pieces merged in. Or vice versa.

We discussed the scenarios recently, any guidance you can offer to help
us out would be greatly appreciated.

http://lists.osdl.org/pipermail/robustmutexes/2005-April/000532.html


Sven


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-08 Thread Ingo Molnar

* Daniel Walker <[EMAIL PROTECTED]> wrote:

>   This patch adds the priority list data structure from Inaky 
> Perez-Gonzalez to the Preempt Real-Time mutex.

this one looks really clean.

it makes me wonder, what is the current status of fusyn's? Such a light 
datastructure would be much more mergeable upstream than the former 
100-queues approach.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-08 Thread Ingo Molnar

* Daniel Walker [EMAIL PROTECTED] wrote:

   This patch adds the priority list data structure from Inaky 
 Perez-Gonzalez to the Preempt Real-Time mutex.

this one looks really clean.

it makes me wonder, what is the current status of fusyn's? Such a light 
datastructure would be much more mergeable upstream than the former 
100-queues approach.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-08 Thread Sven-Thorsten Dietrich
On Fri, 2005-04-08 at 08:28 +0200, Ingo Molnar wrote:
 * Daniel Walker [EMAIL PROTECTED] wrote:
 
  This patch adds the priority list data structure from Inaky 
  Perez-Gonzalez to the Preempt Real-Time mutex.
 
 this one looks really clean.
 
 it makes me wonder, what is the current status of fusyn's? Such a light 
 datastructure would be much more mergeable upstream than the former 
 100-queues approach.

Ingo,

Joe Korty has been doing a lot of work on Fusyn recently.

Fusyn is now a generic implementation, similar to the RT mutex. SMP
scalability is somewhat better for decoupled locks on PI (last I
checked). It has the extra bulk required to support user space.

The major issue that concerns the Fusym and the real-time patch is that
merging the kernel portion of Fusyn creates a collision of redundant PI
implementations with respect to the RT mutex.

The issues are outlined here:

http://developer.osdl.org/dev/mutexes/

There are a few mistakes on the page, (RT mutex does not do priority
ceiling), but for the most part the info is current.

If the RT mutex could be exposed in non PREEMPT_RT configurations,
the fulock portion could be superseded by the RT mutex, and the
remaining pieces merged in. Or vice versa.

We discussed the scenarios recently, any guidance you can offer to help
us out would be greatly appreciated.

http://lists.osdl.org/pipermail/robustmutexes/2005-April/000532.html


Sven


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Priority Lists for the RT mutex

2005-04-08 Thread Daniel Walker

On Thu, 2005-04-07 at 23:28, Ingo Molnar wrote:

 this one looks really clean.
 
 it makes me wonder, what is the current status of fusyn's? Such a light 
 datastructure would be much more mergeable upstream than the former 
 100-queues approach.


Inaky was telling me that 100 queues approach is two years old. 

The biggest problem is that fusyn has it's own PI system .. So it's not
clear if that will work with the RT mutex,. I was thinking the PI stuff
could be made generic so, fusyn, maybe futex, can use it also .

Daniel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-08 Thread Perez-Gonzalez, Inaky
From: Daniel Walker [mailto:[EMAIL PROTECTED]

On Thu, 2005-04-07 at 23:28, Ingo Molnar wrote:

 this one looks really clean.

 it makes me wonder, what is the current status of fusyn's? Such a
light
 datastructure would be much more mergeable upstream than the former
 100-queues approach.

   Inaky was telling me that 100 queues approach is two years old.

The biggest problem is that fusyn has it's own PI system .. So it's not
clear if that will work with the RT mutex,. I was thinking the PI stuff
could be made generic so, fusyn, maybe futex, can use it also .

I concur with Daniel. If we can decide how to deal with that (toss
one out, keep one, merge them, whatever), we could reuse all the user
space glue that is the hardest part to get right.

Current tip of development has some issues with conditional variables
and broadcasts (requeue stuff) that I need to sink my teeth in. Joe
Korty is fixing up a lot of corner cases I wasn't catching, but 
other than that is doing fine.

How long ago since you saw it? I also implemented the futex redirection
stuff we discussed some months ago.

-- Inaky
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-08 Thread Daniel Walker
On Fri, 2005-04-08 at 14:25, Perez-Gonzalez, Inaky wrote:
 I concur with Daniel. If we can decide how to deal with that (toss
 one out, keep one, merge them, whatever), we could reuse all the user
 space glue that is the hardest part to get right.

I have a preference to the Real-Time PI , but that's just cause I've
worked with it more. Ingo's really the one that should be make those
choices though, since he has the biggest influence over what goes into
sched.c ..

 Current tip of development has some issues with conditional variables
 and broadcasts (requeue stuff) that I need to sink my teeth in. Joe
 Korty is fixing up a lot of corner cases I wasn't catching, but 
 other than that is doing fine.

You try to get out, and they suck you right back in.

 How long ago since you saw it? I also implemented the futex redirection
 stuff we discussed some months ago.

It's been a while since I've seen the fusyn scheduler changes. I have
the curernt fusyn CVS, I'll take a look at it.

Daniel


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Priority Lists for the RT mutex

2005-04-08 Thread Perez-Gonzalez, Inaky
From: Daniel Walker [mailto:[EMAIL PROTECTED]

 Current tip of development has some issues with conditional variables
 and broadcasts (requeue stuff) that I need to sink my teeth in. Joe
 Korty is fixing up a lot of corner cases I wasn't catching, but
 other than that is doing fine.

You try to get out, and they suck you right back in.

Don't mention it :] That's why I want to get some more people
hooked up to this...so I can move on to do other things :)

 How long ago since you saw it? I also implemented the futex
redirection
 stuff we discussed some months ago.

It's been a while since I've seen the fusyn scheduler changes. I have
the curernt fusyn CVS, I'll take a look at it.

All that stuff is in futex.c; bear in mind what I said at
the confcall, it is just a hacky proof-of-concept--it doesn't
even implement the async interface.

It kind of works, but is not all that solid [last time I tried
the JVMs locked up].

-- Inaky
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/