Re: [PATCH 0/7] RT: (RFC) RT-Overload/Sched enhancements

2007-10-12 Thread Gregory Haskins
On Fri, 2007-10-12 at 12:29 +0200, Peter Zijlstra wrote:

> I'm wondering why we need the cpu prio management stuff.

I know we covered most of this on IRC, but let me recap so everyone can
follow the thread:

1) The cpupri alg is just one search alg vs the other.  I think we are
all in agreement that it is not clear if one has an advantage over the
other without some empirical data.  So that aspect still remains to be
investigated.  For now, either can be interchanged as the main
functionality is what uses the search, not the search alg itself.

2) The patch series makes some innovations above the current state of
the push-rt patch, which I will try to summarize here for consideration:

A) CPU Priority should be updated due to PI changes as well as
ctx-switch

B) The search algorithm in the cpupri alg employs a priority amongst
eligible CPUs: last-run (for cache-affinity), this_cpu (for
lower-overhead preemption), and finally any other cpu in the
cpus_allowed.  It would be ideal to see the other alg provide similar
priority.

C) The search and pusher functions are separated.  Search is useful in
circumstances outside the push_rt_task functions (see (D))

D) The primary patch addresses one case where we need to redistribute
(high-pri preemption).  There are 3 in total.  The series adds support
for a second case (low-pri RT wakeup).

E) We push until equilibrium instead of just a single task.

3) Having a distinct cpu-priority layer (regardless of search-arg) will
have (IMO) interesting potential going forward.  For instance, we could
have an optional notifier that gets kicked whenever we change
priority-class.  This would allow for some interesting RT related
enhancements by allowing system-level components to register for
priority changes.  For example: APIC TPR, or KVM hypercalls (for RT
guests, on an RT host).  This is more theoretical and half-baked at this
point, but it was something I have been kicking around.


That's all I can think of for now.

Regards,
-Greg

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/7] RT: (RFC) RT-Overload/Sched enhancements

2007-10-12 Thread Gregory Haskins
On Fri, 2007-10-12 at 07:47 -0400, Steven Rostedt wrote:
> --
> 
> On Fri, 12 Oct 2007, Peter Zijlstra wrote:
> 
> >
> > And for that, steve's rq->curr_prio field seems quite suitable.
> >
> > so instead of the:
> >   for (3 tries)
> > find lowest cpu
> > try push
> >
> > we do:
> >
> >   cpu_hotplug_lock();
> >   cpus_and(mask, p->cpus_allowed, online_cpus);
> >   for_each_cpu_mask(i, mask) {
> > if (cpu_rq(i)->curr_prio > p->prio && push_task(p, i))
> >   break;
> >   }
> >   cpu_hotplug_unlock();
> 

IMO we should try to logically separate the "search" and "push"
functionality (for instance, see my series).  The search is useful
outside the "push" operation for cases where we are waking a lower-task
that doesn't preempt on the current RQ.  In that case, we can try to
wake it directly on the optimal queue instead of waking locally and then
pushing away.

> The thing I'm worried about is that we pushed off a rt task that is higher
> in prio than another rt task on another cpu, and we'll cause a bunch of
> rt task bouncing.  That is what I'm trying to avoid.

I agree, though I think we will achieve that once the final 3rd part of
the algorithm is implemented.  Stay tuned for an update in that area.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/7] RT: (RFC) RT-Overload/Sched enhancements

2007-10-12 Thread Steven Rostedt

--

On Fri, 12 Oct 2007, Peter Zijlstra wrote:

>
> And for that, steve's rq->curr_prio field seems quite suitable.
>
> so instead of the:
>   for (3 tries)
> find lowest cpu
> try push
>
> we do:
>
>   cpu_hotplug_lock();
>   cpus_and(mask, p->cpus_allowed, online_cpus);
>   for_each_cpu_mask(i, mask) {
> if (cpu_rq(i)->curr_prio > p->prio && push_task(p, i))
>   break;
>   }
>   cpu_hotplug_unlock();

The thing I'm worried about is that we pushed off a rt task that is higher
in prio than another rt task on another cpu, and we'll cause a bunch of
rt task bouncing.  That is what I'm trying to avoid.

BTW, my logging has showed that I have yet to hit a 2cd try (but I admit,
this is a very limited test set).

-- Steve
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/7] RT: (RFC) RT-Overload/Sched enhancements

2007-10-12 Thread Peter Zijlstra
Hi Gregory,

On Thu, 2007-10-11 at 17:59 -0400, Gregory Haskins wrote:
> The current series applies to 23-rt1-pre1.
> 
> This is a snapshot of the current work-in-progress for the rt-overload
> enhancements.  The primary motivation for the series to to improve the
> algorithm for distributing RT tasks to keep the highest tasks active.  The
> current system tends to blast IPIs "shotgun" style, and we aim to reduce that
> overhead where possible.  We mitigate this behavior by trying to place tasks
> on the ideal runqueue before an overload even occurs.
> 
> Note that this series is *not* currently stable.  There is at
> least one bug resulting in a hard-lock.  And the hard-lock could be masking
> other yet-to-be-discovered issues.
> 
> My primary motivation for sending it out right now is to share the latest
> series with Peter Zijlstra and Steven Rostedt.  However, in the interest of
> keeping the development open we are sending to a wider distribution.
> Comments/suggestions from anyone are, of course, welcome.  But please note
> this is not quite ready for prime-time in any capacity. 
> 
> The series includes patches from both Steven and myself, with serious
> input/guidance/discussion from Peter.

I'm wondering why we need the cpu prio management stuff. I'm thinking we
might just use any cpus_allowed cpu that has a lesser priority than the
task we're trying to migrate.

And for that, steve's rq->curr_prio field seems quite suitable.

so instead of the:
  for (3 tries)
find lowest cpu
try push

we do:

  cpu_hotplug_lock();
  cpus_and(mask, p->cpus_allowed, online_cpus);
  for_each_cpu_mask(i, mask) {
if (cpu_rq(i)->curr_prio > p->prio && push_task(p, i))
  break;
  }
  cpu_hotplug_unlock();



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/