Paul Jackson wrote:
> Andrew wrote:
>>  (and bear in mind that Paul has a track record of being wrong
>>  on this :))
> 
> heh - I saw that <grin>.
> 
> Max - Andrew's about right, as usual.  You answered my initial
> questions on this patch set adequately, but hard real time is
> not my expertise, so in the final analysis, other than my saying
> I don't have any more objections, my input doesn't mean much
> either way.

I honestly think this one is no brainer and I do not think this one will hurt 
Paul's track record :).
Paul initially disagreed with me and that's when he was wrong ;-))
 
Andrew, I looked at this in detail and here is an explanation that 
I sent to Paul a few days ago (a bit shortened/updated version).

--------
I thought some more about your proposal to use sched_load_balance flag in 
cpusets instead of extending 
cpu_isolated_map. I looked at the cpusets, cgroups and here are my thoughts on 
this.
Here is the list of issues with sched_load_balance flag from CPU isolation 
perspective:
-- 
(1) Boot time isolation is not possible. There is currently no way to setup a 
cpuset at
boot time. For example we won't be able to isolate cpus from irqs and 
workqueues at boot.
Not a major issue but still an inconvenience.

-- 
(2) There is currently no easy way to figure out what cpuset a cpu belongs to 
in order to query 
it's sched_load_balance flag. In order to do that we need a method that 
iterates all active cpusets 
and checks their cpus_allowed masks. This implies holding cgroup and cpuset 
mutexes. It's not clear 
whether it's ok to do that from the the contexts CPU isolation happens in 
(apic, sched, workqueue). 
It seems that cgroup/cpuset api is designed from top down access. ie adding a 
cpu to a set and then 
recomputing domains. Which makes perfect sense for the common cpuset usecase 
but is not what cpu 
isolation needs.
In other words I think it's much simpler and cleaner to use the 
cpu_isolated_map for isolation
purposes. No locks, no races, etc.

-- 
(3) cpusets are a bit too dynamic  :) . What I mean by this is that 
sched_load_balance flag
can be changed at any time without bringing a CPU offline. What that means is 
that we'll
need some notifier mechanisms for killing and restarting workqueue threads when 
that flag changes. 
Also we'd need some logic that makes sure that a user does not disable load 
balancing on all cpus 
because that effectively will kill workqueues on all the cpus.
This particular case is already handled very nicely in my patches. Isolated bit 
can be set
only when cpu is offline and it cannot be set on the first online cpu. 
Workqueus and other
subsystems already handle cpu hotplug events nicely and can easily ignore 
isolated cpus when
they come online.

--
#1 is probably unfixable. #2 and #3 can be fixed but at the expense of extra 
complexity across
the board. I seriously doubt that I'll be able to push that through the reviews 
;-).

Also personally I still think cpusets and cpu isolation attack two different 
problems. cpusets is about 
partitioning cpus and memory nodes, and managing tasks. Most of the 
cgroups/cpuset APIs are designed to 
deal with tasks. 
CPU isolation is much simpler and is at the lower layer. It deals with IRQs, 
kernel per cpu threads, etc. 
The only intersection I see is that both features affect scheduling domains. 
CPU isolation is again 
simple here it uses existing logic in sched.c it does not change anything in 
this area. 

---------

Andrew, hopefully that clarifies it. Let me know if you're not convinced.

Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to