* Nick Piggin <[EMAIL PROTECTED]> wrote: > [...] Although I'd imagine it may be something distros may want. For > example, a generic x86-64 kernel for both AMD and Intel systems could > easily have SMT and NUMA turned on.
yes, that's true - in fact reducing the number of separate kernel packages is of utmost importance to all distributions. (I'm not sure we are there yet with CONFIG_NUMA, but small steps wont hurt.) > I agree with the downside of exercising less code paths though. if we make CONFIG_NUMA good enough on small boxes so that distributors can turn it on then in the long run the loss could be offset by the win the extra QA gives. > >is there any case where we'd want to simplify the domain tree? One more > >domain level is just one (and very minor) aspect of CONFIG_NUMA - i'd > >not want to run a CONFIG_NUMA kernel on a non-NUMA box, even if the > >domain tree got optimized. Hm? > > I guess there is the SMT issue too, and even booting an SMP kernel on > a UP system. Also small ia64 NUMA systems will probably have one > redundant NUMA level. i think most factors of not running an SMP kernel on a UP box are not due scheduler overhead: the biggest cost is spinlock overhead. Someone should try a little prototype: use the 'alternate instructions' framework to patch out calls to spinlock functions to NOPs, and benchmark the resulting kernel against UP. If it's "good enough", distros will use it. Having just a single binary kernel RPM that supports everything from NUMA through SMP to UP is the holy grail of distros. (especially the ones that offer commercial support and services.) this is probably not possible on x86 - e.g. it would probably be expensive (in terms of runtime cost) to make the PAE/non-PAE decision runtime (the distro boot kernel needs to be non-PAE). But for newer arches like x64 it should be easier. > If/when topologies get more complex (for example, the recent Altix > discussions we had with Paul), it will be generally easier to set up > all levels in a generic way, then weed them out using something like > this, rather than put the logic in the domain setup code. ok. That should also make it easier to put more of the arch domain setup code into sched.c. E.g. i'm still uneasy about it having so much scheduler code in arch/ia64/kernel/domain.c, and all the ripple effects. (the #ifdefs, include file impact, etc.) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/