Re: updating COMPAT_LINUX for linux 2.6.x support (take 2)

2010-06-17 Thread Andrew Doran
On Wed, Jun 16, 2010 at 03:56:44PM -0700, Chuck Silvers wrote:

> On Mon, Jun 14, 2010 at 03:25:56PM +, Andrew Doran wrote:
> > On Sun, Jun 13, 2010 at 10:59:45PM -0700, Chuck Silvers wrote:
> > > hi folks,
> > > 
> > > ok, more progress.  linux32 is working now and I fixed a few other bugs
> > > along the way.
> > > 
> > > the updated diff is in
> > > ftp://ftp.netbsd.org/pub/NetBSD/misc/chs/linux/diff.linux-nptl-take2.36
> > 
> > - mips pcb_tls. Can you use curlwp->l_private instead and add a 
> >   cpu_lwp_setprivate() ala i386 to handle this?  As it would be the same
> >   mechanism that we'd use for native TLS.  There is a __HAVE flag for
> >   this in machine/types.h as far as I remember, see sys_lwp.c.  I created
> >   patches for a bunch of other architectures to do this, mjf@ is sitting
> >   on them I think.
> 
> ok, I did that for all the platforms where I added TLS code for linux.
> actually, I created an lwp_setprivate() which sets l->l_private and calls
> cpu_lwp_setprivate() if there is one, and changed all the linux TLS code
> and sys__lwp_setprivate() to use that.  for mips, there isn't any
> hardware register so we don't need a cpu_lwp_setprivate() there.

Ah right.. The rdhwr thing can just read l_private. 
 
> > - In x86 sys_machdep.c, I'd feel better if wrmsr() and set of pcb_gs etc 
> >   were bracketed with kpreempt_disable()/kpreempt_enable().  Likewise
> >   memcpy() to pcb_gsd and friends in Linux compat code.
> > 
> > - For the Linux compat setting of %gs/%fs, I'd rather this was done via
> >   a function call into native x86 code because in the past we've ended
> >   up with stragglers in this code, where someone working on compat does
> >   the wrong thing or where someone working on x86 fails to update the
> >   compat code. Not a strong opinion just a preference.
> 
> with the above change to use lwp_setprivate(), the linux code is now
> free of anything that fiddles with TLS thread state directly.
> 
> 
> > - FYI I think I disabled Linux ptrace() because I was concerned about
> >   potential security issues and bitrot in the code. Dunno if that's
> >   still the case.
> 
> you're right, more would be needed there to be safe.
> I put the checks to disallow ptrace on multi-LWP processes
> back the way they were (and added one on powerpc, where it was missing).
> 
> 
> > - Re: the Linux +ucas_int() hack, preemption implies MULTIPROCESSOR
> >   so the kpreempt toggles aren't needed.. Maybe worthwhile as a sort
> >   of documentation though.
> >   We may context switch during the copyout so I don't see how this
> >   can be atomic.  If copyin() is somehow wacky I guess we could switch
> >   there too.  Any reason not to say "implement user space CAS or your
> >   port loses Linux emulation"?
> 
> interesting, I didn't realize that preemption wasn't supported on
> non-MP kernels.

This is mainly down the fact that we need kernel_lock to bracket "legacy"
sections of code that aren't preemption safe.  I think MULTIPROCESSOR
should be sent off to the glue factory but that's another discussion :-).

> I think the hacky version is actually safe.  the copyin() will have
> fetched the page into memory and created a pmap mapping for it.
> in a UP kernel, nothing can happen between the copyin() and the copyout()
> that could change the page's contents or invalidate the mapping,
> so the copyout() can't fault and the value that copyin() read can't
> have changed before the copyout().

Ok, maybe a brief comment above would do.

> as for just disabling COMPAT_LINUX on these platforms,
> I really don't like disabling other people's features
> just because it would be convenient for me.

Perhaps my choice of words was bad.  What we've found is some ports 
lagging behind in the feature stakes, and what works is to say "please
implement this because some point down the road X will break without it".
The presumption here is that it's unreasonable to ask one person to go
implement feature Z on N different ports.  Given the short timeframe for
this change I don't object to the CAS hack.. It is horrible though!

> > - The dup code for fork1() code makes me uncomfortable.  Maybe it's
> >   worthwhile changing our native code so that LIDs are always allocated
> >   from the PID table or something along those lines?  Tend to think these
> >   should be globally unique with the system and not just within a process.
> >   Could also be of potential help with things like inter-process pthread
> >   objects in shared memory.
> 
> could you be more specific about the duplicated code that concerns you?
> do you mean the "Set the new LWP running" chunk at the end of
> linux_clone_nptl()?  that's copied from sys__lwp_create() rather than
> fork1(), BTW.

That type of code, right.
 
> there are 7 callers (8 with my new one) of lwp_create(),
> and all of them have different code for making the new LWP runnable
> (or not) afterward.  do you suppose it would be worthwhile to
> collect all of that together into 

Re: Enabling built-in modules earlier in init

2010-06-17 Thread Antti Kantee
On Wed Jun 16 2010 at 15:36:30 -0700, Paul Goyette wrote:
> The attached diffs add one more routine, module_init3() which gets 
> called from init_main() right after module_class_init(MODULE_CLASS_ANY). 
> module_init3() walks the list of builtin modules that have not already 
> been init'd and marks them disabled.
> 
> Tested briefly on my home systems and appears to work.
> 
> Any objections to committing this?

I'd still hook it to the end of module_class_init(MODULE_CLASS_ANY)
instead of adding more randomly numbered module_init() calls.
The other benefit from doing so is that you get it done atomically,
which is always worthwhile, and doubly so when it's a low hanging fruit
like here.

> @@ -416,6 +434,7 @@ module_init_class(modclass_t class)
>* init.
>*/
>   if (module_do_builtin(mi->mi_name, NULL) != 0) {
> + mod->mod_disabled = true;
>   TAILQ_REMOVE(&module_builtins, mod, mod_chain);
>   TAILQ_INSERT_TAIL(&bi_fail, mod, mod_chain);
>   }

Why do you mark it as disabled?  Doesn't this conflict with the "it
might succeed in a later module_init_class()" idea you presented earlier?

module_disabled = true/false in multiple places looks a little
error-prone.  Now that struct module is growing more and more members,
maybe we can just have an object allocator which initializes the value and
afterwards the only acceptable mutation for module_disabled is setting
it to true (might make sense to rename the variable to something like
module_virgin and flip the polarity, though).


Re: Enabling built-in modules earlier in init

2010-06-17 Thread Paul Goyette

On Thu, 17 Jun 2010, Antti Kantee wrote:


On Wed Jun 16 2010 at 15:36:30 -0700, Paul Goyette wrote:

The attached diffs add one more routine, module_init3() which gets
called from init_main() right after module_class_init(MODULE_CLASS_ANY).
module_init3() walks the list of builtin modules that have not already
been init'd and marks them disabled.

Tested briefly on my home systems and appears to work.

Any objections to committing this?


I'd still hook it to the end of module_class_init(MODULE_CLASS_ANY)
instead of adding more randomly numbered module_init() calls.
The other benefit from doing so is that you get it done atomically,
which is always worthwhile, and doubly so when it's a low hanging fruit
like here.


I dislike adding another special-case-overload.  In my mind, having 
module_class_init(MODULE_CLASS_ANY) have the additional effect of 
"disabling" un-init'd modules is not much different from automatically 
registering modules in MODULE_CLASS_SECMODEL.  This thread has already 
noted that such registration belongs in each module's modcmd().



@@ -416,6 +434,7 @@ module_init_class(modclass_t class)
 * init.
 */
if (module_do_builtin(mi->mi_name, NULL) != 0) {
+   mod->mod_disabled = true;
TAILQ_REMOVE(&module_builtins, mod, mod_chain);
TAILQ_INSERT_TAIL(&bi_fail, mod, mod_chain);
}


Why do you mark it as disabled?  Doesn't this conflict with the "it
might succeed in a later module_init_class()" idea you presented earlier?


Yes, it does conflict.  I've removed this line, and updated the comment 
block above.



module_disabled = true/false in multiple places looks a little
error-prone.  Now that struct module is growing more and more members,
maybe we can just have an object allocator which initializes the value and
afterwards the only acceptable mutation for module_disabled is setting
it to true (might make sense to rename the variable to something like
module_virgin and flip the polarity, though).


HeHe - module_virgin would seem to be a bit obscure to me if I hadn't 
written the code.  Perhaps module_autoload_ok would be acceptable (with 
a state flip)?  Or module_require_force (with no flip)?


The attached diffs use module_autoload_ok instead of module_disabled 
(and state flip), provide an object allocator, and a single mutation 
function.  It also avoids disabling the module if it is unloaded by the 
auto-unload thread;  only modules that are explicitly unloaded by name 
should be prevented from future autoloads.


I've renamed both module_init2() and module_init3() to more descriptive 
routine names, but for now I've kept _init3() as a separate function and 
not made it part of module_init_class().  I'll be doing some testing of 
this over the next day or two before committing.


As always, review, comments, and suggestions (and yes, even criticisms!) 
are welcomed and solicited.   :)




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: sys/sys/module.h
===
RCS file: /cvsroot/src/sys/sys/module.h,v
retrieving revision 1.23
diff -u -p -r1.23 module.h
--- sys/sys/module.h24 May 2010 03:50:25 -  1.23
+++ sys/sys/module.h17 Jun 2010 12:53:24 -
@@ -90,6 +90,7 @@ typedef struct module {
time_t  mod_autotime;
void*mod_ctf;
u_int   mod_fbtentries; /* DTrace FBT entrie count */
+   boolmod_autoload_ok;
 } module_t;
 
 /*
@@ -120,7 +121,8 @@ extern struct modlist   module_builtins;
 extern u_int   module_gen;
 
 void   module_init(void);
-void   module_init2(void);
+void   module_start_unload_thread(void);
+void   module_builtin_no_autoload(void);
 void   module_init_md(void);
 void   module_init_class(modclass_t);
 intmodule_prime(void *, size_t);
Index: sys/kern/init_main.c
===
RCS file: /cvsroot/src/sys/kern/init_main.c,v
retrieving revision 1.420
diff -u -p -r1.420 init_main.c
--- sys/kern/init_main.c10 Jun 2010 20:54:53 -  1.420
+++ sys/kern/init_main.c17 Jun 2010 12:53:24 -
@@ -430,7 +430,7 @@ main(void)
loginit();
 
/* Second part of module system initialization. */
-   module_init2();
+   module_start_unload_thread();
 
/* Initialize the file systems. */
 #ifdef NVNODE_IMPLICIT
@@ -594,9 

Re: updating COMPAT_LINUX for linux 2.6.x support (take 2)

2010-06-17 Thread Mindaugas Rasiukevicius
Chuck Silvers  wrote:
> 
> the updated diff is at:
> ftp://ftp.netbsd.org/pub/NetBSD/misc/chs/linux/diff.linux-nptl-take2.39
> 

The emulation hook, which performs proc_free_pid(l->l_lid), is called before
LWP gets removed from the global list - so there is a small window where LID
uniqueness is not preserved.  How about adding LP_LIDPID flag and moving
proc_free_pid() into lwp_exit(), which makes it symmetric with lwp_create()?

-- 
Mindaugas


Simplelock v.s lock

2010-06-17 Thread Putrycy
Hey guys. I am a noob in netbsd. I had some experience with linux before though.
During my work, ive approached a problem when i need to use a
Linux-like spin locks.
I found out that i need to use either lock or simplelock in NetBSD.
My question is : what is the exact difference between those two ?
According to the manual:
'struct simplelock
              Provides a simple spinning mutex.  A processor will busy-wait
              while trying to acquire a simplelock.  The simplelock operations
              are implemented with machine-dependent locking primitives.
              Simplelocks are usually used only by the high-level lock manager
              and to protect short, critical sections of code.  Simplelocks
              are the only locks that can be used inside an interrupt handler.
              For a simplelock to be used in an interrupt handler, care must
              be taken to disable the interrupt, acquire the lock, do any pro-
              cessing, release the simplelock and re-enable the interrupt.
              This procedure is necessary to avoid deadlock between the inter-
              rupt handler and other threads executing on the same processor.
     struct lock
              Provides a high-level lock supporting sleeping/spinning until
              the lock can be acquired.  The lock manager supplies both exclu-
              sive-access and shared-access locks, with recursive exclusive-
              access locks within a single thread.  It also allows upgrading a
              shared-access lock to an exclusive-access lock, as well as down-
              grading an exclusive-access lock to a shared-access lock.'
Is the only difference that simplelocks are 'turining interrupts off'
, and the other is not ??
Thanks in advance for any reply.


Re: Simplelock v.s lock

2010-06-17 Thread Adam Hoka
On Thu, 17 Jun 2010 16:53:38 +0200
Putrycy  wrote:

> Hey guys. I am a noob in netbsd. I had some experience with linux before 
> though.
> During my work, ive approached a problem when i need to use a
> Linux-like spin locks.
> I found out that i need to use either lock or simplelock in NetBSD.
> My question is : what is the exact difference between those two ?
> According to the manual:
> 'struct simplelock
>               Provides a simple spinning mutex.  A processor will busy-wait
>               while trying to acquire a simplelock.  The simplelock operations
>               are implemented with machine-dependent locking primitives.
>               Simplelocks are usually used only by the high-level lock manager
>               and to protect short, critical sections of code.  Simplelocks
>               are the only locks that can be used inside an interrupt handler.
>               For a simplelock to be used in an interrupt handler, care must
>               be taken to disable the interrupt, acquire the lock, do any pro-
>               cessing, release the simplelock and re-enable the interrupt.
>               This procedure is necessary to avoid deadlock between the inter-
>               rupt handler and other threads executing on the same processor.
>      struct lock
>               Provides a high-level lock supporting sleeping/spinning until
>               the lock can be acquired.  The lock manager supplies both exclu-
>               sive-access and shared-access locks, with recursive exclusive-
>               access locks within a single thread.  It also allows upgrading a
>               shared-access lock to an exclusive-access lock, as well as down-
>               grading an exclusive-access lock to a shared-access lock.'
> Is the only difference that simplelocks are 'turining interrupts off'
> , and the other is not ??
> Thanks in advance for any reply.

Both of them are obsoleted in 5.0, use the newer mutex API, but for the record:

a spin lock enters a loop and keeps polling the lock, and thus wasting
resources if held too long. a sleeping lock gives up execution of the thread,
and puts it into the turnstile, to be signaled when the lock is relased

-- 
NetBSD - Simplicity is prerequisite for reliability


Re: updating COMPAT_LINUX for linux 2.6.x support (take 2)

2010-06-17 Thread Chuck Silvers
On Thu, Jun 17, 2010 at 10:25:59AM +, Andrew Doran wrote:
> > > - Re: the Linux +ucas_int() hack, [...]
> > 
> > I think the hacky version is actually safe.  the copyin() will have
> > fetched the page into memory and created a pmap mapping for it.
> > in a UP kernel, nothing can happen between the copyin() and the copyout()
> > that could change the page's contents or invalidate the mapping,
> > so the copyout() can't fault and the value that copyin() read can't
> > have changed before the copyout().
> 
> Ok, maybe a brief comment above would do.

as you pointed out later, it's not safe if the page is COW at that point.
I'll change these to use the RAS stuff.


> > as for just disabling COMPAT_LINUX on these platforms,
> > I really don't like disabling other people's features
> > just because it would be convenient for me.
> 
> Perhaps my choice of words was bad.  What we've found is some ports 
> lagging behind in the feature stakes, and what works is to say "please
> implement this because some point down the road X will break without it".
> The presumption here is that it's unreasonable to ask one person to go
> implement feature Z on N different ports.  Given the short timeframe for
> this change I don't object to the CAS hack.. It is horrible though!

when I've tried to get people to do that kind of thing in the past,
I haven't had a lot of success, so these days I just assume I have to
do it all myself.  it does take more of my time, but the work gets done
a lot sooner and it's less stressful for me.


> > > - The dup code for fork1() code makes me uncomfortable.  Maybe it's
> > >   worthwhile changing our native code so that LIDs are always allocated
> > >   from the PID table or something along those lines?  Tend to think these
> > >   should be globally unique with the system and not just within a process.
> > >   Could also be of potential help with things like inter-process pthread
> > >   objects in shared memory.
> > 
> > could you be more specific about the duplicated code that concerns you?
> > do you mean the "Set the new LWP running" chunk at the end of
> > linux_clone_nptl()?  that's copied from sys__lwp_create() rather than
> > fork1(), BTW.
> 
> That type of code, right.
>  
> > there are 7 callers (8 with my new one) of lwp_create(),
> > and all of them have different code for making the new LWP runnable
> > (or not) afterward.  do you suppose it would be worthwhile to
> > collect all of that together into an lwp_launch() or somesuch,
> > with a bunch of flags to select the behaviour?
> 
> They all have different requirements.. I would prefer us to limit the
> proliferation of low-level stuff outside of kern/ but, I don't have a
> stong opinion, it's fine as it was.

ok, I'll leave it as-is for now and look at merging all that stuff later.


> > I'd rather not get into changing LID allocation behaviour of native
> > processes as part of this linux work.
> 
> Sure, it's something we can revisit.
> 
> > > - When resetting l_lid, for safety we should hold p_lock (allthough 
> > >   during early fork we'd probably get away with it due to the process
> > >   being SIDL).
> > >   If we take an approach like the above then we wouldn't need to reset
> > >   l_lid at all.
> > 
> > I added a flag to lwp_create() to have it allocate a new PID to use for
> > the LID of the new thread.  now the only places that the LID of an existing
> > LWP is changed are in execve1() (which already did that) and in the
> > linux fork and exec callbacks (at which point the process has only 1 LWP).
> > do we need to take p_lock in these places?
> 
> In fork is the process still SIDL?  In that case no.. Exec yes I think
> we'd need p_lock held there since the process is still "live" as it were.
> p_reflock may be held to lock out the debugger (can't remember) but
> otherwise the proc is visible.  Re: the existing LID reset in execve1()
> I'll have a look and possibly open a PR.  The only reason to reset the
> LID there is so it looks pretty for ps/top, there should be no assumptions
> about LID 1.

ok.  p_reflock is already held across the relevant portion of execve1(),
so that's not a problem.   I'll add the p_lock usage where needed.


-Chuck


Re: updating COMPAT_LINUX for linux 2.6.x support (take 2)

2010-06-17 Thread Matthew Mondor
On Thu, 17 Jun 2010 10:25:59 +
Andrew Doran  wrote:

> This is mainly down the fact that we need kernel_lock to bracket "legacy"
> sections of code that aren't preemption safe.  I think MULTIPROCESSOR
> should be sent off to the glue factory but that's another discussion :-).

Is there any way that performance for the uniprocessor case could be
preserved, where some synchronization/preemption-safe blocks are
unnecessary, without having conditional code when MULTIPROCESSOR?

Or is it that for uniprocessor the same precautions are always required
on -current now?

Thanks,
-- 
Matt


Re: updating COMPAT_LINUX for linux 2.6.x support (take 2)

2010-06-17 Thread Thor Lancelot Simon
On Thu, Jun 17, 2010 at 12:07:43PM -0400, Matthew Mondor wrote:
> On Thu, 17 Jun 2010 10:25:59 +
> Andrew Doran  wrote:
> 
> > This is mainly down the fact that we need kernel_lock to bracket "legacy"
> > sections of code that aren't preemption safe.  I think MULTIPROCESSOR
> > should be sent off to the glue factory but that's another discussion :-).
> 
> Is there any way that performance for the uniprocessor case could be
> preserved, where some synchronization/preemption-safe blocks are
> unnecessary, without having conditional code when MULTIPROCESSOR?

Generally speaking, performance for the uniprocessor case is, in fact,
preserved, because the actual bus-locking operations are patched away
at startup time.

However, only x86 currently supports this, I believe.  Very similar code
is required by part of DTrace (FBT) and I wish someone had both the time
and the skills to work on it for more architectures.  But that's not me.

Thor


Re: updating COMPAT_LINUX for linux 2.6.x support (take 2)

2010-06-17 Thread Chuck Silvers
On Thu, Jun 17, 2010 at 03:27:06PM +0100, Mindaugas Rasiukevicius wrote:
> Chuck Silvers  wrote:
> > 
> > the updated diff is at:
> > ftp://ftp.netbsd.org/pub/NetBSD/misc/chs/linux/diff.linux-nptl-take2.39
> > 
> 
> The emulation hook, which performs proc_free_pid(l->l_lid), is called before
> LWP gets removed from the global list - so there is a small window where LID
> uniqueness is not preserved.  How about adding LP_LIDPID flag and moving
> proc_free_pid() into lwp_exit(), which makes it symmetric with lwp_create()?

good point.  moving that into lwp_exit() sounds fine, I'll do that.

an updated diff (with this plus the p_lock changes) is at the usual
ftp://ftp.netbsd.org/pub/NetBSD/misc/chs/linux/diff.linux-nptl-take2.40

-Chuck


Re: updating COMPAT_LINUX for linux 2.6.x support (take 2)

2010-06-17 Thread Antti Kantee
On Thu Jun 17 2010 at 12:21:41 -0400, Thor Lancelot Simon wrote:
> On Thu, Jun 17, 2010 at 12:07:43PM -0400, Matthew Mondor wrote:
> > On Thu, 17 Jun 2010 10:25:59 +
> > Andrew Doran  wrote:
> > 
> > > This is mainly down the fact that we need kernel_lock to bracket "legacy"
> > > sections of code that aren't preemption safe.  I think MULTIPROCESSOR
> > > should be sent off to the glue factory but that's another discussion :-).
> > 
> > Is there any way that performance for the uniprocessor case could be
> > preserved, where some synchronization/preemption-safe blocks are
> > unnecessary, without having conditional code when MULTIPROCESSOR?
> 
> Generally speaking, performance for the uniprocessor case is, in fact,
> preserved, because the actual bus-locking operations are patched away
> at startup time.

On a slight tangent, since I can't remember if I've mentioned this on
a list before:

If your workload partitions somewhat easily, you can get uniprocessor
locking performance on a multiprocessor host by running several rump
kernels with one virtual CPU each.  Since exclusion is done already at
the level of the CPU scheduler, all bus locking is optimized out for
normal locks.  E.g. if you have 16 host CPUs, you can have 16 worker
kernels all with uniprocessor locking.  Plus, you can have your 16 CPUs
executing networking code in parallel on NetBSD ...


re: updating COMPAT_LINUX for linux 2.6.x support (take 2)

2010-06-17 Thread matthew green

> On Thu, Jun 17, 2010 at 12:07:43PM -0400, Matthew Mondor wrote:
> > On Thu, 17 Jun 2010 10:25:59 +
> > Andrew Doran  wrote:
> > 
> > > This is mainly down the fact that we need kernel_lock to bracket "legacy"
> > > sections of code that aren't preemption safe.  I think MULTIPROCESSOR
> > > should be sent off to the glue factory but that's another discussion :-).
> > 
> > Is there any way that performance for the uniprocessor case could be
> > preserved, where some synchronization/preemption-safe blocks are
> > unnecessary, without having conditional code when MULTIPROCESSOR?
> 
> Generally speaking, performance for the uniprocessor case is, in fact,
> preserved, because the actual bus-locking operations are patched away
> at startup time.
> 
> However, only x86 currently supports this, I believe.  Very similar code
> is required by part of DTrace (FBT) and I wish someone had both the time
> and the skills to work on it for more architectures.  But that's not me.

i ran a bunch of measurements for sparc64 when i changed GENERIC to be
MP by default (it's necessary for now, until we parse the numa-like
memory maps and avoid memory not available without depending on the CPU
it is physically attached to.)

i couldn't observe any difference in a abunch of real world and micro-
benchmark cases, so i never even bothered with patching away code that
"isn't necessary" (though i've written code to do said patching.)


.mrg.


Re: updating COMPAT_LINUX for linux 2.6.x support (take 2)

2010-06-17 Thread Thor Lancelot Simon
On Fri, Jun 18, 2010 at 04:09:30AM +1000, matthew green wrote:
> 
> i ran a bunch of measurements for sparc64 when i changed GENERIC to be
> MP by default (it's necessary for now, until we parse the numa-like
> memory maps and avoid memory not available without depending on the CPU
> it is physically attached to.)
> 
> i couldn't observe any difference in a abunch of real world and micro-
> benchmark cases, so i never even bothered with patching away code that
> "isn't necessary" (though i've written code to do said patching.)

In the x86 case, what's patched away on uniprocessors is the prefix
that causes locked bus transactions.  The effect is definitely measurable!

FBT is a little different.  FBT patches *in* the code that actually
implements the trace points when you turn them on.  Since the trace
points may include conditional branches and other code features which
could have an impact on the execution of _other_ code as a side effect,
it's nice to know they aren't there at all unless you turn them on; the
cpu cannot (mis-)speculatively execute code that is not there. :-)

Thor


Re: updating COMPAT_LINUX for linux 2.6.x support (take 2)

2010-06-17 Thread Antti Kantee
On Fri Jun 18 2010 at 04:09:30 +1000, matthew green wrote:
> > Generally speaking, performance for the uniprocessor case is, in fact,
> > preserved, because the actual bus-locking operations are patched away
> > at startup time.
> > 
> > However, only x86 currently supports this, I believe.  Very similar code
> > is required by part of DTrace (FBT) and I wish someone had both the time
> > and the skills to work on it for more architectures.  But that's not me.
> 
> i ran a bunch of measurements for sparc64 when i changed GENERIC to be
> MP by default (it's necessary for now, until we parse the numa-like
> memory maps and avoid memory not available without depending on the CPU
> it is physically attached to.)
> 
> i couldn't observe any difference in a abunch of real world and micro-
> benchmark cases, so i never even bothered with patching away code that
> "isn't necessary" (though i've written code to do said patching.)

I did a microbenchmark a while ago on i386 and it showed a straggering
difference of tens of %.  Of course, the benchmark didn't do I/O (it
tested tmpfs).  But when you counted the number of locks the kernel
code paths took and compared that against the cost of a bus lock, the
math worked out.  I don't know how expensive a bus lock is on sparc64,
but at least on i386 it's very expensive (comparatively ;).