Re: [patch 1/8] Immediate Values - Global Modules List and Module Mutex

2007-09-22 Thread Rusty Russell
On Fri, 2007-09-21 at 09:37 -0400, Mathieu Desnoyers wrote:
> > Good points.  Well I'd say hiding it all behind a friendly
> > "immediate_set()" interface is the best option then.
> 
> Then we can't benefit of the __init section to have the code removed
> after boot. I don't see the point in doing so.

Because having a magic early version is a bad burden to place on
programmers.  It's an admission of failure that we cannot create a
simpler interface.

AFAICT it's a handful of bytes which would be freed.

> > Is this really worth worrying about?  Isn't there already a problem with
> > printk() in nmi?
> 
> printk() is just an example taken from my current instrumentation. Let's
> say I plan to instrument kmalloc() (which will happen someday): it's
> used in every context, including NMI.

Again, I don't think calling kmalloc in NMI is valid.  Unless I'm
missing something, any code which uses locks is susceptible to deadlock
if used from an NMI handler.   So you really can't do much.

It's nice that you have the perfect solution.  But I'd really rather see
a sufficient solution which is 1/4 of the code and 1/10 the complexity.

Rusty.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/8] Immediate Values - Global Modules List and Module Mutex

2007-09-21 Thread Mathieu Desnoyers
* Rusty Russell ([EMAIL PROTECTED]) wrote:
> On Tue, 2007-09-18 at 09:41 -0400, Mathieu Desnoyers wrote:
> > * Rusty Russell ([EMAIL PROTECTED]) wrote:
> > > On Fri, 2007-09-14 at 11:32 -0400, Mathieu Desnoyers wrote:
> > > > * Rusty Russell ([EMAIL PROTECTED]) wrote:
> > > > > Alternatively, if you called it "immediate_init" then the semantics
> > > > > change slightly, but are more obvious (ie. only use this when the 
> > > > > value
> > > > > isn't being accessed yet).  But it can't be __init then anyway.
> > > > > 
> > > > 
> > > > I think your idea is good. immediate_init() could be used to update the
> > > > immediate values at boot time _and_ at module load time, and we could
> > > > use an architecture specific arch_immediate_update_init() to support it.
> > > 
> > > Right.
> > > 
> > > > As for "when" to use this, it should be used at boot time when
> > > > interrupts are still disabled, still running in UP. It can also be used
> > > > at module load time before any of the module code is executed, as long
> > > > as the module code pages are writable (which they always are, for
> > > > now..). Therefore, the flag seems inappropriate for module load
> > > > arch_immediate_update_init. It cannot be put in __init section neither
> > > > though if we use it like this.
> > > 
> > > I think from a user's POV it would be nice to have a 1:1 mapping with
> > > normal initialization semantics (ie. it will work as long as you don't
> > > access this value until initialized).  And I think this would be the
> > > case.  eg:
> > > 
> > > int foo_func(void)
> > > {
> > >   if (immediate_read(&some_immediate))
> > >   return 0;
> > >   ...
> > > }
> > > 
> > > int some_init(void)
> > > {
> > >   immediate_init(some_immediate, 0);
> > >   register_foo(foo_func);
> > >   ...
> > > }
> > > 
> > 
> > There are other considerations that differs between the boot-time case
> > and the general "init" case: the write-protection flag must be
> > cleared-saved/restored when the kernel is running to patch read-only
> > text, but we don't want to modify cr0 at early boot on i386 because
> > paravirt is not executed yet (at boot time, pages are not
> > write-protected yet).
> > 
> > And I am not sure that it buys us anything to create an immediate_init()
> > when we can do exactly the same job with immediate_set. Yes, it might be
> > a bit slower, but we are not on a fast path.
> 
> Good points.  Well I'd say hiding it all behind a friendly
> "immediate_set()" interface is the best option then.
> 

Then we can't benefit of the __init section to have the code removed
after boot. I don't see the point in doing so.

> > > OK, but can you justify the use of immediates within the nmi or mce
> > > handlers?  They don't strike me as useful candidates for optimization.
> > 
> > Yes, immediate values are used by the Linux Kernel Markers, which
> > instrument many code paths, including functions called from nmi and mce
> > contexts (including printk).
> 
> Is this really worth worrying about?  Isn't there already a problem with
> printk() in nmi?
> 

printk() is just an example taken from my current instrumentation. Let's
say I plan to instrument kmalloc() (which will happen someday): it's
used in every context, including NMI. And it's not because printk is
broken that its instrumentation can afford to be broken even further,
that logic seems wrong to me. Somebody go fix printk then ;)

Mathieu

> Cheers,
> Rusty.
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/8] Immediate Values - Global Modules List and Module Mutex

2007-09-20 Thread Rusty Russell
On Tue, 2007-09-18 at 09:41 -0400, Mathieu Desnoyers wrote:
> * Rusty Russell ([EMAIL PROTECTED]) wrote:
> > On Fri, 2007-09-14 at 11:32 -0400, Mathieu Desnoyers wrote:
> > > * Rusty Russell ([EMAIL PROTECTED]) wrote:
> > > > Alternatively, if you called it "immediate_init" then the semantics
> > > > change slightly, but are more obvious (ie. only use this when the value
> > > > isn't being accessed yet).  But it can't be __init then anyway.
> > > > 
> > > 
> > > I think your idea is good. immediate_init() could be used to update the
> > > immediate values at boot time _and_ at module load time, and we could
> > > use an architecture specific arch_immediate_update_init() to support it.
> > 
> > Right.
> > 
> > > As for "when" to use this, it should be used at boot time when
> > > interrupts are still disabled, still running in UP. It can also be used
> > > at module load time before any of the module code is executed, as long
> > > as the module code pages are writable (which they always are, for
> > > now..). Therefore, the flag seems inappropriate for module load
> > > arch_immediate_update_init. It cannot be put in __init section neither
> > > though if we use it like this.
> > 
> > I think from a user's POV it would be nice to have a 1:1 mapping with
> > normal initialization semantics (ie. it will work as long as you don't
> > access this value until initialized).  And I think this would be the
> > case.  eg:
> > 
> > int foo_func(void)
> > {
> > if (immediate_read(&some_immediate))
> > return 0;
> > ...
> > }
> > 
> > int some_init(void)
> > {
> > immediate_init(some_immediate, 0);
> > register_foo(foo_func);
> > ...
> > }
> > 
> 
> There are other considerations that differs between the boot-time case
> and the general "init" case: the write-protection flag must be
> cleared-saved/restored when the kernel is running to patch read-only
> text, but we don't want to modify cr0 at early boot on i386 because
> paravirt is not executed yet (at boot time, pages are not
> write-protected yet).
> 
> And I am not sure that it buys us anything to create an immediate_init()
> when we can do exactly the same job with immediate_set. Yes, it might be
> a bit slower, but we are not on a fast path.

Good points.  Well I'd say hiding it all behind a friendly
"immediate_set()" interface is the best option then.

> > OK, but can you justify the use of immediates within the nmi or mce
> > handlers?  They don't strike me as useful candidates for optimization.
> 
> Yes, immediate values are used by the Linux Kernel Markers, which
> instrument many code paths, including functions called from nmi and mce
> contexts (including printk).

Is this really worth worrying about?  Isn't there already a problem with
printk() in nmi?

Cheers,
Rusty.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/8] Immediate Values - Global Modules List and Module Mutex

2007-09-18 Thread Mathieu Desnoyers
* Rusty Russell ([EMAIL PROTECTED]) wrote:
> On Fri, 2007-09-14 at 11:32 -0400, Mathieu Desnoyers wrote:
> > * Rusty Russell ([EMAIL PROTECTED]) wrote:
> > > Alternatively, if you called it "immediate_init" then the semantics
> > > change slightly, but are more obvious (ie. only use this when the value
> > > isn't being accessed yet).  But it can't be __init then anyway.
> > > 
> > 
> > I think your idea is good. immediate_init() could be used to update the
> > immediate values at boot time _and_ at module load time, and we could
> > use an architecture specific arch_immediate_update_init() to support it.
> 
> Right.
> 
> > As for "when" to use this, it should be used at boot time when
> > interrupts are still disabled, still running in UP. It can also be used
> > at module load time before any of the module code is executed, as long
> > as the module code pages are writable (which they always are, for
> > now..). Therefore, the flag seems inappropriate for module load
> > arch_immediate_update_init. It cannot be put in __init section neither
> > though if we use it like this.
> 
> I think from a user's POV it would be nice to have a 1:1 mapping with
> normal initialization semantics (ie. it will work as long as you don't
> access this value until initialized).  And I think this would be the
> case.  eg:
> 
> int foo_func(void)
> {
>   if (immediate_read(&some_immediate))
>   return 0;
>   ...
> }
> 
> int some_init(void)
> {
>   immediate_init(some_immediate, 0);
>   register_foo(foo_func);
>   ...
> }
> 

There are other considerations that differs between the boot-time case
and the general "init" case: the write-protection flag must be
cleared-saved/restored when the kernel is running to patch read-only
text, but we don't want to modify cr0 at early boot on i386 because
paravirt is not executed yet (at boot time, pages are not
write-protected yet).

And I am not sure that it buys us anything to create an immediate_init()
when we can do exactly the same job with immediate_set. Yes, it might be
a bit slower, but we are not on a fast path.


> 
> > > On an unrelated note, did you consider simply IPI-ing and doing the
> > > substitution with all CPUs stopped?  If you only updated the immediate
> > > references to this particular var, it should be fast enough not to upset
> > > the RT guys, even.
> > > 
> > 
> > Yes, I thought about this, but since I use immediate values in the
> > kernel markers, which can be put in exception handlers (including nmi,
> > mce handler), which cannot be disabled without important side-effects, I
> > don't think trying to stop the CPUs is a workable solution.
> 
> OK, but can you justify the use of immediates within the nmi or mce
> handlers?  They don't strike me as useful candidates for optimization.
> 

Yes, immediate values are used by the Linux Kernel Markers, which
instrument many code paths, including functions called from nmi and mce
contexts (including printk).

> > > Well, you can do that in asm without gcc support.  It's a little nasty:
> > > since gcc will know nothing about the function call, it can't have side
> > > effects which are visible in this function, and you'll have to save and
> > > restore *all* regs if you decide to do the function call.  But it's
> > > possible (a 5-byte nop gets changed to a call, the call does the pushes
> > > and sets the args regs, calls the function, then pops everything and
> > > rets).
> > 
> > GCC support is required if we want to embed inline functions inside
> > unlikely branches depending on immediate values (no function call
> > there). It also permits passing local variables as arguments to the
> > function call (stack setup), which would be tricky, instrumentation site
> > specific and non portable if done in assembly.
> 
> Well if this is the slow path, you don't want inline anyway.  But it
> would be horribly, horribly arch-specific, yes.
> 

Yes, doing arch specific calls without gcc support seems to be unlikely
to give us a neat portable solution.

Mathieu

> Rusty.
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/8] Immediate Values - Global Modules List and Module Mutex

2007-09-17 Thread Rusty Russell
On Fri, 2007-09-14 at 11:32 -0400, Mathieu Desnoyers wrote:
> * Rusty Russell ([EMAIL PROTECTED]) wrote:
> > Alternatively, if you called it "immediate_init" then the semantics
> > change slightly, but are more obvious (ie. only use this when the value
> > isn't being accessed yet).  But it can't be __init then anyway.
> > 
> 
> I think your idea is good. immediate_init() could be used to update the
> immediate values at boot time _and_ at module load time, and we could
> use an architecture specific arch_immediate_update_init() to support it.

Right.

> As for "when" to use this, it should be used at boot time when
> interrupts are still disabled, still running in UP. It can also be used
> at module load time before any of the module code is executed, as long
> as the module code pages are writable (which they always are, for
> now..). Therefore, the flag seems inappropriate for module load
> arch_immediate_update_init. It cannot be put in __init section neither
> though if we use it like this.

I think from a user's POV it would be nice to have a 1:1 mapping with
normal initialization semantics (ie. it will work as long as you don't
access this value until initialized).  And I think this would be the
case.  eg:

int foo_func(void)
{
if (immediate_read(&some_immediate))
return 0;
...
}

int some_init(void)
{
immediate_init(some_immediate, 0);
register_foo(foo_func);
...
}


> > On an unrelated note, did you consider simply IPI-ing and doing the
> > substitution with all CPUs stopped?  If you only updated the immediate
> > references to this particular var, it should be fast enough not to upset
> > the RT guys, even.
> > 
> 
> Yes, I thought about this, but since I use immediate values in the
> kernel markers, which can be put in exception handlers (including nmi,
> mce handler), which cannot be disabled without important side-effects, I
> don't think trying to stop the CPUs is a workable solution.

OK, but can you justify the use of immediates within the nmi or mce
handlers?  They don't strike me as useful candidates for optimization.

> > Well, you can do that in asm without gcc support.  It's a little nasty:
> > since gcc will know nothing about the function call, it can't have side
> > effects which are visible in this function, and you'll have to save and
> > restore *all* regs if you decide to do the function call.  But it's
> > possible (a 5-byte nop gets changed to a call, the call does the pushes
> > and sets the args regs, calls the function, then pops everything and
> > rets).
> 
> GCC support is required if we want to embed inline functions inside
> unlikely branches depending on immediate values (no function call
> there). It also permits passing local variables as arguments to the
> function call (stack setup), which would be tricky, instrumentation site
> specific and non portable if done in assembly.

Well if this is the slow path, you don't want inline anyway.  But it
would be horribly, horribly arch-specific, yes.

Rusty.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/8] Immediate Values - Global Modules List and Module Mutex

2007-09-14 Thread Mathieu Desnoyers
* Rusty Russell ([EMAIL PROTECTED]) wrote:
> On Thu, 2007-09-13 at 17:21 -0400, Mathieu Desnoyers wrote:
> > * Rusty Russell ([EMAIL PROTECTED]) wrote:
> > > Sure, but why is that the caller's problem?  immediate_set() isn't
> > > fastpath, so why not make it do an "if (early_boot)" internally?
> > 
> > I see two reasons:
> > 1 - early_boot, or anything that looks like this, does not exist
> > currently (and the following reason might show why).
> > 2 - If we use this, we cannot declare the early code with __init, so it
> > will have to stay there forever insteaf of being removable once boot is
> > over.
> > 
> > Therefore, I think it's better to stick to an immediate_set_early
> > version.
> 
> My problem is that I don't know when to use immediate_set_early!  You
> say "early boot": what does that mean?  Is it arch-specific?  And why is
> it my problem anyway?
> 
> Since arch_immediate_update_early() is a memcpy, I don't really buy the
> bloat argument (it has some silly-looking optimization, but that should
> be removed anyway).  I'd really rather see arch_immediate_update()
> examine an internal flag and do the memcpy if it's too early for the
> full algorithm.  That's the only place which really knows, after all.
> You may need some external hook to update that internal "too early"
> flag, of course.
> 
> Alternatively, if you called it "immediate_init" then the semantics
> change slightly, but are more obvious (ie. only use this when the value
> isn't being accessed yet).  But it can't be __init then anyway.
> 

I think your idea is good. immediate_init() could be used to update the
immediate values at boot time _and_ at module load time, and we could
use an architecture specific arch_immediate_update_init() to support it.

As for "when" to use this, it should be used at boot time when
interrupts are still disabled, still running in UP. It can also be used
at module load time before any of the module code is executed, as long
as the module code pages are writable (which they always are, for
now..). Therefore, the flag seems inappropriate for module load
arch_immediate_update_init. It cannot be put in __init section neither
though if we use it like this.


> On an unrelated note, did you consider simply IPI-ing and doing the
> substitution with all CPUs stopped?  If you only updated the immediate
> references to this particular var, it should be fast enough not to upset
> the RT guys, even.
> 

Yes, I thought about this, but since I use immediate values in the
kernel markers, which can be put in exception handlers (including nmi,
mce handler), which cannot be disabled without important side-effects, I
don't think trying to stop the CPUs is a workable solution.


> > > I was thinking that we might find useful specific cases before we get
> > > GCC support, which archs can override with tricky asm if they wish.
> > > 
> > 
> > The first useful case is the Linux Kernel Markers, which really needs a
> > completely boolean if: active or inactive. That would be a good test
> > case to get gcc support.
> 
> Well, you can do that in asm without gcc support.  It's a little nasty:
> since gcc will know nothing about the function call, it can't have side
> effects which are visible in this function, and you'll have to save and
> restore *all* regs if you decide to do the function call.  But it's
> possible (a 5-byte nop gets changed to a call, the call does the pushes
> and sets the args regs, calls the function, then pops everything and
> rets).
> 

GCC support is required if we want to embed inline functions inside
unlikely branches depending on immediate values (no function call
there). It also permits passing local variables as arguments to the
function call (stack setup), which would be tricky, instrumentation site
specific and non portable if done in assembly.

Mathieu

> Cheers,
> Rusty.
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/8] Immediate Values - Global Modules List and Module Mutex

2007-09-13 Thread Rusty Russell
On Thu, 2007-09-13 at 17:21 -0400, Mathieu Desnoyers wrote:
> * Rusty Russell ([EMAIL PROTECTED]) wrote:
> > Sure, but why is that the caller's problem?  immediate_set() isn't
> > fastpath, so why not make it do an "if (early_boot)" internally?
> 
> I see two reasons:
> 1 - early_boot, or anything that looks like this, does not exist
> currently (and the following reason might show why).
> 2 - If we use this, we cannot declare the early code with __init, so it
> will have to stay there forever insteaf of being removable once boot is
> over.
> 
> Therefore, I think it's better to stick to an immediate_set_early
> version.

My problem is that I don't know when to use immediate_set_early!  You
say "early boot": what does that mean?  Is it arch-specific?  And why is
it my problem anyway?

Since arch_immediate_update_early() is a memcpy, I don't really buy the
bloat argument (it has some silly-looking optimization, but that should
be removed anyway).  I'd really rather see arch_immediate_update()
examine an internal flag and do the memcpy if it's too early for the
full algorithm.  That's the only place which really knows, after all.
You may need some external hook to update that internal "too early"
flag, of course.

Alternatively, if you called it "immediate_init" then the semantics
change slightly, but are more obvious (ie. only use this when the value
isn't being accessed yet).  But it can't be __init then anyway.

On an unrelated note, did you consider simply IPI-ing and doing the
substitution with all CPUs stopped?  If you only updated the immediate
references to this particular var, it should be fast enough not to upset
the RT guys, even.

> > I was thinking that we might find useful specific cases before we get
> > GCC support, which archs can override with tricky asm if they wish.
> > 
> 
> The first useful case is the Linux Kernel Markers, which really needs a
> completely boolean if: active or inactive. That would be a good test
> case to get gcc support.

Well, you can do that in asm without gcc support.  It's a little nasty:
since gcc will know nothing about the function call, it can't have side
effects which are visible in this function, and you'll have to save and
restore *all* regs if you decide to do the function call.  But it's
possible (a 5-byte nop gets changed to a call, the call does the pushes
and sets the args regs, calls the function, then pops everything and
rets).

Cheers,
Rusty.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/8] Immediate Values - Global Modules List and Module Mutex

2007-09-13 Thread Mathieu Desnoyers
* Rusty Russell ([EMAIL PROTECTED]) wrote:
> On Tue, 2007-09-11 at 10:27 -0400, Mathieu Desnoyers wrote: 
> > * Rusty Russell ([EMAIL PROTECTED]) wrote:
> > > On Mon, 2007-09-10 at 20:45 -0400, Mathieu Desnoyers wrote:
> > > > Code patching of _live_ SMP code is allowed. This is why I went through
> > > > all this trouble on i386.
> > > 
> > > Oh, I was pretty sure it wasn't.  OK.
> > > 
> > > So now why three versions of immediate_set()?  And why are you using my
> > > lock for exclusion?  Against what?
> > > 
> > 
> > If we need to patch code at boot time, when interrupts are still
> > disabled (it happens when we parse the kernel arguments for instance),
> > we cannot afford to use IPIs to call sync_core() on each cpu, using
> > breakpoints/notifier chains could be tricky (because we are very early
> > at boot and alternatives or paravirt may not have been applied yet).
> 
> Hi Mathieu,
> 
> Sure, but why is that the caller's problem?  immediate_set() isn't
> fastpath, so why not make it do an "if (early_boot)" internally?
> 

I see two reasons:
1 - early_boot, or anything that looks like this, does not exist
currently (and the following reason might show why).
2 - If we use this, we cannot declare the early code with __init, so it
will have to stay there forever insteaf of being removable once boot is
over.

Therefore, I think it's better to stick to an immediate_set_early
version.

> > _immediate_set() has been introduced because of the way immediate values
> > are used by markers: the linux kernel markers already hold the module
> > mutex when they need to update the immediate values. Taking the mutex
> > twice makes no sence, so _immediate_set() is used when the caller
> > already holds the module mutex.
> 
> > Why not just have one immediate_set() which iterates through and fixes
> > > up all the references?
> > 
> > (reasons explained above)
> > 
> > > It can use an internal lock if you want to avoid
> > > concurrent immediate_set() calls.
> > > 
> > 
> > An internal lock won't protect against modules load/unload race. We have
> > to iterate on the module list.
> 
> Sure, but it seems like that's fairly easy to do within module.c:
> 
> /* This updates all the immediates even though only one might have
> * changed.  But it's so rare it's not worth optimizing. */
> void module_update_immediates(void)
> {
> mutex_lock(&module_mutex);
> list_for_each_entry(mod, &modules, list)
> update_immediates(mod->immediate, mod->num_immediate);
> mutex_unlock(&module_mutex);
> }
> 
> Then during module load you do:
> 
> update_immediates(mod->immediate, mod->num_immediate);
> 
> Your immediate_update() just becomes:
> 
> update_immediates(__start___immediate,
> __stop___immediate - __start___immediate);
> module_update_immediates();
> 
> update_immediates() can grab the immediate_mutex if you want.
> 

Yup, excellent idea. I just changed the linux kernel markers too.


> > > Why is it easier to patch the sites now than later?  Currently it's just
> > > churn.  You could go back and find them when this mythical patch gets
> > > merged into this mythical future gcc version.  It could well need a
> > > completely different macro style, like "cond_imm(var, code)".
> > 
> > Maybe you're right. My though was that if we have a way to express a
> > strictly boolean if() statement that can later be optimized further by
> > gcc using a jump rather than a conditionnal branch and currently emulate
> > it by using a load immediate/test/branch, we might want to do so right
> > now so we don't have to do a second code transition from
> > if (immediate_read(&var)) to immediate_if (&var) later. But you might be
> > right in that the form could potentially change anyway when the
> > implementation would come, although I don't see how.
> 
> I was thinking that we might find useful specific cases before we get
> GCC support, which archs can override with tricky asm if they wish.
> 

The first useful case is the Linux Kernel Markers, which really needs a
completely boolean if: active or inactive. That would be a good test
case to get gcc support.

Mathieu

> Cheers,
> Rusty.
> 
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/8] Immediate Values - Global Modules List and Module Mutex

2007-09-12 Thread Rusty Russell
On Tue, 2007-09-11 at 10:27 -0400, Mathieu Desnoyers wrote: 
> * Rusty Russell ([EMAIL PROTECTED]) wrote:
> > On Mon, 2007-09-10 at 20:45 -0400, Mathieu Desnoyers wrote:
> > > Code patching of _live_ SMP code is allowed. This is why I went through
> > > all this trouble on i386.
> > 
> > Oh, I was pretty sure it wasn't.  OK.
> > 
> > So now why three versions of immediate_set()?  And why are you using my
> > lock for exclusion?  Against what?
> > 
> 
> If we need to patch code at boot time, when interrupts are still
> disabled (it happens when we parse the kernel arguments for instance),
> we cannot afford to use IPIs to call sync_core() on each cpu, using
> breakpoints/notifier chains could be tricky (because we are very early
> at boot and alternatives or paravirt may not have been applied yet).

Hi Mathieu,

Sure, but why is that the caller's problem?  immediate_set() isn't
fastpath, so why not make it do an "if (early_boot)" internally?

> _immediate_set() has been introduced because of the way immediate values
> are used by markers: the linux kernel markers already hold the module
> mutex when they need to update the immediate values. Taking the mutex
> twice makes no sence, so _immediate_set() is used when the caller
> already holds the module mutex.

> Why not just have one immediate_set() which iterates through and fixes
> > up all the references?
> 
> (reasons explained above)
> 
> > It can use an internal lock if you want to avoid
> > concurrent immediate_set() calls.
> > 
> 
> An internal lock won't protect against modules load/unload race. We have
> to iterate on the module list.

Sure, but it seems like that's fairly easy to do within module.c:

/* This updates all the immediates even though only one might have
* changed.  But it's so rare it's not worth optimizing. */
void module_update_immediates(void)
{
mutex_lock(&module_mutex);
list_for_each_entry(mod, &modules, list)
update_immediates(mod->immediate, mod->num_immediate);
mutex_unlock(&module_mutex);
}

Then during module load you do:

update_immediates(mod->immediate, mod->num_immediate);

Your immediate_update() just becomes:

update_immediates(__start___immediate,
  __stop___immediate - __start___immediate);
module_update_immediates();

update_immediates() can grab the immediate_mutex if you want.

> > Why is it easier to patch the sites now than later?  Currently it's just
> > churn.  You could go back and find them when this mythical patch gets
> > merged into this mythical future gcc version.  It could well need a
> > completely different macro style, like "cond_imm(var, code)".
> 
> Maybe you're right. My though was that if we have a way to express a
> strictly boolean if() statement that can later be optimized further by
> gcc using a jump rather than a conditionnal branch and currently emulate
> it by using a load immediate/test/branch, we might want to do so right
> now so we don't have to do a second code transition from
> if (immediate_read(&var)) to immediate_if (&var) later. But you might be
> right in that the form could potentially change anyway when the
> implementation would come, although I don't see how.

I was thinking that we might find useful specific cases before we get
GCC support, which archs can override with tricky asm if they wish.

Cheers,
Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/8] Immediate Values - Global Modules List and Module Mutex

2007-09-11 Thread Mathieu Desnoyers
* Rusty Russell ([EMAIL PROTECTED]) wrote:
> On Mon, 2007-09-10 at 20:45 -0400, Mathieu Desnoyers wrote:
> > Code patching of _live_ SMP code is allowed. This is why I went through
> > all this trouble on i386.
> 
> Oh, I was pretty sure it wasn't.  OK.
> 
> So now why three versions of immediate_set()?  And why are you using my
> lock for exclusion?  Against what?
> 

If we need to patch code at boot time, when interrupts are still
disabled (it happens when we parse the kernel arguments for instance),
we cannot afford to use IPIs to call sync_core() on each cpu, using
breakpoints/notifier chains could be tricky (because we are very early
at boot and alternatives or paravirt may not have been applied yet).

So, for early boot code patching, immediate_set_early() is used. It
presumes that variables are not used when the code is patched, that we
are in UP context and it is really minimalistic.

On the other hand, immediate_set() updates kernel and module variables
while the (potentially smp) system is running. It provides coherent
variable updates even if the code referencing then is executing.

_immediate_set() has been introduced because of the way immediate values
are used by markers: the linux kernel markers already hold the module
mutex when they need to update the immediate values. Taking the mutex
twice makes no sence, so _immediate_set() is used when the caller
already holds the module mutex.


> Why not just have one immediate_set() which iterates through and fixes
> up all the references?

(reasons explained above)

> It can use an internal lock if you want to avoid
> concurrent immediate_set() calls.
> 

An internal lock won't protect against modules load/unload race. We have
to iterate on the module list.

> I understand the need for a "module_immediate_fixup()" but that can also
> use your internal lock.

It looks interesting.. can you elaborate a little bit more on this idea?
If we can find a way to encapsulate in module.c everything that needs to
touch the module list, I am all for it.

> 
> > > 2) immediate_if() needs an implementation before you introduce it.  Your
> > > assumption that it's always unlikely seems non-orthogonal.
> > 
> > I could remove the unlikely(), no problem with that. Your point about
> > this is valid. However, I would like to leave the immediate_if() there
> > because it may become very useful if someday gcc permits to extract the
> > address of a branch instruction (and to generate assembly that would not
> > be reached without doing code patching).
> 
> Why is it easier to patch the sites now than later?  Currently it's just
> churn.  You could go back and find them when this mythical patch gets
> merged into this mythical future gcc version.  It could well need a
> completely different macro style, like "cond_imm(var, code)".
> 

Maybe you're right. My though was that if we have a way to express a
strictly boolean if() statement that can later be optimized further by
gcc using a jump rather than a conditionnal branch and currently emulate
it by using a load immediate/test/branch, we might want to do so right
now so we don't have to do a second code transition from
if (immediate_read(&var)) to immediate_if (&var) later. But you might be
right in that the form could potentially change anyway when the
implementation would come, although I don't see how.

Mathieu

> Rusty.
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/8] Immediate Values - Global Modules List and Module Mutex

2007-09-10 Thread Rusty Russell
On Mon, 2007-09-10 at 20:45 -0400, Mathieu Desnoyers wrote:
> Code patching of _live_ SMP code is allowed. This is why I went through
> all this trouble on i386.

Oh, I was pretty sure it wasn't.  OK.

So now why three versions of immediate_set()?  And why are you using my
lock for exclusion?  Against what?

Why not just have one immediate_set() which iterates through and fixes
up all the references?  It can use an internal lock if you want to avoid
concurrent immediate_set() calls.

I understand the need for a "module_immediate_fixup()" but that can also
use your internal lock.

> > 2) immediate_if() needs an implementation before you introduce it.  Your
> > assumption that it's always unlikely seems non-orthogonal.
> 
> I could remove the unlikely(), no problem with that. Your point about
> this is valid. However, I would like to leave the immediate_if() there
> because it may become very useful if someday gcc permits to extract the
> address of a branch instruction (and to generate assembly that would not
> be reached without doing code patching).

Why is it easier to patch the sites now than later?  Currently it's just
churn.  You could go back and find them when this mythical patch gets
merged into this mythical future gcc version.  It could well need a
completely different macro style, like "cond_imm(var, code)".

Rusty.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/8] Immediate Values - Global Modules List and Module Mutex

2007-09-10 Thread Mathieu Desnoyers
* Rusty Russell ([EMAIL PROTECTED]) wrote:
> On Sat, 2007-09-08 at 11:28 +0400, Alexey Dobriyan wrote:
> > On Thu, Sep 06, 2007 at 04:02:29PM -0400, Mathieu Desnoyers wrote:
> > > Remove "static" from module_mutex and the modules list so it can be used 
> > > by
> > > other builtin objects in the kernel. Otherwise, every code depending on 
> > > the
> > > module list would have to be put in kernel/module.c. Since the immediate 
> > > values
> > > depends on the module list but can be considered as logically different, 
> > > it
> > > makes sense to implement them in their own file.
> 
> If I understand this code correctly, then changing immediate values
> needs some exclusion to avoid patching live code.  You leave this to the
> user with some very unclear rules.
> 

I think you really misunderstood some major features of these patches
then. Doing an immediate_set on a variable has the same semantic as
setting a normal static variable. If the write to the equivalent
variable would be performed atomically (e.g. 32 bits or less variable on
a 32 bits architecture), it will also be done atomically on an
immediate value reference.

Code patching of _live_ SMP code is allowed. This is why I went through
all this trouble on i386.

So the locking is the same for an immediate value that it would be for a
static variable. If you are concerned about the fact that we have to
iterate on multiple load immediate sites to enable the variable (which
is done non atomically when there are multiple references), we end up in
a moment where references may be in a different state at a given time.
But this is no different from out of order read/writes of/from a static
variable and how older copies may be seen by another CPU : if you hope
that multiple CPUs will see a coherent copy of a static variable at a
given point in time, you clearly have to use the proper locking that
will order read/writes.

Immediate values are mostly meant to be used in contexts where we can
allow this time window where, during the update, some references will
have one value and others won't. Therefore, doing :

immediate_int_t var;
in function:
BUG_ON(immediate_read(&var) != immediate_read(&var));

might fail if it is executed while an immediate_set is done on var. But
if we would have:

volatile int var;
in function:
BUG_ON(var != var);

and, in another thread:
var++;

We _might_ race and see a different value. What is do different about
it?

When we activate a feature that is protected by a boolean, we currently
have to accept that we will, at some point, be in a state where code
paths will see the enabled value and others will see the disabled one.
(this is the choice we do in tracing, and the choice that has to be done
when using atomic updates without locking).

The only guarantee you get is that once the immediate_set is over, all
further references to the variable in memory will see the new variable.

> The result is a real mess that has nothing to do with the module mutex
> and list.  These patches need a lot more work 8(
> 

module mutex is only there to allow coherent iteration on the module
list so we can patch each module's code.

> 1) The immediate types are just kind of silly.  See per-cpu for how it
> handles this already.  DECLARE_IMMEDIATE(type, var) is probably enough.
> 

I would be ok with that.

> 2) immediate_if() needs an implementation before you introduce it.  Your
> assumption that it's always unlikely seems non-orthogonal.
> 

I could remove the unlikely(), no problem with that. Your point about
this is valid. However, I would like to leave the immediate_if() there
because it may become very useful if someday gcc permits to extract the
address of a branch instruction (and to generate assembly that would not
be reached without doing code patching).

Quoting my discussion with H. Peter Anvin:


Mathieu Desnoyers wrote:
>
> If we can change the compiler, here is what we could do:
>
> Tell GCC to put NOPs that could be altered by a branch alternative to
> some specified code. We should be able to get the instruction pointers
> (think of inlines) to these nop/branch instructions so we can change
> them dynamically.
>

Changing the compiler should be perfectly feasible, *BUT* I think we
need a transitional solution that works on existing compilers.

> I suspect this would be inherently tricky. If someone is ready to do
> this and tells me "yes, it will be there in 1 month", I am more than
> ready to switch my markers to this and help, but since the core of my
> work is kernel tracing, I don't have the time nor the ressources to
> tackle this problem.
>
> In the event that someone answers "we'll do this in the following 3
> years", I might consider to change the if (immediate(var)) into an
> immediate_if (var) so we can later proceed to the change with simple
> ifdefs without rewriting all the kernel code that would use it.

This is much more of "we'll do that in the following 1-2 years", since
we have to deal with a full gcc develop

Re: [patch 1/8] Immediate Values - Global Modules List and Module Mutex

2007-09-10 Thread Rusty Russell
On Sat, 2007-09-08 at 11:28 +0400, Alexey Dobriyan wrote:
> On Thu, Sep 06, 2007 at 04:02:29PM -0400, Mathieu Desnoyers wrote:
> > Remove "static" from module_mutex and the modules list so it can be used by
> > other builtin objects in the kernel. Otherwise, every code depending on the
> > module list would have to be put in kernel/module.c. Since the immediate 
> > values
> > depends on the module list but can be considered as logically different, it
> > makes sense to implement them in their own file.

If I understand this code correctly, then changing immediate values
needs some exclusion to avoid patching live code.  You leave this to the
user with some very unclear rules.

The result is a real mess that has nothing to do with the module mutex
and list.  These patches need a lot more work 8(

1) The immediate types are just kind of silly.  See per-cpu for how it
handles this already.  DECLARE_IMMEDIATE(type, var) is probably enough.

2) immediate_if() needs an implementation before you introduce it.  Your
assumption that it's always unlikely seems non-orthogonal.

3) immediate_set(), _immediate_set() and immediate_set_early()?  No
thanks!  AFAICT you really want an "init_immediate(var, val)".  This
means "you can patch all the references now, they're not executing".
Later on we could possibly have a super-stop-machine version which
ensures noone's preempted and handles the concurrent case.  Maybe.

4) With an "init" interface not a "set" interface, you don't need
locking.  Simpler.

Hope that helps,
Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/8] Immediate Values - Global Modules List and Module Mutex

2007-09-08 Thread Alexey Dobriyan
On Thu, Sep 06, 2007 at 04:02:29PM -0400, Mathieu Desnoyers wrote:
> Remove "static" from module_mutex and the modules list so it can be used by
> other builtin objects in the kernel. Otherwise, every code depending on the
> module list would have to be put in kernel/module.c. Since the immediate 
> values
> depends on the module list but can be considered as logically different, it
> makes sense to implement them in their own file.
> 
> The alternative to this would be to disable preemption in code path that need
> such synchronization, so they can be protected against module unload by
> stop_machine(), but not being able to sleep within while needing such
> synchronization is limiting.

> --- linux-2.6-lttng.orig/kernel/module.c
> +++ linux-2.6-lttng/kernel/module.c
> @@ -64,8 +64,8 @@ extern int module_sysfs_initialized;
>  
>  /* List of modules, protected by module_mutex or preempt_disable
>   * (add/delete uses stop_machine). */
> -static DEFINE_MUTEX(module_mutex);
> -static LIST_HEAD(modules);
> +DEFINE_MUTEX(module_mutex);
> +LIST_HEAD(modules);
>  static DECLARE_MUTEX(notify_mutex);
>  
>  static BLOCKING_NOTIFIER_HEAD(module_notify_list);
> --- linux-2.6-lttng.orig/include/linux/module.h
> +++ linux-2.6-lttng/include/linux/module.h
> @@ -60,6 +60,10 @@ struct module_kobject
>   struct kobject *drivers_dir;
>  };
>  
> +/* Protects the list of modules. */
> +extern struct mutex module_mutex;
> +extern struct list_head modules;

Rusty, do you still want to keep module_mutex virgin? If not, I can
backout /proc/*/wchan vs rmmod race fix et al and use muuuch simpler version.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/