Re: Can a module find out, whether another module is present?

2013-02-05 Thread Nick Kew
On Tue, 05 Feb 2013 16:04:13 -0500
Mikhail T. mi+t...@aldan.algebra.com wrote:

 Hello!
 
 What is the official way for a module to check, whether another module (known 
 by 
 name) is loaded and, if so, whether its hooks (cleanup in particular) will be 
 invoked before or after those of the inquirer?
 
 I don't need to affect the order -- I just need to figure out, what it is... 
 Thanks! Yours,

You can of course fine-tune the order, to specify your module's hooked
functions run before or after another module's.

But in general, querying another module, or knowing anything about
its cleanups, would be a violation of modularity.  If it's legitimate
for a module to expose its inner workings, it can do so by exporting
an API.

Why the questions?  Are you writing two modules that relate closely
to each other?

-- 
Nick Kew


Re: Can a module find out, whether another module is present?

2013-02-05 Thread Mikhail T.

On 05.02.2013 16:37, Nick Kew wrote:

But in general, querying another module, or knowing anything about
its cleanups, would be a violation of modularity.  If it's legitimate
for a module to expose its inner workings, it can do so by exporting
an API.

Why the questions?  Are you writing two modules that relate closely
to each other?
I'm not writing them -- they already exist. The two Tcl-modules (rivet and 
websh) both destroy the Tcl-interpreter at exit. The module, that gets to run 
the clean up last usually causes a crash: 
https://issues.apache.org/bugzilla/show_bug.cgi?id=54162


If each module could query, whether the other one is loaded too, the first one 
could skip destroying the interpreter -- leaving the task to the last one. This 
approach would work even if only one of them has been patched to do this.


The modularity is a great thing, of course, but when the modules use shared 
data-structures (from another library -- such as libtcl), they better cooperate, 
or else...


Yours,

   -mi



Re: Can a module find out, whether another module is present?

2013-02-05 Thread Graham Dumpleton
Don't know if will be applicable in the case of those modules or not, but
mod_python and mod_wsgi have similar conflicts over Python interpreter
initialisation and destruction and have had to do a little dance over who
gets precedence to ensure things don't crash.

In the next version of mod_wsgi though I am dropping support for
coexistence. I want to flag that fact with a big error message and refuse
to start up if both loaded.

What I have done is relied on the fact that mod_python
uses apr_pool_userdata_set() to set a module specific key in the module
init function to avoid doing certain interpreter initialisation on first
run through the config when Apache is started.

In other words, in mod_wsgi it will look for the mod_python key and
complain.

/*
 * No longer support using mod_python at the same time as
 * mod_wsgi as becoming too painful to hack around
 * mod_python's broken usage of threading APIs when align
 * code to the stricter API requirements of Python 3.2.
 */

userdata_key = python_init;

apr_pool_userdata_get(data, userdata_key, s-process-pool);
if (data) {
ap_log_error(APLOG_MARK, APLOG_CRIT, 0, NULL,
 mod_wsgi (pid=%d): The mod_python module can 
 not be used on conjunction with mod_wsgi 4.0+. 
 Remove the mod_python module from the Apache 
 configuration., getpid());

return HTTP_INTERNAL_SERVER_ERROR;
}

Don't know if the modules you are worried about use this convention of
using apr_pool_userdata_set() to flag whether module init has already been
run or not for configuration to avoid doing stuff twice which shouldn't.

Graham



On 6 February 2013 08:48, Mikhail T. mi+t...@aldan.algebra.com wrote:

 On 05.02.2013 16:37, Nick Kew wrote:

 But in general, querying another module, or knowing anything about
 its cleanups, would be a violation of modularity.  If it's legitimate
 for a module to expose its inner workings, it can do so by exporting
 an API.

 Why the questions?  Are you writing two modules that relate closely
 to each other?

 I'm not writing them -- they already exist. The two Tcl-modules (rivet and
 websh) both destroy the Tcl-interpreter at exit. The module, that gets to
 run the clean up last usually causes a crash: https://issues.apache.org/**
 bugzilla/show_bug.cgi?id=54162https://issues.apache.org/bugzilla/show_bug.cgi?id=54162

 If each module could query, whether the other one is loaded too, the first
 one could skip destroying the interpreter -- leaving the task to the last
 one. This approach would work even if only one of them has been patched to
 do this.

 The modularity is a great thing, of course, but when the modules use
 shared data-structures (from another library -- such as libtcl), they
 better cooperate, or else...

 Yours,

-mi




Re: Can a module find out, whether another module is present?

2013-02-05 Thread Jeff Trawick
On Tue, Feb 5, 2013 at 5:14 PM, Graham Dumpleton grah...@apache.org wrote:

 Don't know if will be applicable in the case of those modules or not, but
 mod_python and mod_wsgi have similar conflicts over Python interpreter
 initialisation and destruction and have had to do a little dance over who
 gets precedence to ensure things don't crash.

 In the next version of mod_wsgi though I am dropping support for
 coexistence. I want to flag that fact with a big error message and refuse
 to start up if both loaded.

 What I have done is relied on the fact that mod_python
 uses apr_pool_userdata_set() to set a module specific key in the module
 init function to avoid doing certain interpreter initialisation on first
 run through the config when Apache is started.

 In other words, in mod_wsgi it will look for the mod_python key and
 complain.

 /*
  * No longer support using mod_python at the same time as
  * mod_wsgi as becoming too painful to hack around
  * mod_python's broken usage of threading APIs when align
  * code to the stricter API requirements of Python 3.2.
  */

 userdata_key = python_init;

 apr_pool_userdata_get(data, userdata_key, s-process-pool);
 if (data) {
 ap_log_error(APLOG_MARK, APLOG_CRIT, 0, NULL,
  mod_wsgi (pid=%d): The mod_python module can 
  not be used on conjunction with mod_wsgi 4.0+. 
  Remove the mod_python module from the Apache 
  configuration., getpid());

 return HTTP_INTERNAL_SERVER_ERROR;
 }

 Don't know if the modules you are worried about use this convention of
 using apr_pool_userdata_set() to flag whether module init has already been
 run or not for configuration to avoid doing stuff twice which shouldn't.


module *modp;
for (modp = ap_top_module; modp; modp = modp-next) {
   foo(modp-name);
}




 Graham



 On 6 February 2013 08:48, Mikhail T. mi+t...@aldan.algebra.com wrote:

 On 05.02.2013 16:37, Nick Kew wrote:

 But in general, querying another module, or knowing anything about
 its cleanups, would be a violation of modularity.  If it's legitimate
 for a module to expose its inner workings, it can do so by exporting
 an API.

 Why the questions?  Are you writing two modules that relate closely
 to each other?

 I'm not writing them -- they already exist. The two Tcl-modules (rivet
 and websh) both destroy the Tcl-interpreter at exit. The module, that gets
 to run the clean up last usually causes a crash:
 https://issues.apache.org/**bugzilla/show_bug.cgi?id=54162https://issues.apache.org/bugzilla/show_bug.cgi?id=54162

 If each module could query, whether the other one is loaded too, the
 first one could skip destroying the interpreter -- leaving the task to the
 last one. This approach would work even if only one of them has been
 patched to do this.

 The modularity is a great thing, of course, but when the modules use
 shared data-structures (from another library -- such as libtcl), they
 better cooperate, or else...

 Yours,

-mi





-- 
Born in Roswell... married an alien...
http://emptyhammock.com/


Re: Can a module find out, whether another module is present?

2013-02-05 Thread Mikhail T.

On 05.02.2013 17:14, Graham Dumpleton wrote:
In the next version of mod_wsgi though I am dropping support for coexistence. 
I want to flag that fact with a big error message and refuse to start up if 
both loaded.
I'm not sure, how Python-users will react, but, as a Tcl-user, I'd hate to be 
forced to choose one of the two modules. I'm hosting to completely unrelated 
vhosts, which use the two Tcl-using modules.


On 05.02.2013 17:20, Jeff Trawick wrote:

module *modp;
for (modp = ap_top_module; modp; modp = modp-next) {
   foo(modp-name);
}
Cool! I thought of relying on the fact, that server_rec's module_config is a an 
array of module-pointers, but the above seems more reliable. Thank you!


   -mi



Re: Can a module find out, whether another module is present?

2013-02-05 Thread Nick Kew
On Tue, 05 Feb 2013 16:48:08 -0500
Mikhail T. mi+t...@aldan.algebra.com wrote:

 I'm not writing them -- they already exist. The two Tcl-modules (rivet and 
 websh) both destroy the Tcl-interpreter at exit. The module, that gets to run 
 the clean up last usually causes a crash: 

Are you sure?  My recollection of Tcl is of creating an interpreter
when I want to use it, and destroying it after use.  Many could run
concurrently with a threaded MPM.

The correct place to ensure calling library init and cleanuo
functons more than once doesn't hurt is in the library, and
if Tcl doesn't do that, you might want to report a bug.

Failing that, you could create wrapper functions which
keep track of state, whether as binary on/off or by
reference counting.  Then fix both modules to call those
in place of the problematic ones.  You could even create
another mini-module (say, mod_tcl_lib) to do global
initialisation and cleanup, make it a prerequisite for
the others, then strip those functions altogether
from the 'real' Tcl modules.

-- 
Nick Kew


Re: Can a module find out, whether another module is present?

2013-02-05 Thread Graham Dumpleton
The mod_python project is no longer developed and was moved into the ASF
attic. It is no longer recommended that it be used and the last official
release will not compile on current Apache versions. It only continues in
any form because some Linux distros are making their own patches so it will
compile. They can only ever keep this up for Apache 2.2 though, as 2.4
differences were too great and minor patches will not make it work there.

Graham


On 6 February 2013 09:30, Mikhail T. mi+t...@aldan.algebra.com wrote:

  On 05.02.2013 17:14, Graham Dumpleton wrote:

 In the next version of mod_wsgi though I am dropping support for
 coexistence. I want to flag that fact with a big error message and refuse
 to start up if both loaded.

 I'm not sure, how Python-users will react, but, as a Tcl-user, I'd hate to
 be forced to choose one of the two modules. I'm hosting to completely
 unrelated vhosts, which use the two Tcl-using modules.


 On 05.02.2013 17:20, Jeff Trawick wrote:

 module *modp;
 for (modp = ap_top_module; modp; modp = modp-next) {
foo(modp-name);
 }

 Cool! I thought of relying on the fact, that server_rec's module_config is
 a an array of module-pointers, but the above seems more reliable. Thank you!

 -mi




Re: Can a module find out, whether another module is present?

2013-02-05 Thread Mikhail T.

On 05.02.2013 17:33, Nick Kew wrote:

Are you sure?  My recollection of Tcl is of creating an interpreter
when I want to use it, and destroying it after use.  Many could run
concurrently with a threaded MPM.
You are right. However, calling Tcl_Finalize -- which is what mod_rivet is doing 
-- would destroy /everything/ :)

The correct place to ensure calling library init and cleanuo
functons more than once doesn't hurt is in the library, and
if Tcl doesn't do that, you might want to report a bug.
Personally, I doubt, the Tcl_Finalize call is useful at all. It is only done (by 
mod_rivet), when httpd is exiting anyway.


Yours,

   -mi



Re: Can a module find out, whether another module is present?

2013-02-05 Thread Mikhail T.

On 05.02.2013 17:30, Mikhail T. wrote:

On 05.02.2013 17:20, Jeff Trawick wrote:

module *modp;
for (modp = ap_top_module; modp; modp = modp-next) {
   foo(modp-name);
}
Cool! I thought of relying on the fact, that server_rec's module_config is a 
an array of module-pointers, but the above seems more reliable. Thank you!
BTW, is modp-module_index a reliable indication of order in which modules are 
processed? In other words, of module1's index is smaller than that of module2, 
does that mean, module1's hooks will be invoked prior to module2's? Or must one 
process the link-list to establish order?


   -mi



Re: Can a module find out, whether another module is present?

2013-02-05 Thread William A. Rowe Jr.
On Tue, 05 Feb 2013 17:47:48 -0500
Mikhail T. mi+t...@aldan.algebra.com wrote:

 BTW, is modp-module_index a reliable indication of order in which
 modules are processed? In other words, of module1's index is smaller
 than that of module2, does that mean, module1's hooks will be invoked
 prior to module2's? Or must one process the link-list to establish
 order?

That tells you what order they were loaded; what order the
register_hooks callbacks were processed.

As other hooks are registered, you assign them their priority, and
I don't know the priority, index sort to be a stable sort, but would
not rely on it.



Re: Can a module find out, whether another module is present?

2013-02-05 Thread William A. Rowe Jr.
On Tue, 05 Feb 2013 16:48:08 -0500
Mikhail T. mi+t...@aldan.algebra.com wrote:

 On 05.02.2013 16:37, Nick Kew wrote:
  But in general, querying another module, or knowing anything about
  its cleanups, would be a violation of modularity.  If it's
  legitimate for a module to expose its inner workings, it can do so
  by exporting an API.
 
  Why the questions?  Are you writing two modules that relate closely
  to each other?
 I'm not writing them -- they already exist. The two Tcl-modules
 (rivet and websh) both destroy the Tcl-interpreter at exit. The
 module, that gets to run the clean up last usually causes a crash: 
 https://issues.apache.org/bugzilla/show_bug.cgi?id=54162

What if both attempt to register an identical apr_optional_fn for
tcl_destroy.  That way you will never have both optional functions
called.

FWIW I would call that function as a destructor of the process_pool,
which you can find by walking the config pool's parents.


Re: Can a module find out, whether another module is present?

2013-02-05 Thread Mikhail T.

On 05.02.2013 18:01, William A. Rowe Jr. wrote:

What if both attempt to register an identical apr_optional_fn for
tcl_destroy.  That way you will never have both optional functions
called.
My plan was for each of the modules to skip the destruction, if the OTHER module 
is registered to run clean-up AFTER it.


This way the last module in the list will always run the destructor.

FWIW I would call that function as a destructor of the process_pool,
which you can find by walking the config pool's parents.
That's an idea... But, I think, I found a Tcl-specific solution for this 
particular problem -- instead of calling Tcl_Finalize(), which ruins libtcl for 
everyone in the same process, mod_rivet should simply delete the Tcl-interpreter 
it created (websh does limit itself to exactly that already).


Let's see, what mod_rivet maintainers have to say 
(https://issues.apache.org/bugzilla/attachment.cgi?id=29923action=diff).


But this was a very educating thread nonetheless. Thank you, everyone. Yours,

   -mi



Re: Can a module find out, whether another module is present?

2013-02-05 Thread Graham Dumpleton
Is this being done in the Apache parent process or only in the child
processes?

If in the Apache parent process, you would still have to call Tcl_Finalize()
at some point wouldn't you to ensure that all memory is reclaimed?

One of the flaws early on in mod_python was that it didn't destroy the
Python interpreter. When an Apache restart was done, mod_python and the
Python library would be unloaded from memory. When the in process startup
was done after rereading the configuration Apache would load them again.
Because it was reloaded it was a completely clean set of static variables
holding interpreter state and so interpreter had to be reinitialised.

In other words, the unload/load that happens for modules on a restart meant
that it leaked memory into the Apache parent process, resulting in the
parent process continually growing over time when restarts were done.

Even though mod_python was fixed and destroying the interpreter done.
Python itself still didn't always clean up memory completely and left some
static data in place on basis that if interpreter reinitialised in same
process, would just reuse that to avoid creating it again. Unfortunately
the unload/load cycle of modules still meant that memory leaked and so
mod_python as a result still leaks memory into the Apache parent process.

In the end in mod_wsgi, because of Python leaking memory in this way, had
to defer initialisation of interpreter until child processes were forked,
as simply wasn't possible to get Python to change what it did.

Graham




On 6 February 2013 10:11, Mikhail T. mi+t...@aldan.algebra.com wrote:

  On 05.02.2013 18:01, William A. Rowe Jr. wrote:

 What if both attempt to register an identical apr_optional_fn for
 tcl_destroy.  That way you will never have both optional functions
 called.

  My plan was for each of the modules to skip the destruction, if the OTHER
 module is registered to run clean-up AFTER it.

 This way the last module in the list will always run the destructor.

  FWIW I would call that function as a destructor of the process_pool,
 which you can find by walking the config pool's parents.

  That's an idea... But, I think, I found a Tcl-specific solution for this
 particular problem -- instead of calling Tcl_Finalize(), which ruins libtcl
 for everyone in the same process, mod_rivet should simply delete the
 Tcl-interpreter it created (websh does limit itself to exactly that
 already).

 Let's see, what mod_rivet maintainers have to say (
 https://issues.apache.org/bugzilla/attachment.cgi?id=29923action=diff).

 But this was a very educating thread nonetheless. Thank you, everyone.
 Yours,

 -mi




Re: Can a module find out, whether another module is present?

2013-02-05 Thread Graham Dumpleton
Is this being done in the Apache parent process or only in the child
processes?

If in the Apache parent process, you would still have to call Tcl_Finalize()
at some point wouldn't you to ensure that all memory is reclaimed?

One of the flaws early on in mod_python was that it didn't destroy the
Python interpreter. When an Apache restart was done, mod_python and the
Python library would be unloaded from memory. When the in process startup
was done after rereading the configuration Apache would load them again.
Because it was reloaded it was a completely clean set of static variables
holding interpreter state and so interpreter had to be reinitialised.

In other words, the unload/load that happens for modules on a restart meant
that it leaked memory into the Apache parent process, resulting in the
parent process continually growing over time when restarts were done.

Even though mod_python was fixed and destroying the interpreter done.
Python itself still didn't always clean up memory completely and left some
static data in place on basis that if interpreter reinitialised in same
process, would just reuse that to avoid creating it again. Unfortunately
the unload/load cycle of modules still meant that memory leaked and so
mod_python as a result still leaks memory into the Apache parent process.

In the end in mod_wsgi, because of Python leaking memory in this way, had
to defer initialisation of interpreter until child processes were forked,
as simply wasn't possible to get Python to change what it did.

Graham


On 6 February 2013 10:11, Mikhail T. mi+t...@aldan.algebra.com wrote:

  On 05.02.2013 18:01, William A. Rowe Jr. wrote:

 What if both attempt to register an identical apr_optional_fn for
 tcl_destroy.  That way you will never have both optional functions
 called.

  My plan was for each of the modules to skip the destruction, if the OTHER
 module is registered to run clean-up AFTER it.

 This way the last module in the list will always run the destructor.

  FWIW I would call that function as a destructor of the process_pool,
 which you can find by walking the config pool's parents.

  That's an idea... But, I think, I found a Tcl-specific solution for this
 particular problem -- instead of calling Tcl_Finalize(), which ruins libtcl
 for everyone in the same process, mod_rivet should simply delete the
 Tcl-interpreter it created (websh does limit itself to exactly that
 already).

 Let's see, what mod_rivet maintainers have to say (
 https://issues.apache.org/bugzilla/attachment.cgi?id=29923action=diff).

 But this was a very educating thread nonetheless. Thank you, everyone.
 Yours,

 -mi




Re: Can a module find out, whether another module is present?

2013-02-05 Thread Nick Kew
On Tue, 5 Feb 2013 16:58:48 -0600
William A. Rowe Jr. wr...@rowe-clan.net wrote:


 That tells you what order they were loaded; what order the
 register_hooks callbacks were processed.

But it doesn't tell you the order of process cleanups, as there
are many different hooks where a module could register them.
There's more work to do.  I'd advocate an abstraction that
doesn't depend on loading order.

-- 
Nick Kew


Re: Can a module find out, whether another module is present?

2013-02-05 Thread Mikhail T.

On 05.02.2013 18:25, Graham Dumpleton wrote:
If in the Apache parent process, you would still have to call Tcl_Finalize() 
at some point wouldn't you to ensure that all memory is reclaimed?
I don't think so. If only because after calling Tcl_Finalize(), any other calls 
into libtcl are undefined -- not supposed to happen. So, it can not be done on 
graceful restart anyway. From Tcl's man-page:


   Tcl_Finalize is similar to Tcl_Exit except that it does not  exit  from
   the  current  process.   It is useful for cleaning up when a process is
   finished using Tcl but wishes to continue executing, and when  Tcl  is
   used  in  a  dynamically loaded extension that is about to be unloaded.
   Your code should always invoke Tcl_Finalize when Tcl is being unloaded,
   to  ensure  proper cleanup. Tcl_Finalize can be safely called more than
   once.

One of the flaws early on in mod_python was that it didn't destroy the Python 
interpreter. When an Apache restart was done, mod_python and the Python 
library would be unloaded from memory. When the in process startup was done 
after rereading the configuration Apache would load them again. Because it was 
reloaded it was a completely clean set of static variables holding interpreter 
state and so interpreter had to be reinitialised.
websh is already doing just the Tcl_DeleteInterpreter -- for the interpreter /it 
created/. That seems like the right thing to do anyway.


If websh is wrong (and mod_rivet is right) in that an explicit call to 
Tcl_Finalize is needed for an exiting process, then we are coming back to my 
original question. Registering it as a clean-up call for the process' pool (as 
wrowe@ suggested) seems like the best approach to that.


Yours,

   -mi



Re: Can a module find out, whether another module is present?

2013-02-05 Thread Graham Dumpleton
On 6 February 2013 10:53, Mikhail T. mi+t...@aldan.algebra.com wrote:

  On 05.02.2013 18:25, Graham Dumpleton wrote:

 If in the Apache parent process, you would still have to call Tcl_Finalize()
 at some point wouldn't you to ensure that all memory is reclaimed?

 I don't think so. If only because after calling Tcl_Finalize(), any other
 calls into libtcl are undefined -- not supposed to happen. So, it can not
 be done on graceful restart anyway. From Tcl's man-page:

 Tcl_Finalize is similar to Tcl_Exit except that it does not  exit  from
 the  current  process.   It is useful for cleaning up when a process is
 finished using Tcl but wishes to continue executing, and  when  Tcl  is
 used  in  a  dynamically loaded extension that is about to be unloaded.
 Your code should always invoke Tcl_Finalize when Tcl is being unloaded,
 to  ensure  proper cleanup. Tcl_Finalize can be safely called more than
 once.

  One of the flaws early on in mod_python was that it didn't destroy the
 Python interpreter. When an Apache restart was done, mod_python and the
 Python library would be unloaded from memory. When the in process startup
 was done after rereading the configuration Apache would load them again.
 Because it was reloaded it was a completely clean set of static variables
 holding interpreter state and so interpreter had to be reinitialised.

 websh is already doing just the Tcl_DeleteInterpreter -- for the
 interpreter *it created*. That seems like the right thing to do anyway.

 If websh is wrong (and mod_rivet is right) in that an explicit call to
 Tcl_Finalize is needed for an exiting process,


It is not for an exiting process that is the problem. It is the module
cleanup, unloading and then reloading that occurs of the module within the
same Apache parent process when an Apache restart/graceful is done. The
main Apache parent process isn't actually killed in this situation.

So the section of documentation you quote appears to support what I am
saying that Tcl_Finalize() still needs to be called. After the module is
loaded and initialised again, then Tcl_Init(), or whatever is used to
create it again, would be called to start over and allow new instance of
interpreter to be setup in parent process before new child processes are
forked.

As I asked before, is this being done in the Apache parent process or only
in the child processes? If it is all only going on in the child processes,
the point I am making is moot, but if the interpreter is being initialised
in the Apache parent process before the fork, then it would be relevant.

Graham


Re: Can a module find out, whether another module is present?

2013-02-05 Thread Mikhail T.

On 05.02.2013 19:05, Graham Dumpleton wrote:
So the section of documentation you quote appears to support what I am saying 
that Tcl_Finalize() still needs to be called. After the module is loaded and 
initialised again, then Tcl_Init(), or whatever is used to create it again, 
would be called to start over and allow new instance of interpreter to be 
setup in parent process before new child processes are forked.
I do not think, Tcl_Init can (officially) be called after Tcl_Finalize. The 
function is meant only for situations, when libtcl (or a shared library using 
it) is itself being unloaded (think dlclose()).
As I asked before, is this being done in the Apache parent process or only in 
the child processes? If it is all only going on in the child processes, the 
point I am making is moot, but if the interpreter is being initialised in the 
Apache parent process before the fork, then it would be relevant.
I don't see, why Tcl would be initialized in the main process. If it is, that's, 
probably, a bug in itself.


But I'll await a response from mod_rivet maintainers.

   -mi