Re: RFC: Apache::Registry family re-design

Barrie Slaymaker Sat, 08 Sep 2001 11:10:04 -0700
On Sun, Sep 09, 2001 at 01:19:01AM +0800, Stas Bekman wrote:
> 
> Who said that we have to use prototypes? Or if we don't the sub call won't
> be optimized away?

Yup.  Gotta have that () prototype, it's a bit of a (pragmatic) kludge.
If it were done today it would probably be via a ":constant" attributes
so's you could have constant subvs w/ mult. attrs.  Or something.

Anway, see perlsub.

> $ perl -lwe "sub NOP{};NOP('a')"

FWIW (had to give NOP a return value):

   $ perl -MO=Terse -we "sub NOP{1};NOP()"
   Use of uninitialized value in string eq at 
/usr/local/lib/perl5/5.6.0/i686-linux/B/Terse.pm line 31.
   LISTOP (0x8173648) leave
       OP (0x8173670) enter
       COP (0x8173610) nextstate
       UNOP (0x81735f0) entersub [1]
           UNOP (0x81735a8) null [141]
               OP (0x81735d0) pushmark
               UNOP (0x8173588) null [17]
                   SVOP (0x816e850) gv  GV (0x818c544) *NOP
   -e syntax OK
   $ perl -MO=Terse -we "sub NOP(){1};NOP()"
   Use of uninitialized value in string eq at 
/usr/local/lib/perl5/5.6.0/i686-linux/B/Terse.pm line 31.
   LISTOP (0x81735f0) leave
       OP (0x80f5a40) enter
       COP (0x81735b8) nextstate
       OP (0x816e850) null [5]
   -e syntax OK


> Yes, you understood it correctly. In fact I didn't yet think about the
> details. I suppose we can always use CGI.pm approach and build the code
> via Autoload. then we can really optimize things ourselves, by writing
> only the code that we need. This will make the code writing harder, but
> it's possible. The real problem will be with XS versions of the subs,
> where it simply won't work, unless we use lots of #IFDEFS, I guess that's
> the only way to go there.

Hmmm, Inline comes to mind, but that requires a C compiler at run-time.

> Nuh, we should stay away from using globals. Hmm, is it possible to write
> the NOP in XS? it still has to be CODE reference. Any XS gurus?

No, the sub call to the XS could would be optimized away; that's the
whole point of the constant sub optimization.

> I guess just like Apache::DBI, we can make some Apache::Reload::Registry
> and make it and Apache::Reaload to be aware of each other, and interact in
> the way you suggest.

Just trying to get that code sharing & reuse thang going...

> > > Hmm, that's an interesting idea. But won't it confuse users even more?
> > > this will require them to move all the none-handler code up, or we need
> > > something like: ##__REGISTRY_HANDLER_END__ as well.
> >
> > My experience is that most CGI scripts require munging to run nicely
> > under Apache::Registry anyway, but YMMV.  Just looking to make the
> > munging easier.
> 
> I doubt that your code example in your original post is a typical one.

Yup, it was too clean to start with.

> Usually people scatter their handler code across many subs and I'm not
> sure how ##__REGISTRY_HANDLER_END__ can really help.

All those subs would go before ##__REGISTRY_HANDLER__ or after your
##__REGISTRY_HANDLER_END__

> But what problem are we trying to solve here?

closures.  Basically, if you can't be bothered to (or afford to)
refactor your code, then having the parse "sense" lexical globals and
named sub decls in some way would be helpful.  Given the difficulties of
parsing Perl, I figured maybe defining a tag that let you tell the
parser to partition your code for you might be cool.  Or maybe not; been
a long time since I had to convert a crufty ol' CGI.

> That shouldn't be a problem. If we design this system as flexible as
> possible, we can try to provide a new handler which does what you suggest
> and then if it works well and doesn't add a big overhead, than we can
> replace the previous one with the latter one. Adding
> ##__REGISTRY_HANDLER__ shouldn't add any overhead at all, but that's only
> in case where the script is preloaded.

preloaded?  Or loaded once?  I'm assuming the latter; meaning the script
isn't loaded and compiled every request, not that it's preloaded at
(re)start time.

> register some hook to do something. The only difference is that we allow
> only one hook per single phase. Or do you think you may want to run a few
> different hooks and make it really like Apache hooks, where each hook
> decides whether it wants to handle the current phase or not? I think that
> would slow things down, since we will need to provide the whole mechanism
> with calls.

No, one handler is fine with a few overloadable (either through OO
inheritence or your cookery) subroutines.

> I guess I'm not very clear on what you are trying to accomplish.

An easy way for people to define the working environment for their
scripts.

> please give me a short example of the template you are talking about.

Here's a slightly cleaned up example from my first message, an example
of a custom one I might want to write so all my scripts get run in a
custom environment:

        package <%= $h->{package_name} %> ;

        use strict ; ## or not...
        use My::Standard::Lib ;
        use Apache::Util ;
        use Apache::File ;
        use Apache::Constant ;
        use My::Application::Lib ;
        use GD ;

        *dbh = *My::Globals::dbh ;

        ## etc., etc.

        use base qw( My::AppBase ) ;

        ## Some little utility routines...
        sub empty { ! defined $_[0] || ! length $_[0] }
        ## etc.

        sub handler {
            my ( $r ) = Apache::Request->new( shift ) ;
        #line 1 <%= $h->{script_file_name} %>
            {<%= $h->{script_body} %>} ;
            return OK ;
        }

        1 ;

Apache::Registry could have a (use-constant-optimizable) option to read a file
like this and convert it in to code to be cooked like:

   *handler = *some_handler_from_your_example ;
   *mtime   = *mtime_getter ;

   sub gen_code {
       my $h = shift ;
       join(
           'package ', $h->{package_name}, ' ;

   use strict ; ## or not...
   use My::Standard::Lib ;
   use Apache::Util ;
   use Apache::File ;
   use Apache::Constant ;
   use My::Application::Lib ;
   use GD ;

   *dbh = *My::Globals::dbh ;

   ## etc., etc.

   use base qw( My::AppBase ) ;

   ## Some little utility routines...
   sub empty { ! defined $_[0] || ! length $_[0] }
   ## etc.

   sub handler {
      my ( $r ) = Apache::Request->new( shift ) ;
   #line 1 ', $h->script_file_name, '
      {', $h->{script_body} '} ;
      return OK ;
   }

   1 ;' ) ;
   }

It's just a handy way of letting people define the environment that
their scripts get called in.

> > 'nuther issue: what's the plan to make it threaded-MPM compatible given
> > that it does a chdir() and calls
> 
> Hmm, I really have to start digging into the threading issues, I've
> ignored them so far. Is chdir() not thread-safe? If you chdir in one
> thread, does it affect other threads? Also what other calls you are
> talking about?

On a lot of OSs, the cwd is kept as a part of the kernel's per-process
structures, so all threads share the same idea of cwd.  Anytime you use
a userland threads package this is likely.

There are other threading prblems, like non-reentrant C RTL calls and
old, old APIs like, asctime(), that return pointers to static buffers
inside the C RTL.  Perl will probably come to use reentrant versions or
wrap non-reentrant ones with mutexes and copy off static buffers, or
just disable certain calls in threaded mode, but XS code is another
kettle of fish.  A lot of problems won't show up on single CPU boxes,
either, so a lot of people can't test their code in SMP MT environments
:-(.  Let alone write t/*.t scripts that try to exercise things in a MT
fashion.

Then there's APIs that are completely non threadsafe, AFAIK, like
setjmp/longjmp, which are used in Perl in some farily critical places.

All of the above have fixes, not sure how many are practical in perl5.

- Barrie

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Re: RFC: Apache::Registry family re-design

Reply via email to