Re: nested parenthesis in regex and launching a new process

zsdc Thu, 28 Aug 2003 14:56:59 +0000

Bryan Harris wrote:

The fork concept can be quite confusing at first, but it is actually
quite simple, once you get used to it. Check out the output of this
little program:

[ very interesting stuff cut out ]

Wild!  Why would anyone ever use this?  Why would you ever want to clone
yourself at the current point and have both continue on running?  I guess I
could see it being a smooth way to process a hundred files all at once
(sort-of)...  I don't get it.

Servers often work this way. There's a process listening on a port but when someone connects, it doesn't serve the actual request (which would block the server and no one else could connect until it's done) but only forks the child, which then serves the request, and the parent keeps listening for another requests and forks another children at the same time.

As for the client side, you could for example have a mail client downloading mail in the background, while the user can keep writing and sending mail at the same time, etc. Check out this example:

  #!/usr/bin/perl -w
  open F, '+>', "$0-result" or die $!;
  defined(my $pid = fork) or die $!;
  unless ($pid) {
      # child:
      sleep 5; # an important time consuming task...
      print F "123\n";
      exit;
  }
  # parent:
  print "Not yet... Do something else.\n" and sleep 1 until -s F;
  seek F, 0, 0;
  print "Done! The result is ", <F>;

It could be written better with pipes, or real temp files, but it's cleaner that way. The child could do something more interesting than sleeping, like downloading a webpage from a slow server, without totally freezing our program. The parent might decide to wait only ten seconds and then give up, or try again downloading from another mirror, etc.

Instead of one process, you can have few processes running simultaneously and interacting with each other, while every one of them is a different incarnation of the same program. It can be very powerful.

[ more interesting stuff cut out ]

with '&' your program won't wait for the other-program to finish, but
the other-program process will die when your program (the parent
process) finishes.


Why is this?  Does that mean if I create a new tcsh shell, run an app, kill
the tcsh shell before the app finishes, that the app will die too?  Why have
parent/child relationships in processes?  Are there such things as
grandchildren/grandparents?

Yes. Actually, there's a whole genealogy tree. The process with ID 1 (usually called init) is the protoplast of every other process running on the system. This is the only process without a parent.

The killing of children makes sense when e.g. I open an xterm window and run "man fork". What I have now is a shell process which is a child of xterm, a man process which is a child of the shell and a pager (for scrolling man's output) which is a child of man. Now, when I just close the xterm window, the pager, man and shell are killed as well. It makes sense, because keeping them running (sleeping, actually) would be pointless.

But sometimes you don't want that. For example, your ssh connection could die, everything is killed and you have to connect and login once again and start everything from the beginning. If you don't want that, then check out this great little program:

http://www.gnu.org/software/screen/

It's truely amazing.

Of course you want your program to finish without killing the child
processes in the process (pun definitely intended) and for that you need
your child processes to create new sessions with setsid().

What is a session? I've never heard of that...

A session is basically (it's an oversimplification) a bunch of processes which get killed if you close the session, because they are important only for that session (like the example with xterm). For more informations see:

  man setsid
  http://linux.ctyme.com/man/man2993.htm

  man getsid
  http://linux.ctyme.com/man/man0960.htm

  man setpgid
  http://linux.ctyme.com/man/man2984.htm

You can take a look at Proc::Daemon module on CPAN, but it's not exactly
what you need, mostly because it redirects STDOUT to /dev/null, while
you want your processes to write to STDOUT (by the way, are you sure
about that? it can result in a total mess printed on your terminal) but
still you may want to read its source to see how it works:
http://search.cpan.org/src/EHOOD/Proc-Daemon-0.03/Daemon.pm


Yes, I absolutely want all output going to STDOUT.  Thanks for foreseeing a
potential problem, though...

Instead of /dev/null you could also redirect STDOUT to real files (or pipes or sockets or whatever), so you could capture and access it without all the mess on your screen.

use POSIX 'setsid';
sub forkrun ($) {
    my $cmd = pop;
    defined(my $pid = fork) or die "$0: fork: $!\n";
    unless ($pid) {
        setsid or die "$0: setsid: $!\n";
        exec $cmd;
    }
    return $pid;
}

Yes! This works terrifically!

Great.

How does the fork part work within subroutines?  I'm guessing the part
starting with "unless" looks to see if it's now a child process, and if so,
to quit the current perl script and start off $cmd.  Is that right?

Yes. The child goes inside the unless block (it has 0 in $pid, because it was returned by the fork call) while the parent skips the unless block (it has child's PID in $pid which is not 0) and goes directly to the return.

Actually, there should be a die or exit call after the exec, just in case the command could not be run, because otherwise the child would also return from the subroutine and run the rest of the program as well, just like the parent. So instead of:

exec $cmd;

there should be:

exec $cmd or die "exec $cmd: $!\n";

As for quitting, actually exec doesn't even quit the Perl script in a way exit() or die() does (no object destructors and not even END blocks will be called in your process after a successful exec). The system replaces our process image with a new one, without giving our process any chance to do anything more. Try this:

  #!/usr/bin/perl -l
  BEGIN { print "BEGIN block" }
  END   { print "END block" }
  print "Program body";
  die   "I'm dying here";
  print "I'm already dead";
  print "This is never printed";

The BEGIN block is always run before the main program starts, and the END block after it ends, no matter if it's just the end of a file, or exit() was called, or even die(). But here:

  #!/usr/bin/perl -l
  BEGIN { print "BEGIN block" }
  END   { print "END block" }
  print "Program body";
  exec  "echo This is echo";
  print "I'm already dead";
  print "This is never printed";

nothing will be called after exec. If you add -w switch or "use warnings;" perl will give you a warning:

  Statement unlikely to be reached at ./exec-test-2 line 6.
          (Maybe you meant system() when you said exec()?)

This is why it's always a good idea to have warnings turned on. For more about exec, see:

  perldoc -f exec
  http://www.perldoc.com/perl5.8.0/pod/func/exec.html

  man exec
  http://linux.ctyme.com/man/man0683.htm

  man execve
  http://linux.ctyme.com/man/man0686.htm

It works, so that part's taken care of.  Now I just have to figure out how
it works.  =)  Thanks a lot zsdc.

I'm glad I could be helpful. This subject usually causes lots of confusion. I hope others could also find it interesting and didn't mind if I was a little bit off topic.

--
ZSDC Perl and Systems Security Consulting


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: nested parenthesis in regex and launching a new process

Reply via email to