Re: [Backgroundrb-devel] trouble stopping backgroundrb

Ryan Case Fri, 26 Sep 2008 16:31:01 -0700

Thanks for the patch - this works much better.

Occasionally, I still have to "pkill -9 -f backgroundrb", but most ofthe time just the stop script will clean up when one of thepacket_worker processes dies.


Thanks,
Ryan


On Sep 17, 2008, at 6:08 PM, hemant kumar wrote:

Okay folks here is a patch to "backgroundrb" script, which should fix
some issues:

diff --git a/script/backgroundrb b/script/backgroundrb
index dabf80b..8d4bb78 100755
--- a/script/backgroundrb
+++ b/script/backgroundrb
@@ -49,18 +49,9 @@ when 'stop'
  def kill_process arg_pid_file
    pid = nil
    File.open(arg_pid_file, "r") { |pid_handle| pid =
pid_handle.gets.strip.chomp.to_i }
-    begin
-      pgid =  Process.getpgid(pid)
-      Process.kill('TERM', pid)
-      Process.kill('-TERM', pgid)
-      Process.kill('KILL', pid)
-    rescue Errno::ESRCH => e
-      puts "Deleting pid file"
-    rescue
-      puts $!
-    ensure
-      File.delete(arg_pid_file) if File.exists?(arg_pid_file)
-    end
+    pgid =  Process.getpgid(pid)
+    Process.kill('-TERM', pgid)
+    File.delete(arg_pid_file) if File.exists?(arg_pid_file)
  end
  pid_files = Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
  pid_files.each { |x| kill_process(x) }

What it does is:
1. Deleting by group id is enough for master process.

2. Do not delete the pid file if, there was an exception whilestopping

the daemon.
3. Do not handle exceptions silently.

Please try this and let me know, how it goes.



On Wed, 2008-09-17 at 17:35 +0100, John O'Shea wrote:

Jonathan,
   Glad you raised this, I've been spending some time trying to
diagnose this exact same problem.
   The exception handling code in the "when 'stop'" block (in
script/backgroundrb) could definitely could be improved somewhat
- check that the process with 'pid' exists before trying to kill it
- rescue permission exceptions (Errno::EPERM)

- only delete the pid file if the process pid does not still exist(in

ensure block)
- be a little more verbose to stdout/stderr

While we are on the subject of shutdown, - when the backgroundrbprocessgets a HUP signal does it wait for existing workers to complete anyworkmethods that are executing or is the 'Process.kill('-TERM', pgid)'call

intended to make the OS handle this?

We use capistrano to deploy our application (stopping and restarting

backgroundrb after the rails app has been updated). It would begreat

if we could have more predictability regarding shutting down
backgroundrb (i.e. have the backgroundrb disable the reactor loop in
idle workers and wait for all active workers to finish methods, then
shutdown").

John.

Jonathan Wallace wrote:

Hi Ryan,

I recently ran into the same issue where the backgroundrb process

would not respond to ./script/backgroundrb stop command. The pidfile

was being deleted but the actual process was not being killed.  I'm
running packet 0.1.12 on gentoo.

I'm not exactly sure what conditions put backgroundrb into such a
state but I've decided to modify the script/backgroundrb to behave a
little differently.

My hypothesis is that if one of the Process.kill method calls in

script/backgroundrb raises an exception, the pid file is deletedeventhough the kill signal is never sent. At this point, runningstarting

and stopping backgroundrb never affects the original still running
backgroundrb process.

There are a couple of reasons that I believe an exception could be

raised. Either the Process.getpgid(pid), Process.kill('TERM',pid) or

the PRocess.kill('-TERM', pgid) raise an exception or the effective
uid of the user running script/backgroundrb stop does not have
permission to kill those processes.

To fix this, we've removed the Process.getpgid and the two
Process.kill's that are sending the TERM signal.  Since we've
architected our backgroundrb jobs to be persistent and idempotent (a
db backed queue written before the feature appeared in bdrb), we'll
just use the KILL signal.

Thoughts?

Thanks,
Jonathan

On Tue, Sep 16, 2008 at 12:11 PM, Ryan Case <[EMAIL PROTECTED]>wrote:

Hi folks -

I'm having trouble getting backgroundrb to stop after one of the
packet_worker_r processes dies.

If backgroundrb is running properly,

"/path/to/application/script/backgroundrb stop" works fine, butoften

one of the packet_worker_r processes dies, and the stop command no

longer works after that (it runs, but it does not stop theprocesses,

and so then start doesn't work).

The only thing that seems to work at that point is to manually kill
the processes that are still running, and then the start works, but
that is going to make restarting via monit a lot less clean.

Any ideas would be much appreciated!

I'm using github version of backgroundrb, and packet 0.1.13running on ubuntu.


Thanks!
Ryan
_______________________________________________
Backgroundrb-devel mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/backgroundrb-devel

_______________________________________________
Backgroundrb-devel mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/backgroundrb-devel


_______________________________________________
Backgroundrb-devel mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/backgroundrb-devel


_______________________________________________
Backgroundrb-devel mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/backgroundrb-devel

Re: [Backgroundrb-devel] trouble stopping backgroundrb

Reply via email to