Hi,

So I have an issue where one of my workers is dying in a bad way for 
backgroundrb; its a worker that's doing some XMPP work using xmp4r and its 
hitting a deadlock scenario where ruby's monitor eventually detects and shuts 
that down.  I need to track down why that is, but either way I say bad way for 
backgroundrb because after the worker deadlocks everytime background RB tries 
to use it I get an exception from the master_worker class, line 129 in 
worker_methods where 
reactor.live_workers[worker_name_key].invokable_worker_methods fails for 
nil.invokable_worker_methods.  Clearly Packet is noticing the worker_name_key 
live_worker is gone, and backgroundrb still expects it to be there.

I should definitely fix the deadlock, but it seems like backgroundrb should 
handle bad failures like that and keep the system happy; right now it just 
repeatedly logs an exception that it can't get at invokable_worker_methods on 
nil.  I have a couple questions about the scenario and a potential workaround 
that I wanted to vet:

1)      If one of Packet's live_workers dies who's responsibility (if anyones) 
is it to restart it?  I.e. i'm not sure if Packet should be restarting it 
automatically, or if backgroundrb should detect and restart it.

2)      I felt more comfortable updating backgroundrb (partially due to the 
fact that i have it installed as a plugin, so its easier for me to track in 
SCM) and I wanted to see if this approach seems like it would cause issues.  
Here's what the method in master_worker.rb currently looks like:

def worker_methods worker_name_key
     reactor.live_workers[worker_name_key].invokable_worker_methods
end

I want to change that to

def worker_methods worker_name_key
     reactor.start_worker(:worker => worker_name_key) if 
reactor.live_workers[worker_name_key].nil?
     reactor.live_workers[worker_name_key].invokable_worker_methods
end

                First I want to know if that is a good approach and the right 
functions to call for restarting the worker.  Second its currently dependant on 
worker_name_key being the same as the filename due to how 
Packet::Reactor::start_worker is written, which doesn't seem great to me, but 
I'm unsure how to resolve at this level.

Thanks,
\Peter


_______________________________________________
Backgroundrb-devel mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/backgroundrb-devel

Reply via email to