Hemant, These latest additions to backgroundrb look pretty cool. Unfortunately, I don't think I will be able to use it this way because in my setup I can't run anything on the cluster nodes directly. I have to submit jobs to a queuing system on the cluster's master node, which is why I think a simple daemon running on the master node that polls the (remote) db for pending jobs and then submits these to the queue would probably be better for my case - but I'm far from being an expert on distributed systems so any suggestions are very welcome!
On Wed, 2008-06-25 at 15:56 +0530, hemant wrote: > On Wed, Jun 25, 2008 at 1:02 PM, Jack Nutting <[EMAIL PROTECTED]> wrote: > > On Tue, Jun 24, 2008 at 7:26 PM, Frank Schwach <[EMAIL PROTECTED]> wrote: > >> Jack, > >> I just found your interesting post in the archive and I would like to > >> come back to this. I need to implement something like this: > >> > >> I have some very long running tasks (several hours) that should run on a > >> remote machine and talk to the database on the Rails server. I need to > >> keep track of jobs including those that have been run in the past, so a > >> table for background jobs with their status as you describe would be the > >> best solution for me. > >> > >> I am just wondering whether backgroundrb wouldn't be a bit of an > >> overkill in the scenario you describe? In the new "Advanced Rails > >> Recipes" from the Pragmatic Programmers Bookshelf there is a recipe > >> using a simple daemonized ruby process that polls the database for > >> pending jobs and uses acts_as_state_machine to set the state of the jobs > >> (there is also a nice BackgrounDRb recipe in the book by the way). > >> I am just wondering if the daemonized process isn't easier to handle in > >> this case since you don't integrate your app with backgroundrb very > >> tightly anyway? > >> > >> I would be grateful for any suggestions because there seem to be lots of > >> possible solutions for this problem and some more or less well > >> documented plugins and I haven't used any of them before. I need a > >> simple and robust method that doesn't have too many dependencies and > >> doesn't require too much maintenance because I want to make the finished > >> app available for others to install on their local systems. > > > > This is an interesting question, Frank. My usage of backgroundrb is > > somewhat of an edge case, and most of what I'm doing with it could > > definitely be done with a simpler system. I initially chose > > backgroundrb for my project because it seemed to make the most sense > > at the time (for what I *thought* I needed; actual needs changed with > > further exploration of the problem space), and I was enough of a ruby > > newbie that it felt comfortable for me to have a packaged solution > > that (mostly) "just worked". If I were starting from scratch today, I > > might make a different decision. > > > > However, it's not only inertia that keeps me using backgroundrb. For > > one thing, backgroundrb does provide some handy things--centralized > > logging, IPC for storing runtime status info about my processes, > > etc--that would take some time for me to implement if I were rolling > > my own solutions with a daemonized script, and from my perspective > > that would be wasted time, since I have those things working today > > thanks to backgroundrb. Another reason for me to keep it is that I > > have a few spots in my system where I'm considering using some of > > backgroundrb's other key features, like launching a short-lived > > process to handle something in response to some action happening in > > the main application > > > > Well, I am working on couple of new things with BackgrounDRb. Result > storage and retrieval is one of them,as I mentioned in earlier mails > and solicited opinions from fellows who are using bdrb. You can > checkout > > http://github.com/gnufied/backgroundrb/commits/testcase > > So whats there on this branch of BackgrounDRb which will become master > very soon. > > 1> True clustering system for clustering backgroundrb servers running > on N nodes. Tasks are dispatched in a round robin manner, but you can > specify the host on which you want execute task: > > MiddleMan.worker(:foo_worker).async_some_work(:args => "lol") > > ^^ will choose any server in a round robin manner and run "some_work" > method in the specified worker. You can also specify: > :host => <local or all or "10.0.0.6:11001"> > > which overrides the default behaviour and run specified method on > local bdrb server, all bdrb server or specified server. > > 2> Clustering is failsafe and if one bdrb node goes down, all the > requests are immediately started to being routed to remaining servers. > Once that node comes up, it automatically starts participating in > clustering process. > > 3> Results can be stored in memcache and register_status method has > been replace by a "cache" object available in all workers. Hence you > can cache results with: > > [EMAIL PROTECTED] = some_data > > in your workers and later you can retrieve results using: > > MiddleMan.worker(:foo_worker).ask_result(@user.id) > > I will seriously recommend using memcache if you are clustering bdrb > servers. Also, cache object's caching mechanism is completely thread > safe and hence can be used from within the thread pool or anywhere you > want. > > 4> Apart from memory based job queue that you can use with thread > pools, testcase branch implements database based job queues. So, to > enquue a particular task: > > MiddleMan.worker(:foo_worker).enq_some_task(:job_key,args) > > some_task method will be automatically called in first availbable > worker and task will be dequed from database.Also, jobs with duplicate > keys automatically get rejected. > > Note that, above things are already working on test case branch. I > think, these features make bdrb a very compelling choice. > > Some things that I will finish in a day or two: > > 5> Similar to worker method invocation, with each scheduled method, > you can specify host on which this task should run. For example, if > you have 5 bdrb servers and you have scheduled billing task to run > every sunday. Now, you don't want billing task to run on sunday on all > the servers. So, by default scheduled task will run on the server on > which its been created but you can specify host on which it should > run. -- +++++++++++++++++++++++++++++++ Dr Frank Schwach School of Computing Sciences University of East Anglia Norwich, NR4 7TJ Tel: 0044/(0)1603 - 592 405 www.cmp.uea.ac.uk ++++++++++++++++++++++++++++++++ _______________________________________________ Backgroundrb-devel mailing list [email protected] http://rubyforge.org/mailman/listinfo/backgroundrb-devel
