Hello I've a bio-informatics workflow, that I'm considering running in Kepler. The workflow analyzes about a dozen genomes, and will take a few hundred hours of computing, which I'll run on our 64 node Rocks cluster.
Several Perl programs retrieve the data and analyze it, inputting and outputting results from and to both files and a MySQL database. Each program needs to be instantiated multiple times with different input. It appears that Kepler might be able to execute these computations with its PN scheduler and ExternalExecution Actor, but 1) we typically manage jobs on the cluster via the SGE (qsub) and 2) Kepler's documentation says that "to use the ExternalExecution actor, the invoked application must be on the local computer", which implies that Kepler must be installed on all nodes of the cluster. How would one approach this problem with Kepler? If Kepler isn't the right tool for this problem, what would you recommend? Thanks Arthur Arthur P. Goldberg, PhD Research Scientist in Bioinformatics Group Plant Systems Biology Laboratory www.virtualplant.org <http://www.virtualplant.org> Visiting Academic Computer Science Department Courant Institute of Mathematical Sciences www.cs.nyu.edu/artg <http://www.cs.nyu.edu/artg> artg at cs.nyu.edu <mailto:artg at cs.nyu.edu> New York University 212 995-4918 100 Washington Sq East 8th Floor Silver Building -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mercury.nceas.ucsb.edu/kepler/pipermail/kepler-users/attachments/20091201/1cc387cf/attachment.html>

