Hello,
I'm working on a project to search for transiting planets in 10^4 light curves
collected by the Kepler space telescope. The search is implemented using a
series of ~5 python scripts in the following manner.
python script1.py star 1
<...>
python script1.py star N
python script2.py star 1
<...>
python script2.py star N
and so on. N~10^4 and total CPU time required is 3000 hours. I've had a great
deal of success implementing this with gnu parallel on a small cluster that
uses PBS for resource management.
cat cmdList.txt | parallel --slf $PBS_NODEFILE
Now, I'm trying to migrate this code to a larger computer ("carver" at NERSC).
The command no longer works because carver requires a password for each ssh
connection to a child node. It seems like MPI is the preferred method for
communicating between nodes, but since this work is embarrassingly parallel,
MPI seems like overkill.
Does any one have any advice?
Erik