On 4/11/07, AJ Rossini <[EMAIL PROTECTED]> wrote: > On Tuesday 10 April 2007 23:17, Ramon Diaz-Uriarte wrote: > > > Of course, you are right there. I think that might still be the case. > > At the time we made our decision, and decided to go for MPI, MPI 2 was > > already out, and MPI seemed "more like the current/future standard" > > than PVM. > > That's always been the case. In fact MPI is a standard, where as PVM always > was an implementation defining a so-called standard. >
Ooops, you are right. But in addition to whether or not a standard, it seemed (and still seems) that "MPI is the current/future stuff" whereas PVM seemed more like a useful but aging approach. (I am aging too, so maybe that ain't that good an argument :-). > > So using papply with Rmpi requires sharper programmers than using > > snow? Hey, it is good to know I am that smarter. I'll wear that as a > > badge :-). > > You are! I've never been patient enough to use plain Rmpi or rpvm except a > few times, but for me, the advantage of snow is that you get all the Oh, but except for a few very simple things such as broadcasting data or functions to all the slaves, or cleaning up, I never use Rmpi directly. I always use papply, which is, really, a piece of cake. I am just scratching the surface of this parallelism stuff, and I am sticking to the simple "embarrasingly parallelizable" problems (cross-validation, bootstrap, identical analysis on many samples, etc). So going any deeper into MPI (individual sends, receives, etc) was more trouble than it seemed worth. papply or, alternatively, clusterApplyLB, are all I've (almost ever) needed/used. > backends, not just MPI. In fact, I've heard mention that some folks are > sticking together a NWS backend as well. > > > Anyway, papply (with Rmpi) is not, in my experience, any harder than > > snow (with either rpvm or Rmpi). In fact, I find papply a lot simpler > > than snow (clusterApply and clusterApplyLB). For one thing, debugging > > is very simple, since papply becomes lapply if no lam universe is > > booted. > > In fact it might be easier, since we never put together decent aggregation > routines. > > (smarter doesn't mean works harder, just more intelligently :-). > I'll take that as a compliment :-). > > I see, though, that I might want to check PVM just for the sake of the > > fault tolerance in snowFT. > > Fault tolerance is one of those very ill-defined words. Specifically: > > #1 - mapping pRNG streams to work units, not just CPUs or dispatch order (both > of which can differ), for reproducibility > > #2 - handling "failure to complete" on worker nodes gracefully. > > However, you'd need checkpointing or probably a miracle to handle failure on > the master... > Aha, I hadn't thought of #1, beings as I am much more concerned about #2. (For #1, and to check results, I tend to run things under controlled conditions, where if a worker shuts down, I'll bring it back to life, and start again ---not elegant, but this happens rarely enough that I don't worry too much). Right now, I am dealing with #2 via additional external scripts that check that LAM universes are up, examine log files for signs of failures, modify lamb host definition files if needed, restart LAM universes, etc, and with checkpointing in the R code. But I think it is an ugly kludge (and a pain). I envy the Erlang guys. As for failure in the master ... I'll take that as an act of god, so no point in trying to defeat it via miracles :-). Actually, the scripts above could be distributed (the checkpointing is done from the master), so this is doable via a meta script that runs distributed. I've just added that to the "to-do" list. Best, R. > > best, > -tony > > [EMAIL PROTECTED] > Muttenz, Switzerland. > "Commit early,commit often, and commit in a repository from which we can > easily > roll-back your mistakes" (AJR, 4Jan05). > > -- Ramon Diaz-Uriarte Statistical Computing Team Structural Biology and Biocomputing Programme Spanish National Cancer Centre (CNIO) http://ligarto.org/rdiaz ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.