Hi Diego

I (still) have Torque/PBS version 4.something in old clusters.
[Most people at this point already switched to Slurm.]

Torque/PBS comes with a tool named "pbsdsh" (for PBS distributed shell):

http://docs.adaptivecomputing.com/torque/4-1-3/Content/topics/commands/pbsdsh.htm
https://wikis.nyu.edu/display/NYUHPC/PBSDSH
http://www.ep.ph.bham.ac.uk/general/support/torquepbsdsh.html

"pbsdsh" is able to launch *serial* jobs across the nodes that you allocate for 
the job.
This allows you to run so called "embarrassingly parallel" tasks:

https://en.wikipedia.org/wiki/Embarrassingly_parallel

Ebarrassingly parallel tasks (or programs) are those where the processes don't
need to communicate with each other (therefore would be feasible without MPI).

The link above lists some embarrassingly parallel tasks.
A trivial example would be, for instance, to run transliterate all uppercase 
letters
to lowercase in a large number of text files, i.e. to run the "tr" command below

tr '[:upper:]' '[:lower:]' < input${i}.txt > output{$i}.txt

where the file name index ${i} would be distributed across the various 
processes.
"pbsdsh" can do it without MPI, because there is no need for one processor to
communicate with another processor to perform this.

However, pbsdsh is not unique at all.
There are other tools, independent of PBS, which can do the same if not more 
than pbsdsh. 
"Pdsh" is one of them (probably the most versatile and popular):

https://github.com/chaos/pdsh
https://linux.die.net/man/1/pdsh
http://www.linux-magazine.com/Issues/2014/166/Parallel-Shells
https://www.rittmanmead.com/blog/2014/12/linux-cluster-sysadmin-parallel-command-execution-with-pdsh/


[Some examples in the links above are "parallel system administration tasks", 
some are user-level tasks.]

***

*However ... not all parallelizable problems are embarrassingly parallel!!!*
Actually, most are not embarrassingly parallel.
A whole very large class of common problems in science in engineering, 
is to solve partial differential equations (PDE) using finite-differences (FD) 
or similar methods
(finite elements, finite volume, pseudo-spectral, etc) through domain 
decomposition.
This is  a typical example of a problem that is parallelizable, but not 
embarrassingly parallel. 
When solving PDE's with FD through domain decomposition, you have to exchange
the data on the sub-domain halos across the processors in charge of solving the
equation on each subdomain. This requires communication across the processors,
something that pbsdsh or pdsh cannot do, but MPI can (and so did the 
predecessors of
MPI: p4, PVM, etc).
This class of problems includes most of computational fluid dynamics, 
structural mechanics, 
weather forecast, climate/oceans/atmosphere modeling, geodynamics, etc.
Many problem solving methods in molecular dynamics and computational chemistry 
are not
embarrassingly parallel either.
There are many other classes of parallelizable problems that are not 
embarrassingly parallel.

Ian Foster's book, although somewhat out of date in several aspects,
still provides some background on this:

https://pdfs.semanticscholar.org/09ed/7308fdfb0b640077328aa4fd10ce429f511a.pdf

[Anybody in the list can suggest a recent book that with this type of 
comprehensive approach to parallel programs, please? 
The ones I know are restricted to MPI, or OpenMP, and so on.]

Do you know what type of problem you're trying to solve,
and whether it is embarrassingly parallel or not?
Which type of problem are you trying to solve?

I hope this helps,
Gus Correa


> On Aug 25, 2018, at 03:06, John Hearns via users <users@lists.open-mpi.org> 
> wrote:
> 
> Diego,
> I am sorry but you have different things here. PBS is a resource allocation 
> system. It will reserve the use of a compute server, or several compute 
> servers, for you to run your parallel job on. PBS can launch the MPI job - 
> there are several mechanisms for launching parallel jobs.
> MPI is an API for parallel programming. I would rather say a library, but if 
> I'm not wrong MPI is a standard for parallel programming and is technically 
> an API.
> 
> One piece of advice I would have is that you can run MPI programs from the 
> command line. So Google for 'Hello World MPI'. Write your first MPI program 
> then use mpirun from the command line.
> 
> If you have a cluster which has the PBS batch system you can then use PBS to 
> run your MPI program.
> IF that is not clear please let us know what help you need.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Sat, 25 Aug 2018 at 06:54, Diego Avesani <diego.aves...@gmail.com> wrote:
> Dear all,
> 
> I have a philosophical question.
> 
> I am reading a lot of papers where people use Portable Batch System or job 
> scheduler in order to parallelize their code.
> 
> What are the advantages in using MPI instead?
> 
> I am writing a report on my code, where of course I use openMPI. So tell me 
> please how can I cite you. You deserve all the credits.
> 
> Thanks a lot,
> Thanks again,
> 
> 
> Diego
> 
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to