Re: Using parallel over several computers

Andy Loftus Tue, 14 Mar 2017 20:37:50 -0700

Anders,
Take a look at the --sqlmaster and --sqlworker options.

I use them to effectively create a jobqueue that any node can pull tasks
from. I do this for long running backups on a parallel filesystem (all
nodes have read/write access to the data and the sql joblog file).

1. Create a list of "tasks" and send that to parallel invoked with the
--sqlmaster option.  The sqlmaster option will create the joblog and exit.

2. On any machine that has access to the joblog file AND the data, run
parallel with the --sqlworker option.  As new machines come available, you
can start parallel on them in the same manner.  To stop work on a
particular node, send a KILL signal to the parallel process on that node,
which will stop spawning any new jobs and exit after existing tasks have
completed.

In my case, each "task" is a bash script file, and I list them, one per
line, in a tasklist file, such as:
/path/to/001.cmd
/path/to/002.cmd
...
/path/to/675.cmd

The parallel sqlmaster cmdline is then:
parallel -a "/path/to/tasklist" --sqlmaster "$DBURL" bash

The DBURL is now a task queue as well as a joblog.

The parallel sqlworker cmdline is:
parallel --sqlworker "$DBURL"

Some advantages here are:
+ The original (sqlmaster) host does not have to control the parallel
process and keep spawning new tasks on all the workers.
+ The worker nodes can each run at their own width (-j option).  This might
allow you to run a low task count on the worker nodes without interfering
with other users on the node.  You could even stop and restart with
different -j values as needed throughout the day.
+ Worker nodes can be started simply by running parallel on each. And can
be stopped by sending a KILL to the local parallel on that node.

NOTE: The sql* options have very recent changes to them so make sure you
are using the most recent version of parallel.

Hope this is helpful.

Cheers,
--Andy

On Tue, Mar 14, 2017 at 10:56 AM Douglas A. Augusto <[email protected]>
wrote:

> On 14/03/2017 at 10:54,
> Anders Lind <[email protected]> wrote:
>
> > I could perhaps set this up using the ssh functionality of parallel, but
> I
> > would need to be able to on the fly stop some machines from running jobs,
> > since the computers belong to co-workers who sometimes need their
> computers
> > for their own work.
>
> Hi Anders,
>
> The following thread may interest you:
>
>    Dynamically changing remote servers list
>    https://lists.nongnu.org/archive/html/parallel/2014-08/msg00012.html
>
> Based on that, at the time I made a shell script that keeps parallel's
> sshloginfile updated by filtering out unreachable remote servers and also
> allowing the user to edit (include and/or exclude remote servers)
> on-the-fly:
>
>    https://github.com/daaugusto/gnuparallel
>
> PS: It worked with older versions of GNU Parallel (I haven't tested it with
> more recent ones yet), so you mileage may vary.
>
> --
> Douglas A. Augusto
>

Re: Using parallel over several computers

Reply via email to