All - we have a few clusters in our environment and for some of them
there is a need to manage cron jobs running on the active node of the
cluster.

Are there any standard methods for managing these? The clusters I'm
looking at are Oracle/Sun Cluster based, but VCS-based ones would
presumably have similar problems, I'd imagine.

In the case of some of our systems, the crontabs themselves are huge
and contain many entries. They are copied from the shared
application filesystem directly into /var/spool/cron/crontabs/appuser on
the startup of a dummy cron resource and removed when it is shut down.

For others we have fewer crontab entries and I was thinking of having
some simple shell script shim to execute script entries from the
application user's system crontab on all cluster nodes iff they are the
application filesystem is mounted, but otherwise leave them alone[1].

Ideally I'm thinking it would be far better to run an independent
instance of the cron process as the application user solely to allow
it and all process it spawns to be managed as a fully-fledged cluster
resource - if the cluster needed to evacuate in a hurry the cluster
management could kill the entire process group by nuking the user-mode
cron process ...

I've never heard of cron being used in this way though - does anything
like this exist?

I'm thinking here of situations where the system cron has started
processes on behalf of a cluster-managed app. When the cluster decides
it needs to fail a node and evacuate to the other one, what happens with
in-flight cron-initiated processes? What if something that cron spawned
holds open the app filesystem and prevents it from being unmounted and
exported cleanly? (for example, I'm sure there are other horrible ones)

Granted, most cron tasks are short-lived, however it is a common usage
for cron to be used to manage long-running services[2] - cron will start
a process whose first task is to determine whether there is another copy
already running. If there is, it will exit immediately, otherwise it
will fork and remain running, but even short-lived tasks might throw a
wrench into the works if they happen to run at the wrong time.

If this task were not managed by the cluster resource scripts or a child
of something that was, how would the cluster know which process to kill
beyond nuking all processes owned by the user and hoping for the best?

... what would Nathan do?

Regards,
Malcolm

[1] for example, in $HOME/bin/shim.sh:

|#!/bin/sh 
|if [ ! -x $1 ] ; then exit 0; fi
|exec "$@"

then the crontab entries would be modified as follows:

|# do not modify - the original is in /apps/cluster/etc/crontab
|15 3 * * * $HOME/bin/shim.sh /apps/cluster/bin/foo some thing

In this case, the crontabs would be kept in sync by having the resource
management scripts copy the current crontab from the shared fileystem
into /var/spool/cron/crontabs/appuser each time the resource is started[3]

[2] yes, this would be a mad way to do process management in a clustered
environment, don't do that, make it a proper cluster resource or
smf-managed service already. I know.

[3] Some of the reading I've done indicates there is something that
already exists for Oracle/Sun Cluster to keep arbitrary files in sync -
clfilesync. I haven't found it on my systems yet though, but will look
into that further, although that doesn't address the process group or
in-flight issues as above.

-- 
Malcolm Herbert
[email protected]

Attachment: pgpA2lbuKbWtF.pgp
Description: PGP signature

_______________________________________________
msosug mailing list
[email protected]
http://mexico.purplecow.org/m/listinfo/msosug

Reply via email to