Regarding software configuration (not so much data mining) two useful
tools for the automation of path/library finding for cluster admins
are softenv and modules. The NMI Build&Test software, Metronome, as
used by TeraGrid to build software, has wound up publishing softenv
keys in condor classads so that builds are routed to machines with a)
the right set of software and b) the right set of LD_LIBRARY_PATH/PATH/
ANT_HOME/whatever setup to correctly use that software.
The real trick is getting anyone to agree on a syntax that would let
you find the software you want. In the absence of an existing
standard for software metadata, site admins basically have to agree on
a naming scheme for common software. The benefits are pretty good,
though, even down at the user level. From their perspective, they can
take a .soft file from one TG cluster to another and get the same
software environment without knowing the particulars about /usr/local
vs. /opt vs. /soft/apps or whatever. It's easy enough to leverage
that up at the MDS/GRAM level too.
Charles
On Apr 6, 2008, at 7:26 PM, Lengyel, Florian wrote:
-----Original Message-----
From: Jan Ploski [mailto:[EMAIL PROTECTED]
Sent: Fri 4/4/2008 9:11 PM
Hello,
"Another approach is to install and build your software in each
cluster
more or less manually (using typical SCM tools like CVS/Subversion).
Once you manage to get your software deployed "everywhere", just
keep a
list of configured sites and consult it when submitting jobs (this is
easy to automate if you use something like Condor-G for job submission
and match-making). In other words, create your own directory service
with the kinds of information that you need. Better yet, run some test
jobs regularly that check that your working configuration has not been
broken externally (by the cluster's admin).
VOs typically implement VO tests...
You might think that if
everyone deployed their own directory services, some terrible
redundancy
and waste of effort would result. From my experience, constructing
such
a directory service is easy compared to getting non-trivial software
successfully built at different target sites. Unfortunately, the
application software that needs building tends to be community-
specific,
so the outlooks for saving effort by doing something across
communities
are not good."
I had the VO in mind as the orginizational unit.
...
"The point is, if you installed the software yourself, you know
where it
is and don't really need a Grid-wide service to find it. On the other
hand, if you didn't install the software, then it is either system
software, so again you don't need to query for it, or you can't trust
the installer to have done it in the exact way you would need it and
to
keep it so over time. In case you trusted the installer, you would
have
been informed by her how to find and use this software (e.g., what
sort
of specifications to include in your jobs to make sure they execute
properly)."
This should not be a matter of trust, and it looks like an opportunity
for automation. When I was consulting at Merrill Lynch, I had to
determine
which library and header files were used to build various programs.
In cases where I had a makefile, I redefined the CC macro to
a PERL program I had written, called "woe" for "where on earth."
This generated a report of the locations of header library files
that would
have been used by the compiler and linker if I were building the
project.
Of course operating systems attempt to locate needed libraries and
programs
at runtime. This is at the level of the workstation. Asking your local
sys admin will not scale to the level of the grid (or the virtual
organization).
> While I'm on the subject of tools for the end user, what about
> a shell that abstracts commands that you do from a workstation to
the
> grid level? Something that might be called the "gshell."
I'm pretty sure I've heard one talk about something like that being
implemented in the development version of gLite, but again, Google
fails
to locate it.
...
Regards,
Jan Ploski
Thanks for this. There is something about a grid shell project on
the dev.globus wiki at
http://dev.globus.org/wiki/Project_Ideas#Programming_Tools
Speaking of interacting with the grid using familiar OS shell
commands, I might
want something like a /dev/grid-disk, which I might install as a
kernel module using insmod...
FL