Lengyel, Florian wrote:
This needed editing... take two:
These seen like good questions to me. I would like to know
if there is something for software analogous to the
domain naming service for URLs--a "Software Naming Service."
Does such a thing exist?
Hello,
I'd say that there are many approaches for software deployment out there
and that the best one depends on your application (see below) and scale
(number of execution environments, users, and versions that you care
about). The "software naming service" is likely to depend greatly on the
preferred deployment approach, which is probably why there is no such
service in Globus. Another reason is that Grid hardware resources can be
used by very independent communities who might actually be less than
happy with artificial dependencies created using some grand unified
package management scheme.
One approach is to not install anything up-front, just ship all your
statically compiled executables and data along with your jobs. Just do
it and be glad if you can. It's not always viable because of two main
issues: 1) waste of bandwidth, especially if you have noticeable amounts
of job-independent "master data" (some caching schemes might help) 2) it
does not work at all with MPI-based parallel applications, which must be
compiled and linked on site against the local MPI library to execute at
all or to achieve good performance.
Another approach is to install and build your software in each cluster
more or less manually (using typical SCM tools like CVS/Subversion).
Once you manage to get your software deployed "everywhere", just keep a
list of configured sites and consult it when submitting jobs (this is
easy to automate if you use something like Condor-G for job submission
and match-making). In other words, create your own directory service
with the kinds of information that you need. Better yet, run some test
jobs regularly that check that your working configuration has not been
broken externally (by the cluster's admin). You might think that if
everyone deployed their own directory services, some terrible redundancy
and waste of effort would result. From my experience, constructing such
a directory service is easy compared to getting non-trivial software
successfully built at different target sites. Unfortunately, the
application software that needs building tends to be community-specific,
so the outlooks for saving effort by doing something across communities
are not good.
Yet another approach is to maintain hopefully just a single image of a
complete system and to rely on the ability of virtual machines (such as
UML or VMWare) to execute anywhere. This has very similar problems to
the first mentioned approach. Virtual machines may abstract away too
much of hardware and network for good performance and they may be too
bulky in size.
If you have administrative power over your computing resources or decide
to use the VM solution, you could enforce a certain degree of
homogeneity and rely on a Linux package manager as your software
distribution mechanism. Package managers (and Unix file systems layouts)
are not optimized toward maintaining multiple versions of the same
software on a single system, though. They also require some expertise to
build packages and to correctly describe their dependencies. Most people
who know how to get application software configured and compiled don't
have this sort of expertise. Furthermore, package managers invariably
make the assumption of being used by "the administrator", while in a
Grid setting each community will have its own set of "administrators".
If you prefer a package management system designed from scratch for Grid
clusters, maybe http://www.cmtsite.org/ will be interesting. I swear
there was another similar solution, but my attempt to find it again
today failed. Maybe someone else can give you some pointers.
Each of these tools has query features. Now try querying what software
is installed on a cluster, or a grid. What tools would you use? They don't
seem to exist, or if they are, they haven't made it very far in Google's
page ranking.
There seems to be no Software Naming Service that could
be queried and used, comparable to the Domain Naming Service.
The point is, if you installed the software yourself, you know where it
is and don't really need a Grid-wide service to find it. On the other
hand, if you didn't install the software, then it is either system
software, so again you don't need to query for it, or you can't trust
the installer to have done it in the exact way you would need it and to
keep it so over time. In case you trusted the installer, you would have
been informed by her how to find and use this software (e.g., what sort
of specifications to include in your jobs to make sure they execute
properly).
Perhaps you are suggesting a scenario where a stranger would like to
"browse" Grid resources to see what is installed where to select
potential job destinations. Based on my experience, this sort of
activity only makes sense within a community, which may very well have
some sort of directory services, but not at the level of a whole Grid
used by different communities.
While I'm on the subject of tools for the end user, what about
a shell that abstracts commands that you do from a workstation to the
grid level? Something that might be called the "gshell."
I'm pretty sure I've heard one talk about something like that being
implemented in the development version of gLite, but again, Google fails
to locate it.
Where is the grid equivalent of the path? Of ls?
Or for someone who wants to run a job on some collection of clusters,
but needs certain libraries, which may be installed on different
machines out there, somewhere. Is there a grid equivalent of
ldconfig? Or even of something deprecated, like LD_LIBRARY_PATH?
You might be mixing up two different roles:
1) People who deploy software
2) People who run jobs
The way how I see it working is: deployers do their thing "somehow" and
deliver user interfaces and instructions for people who run jobs. People
who run jobs operate on the level of abstraction relevant to their task
(which input data should be processed, which application configuration
parameters should be used, what to do with the output). The selection of
suitable target clusters shouldn't bother end users (it often does, for
prosaic reasons such as the amount of free local disk space, local
availability of data, and unscheduled downtimes).
What if end users actually want to tweak and recompile application code?
Then the deployer's task should be to provide a reasonable user
interface for doing distributed builds and on-site testing. Once again,
the application programmers don't need to see much difference from when
they were programming for their local cluster or machine.
While I appreciate that the globus toolkit is intended to solve
recurrent middleware problems, where is it being used to address
the most recurring problem of all: getting users to use it?
As you wrote, Globus Toolkit is middleware. The way I see it, the
immediate users of Grid middleware are Grid application developers. The
actual end users shouldn't need to know much about this sort of things,
just like they shouldn't need to know about Unix administration. The
real world might look different, but it's hardly a problem of Globus
Toolkit.
Regards,
Jan Ploski