Lengyel, Florian wrote:

This needed editing... take two:

These seen like good questions to me. I would like to know
if there is something for software analogous to the
domain naming service for URLs--a "Software Naming Service."
Does such a thing exist?

Hello,

I'd say that there are many approaches for software deployment out there and that the best one depends on your application (see below) and scale (number of execution environments, users, and versions that you care about). The "software naming service" is likely to depend greatly on the preferred deployment approach, which is probably why there is no such service in Globus. Another reason is that Grid hardware resources can be used by very independent communities who might actually be less than happy with artificial dependencies created using some grand unified package management scheme.

One approach is to not install anything up-front, just ship all your statically compiled executables and data along with your jobs. Just do it and be glad if you can. It's not always viable because of two main issues: 1) waste of bandwidth, especially if you have noticeable amounts of job-independent "master data" (some caching schemes might help) 2) it does not work at all with MPI-based parallel applications, which must be compiled and linked on site against the local MPI library to execute at all or to achieve good performance.

Another approach is to install and build your software in each cluster more or less manually (using typical SCM tools like CVS/Subversion). Once you manage to get your software deployed "everywhere", just keep a list of configured sites and consult it when submitting jobs (this is easy to automate if you use something like Condor-G for job submission and match-making). In other words, create your own directory service with the kinds of information that you need. Better yet, run some test jobs regularly that check that your working configuration has not been broken externally (by the cluster's admin). You might think that if everyone deployed their own directory services, some terrible redundancy and waste of effort would result. From my experience, constructing such a directory service is easy compared to getting non-trivial software successfully built at different target sites. Unfortunately, the application software that needs building tends to be community-specific, so the outlooks for saving effort by doing something across communities are not good.

Yet another approach is to maintain hopefully just a single image of a complete system and to rely on the ability of virtual machines (such as UML or VMWare) to execute anywhere. This has very similar problems to the first mentioned approach. Virtual machines may abstract away too much of hardware and network for good performance and they may be too bulky in size.

If you have administrative power over your computing resources or decide to use the VM solution, you could enforce a certain degree of homogeneity and rely on a Linux package manager as your software distribution mechanism. Package managers (and Unix file systems layouts) are not optimized toward maintaining multiple versions of the same software on a single system, though. They also require some expertise to build packages and to correctly describe their dependencies. Most people who know how to get application software configured and compiled don't have this sort of expertise. Furthermore, package managers invariably make the assumption of being used by "the administrator", while in a Grid setting each community will have its own set of "administrators".

If you prefer a package management system designed from scratch for Grid clusters, maybe http://www.cmtsite.org/ will be interesting. I swear there was another similar solution, but my attempt to find it again today failed. Maybe someone else can give you some pointers.

Each of these tools has query features. Now try querying what software
is installed on a cluster, or a grid.  What tools would you use? They don't
seem to exist, or if they are, they haven't made it very far in Google's page ranking.
There seems to be no Software Naming Service that could
be queried and used, comparable to the Domain Naming Service.

The point is, if you installed the software yourself, you know where it is and don't really need a Grid-wide service to find it. On the other hand, if you didn't install the software, then it is either system software, so again you don't need to query for it, or you can't trust the installer to have done it in the exact way you would need it and to keep it so over time. In case you trusted the installer, you would have been informed by her how to find and use this software (e.g., what sort of specifications to include in your jobs to make sure they execute properly).

Perhaps you are suggesting a scenario where a stranger would like to "browse" Grid resources to see what is installed where to select potential job destinations. Based on my experience, this sort of activity only makes sense within a community, which may very well have some sort of directory services, but not at the level of a whole Grid used by different communities.

While I'm on the subject of tools for the end user, what about
a shell that abstracts commands that you do from a workstation to the
grid level? Something that might be called the "gshell."

I'm pretty sure I've heard one talk about something like that being implemented in the development version of gLite, but again, Google fails to locate it.

Where is the grid equivalent of the path? Of ls?
Or for someone who wants to run a job on some collection of clusters,
but needs certain libraries, which may be installed on different
machines out there, somewhere. Is there a grid equivalent of ldconfig? Or even of something deprecated, like LD_LIBRARY_PATH?

You might be mixing up two different roles:
1) People who deploy software
2) People who run jobs

The way how I see it working is: deployers do their thing "somehow" and deliver user interfaces and instructions for people who run jobs. People who run jobs operate on the level of abstraction relevant to their task (which input data should be processed, which application configuration parameters should be used, what to do with the output). The selection of suitable target clusters shouldn't bother end users (it often does, for prosaic reasons such as the amount of free local disk space, local availability of data, and unscheduled downtimes).

What if end users actually want to tweak and recompile application code? Then the deployer's task should be to provide a reasonable user interface for doing distributed builds and on-site testing. Once again, the application programmers don't need to see much difference from when they were programming for their local cluster or machine.

While I appreciate that the globus toolkit is intended to solve
recurrent middleware problems, where is it being used to address
the most recurring problem of all: getting users to use it?

As you wrote, Globus Toolkit is middleware. The way I see it, the immediate users of Grid middleware are Grid application developers. The actual end users shouldn't need to know much about this sort of things, just like they shouldn't need to know about Unix administration. The real world might look different, but it's hardly a problem of Globus Toolkit.

Regards,
Jan Ploski

Reply via email to