High-throughput computing (HTC) resources are available from national
cyberinfrastructures such as the Open Science Grid and NSF XSEDE. HTC
resources are suitable for running serial or embarrassingly parallel
user jobs. Unlike high-performance computing (HPC) resources, HTC
environment is more distributed, loosely coupled, and managed
individually by various resource contributors. For HTC users, a HTC
environment presents a virtual resource pool with dynamically aggregated
resources and can be accessed through a unified client software (e.g.,
OSG/VDT/Condor).
Integrating HTC capabilities in Airavata is important for users to
access HTC resources seamlessly as they access other kinds of computing
environments supported by Airavata. At first glance, the integration may
be straightforward by adding middleware support (e.g., Condor or BOSCO)
into Airavata. However, I am proposing a user-oriented approach to the
integration in order to fully leverage HTC client software's capabilities.
An Airavata user does not care the underlying middleware when she/he
composes a job, ideally. What the user cares is the computational
capability provided by the underlying resources. A HTC environment, with
the support from the Condor middleware, is desirable for running:
- large batch jobs
- parameter-sweeping jobs
- stochastic jobs with the same configuration but requiring a large
number of repeated runs in order to obtain statistically confident results
- workflow jobs that can be represented as DAG (directed acyclic graph)
Therefore, instead of presenting a raw Condor interface to Airavata
users, tailored interfaces to aforementioned user job types will be more
useful. Technically, Condor submmit script syntax supports all of the
described jobs through job macros and DAG support. If Airavata can
bridge user job requirements and the composition of the technical Condor
submission script, HTC resources can be more effectively represented for
and used by Airavata community.
The development roadmap is upon Airavata team's design, I'm willing to
contribute a disease mapping application for the testing and evaluation
of the new components and capabilities developed in Airavata for this
purpose.
Thanks,
Yan