Hi, Mike. We have been using StarCluster for some time, to deploy separate clusters in the cloud, per user. We update a custom CentOS 7 AMI that allows us to maintain binary compatibility with our Wharton HPCC system. This solution can be staff time intensive and/or require user training for deploying an application, launching a cluster and moving data around.
We have also deployed Univa's cloud-bursting solution, UniCloud. I'm a fan of this product and its approach. It wraps up Grid Engine with a policy engine that will launch cloud nodes as needed, like jobs waiting in a specific queue. This allows existing users to log into our regular system and use the normal commands with a few extra aliases. The user learning curve is much easier here, as staff do a one-time account and billing setup, then users can qsub/qlogin jobs that use their own AWS cloud queue. Jobs are submitted out of a user's cloud home directory via NFS over VPC. If you have questions, I'm happy to answer. We've figured out quite a few of the usability issues with some friendly initial users. Cheers. On 10:28PM Thu 05/07/15 +0000, Hutcheson, Mike wrote: > Hi. We are working on refreshing the centralized HPC cluster resources > that our university researchers use. I have been asked by our > administration to look into HPC in the cloud offerings as a possibility to > purchasing or running a cluster on-site. > > We currently run a 173-node, CentOS-based cluster with ~120TB (soon to > increase to 300+TB) in our datacenter. It¹s a standard cluster > configuration: IB network, distributed file system (BeeGFS. I really > like it), Torque/Maui batch. Our users run a varied workload, from > fine-grained, MPI-based parallel aps scaling to 100s of cores to > coarse-grained, high-throughput jobs (We¹re a CMS Tier-3 site) with high > I/O requirements. > > Whatever we transition to, whether it be a new in-house cluster or > something ³out there², I want to minimize the amount of change or learning > curve our users would have to experience. They should be able to focus on > their research and not have to spend a lot of their time learning a new > system or trying to spin one up each time they have a job to run. > > If you have worked with HPC in the cloud, either as an admin and/or > someone who has used cloud resources for research computing purposes, I > would appreciate learning your experience. > > Even if you haven¹t used the cloud for HPC computing, please feel free to > share your thoughts or concerns on the matter. > > Sort of along those same lines, what are your thoughts about leasing a > cluster and running it on-site? > > Thanks for your time, > > Mike Hutcheson > Assistant Director of Academic and Research Computing Services > Baylor University > > > _______________________________________________ > Beowulf mailing list, [email protected] sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf -- Gavin W. Burris Senior Project Leader for Research Computing The Wharton School University of Pennsylvania _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
