Re: Automate EC2 cluster termination

2009-11-11 Thread John Clarke
Hi Edmund, I'll look into what you suggested. Yes I'm aware of being able to use S3 directly but I had problems getting it working - I must try again. cheers John 2009/11/10 Edmund Kohlwey ekohl...@gmail.com You should be able to detect the status of the job in your java main() method, just

Re: Automate EC2 cluster termination

2009-11-11 Thread John Clarke
I've never used Amazon Elastic MapReduce as we are trying to minimise costs but if I cant find a good way to solve my problem then I might reconsider. cheers, John 2009/11/10 Hitchcock, Andrew a...@amazon.com Hi John, Have you considered Amazon Elastic MapReduce? (Disclaimer: I work on

Automate EC2 cluster termination

2009-11-10 Thread John Clarke
Hi, I use EC2 to run my Hadoop jobs using Cloudera's 0.18.3 AMI. It works great but I want to automate it a bit more. I want to be able to: - start cluster - copy data from S3 to the DFS - run the job - copy result data from DFS to S3 - verify it all copied ok - shutdown the cluster. I guess

Re: Automate EC2 cluster termination

2009-11-10 Thread Edmund Kohlwey
You should be able to detect the status of the job in your java main() method, just do either: job.waitForCompletion(), and, when the job finishes running, use job.isSuccessful(), or if you want to you can write a custom watcher thread to poll job status manually; this will allow you to, for

Re: Automate EC2 cluster termination

2009-11-10 Thread Hitchcock, Andrew
Hi John, Have you considered Amazon Elastic MapReduce? (Disclaimer: I work on Elastic MapReduce) http://aws.amazon.com/elasticmapreduce/ It waits for your job to finish and then automatically shuts down the cluster. With a simple command like: elastic-mapreduce --create --num-instances 10