Hi Edmund,
I'll look into what you suggested. Yes I'm aware of being able to use S3
directly but I had problems getting it working - I must try again.
cheers
John
2009/11/10 Edmund Kohlwey ekohl...@gmail.com
You should be able to detect the status of the job in your java main()
method, just
I've never used Amazon Elastic MapReduce as we are trying to minimise costs
but if I cant find a good way to solve my problem then I might reconsider.
cheers,
John
2009/11/10 Hitchcock, Andrew a...@amazon.com
Hi John,
Have you considered Amazon Elastic MapReduce? (Disclaimer: I work on
Hi,
I use EC2 to run my Hadoop jobs using Cloudera's 0.18.3 AMI. It works great
but I want to automate it a bit more.
I want to be able to:
- start cluster
- copy data from S3 to the DFS
- run the job
- copy result data from DFS to S3
- verify it all copied ok
- shutdown the cluster.
I guess
You should be able to detect the status of the job in your java main()
method, just do either: job.waitForCompletion(), and, when the job
finishes running, use job.isSuccessful(), or if you want to you can
write a custom watcher thread to poll job status manually; this will
allow you to, for
Hi John,
Have you considered Amazon Elastic MapReduce? (Disclaimer: I work on Elastic
MapReduce)
http://aws.amazon.com/elasticmapreduce/
It waits for your job to finish and then automatically shuts down the cluster.
With a simple command like:
elastic-mapreduce --create --num-instances 10