Re: Ec2 and MR Job question

Billy Pearson Sat, 14 Jun 2008 20:10:24 -0700

My second question is about the ec2 machines has anyone solved the hostnameproblem in a automated way?

Example if I launch a ec2 server to run a task tracker the hostname reportedback to my local cluster with its internal addressthe local reduce task can not access the map files on the ec2 machinebecause with the default hostname.

I get a error:

WARN org.apache.hadoop.mapred.ReduceTask: java.net.UnknownHostException:domU-12-31-39-00-A4-05.compute-1.internal


<question>

Is there a automated way to start a tasktracker on a ec2 machine with ituseing the public hostname so the local task can get the maps from the ec2machines?

example something like

bin/hadoop-daemon.sh start tasktrackerhost=ec2-xx-xx-xx-xx.z-2.compute-1.amazonaws.com


That I can run to start just the tasktracker with the correct hostname
</question>

What I am trying to do is build a custom ami image that I can just launchwhen need to add extra cpu power to my cluster and to automatically start

the tasktracker vi a shell script that can be ran at startup.

Billy

"Billy Pearson" <[EMAIL PROTECTED]>wrote in message news:[EMAIL PROTECTED]

I have a question someone may have answered here before but I can not findthe answer.
Assuming I have a cluster of servers hosting a large amount of data
I want to run a large job that the maps take a lot of cpu power to run andthe reduces only take a small amount cpu to run.I want to run the maps on a group of EC2 servers and run the reduces onthe local cluster of 10 machines.
The problem I am seeing is the map outputs, if I run the maps on EC2 theyare stored local on the instanceWhat I am looking to do is have the map output files stored in hdfs so Ican kill the EC2 instances sense I do not need them for the reduces.
The only way I can thank to do this is run two jobs one maper and storethe output on hdfs and then run a second job to run the reduces
from the map outputs store on the hfds.

Is there away to make the mappers store the final output in hdfs?

Re: Ec2 and MR Job question

Reply via email to