Hi Steve and Amit,

Thanks for your answers. I agree with you that key-based ssh is nothing to 
worry about. But I'm wondering what exactly - that means wich grid 
administration tasks - hadoop does via ssh?! Does it restart crashed data nodes 
or tasks trackers on the slaves? Oder does it transfer data over the grid with 
ssh access? How can I find a short description what exactly hadoop needs ssh 
for? The documentation says only that I have to configure it.

Thanks & Regards
Matthias


> -----Ursprüngliche Nachricht-----
> Von: Steve Loughran [mailto:ste...@apache.org] 
> Gesendet: Mittwoch, 21. Januar 2009 13:59
> An: core-user@hadoop.apache.org
> Betreff: Re: Why does Hadoop need ssh access to master and slaves?
> 
> Amit k. Saha wrote:
> > On Wed, Jan 21, 2009 at 5:53 PM, Matthias Scherer 
> > <matthias.sche...@1und1.de> wrote:
> >> Hi all,
> >>
> >> we've made our first steps in evaluating hadoop. The setup 
> of 2 VMs 
> >> as a hadoop grid was very easy and works fine.
> >>
> >> Now our operations team wonders why hadoop has to be able 
> to connect 
> >> to the master and slaves via password-less ssh?! Can 
> anyone give us 
> >> an answer to this question?
> > 
> > 1. There has to be a way to connect to the remote hosts- 
> slaves and a 
> > secondary master, and SSH is the secure way to do it 2. It 
> has to be 
> > password-less to enable automatic logins
> > 
> 
> SSH is *a * secure way to do it, but not the only way. Other 
> management tools can bring up hadoop clusters. Hadoop ships 
> with scripted support for SSH as it is standard with Linux 
> distros and generally the best way to bring up a remote console.
> 
> Matthias,
> Your ops team should not be worrying about the SSH security, 
> as long as they keep their keys under control.
> 
> (a) Key-based SSH is more secure than passworded SSH, as 
> man-in-middle attacks are prevented. passphrase protected SSH 
> keys on external USB keys even better.
> 
> (b) once the cluster is up, that filesystem is pretty 
> vulnerable to anything on the LAN. You do need to lock down 
> your datacentre, or set up the firewall/routing of the 
> servers so that only trusted hosts can talk to the FS. SSH 
> becomes a detail at that point.
> 
> 
> 

Reply via email to