Re: ssh issues

2009-05-26 Thread Allen Wittenauer



On 5/26/09 3:40 AM, "Steve Loughran"  wrote:
> HDFS is as secure as NFS: you are trusted to be who you say you are.
> Which means that you have to run it on a secured subnet -access
> restricted to trusted hosts and/or one two front end servers or accept
> that your dataset is readable and writeable by anyone on the network.
> 
> There is user identification going in; it is currently at the level
> where it will stop someone accidentally deleting the entire filesystem
> if they lack the rights. Which has been known to happen.

Actually, I'd argue that HDFS is worse than even rudimentary NFS
implementations.  Off the top of my head:

a) There is no equivalent of squash root/force anonymous.  Any host can
assume privilege.
 
b) There is no 'read only from these hosts'.  If you can read blocks
over Hadoop RPC, you can write as well (minus safe mode).



Re: ssh issues

2009-05-26 Thread Steve Loughran

hmar...@umbc.edu wrote:

Steve,

Security through obscurity is always a good practice from a development
standpoint and one of the reasons why tricking you out is an easy task.


:)

My most recent presentation on HDFS clusters is now online, notice how it
doesn't gloss over the security: 
http://www.slideshare.net/steve_l/hdfs-issues



Please, keep hiding relevant details from people in order to keep everyone
smiling.



HDFS is as secure as NFS: you are trusted to be who you say you are. 
Which means that you have to run it on a secured subnet -access 
restricted to trusted hosts and/or one two front end servers or accept 
that your dataset is readable and writeable by anyone on the network.


There is user identification going in; it is currently at the level 
where it will stop someone accidentally deleting the entire filesystem 
if they lack the rights. Which has been known to happen.


If the team looking after the cluster demand separate SSH keys/login for 
every machine then not only are they making their operations costs high, 
once you have got the HDFS cluster and MR engine live, it's moot. You 
can push out work to the JobTracker, which then runs it on the machines, 
under whatever userid the TaskTrackers are running on. Now,  0.20+ will 
run it under the identity of the user who claimed to be submitting the 
job, but without that, your MR Jobs get the access rights to the 
filesystem of the user that is running the TT, but it's fairly 
straightforward to create a modified hadoop client JAR that doesn't call 
whoami to get the userid, and instead spoofs to be anyone. Which means 
that even if you lock down the filesystem -no out of datacentre access-, 
if I can run my java code as MR jobs in your cluster, I can have 
unrestricted access to the filesystem by way of the task tracker server.


But Hal, if you are running Ant for your build I'm running my code on 
your machines anyway, so you had better be glad that I'm not malicious.


-Steve


Re: ssh issues

2009-05-22 Thread Edward Capriolo
Pankil,

I used to be very confused by hadoop and SSH keys. SSH is NOT
required. Each component can be started by hand. This gem of knowledge
is hidden away in the hundreds of DIGG style articles entitled 'HOW TO
RUN A HADOOP MULTI-MASTER CLUSTER!'

The SSH keys are only required by the shell scripts that are contained
with Hadoop like start-all. They are wrappers to kick off other
scripts on a list of nodes. I PERSONALLY dislike using SSH keys as a
software component and believe they should only be used by
administrators.

We chose the cloudera distribution.
http://www.cloudera.com/distribution. A big factor behind this was the
simple init.d scripts they provided. Each hadoop component has its own
start scripts hadoop-namenode, hadoop-datanode, etc.

My suggestion is taking a look at the Cloudera startup scripts. Even
if you decide not to use the distribution you can take a look at their
start up scripts and fit them to your needs.

On Fri, May 22, 2009 at 10:34 AM,   wrote:
> Steve,
>
> Security through obscurity is always a good practice from a development
> standpoint and one of the reasons why tricking you out is an easy task.
> Please, keep hiding relevant details from people in order to keep everyone
> smiling.
>
> Hal
>
>> Pankil Doshi wrote:
>>> Well i made ssh with passphares. as the system in which i need to login
>>> requires ssh with pass phrases and those systems have to be part of my
>>> cluster. and so I need a way where I can specify -i path/to key/ and
>>> passphrase to hadoop in before hand.
>>>
>>> Pankil
>>>
>>
>> Well, are trying to manage a system whose security policy is
>> incompatible with hadoop's current shell scripts. If you push out the
>> configs and manage the lifecycle using other tools, this becomes a
>> non-issue. Dont raise the topic of HDFS security to your ops team
>> though, as they will probably be unhappy about what is currently on offer.
>>
>> -steve
>>
>>
>
>
>


Re: ssh issues

2009-05-22 Thread hmarti2
Steve,

Security through obscurity is always a good practice from a development
standpoint and one of the reasons why tricking you out is an easy task.
Please, keep hiding relevant details from people in order to keep everyone
smiling.

Hal

> Pankil Doshi wrote:
>> Well i made ssh with passphares. as the system in which i need to login
>> requires ssh with pass phrases and those systems have to be part of my
>> cluster. and so I need a way where I can specify -i path/to key/ and
>> passphrase to hadoop in before hand.
>>
>> Pankil
>>
>
> Well, are trying to manage a system whose security policy is
> incompatible with hadoop's current shell scripts. If you push out the
> configs and manage the lifecycle using other tools, this becomes a
> non-issue. Dont raise the topic of HDFS security to your ops team
> though, as they will probably be unhappy about what is currently on offer.
>
> -steve
>
>




Re: ssh issues

2009-05-22 Thread Steve Loughran

Pankil Doshi wrote:

Well i made ssh with passphares. as the system in which i need to login
requires ssh with pass phrases and those systems have to be part of my
cluster. and so I need a way where I can specify -i path/to key/ and
passphrase to hadoop in before hand.

Pankil



Well, are trying to manage a system whose security policy is 
incompatible with hadoop's current shell scripts. If you push out the 
configs and manage the lifecycle using other tools, this becomes a 
non-issue. Dont raise the topic of HDFS security to your ops team 
though, as they will probably be unhappy about what is currently on offer.


-steve


Re: ssh issues

2009-05-22 Thread Pankil Doshi
Well i made ssh with passphares. as the system in which i need to login
requires ssh with pass phrases and those systems have to be part of my
cluster. and so I need a way where I can specify -i path/to key/ and
passphrase to hadoop in before hand.

Pankil

On Thu, May 21, 2009 at 9:35 PM, Aaron Kimball  wrote:

> Pankil,
>
> That means that either you're using the wrong ssh key and it's falling back
> to password authentication, or else you created your ssh keys with
> passphrases attached; try making new ssh keys with ssh-keygen and
> distributing those to start again?
>
> - Aaron
>
> On Thu, May 21, 2009 at 3:49 PM, Pankil Doshi  wrote:
>
> > The problem is that it also prompts for the pass phrase.
> >
> > On Thu, May 21, 2009 at 2:14 PM, Brian Bockelman  > >wrote:
> >
> > > Hey Pankil,
> > >
> > > Use ~/.ssh/config to set the default key location to the proper place
> for
> > > each host, if you're going down that route.
> > >
> > > I'd remind you that SSH is only used as a convenient method to launch
> > > daemons.  If you have a preferred way to start things up on your
> cluster,
> > > you can use that (I think most large clusters don't use ssh... could be
> > > wrong).
> > >
> > > Brian
> > >
> > >
> > > On May 21, 2009, at 2:07 PM, Pankil Doshi wrote:
> > >
> > >  Hello everyone,
> > >>
> > >> I got hint how to solve the problem where clusters have different
> > >> usernames.but now other problem I face is that i can ssh a machine by
> > >> using
> > >> -i path/to key/ ..I cant ssh them directly but I will have to always
> > pass
> > >> the key.
> > >>
> > >> Now i face problem in ssh-ing my machines.Does anyone have any ideas
> how
> > >> to
> > >> deal with that??
> > >>
> > >> Regards
> > >> Pankil
> > >>
> > >
> > >
> >
>


Re: ssh issues

2009-05-21 Thread Aaron Kimball
Pankil,

That means that either you're using the wrong ssh key and it's falling back
to password authentication, or else you created your ssh keys with
passphrases attached; try making new ssh keys with ssh-keygen and
distributing those to start again?

- Aaron

On Thu, May 21, 2009 at 3:49 PM, Pankil Doshi  wrote:

> The problem is that it also prompts for the pass phrase.
>
> On Thu, May 21, 2009 at 2:14 PM, Brian Bockelman  >wrote:
>
> > Hey Pankil,
> >
> > Use ~/.ssh/config to set the default key location to the proper place for
> > each host, if you're going down that route.
> >
> > I'd remind you that SSH is only used as a convenient method to launch
> > daemons.  If you have a preferred way to start things up on your cluster,
> > you can use that (I think most large clusters don't use ssh... could be
> > wrong).
> >
> > Brian
> >
> >
> > On May 21, 2009, at 2:07 PM, Pankil Doshi wrote:
> >
> >  Hello everyone,
> >>
> >> I got hint how to solve the problem where clusters have different
> >> usernames.but now other problem I face is that i can ssh a machine by
> >> using
> >> -i path/to key/ ..I cant ssh them directly but I will have to always
> pass
> >> the key.
> >>
> >> Now i face problem in ssh-ing my machines.Does anyone have any ideas how
> >> to
> >> deal with that??
> >>
> >> Regards
> >> Pankil
> >>
> >
> >
>


Re: ssh issues

2009-05-21 Thread Pankil Doshi
The problem is that it also prompts for the pass phrase.

On Thu, May 21, 2009 at 2:14 PM, Brian Bockelman wrote:

> Hey Pankil,
>
> Use ~/.ssh/config to set the default key location to the proper place for
> each host, if you're going down that route.
>
> I'd remind you that SSH is only used as a convenient method to launch
> daemons.  If you have a preferred way to start things up on your cluster,
> you can use that (I think most large clusters don't use ssh... could be
> wrong).
>
> Brian
>
>
> On May 21, 2009, at 2:07 PM, Pankil Doshi wrote:
>
>  Hello everyone,
>>
>> I got hint how to solve the problem where clusters have different
>> usernames.but now other problem I face is that i can ssh a machine by
>> using
>> -i path/to key/ ..I cant ssh them directly but I will have to always pass
>> the key.
>>
>> Now i face problem in ssh-ing my machines.Does anyone have any ideas how
>> to
>> deal with that??
>>
>> Regards
>> Pankil
>>
>
>


Re: ssh issues

2009-05-21 Thread Brian Bockelman

Hey Pankil,

Use ~/.ssh/config to set the default key location to the proper place  
for each host, if you're going down that route.


I'd remind you that SSH is only used as a convenient method to launch  
daemons.  If you have a preferred way to start things up on your  
cluster, you can use that (I think most large clusters don't use  
ssh... could be wrong).


Brian

On May 21, 2009, at 2:07 PM, Pankil Doshi wrote:


Hello everyone,

I got hint how to solve the problem where clusters have different
usernames.but now other problem I face is that i can ssh a machine  
by using
-i path/to key/ ..I cant ssh them directly but I will have to always  
pass

the key.

Now i face problem in ssh-ing my machines.Does anyone have any ideas  
how to

deal with that??

Regards
Pankil




ssh issues

2009-05-21 Thread Pankil Doshi
Hello everyone,

I got hint how to solve the problem where clusters have different
usernames.but now other problem I face is that i can ssh a machine by using
-i path/to key/ ..I cant ssh them directly but I will have to always pass
the key.

Now i face problem in ssh-ing my machines.Does anyone have any ideas how to
deal with that??

Regards
Pankil