Re: ssh issues
On 5/26/09 3:40 AM, "Steve Loughran" wrote: > HDFS is as secure as NFS: you are trusted to be who you say you are. > Which means that you have to run it on a secured subnet -access > restricted to trusted hosts and/or one two front end servers or accept > that your dataset is readable and writeable by anyone on the network. > > There is user identification going in; it is currently at the level > where it will stop someone accidentally deleting the entire filesystem > if they lack the rights. Which has been known to happen. Actually, I'd argue that HDFS is worse than even rudimentary NFS implementations. Off the top of my head: a) There is no equivalent of squash root/force anonymous. Any host can assume privilege. b) There is no 'read only from these hosts'. If you can read blocks over Hadoop RPC, you can write as well (minus safe mode).
Re: ssh issues
hmar...@umbc.edu wrote: Steve, Security through obscurity is always a good practice from a development standpoint and one of the reasons why tricking you out is an easy task. :) My most recent presentation on HDFS clusters is now online, notice how it doesn't gloss over the security: http://www.slideshare.net/steve_l/hdfs-issues Please, keep hiding relevant details from people in order to keep everyone smiling. HDFS is as secure as NFS: you are trusted to be who you say you are. Which means that you have to run it on a secured subnet -access restricted to trusted hosts and/or one two front end servers or accept that your dataset is readable and writeable by anyone on the network. There is user identification going in; it is currently at the level where it will stop someone accidentally deleting the entire filesystem if they lack the rights. Which has been known to happen. If the team looking after the cluster demand separate SSH keys/login for every machine then not only are they making their operations costs high, once you have got the HDFS cluster and MR engine live, it's moot. You can push out work to the JobTracker, which then runs it on the machines, under whatever userid the TaskTrackers are running on. Now, 0.20+ will run it under the identity of the user who claimed to be submitting the job, but without that, your MR Jobs get the access rights to the filesystem of the user that is running the TT, but it's fairly straightforward to create a modified hadoop client JAR that doesn't call whoami to get the userid, and instead spoofs to be anyone. Which means that even if you lock down the filesystem -no out of datacentre access-, if I can run my java code as MR jobs in your cluster, I can have unrestricted access to the filesystem by way of the task tracker server. But Hal, if you are running Ant for your build I'm running my code on your machines anyway, so you had better be glad that I'm not malicious. -Steve
Re: ssh issues
Pankil, I used to be very confused by hadoop and SSH keys. SSH is NOT required. Each component can be started by hand. This gem of knowledge is hidden away in the hundreds of DIGG style articles entitled 'HOW TO RUN A HADOOP MULTI-MASTER CLUSTER!' The SSH keys are only required by the shell scripts that are contained with Hadoop like start-all. They are wrappers to kick off other scripts on a list of nodes. I PERSONALLY dislike using SSH keys as a software component and believe they should only be used by administrators. We chose the cloudera distribution. http://www.cloudera.com/distribution. A big factor behind this was the simple init.d scripts they provided. Each hadoop component has its own start scripts hadoop-namenode, hadoop-datanode, etc. My suggestion is taking a look at the Cloudera startup scripts. Even if you decide not to use the distribution you can take a look at their start up scripts and fit them to your needs. On Fri, May 22, 2009 at 10:34 AM, wrote: > Steve, > > Security through obscurity is always a good practice from a development > standpoint and one of the reasons why tricking you out is an easy task. > Please, keep hiding relevant details from people in order to keep everyone > smiling. > > Hal > >> Pankil Doshi wrote: >>> Well i made ssh with passphares. as the system in which i need to login >>> requires ssh with pass phrases and those systems have to be part of my >>> cluster. and so I need a way where I can specify -i path/to key/ and >>> passphrase to hadoop in before hand. >>> >>> Pankil >>> >> >> Well, are trying to manage a system whose security policy is >> incompatible with hadoop's current shell scripts. If you push out the >> configs and manage the lifecycle using other tools, this becomes a >> non-issue. Dont raise the topic of HDFS security to your ops team >> though, as they will probably be unhappy about what is currently on offer. >> >> -steve >> >> > > >
Re: ssh issues
Steve, Security through obscurity is always a good practice from a development standpoint and one of the reasons why tricking you out is an easy task. Please, keep hiding relevant details from people in order to keep everyone smiling. Hal > Pankil Doshi wrote: >> Well i made ssh with passphares. as the system in which i need to login >> requires ssh with pass phrases and those systems have to be part of my >> cluster. and so I need a way where I can specify -i path/to key/ and >> passphrase to hadoop in before hand. >> >> Pankil >> > > Well, are trying to manage a system whose security policy is > incompatible with hadoop's current shell scripts. If you push out the > configs and manage the lifecycle using other tools, this becomes a > non-issue. Dont raise the topic of HDFS security to your ops team > though, as they will probably be unhappy about what is currently on offer. > > -steve > >
Re: ssh issues
Pankil Doshi wrote: Well i made ssh with passphares. as the system in which i need to login requires ssh with pass phrases and those systems have to be part of my cluster. and so I need a way where I can specify -i path/to key/ and passphrase to hadoop in before hand. Pankil Well, are trying to manage a system whose security policy is incompatible with hadoop's current shell scripts. If you push out the configs and manage the lifecycle using other tools, this becomes a non-issue. Dont raise the topic of HDFS security to your ops team though, as they will probably be unhappy about what is currently on offer. -steve
Re: ssh issues
Well i made ssh with passphares. as the system in which i need to login requires ssh with pass phrases and those systems have to be part of my cluster. and so I need a way where I can specify -i path/to key/ and passphrase to hadoop in before hand. Pankil On Thu, May 21, 2009 at 9:35 PM, Aaron Kimball wrote: > Pankil, > > That means that either you're using the wrong ssh key and it's falling back > to password authentication, or else you created your ssh keys with > passphrases attached; try making new ssh keys with ssh-keygen and > distributing those to start again? > > - Aaron > > On Thu, May 21, 2009 at 3:49 PM, Pankil Doshi wrote: > > > The problem is that it also prompts for the pass phrase. > > > > On Thu, May 21, 2009 at 2:14 PM, Brian Bockelman > >wrote: > > > > > Hey Pankil, > > > > > > Use ~/.ssh/config to set the default key location to the proper place > for > > > each host, if you're going down that route. > > > > > > I'd remind you that SSH is only used as a convenient method to launch > > > daemons. If you have a preferred way to start things up on your > cluster, > > > you can use that (I think most large clusters don't use ssh... could be > > > wrong). > > > > > > Brian > > > > > > > > > On May 21, 2009, at 2:07 PM, Pankil Doshi wrote: > > > > > > Hello everyone, > > >> > > >> I got hint how to solve the problem where clusters have different > > >> usernames.but now other problem I face is that i can ssh a machine by > > >> using > > >> -i path/to key/ ..I cant ssh them directly but I will have to always > > pass > > >> the key. > > >> > > >> Now i face problem in ssh-ing my machines.Does anyone have any ideas > how > > >> to > > >> deal with that?? > > >> > > >> Regards > > >> Pankil > > >> > > > > > > > > >
Re: ssh issues
Pankil, That means that either you're using the wrong ssh key and it's falling back to password authentication, or else you created your ssh keys with passphrases attached; try making new ssh keys with ssh-keygen and distributing those to start again? - Aaron On Thu, May 21, 2009 at 3:49 PM, Pankil Doshi wrote: > The problem is that it also prompts for the pass phrase. > > On Thu, May 21, 2009 at 2:14 PM, Brian Bockelman >wrote: > > > Hey Pankil, > > > > Use ~/.ssh/config to set the default key location to the proper place for > > each host, if you're going down that route. > > > > I'd remind you that SSH is only used as a convenient method to launch > > daemons. If you have a preferred way to start things up on your cluster, > > you can use that (I think most large clusters don't use ssh... could be > > wrong). > > > > Brian > > > > > > On May 21, 2009, at 2:07 PM, Pankil Doshi wrote: > > > > Hello everyone, > >> > >> I got hint how to solve the problem where clusters have different > >> usernames.but now other problem I face is that i can ssh a machine by > >> using > >> -i path/to key/ ..I cant ssh them directly but I will have to always > pass > >> the key. > >> > >> Now i face problem in ssh-ing my machines.Does anyone have any ideas how > >> to > >> deal with that?? > >> > >> Regards > >> Pankil > >> > > > > >
Re: ssh issues
The problem is that it also prompts for the pass phrase. On Thu, May 21, 2009 at 2:14 PM, Brian Bockelman wrote: > Hey Pankil, > > Use ~/.ssh/config to set the default key location to the proper place for > each host, if you're going down that route. > > I'd remind you that SSH is only used as a convenient method to launch > daemons. If you have a preferred way to start things up on your cluster, > you can use that (I think most large clusters don't use ssh... could be > wrong). > > Brian > > > On May 21, 2009, at 2:07 PM, Pankil Doshi wrote: > > Hello everyone, >> >> I got hint how to solve the problem where clusters have different >> usernames.but now other problem I face is that i can ssh a machine by >> using >> -i path/to key/ ..I cant ssh them directly but I will have to always pass >> the key. >> >> Now i face problem in ssh-ing my machines.Does anyone have any ideas how >> to >> deal with that?? >> >> Regards >> Pankil >> > >
Re: ssh issues
Hey Pankil, Use ~/.ssh/config to set the default key location to the proper place for each host, if you're going down that route. I'd remind you that SSH is only used as a convenient method to launch daemons. If you have a preferred way to start things up on your cluster, you can use that (I think most large clusters don't use ssh... could be wrong). Brian On May 21, 2009, at 2:07 PM, Pankil Doshi wrote: Hello everyone, I got hint how to solve the problem where clusters have different usernames.but now other problem I face is that i can ssh a machine by using -i path/to key/ ..I cant ssh them directly but I will have to always pass the key. Now i face problem in ssh-ing my machines.Does anyone have any ideas how to deal with that?? Regards Pankil
ssh issues
Hello everyone, I got hint how to solve the problem where clusters have different usernames.but now other problem I face is that i can ssh a machine by using -i path/to key/ ..I cant ssh them directly but I will have to always pass the key. Now i face problem in ssh-ing my machines.Does anyone have any ideas how to deal with that?? Regards Pankil