SELinux does nothing for Hadoop cluster security at the data-layer, which is 
why there tools on top, not only to lock down systems, but to provide better 
data governance: where did things come from, has it been tainted by merging 
with sensitive data, etc, etc.

Where it could be good is

1. Allow hadoop nodes to be more secure on the intranet itself. It's another 
layer in the defense-in-depth story, so if some standard Linux service on the 
system (ssh, ntpd, ...) gets compromised, the damage is partially limited. My 
home server is SELinux-enforced, for example

2. Reduce the impact of anything malicious trying to run as a YARN-scheduled 
app.

#2 is moot until you have Kerberos up; until then the whole of HDFS is visible. 
Once you have it up SE linux could restrict what damage a privilege-esclated 
YARN job could do to the local hosts. But I'm still reasonably confident that 
given the ability to run 200+ containers on a Hadoop cluster for a few hours I 
could (a) portscan an intranet for SMB & sharepoint hosts, and (b) execute 
enough TCP open connections to overload the services. 

I'm +1 to getting Hadoop to run on SELinux; I think mainly we've been lazy. 

But it's not going to keep your Hadoop-stored data safe, lock-down your network 
apps or help mitigate the intentional or unintentional damage that hadoop code 
can do if on the same intranet as the rest of your organisation. Or, as AW on 
Nicholas can attest, the damage you can do from running network traffic- or 
CPU-intensive code from taking down the network or power supplies of the rest 
of the datacentre. 



> On 28 Mar 2015, at 02:33, jay vyas <jayunit100.apa...@gmail.com> wrote:
> 
> Tools like freeipa and so on are very synergistic first steps down the road
> of making hadoop more enterprise friendly.  For example, if you let freeipa
> manage users, kerberos and so on - then you can pave the way down the road
> for selinux as well (since these tools are able to work together).
> 
> I think in general, the more hadoop works with the linux community , rather
> than rebuilding its own solutions, the easier it will be to integrate in
> broader and broader deployments - so in theory working to run  selinux and
> hadoop together is probably a win-win.
> 
> On Thu, Mar 26, 2015 at 1:22 PM, Aaron T. Myers <a...@cloudera.com> wrote:
> 
>> In addition to everything Allen has already said, which I entirely agree
>> with, I'll also point out that much of the focus on Hadoop security has
>> been related to authentication, and only somewhat more recently on
>> providing advanced authorization capabilities. I'll readily admit to not
>> knowing much about SE Linux's capabilities, but my impression is that it
>> wouldn't do much to be able to help out with authentication within Hadoop,
>> and hence wouldn't have been a realistic option when Hadoop's security work
>> was started many years ago.
>> 
>> --

Reply via email to