Re: [DISCUSS]: securing ASF Hadoop releases out of the box

2018-07-06 Thread Steve Loughran


> On 6 Jul 2018, at 00:04, Eric Yang  wrote:
> 
> +1 on Non-routable IP idea.  My preference is to start in Hadoop-common to 
> minimize the scope and incrementally improve.  However, this will be 
> incompatible change for initial user experience on public cloud.  What would 
> be the right release vehicle for this work (3.2+ or 4.x)?

3.2+

as for public cloud, that's precisely where you don't want to be wide open 
unless you are in some VPN setup. If you can't set up network rules here, 
should you be trying to install ASF hadoop out the box into a VM.

We should ping the Bigtop people here for their input.

> 
> Regards,
> Eric
> 
> On 7/5/18, 2:33 PM, "larry mccay"  wrote:
> 
>+1 from me as well.
> 
>On Thu, Jul 5, 2018 at 5:19 PM, Steve Loughran 
>wrote:
> 
>> 
>> 
>>> On 5 Jul 2018, at 23:15, Anu Engineer  wrote:
>>> 
>>> +1, on the Non-Routable Idea. We like it so much that we added it to the
>> Ozone roadmap.
>>> https://issues.apache.org/jira/browse/HDDS-231
>>> 
>>> If there is consensus on bringing this to Hadoop in general, we can
>> build this feature in common.
>>> 
>>> --Anu
>>> 
>> 
>> 
>> +1 to out the box, everywhere. Web UIs included
>> 
>> 
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>> 
>> 
> 
> 


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

2018-07-05 Thread Eric Yang
+1 on Non-routable IP idea.  My preference is to start in Hadoop-common to 
minimize the scope and incrementally improve.  However, this will be 
incompatible change for initial user experience on public cloud.  What would be 
the right release vehicle for this work (3.2+ or 4.x)?

Regards,
Eric

On 7/5/18, 2:33 PM, "larry mccay"  wrote:

+1 from me as well.

On Thu, Jul 5, 2018 at 5:19 PM, Steve Loughran 
wrote:

>
>
> > On 5 Jul 2018, at 23:15, Anu Engineer  wrote:
> >
> > +1, on the Non-Routable Idea. We like it so much that we added it to the
> Ozone roadmap.
> > https://issues.apache.org/jira/browse/HDDS-231
> >
> > If there is consensus on bringing this to Hadoop in general, we can
> build this feature in common.
> >
> > --Anu
> >
>
>
> +1 to out the box, everywhere. Web UIs included
>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

2018-07-05 Thread larry mccay
+1 from me as well.

On Thu, Jul 5, 2018 at 5:19 PM, Steve Loughran 
wrote:

>
>
> > On 5 Jul 2018, at 23:15, Anu Engineer  wrote:
> >
> > +1, on the Non-Routable Idea. We like it so much that we added it to the
> Ozone roadmap.
> > https://issues.apache.org/jira/browse/HDDS-231
> >
> > If there is consensus on bringing this to Hadoop in general, we can
> build this feature in common.
> >
> > --Anu
> >
>
>
> +1 to out the box, everywhere. Web UIs included
>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

2018-07-05 Thread Steve Loughran



> On 5 Jul 2018, at 23:15, Anu Engineer  wrote:
> 
> +1, on the Non-Routable Idea. We like it so much that we added it to the 
> Ozone roadmap.
> https://issues.apache.org/jira/browse/HDDS-231
> 
> If there is consensus on bringing this to Hadoop in general, we can build 
> this feature in common.
> 
> --Anu
> 


+1 to out the box, everywhere. Web UIs included


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS]: securing ASF Hadoop releases out of the box

2018-07-05 Thread Anu Engineer
+1, on the Non-Routable Idea. We like it so much that we added it to the Ozone 
roadmap.
https://issues.apache.org/jira/browse/HDDS-231

If there is consensus on bringing this to Hadoop in general, we can build this 
feature in common.

--Anu


On 7/5/18, 1:09 PM, "Sean Busbey"  wrote:

I really, really like the approach of defaulting to only non-routeable
IPs allowed. it seems like a good tradeoff for complexity of
implementation, pain to reconfigure, and level of protection.

On Thu, Jul 5, 2018 at 2:25 PM, Todd Lipcon  
wrote:
> The approach we took in Apache Kudu is that, if Kerberos hasn't been
> enabled, we default to a whitelist of subnets. The default whitelist is
> 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
> matches the IANA "non-routeable IP" subnet list.
>
> In other words, out-of-the-box, you get a deployment that works fine 
within
> a typical LAN environment, but won't allow some remote hacker to locate
> your cluster and access your data. We thought this was a nice balance
> between "works out of the box without lots of configuration" and "decent
> security". In my opinion a "localhost-only by default" would be be overly
> restrictive since I'd usually be deploying on some datacenter or EC2
> machine and then trying to access it from a client on my laptop.
>
> We released this first a bit over a year ago if my memory serves me, and
> we've had relatively few complaints or questions about it. We also made
> sure that the error message that comes back to clients is pretty
> reasonable, indicating the specific configuration that is disallowing
> access, so if people hit the issue on upgrade they had a clear idea what 
is
> going on.
>
> Of course it's not foolproof, since as Eric says, you're still likely open
> to the entirety of your corporation, and you may not want that, but as he
> also pointed out, that might be true even if you enable Kerberos
> authentication.
>
> -Todd
>
> On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang  wrote:
>
>> Hadoop default configuration aimed for user friendliness to increase
>> adoption, and security can be enabled one by one.  This approach is most
>> problematic to security because system can be compromised before all
>> security features are turned on.
>> Larry's proposal will add some safety to remind system admin if security
>> is disabled.  However, reducing the number of knobs on security configs 
are
>> likely required to make the system secure for the banner idea to work
>> without writing too much guessing logic to determine if UI is secured.
>> Penetration test can provide better insights of what hasn't been secured 
to
>> improve the next release.  Thankfully most Hadoop vendors have done this
>> work periodically to help the community secure Hadoop.
>>
>> There are plenty of company advertised if you want security, use
>> Kerberos.  This statement is not entirely true.  Kerberos makes security
>> more difficult to crack for external parties, but it shouldn't be the 
only
>> method to secure Hadoop.  When the Kerberos environment is larger than
>> Hadoop cluster, anyone within Kerberos environment can access Hadoop
>> cluster freely without restriction.  In large scale enterprises or some
>> cloud vendors that sublet their resources, this might not be acceptable.
>>
>> From my point of view, a secure Hadoop release must default all settings
>> to localhost only and allow users to add more hosts through authorized
>> white list of servers.  This will keep security perimeter in check.  All
>> wild card ACLs will need to be removed or default to current user/current
>> host only.  Proxy user/host ACL list must be enforced on http channels.
>> This is basically realigning the default configuration to single node
>> cluster or firewalled configuration.
>>
>> Regards,
>> Eric
>>
>> On 7/5/18, 8:24 AM, "larry mccay"  wrote:
>>
>> Hi Steve -
>>
>> This is a long overdue DISCUSS thread!
>>
>> Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED 
UI
>> ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
>> warning
>> to get to the page like SSL exceptions in the browser do?
>> Similar tactic for UI access without SSL?
>> A new AuthenticationFilter can be added to the filter chains that
>> blocks
>> API calls unless explicitly configured to be open and obvious log a
>> similar
>> message?
>>
>> thanks,
>>
>> --larry
>>
>>
>>
>>
>> On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
>> ste...@hortonworks.com>
>> wrote:
>>
>> > Bitcoins are profitable enough to justify writing 

Re: [DISCUSS]: securing ASF Hadoop releases out of the box

2018-07-05 Thread Sean Busbey
I really, really like the approach of defaulting to only non-routeable
IPs allowed. it seems like a good tradeoff for complexity of
implementation, pain to reconfigure, and level of protection.

On Thu, Jul 5, 2018 at 2:25 PM, Todd Lipcon  wrote:
> The approach we took in Apache Kudu is that, if Kerberos hasn't been
> enabled, we default to a whitelist of subnets. The default whitelist is
> 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
> matches the IANA "non-routeable IP" subnet list.
>
> In other words, out-of-the-box, you get a deployment that works fine within
> a typical LAN environment, but won't allow some remote hacker to locate
> your cluster and access your data. We thought this was a nice balance
> between "works out of the box without lots of configuration" and "decent
> security". In my opinion a "localhost-only by default" would be be overly
> restrictive since I'd usually be deploying on some datacenter or EC2
> machine and then trying to access it from a client on my laptop.
>
> We released this first a bit over a year ago if my memory serves me, and
> we've had relatively few complaints or questions about it. We also made
> sure that the error message that comes back to clients is pretty
> reasonable, indicating the specific configuration that is disallowing
> access, so if people hit the issue on upgrade they had a clear idea what is
> going on.
>
> Of course it's not foolproof, since as Eric says, you're still likely open
> to the entirety of your corporation, and you may not want that, but as he
> also pointed out, that might be true even if you enable Kerberos
> authentication.
>
> -Todd
>
> On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang  wrote:
>
>> Hadoop default configuration aimed for user friendliness to increase
>> adoption, and security can be enabled one by one.  This approach is most
>> problematic to security because system can be compromised before all
>> security features are turned on.
>> Larry's proposal will add some safety to remind system admin if security
>> is disabled.  However, reducing the number of knobs on security configs are
>> likely required to make the system secure for the banner idea to work
>> without writing too much guessing logic to determine if UI is secured.
>> Penetration test can provide better insights of what hasn't been secured to
>> improve the next release.  Thankfully most Hadoop vendors have done this
>> work periodically to help the community secure Hadoop.
>>
>> There are plenty of company advertised if you want security, use
>> Kerberos.  This statement is not entirely true.  Kerberos makes security
>> more difficult to crack for external parties, but it shouldn't be the only
>> method to secure Hadoop.  When the Kerberos environment is larger than
>> Hadoop cluster, anyone within Kerberos environment can access Hadoop
>> cluster freely without restriction.  In large scale enterprises or some
>> cloud vendors that sublet their resources, this might not be acceptable.
>>
>> From my point of view, a secure Hadoop release must default all settings
>> to localhost only and allow users to add more hosts through authorized
>> white list of servers.  This will keep security perimeter in check.  All
>> wild card ACLs will need to be removed or default to current user/current
>> host only.  Proxy user/host ACL list must be enforced on http channels.
>> This is basically realigning the default configuration to single node
>> cluster or firewalled configuration.
>>
>> Regards,
>> Eric
>>
>> On 7/5/18, 8:24 AM, "larry mccay"  wrote:
>>
>> Hi Steve -
>>
>> This is a long overdue DISCUSS thread!
>>
>> Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
>> ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
>> warning
>> to get to the page like SSL exceptions in the browser do?
>> Similar tactic for UI access without SSL?
>> A new AuthenticationFilter can be added to the filter chains that
>> blocks
>> API calls unless explicitly configured to be open and obvious log a
>> similar
>> message?
>>
>> thanks,
>>
>> --larry
>>
>>
>>
>>
>> On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
>> ste...@hortonworks.com>
>> wrote:
>>
>> > Bitcoins are profitable enough to justify writing malware to run on
>> Hadoop
>> > clusters & schedule mining jobs: there have been a couple of
>> incidents of
>> > this in the wild, generally going in through no security, well known
>> > passwords, open ports.
>> >
>> > Vendors of Hadoop-related products get to deal with their lockdown
>> > themselves, which they often do by installing kerberos from the
>> outset,
>> > making users make up their own password for admin accounts, etc.
>> >
>> > The ASF releases though: we just provide something insecure out the
>> box
>> > and some docs saying "use kerberos if you want security"
>> >
>> > What we can do here?
>> >

Re: [DISCUSS]: securing ASF Hadoop releases out of the box

2018-07-05 Thread Todd Lipcon
The approach we took in Apache Kudu is that, if Kerberos hasn't been
enabled, we default to a whitelist of subnets. The default whitelist is
127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
matches the IANA "non-routeable IP" subnet list.

In other words, out-of-the-box, you get a deployment that works fine within
a typical LAN environment, but won't allow some remote hacker to locate
your cluster and access your data. We thought this was a nice balance
between "works out of the box without lots of configuration" and "decent
security". In my opinion a "localhost-only by default" would be be overly
restrictive since I'd usually be deploying on some datacenter or EC2
machine and then trying to access it from a client on my laptop.

We released this first a bit over a year ago if my memory serves me, and
we've had relatively few complaints or questions about it. We also made
sure that the error message that comes back to clients is pretty
reasonable, indicating the specific configuration that is disallowing
access, so if people hit the issue on upgrade they had a clear idea what is
going on.

Of course it's not foolproof, since as Eric says, you're still likely open
to the entirety of your corporation, and you may not want that, but as he
also pointed out, that might be true even if you enable Kerberos
authentication.

-Todd

On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang  wrote:

> Hadoop default configuration aimed for user friendliness to increase
> adoption, and security can be enabled one by one.  This approach is most
> problematic to security because system can be compromised before all
> security features are turned on.
> Larry's proposal will add some safety to remind system admin if security
> is disabled.  However, reducing the number of knobs on security configs are
> likely required to make the system secure for the banner idea to work
> without writing too much guessing logic to determine if UI is secured.
> Penetration test can provide better insights of what hasn't been secured to
> improve the next release.  Thankfully most Hadoop vendors have done this
> work periodically to help the community secure Hadoop.
>
> There are plenty of company advertised if you want security, use
> Kerberos.  This statement is not entirely true.  Kerberos makes security
> more difficult to crack for external parties, but it shouldn't be the only
> method to secure Hadoop.  When the Kerberos environment is larger than
> Hadoop cluster, anyone within Kerberos environment can access Hadoop
> cluster freely without restriction.  In large scale enterprises or some
> cloud vendors that sublet their resources, this might not be acceptable.
>
> From my point of view, a secure Hadoop release must default all settings
> to localhost only and allow users to add more hosts through authorized
> white list of servers.  This will keep security perimeter in check.  All
> wild card ACLs will need to be removed or default to current user/current
> host only.  Proxy user/host ACL list must be enforced on http channels.
> This is basically realigning the default configuration to single node
> cluster or firewalled configuration.
>
> Regards,
> Eric
>
> On 7/5/18, 8:24 AM, "larry mccay"  wrote:
>
> Hi Steve -
>
> This is a long overdue DISCUSS thread!
>
> Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
> ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
> warning
> to get to the page like SSL exceptions in the browser do?
> Similar tactic for UI access without SSL?
> A new AuthenticationFilter can be added to the filter chains that
> blocks
> API calls unless explicitly configured to be open and obvious log a
> similar
> message?
>
> thanks,
>
> --larry
>
>
>
>
> On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
> ste...@hortonworks.com>
> wrote:
>
> > Bitcoins are profitable enough to justify writing malware to run on
> Hadoop
> > clusters & schedule mining jobs: there have been a couple of
> incidents of
> > this in the wild, generally going in through no security, well known
> > passwords, open ports.
> >
> > Vendors of Hadoop-related products get to deal with their lockdown
> > themselves, which they often do by installing kerberos from the
> outset,
> > making users make up their own password for admin accounts, etc.
> >
> > The ASF releases though: we just provide something insecure out the
> box
> > and some docs saying "use kerberos if you want security"
> >
> > What we can do here?
> >
> > Some things to think about
> >
> > * docs explaining IN CAPITAL LETTERS why you need to lock down your
> > cluster to a private subnet or use Kerberos
> > * Anything which can be done to make Kerberos easier (?). I see
> there are
> > some oustanding patches for HADOOP-12649 which need review, but what
> else?
> >
> > Could we have Hadoop 

Re: [DISCUSS]: securing ASF Hadoop releases out of the box

2018-07-05 Thread Eric Yang
Hadoop default configuration aimed for user friendliness to increase adoption, 
and security can be enabled one by one.  This approach is most problematic to 
security because system can be compromised before all security features are 
turned on.  
Larry's proposal will add some safety to remind system admin if security is 
disabled.  However, reducing the number of knobs on security configs are likely 
required to make the system secure for the banner idea to work without writing 
too much guessing logic to determine if UI is secured.  Penetration test can 
provide better insights of what hasn't been secured to improve the next 
release.  Thankfully most Hadoop vendors have done this work periodically to 
help the community secure Hadoop.

There are plenty of company advertised if you want security, use Kerberos.  
This statement is not entirely true.  Kerberos makes security more difficult to 
crack for external parties, but it shouldn't be the only method to secure 
Hadoop.  When the Kerberos environment is larger than Hadoop cluster, anyone 
within Kerberos environment can access Hadoop cluster freely without 
restriction.  In large scale enterprises or some cloud vendors that sublet 
their resources, this might not be acceptable.
 
From my point of view, a secure Hadoop release must default all settings to 
localhost only and allow users to add more hosts through authorized white list 
of servers.  This will keep security perimeter in check.  All wild card ACLs 
will need to be removed or default to current user/current host only.  Proxy 
user/host ACL list must be enforced on http channels.  This is basically 
realigning the default configuration to single node cluster or firewalled 
configuration.  

Regards,
Eric

On 7/5/18, 8:24 AM, "larry mccay"  wrote:

Hi Steve -

This is a long overdue DISCUSS thread!

Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the warning
to get to the page like SSL exceptions in the browser do?
Similar tactic for UI access without SSL?
A new AuthenticationFilter can be added to the filter chains that blocks
API calls unless explicitly configured to be open and obvious log a similar
message?

thanks,

--larry




On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran 
wrote:

> Bitcoins are profitable enough to justify writing malware to run on Hadoop
> clusters & schedule mining jobs: there have been a couple of incidents of
> this in the wild, generally going in through no security, well known
> passwords, open ports.
>
> Vendors of Hadoop-related products get to deal with their lockdown
> themselves, which they often do by installing kerberos from the outset,
> making users make up their own password for admin accounts, etc.
>
> The ASF releases though: we just provide something insecure out the box
> and some docs saying "use kerberos if you want security"
>
> What we can do here?
>
> Some things to think about
>
> * docs explaining IN CAPITAL LETTERS why you need to lock down your
> cluster to a private subnet or use Kerberos
> * Anything which can be done to make Kerberos easier (?). I see there are
> some oustanding patches for HADOOP-12649 which need review, but what else?
>
> Could we have Hadoop determine when it's coming up on an open network and
> start warning? And how?
>
> At the very least, single node hadoop should be locked down. You shouldn't
> have to bring up kerberos to run it like that. And for more sophisticated
> multinode deployments, should the scripts refuse to work without kerberos
> unless you pass in some argument like "--Dinsecure-clusters-permitted"
>
> Any other ideas?
>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>




Re: [DISCUSS]: securing ASF Hadoop releases out of the box

2018-07-05 Thread larry mccay
Hi Steve -

This is a long overdue DISCUSS thread!

Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the warning
to get to the page like SSL exceptions in the browser do?
Similar tactic for UI access without SSL?
A new AuthenticationFilter can be added to the filter chains that blocks
API calls unless explicitly configured to be open and obvious log a similar
message?

thanks,

--larry




On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran 
wrote:

> Bitcoins are profitable enough to justify writing malware to run on Hadoop
> clusters & schedule mining jobs: there have been a couple of incidents of
> this in the wild, generally going in through no security, well known
> passwords, open ports.
>
> Vendors of Hadoop-related products get to deal with their lockdown
> themselves, which they often do by installing kerberos from the outset,
> making users make up their own password for admin accounts, etc.
>
> The ASF releases though: we just provide something insecure out the box
> and some docs saying "use kerberos if you want security"
>
> What we can do here?
>
> Some things to think about
>
> * docs explaining IN CAPITAL LETTERS why you need to lock down your
> cluster to a private subnet or use Kerberos
> * Anything which can be done to make Kerberos easier (?). I see there are
> some oustanding patches for HADOOP-12649 which need review, but what else?
>
> Could we have Hadoop determine when it's coming up on an open network and
> start warning? And how?
>
> At the very least, single node hadoop should be locked down. You shouldn't
> have to bring up kerberos to run it like that. And for more sophisticated
> multinode deployments, should the scripts refuse to work without kerberos
> unless you pass in some argument like "--Dinsecure-clusters-permitted"
>
> Any other ideas?
>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


[DISCUSS]: securing ASF Hadoop releases out of the box

2018-07-04 Thread Steve Loughran
Bitcoins are profitable enough to justify writing malware to run on Hadoop 
clusters & schedule mining jobs: there have been a couple of incidents of this 
in the wild, generally going in through no security, well known passwords, open 
ports.

Vendors of Hadoop-related products get to deal with their lockdown themselves, 
which they often do by installing kerberos from the outset, making users make 
up their own password for admin accounts, etc.

The ASF releases though: we just provide something insecure out the box and 
some docs saying "use kerberos if you want security"

What we can do here?

Some things to think about

* docs explaining IN CAPITAL LETTERS why you need to lock down your cluster to 
a private subnet or use Kerberos
* Anything which can be done to make Kerberos easier (?). I see there are some 
oustanding patches for HADOOP-12649 which need review, but what else?

Could we have Hadoop determine when it's coming up on an open network and start 
warning? And how? 

At the very least, single node hadoop should be locked down. You shouldn't have 
to bring up kerberos to run it like that. And for more sophisticated multinode 
deployments, should the scripts refuse to work without kerberos unless you pass 
in some argument like "--Dinsecure-clusters-permitted"

Any other ideas?


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org