Re: [DISCUSS] Managing provider Connections via UI in managed Airflow services

Jackson, John Fri, 18 Jun 2021 13:23:24 -0700

That would certainly help a bit, but unfortunately it's not just the packages.  
It's the fact that authentication is tied to Python code that can be patched by 
anyone with permission to execute code on the web server, which in turn would 
give them access to packages or any anything else they'd like.


What would be ideal is if the web server got its entire identity and 
capabilities from a central, secure source--think serialized DAGs, but for all 
Airflow UI extensions and configurations.  Then the UI is just that--providing 
UI--but does not contain any code that can be exploited.  It could be hardened 
and secured without impacting the extensible nature of Airflow.

On 2021-06-18, 1:00 PM, "Jarek Potiuk" <[email protected]> wrote:

    CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



    > The only way to be 100% sure that users aren't changing the way the web 
server behaves is to not permit its alteration.  UI plugins, package 
installations, and library changes are among the various vulnerabilities that 
could be exploited.  For example, I could write a plugin that patches the auth 
functions and allows everyone Admin access regardless of their predetermined 
role.  Without strict security controls there will be a limit to Airflow 
adoption amongst Enterprise customers.  For Airflow to grow, it must offer a 
secure-by-design-friendly infrastructure.  Ideally the web server is a window 
into what Airflow is doing, but does not allow access or modification to any of 
the internal behaviour of the system.

    Just a comment on this one. If this is only user vs. admin, I think
    this can be easily solved by only allowing admin users to add packages
    for the webserver, not the dag writers. Will that solve the problem ?

    J.



    On Fri, Jun 18, 2021 at 9:45 PM Jackson, John
    <[email protected]> wrote:
    >
    > Hi Folks,
    >
    > Product Manager for MWAA weighing in here, having spoken to--quite 
literally--hundreds of Airflow customers (both MWAA and in general).
    >
    > Enterprise organizations--those that use Airflow at scale--typically 
separate their "Administrators" from their "Users".  The former sets up the 
security controls, and makes sure that users can't violate their organization's 
data security while still providing access to (often sensitive) data in order 
to accomplish their business goals.  The latter are the folks writing DAGs and 
monitoring their execution, and sometimes see those security controls as a 
hinderance to the ease at which they can write their data pipelines and 
orchestration.
    >
    > The weak spot in the security model is the web based user interface.  It 
needs to be accessible to users, sitting at their laptops, with relative ease 
but cannot be permitted to perform arbitrary tasks otherwise it can escape the 
bounds set to it.  Airflow is wonderful in that it's entirely written in Python 
and extensible.  However, that same ease of extensibility could easily be used 
to bypass the Administrator's security controls, such as auth plugins, and 
allow users access beyond which they should rightfully have (whether 
deliberately or by accident).
    >
    > The only way to be 100% sure that users aren't changing the way the web 
server behaves is to not permit its alteration.  UI plugins, package 
installations, and library changes are among the various vulnerabilities that 
could be exploited.  For example, I could write a plugin that patches the auth 
functions and allows everyone Admin access regardless of their predetermined 
role.  Without strict security controls there will be a limit to Airflow 
adoption amongst Enterprise customers.  For Airflow to grow, it must offer a 
secure-by-design-friendly infrastructure.  Ideally the web server is a window 
into what Airflow is doing, but does not allow access or modification to any of 
the internal behaviour of the system.
    >
    > Should there be some sort of signed and verified packages in the future, 
perhaps organizations will be more open to extensibility.  However, the "shared 
responsibility model" does not allow service providers, be it Astronomer, 
Google, AWS, or anyone else, to be cavalier with customers security concerns 
and must always default to the strictest security defaults possible.  Customers 
look to managed services to provide guard rails that prevent them from data 
breaches while still benefiting from the features and capabilities of the 
software platform.
    >
    > Cheers,
    >
    > John
    >
    > On 2021-06-18, 11:40 AM, "Jarek Potiuk" <[email protected]> wrote:
    >
    >     CAUTION: This email originated from outside of the organization. Do 
not click links or open attachments unless you can confirm the sender and know 
the content is safe.
    >
    >
    >
    >     I agree that this thread is probably not good for categorization of
    >     the offering but I also concur with Ash to get a better understanding
    >     of the risks involved.
    >
    >     I think I "feel" where it comes from and intuitively see that you
    >     might want to add additional or extra layers of precautions (and
    >     likely follow pressures from the internal security teams) but also
    >     Ash's point is quite important. We should get to the bottom of it, and
    >     if there are some real threats that we are not aware of, I think
    >     sharing details on [email protected] is the right thing to
    >     do.
    >
    >     Maybe we will find that other users of Airflow are also at risk and we
    >     might want to protect them (and also all managed services but also
    >     individual installations) in the future by introducing some changes in
    >     this model.
    >
    >     BTW. Subash - you do not need to have a subscription to write to
    >     [email protected]. Just send an -email with the details and
    >     we will get it and we will be able to keep you in discussion when it
    >     follows. Also information for your security team
    >     https://www.apache.org/dev/pmc.html#mailing-list-private . One of the
    >     main purposes of the private@ mailing list is pre-disclosing security
    >     problems related to the project. And we are all obliged as PMCs (and
    >     all ASF members who read the list as well) to not disclose what is
    >     discussed there.
    >
    >     J,
    >
    >     On Fri, Jun 18, 2021 at 4:04 PM Ash Berlin-Taylor <[email protected]> 
wrote:
    >     >
    >     > No one as yet explained what the security concerns actually are? Is 
there some concrete thing that is a worry, is it merely a concern that more 
things installed = marginally more risky?
    >     >
    >     > The blast radius is limited to a single Airflow deployment, and 
access is I assume sufficiently gated behind IAM perms anyway?
    >     >
    >     > By not letting users install extra modules in to the webserver 
image you are also removing their ability to use third party providers, such as 
these
    >     >
    >     > 
https://github.com/great-expectations/airflow-provider-great-expectations
    >     > https://github.com/fivetran/airflow-provider-fivetran
    >     > https://github.com/anyscale/airflow-provider-ray
    >     >
    >     > -- and there are only going to be more of these over time.
    >     >
    >     > Not to mention this blocks UI plugins entirely.
    >     >
    >     > I don't quite understand why MWAA concerns itself with exactly what 
is being installed in the webserver image on top of Airflow -- the Amazon 
Shared Responsibility model would I think already cover the "AWS takes care of 
the base, 'you' take care of what is running" (but I confess I haven't re-read 
it in a number of years)
    >     >
    >     > -ash
    >     >
    >     > On Fri, Jun 18 2021 at 07:06:53 +0000, "Canapathy, Subash" 
<[email protected]> wrote:
    >     >
    >     > Irrespective of personal categorization of the managed offerings 
Airflow-ness, there are obligations to adhere to a security bar and securing 
against any attack vectors a UI feature can introduce – and this will be true 
for any cloud service provider. I want to clarify that we were not suggesting 
to change any assumptions in current way of packaging providers but merely 
citing that we cannot use equivalence to earlier mono repo and add all 60+ of 
them on base image.
    >     >
    >     >
    >     >
    >     > Going back to the original discussion, we are in the process of 
pre-installing providers with Apache 2 license right away and others will be 
added (with approved exception) based on user demand.
    >     >
    >     >
    >     >
    >     > From: Ash Berlin-Taylor <[email protected]>
    >     > Reply-To: "[email protected]" <[email protected]>
    >     > Date: Wednesday, June 16, 2021 at 1:11 AM
    >     > To: "[email protected]" <[email protected]>
    >     > Subject: RE: [EXTERNAL] [DISCUSS] Managing provider Connections via 
UI in managed Airflow services
    >     >
    >     >
    >     >
    >     > On Tue, Jun 15 2021 at 18:21:56 +0000, "Canapathy, Subash" 
<[email protected]> wrote:
    >     >
    >     > Regarding security constraints on why we disallow plugins and 
requirements on the webserver, I will have to discuss this in person on PMC but 
on a high level this comes down to remote code execution prevention on managed 
instances, opening possibilities of exploiting vulnerabilities on the 
flask-app-builder and the underlying python runtime.
    >     >
    >     >
    >     >
    >     > I'm sorry, I don't agree with this summary.
    >     >
    >     >
    >     >
    >     > Airflow's job is to run user submitted code, and to allow the UI to 
be pluggable.
    >     >
    >     >
    >     >
    >     > Are you providing Airflow, or an Airflow like service?
    >
    >
    >
    >     --
    >     +48 660 796 129
    >


    --
    +48 660 796 129

Re: [DISCUSS] Managing provider Connections via UI in managed Airflow services

Reply via email to