My suggestion is to have new funtionality (or even updates maybe) to require 
implementing an API in local and json. That's an intermediate step to stop 
allowing direct DB access. 

Basically I'm saying no new and updated CLI functionality is allowed to use 
"airflow.utils.session" directly and needs to use "airflow.api.common" (or 
alike). 

This can be done now. 

Recent FAB has some nice stuff for Rest APIs including authentication. This 
integrates well with the rest of the wee stuff we have. Although i would love 
to see a grpc implementation.

B.

Sent from my iPhone

> On 18 Jan 2020, at 13:19, Jarek Potiuk <[email protected]> wrote:
> 
> Absolutely yes. I think this is really what is in the current plans/roadmap
> :).
> 
> It's just a matter of when and how to enforce it. The current experimental
> API is well .... still experimental.
> 
> What we really need to do is implement much more complete API
> support/approach - which is on the Airllow 2.0
> roadmap. AFAIK - Kamil is going to start discussions and make some
> proposals for that this week.
> And decoupling CLI from DB is rather high on the list for the API I believe.
> 
> I think it's really important to solve it "well" - i.e. introduce flexible
> authorisation/authentication mechanism to the api,
> have a way to decouple both client and web server from database operations
> and eventually decouple
> workers from DB access. This is our ultimate goal and I think we should
> define a broader picture/target now
> (i.e. how to design the API so that it serves all that options in the
> future) and have a plan on how to gradually
> introduce it so that several contributors/commiters can take part in this
> process. One of the sequences
> of introducing the API might be for example:
> 
> 1) either graduate existing experimental API to be "official" or introduce
> a new "official"
>    API solution if we find it better
> 2) reimplement all CLI commands to use the new API
> 3) Reimplement web server to use the new API.
> 4) (very long term) decouple workers from the database.
> 
> I think forbidding CLI to access database should happen between 1) and 2) -
> when we have an "official" API solution
> in place (and it should be automatically verified - we can easily add
> pylint plugin that can check CLI package for
> db usage). In my opinion we cannot expect people to use API until it goes
> out of experimental/ we have a viable
> stable long term alternative agreed.
> 
> Once this is in place - no new CLI command should be allowed with the
> direct DB access.
> 
> J.
> 
> 
>> On Sat, Jan 18, 2020 at 12:59 PM Bolke de Bruin <[email protected]> wrote:
>> 
>> Hi All,
>> 
>> I’ve noticed that we are still implementing new features or are doing
>> refactoring of CLI commands that directly interface with the database
>> instead of using the abstractions that should be made available from the
>> API specification. Why is this an issue? The CLI is used by arbitrary user
>> to interface with Airflow operations. Airflow relies on the database to be
>> its single source of truth. A user that is able to read the configuration
>> of Airflow is currently able to manipulate the database. The CLI requires
>> database access hence the information to deal with the database is in the
>> configuration file. To improve security the CLI should use a rest API which
>> allows for proper authn/authz and segregation of duties.
>> 
>> In the past I have introduced the experimental API with a “local_client”
>> and “json_client” implementation. The local_client still allows for direct
>> database access and its only function is to be there during the transition
>> period to have the full rest api available. After that it should be
>> deprecated and removed.
>> 
>> My suggestion is to disallow any new functionality in the CLI that directly
>> relies on “airflow.utils.session” and only allow new functionality to go
>> through the API client. For now that would mean 2 implementations: local
>> and json. Of course refactoring the current state should be on the list in
>> order to remove the “local_client”.
>> 
>> The API client should be available to other packages as well. So maybe we
>> should package the cli and client api implementations into
>> “airflow-client”.
>> 
>> What are your thoughts?
>> 
>> Thanks
>> Bolke
>> 
> 
> 
> -- 
> 
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
> 
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>

Reply via email to