Hi Kaxi,

Just sent out the AIP:
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-24+DAG+Persistence+in+DB+using+JSON+for+Airflow+Webserver+and+%28optional%29+Scheduler

Thanks!
Zhou


On Fri, Jul 26, 2019 at 1:33 PM Zhou Fang <zhouf...@google.com> wrote:

> Hi Kaxil,
>
> We are also working on persisting DAGs into DB using JSON for Airflow
> webserver in Google Composer. We target at minimizing the change to the
> current Airflow code. Happy to get synced on this!
>
> Here is our progress:
> (1) Serializing DAGs using Pickle to be used in webserver
> It has been launched in Composer. I am working on the PR to upstream it:
> https://github.com/apache/airflow/pull/5594
> Currently it does not support non-Airflow operators and we are working on
> a fix.
>
> (2) Caching Pickled DAGs in DB to be used by webserver
> We have a proof-of-concept implementation, working on an AIP now.
>
> (3) Using JSON instead of Pickle in (1) and (2)
> Decided to use JSON because Pickle is not secure and human readable. The
> serialization approach is very similar to (1).
>
> I will update the RP (https://github.com/apache/airflow/pull/5594) to
> replace Pickle by JSON, and send our design of (2) as an AIP next week.
> Glad to check together whether our implementation makes sense and do
> improvements on that.
>
> Thanks!
> Zhou
>
>
> On Fri, Jul 26, 2019 at 7:37 AM Kaxil Naik <kaxiln...@gmail.com> wrote:
>
>> Hi all,
>>
>> We, at Astronomer, are going to spend time working on DAG Serialisation.
>> There are 2 AIPs that are somewhat related to what we plan to work on:
>>
>>    - AIP-18 Persist all information from DAG file in DB
>>    <
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-18+Persist+all+information+from+DAG+file+in+DB
>> >
>>    - AIP-19 Making the webserver stateless
>>    <
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-19+Making+the+webserver+stateless
>> >
>>
>> We plan to use JSON as the Serialisation format and store it as a blob in
>> metadata DB.
>>
>> *Goals:*
>>
>>    - Make Webserver Stateless
>>    - Use the same version of the DAG across Webserver & Scheduler
>>    - Keep backward compatibility and have a flag (globally & at DAG level)
>>    to turn this feature on/off
>>    - Enable DAG Versioning (extended Goal)
>>
>>
>> We will be preparing a proposal (AIP) after some research and some initial
>> work and open it for the suggestions of the community.
>>
>> We already had some good brain-storming sessions with Twitter folks (DanD
>> &
>> Sumit), folks from GoDataDriven (Fokko & Bas) & Alex (from Uber) which
>> will
>> be a good starting point for us.
>>
>> If anyone in the community is interested in it or has some experience
>> about
>> the same and want to collaborate please let me know and join
>> #dag-serialisation channel on Airflow Slack.
>>
>> Regards,
>> Kaxil
>>
>

Reply via email to