Hi Kaxi, Just sent out the AIP: https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-24+DAG+Persistence+in+DB+using+JSON+for+Airflow+Webserver+and+%28optional%29+Scheduler
Thanks! Zhou On Fri, Jul 26, 2019 at 1:33 PM Zhou Fang <zhouf...@google.com> wrote: > Hi Kaxil, > > We are also working on persisting DAGs into DB using JSON for Airflow > webserver in Google Composer. We target at minimizing the change to the > current Airflow code. Happy to get synced on this! > > Here is our progress: > (1) Serializing DAGs using Pickle to be used in webserver > It has been launched in Composer. I am working on the PR to upstream it: > https://github.com/apache/airflow/pull/5594 > Currently it does not support non-Airflow operators and we are working on > a fix. > > (2) Caching Pickled DAGs in DB to be used by webserver > We have a proof-of-concept implementation, working on an AIP now. > > (3) Using JSON instead of Pickle in (1) and (2) > Decided to use JSON because Pickle is not secure and human readable. The > serialization approach is very similar to (1). > > I will update the RP (https://github.com/apache/airflow/pull/5594) to > replace Pickle by JSON, and send our design of (2) as an AIP next week. > Glad to check together whether our implementation makes sense and do > improvements on that. > > Thanks! > Zhou > > > On Fri, Jul 26, 2019 at 7:37 AM Kaxil Naik <kaxiln...@gmail.com> wrote: > >> Hi all, >> >> We, at Astronomer, are going to spend time working on DAG Serialisation. >> There are 2 AIPs that are somewhat related to what we plan to work on: >> >> - AIP-18 Persist all information from DAG file in DB >> < >> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-18+Persist+all+information+from+DAG+file+in+DB >> > >> - AIP-19 Making the webserver stateless >> < >> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-19+Making+the+webserver+stateless >> > >> >> We plan to use JSON as the Serialisation format and store it as a blob in >> metadata DB. >> >> *Goals:* >> >> - Make Webserver Stateless >> - Use the same version of the DAG across Webserver & Scheduler >> - Keep backward compatibility and have a flag (globally & at DAG level) >> to turn this feature on/off >> - Enable DAG Versioning (extended Goal) >> >> >> We will be preparing a proposal (AIP) after some research and some initial >> work and open it for the suggestions of the community. >> >> We already had some good brain-storming sessions with Twitter folks (DanD >> & >> Sumit), folks from GoDataDriven (Fokko & Bas) & Alex (from Uber) which >> will >> be a good starting point for us. >> >> If anyone in the community is interested in it or has some experience >> about >> the same and want to collaborate please let me know and join >> #dag-serialisation channel on Airflow Slack. >> >> Regards, >> Kaxil >> >