Previous email gives more info why I believe CWL should be part of Airflow. Or should I elaborate more?
On 2019/11/13 21:09:52, Maxime Beauchemin <maximebeauche...@gmail.com> wrote: > The big question is why can't it just be on its own Github repository and > in its own PyPI package? Why does it have to be packaged with our PyPI > package or live in the Airflow repo? > > Max > > On Wed, Nov 13, 2019 at 12:42 PM Andrey Kartashov <por...@porter.st> wrote: > > > I don't quite get what this example should to prove? I agree XML is too > > much for everything but CWL is not XML. And our point is not to bring a new > > pipeline manager we don't have any other but Airflow. We do not do > > conversion because we have to change CWL file to send them to others > > Universities/Hospitals/Companies to reproduce result and get improved CWL > > back to run it on our lovely Airflow. If we convert CWL to python and > > change it to python we later would have to convert it to CWL back. Our > > CWLDAG is just an Airflow's DAG and can be used as usual within any > > Airflow's pipelines > > with task.set_upstream or task.set_downstream to integrate CWL pipeline > > into bigger Airflow pipeline. > > > > > > > > Speaking of oozie-2-airflow - as one of the creators of it. Here is the > > > repo: https://github.com/GoogleCloudPlatform/oozie-to-airflow > > > > > > And some basic high-level concepts for O2A: > > > > > > - The world becomes a better place - one XML less at a time. This was > > > our motto. From what we know - Ooozie XML workflows are pretty much > > > universally hated (well maybe just disliked) in the data processing > > world. > > > But there are people who have 1000s of oozie workflows, so we wanted > > to > > > give people a tool to easily converts many XMLs to Airflow DAGs. > > > - It's one-way conversion. The aim is to get Oozie XML workflows as > > > source, convert it to DAGs and continue working on the generated DAGs. > > > - We accept are not perfect. We know we can convert many Ooozie > > > Worfkflows and they will work but we have a number of limitations ( > > > > > https://github.com/GoogleCloudPlatform/oozie-to-airflow#common-known-limitations > > ). > > > We have also a list of issues: > > > https://github.com/GoogleCloudPlatform/oozie-to-airflow/issues that > > are > > > still open and need solving for some more sophisticated features of > > Ooozie. > > > But this is OK. We accept that we are not perfect - the DAGs > > generated by > > > O2A can be further manually modified and evolved as Python DAGs to add > > > missing features. > > > - The Python code generated by O2A are generated with the focus of > > > readability. Those python files are well written, formatted and > > structured > > > in a way that resembles as if a human programmer wrote it so that it > > is > > > easy to take them forward and modify them manually. For example we > > can > > > then add all the Airflow-specific features Maxime mentioned - queues, > > xcom, > > > callbacks, etc. etc. ). So we do not have to be perfect. > > > > >