Hi J,
You always start with hello Andrey (so no I feel uncomfortable :)

> I'd like to start by saying I'm a big big fan of Airflow, otherwise there
> > wouldn’t be CWL-Airflow :). I like everything about Airflow development.
> > Especially the way of adding extra packages to Airflow like Kubernetes for
> > example with just pip install 'apache-airflow[kubernetes]'. I believe it
> > would be nice to have pip install 'apache-airflow[cwl]' too.
> 
> 
> This can be easily done as Maxime said - have a separate cwl-airflow
> package in pypi and have it configured as dependency in 'cwl' extra. I see
> no problem with that. I assume you have a perfect converter that always
> produces good and schedulable .dag file.  I imagine the users will run (on
> demand) the converter to convert the cwl descriptor + job to a RESULT.py in
> the dag folder and they  otherwise run a pretty standard airflow to process
> it. Do you imagine a more "involved" integration with Airlfow? If so - how
> do you imagine the use case? Could you explain how do you envision the
> life-cycyle of such workflow?
> 

I found some old threads on google groups. 
https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/airbnb_airflow/_muNQ7jTAkQ/qeWCaLHiAAAJ

https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/airbnb_airflow/nk2GAciOADI/gW1pWTwRDwAJ

I hope ideas in that treads and questions do not interfere with current 
discussions. I have not mentioned anything about `cwl descriptor + job` 
(Michael K probably did :(), but in current implementation CWL Description is 
static and lives long live :) and then one can attach a sensor to the top of 
CWLDAG to check files in a directory  or as we do we trigger that dag    
`airflow trigger_dag --conf "{\"job\":$(cat ./hg19.job)}" "bowtie-index"` so we 
starting it with kind of xcom data filled in.


> >
> > Here I'm citing from http://commonwl.org: "The Common Workflow Language
> > (CWL) is an open standard for describing analysis workflows and tools in a
> > way that makes them portable and scalable across a variety of software and
> > hardware environment". So CWL is a standard and it is more about how
> > describe a workflow in a way that everybody will understand and be able to
> > reproduce. It might be not yet widely implemented and lacks some features,
> > but it’s definitely not about favorite programming language or pipeline
> > manager. Of course, this CWL specification brings some limitations and they
> > exist because of CWL attempts to formalize the most common features of any
> > pipeline.
> >
> 
> Do you imagine that ALL airflow users starts using CWL as their main
> workflow language? For the reasons described above (Python use/Data
> Scientist approach) I think this is not going to happen for Airflow, I can
> understand that you want to run a CWL workflows using Airflow, but I do not
> see Airflow DAG developers switching to use CWL as their main workflow
> language. And for people who have 100s or 1000s of DAGs there is no easy
> way to convert them to CWL (is there?).
> 

I never thought about rewriting all the python DAGs code :) into CWL. I did say 
that there are people who decided to use CWL and they already limited to 
environment they have like  toil, IBM HPC and others who picked up Airflow for 
them CWL is the solution.


> 
> > We believe that it’s worth moving towards CWL standard based on the
> > growing interest of people and big companies who run scientific pipelines.
> > There are a lot of published scientific papers. Projects such as Toil,
> > Arvados, Galaxy, Taverna and others already have solutions to run CWL
> > pipelines. People are interested in pipelines that are easy to share. Even
> > IBM released CWL HPC executer
> >
> 
> I understand there are papers/ I also understand that there is a group of
> people who would like to have easily shareable pipelines. Why do you think
> this is important for Airflow users to have them?
> 

I do not believe that it is possible to convert anybody (so I separate existing 
users and new users), but I strongly believe that there is a huge biology, 
chemistry, astronomers, physics, etc community which would appreciate CWL in 
Airflow.


> > CWL-Airflow creates Airflow DAG with all its steps on the fly on every
> > dagbag refresh. I think this behavior can be considered as converter. More
> > details can be found here
> > https://cwl-airflow.readthedocs.io/en/1.0.18/readme/how_it_works.html#what-s-inside
> >
> 
> It's not a question of time, but rather the question of focus (and missed
> opportunities) - we already have a full roadmap for 2.0 and beyond and we
> are all working hard on it. Since you have already working solution - why
> do you not want to maintain it? What's the problem you want to solve by
> donating the code to Airflow team ?
> 

Because I do it in free time, and at some point I just might have no free time. 

Reply via email to