lboudard commented on issue #17490: URL: https://github.com/apache/airflow/issues/17490#issuecomment-898448207
I agree on this subject, currently pod operator is missing some very handy features that [kubernetes job controller](https://kubernetes.io/docs/concepts/workloads/controllers/job/) implements such as time to live after success/failure that are really handy. I also agree on the fact that the usage of kubernetes executor vs kubernetes pod operator is not very clear yet. In our use case, since we have very different dags types living in the same airflow instance, we have multiple images that are run through pod operators (that we used before kubernetes executor and taskflow api). Say for instance one image to parse new batches of data and another one to train models on it in another dag. But that is not ideal since the workflow dependencies are not properly binded in code ``` read_file | parse | feature_engineering | train_model read_file | archive ``` that describe direct data dependencies in code (say the airflow taskflow way, or equivalently in spark or apache beam), we rather have ``` schedule_parse_file_and_store(raw_data_batch_location) schedule_feature_engineer(raw_data_batch_location) schedule_train_model(feature_engineered_batch_location) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org