Hi Manu, Thanks for raising this question. There is a PR for moving <https://github.com/apache/incubator-airflow/pull/3560> to hdfs3. There is code in the existing codebase, which support HA <https://github.com/apache/incubator-airflow/blob/53b89b98371c7bb993b242c341d3941e9ce09f9a/airflow/hooks/hdfs_hook.py#L92-L96>, but this might not be for the sensor.
Personally I'm not familiar with pyarrow.hdfs, so I'm not the one to judge how mature it is. We need to replace Snakebite for sure since it is only compatible with Python 2.7. Cheers, Fokko Op wo 29 aug. 2018 om 04:29 schreef Manu Zhang <owenzhang1...@gmail.com>: > Hi all, > > We've been using WebHdfsSensor happily to sensor the state of upstream > tasks outputting to HDFS except when there is a namenode switch. I've > opened https://issues.apache.org/jira/browse/AIRFLOW-2901 to discuss the > HDFS HA support. > > There are two solutions that I can see, > > 1. use pyarrow.hdfs which has HA support > 2. allow user to configure a list of namenodes > > WDYT ? > > Thanks, > Manu Zhang >