[
https://issues.apache.org/jira/browse/AIRFLOW-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Manu Zhang updated AIRFLOW-2901:
--------------------------------
Description:
If HDFS is configured with HA, we cannot use WebHdfsSensor to check for file
existence since WebHdfs cannot resolve the name service ID. Consider using
[pyarrow.hdfs|https://arrow.apache.org/docs/python/filesystems.html] as a
replacement.
An alternative way is to allow users to configure a list of namenodes if the
dependencies of pyarrow (including libhdfs.so) are too heavy
was:If HDFS is configured with HA, we cannot use WebHdfsSensor to check for
file existence since WebHdfs cannot resolve the name service ID. Consider using
[pyarrow.hdfs|https://arrow.apache.org/docs/python/filesystems.html] as a
replacement.
> WebHdfsSensor doesn't support HDFS HA
> -------------------------------------
>
> Key: AIRFLOW-2901
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2901
> Project: Apache Airflow
> Issue Type: Improvement
> Components: hooks
> Reporter: Manu Zhang
> Priority: Major
>
> If HDFS is configured with HA, we cannot use WebHdfsSensor to check for file
> existence since WebHdfs cannot resolve the name service ID. Consider using
> [pyarrow.hdfs|https://arrow.apache.org/docs/python/filesystems.html] as a
> replacement.
> An alternative way is to allow users to configure a list of namenodes if the
> dependencies of pyarrow (including libhdfs.so) are too heavy
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)