[ https://issues.apache.org/jira/browse/AIRFLOW-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876277#comment-16876277 ]
ASF subversion and git services commented on AIRFLOW-4862: ---------------------------------------------------------- Commit 29176da5dab9d3b4f558e5e6a94ddfa850a4f556 in airflow's branch refs/heads/v1-10-stable from Xiaodong [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=29176da ] [AIRFLOW-4862] Allow directly using IP address as hostname (#5501) (cherry picked from commit 9ddde7296ccbbad8c78aefd4d19d197e3e7b1893) > Allow directly using IP address as hostname in > airflow.utils.net.get_hostname() > ------------------------------------------------------------------------------- > > Key: AIRFLOW-4862 > URL: https://issues.apache.org/jira/browse/AIRFLOW-4862 > Project: Apache Airflow > Issue Type: Improvement > Components: utils > Affects Versions: 1.10.3 > Reporter: Xiaodong DENG > Assignee: Xiaodong DENG > Priority: Minor > Fix For: 1.10.4 > > > In airflow.utils.net.get_hostname(), the default function used to get > hostname for nodes (like worker) is *socket.getfqdn()*, which will return > fully qualified domain name. > In some cases, we do need to ensure that hostnames can resolved so that nodes > can talk to each other. One example is: if I use S3 for remote logging, then > the log will only be pushed to S3 after the job is finished (either success > or failure); when the job is still running, webserver will first check if the > log is available in its own volume, if not, webserver will fetch log from > worker. > If workers' hostnames are something like "airflow-worker-53-4bp8v" (e.g., > when running on OpenShift or K8S), it's possible that the hostname can't be > resolved, then we will observe errors like below > {code:java} > *** Log file does not exist: > /opt/app-root/airflow/logs/example_python_operator/sleep_for_3/2019-06-18T08:14:15.313472+00:00/2.log > *** Fetching from: > http://airflow-worker-57-n69vb:8793/log/example_python_operator/sleep_for_3/2019-06-18T08:14:15.313472+00:00/2.log > *** Failed to fetch log file from worker. > HTTPConnectionPool(host='airflow-worker-57-n69vb', port=8793): Max retries > exceeded with url: > /log/example_python_operator/sleep_for_3/2019-06-18T08:14:15.313472+00:00/2.log > (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at > 0x7fec3266a6d8>: Failed to establish a new connection: [Errno -2] Name or > service not known',)) > {code} > > This may be addressed by properly setting service discovery, but the users > may not always have the privilege to do that (like myself in my organization). > Another solution is to change "hostname_callable" in Airflow's configuration > ([core] section). But it will only work if you have a function which can > return a resolvable hostname and it can't take any argument (due to the > existing implementation > [https://github.com/apache/airflow/blob/dd08ae3469a50a145f9ae7f819ed1840fe2a5bd6/airflow/utils/net.py#L41-L45).] > > The change I would like to propose is: allow users to use IP address directly > as hostname. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)