[ 
https://issues.apache.org/jira/browse/AIRFLOW-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876342#comment-16876342
 ] 

ASF subversion and git services commented on AIRFLOW-4862:
----------------------------------------------------------

Commit 29176da5dab9d3b4f558e5e6a94ddfa850a4f556 in airflow's branch 
refs/heads/v1-10-test from Xiaodong
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=29176da ]

[AIRFLOW-4862] Allow directly using IP address as hostname (#5501)


(cherry picked from commit 9ddde7296ccbbad8c78aefd4d19d197e3e7b1893)


> Allow directly using IP address as hostname in 
> airflow.utils.net.get_hostname()
> -------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-4862
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4862
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: utils
>    Affects Versions: 1.10.3
>            Reporter: Xiaodong DENG
>            Assignee: Xiaodong DENG
>            Priority: Minor
>             Fix For: 1.10.4
>
>
> In airflow.utils.net.get_hostname(), the default function used to get 
> hostname for nodes (like worker) is *socket.getfqdn()*, which will return 
> fully qualified domain name.
> In some cases, we do need to ensure that hostnames can resolved so that nodes 
> can talk to each other. One example is: if I use S3 for remote logging, then 
> the log will only be pushed to S3 after the job is finished (either success 
> or failure); when the job is still running, webserver will first check if the 
> log is available in its own volume, if not, webserver will fetch log from 
> worker.
> If workers' hostnames are something like "airflow-worker-53-4bp8v" (e.g., 
> when running on OpenShift or K8S), it's possible that the hostname can't be 
> resolved, then we will observe errors like below
> {code:java}
> *** Log file does not exist: 
> /opt/app-root/airflow/logs/example_python_operator/sleep_for_3/2019-06-18T08:14:15.313472+00:00/2.log
> *** Fetching from: 
> http://airflow-worker-57-n69vb:8793/log/example_python_operator/sleep_for_3/2019-06-18T08:14:15.313472+00:00/2.log
> *** Failed to fetch log file from worker. 
> HTTPConnectionPool(host='airflow-worker-57-n69vb', port=8793): Max retries 
> exceeded with url: 
> /log/example_python_operator/sleep_for_3/2019-06-18T08:14:15.313472+00:00/2.log
>  (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 
> 0x7fec3266a6d8>: Failed to establish a new connection: [Errno -2] Name or 
> service not known',))
> {code}
>  
> This may be addressed by properly setting service discovery, but the users 
> may not always have the privilege to do that (like myself in my organization).
> Another solution is to change "hostname_callable" in Airflow's configuration 
> ([core] section). But it will only work if you have a function which can 
> return a resolvable hostname and it can't take any argument (due to the 
> existing implementation 
> [https://github.com/apache/airflow/blob/dd08ae3469a50a145f9ae7f819ed1840fe2a5bd6/airflow/utils/net.py#L41-L45).]
>  
> The change I would like to propose is: allow users to use IP address directly 
> as hostname.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to