[ 
https://issues.apache.org/jira/browse/AIRFLOW-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876893#comment-16876893
 ] 

ASF GitHub Bot commented on AIRFLOW-4862:
-----------------------------------------

ashb commented on pull request #5513: [AIRFLOW-4862] Fix bug for earlier change 
to allow using IP as hostname
URL: https://github.com/apache/airflow/pull/5513
 
 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow directly using IP address as hostname in 
> airflow.utils.net.get_hostname()
> -------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-4862
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4862
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: utils
>    Affects Versions: 1.10.3
>            Reporter: Xiaodong DENG
>            Assignee: Xiaodong DENG
>            Priority: Minor
>             Fix For: 1.10.5
>
>
> In airflow.utils.net.get_hostname(), the default function used to get 
> hostname for nodes (like worker) is *socket.getfqdn()*, which will return 
> fully qualified domain name.
> In some cases, we do need to ensure that hostnames can resolved so that nodes 
> can talk to each other. One example is: if I use S3 for remote logging, then 
> the log will only be pushed to S3 after the job is finished (either success 
> or failure); when the job is still running, webserver will first check if the 
> log is available in its own volume, if not, webserver will fetch log from 
> worker.
> If workers' hostnames are something like "airflow-worker-53-4bp8v" (e.g., 
> when running on OpenShift or K8S), it's possible that the hostname can't be 
> resolved, then we will observe errors like below
> {code:java}
> *** Log file does not exist: 
> /opt/app-root/airflow/logs/example_python_operator/sleep_for_3/2019-06-18T08:14:15.313472+00:00/2.log
> *** Fetching from: 
> http://airflow-worker-57-n69vb:8793/log/example_python_operator/sleep_for_3/2019-06-18T08:14:15.313472+00:00/2.log
> *** Failed to fetch log file from worker. 
> HTTPConnectionPool(host='airflow-worker-57-n69vb', port=8793): Max retries 
> exceeded with url: 
> /log/example_python_operator/sleep_for_3/2019-06-18T08:14:15.313472+00:00/2.log
>  (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 
> 0x7fec3266a6d8>: Failed to establish a new connection: [Errno -2] Name or 
> service not known',))
> {code}
>  
> This may be addressed by properly setting service discovery, but the users 
> may not always have the privilege to do that (like myself in my organization).
> Another solution is to change "hostname_callable" in Airflow's configuration 
> ([core] section). But it will only work if you have a function which can 
> return a resolvable hostname and it can't take any argument (due to the 
> existing implementation 
> [https://github.com/apache/airflow/blob/dd08ae3469a50a145f9ae7f819ed1840fe2a5bd6/airflow/utils/net.py#L41-L45).]
>  
> The change I would like to propose is: allow users to use IP address directly 
> as hostname.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to