zachary-naylor opened a new issue #15116:
URL: https://github.com/apache/airflow/issues/15116


   **Apache Airflow version**: 2.0.1
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl 
version`): N/A
   
   **Environment**:
   - **Cloud provider or hardware configuration**: AWS
   - **OS** (e.g. from /etc/os-release): Ubuntu WSL 20.04.2 LTS / Windows 10 
19041.844
   - **Kernel** (e.g. `uname -a`):  4.4.0-19041-Microsoft x86_64 GNU/Linux
   - **Install tools**:
   - **Others**: Local debug via WSL1 backed Docker Engine v20.10.5 / 
docker-compose 1.28.4 (running apache/airflow:2.0.1-python3.8)
   
   **What happened**: 
   With the ```AIRFLOW__CELERY__POOL``` set to 'eventlet', logs are neither 
written nor retrieved from the defined S3 remote bucket (remote logging enabled 
and the 'aws_default' connection exists). 
   
   This produces both of these error messages in the worker console. 
   - Could not verify previous log to append: maximum recursion depth exceeded 
while calling a Python object
   - Could not write logs to 
s3://####/logs/airflow2/trial_dag_airflow2/gen_extract_trial/2021-03-31T13:38:24.330747+00:00/1.log
 
   ```
   [2021-03-31 13:38:26,934: INFO/MainProcess] Airflow Connection: 
aws_conn_id=aws_default
   [2021-03-31 13:38:26,944: INFO/MainProcess] No credentials retrieved from 
Connection
   [2021-03-31 13:38:26,944: INFO/MainProcess] Creating session with 
aws_access_key_id=None region_name=None
   [2021-03-31 13:38:26,962: INFO/MainProcess] role_arn is None
   [2021-03-31 13:38:26,963: ERROR/MainProcess] Could not write logs to 
s3://####/logs/airflow2/trial_dag_airflow2/gen_extract_trial/2021-03-31T13:38:24.330747+00:00/1.log
   Traceback (most recent call last):
   File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/log/s3_task_handler.py",
 line 186, in s3_write
   self.hook.load_string(
   File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 61, in wrapper
   return func(*bound_args.args, **bound_args.kwargs)
   File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 90, in wrapper
   return func(*bound_args.args, **bound_args.kwargs)
   File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 563, in load_string
   self._upload_file_obj(file_obj, key, bucket_name, replace, encrypt, 
acl_policy)
   File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 653, in _upload_file_obj
   client = self.get_conn()
   File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
 line 455, in get_conn
   return self.conn
   File "/home/airflow/.local/lib/python3.8/site-packages/cached_property.py", 
line 36, in __get__
   value = obj.__dict__[self.func.__name__] = self.func(obj)
   File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
 line 437, in conn
   return self.get_client_type(self.client_type, region_name=self.region_name)
   File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
 line 410, in get_client_type
   return session.client(client_type, endpoint_url=endpoint_url, config=config, 
verify=self.verify)
   File "/home/airflow/.local/lib/python3.8/site-packages/boto3/session.py", 
line 258, in client
   return self._session.create_client(
   File "/home/airflow/.local/lib/python3.8/site-packages/botocore/session.py", 
line 826, in create_client
   credentials = self.get_credentials()
   File "/home/airflow/.local/lib/python3.8/site-packages/botocore/session.py", 
line 430, in get_credentials
   self._credentials = self._components.get_component(
   File "/home/airflow/.local/lib/python3.8/site-packages/botocore/session.py", 
line 924, in get_component
   self._components[name] = factory()
   File "/home/airflow/.local/lib/python3.8/site-packages/botocore/session.py", 
line 151, in _create_credential_resolver
   return botocore.credentials.create_credential_resolver(
   File 
"/home/airflow/.local/lib/python3.8/site-packages/botocore/credentials.py", 
line 72, in create_credential_resolver
   container_provider = ContainerProvider()
   File 
"/home/airflow/.local/lib/python3.8/site-packages/botocore/credentials.py", 
line 1817, in __init__
   fetcher = ContainerMetadataFetcher()
   File "/home/airflow/.local/lib/python3.8/site-packages/botocore/utils.py", 
line 1976, in __init__
   session = botocore.httpsession.URLLib3Session(
   File 
"/home/airflow/.local/lib/python3.8/site-packages/botocore/httpsession.py", 
line 180, in __init__
   self._manager = PoolManager(**self._get_pool_manager_kwargs())
   File 
"/home/airflow/.local/lib/python3.8/site-packages/botocore/httpsession.py", 
line 188, in _get_pool_manager_kwargs
   'ssl_context': self._get_ssl_context(),
   File 
"/home/airflow/.local/lib/python3.8/site-packages/botocore/httpsession.py", 
line 197, in _get_ssl_context
   return create_urllib3_context()
   File 
"/home/airflow/.local/lib/python3.8/site-packages/botocore/httpsession.py", 
line 72, in create_urllib3_context
   context.options |= options
   File "/usr/local/lib/python3.8/ssl.py", line 602, in options
   super(SSLContext, SSLContext).options.__set__(self, value)
   File "/usr/local/lib/python3.8/ssl.py", line 602, in options
   super(SSLContext, SSLContext).options.__set__(self, value)
   File "/usr/local/lib/python3.8/ssl.py", line 602, in options
   super(SSLContext, SSLContext).options.__set__(self, value)
   [Previous line repeated 456 more times]
   RecursionError: maximum recursion depth exceeded while calling a Python 
object
   ````
   Recursion errors are recorded in the AF UI logs in addition to an entry of 
Falling back to local log. When run through AWS ECS, no logs are accessible 
within the UI.
   ```
   *** Falling back to local log
   *** Reading local file: 
/opt/airflow/logs/trial_dag_airflow2/gen_extract_trial/2021-03-31T13:38:27.146858+00:00/1.log
   ```
   
   **What you expected to happen**: 
   Logs should be written to the defined S3 bucket using credentials within the 
mounted /home/airflow/.aws directory or inherited from the ECS task operator 
without recursion errors.
   
   When 'prefork' is used, it appears an attempt is made to locate credentials 
within the mounted volume:
   [2021-03-31 14:41:22,863: INFO/ForkPoolWorker-15] Found credentials in 
shared credentials file: ~/.aws/credentials
   
   **How to reproduce it**:
   * Set and export AWS_PROFILE var (```AWS_PROFILE="dev_access" \ export 
AWS_PROFILE```)
   * Use the docker-compose setup 
(https://github.com/apache/airflow/blob/master/docs/apache-airflow/start/docker-compose.yaml)
   * Append or update docker-compose [x-airflow-common] with the following:
   ```
   image: apache/airflow:2.0.1-python3.8
   environment:
     AIRFLOW__CORE__LOAD_EXAMPLES: 'False'
     AIRFLOW__CELERY__POOL: 'eventlet'
     AIRFLOW__LOGGING__REMOTE_LOGGING: 'True'
     AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID: 'aws_default'
     AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER: 's3://####/logs/airflow2'
     AWS_PROFILE: ${AWS_PROFILE}
   volumes:
     -  /c/Users/user.name/.aws:/home/airflow/.aws
   ```
   * Start docker-compose container. When started, ensure the follow connection 
exists: 
   ```
   * conn_id: aws_default
   * conn_type: aws
   ```
   * Two test DAGs were used. The first is comprised of a simple DAG with a 
PythonOperator to a function with a print statement. The second DAG is 
identical but with the following code embedded within the function instead of a 
print statement.
   ```
       import boto3
   
       s3 = boto3.client("s3")
       s3.put_object(
           Bucket="####",
           Body="Test content from Airflow2",
           Key=f"var/airflow2/test_log_file.txt",
       )
   ```
   * Running either of the two DAGs results in logs unable to be read/written 
to S3. With the 2nd DAG, recursion errors are encountered in the local log.
   
   **Anything else we need to know**: 
   This has been reproduced with both ```apache/airflow:2.0.1-python3.8``` and 
```apache/airflow:2.0.0-python3.8``` docker images. I have not trialed with 
alternative Python versions.
   
   When the AIRFLOW__CELERY__POOL is set to prefork or solo, no recursion 
errors are encountered with logs readable/writeable to S3.
   
   Running the same test, but with the docker images 
[```apache/airflow:1.10.12-python3.8``` ```apache/airflow:1.10.14-python3.8``` 
```apache/airflow:1.10.15-python3.8```], amended variables set and Airflow 1x 
compatible DAGs is successful in reading/writing to S3 without recursion errors.
   ```
   environment:
     AIRFLOW__CORE__LOAD_EXAMPLES: 'False'
     AIRFLOW__CELERY__POOL: 'eventlet'
     AIRFLOW__CORE__REMOTE_LOGGING: 'True'
     AIRFLOW__CORE__REMOTE_LOG_CONN_ID: 'aws_default'
     AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER: 's3://####/logs/airflow'
     AWS_PROFILE: ${AWS_PROFILE}
   volumes:
     -  /c/Users/user.name/.aws:/home/airflow/.aws
   ```
   Upscaling eventlet from 30.0.1 > 30.0.2 has no effect. Conversely 
downgrading to gevent 1.5.0 and eventlet 0.25.2 have had no effect.
   
   Adding airflow_local_settings.py to the ${AIRFLOW_HOME} directory within the 
Docker image, with variations of the following code at the top (following 
suggestions) has had no effect.
   ```
   import eventlet.debug
   eventlet.monkey_patch()
   ```
   ```
   from gevent import monkey
   monkey.patch_ssl()
   ```
   ```
   from gevent import monkey
   monkey.patch_all()
   ```
   This may be related to https://github.com/apache/airflow/issues/8212


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to