[GitHub] [airflow] mik-laj commented on a change in pull request #6644: [AIRFLOW-6047] Simplify the logging configuration template
mik-laj commented on a change in pull request #6644: [AIRFLOW-6047] Simplify the logging configuration template URL: https://github.com/apache/airflow/pull/6644#discussion_r350099913 ## File path: airflow/config_templates/airflow_local_settings.py ## @@ -148,26 +128,61 @@ } } -REMOTE_HANDLERS = { -'s3': { +# Only update the handlers and loggers when CONFIG_PROCESSOR_MANAGER_LOGGER is set. +# This is to avoid exceptions when initializing RotatingFileHandler multiple times +# in multiple processes. +if os.environ.get('CONFIG_PROCESSOR_MANAGER_LOGGER') == 'True': +DEFAULT_LOGGING_CONFIG['handlers'] \ +.update(DEFAULT_DAG_PARSING_LOGGING_CONFIG['handlers']) +DEFAULT_LOGGING_CONFIG['loggers'] \ +.update(DEFAULT_DAG_PARSING_LOGGING_CONFIG['loggers']) + +# Manually create log directory for processor_manager handler as RotatingFileHandler +# will only create file but not the directory. +processor_manager_handler_config = DEFAULT_DAG_PARSING_LOGGING_CONFIG['handlers'][ +'processor_manager'] +directory = os.path.dirname(processor_manager_handler_config['filename']) +mkdirs(directory, 0o755) + +# Remote logging configuration + +# Storage bucket URL for remote logging +# S3 buckets should start with "s3://" +# GCS buckets should start with "gs://" +# WASB buckets should start with "wasb" +# just to help Airflow select correct handler +REMOTE_BASE_LOG_FOLDER = conf.get('core', 'REMOTE_BASE_LOG_FOLDER') + +ELASTICSEARCH_HOST = conf.get('elasticsearch', 'HOST') + +REMOTE_LOGGING = conf.getboolean('core', 'remote_logging') + +if REMOTE_LOGGING and REMOTE_BASE_LOG_FOLDER.startswith('s3://'): +S3_REMOTE_HANDLERS = { 'task': { 'class': 'airflow.utils.log.s3_task_handler.S3TaskHandler', 'formatter': 'airflow', Review comment: This is not common to all handlers, so it will be problematic. My Stackdriver handler contains the following configurations: https://github.com/PolideaInternal/airflow/blob/e2511a74bfdd3824845ae037e4a50de127c223d6/airflow/config_templates/airflow_local_settings.py ```python gcp_conn_id = conf.get('core', 'REMOTE_LOG_CONN_ID', fallback=None) # stackdriver:///airflow-tasks => airflow-tasks REMOTE_BASE_LOG_FOLDER = urlparse(REMOTE_BASE_LOG_FOLDER).path[1:] STACKDRIVER_REMOTE_HANDLERS = { 'task': { 'class': 'airflow.utils.log.stackdriver_task_handler.StackdriverTaskHandler', 'formatter': 'airflow', 'name': REMOTE_BASE_LOG_FOLDER, 'gcp_conn_id': gcp_conn_id } } DEFAULT_LOGGING_CONFIG['handlers'].update(STACKDRIVER_REMOTE_HANDLERS) ``` I'm also afraid that pulling out only part of the configuration to a separate variable will make it difficult to understand. This is not a classic code that must follow DRY rules to avoid problems. This is a configuration file where each code has a different purpose. They look similar, but each has its own separate role. First of all, this file should be easy to understand and adapt to the specific case of our users . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on a change in pull request #6644: [AIRFLOW-6047] Simplify the logging configuration template
mik-laj commented on a change in pull request #6644: [AIRFLOW-6047] Simplify the logging configuration template URL: https://github.com/apache/airflow/pull/6644#discussion_r350742530 ## File path: airflow/config_templates/airflow_local_settings.py ## @@ -191,32 +215,6 @@ 'json_format': ELASTICSEARCH_JSON_FORMAT, 'json_fields': ELASTICSEARCH_JSON_FIELDS }, -}, -} - -REMOTE_LOGGING = conf.getboolean('core', 'remote_logging') - -# Only update the handlers and loggers when CONFIG_PROCESSOR_MANAGER_LOGGER is set. -# This is to avoid exceptions when initializing RotatingFileHandler multiple times -# in multiple processes. -if os.environ.get('CONFIG_PROCESSOR_MANAGER_LOGGER') == 'True': -DEFAULT_LOGGING_CONFIG['handlers'] \ -.update(DEFAULT_DAG_PARSING_LOGGING_CONFIG['handlers']) -DEFAULT_LOGGING_CONFIG['loggers'] \ -.update(DEFAULT_DAG_PARSING_LOGGING_CONFIG['loggers']) - -# Manually create log directory for processor_manager handler as RotatingFileHandler -# will only create file but not the directory. -processor_manager_handler_config = DEFAULT_DAG_PARSING_LOGGING_CONFIG['handlers'][ -'processor_manager'] -directory = os.path.dirname(processor_manager_handler_config['filename']) -mkdirs(directory, 0o755) +} -if REMOTE_LOGGING and REMOTE_BASE_LOG_FOLDER.startswith('s3://'): -DEFAULT_LOGGING_CONFIG['handlers'].update(REMOTE_HANDLERS['s3']) -elif REMOTE_LOGGING and REMOTE_BASE_LOG_FOLDER.startswith('gs://'): -DEFAULT_LOGGING_CONFIG['handlers'].update(REMOTE_HANDLERS['gcs']) -elif REMOTE_LOGGING and REMOTE_BASE_LOG_FOLDER.startswith('wasb'): -DEFAULT_LOGGING_CONFIG['handlers'].update(REMOTE_HANDLERS['wasb']) -elif REMOTE_LOGGING and ELASTICSEARCH_HOST: -DEFAULT_LOGGING_CONFIG['handlers'].update(REMOTE_HANDLERS['elasticsearch']) +DEFAULT_LOGGING_CONFIG['handlers'].update(ELASTIC_REMOTE_HANDLERS) Review comment: I added else statement at the end. Is it looks good for you? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on a change in pull request #6644: [AIRFLOW-6047] Simplify the logging configuration template
mik-laj commented on a change in pull request #6644: [AIRFLOW-6047] Simplify the logging configuration template URL: https://github.com/apache/airflow/pull/6644#discussion_r350992546 ## File path: airflow/config_templates/airflow_local_settings.py ## @@ -191,32 +215,6 @@ 'json_format': ELASTICSEARCH_JSON_FORMAT, 'json_fields': ELASTICSEARCH_JSON_FIELDS }, -}, -} - -REMOTE_LOGGING = conf.getboolean('core', 'remote_logging') - -# Only update the handlers and loggers when CONFIG_PROCESSOR_MANAGER_LOGGER is set. -# This is to avoid exceptions when initializing RotatingFileHandler multiple times -# in multiple processes. -if os.environ.get('CONFIG_PROCESSOR_MANAGER_LOGGER') == 'True': -DEFAULT_LOGGING_CONFIG['handlers'] \ -.update(DEFAULT_DAG_PARSING_LOGGING_CONFIG['handlers']) -DEFAULT_LOGGING_CONFIG['loggers'] \ -.update(DEFAULT_DAG_PARSING_LOGGING_CONFIG['loggers']) - -# Manually create log directory for processor_manager handler as RotatingFileHandler -# will only create file but not the directory. -processor_manager_handler_config = DEFAULT_DAG_PARSING_LOGGING_CONFIG['handlers'][ -'processor_manager'] -directory = os.path.dirname(processor_manager_handler_config['filename']) -mkdirs(directory, 0o755) +} -if REMOTE_LOGGING and REMOTE_BASE_LOG_FOLDER.startswith('s3://'): -DEFAULT_LOGGING_CONFIG['handlers'].update(REMOTE_HANDLERS['s3']) -elif REMOTE_LOGGING and REMOTE_BASE_LOG_FOLDER.startswith('gs://'): -DEFAULT_LOGGING_CONFIG['handlers'].update(REMOTE_HANDLERS['gcs']) -elif REMOTE_LOGGING and REMOTE_BASE_LOG_FOLDER.startswith('wasb'): -DEFAULT_LOGGING_CONFIG['handlers'].update(REMOTE_HANDLERS['wasb']) -elif REMOTE_LOGGING and ELASTICSEARCH_HOST: -DEFAULT_LOGGING_CONFIG['handlers'].update(REMOTE_HANDLERS['elasticsearch']) +DEFAULT_LOGGING_CONFIG['handlers'].update(ELASTIC_REMOTE_HANDLERS) Review comment: I didn't want to increase the level of indentation, but It will be clearer to understand, so I make a change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services