[GitHub] [airflow] mik-laj commented on a change in pull request #9531: Support .airflowignore for plugins
mik-laj commented on a change in pull request #9531: URL: https://github.com/apache/airflow/pull/9531#discussion_r448633985 ## File path: airflow/plugins_manager.py ## @@ -164,34 +165,31 @@ def load_plugins_from_plugin_directory(): global plugins # pylint: disable=global-statement log.debug("Loading plugins from directory: %s", settings.PLUGINS_FOLDER) -# Crawl through the plugins folder to find AirflowPlugin derivatives -for root, _, files in os.walk(settings.PLUGINS_FOLDER, followlinks=True): # noqa # pylint: disable=too-many-nested-blocks -for f in files: -filepath = os.path.join(root, f) -try: -if not os.path.isfile(filepath): -continue -mod_name, file_ext = os.path.splitext( -os.path.split(filepath)[-1]) -if file_ext != '.py': -continue - -log.debug('Importing plugin module %s', filepath) - -loader = importlib.machinery.SourceFileLoader(mod_name, filepath) -spec = importlib.util.spec_from_loader(mod_name, loader) -mod = importlib.util.module_from_spec(spec) -sys.modules[spec.name] = mod -loader.exec_module(mod) -for mod_attr_value in list(mod.__dict__.values()): -if is_valid_plugin(mod_attr_value): -plugin_instance = mod_attr_value() -plugins.append(plugin_instance) -except Exception as e: # pylint: disable=broad-except -log.exception(e) -path = filepath or str(f) -log.error('Failed to import plugin %s', path) -import_errors[path] = str(e) +for file_path in find_path_from_directory( +settings.PLUGINS_FOLDER, ".airflowignore"): + +if not os.path.isfile(file_path): +continue +mod_name, file_ext = os.path.splitext(os.path.split(file_path)[-1]) +if file_ext != '.py': +continue Review comment: ```suggestion ``` Should this not be part of the find_path_from_directory function? I would like to see that there is no code in the plugins_manager.py file that is responsible for the file selection. Plugins_manager should only load the module. WDYT? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #9531: Support .airflowignore for plugins : .pluginingore
mik-laj commented on a change in pull request #9531: URL: https://github.com/apache/airflow/pull/9531#discussion_r446609259 ## File path: airflow/plugins_manager.py ## @@ -164,8 +163,28 @@ def load_plugins_from_plugin_directory(): global plugins # pylint: disable=global-statement log.debug("Loading plugins from directory: %s", settings.PLUGINS_FOLDER) +patterns_by_dir: Dict[str, List[Pattern[str]]] = {} + # Crawl through the plugins folder to find AirflowPlugin derivatives -for root, _, files in os.walk(settings.PLUGINS_FOLDER, followlinks=True): # noqa # pylint: disable=too-many-nested-blocks +for root, dirs, files in os.walk(settings.PLUGINS_FOLDER, followlinks=True): # noqa # pylint: disable=too-many-nested-blocks + +patterns: List[Pattern[str]] = patterns_by_dir.get(root, []) +ignore_file = os.path.join(root, '.pluginignore') + +if os.path.isfile(ignore_file): +with open(ignore_file, 'r') as file: Review comment: I like this code now. :-) Moving this method to airflow.utils.file is a good idea. This will allow us to delete the duplicate code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #9531: Support .airflowignore for plugins : .pluginingore
mik-laj commented on a change in pull request #9531: URL: https://github.com/apache/airflow/pull/9531#discussion_r446609058 ## File path: docs/concepts.rst ## @@ -1465,3 +1465,13 @@ would not be scanned by Airflow at all. This improves efficiency of DAG finding) The scope of a ``.airflowignore`` file is the directory it is in plus all its subfolders. You can also prepare ``.airflowignore`` file for a subfolder in ``DAG_FOLDER`` and it would only be applicable for that subfolder. + + +.pluginignore +'' + +A ``.pluginignore`` file specifies the directories or files in ``PLUGINS_FOLDER`` Review comment: Why the new file name? In my opinion, you can use `.airflowignore` and it will be easier to use. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #9531: Support .airflowignore for plugins : .pluginingore
mik-laj commented on a change in pull request #9531: URL: https://github.com/apache/airflow/pull/9531#discussion_r446210550 ## File path: airflow/plugins_manager.py ## @@ -164,8 +163,28 @@ def load_plugins_from_plugin_directory(): global plugins # pylint: disable=global-statement log.debug("Loading plugins from directory: %s", settings.PLUGINS_FOLDER) +patterns_by_dir: Dict[str, List[Pattern[str]]] = {} + # Crawl through the plugins folder to find AirflowPlugin derivatives -for root, _, files in os.walk(settings.PLUGINS_FOLDER, followlinks=True): # noqa # pylint: disable=too-many-nested-blocks +for root, dirs, files in os.walk(settings.PLUGINS_FOLDER, followlinks=True): # noqa # pylint: disable=too-many-nested-blocks + +patterns: List[Pattern[str]] = patterns_by_dir.get(root, []) +ignore_file = os.path.join(root, '.pluginignore') + +if os.path.isfile(ignore_file): +with open(ignore_file, 'r') as file: Review comment: Hello. This code looks very similar to: https://github.com/apache/airflow/blob/master/airflow/utils/file.py#L125 I think you can make a generator that will take the starting path and yield for each file to load. Best regards, Kamil This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org