[GitHub] [airflow] mik-laj commented on a change in pull request #9531: Support .airflowignore for plugins

2020-07-01 Thread GitBox


mik-laj commented on a change in pull request #9531:
URL: https://github.com/apache/airflow/pull/9531#discussion_r448633985



##
File path: airflow/plugins_manager.py
##
@@ -164,34 +165,31 @@ def load_plugins_from_plugin_directory():
 global plugins  # pylint: disable=global-statement
 log.debug("Loading plugins from directory: %s", settings.PLUGINS_FOLDER)
 
-# Crawl through the plugins folder to find AirflowPlugin derivatives
-for root, _, files in os.walk(settings.PLUGINS_FOLDER, followlinks=True):  
# noqa # pylint: disable=too-many-nested-blocks
-for f in files:
-filepath = os.path.join(root, f)
-try:
-if not os.path.isfile(filepath):
-continue
-mod_name, file_ext = os.path.splitext(
-os.path.split(filepath)[-1])
-if file_ext != '.py':
-continue
-
-log.debug('Importing plugin module %s', filepath)
-
-loader = importlib.machinery.SourceFileLoader(mod_name, 
filepath)
-spec = importlib.util.spec_from_loader(mod_name, loader)
-mod = importlib.util.module_from_spec(spec)
-sys.modules[spec.name] = mod
-loader.exec_module(mod)
-for mod_attr_value in list(mod.__dict__.values()):
-if is_valid_plugin(mod_attr_value):
-plugin_instance = mod_attr_value()
-plugins.append(plugin_instance)
-except Exception as e:  # pylint: disable=broad-except
-log.exception(e)
-path = filepath or str(f)
-log.error('Failed to import plugin %s', path)
-import_errors[path] = str(e)
+for file_path in find_path_from_directory(
+settings.PLUGINS_FOLDER, ".airflowignore"):
+
+if not os.path.isfile(file_path):
+continue
+mod_name, file_ext = os.path.splitext(os.path.split(file_path)[-1])
+if file_ext != '.py':
+continue

Review comment:
   ```suggestion
   ```
   Should this not be part of the find_path_from_directory function? I would 
like to see that there is no code in the plugins_manager.py file that is 
responsible for the file selection. Plugins_manager should only load the 
module. WDYT?
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #9531: Support .airflowignore for plugins : .pluginingore

2020-06-27 Thread GitBox


mik-laj commented on a change in pull request #9531:
URL: https://github.com/apache/airflow/pull/9531#discussion_r446609259



##
File path: airflow/plugins_manager.py
##
@@ -164,8 +163,28 @@ def load_plugins_from_plugin_directory():
 global plugins  # pylint: disable=global-statement
 log.debug("Loading plugins from directory: %s", settings.PLUGINS_FOLDER)
 
+patterns_by_dir: Dict[str, List[Pattern[str]]] = {}
+
 # Crawl through the plugins folder to find AirflowPlugin derivatives
-for root, _, files in os.walk(settings.PLUGINS_FOLDER, followlinks=True):  
# noqa # pylint: disable=too-many-nested-blocks
+for root, dirs, files in os.walk(settings.PLUGINS_FOLDER, 
followlinks=True):  # noqa # pylint: disable=too-many-nested-blocks
+
+patterns: List[Pattern[str]] = patterns_by_dir.get(root, [])
+ignore_file = os.path.join(root, '.pluginignore')
+
+if os.path.isfile(ignore_file):
+with open(ignore_file, 'r') as file:

Review comment:
   I like this code now. :-)
   
   Moving this method to airflow.utils.file is a good idea. This will allow us 
to delete the duplicate code.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #9531: Support .airflowignore for plugins : .pluginingore

2020-06-27 Thread GitBox


mik-laj commented on a change in pull request #9531:
URL: https://github.com/apache/airflow/pull/9531#discussion_r446609058



##
File path: docs/concepts.rst
##
@@ -1465,3 +1465,13 @@ would not be scanned by Airflow at all. This improves 
efficiency of DAG finding)
 The scope of a ``.airflowignore`` file is the directory it is in plus all its 
subfolders.
 You can also prepare ``.airflowignore`` file for a subfolder in ``DAG_FOLDER`` 
and it
 would only be applicable for that subfolder.
+
+
+.pluginignore
+''
+
+A ``.pluginignore`` file specifies the directories or files in 
``PLUGINS_FOLDER``

Review comment:
   Why the new file name? In my opinion, you can use `.airflowignore` and 
it will be easier to use.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #9531: Support .airflowignore for plugins : .pluginingore

2020-06-26 Thread GitBox


mik-laj commented on a change in pull request #9531:
URL: https://github.com/apache/airflow/pull/9531#discussion_r446210550



##
File path: airflow/plugins_manager.py
##
@@ -164,8 +163,28 @@ def load_plugins_from_plugin_directory():
 global plugins  # pylint: disable=global-statement
 log.debug("Loading plugins from directory: %s", settings.PLUGINS_FOLDER)
 
+patterns_by_dir: Dict[str, List[Pattern[str]]] = {}
+
 # Crawl through the plugins folder to find AirflowPlugin derivatives
-for root, _, files in os.walk(settings.PLUGINS_FOLDER, followlinks=True):  
# noqa # pylint: disable=too-many-nested-blocks
+for root, dirs, files in os.walk(settings.PLUGINS_FOLDER, 
followlinks=True):  # noqa # pylint: disable=too-many-nested-blocks
+
+patterns: List[Pattern[str]] = patterns_by_dir.get(root, [])
+ignore_file = os.path.join(root, '.pluginignore')
+
+if os.path.isfile(ignore_file):
+with open(ignore_file, 'r') as file:

Review comment:
   Hello.
   This code looks very similar to:
   https://github.com/apache/airflow/blob/master/airflow/utils/file.py#L125
   I think you can make a generator that will take the starting path and yield 
for each file to load. 
   Best regards,
   Kamil





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org