Yikun commented on a change in pull request #32867:
URL: https://github.com/apache/spark/pull/32867#discussion_r652421042



##########
File path: dev/sparktestsupport/modules.py
##########
@@ -16,13 +16,65 @@
 #
 
 from functools import total_ordering
+from importlib import import_module
+import inspect
 import itertools
 import os
+from pkgutil import iter_modules
 import re
+import unittest
+
+from sparktestsupport import SPARK_HOME
+
 
 all_modules = []
 
 
+def _contain_unittests_class(module_name):
+    """
+    Check if the module with specific module_name has classes are derived from 
unittest.TestCase.
+
+    Such as:
+    pyspark.tests.test_appsubmit, it will return True, because there is 
SparkSubmitTests which is
+    included under the module of pyspark.tests.test_appsubmit, inherits from 
unittest.TestCase.
+    ``
+
+    :param module_name: the complete name of module to be checked.
+    :return: True if contains unittest classes otherwise False.
+             An ``ModuleNotFoundError`` will raise if the module is not found
+    """
+    _module = import_module(module_name)

Review comment:
       ```Python
   Traceback (most recent call last):
     File "./dev/run-tests.py", line 32, in <module>
       import sparktestsupport.modules as modules
     File "/home/runner/work/spark/spark/dev/sparktestsupport/modules.py", line 
425, in <module>
       pyspark_core = Module(
     File "/home/runner/work/spark/spark/dev/sparktestsupport/modules.py", line 
122, in __init__
       discovered_goals = _discover_python_unittests(python_test_paths)
     File "/home/runner/work/spark/spark/dev/sparktestsupport/modules.py", line 
73, in _discover_python_unittests
       if _contain_unittests_class(module.name):
     File "/home/runner/work/spark/spark/dev/sparktestsupport/modules.py", line 
46, in _contain_unittests_class
       _module = import_module(module_name)
     File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
       return _bootstrap._gcd_import(name[level:], package, level)
   ModuleNotFoundError: No module named 'pyspark'
   ```
   
   It should be changed to path based.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to