This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
     new f2bcc93  [SPARK-32812][PYTHON][TESTS] Avoid initiating a process 
during the main process for run-tests.py
f2bcc93 is described below

commit f2bcc9349d86be71dba491b8348ac8d83f0764a8
Author: itholic <haejoon...@naver.com>
AuthorDate: Tue Sep 8 12:22:13 2020 +0900

    [SPARK-32812][PYTHON][TESTS] Avoid initiating a process during the main 
process for run-tests.py
    
    ### What changes were proposed in this pull request?
    
    In certain environments, seems it fails to run `run-tests.py` script as 
below:
    
    ```
    Traceback (most recent call last):
     File "<string>", line 1, in <module>
    ...
    
    raise RuntimeError('''
    RuntimeError:
     An attempt has been made to start a new process before the
     current process has finished its bootstrapping phase.
    
    This probably means that you are not using fork to start your
     child processes and you have forgotten to use the proper idiom
     in the main module:
    
    if __name__ == '__main__':
     freeze_support()
     ...
    
    The "freeze_support()" line can be omitted if the program
     is not going to be frozen to produce an executable.
    Traceback (most recent call last):
    ...
     raise EOFError
    EOFError
    
    ```
    
    The reason is that `Manager.dict()` launches another process when the main 
process is initiated.
    
    It works in most environments for an unknown reason but it should be good 
to avoid such pattern as guided from Python itself.
    
    ### Why are the changes needed?
    
    To prevent the test failure for Python.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No, it fixes a test script.
    
    ### How was this patch tested?
    
    Manually ran the script after fixing.
    
    ```
    Running PySpark tests. Output is in /.../python/unit-tests.log
    Will test against the following Python executables: ['/.../python3', 
'python3.8']
    Will test the following Python tests: ['pyspark.sql.dataframe']
    /.../python3 python_implementation is CPython
    /.../python3 version is: Python 3.8.5
    python3.8 python_implementation is CPython
    python3.8 version is: Python 3.8.5
    Starting test(/.../python3): pyspark.sql.dataframe
    Starting test(python3.8): pyspark.sql.dataframe
    Finished test(/.../python3): pyspark.sql.dataframe (33s)
    Finished test(python3.8): pyspark.sql.dataframe (34s)
    Tests passed in 34 seconds
    ```
    
    Closes #29666 from itholic/SPARK-32812.
    
    Authored-by: itholic <haejoon...@naver.com>
    Signed-off-by: HyukjinKwon <gurwls...@apache.org>
    (cherry picked from commit c8c082ce380b2357623511c6625503fb3f1d65bf)
    Signed-off-by: HyukjinKwon <gurwls...@apache.org>
---
 python/run-tests.py | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/python/run-tests.py b/python/run-tests.py
index c34e48a..9a95c96 100755
--- a/python/run-tests.py
+++ b/python/run-tests.py
@@ -53,7 +53,7 @@ def print_red(text):
     print('\033[31m' + text + '\033[0m')
 
 
-SKIPPED_TESTS = Manager().dict()
+SKIPPED_TESTS = None
 LOG_FILE = os.path.join(SPARK_HOME, "python/unit-tests.log")
 FAILURE_REPORTING_LOCK = Lock()
 LOGGER = logging.getLogger()
@@ -141,6 +141,7 @@ def run_individual_python_test(target_dir, test_name, 
pyspark_python):
             skipped_counts = len(skipped_tests)
             if skipped_counts > 0:
                 key = (pyspark_python, test_name)
+                assert SKIPPED_TESTS is not None
                 SKIPPED_TESTS[key] = skipped_tests
             per_test_output.close()
         except:
@@ -293,4 +294,5 @@ def main():
 
 
 if __name__ == "__main__":
+    SKIPPED_TESTS = Manager().dict()
     main()


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to