Imran Rashid created SPARK-25344:
------------------------------------

             Summary: Break large tests.py files into smaller files
                 Key: SPARK-25344
                 URL: https://issues.apache.org/jira/browse/SPARK-25344
             Project: Spark
          Issue Type: Improvement
          Components: PySpark
    Affects Versions: 2.4.0
            Reporter: Imran Rashid


We've got a ton of tests in one humongous tests.py file, rather than breaking 
it out into smaller files.

Having one huge file doesn't seem great for code organization, and it also 
makes the test parallelization in run-tests.py not work as well.  On my laptop, 
tests.py takes 150s, and the next longest test file takes only 20s.  There are 
similarly large files in other pyspark modules, eg. sql/tests.py, ml/tests.py, 
mllib/tests.py, streaming/tests.py.

It seems that at least for some of these files, its already broken into 
independent test classes, so it shouldn't be too hard to just move them into 
their own files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to