Josh Rosen created SPARK-8422:
---------------------------------

             Summary: Introduce a module abstraction inside of dev/run-tests
                 Key: SPARK-8422
                 URL: https://issues.apache.org/jira/browse/SPARK-8422
             Project: Spark
          Issue Type: Sub-task
          Components: Build, Project Infra
            Reporter: Josh Rosen
            Assignee: Josh Rosen


At a high level, we have Spark modules / components which

1. are affected / impacted by file changes (e.g. a module is associated with a 
set of source files, so changes to those files change the module),
2. contain a set of tests to run, which are triggered via Maven, SBT, or via 
Python / R scripts.
3. depend on other modules and have dependent modules: if we change core, then 
every downstream test should be run.  If we change only MLlib, then we can skip 
the SQL tests but should probably run the Python MLlib tests, etc.

Right now, the per-module logic is spread across a few different places inside 
of the {{dev/run-tests}} script: we have one function that describes how to 
detect changes for all modules, another function that (implicitly) deals with 
module dependencies, etc.

Instead, I propose that we introduce a class for describing a module, use 
instances of this class to build up a dependency graph, then phrase the "find 
which tests to run" operations in terms of that graph.  I think that this will 
be easier to understand / maintain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to