[ 
https://issues.apache.org/jira/browse/SPARK-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14277646#comment-14277646
 ] 

Imran Rashid commented on SPARK-4746:
-------------------------------------

submitted a PR: https://github.com/apache/spark/pull/4048

My decision of what to choose as IntegrationTests was somewhat arbitrary.  But 
it does have the benefit that (a) all unit tests now run in about 5 mins on my 
laptop and (b) you don't need to do a full {{mvn package}} before running the 
unit tests.

I took a brief look at what was taking up the time in the remaining tests.  
Most of the tests are pretty fast, there are a few that are taking up most of 
the time.  In fact, it looks like we can get the time down by ~50% if we move 
the slowest 10 remaining tests to Integration tests as well:

ConnectionManagerSuite: - sendMessageReliably timeout 
MapOutputTrackerSuite: - remote fetch exceeds akka frame size 
RDDSuite: - takeSample 
XORShiftRandomSuite - XORShift generates valid random numbers 
ExternalSorterSuite: - cleanup of intermediate files in sorter 
SparkListenerSuite: - local metrics 
ContextCleanerSuite: - automatically cleanup RDD + shuffle + broadcast 
ExternalSorterSuite: - cleanup of intermediate files in sorter, bypass 
merge-sort 
ExternalSorterSuite: - sorting without aggregation, with spill 
ExternalSorterSuite: - empty partitions with spilling 
TaskSetManagerSuite: - abort the job if total size of results is too large 

of course you have to go a little bit deeper to get more gains, but still you 
could go a lot further eg. if you move the 60 slowest then you're down to 25% 
of the time.  This may be opening the door to an endless debate about where we 
draw the line on what is a unit test vs integration test.  But one nice thing 
about the test-tag based approach is that we could go finer grained if we want, 
and its pretty easy for a developer to customize which set of tests they want 
to run when developing locally.

> integration tests should be separated from faster unit tests
> ------------------------------------------------------------
>
>                 Key: SPARK-4746
>                 URL: https://issues.apache.org/jira/browse/SPARK-4746
>             Project: Spark
>          Issue Type: Bug
>            Reporter: Imran Rashid
>            Priority: Trivial
>
> Currently there isn't a good way for a developer to skip the longer 
> integration tests.  This can slow down local development.  See 
> http://apache-spark-developers-list.1001551.n3.nabble.com/Spurious-test-failures-testing-best-practices-td9560.html
> One option is to use scalatest's notion of test tags to tag all integration 
> tests, so they could easily be skipped



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to