[ https://issues.apache.org/jira/browse/SPARK-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15651525#comment-15651525 ]
Sean Owen commented on SPARK-9487: ---------------------------------- I'm not against just committing the Scala (or Java) changes separately, though that tends to make some tests less consistent while making others more consistent. Right now the WIP PR doesn't make all of the Scala changes yet, right? are there similar issues? It wouldn't hurt to figure out these test failures if that's all there is in Python and get them all done at once. I think some of it is just due to expected variations due to different distiributions of the data, but bears some reading of the tests to see if that makes sense. > Use the same num. worker threads in Scala/Python unit tests > ----------------------------------------------------------- > > Key: SPARK-9487 > URL: https://issues.apache.org/jira/browse/SPARK-9487 > Project: Spark > Issue Type: Improvement > Components: PySpark, Spark Core, SQL, Tests > Affects Versions: 1.5.0 > Reporter: Xiangrui Meng > Labels: starter > Attachments: ContextCleanerSuiteResults, HeartbeatReceiverSuiteResults > > > In Python we use `local[4]` for unit tests, while in Scala/Java we use > `local[2]` and `local` for some unit tests in SQL, MLLib, and other > components. If the operation depends on partition IDs, e.g., random number > generator, this will lead to different result in Python and Scala/Java. It > would be nice to use the same number in all unit tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org