Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22631#discussion_r223202913 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala --- @@ -110,7 +112,7 @@ class CastSuite extends SparkFunSuite with ExpressionEvalHelper { } test("cast string to timestamp") { - for (tz <- ALL_TIMEZONES) { + for (tz <- Random.shuffle(ALL_TIMEZONES).take(50)) { --- End diff -- @srowen I think the point here is to test that this works with different timezones. In `DateExpressionsSuite`, for instance, we test only 3-4 timezones. I don't think it makes sense to test every of the 650 possible timezones: if it works with many of them, then it means that the code is generic and respects timezones. We can also define a fixed subset of timezones, but IMHO taking randomly some of them provides the additional safety that if there is a specific single timezone creating problem, we are able to identify it on several subsequent runs. We have many place where we generate data randomly in the test, so we already have randomness in the tests. I think the rationale behind them is the same: if the function works with some different data, then it generalize properly.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org