Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19816 @felixcheung, I just tried to lower this by default and ran. Seems some tests are being failed. For example, if we lower`spark.sql.shuffle.partitions` to 5, these fail additionally: ``` Failed ------------------------------------------------------------------------- 1. Failure: spark.als (@test_mllib_recommendation.R#36) ------------------------ predictions$prediction not equal to c(-0.1380762, 2.6258414, -1.5018409). 3/3 mismatches (average diff: 2.75) [1] 2.626 - -0.138 == 2.76 [2] -1.502 - 2.626 == -4.13 [3] -0.138 - -1.502 == 1.36 2. Failure: pivot GroupedData column (@test_sparkSQL.R#1921) ------------------- `sum1` not equal to `correct_answer`. Component âyearâ: Mean relative difference: 0.0004961548 Component âPythonâ: Mean relative difference: 0.0952381 Component âRâ: Mean relative difference: 0.5454545 3. Failure: pivot GroupedData column (@test_sparkSQL.R#1922) ------------------- `sum2` not equal to `correct_answer`. Component âyearâ: Mean relative difference: 0.0004961548 Component âPythonâ: Mean relative difference: 0.0952381 Component âRâ: Mean relative difference: 0.5454545 4. Failure: pivot GroupedData column (@test_sparkSQL.R#1923) ------------------- `sum3` not equal to `correct_answer`. Component âyearâ: Mean relative difference: 0.0004961548 Component âPythonâ: Mean relative difference: 0.0952381 Component âRâ: Mean relative difference: 0.5454545 5. Failure: pivot GroupedData column (@test_sparkSQL.R#1924) ------------------- `sum4` not equal to correct_answer[, c("year", "R")]. Component âyearâ: Mean relative difference: 0.0004961548 Component âRâ: Mean relative difference: 0.5454545 ``` Shuffle + R worker cases look not quite frequent (to be clear, just shuffle without R will be fine IIUC). I don't have a strong opinion on lowering because .. if we don't lower, some tests in the future could cause such problem again vs if we should lower, the required change looks quite larger and this case might be not quite frequent.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org