Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/21228#discussion_r185792840 --- Diff: R/pkg/R/functions.R --- @@ -3184,6 +3191,7 @@ setMethod("create_map", #' collect(select(df2, collect_list(df2$gear))) #' collect(select(df2, collect_set(df2$gear)))} #' @note collect_list since 2.3.0 +#' @note the function is non-deterministic because its result depends on order of rows. --- End diff -- for collect_list, collect_set maybe word it: "the function is non-deterministic, because the order of collected results depends on order of rows, which may be non-deterministic after a shuffle"
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org