Xiao Li created SPARK-13380: ------------------------------- Summary: Document Rand(seed) and Randn(seed) Return Indeterministic Results When Data Partitions are not fixed Key: SPARK-13380 URL: https://issues.apache.org/jira/browse/SPARK-13380 Project: Spark Issue Type: Documentation Components: SQL Affects Versions: 2.0.0 Reporter: Xiao Li Priority: Minor
rand and randn functions with a seed argument are commonly used. Based on the common sense, the results of rand and randn should be deterministic if the seed parameter value is provided. For example, in MS SQL Server, it also has a function rand. Regarding the parameter seed, the description is like: Seed is an integer expression (tinyint, smallint, or int) that gives the seed value. If seed is not specified, the SQL Server Database Engine assigns a seed value at random. For a specified seed value, the result returned is always the same. Update: the current implementation is unable to generate deterministic results when the partitions are not fixed. This PR documents this issue in the function descriptions. @jkbradley hit an issue and provided an example in the following JIRA: https://issues.apache.org/jira/browse/SPARK-13333 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org