GitHub user yhuai opened a pull request: https://github.com/apache/spark/pull/14284
[SPARK-16633] [SPARK-16642] Fixes three issues related to window functions ## What changes were proposed in this pull request? This PR contains three changes. First, this PR changes the behavior of lead/lag back to Spark 1.6's behavior, which is described as below: 1. lead/lag respect null input values, which means that if the offset row exists and the input value is null, the result will be null instead of the default value. 2. If the offset row does not exist, the default value will be used. 3. OffsetWindowFunction's nullable setting also considers the nullability of its input (because of the first change). Second, this PR fixes the evaluation of lead/lag when the input expression is a literal. This fix is a result of the first change. In current master, if a literal is used as the input expression of a lead or lag function, the result will be this literal even if the offset row does not exist. Third, this PR makes ResolveWindowFrame not fire if a window function is not resolved. ## How was this patch tested? New tests in SQLWindowFunctionSuite You can merge this pull request into a Git repository by running: $ git pull https://github.com/yhuai/spark lead-lag Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14284.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14284 ---- commit 78e69018ecaffb9598f4ea2b51900850ee3fb988 Author: Yin Huai <yh...@databricks.com> Date: 2016-07-20T06:56:50Z Add regression tests commit da5f36f5daa16c4aba605cb939b313c92274b24e Author: Yin Huai <yh...@databricks.com> Date: 2016-07-20T07:22:17Z Fix SPARK-16642 commit 02ee1915ab2519c876f60162ff00aaa155142eec Author: Yin Huai <yh...@databricks.com> Date: 2016-07-20T08:43:04Z OffsetWindowFunction's nullable should also check its input's nullable field. commit 506393b3eec45f7b62615adfe317a230e8de4128 Author: Yin Huai <yh...@databricks.com> Date: 2016-07-20T08:43:28Z Change the behavior of lead/lag back to Spark 1.6's behavior, which is explained below: * When the offset row does not exits, default values will be used. * lead/lag always respect null input values. ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org