This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new c0297de [MINOR][PYSPARK][SQL][DOC] Fix rowsBetween doc in Window c0297de is described below commit c0297dedd829a92cca920ab8983dab399f8f32d5 Author: Liang-Chi Hsieh <vii...@gmail.com> AuthorDate: Fri Jun 14 09:56:37 2019 +0900 [MINOR][PYSPARK][SQL][DOC] Fix rowsBetween doc in Window ## What changes were proposed in this pull request? I suspect that the doc of `rowsBetween` methods in Scala and PySpark looks wrong. Because: ```scala scala> val df = Seq((1, "a"), (2, "a"), (3, "a"), (4, "a"), (5, "a"), (6, "a")).toDF("id", "category") df: org.apache.spark.sql.DataFrame = [id: int, category: string] scala> val byCategoryOrderedById = Window.partitionBy('category).orderBy('id).rowsBetween(-1, 2) byCategoryOrderedById: org.apache.spark.sql.expressions.WindowSpec = org.apache.spark.sql.expressions.WindowSpec7f04de97 scala> df.withColumn("sum", sum('id) over byCategoryOrderedById).show() +---+--------+---+ | id|category|sum| +---+--------+---+ | 1| a| 6| # sum from index 0 to (0 + 2): 1 + 2 + 3 = 6 | 2| a| 10| # sum from index (1 - 1) to (1 + 2): 1 + 2 + 3 + 4 = 10 | 3| a| 14| | 4| a| 18| | 5| a| 15| | 6| a| 11| +---+--------+---+ ``` So the frame (-1, 2) for row with index 5, as described in the doc, should range from index 4 to index 7. ## How was this patch tested? N/A, just doc change. Closes #24864 from viirya/window-spec-doc. Authored-by: Liang-Chi Hsieh <vii...@gmail.com> Signed-off-by: HyukjinKwon <gurwls...@apache.org> --- python/pyspark/sql/window.py | 2 +- sql/core/src/main/scala/org/apache/spark/sql/expressions/Window.scala | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/python/pyspark/sql/window.py b/python/pyspark/sql/window.py index 65c3ff5..9e02758a 100644 --- a/python/pyspark/sql/window.py +++ b/python/pyspark/sql/window.py @@ -101,7 +101,7 @@ class Window(object): An offset indicates the number of rows above or below the current row, the frame for the current row starts or ends. For instance, given a row based sliding frame with a lower bound offset of -1 and a upper bound offset of +2. The frame for row with index 5 would range from - index 4 to index 6. + index 4 to index 7. >>> from pyspark.sql import Window >>> from pyspark.sql import functions as func diff --git a/sql/core/src/main/scala/org/apache/spark/sql/expressions/Window.scala b/sql/core/src/main/scala/org/apache/spark/sql/expressions/Window.scala index 9a4ad44..cd1c198 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/expressions/Window.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/expressions/Window.scala @@ -129,7 +129,7 @@ object Window { * An offset indicates the number of rows above or below the current row, the frame for the * current row starts or ends. For instance, given a row based sliding frame with a lower bound * offset of -1 and a upper bound offset of +2. The frame for row with index 5 would range from - * index 4 to index 6. + * index 4 to index 7. * * {{{ * import org.apache.spark.sql.expressions.Window --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org