Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22305#discussion_r232485612
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala
 ---
    @@ -27,17 +27,62 @@ import 
org.apache.spark.api.python.{ChainedPythonFunctions, PythonEvalType}
     import org.apache.spark.rdd.RDD
     import org.apache.spark.sql.catalyst.InternalRow
     import org.apache.spark.sql.catalyst.expressions._
    -import org.apache.spark.sql.catalyst.plans.physical._
    -import org.apache.spark.sql.execution.{GroupedIterator, SparkPlan, 
UnaryExecNode}
    +import org.apache.spark.sql.catalyst.plans.physical.{AllTuples, 
ClusteredDistribution, Distribution, Partitioning}
    +import org.apache.spark.sql.execution.{ExternalAppendOnlyUnsafeRowArray, 
SparkPlan}
     import org.apache.spark.sql.execution.arrow.ArrowUtils
    -import org.apache.spark.sql.types.{DataType, StructField, StructType}
    +import org.apache.spark.sql.execution.window._
    +import org.apache.spark.sql.types._
     import org.apache.spark.util.Utils
     
    +/**
    + * This class calculates and outputs windowed aggregates over the rows in 
a single partition.
    + *
    + * It is very similar to [[WindowExec]] and has similar logic. The main 
difference is that this
    + * node doesn't not compute any window aggregation values. Instead, it 
computes the lower and
    + * upper bound for each window (i.e. window bounds) and pass the data and 
indices to python work
    + * to do the actual window aggregation.
    + *
    + * It currently materialize all data associated with the same partition 
key and pass them to
    --- End diff --
    
    tiny typo: `materialize` -> `materializes` and `pass` -> `passes`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to