tdas commented on a change in pull request #33093: URL: https://github.com/apache/spark/pull/33093#discussion_r662282624
########## File path: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala ########## @@ -280,6 +280,51 @@ class KeyValueGroupedDataset[K, V] private[sql]( child = logicalPlan)) } + /** + * (Scala-specific) + * Applies the given function to each group of data, while maintaining a user-defined per-group + * state. The result Dataset will represent the objects returned by the function. + * For a static batch Dataset, the function will be invoked once per group. For a streaming + * Dataset, the function will be invoked for each group repeatedly in every trigger, and + * updates to each group's state will be saved across invocations. + * See [[org.apache.spark.sql.streaming.GroupState]] for more details. + * + * @tparam S The type of the user-defined state. Must be encodable to Spark SQL types. + * @tparam U The type of the output objects. Must be encodable to Spark SQL types. + * @param func Function to be called on every group. + * @param timeoutConf Timeout Conf, see GroupStateTimeout for more details + * @param initialState The user provided state that will be initialized when the first batch + * of data is processed in the streaming query. The user defined function + * will be called on the state data even if there are no other values in + * the group. To convert a Dataset ds of type Dataset[(K, S)] to a Review comment: ``` To covert a Dataset `ds` of type of type `Dataset[(K, S)]` .... to a `KeyValueGroupedDataset[K, S]` ``` change in other places as well -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org