This is an automated email from the ASF dual-hosted git repository. viirya pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.2 by this push: new 9bfc5b1 [MINOR][SS][DOCS] Point to correct examples of Arbitrary Stateful Operations 9bfc5b1 is described below commit 9bfc5b14c9b0fbb50dd537a509f2e094e1c5779e Author: Liang-Chi Hsieh <vii...@gmail.com> AuthorDate: Thu Oct 28 09:22:42 2021 -0700 [MINOR][SS][DOCS] Point to correct examples of Arbitrary Stateful Operations ### What changes were proposed in this pull request? This fixes incorrect example links in Structured Streaming Programming Guide. ### Why are the changes needed? StructuredSessionization.scala and JavaStructuredSessionization.java are now using session window expression, not `flatMapGroupsWithState`. The section talks about arbitrary stateful operations and should point to another examples. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Doc change only. Closes #34408 from viirya/fix-ss-doc. Authored-by: Liang-Chi Hsieh <vii...@gmail.com> Signed-off-by: Liang-Chi Hsieh <vii...@gmail.com> (cherry picked from commit 5b2bbcef6854c495c32b37e383dd5f1f6ce23dd4) Signed-off-by: Liang-Chi Hsieh <vii...@gmail.com> --- docs/structured-streaming-programming-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/structured-streaming-programming-guide.md b/docs/structured-streaming-programming-guide.md index 18dfbec..4642d44 100644 --- a/docs/structured-streaming-programming-guide.md +++ b/docs/structured-streaming-programming-guide.md @@ -1806,7 +1806,7 @@ However, as a side effect, data from the slower streams will be aggressively dro this configuration judiciously. ### Arbitrary Stateful Operations -Many usecases require more advanced stateful operations than aggregations. For example, in many usecases, you have to track sessions from data streams of events. For doing such sessionization, you will have to save arbitrary types of data as state, and perform arbitrary operations on the state using the data stream events in every trigger. Since Spark 2.2, this can be done using the operation `mapGroupsWithState` and the more powerful operation `flatMapGroupsWithState`. Both operations a [...] +Many usecases require more advanced stateful operations than aggregations. For example, in many usecases, you have to track sessions from data streams of events. For doing such sessionization, you will have to save arbitrary types of data as state, and perform arbitrary operations on the state using the data stream events in every trigger. Since Spark 2.2, this can be done using the operation `mapGroupsWithState` and the more powerful operation `flatMapGroupsWithState`. Both operations a [...] Though Spark cannot check and force it, the state function should be implemented with respect to the semantics of the output mode. For example, in Update mode Spark doesn't expect that the state function will emit rows which are older than current watermark plus allowed late record delay, whereas in Append mode the state function can emit these rows. --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org