Github user ahmed-mahran commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14234#discussion_r71073767
  
    --- Diff: docs/structured-streaming-programming-guide.md ---
    @@ -620,16 +603,14 @@ df.groupBy("type").count()
     ### Window Operations on Event Time
     Aggregations over a sliding event-time window are straightforward with 
Structured Streaming. The key idea to understand about window-based 
aggregations are very similar to grouped aggregations. In a grouped 
aggregation, aggregate values (e.g. counts) are maintained for each unique 
value in the user-specified grouping column. In case of window-based 
aggregations, aggregate values are maintained for each window the event-time of 
a row falls into. Let's understand this with an illustration. 
     
    -Imagine our quick example is modified and the stream now contains lines 
along with the time when the line was generated. Instead of running word 
counts, we want to count words within 10 minute windows, updating every 5 
minutes. That is, word counts in words received between 10 minute windows 12:00 
- 12:10, 12:05 - 12:15, 12:10 - 12:20, etc. Note that 12:00 - 12:10 means data 
that arrived after 12:00 but before 12:10. Now, consider a word that was 
received at 12:07. This word should increment the counts corresponding to two 
windows 12:00 - 12:10 and 12:05 - 12:15. So the counts will be indexed by both, 
the grouping key (i.e. the word) and the window (can be calculated from the 
event-time).
    --- End diff --
    
    Added anchor `[quick example](#quick-example)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to