Burak Yavuz created SPARK-14160: ----------------------------------- Summary: Windowing for structured streaming Key: SPARK-14160 URL: https://issues.apache.org/jira/browse/SPARK-14160 Project: Spark Issue Type: Sub-task Components: SQL Reporter: Burak Yavuz
This JIRA is to track the status regarding event time windowing operations for Continuous queries. The proposed API is as follows. There are 3 parameters for the window : 1. Time column. This will generally be the event time column for the record, but it should be possible to use ingestion time as well using an expression. 2. The window length 3. Slide interval (optional). The slide interval will create new windows with the window length provided in (2) at each interval. If the slide interval is not provided, we will generate tumbling windows. Examples: Consider the following schema for our data: {code} sensor_id, measurement, timestamp {code} In order to generate 30 second tumbling windows and averaging out the measurement values for each sensor, we may write something like: {code} df.window("timestamp", 30.seconds) .groupBy("sensor_id") .agg(mean("measurement")) {code} using the DataSet/DataFrame api. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org