[jira] [Created] (SPARK-54507) Add time_bucket function for TIME type bucketing

Vinod KC (Jira) Tue, 25 Nov 2025 11:12:50 -0800

Vinod KC created SPARK-54507:
--------------------------------

             Summary: Add time_bucket function for TIME type bucketing
                 Key: SPARK-54507
                 URL: https://issues.apache.org/jira/browse/SPARK-54507
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 4.1.0
            Reporter: Vinod KC



Add a new `time_bucket()` SQL function that buckets TIME values into 
fixed-width intervals,
returning the start time of each bucket. This enables histogram generation and 
time-of-day
pattern analysis for TIME columns.

Currently,  Spark SQL users cannot easily group TIME values by intervals (e.g., 
15-minute or 1-hour 
buckets) without complex manual calculations.

Common use cases that require this function:
- Retail: Analyze customer traffic by 30-minute slots to optimize staffing
- Healthcare: Group appointments by 15-minute intervals for scheduling 
optimization
- Manufacturing: Aggregate sensor readings by hourly buckets to detect 
production patterns
- DevOps: Bucket system events by 5-minute intervals for performance monitoring

Industry precedent:
- SQL Server 2022: `DATE_BUCKET()` supports TIME type bucketing
- TimescaleDB: `time_bucket()` is one of their most popular functions


eg:

{code:java}
WITH time_bucket_data AS (
  SELECT * FROM VALUES
    (1, TIME'09:15:30.123456'),
    (2, TIME'09:37:45.654321'),
    (3, TIME'10:05:12.987654'),
    (4, TIME'14:42:00.111222'),
    (5, TIME'14:55:30.333444')
  AS t(id, event_time)
)
SELECT time_bucket(INTERVAL '30' MINUTE, event_time) AS time_slot,
       COUNT(*) AS cnt
FROM time_bucket_data
GROUP BY time_slot
ORDER BY time_slot;
{code}

Output

{code:java}
+---------+---+
|time_slot|cnt|
+---------+---+
| 09:00:00|  1|
| 09:30:00|  1|
| 10:00:00|  1|
| 14:30:00|  2|
+---------+---+
{code}





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (SPARK-54507) Add time_bucket function for TIME type bucketing

Reply via email to