cgivre opened a new pull request #1680: Drill-7077: Add Function to Facilitate 
Time Series Analysis
URL: https://github.com/apache/drill/pull/1680
 
 
   When analyzing time based data, you will often have to aggregate by time 
grains. While some time grains will be easy to calculate, others, such as 
quarter, can be quite difficult. These functions enable a user to quickly and 
easily aggregate data by various units of time. Usage is as follows:
   ```SELECT <fields>
   FROM <data>
   GROUP BY nearestDate(<timestamp_column>, <time increment>
   ```
   So let's say that a user wanted to count the number of hits on a web server 
per 15 minute, the query might look like this:
   
   ```
   SELECT nearestDate(`eventDate`, '15MINUTE' ) AS eventDate,
   COUNT(*) AS hitCount
   FROM dfs.`log.httpd`
   GROUP BY nearestDate(`eventDate`, '15MINUTE')
   ```
   Currently supports the following time units:
    * YEAR
    * QUARTER
    * MONTH
    * WEEK_SUNDAY
    * WEEK_MONDAY
    * DAY
    * HOUR
    * HALF_HOUR / 30MIN
    * QUARTER_HOUR / 15MIN
    * MINUTE
    * 30SECOND
    * 15SECOND
    * SECOND
   
   There are two versions of the function, one which accepts a date and 
interval, and the other accepts a string, format string and interval.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to