[jira] [Commented] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-04-09 Thread Bridget Bevens (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813987#comment-16813987
 ] 

Bridget Bevens commented on DRILL-7077:
---

[~cgivre] the info is posted here: 
https://drill.apache.org/docs/date-time-functions-and-arithmetic/#nearestdate 
Let me know if I need to change anything.

Thanks,
Bridget

> Add Function to Facilitate Time Series Analysis
> ---
>
> Key: DRILL-7077
> URL: https://issues.apache.org/jira/browse/DRILL-7077
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.16.0
>
>
> When analyzing time based data, you will often have to aggregate by time 
> grains. While some time grains will be easy to calculate, others, such as 
> quarter, can be quite difficult. These functions enable a user to quickly and 
> easily aggregate data by various units of time. Usage is as follows:
> {code:java}
> SELECT 
> FROM 
> GROUP BY nearestDate(, {code}
> So let's say that a user wanted to count the number of hits on a web server 
> per 15 minute, the query might look like this:
> {code:java}
> SELECT nearestDate(`eventDate`, '15MINUTE' ) AS eventDate,
> COUNT(*) AS hitCount
> FROM dfs.`log.httpd`
> GROUP BY nearestDate(`eventDate`, '15MINUTE'){code}
> Currently supports the following time units:
>  * YEAR
>  * QUARTER
>  * MONTH
>  * WEEK_SUNDAY
>  * WEEK_MONDAY
>  * DAY
>  * HOUR
>  * HALF_HOUR / 30MIN
>  * QUARTER_HOUR / 15MIN
>  * MINUTE
>  * 30SECOND
>  * 15SECOND
>  * SECOND
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-04-09 Thread Bridget Bevens (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813895#comment-16813895
 ] 

Bridget Bevens commented on DRILL-7077:
---

Hi [~cgivre],

I'm trying this function and may be doing something wrong, but 15SECOND and 
30SECOND are not working for me:

select nearestdate(CAST(COLUMNS[2] as timestamp), '30SECOND') as nearest_second 
from dfs.samples.`/bee/time.csv`;

Error: SYSTEM ERROR: DrillRuntimeException: [30SECOND] is not a valid time 
statement. Expecting: [YEAR, QUARTER, MONTH, WEEK_SUNDAY, WEEK_MONDAY, DAY, 
HOUR, HALF_HOUR, QUARTER_HOUR, MINUTE, HALF_MINUTE, QUARTER_MINUTE, SECOND]

Fragment 0:0

Please, refer to logs for more information.

[Error Id: f119202e-ec24-4670-83c2-14b4a7f83ebf on doc23.lab:31010] 
(state=,code=0)

apache drill> select nearestdate(CAST(COLUMNS[2] as timestamp), 'SECOND') as 
nearest_second from dfs.samples.`/bee/time.csv`;
+---+
|nearest_second |
+---+
| 2018-01-01 05:10:15.0 |
| 2017-02-02 01:02:03.0 |
| 2003-04-06 07:11:11.0 |
+---+
3 rows selected (0.191 seconds)  

Are 15SECOND and 30SECOND supported?

Thanks,
Bridget


> Add Function to Facilitate Time Series Analysis
> ---
>
> Key: DRILL-7077
> URL: https://issues.apache.org/jira/browse/DRILL-7077
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.16.0
>
>
> When analyzing time based data, you will often have to aggregate by time 
> grains. While some time grains will be easy to calculate, others, such as 
> quarter, can be quite difficult. These functions enable a user to quickly and 
> easily aggregate data by various units of time. Usage is as follows:
> {code:java}
> SELECT 
> FROM 
> GROUP BY nearestDate(, {code}
> So let's say that a user wanted to count the number of hits on a web server 
> per 15 minute, the query might look like this:
> {code:java}
> SELECT nearestDate(`eventDate`, '15MINUTE' ) AS eventDate,
> COUNT(*) AS hitCount
> FROM dfs.`log.httpd`
> GROUP BY nearestDate(`eventDate`, '15MINUTE'){code}
> Currently supports the following time units:
>  * YEAR
>  * QUARTER
>  * MONTH
>  * WEEK_SUNDAY
>  * WEEK_MONDAY
>  * DAY
>  * HOUR
>  * HALF_HOUR / 30MIN
>  * QUARTER_HOUR / 15MIN
>  * MINUTE
>  * 30SECOND
>  * 15SECOND
>  * SECOND
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806446#comment-16806446
 ] 

ASF GitHub Bot commented on DRILL-7077:
---

asfgit commented on pull request #1680: DRILL-7077: Add Function to Facilitate 
Time Series Analysis
URL: https://github.com/apache/drill/pull/1680
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Function to Facilitate Time Series Analysis
> ---
>
> Key: DRILL-7077
> URL: https://issues.apache.org/jira/browse/DRILL-7077
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.16.0
>
>
> When analyzing time based data, you will often have to aggregate by time 
> grains. While some time grains will be easy to calculate, others, such as 
> quarter, can be quite difficult. These functions enable a user to quickly and 
> easily aggregate data by various units of time. Usage is as follows:
> {code:java}
> SELECT 
> FROM 
> GROUP BY nearestDate(, {code}
> So let's say that a user wanted to count the number of hits on a web server 
> per 15 minute, the query might look like this:
> {code:java}
> SELECT nearestDate(`eventDate`, '15MINUTE' ) AS eventDate,
> COUNT(*) AS hitCount
> FROM dfs.`log.httpd`
> GROUP BY nearestDate(`eventDate`, '15MINUTE'){code}
> Currently supports the following time units:
>  * YEAR
>  * QUARTER
>  * MONTH
>  * WEEK_SUNDAY
>  * WEEK_MONDAY
>  * DAY
>  * HOUR
>  * HALF_HOUR / 30MIN
>  * QUARTER_HOUR / 15MIN
>  * MINUTE
>  * 30SECOND
>  * 15SECOND
>  * SECOND
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-03-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801571#comment-16801571
 ] 

ASF GitHub Bot commented on DRILL-7077:
---

cgivre commented on issue #1680: DRILL-7077: Add Function to Facilitate Time 
Series Analysis
URL: https://github.com/apache/drill/pull/1680#issuecomment-476565199
 
 
   @arina-ielchiieva Commits squashed.  Thanks for the review.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Function to Facilitate Time Series Analysis
> ---
>
> Key: DRILL-7077
> URL: https://issues.apache.org/jira/browse/DRILL-7077
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.16.0
>
>
> When analyzing time based data, you will often have to aggregate by time 
> grains. While some time grains will be easy to calculate, others, such as 
> quarter, can be quite difficult. These functions enable a user to quickly and 
> easily aggregate data by various units of time. Usage is as follows:
> {code:java}
> SELECT 
> FROM 
> GROUP BY nearestDate(, {code}
> So let's say that a user wanted to count the number of hits on a web server 
> per 15 minute, the query might look like this:
> {code:java}
> SELECT nearestDate(`eventDate`, '15MINUTE' ) AS eventDate,
> COUNT(*) AS hitCount
> FROM dfs.`log.httpd`
> GROUP BY nearestDate(`eventDate`, '15MINUTE'){code}
> Currently supports the following time units:
>  * YEAR
>  * QUARTER
>  * MONTH
>  * WEEK_SUNDAY
>  * WEEK_MONDAY
>  * DAY
>  * HOUR
>  * HALF_HOUR / 30MIN
>  * QUARTER_HOUR / 15MIN
>  * MINUTE
>  * 30SECOND
>  * 15SECOND
>  * SECOND
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800832#comment-16800832
 ] 

ASF GitHub Bot commented on DRILL-7077:
---

arina-ielchiieva commented on issue #1680: DRILL-7077: Add Function to 
Facilitate Time Series Analysis
URL: https://github.com/apache/drill/pull/1680#issuecomment-476264054
 
 
   +1, LGTM.
   Please squash the commits.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Function to Facilitate Time Series Analysis
> ---
>
> Key: DRILL-7077
> URL: https://issues.apache.org/jira/browse/DRILL-7077
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
>
> When analyzing time based data, you will often have to aggregate by time 
> grains. While some time grains will be easy to calculate, others, such as 
> quarter, can be quite difficult. These functions enable a user to quickly and 
> easily aggregate data by various units of time. Usage is as follows:
> {code:java}
> SELECT 
> FROM 
> GROUP BY nearestDate(, {code}
> So let's say that a user wanted to count the number of hits on a web server 
> per 15 minute, the query might look like this:
> {code:java}
> SELECT nearestDate(`eventDate`, '15MINUTE' ) AS eventDate,
> COUNT(*) AS hitCount
> FROM dfs.`log.httpd`
> GROUP BY nearestDate(`eventDate`, '15MINUTE'){code}
> Currently supports the following time units:
>  * YEAR
>  * QUARTER
>  * MONTH
>  * WEEK_SUNDAY
>  * WEEK_MONDAY
>  * DAY
>  * HOUR
>  * HALF_HOUR / 30MIN
>  * QUARTER_HOUR / 15MIN
>  * MINUTE
>  * 30SECOND
>  * 15SECOND
>  * SECOND
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800825#comment-16800825
 ] 

ASF GitHub Bot commented on DRILL-7077:
---

ihuzenko commented on pull request #1680: DRILL-7077: Add Function to 
Facilitate Time Series Analysis
URL: https://github.com/apache/drill/pull/1680#discussion_r268716085
 
 

 ##
 File path: 
contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestNearestDateFunctions.java
 ##
 @@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.categories.SqlFunctionTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterFixtureBuilder;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.time.LocalDateTime;
+
+import static org.junit.Assert.assertTrue;
+
+@Category({UnlikelyTest.class, SqlFunctionTest.class})
+public class TestNearestDateFunctions extends ClusterTest {
+
+  @BeforeClass
+  public static void setup() throws Exception {
+ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher);
+startCluster(builder);
+  }
+
+  @Test
+  public void testNearestDate() throws Exception {
 
 Review comment:
   ok, cool)  
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Function to Facilitate Time Series Analysis
> ---
>
> Key: DRILL-7077
> URL: https://issues.apache.org/jira/browse/DRILL-7077
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
>
> When analyzing time based data, you will often have to aggregate by time 
> grains. While some time grains will be easy to calculate, others, such as 
> quarter, can be quite difficult. These functions enable a user to quickly and 
> easily aggregate data by various units of time. Usage is as follows:
> {code:java}
> SELECT 
> FROM 
> GROUP BY nearestDate(, {code}
> So let's say that a user wanted to count the number of hits on a web server 
> per 15 minute, the query might look like this:
> {code:java}
> SELECT nearestDate(`eventDate`, '15MINUTE' ) AS eventDate,
> COUNT(*) AS hitCount
> FROM dfs.`log.httpd`
> GROUP BY nearestDate(`eventDate`, '15MINUTE'){code}
> Currently supports the following time units:
>  * YEAR
>  * QUARTER
>  * MONTH
>  * WEEK_SUNDAY
>  * WEEK_MONDAY
>  * DAY
>  * HOUR
>  * HALF_HOUR / 30MIN
>  * QUARTER_HOUR / 15MIN
>  * MINUTE
>  * 30SECOND
>  * 15SECOND
>  * SECOND
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800823#comment-16800823
 ] 

ASF GitHub Bot commented on DRILL-7077:
---

ihuzenko commented on pull request #1680: DRILL-7077: Add Function to 
Facilitate Time Series Analysis
URL: https://github.com/apache/drill/pull/1680#discussion_r268714887
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/NearestDateUtils.java
 ##
 @@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+
+import java.time.temporal.TemporalAdjusters;
+import java.time.LocalDateTime;
+import java.time.DayOfWeek;
+import java.time.temporal.ChronoUnit;
+import java.util.Arrays;
+
+public class NearestDateUtils {
+  /**
+   * Specifies the time grouping to be used with the nearest date function
+   */
+  private enum TimeInterval {
+YEAR,
+QUARTER,
+MONTH,
+WEEK_SUNDAY,
+WEEK_MONDAY,
+DAY,
+HOUR,
+HALF_HOUR,
+QUARTER_HOUR,
+MINUTE,
+HALF_MINUTE,
+QUARTER_MINUTE,
+SECOND
+  }
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(NearestDateUtils.class);
+
+  /**
+   * This function takes a Java LocalDateTime object, and an interval string 
and returns
+   * the nearest date closets to that time.  For instance, if you specified 
the date as 2018-05-04 and YEAR, the function
+   * will return 2018-01-01
+   *
+   * @param dthe original datetime before adjustments
+   * @param interval The interval string to deduct from the supplied date
+   * @return the modified LocalDateTime
+   */
+  public final static java.time.LocalDateTime getDate(java.time.LocalDateTime 
d, String interval) {
+java.time.LocalDateTime newDate = d;
+int year = d.getYear();
+int month = d.getMonth().getValue();
+int day = d.getDayOfMonth();
+int hour = d.getHour();
+int minute = d.getMinute();
+int second = d.getSecond();
+TimeInterval adjustmentAmount;
+try {
+  adjustmentAmount = TimeInterval.valueOf(interval.toUpperCase());
+} catch (IllegalArgumentException e) {
+  throw new DrillRuntimeException(String.format("[%s] is not a valid time 
statement. Expecting: %s", interval, Arrays.asList(TimeInterval.values(;
+}
+switch (adjustmentAmount) {
+  case YEAR:
+newDate = LocalDateTime.of(year, 1, 1, 0, 0, 0);
+break;
+  case QUARTER:
+newDate = LocalDateTime.of(year, (month / 3) * 3 + 1, 1, 0, 0, 0);
+break;
+  case MONTH:
+newDate = LocalDateTime.of(year, month, 1, 0, 0, 0);
+break;
+  case WEEK_SUNDAY:
+newDate = 
newDate.with(TemporalAdjusters.previousOrSame(DayOfWeek.SUNDAY))
+.truncatedTo(ChronoUnit.DAYS);
+break;
+  case WEEK_MONDAY:
+newDate = 
newDate.with(TemporalAdjusters.previousOrSame(DayOfWeek.MONDAY))
+.truncatedTo(ChronoUnit.DAYS);
+break;
+  case DAY:
+newDate = LocalDateTime.of(year, month, day, 0, 0, 0);
+break;
+  case HOUR:
+newDate = LocalDateTime.of(year, month, day, hour, 0, 0);
+break;
+  case HALF_HOUR:
+if (minute >= 30) {
+  minute = 30;
+} else {
+  minute = 0;
+}
+newDate = LocalDateTime.of(year, month, day, hour, minute, 0);
+break;
+  case QUARTER_HOUR:
+if (minute >= 45) {
+  minute = 45;
+} else if (minute >= 30) {
+  minute = 30;
+} else if (minute >= 15) {
+  minute = 15;
+} else {
+  minute = 0;
+}
+newDate = LocalDateTime.of(year, month, day, hour, minute, 0);
+break;
+  case MINUTE:
+newDate = LocalDateTime.of(year, month, day, hour, minute, 0);
+break;
+  case HALF_MINUTE:
+if (second >= 30) {
+  second = 30;
+} else {
+  second = 0;
+}
+newDate = LocalDateTime.of(year, month, day, hour, minute, second);
+break;

[jira] [Commented] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-03-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800394#comment-16800394
 ] 

ASF GitHub Bot commented on DRILL-7077:
---

cgivre commented on pull request #1680: DRILL-7077: Add Function to Facilitate 
Time Series Analysis
URL: https://github.com/apache/drill/pull/1680#discussion_r268484737
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/NearestDateUtils.java
 ##
 @@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+
+import java.time.temporal.TemporalAdjusters;
+import java.time.LocalDateTime;
+import java.time.DayOfWeek;
+import java.time.temporal.ChronoUnit;
+import java.util.Arrays;
+
+public class NearestDateUtils {
+  /**
+   * Specifies the time grouping to be used with the nearest date function
+   */
+  private enum TimeInterval {
+YEAR,
+QUARTER,
+MONTH,
+WEEK_SUNDAY,
+WEEK_MONDAY,
+DAY,
+HOUR,
+HALF_HOUR,
+QUARTER_HOUR,
+MINUTE,
+HALF_MINUTE,
+QUARTER_MINUTE,
+SECOND
+  }
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(NearestDateUtils.class);
+
+  /**
+   * This function takes a Java LocalDateTime object, and an interval string 
and returns
+   * the nearest date closets to that time.  For instance, if you specified 
the date as 2018-05-04 and YEAR, the function
+   * will return 2018-01-01
+   *
+   * @param dthe original datetime before adjustments
+   * @param interval The interval string to deduct from the supplied date
+   * @return the modified LocalDateTime
+   */
+  public final static java.time.LocalDateTime getDate(java.time.LocalDateTime 
d, String interval) {
+java.time.LocalDateTime newDate = d;
+int year = d.getYear();
+int month = d.getMonth().getValue();
+int day = d.getDayOfMonth();
+int hour = d.getHour();
+int minute = d.getMinute();
+int second = d.getSecond();
+TimeInterval adjustmentAmount;
+try {
+  adjustmentAmount = TimeInterval.valueOf(interval.toUpperCase());
+} catch (IllegalArgumentException e) {
+  throw new DrillRuntimeException(String.format("[%s] is not a valid time 
statement. Expecting: %s", interval, Arrays.asList(TimeInterval.values(;
+}
+switch (adjustmentAmount) {
+  case YEAR:
+newDate = LocalDateTime.of(year, 1, 1, 0, 0, 0);
+break;
+  case QUARTER:
+newDate = LocalDateTime.of(year, (month / 3) * 3 + 1, 1, 0, 0, 0);
+break;
+  case MONTH:
+newDate = LocalDateTime.of(year, month, 1, 0, 0, 0);
+break;
+  case WEEK_SUNDAY:
+newDate = 
newDate.with(TemporalAdjusters.previousOrSame(DayOfWeek.SUNDAY))
+.truncatedTo(ChronoUnit.DAYS);
+break;
+  case WEEK_MONDAY:
+newDate = 
newDate.with(TemporalAdjusters.previousOrSame(DayOfWeek.MONDAY))
+.truncatedTo(ChronoUnit.DAYS);
+break;
+  case DAY:
+newDate = LocalDateTime.of(year, month, day, 0, 0, 0);
+break;
+  case HOUR:
+newDate = LocalDateTime.of(year, month, day, hour, 0, 0);
+break;
+  case HALF_HOUR:
+if (minute >= 30) {
+  minute = 30;
+} else {
+  minute = 0;
+}
+newDate = LocalDateTime.of(year, month, day, hour, minute, 0);
+break;
+  case QUARTER_HOUR:
+if (minute >= 45) {
+  minute = 45;
+} else if (minute >= 30) {
+  minute = 30;
+} else if (minute >= 15) {
+  minute = 15;
+} else {
+  minute = 0;
+}
+newDate = LocalDateTime.of(year, month, day, hour, minute, 0);
+break;
+  case MINUTE:
+newDate = LocalDateTime.of(year, month, day, hour, minute, 0);
+break;
+  case HALF_MINUTE:
+if (second >= 30) {
+  second = 30;
+} else {
+  second = 0;
+}
+newDate = LocalDateTime.of(year, month, day, hour, minute, second);
+break;
+ 

[jira] [Commented] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-03-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800393#comment-16800393
 ] 

ASF GitHub Bot commented on DRILL-7077:
---

cgivre commented on pull request #1680: DRILL-7077: Add Function to Facilitate 
Time Series Analysis
URL: https://github.com/apache/drill/pull/1680#discussion_r268484310
 
 

 ##
 File path: 
contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestNearestDateFunctions.java
 ##
 @@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.categories.SqlFunctionTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterFixtureBuilder;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.time.LocalDateTime;
+
+import static org.junit.Assert.assertTrue;
+
+@Category({UnlikelyTest.class, SqlFunctionTest.class})
+public class TestNearestDateFunctions extends ClusterTest {
+
+  @BeforeClass
+  public static void setup() throws Exception {
+ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher);
+startCluster(builder);
+  }
+
+  @Test
+  public void testNearestDate() throws Exception {
 
 Review comment:
   @ihuzenko 
   If the values are incorrect, the test method will output something like 
received X, expecting Y so it's pretty easy to see what's wrong.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Function to Facilitate Time Series Analysis
> ---
>
> Key: DRILL-7077
> URL: https://issues.apache.org/jira/browse/DRILL-7077
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
>
> When analyzing time based data, you will often have to aggregate by time 
> grains. While some time grains will be easy to calculate, others, such as 
> quarter, can be quite difficult. These functions enable a user to quickly and 
> easily aggregate data by various units of time. Usage is as follows:
> {code:java}
> SELECT 
> FROM 
> GROUP BY nearestDate(, {code}
> So let's say that a user wanted to count the number of hits on a web server 
> per 15 minute, the query might look like this:
> {code:java}
> SELECT nearestDate(`eventDate`, '15MINUTE' ) AS eventDate,
> COUNT(*) AS hitCount
> FROM dfs.`log.httpd`
> GROUP BY nearestDate(`eventDate`, '15MINUTE'){code}
> Currently supports the following time units:
>  * YEAR
>  * QUARTER
>  * MONTH
>  * WEEK_SUNDAY
>  * WEEK_MONDAY
>  * DAY
>  * HOUR
>  * HALF_HOUR / 30MIN
>  * QUARTER_HOUR / 15MIN
>  * MINUTE
>  * 30SECOND
>  * 15SECOND
>  * SECOND
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-03-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800391#comment-16800391
 ] 

ASF GitHub Bot commented on DRILL-7077:
---

cgivre commented on pull request #1680: DRILL-7077: Add Function to Facilitate 
Time Series Analysis
URL: https://github.com/apache/drill/pull/1680#discussion_r268484099
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/NearestDateUtils.java
 ##
 @@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+
+import java.time.temporal.TemporalAdjusters;
+import java.time.LocalDateTime;
+import java.time.DayOfWeek;
+import java.time.temporal.ChronoUnit;
+import java.util.Arrays;
+
+public class NearestDateUtils {
+  /**
+   * Specifies the time grouping to be used with the nearest date function
+   */
+  private enum TimeInterval {
+YEAR,
+QUARTER,
+MONTH,
+WEEK_SUNDAY,
+WEEK_MONDAY,
+DAY,
+HOUR,
+HALF_HOUR,
+QUARTER_HOUR,
+MINUTE,
+HALF_MINUTE,
+QUARTER_MINUTE,
+SECOND
+  }
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(NearestDateUtils.class);
 
 Review comment:
   Fixed
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Function to Facilitate Time Series Analysis
> ---
>
> Key: DRILL-7077
> URL: https://issues.apache.org/jira/browse/DRILL-7077
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
>
> When analyzing time based data, you will often have to aggregate by time 
> grains. While some time grains will be easy to calculate, others, such as 
> quarter, can be quite difficult. These functions enable a user to quickly and 
> easily aggregate data by various units of time. Usage is as follows:
> {code:java}
> SELECT 
> FROM 
> GROUP BY nearestDate(, {code}
> So let's say that a user wanted to count the number of hits on a web server 
> per 15 minute, the query might look like this:
> {code:java}
> SELECT nearestDate(`eventDate`, '15MINUTE' ) AS eventDate,
> COUNT(*) AS hitCount
> FROM dfs.`log.httpd`
> GROUP BY nearestDate(`eventDate`, '15MINUTE'){code}
> Currently supports the following time units:
>  * YEAR
>  * QUARTER
>  * MONTH
>  * WEEK_SUNDAY
>  * WEEK_MONDAY
>  * DAY
>  * HOUR
>  * HALF_HOUR / 30MIN
>  * QUARTER_HOUR / 15MIN
>  * MINUTE
>  * 30SECOND
>  * 15SECOND
>  * SECOND
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-03-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800390#comment-16800390
 ] 

ASF GitHub Bot commented on DRILL-7077:
---

cgivre commented on pull request #1680: DRILL-7077: Add Function to Facilitate 
Time Series Analysis
URL: https://github.com/apache/drill/pull/1680#discussion_r268484081
 
 

 ##
 File path: 
contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestNearestDateFunctions.java
 ##
 @@ -0,0 +1,152 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.categories.SqlFunctionTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterFixtureBuilder;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.time.LocalDateTime;
+
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+@Category({UnlikelyTest.class, SqlFunctionTest.class})
+public class TestNearestDateFunctions extends ClusterTest {
+
+  @BeforeClass
+  public static void setup() throws Exception {
+ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher);
+startCluster(builder);
+  }
+
+  @Test
+  public void testNearestDate() throws Exception {
+String query = "SELECT nearestDate( TO_TIMESTAMP('2019-02-01 07:22:00', 
'-MM-dd HH:mm:ss'), 'YEAR') AS nearest_year, " +
+"nearestDate( TO_TIMESTAMP('2019-02-01 07:22:00', '-MM-dd 
HH:mm:ss'), 'QUARTER') AS nearest_quarter, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'MONTH') AS nearest_month, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'DAY') AS nearest_day, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'WEEK_SUNDAY') AS nearest_week_sunday, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'WEEK_MONDAY') AS nearest_week_monday, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'HOUR') AS nearest_hour, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:42:00', '-MM-dd 
HH:mm:ss'), 'HALF_HOUR') AS nearest_half_hour, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:48:00', '-MM-dd 
HH:mm:ss'), 'QUARTER_HOUR') AS nearest_quarter_hour, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'MINUTE') AS nearest_minute, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:22', '-MM-dd 
HH:mm:ss'), 'HALF_MINUTE') AS nearest_30second, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:22', '-MM-dd 
HH:mm:ss'), 'QUARTER_MINUTE') AS nearest_15second, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:31', '-MM-dd 
HH:mm:ss'), 'SECOND') AS nearest_second " +
+"FROM (VALUES(1))";
+testBuilder()
+.sqlQuery(query)
+.unOrdered()
+.baselineColumns("nearest_year",
+"nearest_quarter",
+"nearest_month",
+"nearest_day",
+"nearest_week_sunday",
+"nearest_week_monday",
+"nearest_hour",
+"nearest_half_hour",
+"nearest_quarter_hour",
+"nearest_minute",
+"nearest_30second",
+"nearest_15second",
+"nearest_second")
+.baselineValues(LocalDateTime.of(2019, 1, 1, 0, 0, 0),  //Year
+LocalDateTime.of(2019, 1, 1, 0, 0, 0),  //Quarter
+LocalDateTime.of(2019, 2, 1, 0, 0, 0), //Month
+LocalDateTime.of(2019, 2, 15, 0, 0, 0), //Day
+LocalDateTime.of(2019, 2, 10, 0, 0, 0), //Week Sunday
+LocalDateTime.of(2019, 2, 11, 0, 0, 0),