vvysotskyi commented on a change in pull request #2040: DRILL-7668: Allow Time
Bucket Function to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#discussion_r403725422
##########
File path:
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/TimeBucketFunctions.java
##########
@@ -97,9 +99,88 @@ public void eval() {
long timestamp = inputDate.value;
// Get the interval in milliseconds
- long intervalToAdd = interval.value;
+ long groupByInterval = interval.value;
- out.value = timestamp - (timestamp % intervalToAdd);
+ out.value = timestamp - (timestamp % groupByInterval);
+ }
+ }
+
+ /**
+ * This function is used for facilitating time series analysis by creating
buckets of time intervals. See
+ *
https://blog.timescale.com/blog/simplified-time-series-analytics-using-the-time_bucket-function/
for usage. The function takes two arguments:
+ * 1. The timestamp (as a Drill timestamp)
+ * 2. The desired bucket interval IN milliseconds
+ *
+ * The function returns a BIGINT of the nearest time bucket.
+ */
+ @FunctionTemplate(name = "time_bucket",
+ scope = FunctionTemplate.FunctionScope.SIMPLE,
+ nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+ public static class TimestampTimeBucketFunction implements DrillSimpleFunc {
+
+ @Param
+ TimeStampHolder inputDate;
+
+ @Param
+ BigIntHolder interval;
+
+ @Output
+ TimeStampHolder out;
+
+ @Override
+ public void setup() {
+ }
+
+ @Override
+ public void eval() {
+ // Get the timestamp in milliseconds
+ long timestamp = inputDate.value;
+
+ // Get the interval in milliseconds
+ long groupByInterval = interval.value;
+
+ java.time.Instant instant = java.time.Instant.ofEpochMilli(timestamp -
(timestamp % groupByInterval));
+ java.time.LocalDateTime localDate =
instant.atZone(java.time.ZoneId.of("UTC")).toLocalDateTime();
+
+ out.value =
localDate.atZone(java.time.ZoneId.of("UTC")).toInstant().toEpochMilli();
Review comment:
@cgivre, could you please explain, what happens here? Initially, you
calculate the required milliseconds, after that creates `Instant` instance
based on that, converts it to `LocalDateTime` at `UTC` timezone, converts it to
`ZonedDateTime`, converts it to `Instant` and after that converts back to
milliseconds.
Are all these transformations required? Usually, UDF shouldn't apply
timezone to the values they handle.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services