Copilot commented on code in PR #2149:
URL: https://github.com/apache/auron/pull/2149#discussion_r3025221300
##########
spark-extension/src/main/scala/org/apache/spark/sql/auron/NativeConverters.scala:
##########
@@ -952,6 +952,8 @@ object NativeConverters extends Logging {
buildTimePartExt("Spark_Minute", e.children.head, isPruningExpr,
fallback)
case e: Second if datetimeExtractEnabled =>
buildTimePartExt("Spark_Second", e.children.head, isPruningExpr,
fallback)
+ case e: MonthsBetween if datetimeExtractEnabled =>
Review Comment:
`MonthsBetween` conversion is guarded by `datetimeExtractEnabled`, but the
backing config option `SparkAuronConfiguration.DATETIME_EXTRACT_ENABLED` is
documented as enabling only “datetime extract operations”
(spark-extension/src/main/java/org/apache/auron/spark/configuration/SparkAuronConfiguration.java:506-510).
This makes native `months_between` unexpectedly disabled by default and
couples it to an unrelated feature flag. Consider introducing a dedicated
config for `months_between` (or a broader datetime-functions flag), or updating
the existing config’s name/description if it is intended to gate more than
extract functions.
```suggestion
case e: MonthsBetween =>
```
##########
native-engine/datafusion-ext-functions/src/spark_dates.rs:
##########
@@ -643,4 +808,120 @@ mod tests {
Ok(())
}
+
+ fn months_between_args(
+ timestamp1_ms: Option<i64>,
+ timestamp2_ms: Option<i64>,
+ round_off: Option<bool>,
+ timezone: Option<&str>,
+ ) -> [ColumnarValue; 4] {
+ [
+
ColumnarValue::Scalar(ScalarValue::TimestampMillisecond(timestamp1_ms, None)),
+
ColumnarValue::Scalar(ScalarValue::TimestampMillisecond(timestamp2_ms, None)),
+ ColumnarValue::Scalar(ScalarValue::Boolean(round_off)),
+
ColumnarValue::Scalar(ScalarValue::Utf8(timezone.map(str::to_string))),
+ ]
+ }
+
+ #[test]
+ fn test_spark_months_between_ignores_time_for_same_day() -> Result<()> {
+ let out = spark_months_between(&months_between_args(
+ Some(utc_ms(2024, 3, 15, 23, 59, 59)),
+ Some(utc_ms(2024, 1, 15, 0, 0, 0)),
+ Some(true),
+ Some("UTC"),
+ ))?
Review Comment:
The new unit tests validate core `months_between` behavior in UTC, but they
don’t cover the session time zone/DST-sensitive paths (e.g., non-UTC zones, DST
gaps/overlaps) even though the implementation has dedicated logic for local day
boundaries. Adding at least one test case with a DST-transition zone (like
America/New_York) would help prevent regressions in the timezone-aware
computation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]