Copilot commented on code in PR #2129:
URL: https://github.com/apache/auron/pull/2129#discussion_r3007143666


##########
native-engine/datafusion-ext-functions/src/spark_dates.rs:
##########
@@ -46,6 +46,29 @@ pub fn spark_day(args: &[ColumnarValue]) -> 
Result<ColumnarValue> {
     Ok(ColumnarValue::Array(date_part(&input, DatePart::Day)?))
 }
 
+/// `spark_dayofweek(date/timestamp/compatible-string)`
+///
+/// Matches Spark's `dayofweek()` semantics:
+/// Sunday = 1, Monday = 2, ..., Saturday = 7.
+pub fn spark_dayofweek(args: &[ColumnarValue]) -> Result<ColumnarValue> {
+    let input = cast(&args[0].clone().into_array(1)?, &DataType::Date32)?;
+    let input = input
+        .as_any()
+        .downcast_ref::<Date32Array>()
+        .expect("internal cast to Date32 must succeed");
+
+    let epoch = NaiveDate::from_ymd_opt(1970, 1, 1).expect("1970-01-01 must be 
a valid date");
+    let dayofweek = Int32Array::from_iter(input.iter().map(|opt_days| {
+        opt_days.and_then(|days| {
+            epoch
+                .checked_add_signed(Duration::days(days as i64))
+                .map(|date| date.weekday().num_days_from_sunday() as i32 + 1)

Review Comment:
   `spark_dayofweek` computes weekday by materializing a `NaiveDate` per row 
(epoch + Duration::days) and uses `checked_add_signed`, which will yield NULL 
for Date32 values outside chrono's supported range. This is both slower than 
necessary and can diverge from Spark/Date32 semantics for extreme (but 
representable) dates. Consider computing day-of-week directly from the Date32 
day offset with modular arithmetic (e.g., based on the known weekday of 
1970-01-01) to avoid chrono range limits and per-row date construction.
   ```suggestion
       // Date32 is days since 1970-01-01. 1970-01-01 is a Thursday.
       // If we number weekdays so that Sunday = 0, ..., Saturday = 6,
       // then 1970-01-01 corresponds to 4. For an offset `days`,
       // weekday_index = (days + 4) mod 7 gives 0 = Sunday, ..., 6 = Saturday.
       // Spark wants Sunday = 1, ..., Saturday = 7, so we add 1.
       let dayofweek = Int32Array::from_iter(input.iter().map(|opt_days| {
           opt_days.map(|days| {
               let weekday_index = (days as i64 + 4).rem_euclid(7);
               weekday_index as i32 + 1
   ```



##########
spark-extension-shims-spark/src/test/scala/org/apache/auron/AuronFunctionSuite.scala:
##########
@@ -117,6 +117,17 @@ class AuronFunctionSuite extends AuronQueryTest with 
BaseAuronSQLSuite {
     }
   }
 
+  test("dayofweek function") {
+    withTable("t1") {
+      sql("create table t1(c1 date) using parquet")
+      sql("insert into t1 values(date'2009-07-30')")
+
+      checkSparkAnswerAndOperator("select dayofweek(c1) from t1")
+      checkAnswer(sql("select dayofweek(c1) from t1"), Seq(Row(5)))
+      checkAnswer(sql("select dayofweek('2009-07-30')"), Seq(Row(5)))
+    }

Review Comment:
   The new test only covers DATE column input and a string literal. Since the 
PR claims `dayofweek()` supports TIMESTAMP and is null-safe, please add 
assertions for (1) NULL input returning NULL and (2) a TIMESTAMP input (column 
or literal) matching Spark’s result (ideally with an explicit session timeZone 
to make the expected behavior stable).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to