sdf-jkl opened a new issue, #22115:
URL: https://github.com/apache/datafusion/issues/22115
### Describe the bug
`date_part('isodow', ...)` and `EXTRACT(isodow FROM ...)` return values in
`0..=6` (Mon=0, …, Sun=6) instead of PostgreSQL's documented `1..=7` (Mon=1, …,
Sun=7).
Root cause: `datafusion/functions/src/datetime/date_part.rs:242` maps
`"isodow"` to `DatePart::DayOfWeekMonday0`, which arrow implements as
`chrono::Datelike::num_days_from_monday()` — defined as `0..=6` with Mon=0.
Every value is off by `-1` and the range is wrong (df can return `0`, never
`7`; PG is the inverse).
### To Reproduce
```sql
-- 2000-01-01 was a Saturday; PG isodow should be 6
SELECT date_part('isodow', CAST('2000-01-01' AS DATE));
-- df returns: 5
-- 2020-09-08 was a Tuesday; PG isodow should be 2
SELECT EXTRACT(isodow FROM to_timestamp('2020-09-08T12:00:00+00:00'));
-- df returns: 1
```
Both wrong values exactly match `DayOfWeekMonday0` semantics. The bug is
also baked into the test corpus at
`datafusion/sqllogictest/test_files/datetime/date_part.slt:1243` and
`:1248,1253,1258`, so the test suite passes against the buggy output.
### Expected behavior
PostgreSQL-compatible `isodow` per ISO 8601:
| Day | PG `isodow` | df returns |
|---|---|---|
| Mon | 1 | **0** |
| Tue | 2 | **1** |
| Wed | 3 | **2** |
| Thu | 4 | **3** |
| Fri | 5 | **4** |
| Sat | 6 | **5** |
| Sun | 7 | **6** |
### Additional context
#### Caller-visible breakage
- `WHERE date_part('isodow', d) = 7` returns zero rows in df (PG: all
Sundays).
- `WHERE date_part('isodow', d) = 0` returns Mondays in df (PG: zero rows /
domain error).
- Cross-engine SQL silently diverges; users porting from PG/DuckDB/Snowflake
get value-shifted results.
#### Coordination with the Spark wrapper (two bugs that cancel today)
`datafusion/spark/src/function/datetime/date_part.rs:127-133` post-evaluates
a `+ 1` for both `dow` and `isodow`:
```rust
match part {
\"dow\" | \"isodow\" => date_part_expr + 1,
_ => date_part_expr,
}
```
Per the [Spark
docs](https://spark.apache.org/docs/4.2.0-preview2/api/sql/datetime-functions/):
- \`DAYOFWEEK\` (alias \`DOW\`) — Sun(1) → Sat(7)
- \`DAYOFWEEK_ISO\` (alias \`DOW_ISO\`) — Mon(1) → Sun(7), ISO 8601
For a known Friday (`2011-05-06`, Spark's own `extract.sql` fixture), real
Spark returns:
- `extract(dayofweek)` = 6
- `extract(dayofweek_iso)` = 5
The df-spark wrapper currently produces those same values, but the two `+1`s
do semantically different jobs:
- **`dow`'s `+1` is a real convention shift**: df's `dow` is PG's 0..=6
Sun=0; Spark's `dayofweek` is 1..=7 Sun=1. Same anchor, different numbering —
`+1` bridges them. Right answer for the right reason.
- **`isodow`'s `+1` is bug compensation**: df's `isodow` and Spark's
`dayofweek_iso` should have the *same* convention (Mon=1, ISO 8601). No shift
should be needed. The `+1` works only because df's `isodow` is wrong by `-1`.
Right answer by accident — `BUG_in_df + extra_+1_in_wrapper =
correct_observable_output`.
Fixing df's `isodow` standalone silently breaks Spark's `dayofweek_iso`
(becomes 2..=8). The Spark wrapper's `isodow` branch must drop the `+ 1` in the
same PR.
#### Proposed fix chain
1. **arrow-rs**: add `DatePart::DayOfWeekMonday1` variant returning
`num_days_from_monday() + 1` (1..=7, Mon=1). Non-breaking thanks to
`#[non_exhaustive]`. Symmetric with the existing `Week`/`WeekISO` and
`Year`/`YearISO` pairs.
2. **datafusion `date_part.rs:242`**: `\"isodow\" =>
DatePart::DayOfWeekMonday1`.
3. **datafusion Spark `date_part.rs:127-133`**: drop `\"isodow\"` from the
`+ 1` match (keep `\"dow\"` — the genuine convention shift).
4. **datafusion `date_part.slt`**: update lines 1243 (5→6) and
1248/1253/1258 (1→2) to PG-correct expectations.
Steps 2–4 ship as one datafusion PR once step 1 lands in arrow. An interim
datafusion PR can use `DayOfWeekMonday0 + 1` array arithmetic in step 2 to ship
the fix today, and swap to the cleaner variant when it ships in arrow.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]