[
https://issues.apache.org/jira/browse/SPARK-56769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rito Takeuchi updated SPARK-56769:
----------------------------------
Description:
h2. Background
`DateTimeUtils.truncTimestamp` for the WEEK / MONTH / QUARTER / YEAR levels
currently routes through:
{code:scala}
case _ => // Try to truncate date levels
val dDays = microsToDays(micros, zoneId)
daysToMicros(truncDate(dDays, level), zoneId)
{code}
`microsToDays` allocates `Instant` + `ZonedDateTime` + `LocalDate` per row;
`daysToMicros` allocates `LocalDate` + `ZonedDateTime` + `Instant`. `truncDate`
itself allocates one more `LocalDate` for MONTH/YEAR (in `getDayOfMonth` /
`getDayInYear`) and *two* for QUARTER (the existing implementation goes
through `IsoFields.DAY_OF_QUARTER`, which is a `TemporalAdjuster` that
produces a fresh `LocalDate`). The result is 167-218 ns/row on JDK 17 GH
Actions runners.
SPARK-56663 introduced the offset-arithmetic + DST-equality-guard pattern
for the time-level units (MINUTE / HOUR / DAY) and confirmed that the same
pattern is sound for any unit that evenly divides {{MICROS_PER_DAY}}. The
date-level branch is a natural extension.
h2. Proposal
Add a `truncDateFast` helper paralleling `truncToUnitFast` from SPARK-56663:
# Resolve the zone offset at `micros` once.
# Compute the local epoch-day by integer division: {{Math.floorDiv(micros +
offsetMicros, MICROS_PER_DAY)}}.
# Run the existing `truncDate(localDays, level)` (pure integer math for WEEK;
one `LocalDate` alloc for MONTH/YEAR).
# Convert the truncated day back to UTC micros: {{truncatedDays *
MICROS_PER_DAY - offsetMicros}}.
# Verify the offset at the candidate equals the offset at the original
(the SPARK-30766 / SPARK-30857 DST guard); fall back to the slow
`microsToDays` / `daysToMicros` path if not.
Also rewrite `TRUNC_TO_QUARTER` from `IsoFields.DAY_OF_QUARTER` (a
`TemporalAdjuster` that produces a fresh `LocalDate`) to a direct
`withMonth(firstMonthOfQuarter).withDayOfMonth(1)` chain on the existing
`LocalDate`. Saves one allocation + the adjuster overhead.
h2. Benchmark
`DateTimeBenchmark` Truncation, wholestage on, ns/row, on a 12th Gen Intel
i7-1260P:
|| level || master baseline || this PR || speedup ||
| WEEK | 165.2 | 78.2 | 2.11x |
| MONTH | 181.9 | 92.2 | 1.97x |
| MM | 182.2 | 92.5 | 1.97x |
| MON | 182.9 | 92.7 | 1.97x |
| QUARTER | 216.8 | 108.8 | 1.99x |
| YEAR | 205.2 | 96.7 | 2.12x |
| YYYY | 205.8 | 96.9 | 2.12x |
| YY | 206.3 | 96.0 | 2.15x |
Stacked on top of SPARK-56663, the cumulative speedup vs master is the
same range (since this PR only affects rows SPARK-56663 didn't touch).
h2. Out of scope
* `trunc(date, ...)` (date input, no zoneId) -- this PR only changes the
`timestamp -> date_trunc` flow. The `TruncDate` expression bypasses
`truncTimestamp` entirely; the only change visible to it is the
`TRUNC_TO_QUARTER` rewrite (which `trunc(date, ...)` doesn't use in the
benchmark today).
* MICROSECOND / MILLISECOND / SECOND / MINUTE / HOUR / DAY -- handled by
SPARK-56663.
* Per-instance offset cache -- a separate optimization that would amortize
the {{rules.getOffset}} call across rows. Would benefit both this PR's
and SPARK-56663's paths. Out of scope here.
* Integer-only calendar arithmetic (Hinnant-style) -- would eliminate the
remaining `LocalDate` allocation inside `truncDate` for MONTH/YEAR and
push date-level units to the same floor as time-level units. Out of
scope here.
h2. Related
* SPARK-56663 - introduced the offset-arithmetic fast path for MIN/HR/DAY;
this PR extends the same pattern to the date-level units.
* SPARK-33404 - introduced the slow path that this family of changes is
recovering from.
* SPARK-30766 / SPARK-30857 - the DST-correctness invariants from these
fixes are preserved here via the offset-equality guard.
> Add fast path for date_trunc WEEK/MONTH/QUARTER/YEAR
> ----------------------------------------------------
>
> Key: SPARK-56769
> URL: https://issues.apache.org/jira/browse/SPARK-56769
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 5.0.0
> Reporter: Rito Takeuchi
> Priority: Major
>
> h2. Background
> `DateTimeUtils.truncTimestamp` for the WEEK / MONTH / QUARTER / YEAR levels
> currently routes through:
> {code:scala}
> case _ => // Try to truncate date levels
> val dDays = microsToDays(micros, zoneId)
> daysToMicros(truncDate(dDays, level), zoneId)
> {code}
> `microsToDays` allocates `Instant` + `ZonedDateTime` + `LocalDate` per row;
> `daysToMicros` allocates `LocalDate` + `ZonedDateTime` + `Instant`.
> `truncDate`
> itself allocates one more `LocalDate` for MONTH/YEAR (in `getDayOfMonth` /
> `getDayInYear`) and *two* for QUARTER (the existing implementation goes
> through `IsoFields.DAY_OF_QUARTER`, which is a `TemporalAdjuster` that
> produces a fresh `LocalDate`). The result is 167-218 ns/row on JDK 17 GH
> Actions runners.
> SPARK-56663 introduced the offset-arithmetic + DST-equality-guard pattern
> for the time-level units (MINUTE / HOUR / DAY) and confirmed that the same
> pattern is sound for any unit that evenly divides {{MICROS_PER_DAY}}. The
> date-level branch is a natural extension.
> h2. Proposal
> Add a `truncDateFast` helper paralleling `truncToUnitFast` from SPARK-56663:
> # Resolve the zone offset at `micros` once.
> # Compute the local epoch-day by integer division: {{Math.floorDiv(micros +
> offsetMicros, MICROS_PER_DAY)}}.
> # Run the existing `truncDate(localDays, level)` (pure integer math for WEEK;
> one `LocalDate` alloc for MONTH/YEAR).
> # Convert the truncated day back to UTC micros: {{truncatedDays *
> MICROS_PER_DAY - offsetMicros}}.
> # Verify the offset at the candidate equals the offset at the original
> (the SPARK-30766 / SPARK-30857 DST guard); fall back to the slow
> `microsToDays` / `daysToMicros` path if not.
> Also rewrite `TRUNC_TO_QUARTER` from `IsoFields.DAY_OF_QUARTER` (a
> `TemporalAdjuster` that produces a fresh `LocalDate`) to a direct
> `withMonth(firstMonthOfQuarter).withDayOfMonth(1)` chain on the existing
> `LocalDate`. Saves one allocation + the adjuster overhead.
> h2. Benchmark
> `DateTimeBenchmark` Truncation, wholestage on, ns/row, on a 12th Gen Intel
> i7-1260P:
> || level || master baseline || this PR || speedup ||
> | WEEK | 165.2 | 78.2 | 2.11x |
> | MONTH | 181.9 | 92.2 | 1.97x |
> | MM | 182.2 | 92.5 | 1.97x |
> | MON | 182.9 | 92.7 | 1.97x |
> | QUARTER | 216.8 | 108.8 | 1.99x |
> | YEAR | 205.2 | 96.7 | 2.12x |
> | YYYY | 205.8 | 96.9 | 2.12x |
> | YY | 206.3 | 96.0 | 2.15x |
> Stacked on top of SPARK-56663, the cumulative speedup vs master is the
> same range (since this PR only affects rows SPARK-56663 didn't touch).
> h2. Out of scope
> * `trunc(date, ...)` (date input, no zoneId) -- this PR only changes the
> `timestamp -> date_trunc` flow. The `TruncDate` expression bypasses
> `truncTimestamp` entirely; the only change visible to it is the
> `TRUNC_TO_QUARTER` rewrite (which `trunc(date, ...)` doesn't use in the
> benchmark today).
> * MICROSECOND / MILLISECOND / SECOND / MINUTE / HOUR / DAY -- handled by
> SPARK-56663.
> * Per-instance offset cache -- a separate optimization that would amortize
> the {{rules.getOffset}} call across rows. Would benefit both this PR's
> and SPARK-56663's paths. Out of scope here.
> * Integer-only calendar arithmetic (Hinnant-style) -- would eliminate the
> remaining `LocalDate` allocation inside `truncDate` for MONTH/YEAR and
> push date-level units to the same floor as time-level units. Out of
> scope here.
> h2. Related
> * SPARK-56663 - introduced the offset-arithmetic fast path for MIN/HR/DAY;
> this PR extends the same pattern to the date-level units.
> * SPARK-33404 - introduced the slow path that this family of changes is
> recovering from.
> * SPARK-30766 / SPARK-30857 - the DST-correctness invariants from these
> fixes are preserved here via the offset-equality guard.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]