cloud-fan opened a new pull request, #55815:
URL: https://github.com/apache/spark/pull/55815
<!--
Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, please read our contributor guidelines:
https://spark.apache.org/contributing.html
2. Ensure you have added or run the appropriate tests for your PR:
https://spark.apache.org/developer-tools.html
3. If the PR is unfinished, see how to mark it:
https://spark.apache.org/contributing.html#pull-request
4. Be sure to keep the PR description updated to reflect all changes.
5. Please write your PR title to summarize what this PR proposes.
6. If possible, provide a concise example to reproduce the issue for a
faster review.
7. If you want to add a new configuration, please read the guideline first
for naming configurations in
'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
8. If you want to add or modify an error type or message, please read the
guideline first in
'common/utils/src/main/resources/error/README.md'.
-->
### What changes were proposed in this pull request?
Five small cleanups in the segment-tree moving-frame window code introduced
by #55422:
1. `AggregateProcessor.evaluate(source, target)` -- replace the
buffer-layout contract `assert` with `require`. The existing comment notes the
contract is "invisible at the call site and easy to break from either end" and
that the check exists to "surface drift loudly instead of producing silently
garbled output". But `assert` is disabled in production JVMs by default, so a
future refactor that changes `WindowSegmentTree`'s buffer layout would silently
produce wrong results in production while passing tests. `require` makes the
safeguard active everywhere.
2. `WindowEvaluatorFactoryBase.scala` -- fix terminology in the `def
processor` comment. The comment says `Keep as def (by-name)`, but `def
processor(index: Int)` is a parameterized method, not a by-name parameter (`=>
T`). Reword to `Keep as def (lazy / per-call)`.
3. `WindowEvaluatorFactoryBase.eligibleForSegTree` -- add a defensive `case
_ => false` to the `frameType match` so future additions to the sealed
`FrameType` trait do not silently throw `MatchError` at runtime.
4. `WindowEvaluatorFactoryBase.estimateMaxCachedBlocks` -- add a comment
justifying the `+ 2` slack in the cached-block budget (one boundary block at
each end of the frame's interval), since the magic number was not previously
explained.
5. `WindowSegmentTreeSuite.scala` -- fix indentation of 11 `test(` blocks
that were declared at 4-space indent (inconsistent with the file's 2-space
convention and the 3 other correctly-indented tests in the same file).
A separate follow-up is needed to add RANGE-frame coverage to
`WindowBenchmark` -- the current benchmark is RowFrame-only -- but that
requires regenerating the committed JDK 17/21/25 results files and is deferred.
### Why are the changes needed?
Review-comment-style follow-ups. Nothing here changes behavior beyond the
`assert -> require` swap, which only fires if an internal contract is violated
-- in which case throwing in production is the desired behavior.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Existing tests cover all touched code paths. The `require` change is
exercised by the same callers that exercise the `assert`; the test indentation
fix is whitespace-only; the comment and `case _` changes have no runtime effect.
### Was this patch authored or co-authored using generative AI tooling?
Yes, Claude assisted in identifying and drafting these cleanups.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]