LuciferYang opened a new pull request, #55816: URL: https://github.com/apache/spark/pull/55816
### What changes were proposed in this pull request? Extend the bulk read+widen pattern introduced in SPARK-56791 to `FloatToDoubleUpdater` (parquet FLOAT read into Spark `DoubleType`). A new `readFloatsAsDoubles` default method on `VectorizedValuesReader` does the per-row fallback. `VectorizedPlainValuesReader` overrides it to fetch source bytes once via `getBuffer(total * 4)` and run a tight in-method conversion loop. `FloatToDoubleUpdater.readValues` becomes a one-line delegation. The widen is Java's primitive float-to-double conversion: exact for every finite and infinite float; a NaN float widens to a double NaN (the JVM may canonicalize the payload). ### Why are the changes needed? `FloatToDoubleUpdater.readValues` allocates a fresh `ByteBuffer` slice inside `getBuffer(4)` for every element on the legacy path, and that allocation dominates the loop. Collapsing N allocations into one is the same win SPARK-56791 delivered for the INT32 -> Long sibling. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? (To be updated after the GHA benchmark and test runs complete.) ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
