Re: [PR] [SPARK-56370] Optimize Platform.copyMemory with a fast path for small copies [spark]

2026-04-07 Thread via GitHub


LuciferYang commented on code in PR #55230:
URL: https://github.com/apache/spark/pull/55230#discussion_r3049360732


##
core/benchmarks/PlatformBenchmark-results.txt:
##
@@ -2,277 +2,277 @@
 Platform Byte Access
 

 
-OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Linux 6.14.0-1017-azure
-AMD EPYC 7763 64-Core Processor
+OpenJDK 64-Bit Server VM 17.0.18+0 on Mac OS X 26.2
+Apple M4 Pro
 Byte Access:  Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
 

-putByte: On-heap 63 63 
  0   1590.5   0.6   1.0X
-putByte: Off-heap80 80 
  0   1245.7   0.8   0.8X
-getByte: On-heap 59 59 
  0   1685.7   0.6   1.1X
-getByte: Off-heap47 48 
  1   2112.6   0.5   1.3X
+putByte: On-heap 13 13 
  0   7745.2   0.1   1.0X
+putByte: Off-heap14 18 
  4   7026.1   0.1   0.9X
+getByte: On-heap 25 26 
  0   3922.1   0.3   0.5X
+getByte: Off-heap25 27 
  3   3953.4   0.3   0.5X
 
 
 

 Platform Short Access
 

 
-OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Linux 6.14.0-1017-azure
-AMD EPYC 7763 64-Core Processor
+OpenJDK 64-Bit Server VM 17.0.18+0 on Mac OS X 26.2
+Apple M4 Pro
 Short Access: Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
 

-putShort: On-heap63 63 
  1   1586.2   0.6   1.0X
-putShort: Off-heap  119119 
  0839.2   1.2   0.5X

Review Comment:
   This was a previous noise/flaw. If this optimization proves effective, we 
can fix it as part of this effort. haha



##
core/benchmarks/PlatformBenchmark-results.txt:
##
@@ -2,277 +2,277 @@
 Platform Byte Access
 

 
-OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Linux 6.14.0-1017-azure
-AMD EPYC 7763 64-Core Processor
+OpenJDK 64-Bit Server VM 17.0.18+0 on Mac OS X 26.2
+Apple M4 Pro
 Byte Access:  Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
 

-putByte: On-heap 63 63 
  0   1590.5   0.6   1.0X
-putByte: Off-heap80 80 
  0   1245.7   0.8   0.8X
-getByte: On-heap 59 59 
  0   1685.7   0.6   1.1X
-getByte: Off-heap47 48 
  1   2112.6   0.5   1.3X
+putByte: On-heap 13 13 
  0   7745.2   0.1   1.0X
+putByte: Off-heap14 18 
  4   7026.1   0.1   0.9X
+getByte: On-heap 25 26 
  0   3922.1   0.3   0.5X
+getByte: Off-heap25 27 
  3   3953.4   0.3   0.5X
 
 
 

 Platform Short Access
 

 
-OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Linux 6.14.0-1017-azure
-AMD EPYC 7763 64-Core Processor
+OpenJDK 64-Bit Server VM 17.0.18+0 on Mac OS X 26.2
+Apple M4 Pro
 Short Access: Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
 
---

Re: [PR] [SPARK-56370] Optimize Platform.copyMemory with a fast path for small copies [spark]

2026-04-07 Thread via GitHub


LuciferYang commented on code in PR #55230:
URL: https://github.com/apache/spark/pull/55230#discussion_r3049355930


##
core/benchmarks/PlatformBenchmark-results.txt:
##
@@ -2,277 +2,277 @@
 Platform Byte Access
 

 
-OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Linux 6.14.0-1017-azure
-AMD EPYC 7763 64-Core Processor
+OpenJDK 64-Bit Server VM 17.0.18+0 on Mac OS X 26.2
+Apple M4 Pro

Review Comment:
   https://github.com/user-attachments/assets/f9896be8-4ca7-4c80-b6e2-6072b2102937";
 />
   
   You can find a similar workflow under your personal repository, then trigger 
the job here. You can run the Java 17, 21 & 25 tasks separately and have them 
submit the results automatically.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [SPARK-56370] Optimize Platform.copyMemory with a fast path for small copies [spark]

2026-04-07 Thread via GitHub


LuciferYang commented on code in PR #55230:
URL: https://github.com/apache/spark/pull/55230#discussion_r3049349054


##
core/benchmarks/PlatformBenchmark-results.txt:
##
@@ -2,277 +2,277 @@
 Platform Byte Access
 

 
-OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Linux 6.14.0-1017-azure
-AMD EPYC 7763 64-Core Processor
+OpenJDK 64-Bit Server VM 17.0.18+0 on Mac OS X 26.2
+Apple M4 Pro

Review Comment:
   we need run this with Github Actions



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [SPARK-56370] Optimize Platform.copyMemory with a fast path for small copies [spark]

2026-04-07 Thread via GitHub


AngersZh commented on code in PR #55230:
URL: https://github.com/apache/spark/pull/55230#discussion_r3048946292


##
common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:
##
@@ -240,6 +240,10 @@ public static void setMemory(long address, byte value, 
long size) {
 
   public static void copyMemory(
 Object src, long srcOffset, Object dst, long dstOffset, long length) {
+if (length <= UNSAFE_COPY_THRESHOLD) {

Review Comment:
   We use parca to verify this, only this API. The benefit means in prod, most 
should be less than 1m



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [SPARK-56370] Optimize Platform.copyMemory with a fast path for small copies [spark]

2026-04-07 Thread via GitHub


AngersZh commented on code in PR #55230:
URL: https://github.com/apache/spark/pull/55230#discussion_r3048946292


##
common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:
##
@@ -240,6 +240,10 @@ public static void setMemory(long address, byte value, 
long size) {
 
   public static void copyMemory(
 Object src, long srcOffset, Object dst, long dstOffset, long length) {
+if (length <= UNSAFE_COPY_THRESHOLD) {

Review Comment:
   We use parca to verify this, only this API. The benefit means in prod, most 
should be less than 1m



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [SPARK-56370] Optimize Platform.copyMemory with a fast path for small copies [spark]

2026-04-07 Thread via GitHub


LuciferYang commented on code in PR #55230:
URL: https://github.com/apache/spark/pull/55230#discussion_r3048933881


##
common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:
##
@@ -240,6 +240,10 @@ public static void setMemory(long address, byte value, 
long size) {
 
   public static void copyMemory(
 Object src, long srcOffset, Object dst, long dstOffset, long length) {
+if (length <= UNSAFE_COPY_THRESHOLD) {

Review Comment:
   @AngersZh Is it end-to-end, or just this API? Do we get consistent 
performance gains on Java 17, 21, and 25? If you think `PlatformBenchmark` does 
not capture this improvement, could you provide reproducible jmh benchmark code 
and the results run on GitHub Actions?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [SPARK-56370] Optimize Platform.copyMemory with a fast path for small copies [spark]

2026-04-07 Thread via GitHub


AngersZh commented on code in PR #55230:
URL: https://github.com/apache/spark/pull/55230#discussion_r3048911397


##
common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:
##
@@ -240,6 +240,10 @@ public static void setMemory(long address, byte value, 
long size) {
 
   public static void copyMemory(
 Object src, long srcOffset, Object dst, long dstOffset, long length) {
+if (length <= UNSAFE_COPY_THRESHOLD) {

Review Comment:
   > IIRC, I should have already taken note of this potential optimization 
point, but I remember there wasn't any significant optimization effect.
   
   In our prod, gain nearly 1% benefit



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [SPARK-56370] Optimize Platform.copyMemory with a fast path for small copies [spark]

2026-04-06 Thread via GitHub


LuciferYang commented on code in PR #55230:
URL: https://github.com/apache/spark/pull/55230#discussion_r3043337643


##
common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:
##
@@ -240,6 +240,10 @@ public static void setMemory(long address, byte value, 
long size) {
 
   public static void copyMemory(
 Object src, long srcOffset, Object dst, long dstOffset, long length) {
+if (length <= UNSAFE_COPY_THRESHOLD) {

Review Comment:
   IIRC, I should have already taken note of this potential optimization point, 
but I remember there wasn't any significant optimization effect.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [SPARK-56370] Optimize Platform.copyMemory with a fast path for small copies [spark]

2026-04-06 Thread via GitHub


LuciferYang commented on code in PR #55230:
URL: https://github.com/apache/spark/pull/55230#discussion_r3043331562


##
common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:
##
@@ -240,6 +240,10 @@ public static void setMemory(long address, byte value, 
long size) {
 
   public static void copyMemory(
 Object src, long srcOffset, Object dst, long dstOffset, long length) {
+if (length <= UNSAFE_COPY_THRESHOLD) {

Review Comment:
   The microbenchmark results of `PlatformBenchmark.scala` need to be refreshed 
to verify the effectiveness.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]