jhrotko commented on PR #949:
URL: https://github.com/apache/arrow-java/pull/949#issuecomment-3729300767

   
   Test Configuration
   - Vector Length w/ 10,000 elements
   - Old Version: Commit e35ed7c3 (UuidHolder with `ArrowBuf buffer` + `int 
start`)
   - New Version: uuid-improvement branch (UuidHolder with `long mostSigBits` + 
`long leastSigBits`)
   - JMH Settings: 2 warmup iterations, 3 measurement iterations, GC profiler 
enabled
   
   ## 1. getUuidDirectly
   Read UUID objects directly via `vector.getObject(i)`
   
   | Metric | Old (buffer+offset) | New (two-longs) | Change |
   |--------|---------------------|-----------------|--------|
   | **Time (µs/op)** | 10.007 ± 1.404 | 9.960 ± 0.460 |  **-0.5%** (0.047 µs 
faster) |
   | **Memory Allocation (B/op)** | 0.012 ± 0.147 | 0.012 ± 0.145 |  **Same** |
   | **Allocation Rate (MB/sec)** | 0.001 ± 0.014 | 0.001 ± 0.014 |  **Same** |
   | **GC Count** | ≈ 0 | ≈ 0 | **Same** |
   
    --> Essentially identical performance
   
   ## 2. getWithNullableUuidHolder
   Read into `NullableUuidHolder` (supports null values)
   
   | Metric | Old (buffer+offset) | New (two-longs) | Change |
   |--------|---------------------|-----------------|--------|
   | **Time (µs/op)** | 9.918 ± 0.120 | 9.979 ± 0.621 | ≈ **+0.6%** (0.061 µs 
slower) |
   | **Memory Allocation (B/op)** | 0.011 ± 0.144 | 0.011 ± 0.144 |  **Same** |
   | **Allocation Rate (MB/sec)** | 0.001 ± 0.014 | 0.001 ± 0.014 |  **Same** |
   | **GC Count** | ≈ 0 | ≈ 0 |  **Same** |
   
   --> Essentially identical performance, both allocation-free
   
   ## 3. getWithUuidHolder
   Read into `UuidHolder` (non-nullable)
   
   | Metric | Old (buffer+offset) | New (two-longs) | Change |
   |--------|---------------------|-----------------|--------|
   | **Time (µs/op)** | 9.934 ± 0.162 | 16.488 ± 0.121 | ⚠️ **+66.0%** (6.554 
µs slower) |
   | **Memory Allocation (B/op)** | 0.011 ± 0.144 | 0.019 ± 0.239 |  **+73%** 
(still ~0 bytes) |
   | **Allocation Rate (MB/sec)** | 0.001 ± 0.014 | 0.001 ± 0.014 |  **Same** |
   | **GC Count** | ≈ 0 | ≈ 0 |  **Same** |
   
   --> 66% performance regression
   - Old approach: Direct buffer reference (10 µs/op)
   - New approach: Read two longs + byte reversal (16 µs/op)
   
   ## 4. setUuidDirectly
   Write UUID objects directly via `vector.setSafe(i, uuid)`
   
   | Metric | Old (buffer+offset) | New (two-longs) | Change |
   |--------|---------------------|-----------------|--------|
   | **Time (µs/op)** | 87.047 ± 2.229 | 89.154 ± 5.116 | ≈ **+2.4%** (2.107 µs 
slower) |
   | **Memory Allocation (B/op)** | 320000.101 ± 1.262 | 320000.103 ± 1.290 |  
**Same** |
   | **Allocation Rate (MB/sec)** | 3505.783 ± 88.171 | 3422.955 ± 195.204 |  
**-2.4%** |
   | **GC Count** | 174 | 170 |  **-2.3%** |
   | **GC Time (ms)** | 79 | 78 |  **-1.3%** |
   
   --> Essentially identical performance
   - High allocation is expected (creating 10,000 UUID objects = 32 bytes each)
   
   ## 5. setWithNullableUuidHolder
   Write using `NullableUuidHolder` with nullability support
   
   | Metric | Old (buffer+offset) | New (two-longs) | Change |
   |--------|---------------------|-----------------|--------|
   | **Time (µs/op)** | 25.415 ± 4.776 | 25.052 ± 1.187 |  **-1.4%** (0.363 µs 
faster) |
   | **Memory Allocation (B/op)** | 0.029 ± 0.364 | 0.029 ± 0.362 |  **Same** |
   | **Allocation Rate (MB/sec)** | 0.001 ± 0.014 | 0.001 ± 0.014 |  **Same** |
   | **GC Count** | ≈ 0 | ≈ 0 |  **Same** |
   
   --> Essentially identical performance
   
   ## 6. setWithUuidHolder
   Write using `UuidHolder` (non-nullable)
   
   | Metric | Old (buffer+offset) | New (two-longs) | Change |
   |--------|---------------------|-----------------|--------|
   | **Time (µs/op)** |  **FAILED** | 91.831 ± 7.582 |  **FIXED** |
   | **Memory Allocation (B/op)** |  **N/A** | 320000.106 ± 1.328 |  **Works** |
   | **Allocation Rate (MB/sec)** |  **N/A** | 3323.156 ± 272.962 |  **Works** |
   | **GC Count** |  **N/A** | 165 |  **Works** |
   | **GC Time (ms)** |  **N/A** | 75 | **Works** |
   
   **Error in Old Version**:
   ```
   java.lang.NullPointerException: Cannot set UUID vector, the source buffer is 
null.
        at org.apache.arrow.vector.UuidVector.set(UuidVector.java:241)
   ```
   
   --> Offset implementation was broken!
   - The old buffer+offset approach had a NullPointerException bug
   - The new two-longs approach fixes this bug and works correctly
   
   ## 7. setWithWriter
   Write using `UuidWriterImpl` (extension writer pattern)
   
   | Metric | Old (buffer+offset) | New (two-longs) | Change |
   |--------|---------------------|-----------------|--------|
   | **Time (µs/op)** | 62.174 ± 8.932 | 61.747 ± 3.117 |  **-0.7%** (0.427 µs 
faster) |
   | **Memory Allocation (B/op)** | 213312.072 ± 0.912 | 213312.076 ± 1.036 |  
**Same** |
   | **Allocation Rate (MB/sec)** | 3272.006 ± 473.174 | 3294.462 ± 165.505 |  
**+0.7%** |
   | **GC Count** | 162 | 164 | ≈ **+1.2%** |
   | **GC Time (ms)** | 73 | 76 | ≈ **+4.1%** |
   
   --> Essentially identical performance
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to