jamesfredley opened a new pull request, #2377:
URL: https://github.com/apache/groovy/pull/2377

   ## Summary
   
   Based on https://github.com/apache/groovy/pull/2374, but applied to 
master(Groovy 6) instead of 
[GROOVY_4_0_X](https://github.com/apache/groovy/tree/GROOVY_4_0_X)
   
   This PR improves invokedynamic performance reported in GROOVY-10307. The 
optimization reduces the performance impact of metaclass changes on call sites 
by replacing the global SwitchPoint invalidation mechanism with targeted 
per-call-site cache invalidation.
   
   ## Problem
   
   When any metaclass changes in Groovy, the global `SwitchPoint` is 
invalidated, causing **all** invokedynamic call sites across the entire 
application to fall back and re-link. This creates significant overhead in 
applications that frequently modify metaclasses (e.g., Grails applications with 
dynamic finders, runtime mixins, etc.).
   
   ## Solution
   
   This PR implements a more targeted invalidation strategy:
   
   1. **Disable global SwitchPoint guard by default** - The 
`SwitchPoint.guardWithTest()` wrapper is now optional and disabled by default. 
This prevents mass invalidation of all call sites when any metaclass changes.
   
   2. **Track all call sites via WeakReference set** - All `CacheableCallSite` 
instances are registered in a concurrent set using weak references, allowing 
targeted invalidation without preventing garbage collection.
   
   3. **Add `clearCache()` method to CacheableCallSite** - When a metaclass 
changes, we can now clear the LRU cache and reset the fallback count on 
specific call sites rather than invalidating everything.
   
   4. **Targeted invalidation on metaclass change** - 
`invalidateSwitchPoints()` now iterates through registered call sites, clearing 
caches and resetting targets as needed.
   
   ## Changes
   
   ### `CacheableCallSite.java`
   - Added `clearCache()` method to clear LRU cache and reset fallback count
   
   ### `IndyInterface.java`
   - Added `ALL_CALL_SITES` WeakReference set to track all call sites
   - Added `registerCallSite(CacheableCallSite)` method
   - Modified `invalidateSwitchPoints()` to clear caches on all registered call 
sites
   - Register call sites during bootstrap
   
   ### `Selector.java`
   - Added `INDY_SWITCHPOINT_GUARD` system property flag (default: `false`)
   - Made the SwitchPoint guard conditional based on the flag
   
   ### `JarJarTask.groovy` (unrelated build fix)
   - Changed `@InputFiles @Classpath` to `@Input` on `untouchedFiles` field
   - Fixes Windows build issue where glob patterns containing `*` were treated 
as literal file paths
   
   ## Configuration
   
   The SwitchPoint guard behavior can be controlled via system property:
   
   ```bash
   # Use new targeted invalidation (default)
   java -jar myapp.jar
   
   # Revert to old global SwitchPoint behavior
   java -Dgroovy.indy.switchpoint.guard=true -jar myapp.jar
   ```
   
   ## Benchmark Results
   
   Tested using a dedicated benchmark suite measuring metaclass invalidation 
impact: https://github.com/jamesfredley/groovy-indy-performance
   
   ## Complete Benchmark Comparison (3-Run Averages)
   
   **Test Date:** February 4, 2026  
   **Test Machine:** Windows 11, 20 cores, 4GB max heap  
   **Java Version:** 17.0.18 (Amazon Corretto)
   
   ### Versions Tested
   
   | Version | Description |
   |---------|-------------|
   | **4.0.30** | Groovy 4.0.30 from Maven Central (baseline) |
   | **6-snapshot** | Groovy 6.0.0-SNAPSHOT clean master (no optimizations) |
   | **6-snapshot-opt** | Groovy 6.0.0-SNAPSHOT with this PRs optimizations |
   
   ### Optimizations Applied in 6-snapshot-opt
   
   1. **Disabled global SwitchPoint guard** - `INDY_SWITCHPOINT_GUARD` defaults 
to `false`
   2. **Call site registry** - Track all call sites via `WeakReference` set
   3. **Cache invalidation** - Clear individual call site caches on metaclass 
change
   4. **Target reset** - Reset call site targets to default on invalidation
   
   ---
   
   ## 🎯 KEY METRIC: Metaclass Invalidation Stress Test
   
   This test measures the performance impact when metaclass changes occur 
during execution.
   Lower ratio = better (less performance degradation from metaclass changes).
   
   | Metric | 4.0.30 | 6-snapshot | 6-snapshot-opt |
   |--------|--------|------------|----------------|
   | Run 1 | 72.66x | 83.16x | **67.33x** |
   | Run 2 | 103.63x | **77.54x** | 76.90x |
   | Run 3 | 107.92x | 103.71x | **71.26x** |
   | **Average** | 94.74x | 88.14x | **71.83x** |
   | Baseline (no changes) | **5.61 ms** | 5.84 ms | 6.02 ms |
   | With metaclass changes | 515.11 ms | 505.83 ms | **431.28 ms** |
   
   ### Key Finding
   
   **6-snapshot-opt reduces metaclass invalidation impact by:**
   - **19% vs 6-snapshot** (71.83x vs 88.14x)
   - **24% vs 4.0.30** (71.83x vs 94.74x)
   
   ---
   
   ## Comprehensive Benchmark Suite (3-Run Averages)
   
   | Benchmark | 4.0.30 | 6-snapshot | 6-snapshot-opt |
   |-----------|--------|------------|----------------|
   | **Loop Benchmarks** | | | |
   | Loop: each + toString | **31.72 ms** | 49.23 ms | 50.37 ms |
   | Loop: collect | **52.12 ms** | 75.74 ms | 75.80 ms |
   | Loop: findAll | **113.37 ms** | 146.15 ms | 147.52 ms |
   | **Method Invocation** | | | |
   | Method: simple instance | **8.81 ms** | 27.53 ms | 27.74 ms |
   | Method: with params | **10.33 ms** | **27.49 ms** | 29.32 ms |
   | Method: static | **7.54 ms** | 26.49 ms | 26.56 ms |
   | Method: polymorphic | **1.86 s** | 2.03 s | **1.86 s** |
   | **Closures** | | | |
   | Closure: creation + call | **24.93 ms** | 34.87 ms | 34.27 ms |
   | Closure: reused | **20.59 ms** | 27.70 ms | 27.85 ms |
   | Closure: nested | 39.16 ms | **37.46 ms** | 38.27 ms |
   | Closure: curried | **159.16 ms** | 191.60 ms | 192.33 ms |
   | **Properties** | | | |
   | Property: read/write | **19.62 ms** | 71.62 ms | 73.83 ms |
   | **Collections** | | | |
   | Collection: each | **108.84 ms** | 127.27 ms | 128.85 ms |
   | Collection: collect | **119.86 ms** | 138.99 ms | **138.79 ms** |
   | Collection: inject | **133.79 ms** | 153.63 ms | 155.63 ms |
   | **GStrings** | | | |
   | GString: simple | **101.42 ms** | 118.48 ms | **118.47 ms** |
   | GString: multi-value | **114.82 ms** | 130.68 ms | **129.68 ms** |
   | **Call Site Performance** | | | |
   | Monomorphic call site | 124.68 ms | 117.98 ms | **117.21 ms** |
   | Polymorphic call site | 3.73 s | 3.79 s | **3.56 s** |
   
   ---
   
   ## Closure Benchmark Suite (3-Run Averages)
   
   | Benchmark | 4.0.30 | 6-snapshot | 6-snapshot-opt |
   |-----------|--------|------------|----------------|
   | Simple closure creation | 29.71 ms | **14.86 ms** | 16.20 ms |
   | Closure reuse | 19.25 ms | **10.08 ms** | 10.99 ms |
   | Multi-param closure | 38.58 ms | **19.25 ms** | 21.48 ms |
   | Closure with capture | 19.60 ms | **7.48 ms** | 7.66 ms |
   | Closure modify capture | 12.87 ms | 4.73 ms | **4.29 ms** |
   | Closure delegation | 27.94 ms | 27.97 ms | **25.87 ms** |
   | Nested closures | 58.24 ms | 26.16 ms | **25.84 ms** |
   | Curried closure | **294.13 ms** | 337.07 ms | 332.51 ms |
   | Closure composition | **59.20 ms** | 70.86 ms | 73.05 ms |
   | Closure spread | **2.05 s** | 2.66 s | 2.66 s |
   | Closure.call() | **22.32 ms** | 8.56 ms | 8.99 ms |
   | Closure trampoline | **53.48 ms** | 60.26 ms | 59.67 ms |
   
   ---
   
   ## Loop Benchmark Suite (3-Run Averages)
   
   | Benchmark | 4.0.30 | 6-snapshot | 6-snapshot-opt |
   |-----------|--------|------------|----------------|
   | Original: each + toString | **45.54 ms** | 48.94 ms | 49.51 ms |
   | Simple: each only | **39.93 ms** | 47.65 ms | 48.95 ms |
   | Closure call | 19.54 ms | **2.58 ms** | 2.74 ms |
   | Method call | 5.32 ms | 6.02 ms | **5.33 ms** |
   | Nested loops | **74.47 ms** | 79.96 ms | 79.68 ms |
   | Loop with collect | **88.51 ms** | 106.90 ms | 105.39 ms |
   | Loop with findAll | **202.83 ms** | 233.18 ms | 232.28 ms |
   
   ---
   
   ## Method Invocation Benchmark Suite (3-Run Averages)
   
   | Benchmark | 4.0.30 | 6-snapshot | 6-snapshot-opt |
   |-----------|--------|------------|----------------|
   | Simple instance method | 7.69 ms | 6.17 ms | **5.75 ms** |
   | Method with parameters | **7.89 ms** | 7.85 ms | 8.08 ms |
   | Method with object param | **10.89 ms** | 10.61 ms | 10.64 ms |
   | Static method | **3.20 ms** | 3.42 ms | 3.48 ms |
   | Static method with params | **7.87 ms** | 8.16 ms | 7.67 ms |
   | Interface method | 3.24 ms | 3.81 ms | **3.69 ms** |
   | Dynamic typed calls | **3.26 ms** | **3.26 ms** | 3.31 ms |
   | Property access | **21.50 ms** | N/A | N/A |
   | GString method | **192.53 ms** | N/A | N/A |
   
   ---
   
   ## Raw Data: Individual Run Results
   
   ### Metaclass Invalidation Ratios
   
   | Run | 4.0.30 | 6-snapshot | 6-snapshot-opt |
   |-----|--------|------------|----------------|
   | 1 | 72.66x | 83.16x | 67.33x |
   | 2 | 103.63x | 77.54x | 76.90x |
   | 3 | 107.92x | 103.71x | 71.26x |
   
   ### With Metaclass Changes (ms)
   
   | Run | 4.0.30 | 6-snapshot | 6-snapshot-opt |
   |-----|--------|------------|----------------|
   | 1 | 517.31 | 515.21 | 430.71 |
   | 2 | 508.25 | 504.47 | 429.30 |
   | 3 | 519.77 | 497.80 | 433.84 |
   
   ### Baseline (No Metaclass Changes) (ms)
   
   | Run | 4.0.30 | 6-snapshot | 6-snapshot-opt |
   |-----|--------|------------|----------------|
   | 1 | 7.12 | 6.20 | 6.40 |
   | 2 | 4.90 | 6.51 | 5.58 |
   | 3 | 4.82 | 4.80 | 6.09 |
   
   
   
   
   ## Related
   
   - JIRA: https://issues.apache.org/jira/browse/GROOVY-10307
   - Original PR: https://github.com/apache/groovy/pull/2374
   - Test project: https://github.com/jamesfredley/groovy-indy-performance
   - Since this PR is against Groovy 6, the Grails 7 test project will not run
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to