jamesfredley opened a new pull request, #2377: URL: https://github.com/apache/groovy/pull/2377
## Summary Based on https://github.com/apache/groovy/pull/2374, but applied to master(Groovy 6) instead of [GROOVY_4_0_X](https://github.com/apache/groovy/tree/GROOVY_4_0_X) This PR improves invokedynamic performance reported in GROOVY-10307. The optimization reduces the performance impact of metaclass changes on call sites by replacing the global SwitchPoint invalidation mechanism with targeted per-call-site cache invalidation. ## Problem When any metaclass changes in Groovy, the global `SwitchPoint` is invalidated, causing **all** invokedynamic call sites across the entire application to fall back and re-link. This creates significant overhead in applications that frequently modify metaclasses (e.g., Grails applications with dynamic finders, runtime mixins, etc.). ## Solution This PR implements a more targeted invalidation strategy: 1. **Disable global SwitchPoint guard by default** - The `SwitchPoint.guardWithTest()` wrapper is now optional and disabled by default. This prevents mass invalidation of all call sites when any metaclass changes. 2. **Track all call sites via WeakReference set** - All `CacheableCallSite` instances are registered in a concurrent set using weak references, allowing targeted invalidation without preventing garbage collection. 3. **Add `clearCache()` method to CacheableCallSite** - When a metaclass changes, we can now clear the LRU cache and reset the fallback count on specific call sites rather than invalidating everything. 4. **Targeted invalidation on metaclass change** - `invalidateSwitchPoints()` now iterates through registered call sites, clearing caches and resetting targets as needed. ## Changes ### `CacheableCallSite.java` - Added `clearCache()` method to clear LRU cache and reset fallback count ### `IndyInterface.java` - Added `ALL_CALL_SITES` WeakReference set to track all call sites - Added `registerCallSite(CacheableCallSite)` method - Modified `invalidateSwitchPoints()` to clear caches on all registered call sites - Register call sites during bootstrap ### `Selector.java` - Added `INDY_SWITCHPOINT_GUARD` system property flag (default: `false`) - Made the SwitchPoint guard conditional based on the flag ### `JarJarTask.groovy` (unrelated build fix) - Changed `@InputFiles @Classpath` to `@Input` on `untouchedFiles` field - Fixes Windows build issue where glob patterns containing `*` were treated as literal file paths ## Configuration The SwitchPoint guard behavior can be controlled via system property: ```bash # Use new targeted invalidation (default) java -jar myapp.jar # Revert to old global SwitchPoint behavior java -Dgroovy.indy.switchpoint.guard=true -jar myapp.jar ``` ## Benchmark Results Tested using a dedicated benchmark suite measuring metaclass invalidation impact: https://github.com/jamesfredley/groovy-indy-performance ## Complete Benchmark Comparison (3-Run Averages) **Test Date:** February 4, 2026 **Test Machine:** Windows 11, 20 cores, 4GB max heap **Java Version:** 17.0.18 (Amazon Corretto) ### Versions Tested | Version | Description | |---------|-------------| | **4.0.30** | Groovy 4.0.30 from Maven Central (baseline) | | **6-snapshot** | Groovy 6.0.0-SNAPSHOT clean master (no optimizations) | | **6-snapshot-opt** | Groovy 6.0.0-SNAPSHOT with this PRs optimizations | ### Optimizations Applied in 6-snapshot-opt 1. **Disabled global SwitchPoint guard** - `INDY_SWITCHPOINT_GUARD` defaults to `false` 2. **Call site registry** - Track all call sites via `WeakReference` set 3. **Cache invalidation** - Clear individual call site caches on metaclass change 4. **Target reset** - Reset call site targets to default on invalidation --- ## 🎯 KEY METRIC: Metaclass Invalidation Stress Test This test measures the performance impact when metaclass changes occur during execution. Lower ratio = better (less performance degradation from metaclass changes). | Metric | 4.0.30 | 6-snapshot | 6-snapshot-opt | |--------|--------|------------|----------------| | Run 1 | 72.66x | 83.16x | **67.33x** | | Run 2 | 103.63x | **77.54x** | 76.90x | | Run 3 | 107.92x | 103.71x | **71.26x** | | **Average** | 94.74x | 88.14x | **71.83x** | | Baseline (no changes) | **5.61 ms** | 5.84 ms | 6.02 ms | | With metaclass changes | 515.11 ms | 505.83 ms | **431.28 ms** | ### Key Finding **6-snapshot-opt reduces metaclass invalidation impact by:** - **19% vs 6-snapshot** (71.83x vs 88.14x) - **24% vs 4.0.30** (71.83x vs 94.74x) --- ## Comprehensive Benchmark Suite (3-Run Averages) | Benchmark | 4.0.30 | 6-snapshot | 6-snapshot-opt | |-----------|--------|------------|----------------| | **Loop Benchmarks** | | | | | Loop: each + toString | **31.72 ms** | 49.23 ms | 50.37 ms | | Loop: collect | **52.12 ms** | 75.74 ms | 75.80 ms | | Loop: findAll | **113.37 ms** | 146.15 ms | 147.52 ms | | **Method Invocation** | | | | | Method: simple instance | **8.81 ms** | 27.53 ms | 27.74 ms | | Method: with params | **10.33 ms** | **27.49 ms** | 29.32 ms | | Method: static | **7.54 ms** | 26.49 ms | 26.56 ms | | Method: polymorphic | **1.86 s** | 2.03 s | **1.86 s** | | **Closures** | | | | | Closure: creation + call | **24.93 ms** | 34.87 ms | 34.27 ms | | Closure: reused | **20.59 ms** | 27.70 ms | 27.85 ms | | Closure: nested | 39.16 ms | **37.46 ms** | 38.27 ms | | Closure: curried | **159.16 ms** | 191.60 ms | 192.33 ms | | **Properties** | | | | | Property: read/write | **19.62 ms** | 71.62 ms | 73.83 ms | | **Collections** | | | | | Collection: each | **108.84 ms** | 127.27 ms | 128.85 ms | | Collection: collect | **119.86 ms** | 138.99 ms | **138.79 ms** | | Collection: inject | **133.79 ms** | 153.63 ms | 155.63 ms | | **GStrings** | | | | | GString: simple | **101.42 ms** | 118.48 ms | **118.47 ms** | | GString: multi-value | **114.82 ms** | 130.68 ms | **129.68 ms** | | **Call Site Performance** | | | | | Monomorphic call site | 124.68 ms | 117.98 ms | **117.21 ms** | | Polymorphic call site | 3.73 s | 3.79 s | **3.56 s** | --- ## Closure Benchmark Suite (3-Run Averages) | Benchmark | 4.0.30 | 6-snapshot | 6-snapshot-opt | |-----------|--------|------------|----------------| | Simple closure creation | 29.71 ms | **14.86 ms** | 16.20 ms | | Closure reuse | 19.25 ms | **10.08 ms** | 10.99 ms | | Multi-param closure | 38.58 ms | **19.25 ms** | 21.48 ms | | Closure with capture | 19.60 ms | **7.48 ms** | 7.66 ms | | Closure modify capture | 12.87 ms | 4.73 ms | **4.29 ms** | | Closure delegation | 27.94 ms | 27.97 ms | **25.87 ms** | | Nested closures | 58.24 ms | 26.16 ms | **25.84 ms** | | Curried closure | **294.13 ms** | 337.07 ms | 332.51 ms | | Closure composition | **59.20 ms** | 70.86 ms | 73.05 ms | | Closure spread | **2.05 s** | 2.66 s | 2.66 s | | Closure.call() | **22.32 ms** | 8.56 ms | 8.99 ms | | Closure trampoline | **53.48 ms** | 60.26 ms | 59.67 ms | --- ## Loop Benchmark Suite (3-Run Averages) | Benchmark | 4.0.30 | 6-snapshot | 6-snapshot-opt | |-----------|--------|------------|----------------| | Original: each + toString | **45.54 ms** | 48.94 ms | 49.51 ms | | Simple: each only | **39.93 ms** | 47.65 ms | 48.95 ms | | Closure call | 19.54 ms | **2.58 ms** | 2.74 ms | | Method call | 5.32 ms | 6.02 ms | **5.33 ms** | | Nested loops | **74.47 ms** | 79.96 ms | 79.68 ms | | Loop with collect | **88.51 ms** | 106.90 ms | 105.39 ms | | Loop with findAll | **202.83 ms** | 233.18 ms | 232.28 ms | --- ## Method Invocation Benchmark Suite (3-Run Averages) | Benchmark | 4.0.30 | 6-snapshot | 6-snapshot-opt | |-----------|--------|------------|----------------| | Simple instance method | 7.69 ms | 6.17 ms | **5.75 ms** | | Method with parameters | **7.89 ms** | 7.85 ms | 8.08 ms | | Method with object param | **10.89 ms** | 10.61 ms | 10.64 ms | | Static method | **3.20 ms** | 3.42 ms | 3.48 ms | | Static method with params | **7.87 ms** | 8.16 ms | 7.67 ms | | Interface method | 3.24 ms | 3.81 ms | **3.69 ms** | | Dynamic typed calls | **3.26 ms** | **3.26 ms** | 3.31 ms | | Property access | **21.50 ms** | N/A | N/A | | GString method | **192.53 ms** | N/A | N/A | --- ## Raw Data: Individual Run Results ### Metaclass Invalidation Ratios | Run | 4.0.30 | 6-snapshot | 6-snapshot-opt | |-----|--------|------------|----------------| | 1 | 72.66x | 83.16x | 67.33x | | 2 | 103.63x | 77.54x | 76.90x | | 3 | 107.92x | 103.71x | 71.26x | ### With Metaclass Changes (ms) | Run | 4.0.30 | 6-snapshot | 6-snapshot-opt | |-----|--------|------------|----------------| | 1 | 517.31 | 515.21 | 430.71 | | 2 | 508.25 | 504.47 | 429.30 | | 3 | 519.77 | 497.80 | 433.84 | ### Baseline (No Metaclass Changes) (ms) | Run | 4.0.30 | 6-snapshot | 6-snapshot-opt | |-----|--------|------------|----------------| | 1 | 7.12 | 6.20 | 6.40 | | 2 | 4.90 | 6.51 | 5.58 | | 3 | 4.82 | 4.80 | 6.09 | ## Related - JIRA: https://issues.apache.org/jira/browse/GROOVY-10307 - Original PR: https://github.com/apache/groovy/pull/2374 - Test project: https://github.com/jamesfredley/groovy-indy-performance - Since this PR is against Groovy 6, the Grails 7 test project will not run -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
