[
https://issues.apache.org/jira/browse/GROOVY-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18062302#comment-18062302
]
ASF GitHub Bot commented on GROOVY-10307:
-----------------------------------------
jamesfredley opened a new pull request, #2390:
URL: https://github.com/apache/groovy/pull/2390
## Summary
Adds three new JMH benchmark classes in `org.apache.groovy.perf.grails` that
reproduce the real-world invokedynamic performance regression observed in
Grails 7 applications (GROOVY-10307).
## Benchmark Classes
### CallSiteInvalidationBench (11 benchmarks)
Tests the core SwitchPoint invalidation mechanism - the root cause of the
regression. Demonstrates that modifying ANY metaclass triggers global
invalidation affecting ALL call sites:
- **Cross-type invalidation** at 3 frequencies (every 100/1000/10000
iterations)
- **Same-type invalidation** for comparison
- **Multiple call sites** scaling (5 concurrent call sites)
- **Burst-then-steady-state** simulating framework startup
Key result: `baselineHotLoop` (0.49ms) vs `crossTypeInvalidationEvery1000`
(467ms) = **~950x overhead**
### MetaclassVariationBench (9 benchmarks)
Tests GORM-like patterns where domain classes get per-instance
ExpandoMetaClass enhancements:
- **Shared vs per-instance metaclass** dispatch
- **Multi-class startup simulation** (4 domain types enhanced in sequence)
- **Dynamic finder** calls via static metaclass injection
- **Per-instance EMC with ongoing churn** (worst case)
Key result: `baselineSharedMetaclass` (2.0ms) vs `perInstanceMetaclass`
(420ms) = **~207x overhead**
### GrailsWorkloadBench (14 benchmarks)
Tests patterns extracted from the
[grails7-performance-regression](https://github.com/jglapa/grails7-performance-regression)
demo app's `PerformanceTestService`:
- **Collection closure chains**
(`.findAll{}.collect{}.groupBy{}.collectEntries{}`)
- **Spread operator** (`employees*.firstName`)
- **Nested closure delegation** (3-level criteria-like DSL)
- **GString interpolation** with dynamic property access
- **Dynamic property by name** (`this."$field"`)
- **Project metrics aggregation** (`.count{}`, `.sum{}`, map building)
- **Full analysis** combining all patterns
Key result: `baselineCollectionClosureChain` (199ms) vs with invalidation
(1631ms) = **~8.2x overhead** (matches demo app's observed 8.2x bootRun
regression)
## Verification Results
Each benchmark has a baseline (no metaclass changes) and an invalidation
variant (periodic cross-type metaclass modifications). The delta quantifies the
SwitchPoint invalidation overhead.
Also tested against PR #2377's optimization (disabling SwitchPoint guards +
explicit cache clearing):
- **Baselines regressed 34-2851%** (steady-state dispatch worse without
SwitchPoint guards)
- **Invalidation benchmarks mostly regressed** (62-183% worse in most cases)
- **Only 3 of 33 benchmarks improved** (10-28% faster in specific complex
patterns)
The benchmarks confirm that the SwitchPoint guard is critical for JIT
optimization of stable call sites, and removing it (as PR #2377 does) trades
steady-state performance for marginal churn resilience.
## Running
```bash
# All new benchmarks
./gradlew :performance:jmh
-PbenchInclude="CallSiteInvalidation|MetaclassVariation|GrailsWorkload"
# Individual benchmark class
./gradlew :performance:jmh -PbenchInclude=CallSiteInvalidation
```
## Related
- [GROOVY-10307](https://issues.apache.org/jira/browse/GROOVY-10307)
- [grails-core #15293](https://github.com/apache/grails-core/issues/15293)
- [Demo app](https://github.com/jglapa/grails7-performance-regression)
showing 5-8x regression
- PR #2377 (SwitchPoint guard optimization - benchmarked here)
> Groovy 4 runtime performance on average 2.4x slower than Groovy 3
> -----------------------------------------------------------------
>
> Key: GROOVY-10307
> URL: https://issues.apache.org/jira/browse/GROOVY-10307
> Project: Groovy
> Issue Type: Bug
> Components: bytecode, performance
> Affects Versions: 4.0.0-beta-1, 3.0.9
> Environment: OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9
> (build 11.0.11+9)
> OpenJDK 64-Bit Server VM AdoptOpenJDK-11.0.11+9 (build 11.0.11+9, mixed mode)
> WIN10 (tests) / REL 8 (web application)
> IntelliJ 2021.2
> Reporter: mgroovy
> Priority: Major
> Attachments: groovy_3_0_9_gc.png, groovy_3_0_9_loop2.png,
> groovy_3_0_9_loop4.png, groovy_3_0_9_mem.png, groovy_4_0_0_b1_loop2.png,
> groovy_4_0_0_b1_loop4.png, groovy_4_0_0_b1_loop4_gc.png,
> groovy_4_0_0_b1_loop4_mem.png,
> groovysql_performance_groovy4_2_xx_yy_zzzz.groovy, loops.groovy,
> profile3.txt, profile4-loops.txt, profile4.txt, profile4d.txt
>
>
> Groovy 4.0.0-beta-1 runtime performance in our framework is on average 2 to 3
> times slower compared to using Groovy 3.0.9 (regular i.e. non-INDY)
> * Our complete framework and application code is completely written in
> Groovy, spread over multiple IntelliJ modules
> ** mixed @CompileDynamic/@TypeChecked and @CompileStatic
> ** No Java classes left in project, i.e. no cross compilation occurs
> * We build using IntelliJ 2021.2 Groovy build process, then run / deploy the
> compiled class files
> ** We do _not_ use a Groovy based DSL, nor do we execute Groovy scripts
> during execution
> * Performance degradation when using Groovy 4.0.0-beta-1 instead of Groovy
> 3.0.9 (non-INDY):
> ** The performance of the largest of our web applications has dropped 3x
> (startup) / 2x (table refresh) respectively
> *** Stack: Tomcat/Vaadin/Ebean plus framework generated SQL
> ** Our test suite runs about 2.4 times as long as before (120 min when using
> G4, compared to about 50 min with G3)
> *** JUnit 5
> *** test suite also contains no scripts / dynamic code execution
> *** Individual test performance varies: A small number of tests runs faster,
> but the majority is slower, with some extreme cases taking nearly 10x as long
> to finish
> * Using Groovy 3.0.9 INDY displays nearly identical performance degradation,
> so it seems that the use of invoke dynamic is somehow at fault
--
This message was sent by Atlassian Jira
(v8.20.10#820010)