[jira] [Commented] (GROOVY-10307) Groovy 4 runtime performance on average 2.4x slower than Groovy 3

ASF GitHub Bot (Jira) Mon, 02 Mar 2026 16:10:07 -0800


    [ 
https://issues.apache.org/jira/browse/GROOVY-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18062302#comment-18062302
 ]


ASF GitHub Bot commented on GROOVY-10307:
-----------------------------------------

jamesfredley opened a new pull request, #2390:
URL: https://github.com/apache/groovy/pull/2390

   ## Summary
   
   Adds three new JMH benchmark classes in `org.apache.groovy.perf.grails` that 
reproduce the real-world invokedynamic performance regression observed in 
Grails 7 applications (GROOVY-10307).
   
   ## Benchmark Classes
   
   ### CallSiteInvalidationBench (11 benchmarks)
   
   Tests the core SwitchPoint invalidation mechanism - the root cause of the 
regression. Demonstrates that modifying ANY metaclass triggers global 
invalidation affecting ALL call sites:
   
   - **Cross-type invalidation** at 3 frequencies (every 100/1000/10000 
iterations)
   - **Same-type invalidation** for comparison
   - **Multiple call sites** scaling (5 concurrent call sites)
   - **Burst-then-steady-state** simulating framework startup
   
   Key result: `baselineHotLoop` (0.49ms) vs `crossTypeInvalidationEvery1000` 
(467ms) = **~950x overhead**
   
   ### MetaclassVariationBench (9 benchmarks)
   
   Tests GORM-like patterns where domain classes get per-instance 
ExpandoMetaClass enhancements:
   
   - **Shared vs per-instance metaclass** dispatch
   - **Multi-class startup simulation** (4 domain types enhanced in sequence)
   - **Dynamic finder** calls via static metaclass injection
   - **Per-instance EMC with ongoing churn** (worst case)
   
   Key result: `baselineSharedMetaclass` (2.0ms) vs `perInstanceMetaclass` 
(420ms) = **~207x overhead**
   
   ### GrailsWorkloadBench (14 benchmarks)
   
   Tests patterns extracted from the 
[grails7-performance-regression](https://github.com/jglapa/grails7-performance-regression)
 demo app's `PerformanceTestService`:
   
   - **Collection closure chains** 
(`.findAll{}.collect{}.groupBy{}.collectEntries{}`)
   - **Spread operator** (`employees*.firstName`)
   - **Nested closure delegation** (3-level criteria-like DSL)
   - **GString interpolation** with dynamic property access
   - **Dynamic property by name** (`this."$field"`)
   - **Project metrics aggregation** (`.count{}`, `.sum{}`, map building)
   - **Full analysis** combining all patterns
   
   Key result: `baselineCollectionClosureChain` (199ms) vs with invalidation 
(1631ms) = **~8.2x overhead** (matches demo app's observed 8.2x bootRun 
regression)
   
   ## Verification Results
   
   Each benchmark has a baseline (no metaclass changes) and an invalidation 
variant (periodic cross-type metaclass modifications). The delta quantifies the 
SwitchPoint invalidation overhead.
   
   Also tested against PR #2377's optimization (disabling SwitchPoint guards + 
explicit cache clearing):
   
   - **Baselines regressed 34-2851%** (steady-state dispatch worse without 
SwitchPoint guards)
   - **Invalidation benchmarks mostly regressed** (62-183% worse in most cases)
   - **Only 3 of 33 benchmarks improved** (10-28% faster in specific complex 
patterns)
   
   The benchmarks confirm that the SwitchPoint guard is critical for JIT 
optimization of stable call sites, and removing it (as PR #2377 does) trades 
steady-state performance for marginal churn resilience.
   
   ## Running
   
   ```bash
   # All new benchmarks
   ./gradlew :performance:jmh 
-PbenchInclude="CallSiteInvalidation|MetaclassVariation|GrailsWorkload"
   
   # Individual benchmark class
   ./gradlew :performance:jmh -PbenchInclude=CallSiteInvalidation
   ```
   
   ## Related
   
   - [GROOVY-10307](https://issues.apache.org/jira/browse/GROOVY-10307)
   - [grails-core #15293](https://github.com/apache/grails-core/issues/15293)
   - [Demo app](https://github.com/jglapa/grails7-performance-regression) 
showing 5-8x regression
   - PR #2377 (SwitchPoint guard optimization - benchmarked here)




> Groovy 4 runtime performance on average 2.4x slower than Groovy 3
> -----------------------------------------------------------------
>
>                 Key: GROOVY-10307
>                 URL: https://issues.apache.org/jira/browse/GROOVY-10307
>             Project: Groovy
>          Issue Type: Bug
>          Components: bytecode, performance
>    Affects Versions: 4.0.0-beta-1, 3.0.9
>         Environment: OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9 
> (build 11.0.11+9)
> OpenJDK 64-Bit Server VM AdoptOpenJDK-11.0.11+9 (build 11.0.11+9, mixed mode)
> WIN10 (tests) / REL 8 (web application)
> IntelliJ 2021.2 
>            Reporter: mgroovy
>            Priority: Major
>         Attachments: groovy_3_0_9_gc.png, groovy_3_0_9_loop2.png, 
> groovy_3_0_9_loop4.png, groovy_3_0_9_mem.png, groovy_4_0_0_b1_loop2.png, 
> groovy_4_0_0_b1_loop4.png, groovy_4_0_0_b1_loop4_gc.png, 
> groovy_4_0_0_b1_loop4_mem.png, 
> groovysql_performance_groovy4_2_xx_yy_zzzz.groovy, loops.groovy, 
> profile3.txt, profile4-loops.txt, profile4.txt, profile4d.txt
>
>
> Groovy 4.0.0-beta-1 runtime performance in our framework is on average 2 to 3 
> times slower compared to using Groovy 3.0.9 (regular i.e. non-INDY)
> * Our complete framework and application code is completely written in 
> Groovy, spread over multiple IntelliJ modules
> ** mixed @CompileDynamic/@TypeChecked and @CompileStatic
> ** No Java classes left in project, i.e. no cross compilation occurs
> * We build using IntelliJ 2021.2 Groovy build process, then run / deploy the 
> compiled class files
> ** We do _not_ use a Groovy based DSL, nor do we execute Groovy scripts 
> during execution
> * Performance degradation when using Groovy 4.0.0-beta-1 instead of Groovy 
> 3.0.9 (non-INDY):
> ** The performance of the largest of our web applications has dropped 3x 
> (startup) / 2x (table refresh) respectively
> *** Stack: Tomcat/Vaadin/Ebean plus framework generated SQL
> ** Our test suite runs about 2.4 times as long as before (120 min when using 
> G4, compared to about 50 min with G3)
> *** JUnit 5 
> *** test suite also contains no scripts / dynamic code execution
> *** Individual test performance varies: A small number of tests runs faster, 
> but the majority is slower, with some extreme cases taking nearly 10x as long 
> to finish
> * Using Groovy 3.0.9 INDY displays nearly identical performance degradation, 
> so it seems that the use of invoke dynamic is somehow at fault



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (GROOVY-10307) Groovy 4 runtime performance on average 2.4x slower than Groovy 3

Reply via email to