On 09.05.22 17:41, MG wrote:
Hi Jochen,
since I am not feeling well right now, just some quick answers to some
of your questions:
1. I am not at liberty to release the source code from our project, not
even partially, so these JARs are currently all I can provide. I
felt when I reported this problem initially, there were doubts that
it even exists, so after I could not pinpoint the problem on my
side, I invested the time to extract these project parts, clean them
from any historical references to other modules etc and supply them
in compiled form.
I think I can test if it is really my assumption or not. I made myself a
little program, that will create a class with many methods and one more
method, that is calling all of them, one-by-one. The methods are empty
and nothing takes parameters. I then put this in the compiler and
measure times for creating the class and running the methods.
For 1000 Methods I get on my machine these results for Groovy-3.0.9
non-Indy:
1 Class: 36ms
10 Classes: 114ms
100 Classes: 985ms
1000 Classes: 8227ms
the last case means, that 1+ million methods are executed inside of the
test class
Then I repeat the test with Groovy-4.0.2 Indy:
1 Class: 137ms (* 3,80)
10 Classes: 670ms (* 5,88)
100 Classes: 3711ms (* 3,77)
1000 Classes: 108235ms (* 13,16)
I would not classify these numbers as a proper test, but it does hint at
what I had assumed... creating the initial callsites in indy is too
expensive
the script I used:
def CLASSES = args[0] as Integer
def METHODS = args[1] as Integer
def methods = ""
def calls = ""
METHODS.times { i ->
methods += "\ndef foo$i(){}"
calls += "\nfoo$i()"
}
def gcl = new GroovyClassLoader();
def testClasses = (1..CLASSES).collect{
def testClass = """
class MyPerfTest$it{
$methods
def run() {
$calls
}
}
"""
gcl.parseClass(testClass)
}
def time1 = System.nanoTime()
for (clazz :testClasses) {
clazz.newInstance().run()
}
def time2 = System.nanoTime()
def timeDiff = (time2-time1)/1000000
println "Time = $timeDiff ms"
While this is surely not "real world" my goal was testing the callsite
creation, and I think the test does this well enough. Why exactly this
scenario? Because I know indy can perform well in micro-benchmarks,
where you crunch a number in tight loops. and if there is no problem
there, then it must be with the many callsites you tend to visit only
once - like here.
2. As I have said in previous posts, the performance degradation occurs
with non-Indy vs Indy. In fact this is how I could pinpoint that
Indy was to blame: After realizing that our test suite ran for 3h
under Groovy 4, instead of 1h under Groovy 3, and that this was not
restricted to a few tests becoming really slow, but was a more or
less general phenomenon (which was puzzling), I ran the test suite
under Groovy 3 Indy (for the first time), and got basically the same
reduced performance.
I had a suspicion for a long time already actually. Bad since I sadly
have barely any time for Groovy these days... and checking this can take
very very long.
[...]
4. Throwing away the first test loop: It is a bit of an acdemic
discusssion, since Groovy 4 performance is always 2 to 3 times
slower in any case. You can execute the test script interleaved
between Grooyv 3 / 4 multiple times to see how measurements develop,
but I would say it is clear that Groovy 4 is always way slower than
Groovy 3 for this test, and that is all that matters :-)
in my little test factor 2-3 times is for me factor 3-4 and that was my
best case.
bye Jochen