Thanks, I had not looked at the master branch, ClassInfo source looks
quite a bit cleaner there already :)
Regarding programmatic cleanup (GROOVY-7646), I think that is a good
idea, but in the details there might be some obstacles.
For example this sequence of calls to GroovyShell:
def shell = new GroovyShell()
def script1 = shell.parse("42")
assert script1.class.name == "Script1"
def script2 = shell.parse("new Script1().run()")
assert script2.run() == 42
def script3 = shell.parse("99", "Nintetynine")
assert script3.class.name == "Nintetynine"
def file = new File("Fiftyfive.groovy")
file.setText("55")
def script4 = shell.parse(file)
assert script4.class.name == "Fiftyfive"
So, classes accumulate (in the GroovyClassLoader) and can be addressed
by their names in subsequent scripts. (And for more complex script
expressions, more than one class might be the result of compilation,
e.g. with closures or inner classes, enums, etc.)
I would estimate that in the case where a script is run with a name
given automatically by the GroovyShell ("Script1", "Script2", ...) it
would be OK to do the cleanup (and I guess using GroovyShell that way
might be a very common case?), but when it comes to explicitly named
scripts, doing so might change behaviour of existing code.
I just took a look at GroovyScriptEngine which also has run() methods
and, if I remember correctly, it recompiles all scripts if one of them
changes (to get dependencies right), so in principle lots of classes to
cleanup for each time this happens. But I am not sure if that is
possible there, because there is also a createScript() method, so
possibly still objects/classes that are in use around.
(And I have also just started to think about Grengine in that context,
my open source library for using Groovy in a Java VM (and which almost
nobody uses ;), there it might be easier to build in such automatic
removal because the approach is more structured, although a bit less
dynamic.)
Hmn, would really be great if there was a way to achieve constant
garbage collection of Groovy classes.
Alain
On 16.05.16 18:18, John Wagenleitner wrote:
Just catching up on this thread, very interesting discussion and will
have to give the posted test code a try.
You are right about the PhantomReference and it has been removed in
master [1] along with the local cache that used it. Due to some
refactorings that were not in 2_4_X at the time it wasn't removed from
that branch. But probably should be cleaned up if any fixes for the
memory issues to ClassInfo are merged into that branch in the near future.
I think the suggestion of referencing the Class via the ClassInfo from
metaclasses/cachedclasses would be a good one, the less places the
Class is kept the better. Unfortunately since it's a protected field
on MetaClassImpl that would be a breaking change would be something
for a 3.0 as you pointed out.
So far, I have found that keeping a WeakReference to the Class in
ClassInfo allows it to be collected (mostly tested with non-ClassValue
version of ClassInfo). At least one exception is if methods are added
to the metaclass then it's required to setMetaClass(null) since the
ExpandoMetaClass is a strong reference on ClassInfo and EMC has a
strong reference to the Class. What is difficult to determine is if
keeping a WeakReference can cause any potential issues. Only possible
problem I can see is if the methods of the Class A were referenced in
the MetaMethodIndex for Class B, but I think in that case as long as
the Class B was strongly referenced the index itself would keep Class
A referenced.
In environments where lots of scripts are being parsed and run and
references to the Class are not retained, it might be worth having a
way to programmatically initiate the cleanup so as not to have to wait
for the Soft References to be collected. The extra performance costs
of clearing a few references might not be as high as consistently
hitting the upper heap limit constantly. It is something I have
looked at for GROOVY-7646 [2]. Parsed groovy classes should be
collectible by default without any intervention, but there may be
cases where an API to help speed the removal might be useful too.
[1]
https://github.com/apache/groovy/commit/e967039222dc01a59824f95d9313a3b2e7aa9f50
[2] https://github.com/apache/groovy/pull/325
On Mon, May 16, 2016 at 8:01 AM, Alain Stalder <[email protected]
<mailto:[email protected]>> wrote:
My time here is running up, other things to attend to, so here is what
I wrote about the current state of class loading and garbage
collection
in Groovy in the just updated user manual of Grengine:
https://www.grengine.ch/manual.html#the-cost-of-session-separation
--
==== The Cost of Session Separation
Although loading classes from bytecode obtained from compiling
Groovy scripts
is a lot less expensive than compiling them (plus afterwards also
loading the
resulting bytecode), it is still somewhat more expensive than one
might naively
expect and there are a few things to be aware of when operating
that way.
In the following, I will simply call classes compiled by the
Groovy compiler
from Groovy scripts/sources _Groovy classes_ and classes compiled
by the Java
compiler from Java sources _Java classes_.
* *Class Loading* +
Experimentally, loading of a typical Groovy class is often about
10 times
slower than loading a Java class with similarly complex source
code, but
both are relatively expensive operations (of the order of a
millisecond
for a small Groovy class, to give a rough indication). For Java
classes,
this is apparently mainly expensive because some security checks
have to
be made on the bytecode. For Groovy classes, it is mainly expensive
because some meta information is needed to later efficiently
call methods
dynamically, and the like.
* *Garbage Collection* +
Classes are stored in _PermGen_ (up to Java 7) resp. _Metaspace_
(Java 8
and later) plus some associated data on the Heap, at least for
Groovy
classes the latter is normally the case (meta information).
Whereas for
Java classes, unused classes appear to be usually garbage
collected from
PermGen/Metaspace continuously, with Groovy classes this
experimentally
does not happen before PermGen/Metaspace or the Heap reach a
configured
limit. Why exactly this is so and whether it is easy to change
and whether
it will change in the future, is difficult to answer for me, I
find the
code around it is rather convoluted, hard to untangle. Note that
by default
on Java VMs there is typically no limit set for Metaspace (but
there is
for PermGen), so setting a limit is crucial in practice when
using Groovy.
* *Garbage Collection Bugs* +
In the past, several Groovy versions had failed at garbage
collecting
Groovy classes and their class loaders, resulting finally in an
`OutOfMemoryError` due to exhaustion of PermGen/Metaspace or the
Heap,
whichever limit was reached first. If when you are reading this,
Groovy
2.4.6 is (still) the newest version, make sure you set the
system property
`groovy.use.classvalue=true` in the context of Grengine. Note
that under
different circumstances, like the one described in
https://issues.apache.org/jira/browse/GROOVY-7591[GROOVY-7591
<https://issues.apache.org/jira/browse/GROOVY-7591%5BGROOVY-7591>:
Use of ClassValue causes major memory leak] you would instead
have had to
set it to false! That Groovy bug is actually in turn due to a bug in
Oracle/OpenJDK Java VMs regarding garbage collection under some
circumstances, more precisely a bug in a new feature (`ClassValue`)
introduced in order to make thing easier(!) for dynamic
languages in the
Java VM, see
https://bugs.openjdk.java.net/browse/JDK-8136353[JDK-8136353]
<https://bugs.openjdk.java.net/browse/JDK-8136353%5BJDK-8136353%5D>.
So, if you want to use session separation with Greninge (or
otherwise want
to load many Groovy classes repeately), first set a limit on
PermGen/Metaspace,
then verify that classes can be garbage collected in an
environment close to
production and that throughput under load would be sufficient
(despite the
relatively slow class loading performance of Groovy (and Java)
classes in the
Java VM) and then use it. And don't forget to repeat this at least
when you
upgrade Groovy to a new version, but possibly also when you
upgrade Java.
Or see the next section for an alternative...
--
PS: By the way, very funny how Jochen Theodorou "garbage
collected" what I
wrote about PhantomReference to a "[...]"...
Good luck with Groovy garbage collection.
Alain