On Mon, May 16, 2016 at 1:34 PM, Alain Stalder <[email protected]> wrote:

> Thanks, I had not looked at the master branch, ClassInfo source looks
> quite a bit cleaner there already :)
>
> Regarding programmatic cleanup (GROOVY-7646), I think that is a good idea,
> but in the details there might be some obstacles.
>

I definitely agree, many obstacles usually present themselves once you
scratch the surface. :)



>
> For example this sequence of calls to GroovyShell:
>
> def shell = new GroovyShell()
> def script1 = shell.parse("42")
> assert script1.class.name == "Script1"
> def script2 = shell.parse("new Script1().run()")
> assert script2.run() == 42
>
> def script3 = shell.parse("99", "Nintetynine")
> assert script3.class.name == "Nintetynine"
>
> def file = new File("Fiftyfive.groovy")
> file.setText("55")
> def script4 = shell.parse(file)
> assert script4.class.name == "Fiftyfive"
>
> So, classes accumulate (in the GroovyClassLoader) and can be addressed by
> their names in subsequent scripts. (And for more complex script
> expressions, more than one class might be the result of compilation, e.g.
> with closures or inner classes, enums, etc.)
>


This passes with the changes in place for GROOVY-7646, though calls to
parse don't call the added clean-up code.  It still passes if I change
parse to run.



> I would estimate that in the case where a script is run with a name given
> automatically by the GroovyShell ("Script1", "Script2", ...) it would be OK
> to do the cleanup (and I guess using GroovyShell that way might be a very
> common case?), but when it comes to explicitly named scripts, doing so
> might change behaviour of existing code.
>


For quite some time GroovyShell#evaluate(Reader,String filename) was doing
this kind of cleanup [1].  Cleanup meaning removing the metaClass, the
ClassInfo from the cache and the Introspector beanInfoCache.  As long as
the ClassLoader and any of it's classes are still referenced the Classes
that result from parse/run calls would still be available.  But you are
right, there are so many ways the shell can be used it is difficult to tell
what it might break.



>
> I just took a look at GroovyScriptEngine which also has run() methods and,
> if I remember correctly, it recompiles all scripts if one of them changes
> (to get dependencies right), so in principle lots of classes to cleanup for
> each time this happens. But I am not sure if that is possible there,
> because there is also a createScript() method, so possibly still
> objects/classes that are in use around.
>
> (And I have also just started to think about Grengine in that context, my
> open source library for using Groovy in a Java VM (and which almost nobody
> uses ;), there it might be easier to build in such automatic removal
> because the approach is more structured, although a bit less dynamic.)
>
> Hmn, would really be great if there was a way to achieve constant garbage
> collection of Groovy classes.
>


I take constant to mean not waiting until heap/metaspace is filled before
collection.  If so, from that I've seen that would still require some user
intervention (java.beans.Introspector.flushFromCaches(clazz)) in order to
clear the Introspector cache which keeps a Soft Reference to main method of
a Script class which in turns references the Class.



>
>
> Alain
>
>
>

[1]
https://github.com/apache/groovy/commit/5724870025c25622015ba13c0310def5742d0b2f#diff-62f4f9c1bd5efea3ddcfe563c25f953eR459



> On 16.05.16 18:18, John Wagenleitner wrote:
>
> Just catching up on this thread, very interesting discussion and will have
> to give the posted test code a try.
>
> You are right about the PhantomReference and it has been removed in
> master [1] along with the local cache that used it.  Due to some
> refactorings that were not in 2_4_X at the time it wasn't removed from that
> branch.  But probably should be cleaned up if any fixes for the memory
> issues to ClassInfo are merged into that branch in the near future.
>
> I think the suggestion of referencing the Class via the ClassInfo from
> metaclasses/cachedclasses would be a good one, the less places the Class is
> kept the better.  Unfortunately since it's a protected field on
> MetaClassImpl that would be a breaking change would be something for a 3.0
> as you pointed out.
>
> So far, I have found that keeping a WeakReference to the Class in
> ClassInfo allows it to be collected (mostly tested with non-ClassValue
> version of ClassInfo).  At least one exception is if methods are added to
> the metaclass then it's required to setMetaClass(null) since the
> ExpandoMetaClass is a strong reference on ClassInfo and EMC has a strong
> reference to the Class.  What is difficult to determine is if keeping a
> WeakReference can cause any potential issues.  Only possible problem I can
> see is if the methods of the Class A were referenced in the MetaMethodIndex
> for Class B, but I think in that case as long as the Class B was strongly
> referenced the index itself would keep Class A referenced.
>
> In environments where lots of scripts are being parsed and run and
> references to the Class are not retained, it might be worth having a way to
> programmatically initiate the cleanup so as not to have to wait for the
> Soft References to be collected.  The extra performance costs of clearing a
> few references might not be as high as consistently hitting the upper heap
> limit constantly.  It is something I have looked at for GROOVY-7646 [2].
> Parsed groovy classes should be collectible by default without any
> intervention, but there may be cases where an API to help speed the removal
> might be useful too.
>
>
> [1]
> https://github.com/apache/groovy/commit/e967039222dc01a59824f95d9313a3b2e7aa9f50
>
> [2] https://github.com/apache/groovy/pull/325
>
>
> On Mon, May 16, 2016 at 8:01 AM, Alain Stalder <[email protected]> wrote:
>
>> My time here is running up, other things to attend to, so here is what
>> I wrote about the current state of class loading and garbage collection
>> in Groovy in the just updated user manual of Grengine:
>>
>> https://www.grengine.ch/manual.html#the-cost-of-session-separation
>> --
>> ==== The Cost of Session Separation
>>
>> Although loading classes from bytecode obtained from compiling Groovy
>> scripts
>> is a lot less expensive than compiling them (plus afterwards also loading
>> the
>> resulting bytecode), it is still somewhat more expensive than one might
>> naively
>> expect and there are a few things to be aware of when operating that way.
>>
>> In the following, I will simply call classes compiled by the Groovy
>> compiler
>> from Groovy scripts/sources _Groovy classes_ and classes compiled by the
>> Java
>> compiler from Java sources _Java classes_.
>>
>> * *Class Loading* +
>>   Experimentally, loading of a typical Groovy class is often about 10
>> times
>>   slower than loading a Java class with similarly complex source code, but
>>   both are relatively expensive operations (of the order of a millisecond
>>   for a small Groovy class, to give a rough indication). For Java classes,
>>   this is apparently mainly expensive because some security checks have to
>>   be made on the bytecode. For Groovy classes, it is mainly expensive
>>   because some meta information is needed to later efficiently call
>> methods
>>   dynamically, and the like.
>> * *Garbage Collection* +
>>   Classes are stored in _PermGen_ (up to Java 7) resp. _Metaspace_ (Java 8
>>   and later) plus some associated data on the Heap, at least for Groovy
>>   classes the latter is normally the case (meta information). Whereas for
>>   Java classes, unused classes appear to be usually garbage collected from
>>   PermGen/Metaspace continuously, with Groovy classes this experimentally
>>   does not happen before PermGen/Metaspace or the Heap reach a configured
>>   limit. Why exactly this is so and whether it is easy to change and
>> whether
>>   it will change in the future, is difficult to answer for me, I find the
>>   code around it is rather convoluted, hard to untangle. Note that by
>> default
>>   on Java VMs there is typically no limit set for Metaspace (but there is
>>   for PermGen), so setting a limit is crucial in practice when using
>> Groovy.
>> * *Garbage Collection Bugs* +
>>   In the past, several Groovy versions had failed at garbage collecting
>>   Groovy classes and their class loaders, resulting finally in an
>>   `OutOfMemoryError` due to exhaustion of PermGen/Metaspace or the Heap,
>>   whichever limit was reached first. If when you are reading this, Groovy
>>   2.4.6 is (still) the newest version, make sure you set the system
>> property
>>   `groovy.use.classvalue=true` in the context of Grengine. Note that under
>>   different circumstances, like the one described in
>>   https://issues.apache.org/jira/browse/GROOVY-7591[GROOVY-7591:
>>   Use of ClassValue causes major memory leak] you would instead have had
>> to
>>   set it to false! That Groovy bug is actually in turn due to a bug in
>>   Oracle/OpenJDK Java VMs regarding garbage collection under some
>>   circumstances, more precisely a bug in a new feature (`ClassValue`)
>>   introduced in order to make thing easier(!) for dynamic languages in the
>>   Java VM, see
>> https://bugs.openjdk.java.net/browse/JDK-8136353[JDK-8136353].
>>
>> So, if you want to use session separation with Greninge (or otherwise want
>> to load many Groovy classes repeately), first set a limit on
>> PermGen/Metaspace,
>> then verify that classes can be garbage collected in an environment close
>> to
>> production and that throughput under load would be sufficient (despite the
>> relatively slow class loading performance of Groovy (and Java) classes in
>> the
>> Java VM) and then use it. And don't forget to repeat this at least when
>> you
>> upgrade Groovy to a new version, but possibly also when you upgrade Java.
>>
>> Or see the next section for an alternative...
>> --
>>
>> PS: By the way, very funny how Jochen Theodorou "garbage collected" what I
>> wrote about PhantomReference to a "[...]"...
>>
>> Good luck with Groovy garbage collection.
>>
>> Alain
>>
>
>
>

Reply via email to