Re: [Arch] Class unloading and VM objects reclaim
I might be wrong but i few thougths as well: i think When loading a class loader let him manage his heap i.e. a memory should be associated per class loader for general objects (i.e. objects that will be garbadge collected) however as for as Strings are concerns thay must be shared gloablly as to follow the requirements imposed by the Java specs. The main concern is how they will be vanished depends upon the references to them and it could be managed entirly separtelly by providing a stack manage that can link between the class loaders and the Strings and this will ensure that Stack mem manager could reclaim the objects when there is no link to a string. similarlly buffers should be on the class loaders disposal as i think they must eb controlled from the objects being gc'd and if the objects are being reclaimed then these buffers should too.However if we see that buffers are being shared then we can provide a buffers manager which can do the same trick that is adopted in the string pattren. it will track the links between the two and if found zero referenced buffer, then it can reclaim the memory. so providing stack memory manager/ or i would say linker (classloaders and their string objects) and then buffer manager/linker we can acheive the required goal with some overhead though. On 9/13/05, Tim Ellison [EMAIL PROTECTED] wrote: Archie Cobbs wrote: Santiago Gala wrote: IIRC, the (JVM spec v2) requirement for .equals String literals to be (==)identical only holds for Strings in the same .class file, but I could be wrong. I believe the requirement is stronger than that. Any two String literals (i.e., String constants from class files) that are the same string must be the same object (==). It doesn't matter if they come from different classes. I agree. All identical string literals must refer to the same instance. http://java.sun.com/docs/books/vmspec/2nd-edition/html/ConstantPool.doc.html#73272 In any case, interning all VM Strings in some sort of global weak set would do the trick, as shared literals would remain referenced on classloader collection, while unreferenced ones would be eligible for collection. This is typically how symbols in lisp or smalltalk are managed (most implementations don't even bother to use a weak structure, so symbols are never collected). Basically this is what String.intern() does, and nothing impedes us to actually use it for all Strings in the VM. This is how Classpath works. Strings are intern'd using a weak hash map and String.intern() is implemented in pure Java. The JVM does nothing special for strings, with the exception that when it creates them from class files it also interns them. I'd speculate that the VM interns many more strings loading classes than typical Java code calls to the API, so provided the call into Java is low-cost this works fine. -Archie __ Archie Cobbs * CTO, Awarix * http://www.awarix.com -- Tim Ellison ([EMAIL PROTECTED]) IBM Java technology centre, UK. -- Usman Bashir Certified IBM XML Solution Developer Certified UML Developer Brainbench Certified Internet Perfessional[advance](BCIP) Brainbench Certified Java Perfessional (BCJP) Brainbench Certified .NET Perfessional Brainbench Ceritified C++ Perfessional (BCCP) Software engineer IT24 Faculty Member Operation Badar Lahore
Re: [Arch] Class unloading and VM objects reclaim
usman bashir wrote: i think When loading a class loader let him manage his heap i.e. a memory should be associated per class loader for general objects (i.e. objects that will be garbadge collected) however as for as Strings are concerns thay must be shared gloablly as to follow the requirements imposed by the Java specs. Hmm.. so you're basically saying partition the entire heap according to class loader (of each instance's class)? You could do that and it would work... but what do you gain? (not a rhetorical question, I just can't think of anything). By the way, no special handling for Strings is necessary; java.lang.String is always loaded by the boot loader. -Archie __ Archie Cobbs *CTO, Awarix* http://www.awarix.com
Re: [Arch] Class unloading and VM objects reclaim
I don't see why both methods of memory allocation and clean up may not be employed. The classLoader may manage it's own memory area and create objects that only it uses in the main memory area. The main memory area objects would be true objects and be gc'ed just like normal (So long as the class/classloader was given an object sig and registered it's creation and release of these like any normal object would) and the local memory would be freed when the class/classLoader was removed. In JikesRVM with MMTk, the issue of where the classloader allocates its objects is essentially configurable. You can select by call-site and/or by class which area of the heap objects are allocated into. By default all classloader allocated objects are allocated into the standard heap, but it's fairly common when doing experiments or debugging a new collector to allocate these objects into an immortal space. While MMTk doesn't (yet) support region-based allocation, it would be possible to have per-classloader objects allocated into regions, where they can be freed en masse. What we can't do at the moment is choose whether or not to scan these objects at GC time, but my feeling is that particularly in a generational collector it's not a significant overhead. The point I want to make is that I think a good design should make the issue of classloader allocation policy configurable, if not at run time then certainly at build time. This also brings up questions as to if we want to control all memory allocation in the JVM instance, simply using the OS to increase and decrease the JVM size or if the OS will be used to assign separate memory areas to separate pieces of the JVM? MMTk divides up virtual memory by collection policy. For example, in the generational mark-sweep collector, 15% of available virtual memory is set aside at the top end of the available virtual address space for a nursery, and 60% of virtual memory is set aside for the mature space (and other areas for large objects, metadata etc). As objects are allocated, MMTk requests memory from the OS, mapped into the appropriate space. It is useful to be able to specify certain characteristics of the virtual memory regions, eg that the nursery is at higher addresses, because the efficiency of the read and write barriers depends on being able to quickly identify nursery objects, and having the nursery in high addresses means you can do that with a single comparison. Other barriers have other requirements. Is this what you mean by the OS will be used to assign separate memory areas to separate pieces of the JVM ? In which case, I'd argue that it's highly desirable. cheers
Re: [Arch] Class unloading and VM objects reclaim
Archie Cobbs wrote: Santiago Gala wrote: IIRC, the (JVM spec v2) requirement for .equals String literals to be (==)identical only holds for Strings in the same .class file, but I could be wrong. I believe the requirement is stronger than that. Any two String literals (i.e., String constants from class files) that are the same string must be the same object (==). It doesn't matter if they come from different classes. I agree. All identical string literals must refer to the same instance. http://java.sun.com/docs/books/vmspec/2nd-edition/html/ConstantPool.doc.html#73272 In any case, interning all VM Strings in some sort of global weak set would do the trick, as shared literals would remain referenced on classloader collection, while unreferenced ones would be eligible for collection. This is typically how symbols in lisp or smalltalk are managed (most implementations don't even bother to use a weak structure, so symbols are never collected). Basically this is what String.intern() does, and nothing impedes us to actually use it for all Strings in the VM. This is how Classpath works. Strings are intern'd using a weak hash map and String.intern() is implemented in pure Java. The JVM does nothing special for strings, with the exception that when it creates them from class files it also interns them. I'd speculate that the VM interns many more strings loading classes than typical Java code calls to the API, so provided the call into Java is low-cost this works fine. -Archie __ Archie Cobbs *CTO, Awarix* http://www.awarix.com -- Tim Ellison ([EMAIL PROTECTED]) IBM Java technology centre, UK.
Re: [Arch] Class unloading and VM objects reclaim
El jue, 08-09-2005 a las 15:51 +0100, Peter Edworthy escribió: (...snip...) [long version] Cleaning up Strings that are only used in a particular classloader has a slight catch; because of the 'Strings containing the same characters are the same object' clause. This requires keeping a list of all currently created String objects that is checked at String object creation time to see if a reference to an existing object should be used or a new String object created. This list has to be outside of the gc link checking or no String would be gc'ed, but the list must also be informed when a String is gc'ed so as to remove it. IIRC, the (JVM spec v2) requirement for .equals String literals to be (==)identical only holds for Strings in the same .class file, but I could be wrong. In any case, interning all VM Strings in some sort of global weak set would do the trick, as shared literals would remain referenced on classloader collection, while unreferenced ones would be eligible for collection. This is typically how symbols in lisp or smalltalk are managed (most implementations don't even bother to use a weak structure, so symbols are never collected). Basically this is what String.intern() does, and nothing impedes us to actually use it for all Strings in the VM. Regards Santiago -- VP and Chair, Apache Portals (http://portals.apache.org) Apache Software Foundation signature.asc Description: This is a digitally signed message part
Re: [Arch] Class unloading and VM objects reclaim
Santiago Gala wrote: IIRC, the (JVM spec v2) requirement for .equals String literals to be (==)identical only holds for Strings in the same .class file, but I could be wrong. I believe the requirement is stronger than that. Any two String literals (i.e., String constants from class files) that are the same string must be the same object (==). It doesn't matter if they come from different classes. In any case, interning all VM Strings in some sort of global weak set would do the trick, as shared literals would remain referenced on classloader collection, while unreferenced ones would be eligible for collection. This is typically how symbols in lisp or smalltalk are managed (most implementations don't even bother to use a weak structure, so symbols are never collected). Basically this is what String.intern() does, and nothing impedes us to actually use it for all Strings in the VM. This is how Classpath works. Strings are intern'd using a weak hash map and String.intern() is implemented in pure Java. The JVM does nothing special for strings, with the exception that when it creates them from class files it also interns them. -Archie __ Archie Cobbs *CTO, Awarix* http://www.awarix.com
Re: [Arch] Class unloading and VM objects reclaim
Xiao-Feng Li wrote: Archie, thanks for the info, the approach looks like the second one I suggested. We can take it at first. Hmmm.. guess I don't understand what you meant by The former approach can reclaim as more as possible VM objects.. could you explain? Btw, there are some situations I am not sure if the approach you mentioned is possible. For example, we may want to reclaim also VM structures for strings that is not associated with a class loader, or to reclaim some jitted code buffers before the class loader is unloaded, or sometimes we want to put certain data/codes closely together for better cache/branch effects. Suggestions? Of course, anything not associated with a class loader has to be allocated from somewhere else... The per-loader memory could be a stack, or a (malloc style) heap. The latter allows more freedom regarding when you free things but at the cost of higher allocation times. In many cases, per-loader structures are alloc'd once and never free'd until unloading, so stacks can work pretty good. Stacks also allow freeing on error conditions (you can always free the last thing allocated). Whether you have per-loader data structures that may need to be freed before class unloading depends on how the VM works; in JCVM there are none so life is simpler and a stack structure works fine. If a VM did have this, it could use a malloc style heap instead of a stack, etc. -Archie __ Archie Cobbs *CTO, Awarix* http://www.awarix.com
[Arch] Class unloading and VM objects reclaim
Hi, folks, Class unloading is an optimization Harmony wants to have, and it should be straightforward to implement the class unloading semantics proposed by Java Language Specification. We want to push this a bit further to allow more VM data structures be reclaimed. We call them VM objects w.r.t. App objects, because they are maintained by VM. They are basically two approaches to unload the classes, one is to encode the VM object similarly as App object with a header, then GC can treat them uniformly (almost); the other approach treats class unloading specially, which reclaims a class loader together with all its associated VM objects. The former approach can reclaim as more as possible VM objects (besides other benefits such as code placement) but requires more GC overhead, while the latter approach can reclaim a class loader related objects all together if they are arranged properly. I think the latter approach is good enough for Harmony to have initially. The uniformed GC approach can be introduced later for memory-constraint system. Do you folks agree? Thanks, xiaofeng == Intel Managed Runtime Division
Re: [Arch] Class unloading and VM objects reclaim
I totally agree with Xiao feng's proposal Best Regards, William Wu Software Engineer Sybase Shanghai RD Center Room 1202-1206, Building One, Zhangjiang Semiconductor Industry Park 3000 Longdong Avenue Pudong, Shanghai 201203 Tel: 8621-68799918 ext 3081 Fax: 8621-68790199 Email: [EMAIL PROTECTED]
Re: [Arch] Class unloading and VM objects reclaim
Xiao-Feng Li wrote: They are basically two approaches to unload the classes, one is to encode the VM object similarly as App object with a header, then GC can treat them uniformly (almost); the other approach treats class unloading specially, which reclaims a class loader together with all its associated VM objects. The former approach can reclaim as more as possible VM objects (besides other benefits such as code placement) but requires more GC overhead, while the latter approach can reclaim a class loader related objects all together if they are arranged properly. You can get the benefits of both approaches using per-class loader memory areas like SableVM and JCVM. Each class loader has it's own stack of memory. All loader-related memory is allocated from that stack (including possibly java.lang.Class objects). Then when you unload the loader you free the entire stack at once. During GC you treat a class loader's stack as a single giant object. For more info: http://jcvm.sourceforge.net/share/jc/doc/jc.html#GC%20and%20Class%20Loaders Cheers, -Archie __ Archie Cobbs *CTO, Awarix* http://www.awarix.com
Re: [Arch] Class unloading and VM objects reclaim
Archie, thanks for the info, the approach looks like the second one I suggested. We can take it at first. Btw, there are some situations I am not sure if the approach you mentioned is possible. For example, we may want to reclaim also VM structures for strings that is not associated with a class loader, or to reclaim some jitted code buffers before the class loader is unloaded, or sometimes we want to put certain data/codes closely together for better cache/branch effects. Suggestions? Thanks, xiaofeng == Intel Managed Runtime Division On 9/8/05, Archie Cobbs [EMAIL PROTECTED] wrote: Xiao-Feng Li wrote: They are basically two approaches to unload the classes, one is to encode the VM object similarly as App object with a header, then GC can treat them uniformly (almost); the other approach treats class unloading specially, which reclaims a class loader together with all its associated VM objects. The former approach can reclaim as more as possible VM objects (besides other benefits such as code placement) but requires more GC overhead, while the latter approach can reclaim a class loader related objects all together if they are arranged properly. You can get the benefits of both approaches using per-class loader memory areas like SableVM and JCVM. Each class loader has it's own stack of memory. All loader-related memory is allocated from that stack (including possibly java.lang.Class objects). Then when you unload the loader you free the entire stack at once. During GC you treat a class loader's stack as a single giant object. For more info: http://jcvm.sourceforge.net/share/jc/doc/jc.html#GC%20and%20Class%20Loaders Cheers, -Archie __ Archie Cobbs *CTO, Awarix* http://www.awarix.com