The main GC feature here are the Thread-Local Allocation Buffers. They are on by default and are "automatically sized according to allocation patterns". The size can also be fine-tuned with the -XX:TLABSize=nconfiguration option. You may consider tweaking this setting to optimize runtime. Basically, everything that one call to your function allocates should fit into a TLAB because it is all garbage upon exit. Allocation inside TLAB is ultra-fast and completely concurrent.
Configure TLAB<http://docs.oracle.com/javase/7/docs/technotes/tools/windows/java.html> On Sunday, December 9, 2012 7:37:09 PM UTC+1, Andy Fingerhut wrote: > > > On Dec 9, 2012, at 6:25 AM, Softaddicts wrote: > > > If the number of object allocation mentioned earlier in this thread are > real, > > yes vm heap management can be a bottleneck. There has to be some > > locking done somewhere otherwise the heap would corrupt :) > > > > The other bottleneck can come from garbage collection which has to > freeze > > object allocation completely or partially. > > > > This internal process has to reclaim unreferenced objects otherwise you > may end up > > exhausting the heap. That can even susoend your app while gc is running > depending > > on the strategy used. > > Agreed that memory allocation and garbage collection will in some cases > need to coordinate between threads to work in the general case of arbitrary > allocations and GCs. > > However, one could have a central list of large "pages" of free memory > (e.g. a few MBytes, or maybe even larger), and pass these out to concurrent > memory allocators in these large chunks, and let them do small object > allocations and GC within each thread completely concurrently. > > The only times locking of any kind might be needed with such a strategy > would be when one of the parallel threads requests a new big page from the > central free list, or returned a completely empty free page back to the > central list that it didn't need any more. All other memory allocation and > GC could be completely concurrent. The idea of making those "pages" large > is that such passing pages around would be infrequent, and thus could be > made to have no measurable synchronization overhead. > > That is pretty much what is happening when you run Lee's benchmark > programs as 1 thread per JVM, but 4 or 8 different JVMs processes running > in parallel. In that situation the OS has the central free list of pages, > and the JVMs manage their small object allocations and GCs completely > concurrently without interfering with each other. > > If HotSpot's JVM could be configured to work like that, he would be seeing > big speedups in a single JVM. > > Andy > > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en