On 18/02/2016 20:30, Timothy Sipples wrote:
Huge memory makes it possible to run completely new classes of workloads,
for example ... Java heaps that never garbage collect during a batch run


I will make the same comment I made last time this topic came up - avoiding garbage collection is not usually a wise goal.

Reasons garbage collection is a good thing:

1) Processor cache. IBM has been making a big thing of processor cache, with type 113 records and RNI classification of workloads, for good reason. Compared to processor cache, memory is VERY slow. I don't recall the figures (if IBM even publishes them) but the general consensus is that main storage is to processor cache what disk is to main storage. You do NOT want your data in main storage if it could be in cache.

Garbage collection moves active data together in the address space. This is very good for processor cache - it makes it more likely that data will all be in a cache closer to the processor. By eliminating dead object space, Java might even utilize cache better than languages like C++.

On the other hand if you give Java a huge heap to avoid garbage collection, you run the real risk that active data ends up spread thinly across several GB of address space. This is almost the pathological worst case for processor cache usage.

2) 64 bit performance. (This one is subject to IIRC and is probably much less significant than #1). 64 bit Java is slightly slower than 32 bit EXCEPT THAT if the heap is less than a certain size (2GB?) it can perform some optimizations so that performance is the same as 32 bit. Go over that boundary and you lose some performance.

3) It's pretty hard to guarantee than you will never invoke GC. The longer you have been without it the worse it will be. You will at least suffer the cache penaties, and you might even have to page in parts of the address space. Then everyone wonders what happened, someone points at GC and the conclusion is that GC mustn't be allowed to happen - when really the problem was the lack of GC up to that point.

GC can be a problem for interactive workloads where response time is critical. In that case deterministic memory managment e.g. C++ might be better. However, batch jobs are about the least likely workload to suffer ill effects from regular "stop the world" GC.

Andrew Rowley


--
Andrew Rowley
Black Hill Software
+61 413 302 386

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Reply via email to