I have been battling a memory problem for a few months now with no luck. I have used profilers and everything else you could imagine. The problem is we have never been able to duplicate it on the testing servers it only happens on production which does not see much load at all (100 users a day). I noticed something very odd yesterday though. I am using the "-XX:+UseConcMarkSweepGC and -Xms512m". I was monitoring the garbage collection activity very closely and I was receiving good performance for 3 days. I was seeing one garbage collection about every 5 minutes in the Catalina.out. Then out of nowhere the garbage collection starting going crazy, about one garbage collection a second, I checked the cpu usage and it was stuck around 60% used. There were only 2-4 people logged in at a time. The crazy garbage collection went on for a couple of hours before it eventually crapped out with "Out of Memory" errors. It almost seems like a section of the app caused this sudden outburst but there is no way our QA department hasn't went over the same section. I was hoping some one seen or heard of this before. Below is my environment.
Thanks, Apache2.0.47 using mod_jk2 2 * tomcat4.1.24 J2sdk1.4.2_04