On 09/04/2012 09:18, Ofer Israeli wrote: > On 08/04/2012 23:14, Stefan Mayr <ste...@mayr-stefan.de> wrote: >> Am 08.04.2012 18:41, schrieb Ofer Israeli: >>> 2012/4/6 Pid<p...@pidster.com>: >>>> On 05/04/2012 22:17, Ofer Israeli wrote: >>>>> Y >>>>> >>>>> On 5 באפר 2012, at 18:58, "Konstantin >>>>> Kolinko"<knst.koli...@gmail.com> >>>> wrote: >>>>> >>>>>> 2012/4/5 Ofer Israeli<of...@checkpoint.com>: >>>>>>> Mark Thomas wrote: >>>>>>>> On 04/04/2012 17:02, Ofer Israeli wrote: >>>>>>>> >>>>>>>> Once you have an OOME all bets are off. The JVM needs to be >>>> restarted. >>>>>>>> There is no guarantee of reliable operation after an OOME. >>>>>>>> >>>>>>>> Mark >>>>>>> >>>>>>> Hi Mark, >>>>>>> I agree that there in such a situation the JVM should be >>>>>>> restarted, but it >>>> isn't restarted by Tomcat. On the other hand, Tomcat does take some >>>> precautious actions and kills the accepting thread, but in such a >>>> case it should also close the socket that thread is listening on >>>> otherwise it is leaving garbage around after the thread's death. >>>>>>> Do you see any reason as not to close the listening socket? >>>>>>> >>>>>> >>>>>> 1. Tomcat does not start JVM thus it cannot restart it. >>>>>> >>>>>> You need some external tool or script or admin to perform >>>>>> monitoring and (re)starts. >>>>>> >>>>>> 2. OOM can happen at a random place. Once it happens, it is likely >>>>>> that other places will also start to fail randomly. It is also >>>>>> likely that your attempts to recover will fail as well. >>>>>> >>>>>> Mark already mentioned it: "all bets are off". >>>>>> >>>>>> Best regards, >>>>>> Konstantin Kolinko >>>>>> >>>>> Hi Konstantin, >>>>> >>>>> I agree regarding the OOM bringing TC to a state where it must be >>>> restored, but my point remains: if there is code that handles >>>> catching this exception and terminating the thread, why not terminate >>>> gracefully by closing the listening socket before killing the thread? >>>> >>>> And your point has been answered. After an OOM the JVM is in an >>>> unknown, unsafe state so a restart MUST occur to restore service. >>>> >>>> Closing a socket gracefully after an OOM is a bit like trying to shut >>>> one of the portholes on the Titanic, shortly after hearing a large crashing >> sound. >>>> >>>> >>>> There's only one place I know of where Tomcat attempts to interact >>>> with OOM conditions and this is not one of them, so I don't believe >>>> it's safe to say that Tomcat is deliberately handling this exception. >>>> >>>> NB an OOM is an Error, not an Exception - it is a subclass of >>>> VirtualMachineError, which is thrown to indicate that the Java >>>> Virtual Machine is broken or has run out of resources necessary for >>>> it to continue operating. >>>> >>>> An Error is a subclass of Throwable that indicates serious problems >>>> that a reasonable application should not try to catch. >>>> </end-quote> >>>> >>>> If anything, the locations where Tomcat catches a Throwable should be >>>> modified so it does *not* catch Errors, rather than continuing to do >>>> so and then attempting a trivial tidy-up. >>>> >>>> >>>> p >>> >>> Thanks for your input - you're right regarding the error and the fact that >> Tomcat is indeed catching a Throwable and not an Exception. I assume that if >> the Throwable were not caught, then the thread would die in any case. >> Although stated before that Tomcat could not kill itself in such a >> situation, I >> still wonder if it would be possible to do so. Or taking a different >> perspective >> on this: if the JVM specification is such that it cannot be trusted to >> continue >> running after an OOM, then why does it not kill itself or restart itself? >>> >> >> I guess you can do this with some vendor specific JVM arguments as >> SUNs/Oracles -XX:OnOutOfMemoryError: >> http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp- >> 140102.html >> >> Different findings like "kill -9 %p" let me suspect that you can use %p as a >> variable for your current pid. With that you can either kill your current >> instance and let your monitoring handle the rest or try to initiate the >> restart >> by yourself. >> >> Give it a try >> >> Stefan >> > Thanks Stefan - will look into this option.
Be careful using that option; if you are also producing a heap dump you should wait until the heap dump has finished writing before stopping the process, or the heap will be incomplete & therefore useless. p -- [key:62590808]
signature.asc
Description: OpenPGP digital signature