On Mar 10, 2008, at 3:05 PM, Mike Wynholds wrote:
Scott and others-
My client has a five-node Resin Pro cluster, each running version
3.1.2.
Today one of the nodes experienced an OutOfMemoryException which did
not bring Resin down but seemed to have put it in a completely
unresponsive state.
Do you have the <memory-free-min> set? Resin should restart itself
before it gets to OOM.
The problem with OOM is that errors and behavior start becoming
undefined. Basically, it's not possible to really handle OOM other
than restarting the system. The <memory-free-min> makes sure Resin
restarts before that situation occurs.
-- Scott
With 10 minutes or so of that happening, the other four servers stop
responding as well. Looking at their logs shows that they are
continuously getting socket timeouts while trying to communicate
with the first server for session clustering. (Stack trace below).
To be fair, this is not the only exception being thrown. We also
see our distributed EhCache system unsuccessfully trying to
replicate itself. And we *also* see the occasional Hessian
exception happening (also below). Ultimately the server just gets
so bogged down, it seems, that it needs to be restarted.
So my question is this:
Assuming a Resin node runs out of memory, is there a way for other
Resin nodes to detect that and take the same action as if the node
was actually down? I’m not sure this is really a bug, but it is
probably a good super-edge-case scenario worth thinking about.
We are currently looking at our watchdog process config to see why
it did not auto-restart Resin. I think we didn’t give enough memory
buffer for the watchdog to detect a needed restart, and our app lost
responsiveness before the watchdog could restart it. But that’s
just a theory.
I am interested in feedback from Scott and other Caucho developers
about this issue, as well as other Resin users who may have
experienced issues like this before and have any thoughts or
suggestions on the matter.
Thanks.
..mike..
--- Socket Timeout stack trace (partial) ---
[14:47:10.389] java.net.SocketTimeoutException: Read timed out
[14:47:10.389] at java.net.SocketInputStream.socketRead0(Native
Method)
[14:47:10.389] at
java.net.SocketInputStream.read(SocketInputStream.java:129)
[14:47:10.389] at com.caucho.vfs.TcpStream.read(TcpStream.java:163)
[14:47:10.389] at
com.caucho.vfs.ReadStream.readBuffer(ReadStream.java:1001)
[14:47:10.389] at com.caucho.vfs.ReadStream.read(ReadStream.java:306)
[14:47:10.389] at
com
.caucho.server.cluster.ClusterStore.updateAccess(ClusterStore.java:
856)
[14:47:10.389] at
com
.caucho.server.cluster.ClusterStore.accessServer(ClusterStore.java:
823)
[14:47:10.389] at
com.caucho.server.cluster.ClusterStore.accessImpl(ClusterStore.java:
804)
[14:47:10.389] at
com.caucho.server.cluster.ClusterObject.access(ClusterObject.java:337)
[14:47:10.389] at
com.caucho.server.session.SessionImpl.setAccess(SessionImpl.java:839)
[14:47:10.389] at
com.caucho.server.session.SessionManager.load(SessionManager.java:
1477)
[14:47:10.389] at
com
.caucho.server.session.SessionManager.getSession(SessionManager.java:
1335)
[14:47:10.389] at
com
.caucho
.server
.connection
.AbstractHttpRequest.createSession(AbstractHttpRequest.java:1455)
[14:47:10.389] at
com
.caucho
.server
.connection.AbstractHttpRequest.getSession(AbstractHttpRequest.java:
1270)
[14:47:10.389] at
net
.sf
.acegisecurity
.context
.HttpSessionContextIntegrationFilter
.doFilter(HttpSessionContextIntegrationFilter.java:172)
[14:47:10.389] at net.sf.acegisecurity.util.FilterChainProxy
$VirtualFilterChain.doFilter(FilterChainProxy.java:303)
[14:47:10.389] at
net
.sf
.acegisecurity.util.FilterChainProxy.doFilter(FilterChainProxy.java:
173)
[14:47:10.389] at
net
.sf
.acegisecurity
.util.FilterToBeanProxy.doFilter(FilterToBeanProxy.java:125)
[14:47:10.389] at
com
.caucho
.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:73)
--- Hessian failure stack trace ---
[14:15:00.065] Caused by: org.springframework.web.util.NestedServletException
: Hessian skeleton invocation failed; nested exception is
java.io.IOException: expected 'c' in hessian input at -1
[14:15:00.065] at
org
.springframework
.remoting
.caucho
.HessianServiceExporter.handleRequest(HessianServiceExporter.java:150)
[14:15:00.065] at org.springframework.web.servlet.mvc.HttpRequestHandlerAdapter.handle
(HttpRequestHandlerAdapter.java:49)
[14:15:00.065] at org.springframework.web.servlet.DispatcherServlet.doDispatch
(DispatcherServlet.java:857)
[14:15:00.065] at org.springframework.web.servlet.DispatcherServlet.doService
(DispatcherServlet.java:792)
[14:15:00.065] at org.springframework.web.servlet.FrameworkServlet.processRequest
(FrameworkServlet.java:475)
[14:15:00.065] at org.springframework.web.servlet.FrameworkServlet.doPost
(FrameworkServlet.java:440)
[14:15:00.065] at
javax.servlet.http.HttpServlet.service(HttpServlet.java:153)
[14:15:00.065] at
javax.servlet.http.HttpServlet.service(HttpServlet.java:91)
[14:15:00.065] at
com
.caucho
.server.dispatch.ServletFilterChain.doFilter(ServletFilterChain.java:
103)
[14:15:00.065] at net.sf.acegisecurity.util.FilterChainProxy
$VirtualFilterChain.doFilter(FilterChainProxy.java:292)
[14:15:00.065] at
taylor
.tops.security.UserTrackerFilter.doFilter(UserTrackerFilter.java:27)
[14:15:00.065] at net.sf.acegisecurity.util.FilterChainProxy
$VirtualFilterChain.doFilter(FilterChainProxy.java:303)
[14:15:00.065] at net.sf.acegisecurity.intercept.web.FilterSecurityInterceptor.invoke
(FilterSecurityInterceptor.java:84)
[14:15:00.065] at net.sf.acegisecurity.intercept.web.SecurityEnforcementFilter.doFilter
(SecurityEnforcementFilter.java:182)
[14:15:00.065] ... 18 more
[14:15:00.065] Caused by: java.io.IOException: expected 'c' in
hessian input at -1
[14:15:00.065] at
org
.springframework
.remoting
.caucho.Hessian2SkeletonInvoker.invoke(Hessian2SkeletonInvoker.java:
51)
[14:15:00.065] at
org
.springframework
.remoting
.caucho
.HessianServiceExporter.handleRequest(HessianServiceExporter.java:147)
[14:15:00.065] ... 31 more
.....
Michael Wynholds
President
Carbon Five, Inc.
310 821 7125 x13
[EMAIL PROTECTED]
_______________________________________________
resin-interest mailing list
[email protected]
http://maillist.caucho.com/mailman/listinfo/resin-interest
_______________________________________________
resin-interest mailing list
[email protected]
http://maillist.caucho.com/mailman/listinfo/resin-interest