On Fri, Oct 29, 2010 at 1:51 PM, Pid <p...@pidster.com> wrote: > On 29/10/2010 11:17, Ossi wrote: > > Hi! > > > > Should BackupManager work well with any number of nodes? > > Yes. > > > And with large clusters it should work even better than DeltaManager? > > Yes. *Should*. > > > We have large production clusters (10+) nodes and we have evaluated if we > > can use BackupManager. > > > > In test cluster of 6 nodes it didn't work too well: much higher request > > latency, with logs full of following errors: > > > > 2010-09-24 14:17:34,536 ERROR [tomcat-processor-53] > > (org.apache.catalina.tribes.tipis.AbstractReplicatedMap) Unable to > replicate > > out data for a LazyReplicatedMap.get > > operationorg.apache.catalina.tribes.ChannelException: Operation has timed > > out(3000 ms.).; Faulty members:tcp://{10, 1, 8, 219}:4200; > > It's timing out for some reason. You could try increasing the timeout. >
Yes, I noticed that. However it is using same configs that with DeltaManager and we didn't get those same errors with that. What could be reason for those timeouts? How to know what operation could be causing the timeout? Like is that on initialization/starting phase (so, it couldn't connect at all) or I something in replication just taking a lot of time. I'll test this with different timeouts. > > Does this occur on all cluster members, or just a few? > Sorry, I don't remember it has been awhile when we did those test and apparently the logs are gone. Gotta check this when I test this next time. > > p > > > > at > > > org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:97) > > > > at > > > org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:53) > > > > at > > > org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:80) > > > > at > > > org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:78) > > > > at > > > org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:75) > > > > at > > > org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:73) > > > > at > > > org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:75) > > > > at > > > org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage(TcpFailureDetector.java:87) > > > > at > > > org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:75) > > > > at > > org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:216) > > at > > org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:175) > > at > org.apache.catalina.tribes.group.RpcChannel.send(RpcChannel.java:89) > > at > > > org.apache.catalina.tribes.tipis.AbstractReplicatedMap.get(AbstractReplicatedMap.java:844) > > > > at > > org.apache.catalina.session.ManagerBase.findSession(ManagerBase.java:887) > > at > org.apache.catalina.connector.Request.doGetSession(Request.java:2363) > > > > at > org.apache.catalina.connector.Request.getSession(Request.java:2098) > > at > > > org.apache.catalina.connector.RequestFacade.getSession(RequestFacade.java:833) > > > > at > > > javax.servlet.http.HttpServletRequestWrapper.getSession(HttpServletRequestWrapper.java:216) > > > > at > > > com.sulake.habboweb.util.TomcatSessionFixationPreventerFilter$RequestWrapper.getSession(TomcatSessionFixationPreventerFilter.java:72) > > > > ..... > > > > > > Yes, I know that documentation says: "Downside of the BackupManager: not > > quite as battle tested as the delta manager". Maybe this is it. :) > > > > Regards, > > Ossi > > > >