Hello,

We're using the clustering options of the Orion 1.4.5 server. Our setup
includes three PC's with Windows 2000, one is running the loadbalancer and
the other two form one island. During our tests we can shutdown servers and
powerdown PC's and everything works correctly. When we disconnect the
network cable from one of our island PC's, things go wrong with the state
replication. We use the sessionservlet for testing purposes, the following
table describes the problem we encountered ('*' indicates the serving
server):

orion1                  orion2  sessionID
up                      up*     1
up                      up*     2
up*                     down    3
up*                     up      4
disconnect cable        up*     5
connect cable           up*     6
up                      up*     7
up*                     down    5

After disconnecting the network cable, the orionserver generates a JMS
exception, but keeps running. As the table shows, after reconnecting the
network cable, the state is no longer replicated between the two servers.
This is visible when the other server is brought down and the client is
forced to go to the reconnected server. The sessionID is an increment of the
state before disconnecting the cable (5), although the other server is
already up to sessionID 7, so the correct state should have been 8.

Has anybody encountered similar problems?

Jan Sipke van der Veen

Reply via email to