[ http://jira.jboss.com/jira/browse/JBAS-896?page=history ]

Scott M Stark reassigned JBAS-896:
----------------------------------

    Assign To: Bela Ban  (was: Scott M Stark)

> JMS started on both nodes in cluster after network glitch
> ---------------------------------------------------------
>
>          Key: JBAS-896
>          URL: http://jira.jboss.com/jira/browse/JBAS-896
>      Project: JBoss Application Server
>         Type: Bug
>   Components: Clustering
>     Versions: JBossAS-3.2.6 Final
>     Reporter: SourceForge User
>     Assignee: Bela Ban

>
>
> SourceForge Submitter: iankenn .
> Original posting on JBoss.org Clustering forum:
> Hi
> I'm currently developing a system which uses JMS
> queuing for async processing of messages. I'm looking
> at deploying to a cluster of two JBoss 3.2.3 servers to
> provide some level of fail-over/resilience.
> During testing of the JMS fail-over I've tried killing
> one of the JBoss instances (the one running the JMS
> server) and see that the JMS queues are migrated to the
> other node. But when I tried to simulate a temporary
> loss of network connectivity between the two machines
> (by removing one of the network cables and then
> replacing it) the cluster seems to break and both
> machines start to run the JMS queues.
> When the network cable is reconnected, neither node
> appear to know that there is another node in the same
> partition. Effectively the cluster is not
> re-established. The only way to make the two nodes see
> each other again is to restart one of the nodes. Is
> there something that I have miss-configured/not
> configured, I am new to clustering and would appreciate
> some advice. - I am currently testing on two windows
> machines but intend to deploy to Linux boxes.
> Thanks,
> Ian
> See posting
> http://www.jboss.org/index.html?module=bb&op=viewtopic&t=45901
> Configuration (both machines)
> OS: Windows 2000 
> JDK: 1.4.2_03
> JBoss: 3.2.3
> The attached zip contains the cluster.log files for
> both servers:
> Node 'A' - Node_A_cluster.log
> Node 'B' - Node_B_cluster.log
> Steps 
> -----
> 1. Turn on logging for clustering in /conf/log4j.xml 
> 2. Start JBoss on Node 'A'
> 3. Start JBoss on Node 'B'
> 4. Deploy EAR to farm dir on Node 'A''
>     This is farmed to Node 'B'
> 5. Submit Msg to Node 'A' (Http request to application)
> 6. Submit Msg to Node 'B' (Http request to application)
> 7. Look at the HAILSharedState ServerAddress for the
> JBoss MQ on the jmx-console - this shows the IP address
> of Node 'A' on both nodes.
> 8. Remove network cable from Node 'A'
> 9. The following messages are displayed in the console:
> Node 'A'
> 10:40:53,921 INFO  [DefaultPartition] New cluster view
> (id: 2, delta: -1) : [192.168.0.34:1099]
> 10:40:53,921 INFO  [DefaultPartition:ReplicantManager]
> Dead members: 1
> 10:40:58,015 INFO  [DefaultPartition] Suspected member:
> wizcom-desk01:4950 (additional data: 17 byte
> s)
> Node 'B'
> 10:40:53,376 INFO  [DefaultPartition] New cluster view
> (id: 2, delta: -1) : [192.168.0.46:1099]
> 10:40:53,376 INFO  [DefaultPartition:ReplicantManager]
> Dead members: 1
> 10:40:53,516 INFO  [HAILServerILService] Notified to
> become singleton
> 10. The jmx-console on Node 'B' now shows it's own IP
> address as the HAILSharedState ServerAddress.
> 11. The jmx-console on Node 'A' still shows it's own IP
> address as the HAILSharedState ServerAddress.
> 11. Reconnect the network cable to Node 'A'
> 12. The following message appears in the console:
> Node 'A'
> 10:45:05,171 INFO  [DefaultPartition] New cluster view
> (id: 3, delta: 1) : [192.168.0.34:1099, 192.168.0.46:1099]
> 10:45:05,171 INFO  [DefaultPartition:ReplicantManager]
> Merging partitions...
> 10:45:05,171 INFO  [DefaultPartition:ReplicantManager]
> Dead members: 0
> 10:45:05,187 INFO  [DefaultPartition:ReplicantManager]
> Originating groups: [[wizcom-comp2:1277 (additional
> data: 17 bytes)|2] [wizcom-comp2:1277 (additional data:
> 17 bytes)], [wizcom-desk01:4950 (additional data: 17
> bytes)|2] [wizcom-desk01:4950 (additional data: 17 bytes)]]
> 10:45:05,233 INFO  [DefaultPartition:ReplicantManager]
> Start merging members in DRM service...
> 10:45:05,655 INFO  [DefaultPartition:ReplicantManager]
> ..Finished merging members in DRM service
> Node 'B'
> 10:45:05,740 INFO  [DefaultPartition] New cluster view:
> 3 ([192.168.0.34:1099, 192.168.0.46:1099] delta: 1)
> 10:45:05,756 INFO  [DefaultPartition:ReplicantManager]
> Merging partitions...
> 10:45:05,756 INFO  [DefaultPartition:ReplicantManager]
> Dead members: 0
> 10:45:05,756 INFO  [DefaultPartition:ReplicantManager]
> Originating groups: [[wizcom-comp2:1277 (additional
> data: 17 bytes)|2] [wizcom-comp2:1277 (additional data:
> 17 bytes)], [WIZCOM-DESK01:4950 (additional data: 17
> bytes)|2] [WIZCOM-DESK01:4950 (additional data: 17 bytes)]]
> 10:45:05,818 INFO  [DefaultPartition:ReplicantManager]
> Start merging members in DRM service...
> 10:45:05,943 INFO  [HAILServerILService] Notified to
> stop acting as singleton.
> 10:45:05,943 INFO  [DefaultPartition:ReplicantManager]
> ..Finished merging members in DRM service
> 13. Refresh the HAILSharedState in the jmx-console,
> both nodes have their own IP address as the ServerAddress.
> Thanks
> Ian

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://jira.jboss.com/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
JBoss-Development mailing list
JBoss-Development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jboss-development

Reply via email to