Hi,

We would like to detect the failure of a node in our JBoss cluster
sooner.  With our current java groups protocol stack in
cluster-service.xml it takes too long and I am the one that should tweak
it.  I have to admit I am not really all that comfortable with it, even
though I was browsing the javagroups code some and read the user doc.
Our cluster is running pretty stable with the current settings that I
have listed below.
To improve failure detection performance I was thinking of changing the
FD protocols timeout to 2000 and maybe bring down the timeout value from
VERIFY_SUSPECT to 1500.
What do you guys think?

Regards,
Sebastian

-------------------------------------------------------------
    <attribute name="PartitionConfig">
      <Config>
        <!-- UDP: Uses IP multicast for group messages and UDP packets
for
             messages to individual members -->
        <UDP mcast_addr="228.1.2.3" mcast_port="45577" 
             ip_ttl="64" ip_mcast="true"
             mcast_send_buf_size="150000" mcast_recv_buf_size="80000" 
             ucast_send_buf_size="150000" ucast_recv_buf_size="80000" 
             loopback="false" />

          <!-- PING: Uses IP multicast (by default) to find initial
members.
               Once found, the current coordinator can be determined and
             a unicast JOIN request will be sent to it -->
        <PING timeout="4000" num_initial_members="3" 
              up_thread="true" down_thread="true" />
        
        <!-- MERGE2: Will merge subgroups back into one group -->
        <MERGE2 min_interval="5000" max_interval="10000" />

        <!-- FD: Failure detection based on simple heartbeat
             protocol. Regularly polls members for liveness. -->
        <FD shun="true" timeout="5000" max_tries="4" 
            up_thread="true" down_thread="true" />

        <!-- VERIFY_SUSPECT: Double-checks whether suspected member is  
             really dead, otherwise suspicion generated from protocol 
             below is discarded. -->
        <VERIFY_SUSPECT timeout="3000" num_msgs="2"
                        up_thread="true" down_thread="true" />
         
        <!-- pbcast.STABLE: Deletes messages that have been seen by 
             all members (distributed message garbage collection) -->
        <pbcast.STABLE desired_avg_gossip="20000"
                       up_thread="true" down_thread="true" />

        <!-- pbcast.NAKACK: Ensures (a) message reliability and 
             (b) FIFO. Message reliability guarantees that a message 
             will be received. If not, receiver will request
             retransmission.  FIFO guarantees that all messages from 
             sender P will be received in the order P sent them -->
        <pbcast.NAKACK gc_lag="50"
retransmit_timeout="800,1600,2400,3000"
                       up_thread="true" down_thread="true" />

        <!-- UNICAST: Same as NAKACK for unicast messages: messages
             from sender P will not be lost (retransmission if
necessary)
             and will be in FIFO order (essentially the same as TCP in
             TCP/IP, without the flow control) -->
        <UNICAST timeout="800,1600,2400,3000" window_size="100" 
                 min_threshold="10" down_thread="true" />

        <!-- FRAG: Fragments large messages into smaller ones and
             reassembles them back at the receiver side. For both 
             multicast and unicast messages. -->
        <FRAG frag_size="8192"
              down_thread="true" up_thread="true" />

        <!-- pbcast.GMS: Membership protocol. Responsible for 
             joining/leaving members and installing new views. -->
        <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
                    shun="true" print_local_addr="true" />

        <!-- pbcast.STATE_TRANSFER: State transfer protocol. -->
        <pbcast.STATE_TRANSFER up_thread="true" down_thread="true" />
      </Config>
    </attribute>
-------------------------------------------------------------


_______________________________________________________
This message is for the named recipient's use only.  It may contain sensitive and 
private proprietary information.  No confidentiality is waived or lost by any 
incorrect transmission.  If you are not the intended recipient, please immediately 
delete it and all copies of it from your system, destroy any hard copies of it and 
notify the sender.  You must not, directly or indirectly, use, disclose, distribute, 
print, or copy any part of this message if you are not the intended recipient.  
Sakonnet Technology, LLC and its subsidiaries reserve the right to monitor all e-mail 
communications through their networks. Any views expressed in this message are those 
of the individual sender, except where the message states otherwise and the sender is 
authorized to state them to be the views of any such entity.  Unless otherwise stated, 
any pricing information given in this message is indicative only, is subject to change 
and does not constitute an offer to deal at any price quoted. Any reference to the 
terms of executed transactions should be treated as preliminary only and subject to 
our formal written confirmation. 



-------------------------------------------------------
This SF.net email is sponsored by OSDN developer relations
Here's your chance to show off your extensive product knowledge
We want to know what you know. Tell us and you have a chance to win $100
http://www.zoomerang.com/survey.zgi?HRPT1X3RYQNC5V4MLNSV3E54
_______________________________________________
JBoss-user mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jboss-user

Reply via email to