Bugs item #816108, was opened at 2003-10-01 20:10
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=376685&aid=816108&group_id=22866

Category: Clustering
Group: v3.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Rob Dickinson (rob_dickinson)
Assigned to: Nobody/Anonymous (nobody)
Summary: Uneven balancing with round robin policy

Initial Comment:
I'm using the JBoss clustering framework to export 

clustered RMI services. My code initially works properly, 

but as I shut down and restart server nodes I'm seeing 

some unfairness in the round robin load balancing. It's 

certainly possible this isn't a JBoss bug, but a problem in 

how I'm using the clustering framework.



I've observed this behavior on WinXP with both 3.2.2RC3 

and 3.2.2RC4.  I'm using JDK 1.4.1_03.



THE TEST CASE



I've got a simple MBean (a "beacon") that exports an 

RMI server using the JBoss HA framework, and then 

binds the RMI stub into local JNDI:





public class Beacon

     extends org.jboss.system.ServiceMBeanSupport

  implements BeaconMBean

{

   public void startService() throws Exception

   {

      rebind();

   }

 

   private void rebind() throws Exception

   {

      log.info("Rebinding...");

 

      // Grab HA partition

      String pname = "/HAPartition/" + partitionName;

      InitialContext context = new InitialContext();

      HAPartition partition = (HAPartition)context.lookup

(pname);

 

      // Create HA-RMI server

      this.beaconImpl = new BeaconImpl(partition);

      this.beaconServer = new HARMIServerImpl

(partition, "Beacon",

         BeaconInterface.class, beaconImpl, 0, null, null);

 

      // Bind server stub

      BeaconInterface stub = (BeaconInterface)

beaconServer.createHAStub(new RoundRobin());

      context.rebind(jndiName, stub);

      

      log.info("Rebinding complete");

   }

   

   // LOTS OF DETAIL OMITTED HERE!

}





The beacon RMI server does nothing but increment a 

counter and log the new value.



My simple client looks up a beacon interface and 

repeatedly makes calls to the beacon interface (by 

looking up the RMI stub via JNDI). The beacon interface 

is looked up every 10 calls to detect when the load 

balancing gets stuck (it wouldn't otherwise be required 

to do this). 





public class Client

{

   public static void main(String[] args) throws Exception

   {

      Properties p = new Properties();

      p.put

("java.naming.factory.initial", "org.jnp.interfaces.NamingC

ontextFactory");

      p.put

("java.naming.factory.url.pkgs", "org.jboss.naming:org.jnp

.interfaces");

      p.put("jnp.partitionName", "DefaultPartition");

      p.put

("java.naming.provider.url", "jnp://localhost:80,jnp://local

host:81");

 

      int count = 0;

      BeaconInterface beacon = null;

      while (true) {

         try {

            if ((beacon == null) || count++ % 10 == 0) {

               InitialContext context = new InitialContext

(p);

               beacon = (BeaconInterface)context.lookup

("mybeacon");

            }

            System.out.println("OK: " + beacon.execute

(null));

         } catch (Throwable t) {

            beacon = null;

            t.printStackTrace();

         } finally {

            Thread.sleep(100);

         }

      }

   }

}





RUNNING THE TEST CASE



1) Download and expand the jboss-roundrobinbug.zip file.



2) You need JBoss 3.2.2rc3 somewhere on your hard 

disk. If this isn't installed to 'c:\jboss322rc3', then edit 

the build properties in the test project accordingly.



3) Run 'ant install' to configure your JBoss server to run 

the test project. (This creates a 'beacon' server 

configuration, while leaving any other servers alone.) 

This sets up the 'primary' JBoss server.



4) Copy your JBoss server to a new directory 

(like 'c:\jboss322rc3-2'). Edit the 'cluster-service.xml' 

file and change the port to 81. (It's 80 by default.) This 

sets up the 'backup' JBoss server.



5) Run the 'ant test' target from the Ant test project 

build. This starts the client polling, which initially results 

in a stream of SocketTimeoutExceptions.



6) Start the primary JBoss server (run -c beacon). Now 

the client starts to make calls to the beacon. When the 

count is evenly divisible by 10, the client pauses 

momentarily as the InitialContext is recreated as 

expected.



0

1

2

3

4

...



7) After letting the primary run for a bit, fire up the 

backup JBoss server (again, run -c beacon). Now the 

client starts to evenly load balance:



511

0

512

1

513

2

514

...



8) Now stop and restart the primary JBoss server. The 

client now gets stuck to the backup. If the 

InitialContext were not refreshed periodically, the client 

has a 50-50 shot of never calling the primary after it 

fails.



1

19533

2

19534

3

19535

4

19536

5

19537

19538

19539

19540

19541

19542

19543

19544

19545

19546

19547

6

19548

7

19549

...



Once the client starts to see the behavior in step #8, 

there doesn't appear to be a way to get it unstuck. 

Bouncing the primary server again or the backup doesn't 

ever restore the even load balancing initially evident.



ATTEMPTED WORKAROUNDS



1. Enabling autodiscovery



I thought this would do the trick, but still manifests 

problems.  In this case after the primary server has 

been bounced the client sees a repeated set of socket 

timeouts between successful calls:



     [java] OK: 62

     [java] OK: 63

     [java] OK: 64

     [java] javax.naming.CommunicationException: 

Receive timed out.  Root exception is 

java.net.SocketTimeoutException: Receive timed out

     [java]     at 

java.net.PlainDatagramSocketImpl.receive(Native 

Method)

     [java]     at java.net.DatagramSocket.receive

(DatagramSocket.java:680)

     [java]     at 

org.jnp.interfaces.NamingContext.discoverServer

(NamingContext.java:1093)

     [java]     at 

org.jnp.interfaces.NamingContext.checkRef

(NamingContext.java:1192)

     [java]     at org.jnp.interfaces.NamingContext.lookup

(NamingContext.java:514)

     [java]     at org.jnp.interfaces.NamingContext.lookup

(NamingContext.java:507)

     [java]     at javax.naming.InitialContext.lookup

(InitialContext.java:347)

     [java]     at test.Client.main(Client.java:25)

     [java] javax.naming.CommunicationException: 

Receive timed out.  Root exception is 

java.net.SocketTimeoutException: Receive timed out

     [java]     at 

java.net.PlainDatagramSocketImpl.receive(Native 

Method)

     [java]     at java.net.DatagramSocket.receive

(DatagramSocket.java:680)

     [java]     at 

org.jnp.interfaces.NamingContext.discoverServer

(NamingContext.java:1093)

     [java]     at 

org.jnp.interfaces.NamingContext.checkRef

(NamingContext.java:1192)

     [java]     at org.jnp.interfaces.NamingContext.lookup

(NamingContext.java:514)

     [java]     at org.jnp.interfaces.NamingContext.lookup

(NamingContext.java:507)

     [java]     at javax.naming.InitialContext.lookup

(InitialContext.java:347)

     [java]     at test.Client.main(Client.java:25)

     [java] OK: 65

     [java] OK: 66



2. JavaGroups configuration



I've tried turning loopback on/off and changing other 

settings.  No luck!

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=376685&aid=816108&group_id=22866


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
JBoss-Development mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jboss-development

Reply via email to