Bugs item #816108, was opened at 2003-10-01 20:10
Message generated for change (Comment added) made by rob_dickinson
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=376685&aid=816108&group_id=22866
Category: Clustering
Group: v3.2
Status: Closed
Resolution: Invalid
Priority: 5
Submitted By: Rob Dickinson (rob_dickinson)
Assigned to: Sacha Labourey (slaboure)
Summary: Uneven balancing with round robin policy
Initial Comment:
I'm using the JBoss clustering framework to export
clustered RMI services. My code initially works properly,
but as I shut down and restart server nodes I'm seeing
some unfairness in the round robin load balancing. It's
certainly possible this isn't a JBoss bug, but a problem in
how I'm using the clustering framework.
I've observed this behavior on WinXP with both 3.2.2RC3
and 3.2.2RC4. I'm using JDK 1.4.1_03.
THE TEST CASE
I've got a simple MBean (a "beacon") that exports an
RMI server using the JBoss HA framework, and then
binds the RMI stub into local JNDI:
public class Beacon
extends org.jboss.system.ServiceMBeanSupport
implements BeaconMBean
{
public void startService() throws Exception
{
rebind();
}
private void rebind() throws Exception
{
log.info("Rebinding...");
// Grab HA partition
String pname = "/HAPartition/" + partitionName;
InitialContext context = new InitialContext();
HAPartition partition = (HAPartition)context.lookup
(pname);
// Create HA-RMI server
this.beaconImpl = new BeaconImpl(partition);
this.beaconServer = new HARMIServerImpl
(partition, "Beacon",
BeaconInterface.class, beaconImpl, 0, null, null);
// Bind server stub
BeaconInterface stub = (BeaconInterface)
beaconServer.createHAStub(new RoundRobin());
context.rebind(jndiName, stub);
log.info("Rebinding complete");
}
// LOTS OF DETAIL OMITTED HERE!
}
The beacon RMI server does nothing but increment a
counter and log the new value.
My simple client looks up a beacon interface and
repeatedly makes calls to the beacon interface (by
looking up the RMI stub via JNDI). The beacon interface
is looked up every 10 calls to detect when the load
balancing gets stuck (it wouldn't otherwise be required
to do this).
public class Client
{
public static void main(String[] args) throws Exception
{
Properties p = new Properties();
p.put
("java.naming.factory.initial", "org.jnp.interfaces.NamingC
ontextFactory");
p.put
("java.naming.factory.url.pkgs", "org.jboss.naming:org.jnp
.interfaces");
p.put("jnp.partitionName", "DefaultPartition");
p.put
("java.naming.provider.url", "jnp://localhost:80,jnp://local
host:81");
int count = 0;
BeaconInterface beacon = null;
while (true) {
try {
if ((beacon == null) || count++ % 10 == 0) {
InitialContext context = new InitialContext
(p);
beacon = (BeaconInterface)context.lookup
("mybeacon");
}
System.out.println("OK: " + beacon.execute
(null));
} catch (Throwable t) {
beacon = null;
t.printStackTrace();
} finally {
Thread.sleep(100);
}
}
}
}
RUNNING THE TEST CASE
1) Download and expand the jboss-roundrobinbug.zip file.
2) You need JBoss 3.2.2rc3 somewhere on your hard
disk. If this isn't installed to 'c:\jboss322rc3', then edit
the build properties in the test project accordingly.
3) Run 'ant install' to configure your JBoss server to run
the test project. (This creates a 'beacon' server
configuration, while leaving any other servers alone.)
This sets up the 'primary' JBoss server.
4) Copy your JBoss server to a new directory
(like 'c:\jboss322rc3-2'). Edit the 'cluster-service.xml'
file and change the port to 81. (It's 80 by default.) This
sets up the 'backup' JBoss server.
5) Run the 'ant test' target from the Ant test project
build. This starts the client polling, which initially results
in a stream of SocketTimeoutExceptions.
6) Start the primary JBoss server (run -c beacon). Now
the client starts to make calls to the beacon. When the
count is evenly divisible by 10, the client pauses
momentarily as the InitialContext is recreated as
expected.
0
1
2
3
4
...
7) After letting the primary run for a bit, fire up the
backup JBoss server (again, run -c beacon). Now the
client starts to evenly load balance:
511
0
512
1
513
2
514
...
8) Now stop and restart the primary JBoss server. The
client now gets stuck to the backup. If the
InitialContext were not refreshed periodically, the client
has a 50-50 shot of never calling the primary after it
fails.
1
19533
2
19534
3
19535
4
19536
5
19537
19538
19539
19540
19541
19542
19543
19544
19545
19546
19547
6
19548
7
19549
...
Once the client starts to see the behavior in step #8,
there doesn't appear to be a way to get it unstuck.
Bouncing the primary server again or the backup doesn't
ever restore the even load balancing initially evident.
ATTEMPTED WORKAROUNDS
1. Enabling autodiscovery
I thought this would do the trick, but still manifests
problems. In this case after the primary server has
been bounced the client sees a repeated set of socket
timeouts between successful calls:
[java] OK: 62
[java] OK: 63
[java] OK: 64
[java] javax.naming.CommunicationException:
Receive timed out. Root exception is
java.net.SocketTimeoutException: Receive timed out
[java] at
java.net.PlainDatagramSocketImpl.receive(Native
Method)
[java] at java.net.DatagramSocket.receive
(DatagramSocket.java:680)
[java] at
org.jnp.interfaces.NamingContext.discoverServer
(NamingContext.java:1093)
[java] at
org.jnp.interfaces.NamingContext.checkRef
(NamingContext.java:1192)
[java] at org.jnp.interfaces.NamingContext.lookup
(NamingContext.java:514)
[java] at org.jnp.interfaces.NamingContext.lookup
(NamingContext.java:507)
[java] at javax.naming.InitialContext.lookup
(InitialContext.java:347)
[java] at test.Client.main(Client.java:25)
[java] javax.naming.CommunicationException:
Receive timed out. Root exception is
java.net.SocketTimeoutException: Receive timed out
[java] at
java.net.PlainDatagramSocketImpl.receive(Native
Method)
[java] at java.net.DatagramSocket.receive
(DatagramSocket.java:680)
[java] at
org.jnp.interfaces.NamingContext.discoverServer
(NamingContext.java:1093)
[java] at
org.jnp.interfaces.NamingContext.checkRef
(NamingContext.java:1192)
[java] at org.jnp.interfaces.NamingContext.lookup
(NamingContext.java:514)
[java] at org.jnp.interfaces.NamingContext.lookup
(NamingContext.java:507)
[java] at javax.naming.InitialContext.lookup
(InitialContext.java:347)
[java] at test.Client.main(Client.java:25)
[java] OK: 65
[java] OK: 66
2. JavaGroups configuration
I've tried turning loopback on/off and changing other
settings. No luck!
----------------------------------------------------------------------
>Comment By: Rob Dickinson (rob_dickinson)
Date: 2003-10-07 17:30
Message:
Logged In: YES
user_id=748966
Thanks, Sacha. I'll try again with your changes...
Thanks for all your help!!!
----------------------------------------------------------------------
Comment By: Rob Dickinson (rob_dickinson)
Date: 2003-10-07 17:30
Message:
Logged In: YES
user_id=748966
Thanks, Sacha. I'll try again with your changes...
Thanks for all your help!!!
----------------------------------------------------------------------
Comment By: Sacha Labourey (slaboure)
Date: 2003-10-07 10:36
Message:
Logged In: YES
user_id=95900
Hello,
This is because when you bind your object in JNDI, it is being
serialized at that time and the content is never refreshed (i.e.
reserialized) when the possible list of targets node changes.
=> you have to detect changes and rebind your proxy. I've
added a callback object to make this easier => it will only
work with >= 3.2.2. Please see your modified code in
attachment.
Cheers,
sacha
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=376685&aid=816108&group_id=22866
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
JBoss-Development mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jboss-development