Bugs item #1007959, was opened at 2004-08-12 06:39
Message generated for change (Comment added) made by starksm
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=376685&aid=1007959&group_id=22866

Category: Clustering
Group: v3.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Peter Backx (pbackx)
>Assigned to: Scott M Stark (starksm)
Summary: HA-JNDI extremely unreliable

Initial Comment:
os: linux 2.6
jdk: Sun hotspot 1.4.2_02

I have deployed 2 SLSBs on 2 different machines that
are clustered (I tried both TCP and UDP) HA-JNDI is
enabled but no farming.

Both looking for a bean from the client and looking for
a bean from another is extremely unreliable. It works
in about 10% of the cases. I have been unable to make
this bug 100% reproducable ... it appears to be
entirely random.

If I let one bean call another I get the following in
my server logs which shows that the server is actually
responding:

2004-08-12 15:31:08,759 DEBUG
[org.jboss.ha.framework.server.HAPartitionImpl]
dests=[asklepios30:7800 (additional data: 16 bytes),
10.10.12.1:7800 (additional data: 15 bytes)],
method_call=HAJNDI.lookupLocally(Another), mode=2,
timeout=60000
2004-08-12 15:31:08,759 DEBUG
[org.jboss.ha.framework.server.HAPartitionImpl]
real_dests=[asklepios30:7800 (additional data: 16
bytes), 10.10.12.1:7800 (additional data: 15 bytes)]
2004-08-12 15:31:09,027 DEBUG
[org.jboss.ha.framework.server.HAPartitionImpl]
responses: [sender=asklepios30:7800 (additional data:
16 bytes), retval=null, received=true, suspected=false]
[sender=10.10.12.1:7800 (additional data: 15 bytes),
[EMAIL PROTECTED],
received=true, suspected=false]

(I have 3 servers: 10.10.11.1 is doing the lookup and
asklepios30 and 10.10.12.1 are replying ... don't mind
the fairly complicated (for a cluster) network setup,
the problem is the same if they are all on the same
subnet and UDP multicasting)

However by the time this value gets returned by
Context.lookup it is null (most of the time).

This bug was not in 3.2.2 but because I cannot assign
the correct NodeAddress to partition members I have to
use 3.2.5

I have also attached my cluster configuration file
(@BIND_ADDR@ gets replaced by the correct address in an
ANT script)

I'm willing to debug the code myself because this
problem is really urgent to me, however I hope some one
can tell me through which calls the information flows
between the HAPartitionImpl and the lookup return. Any
pointers are appreciated.

----------------------------------------------------------------------

>Comment By: Scott M Stark (starksm)
Date: 2004-08-12 09:53

Message:
Logged In: YES 
user_id=175228

Describe the test scenario in terms of what is being looked
up, by who, and where the binding exists in the cluster. If
you have the test deployments attach them to the report.

----------------------------------------------------------------------

Comment By: Peter Backx (pbackx)
Date: 2004-08-12 09:31

Message:
Logged In: YES 
user_id=91876

Based on my trace, I started looking into the code and
here's where am I right now:

TreeHead.lookup first does a local lookup but does not find
the name (which is to be expected) after that it does a
callMethodOnCluster (on HAPartitionImpl) to do a clusterwide
lookup. This call is forwarded to the RpcDispatcher (which
is part of jgroups).

This all works ok, since the last trace message (responses:
...) is from within RpcDispatcher's callRemoteMethods method.

However something goes wrong in returning that response back
to the initial lookup call. Two points:
* One would expect at least some output from the last trace
call in TreeHead's lookup.
* Secondly it appears that the first result in the ArrayList
is returned which in this case is the wrong entry (it is for
server asklepios30 which does _not_ have the bean)

The last remark could very well cause the pretty random
behavior, but considering that there is no trace message I'm
not sure yet if this is the actual bug.

I will try to set-up my environment and compile my own jboss
(or at least jbossha.jar) tomorrow (all info above is just
from static code analysis) and see what happens.

----------------------------------------------------------------------

Comment By: Peter Backx (pbackx)
Date: 2004-08-12 08:49

Message:
Logged In: YES 
user_id=91876

Here's a trace of org.jboss.ha:

17:54:30,098 TRACE [TreeHead] lookup, name=Another
17:54:30,098 TRACE [TreeHead] lookupLocally, name=Another
17:54:30,098 TRACE [TreeHead] lookupLocally failed, name=Another
javax.naming.NameNotFoundException: Another not bound
        at
org.jnp.server.NamingServer.getBinding(NamingServer.java:495)
        at
org.jnp.server.NamingServer.getBinding(NamingServer.java:503)
        at
org.jnp.server.NamingServer.getObject(NamingServer.java:509)
        at
org.jnp.server.NamingServer.lookup(NamingServer.java:282)
        at
org.jboss.ha.jndi.TreeHead.lookupLocally(TreeHead.java:266)
        at org.jboss.ha.jndi.TreeHead.lookup(TreeHead.java:192)
        at org.jboss.ha.jndi.HAJNDI.lookup(HAJNDI.java:131)
        at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at
org.jboss.ha.framework.interfaces.HARMIClient.invoke(HARMIClient.java:187)
        at $Proxy47.lookup(Unknown Source)
        at
org.jnp.interfaces.NamingContext.lookup(NamingContext.java:530)
        at
org.jnp.interfaces.NamingContext.lookup(NamingContext.java:509)
        at javax.naming.InitialContext.lookup(Unknown Source)
        at
be.ugent.pbackx.util.HAJNDIUtil.lookup_aroundBody6(HAJNDIUtil.java:29)
        at
be.ugent.pbackx.util.HAJNDIUtil.HAJNDIlookup(HAJNDIUtil.java:29)
        at
be.ugent.pbackx.bean1.HelloWorldBean.hello_aroundBody34(HelloWorldBean.java:40)
        at
be.ugent.pbackx.bean1.HelloWorldBean$AjcClosure35.run(HelloWorldBean.java)
        at
be.ugent.pbackx.aspect.TestAspect.ajc$around$be_ugent_pbackx_aspect_TestAspect$2$dd6145a5proceed(TestAspect.java)
        at
be.ugent.pbackx.aspect.TestAspect.ajc$around$be_ugent_pbackx_aspect_TestAspect$2$dd6145a5(TestAspect.java:58)
        at
be.ugent.pbackx.bean1.HelloWorldBean.hello(HelloWorldBean.java)
        at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at
org.jboss.ejb.StatelessSessionContainer$ContainerInterceptor.invoke(StatelessSessionContainer.java:683)
        at
org.jboss.resource.connectionmanager.CachedConnectionInterceptor.invoke(CachedConnectionInterceptor.java:185)
        at
org.jboss.ejb.plugins.AbstractTxInterceptor.invokeNext(AbstractTxInterceptor.java:84)
        at
org.jboss.ejb.plugins.AbstractTxInterceptorBMT.invokeNext(AbstractTxInterceptorBMT.java:144)
        at
org.jboss.ejb.plugins.TxInterceptorBMT.invoke(TxInterceptorBMT.java:62)
        at
org.jboss.ejb.plugins.StatelessSessionInstanceInterceptor.invoke(StatelessSessionInstanceInterceptor.java:72)
        at
org.jboss.ejb.plugins.SecurityInterceptor.invoke(SecurityInterceptor.java:120)
        at
org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:191)
        at
org.jboss.ejb.plugins.CleanShutdownInterceptor.invoke(CleanShutdownInterceptor.java:264)
        at
org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke(ProxyFactoryFinderInterceptor.java:122)
        at
org.jboss.ejb.StatelessSessionContainer.internalInvoke(StatelessSessionContainer.java:331)
        at org.jboss.ejb.Container.invoke(Container.java:723)
        at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at
org.jboss.mx.server.ReflectedDispatcher.dispatch(ReflectedDispatcher.java:60)
        at
org.jboss.mx.server.Invocation.dispatch(Invocation.java:61)
        at
org.jboss.mx.server.Invocation.dispatch(Invocation.java:53)
        at
org.jboss.mx.server.Invocation.invoke(Invocation.java:86)
        at
org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:185)
        at
org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:473)
        at
org.jboss.invocation.jrmp.server.JRMPInvokerHA.invoke(JRMPInvokerHA.java:163)
        at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at sun.rmi.server.UnicastServerRef.dispatch(Unknown
Source)
        at sun.rmi.transport.Transport$1.run(Unknown Source)
        at
java.security.AccessController.doPrivileged(Native Method)
        at sun.rmi.transport.Transport.serviceCall(Unknown
Source)
        at
sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown
Source)
        at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown
Source)
        at java.lang.Thread.run(Unknown Source)
17:54:30,101 TRACE [TreeHead] calling lookupLocally(Another)
on cluster
17:54:30,101 TRACE [DefaultPartition]
callMethodOnCluster(true), objName=HAJNDI,
methodName=lookupLocally, members=[asklepios30:7800
(additional data: 16 bytes), 10.10.12.1:7800 (additional
data: 15 bytes)]
17:54:30,101 DEBUG [HAPartitionImpl] dests=[asklepios30:7800
(additional data: 16 bytes), 10.10.12.1:7800 (additional
data: 15 bytes)], method_call=HAJNDI.lookupLocally(Another),
mode=2, timeout=60000
17:54:30,103 DEBUG [HAPartitionImpl]
real_dests=[asklepios30:7800 (additional data: 16 bytes),
10.10.12.1:7800 (additional data: 15 bytes)]
17:54:30,200 DEBUG [HAPartitionImpl] responses:
[sender=asklepios30:7800 (additional data: 16 bytes),
retval=null, received=true, suspected=false]
[sender=10.10.12.1:7800 (additional data: 15 bytes),
[EMAIL PROTECTED],
received=true, suspected=false]


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=376685&aid=1007959&group_id=22866


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
JBoss-Development mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jboss-development

Reply via email to