Bugs item #863113, was opened at 2003-12-19 20:33
Message generated for change (Comment added) made by tpeuss
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=376685&aid=863113&group_id=22866

Category: Clustering
Group: v3.2
Status: Closed
Resolution: Invalid
Priority: 5
Submitted By: Jason Tetrault (airsquig)
Assigned to: Thomas Peuss (tpeuss)
Summary: Session Replication Inconsistent

Initial Comment:
This Bug report is a result of discussions on the JBoss 
Forums:
        http://www.jboss.org/index.html?
module=bb&op=viewtopic&t=43602

Versions Replicated on:
     JBoss 3.2.3, JBoss 3.2.2, JBoss 3.2.4 RC, JBoss 4.0 
RC
Version researched:
      JBoss 3.2.4 nightly
Overview:
It was noticed that HTTP Session Replication appeared 
inconsistent in a clustered environment.  Further 
research showed that this was happening under round 
robin load balancing.  

Basically, it was found that a Session would replicate to 
another node ONCE and only Once, after that, the 
sessions are inconsistent.

This shows itself in a apache round robin, NON Sticky 
load balanced configuration.  On the third hit is when 
this bug shows itself.

Now, after adding some trace, what I found is the 
following:
It seem that the 
org.jboss.web.tomcat.session.ClusterManager has a 
local session container(sessions).  If  and only if the 
session is not in the local container, it will access the 
HTTPSessionMBean to get the session, which calls the 
org.jboss.ha.httpsession.beanimpl.ejb.ClusteredHTTPSes
sionBeanImpl EJB. All works well. If node A is hit first, 
then Node B, Node B will get the session from the EJB 
that was set from node A. 

Now, Once both nodes have the session in their cache, 
the ClusterManager does not appear to go back to the 
MBean to re-get their session on get session requests, it 
just uses the version in local sessions container. This 
means that a session will replicate ONCE, and after 
that, tomcat uses its local version. This causes an 
inconsistent session state.

It does look like each Servlet Container is updating 
there version in the EJB but, it is off because it is based 
off the inconsistent session it its local cache.

Now, to test this, I made a quick code change to 
ClusterManager.findSession()method to get the session 
from the MBean and not use the one from sessions.  
This appeared to fix the problem (You have to call 
sessions.sessions.get(id) or it will break. Again, I did 
not spend much time on fixing it). Exactly how you 
want to fix this is up to you, I can see a few ways:

        1.  Somewhat like mentioned above.
        2.  Making the backend call the invalidate on 
the MBean to get rid of the sessions session.
        3. others.

Reproduction Information:

This can be replicated by 1 JSP in a clustered 
environment in a session replicated web application.  
The attached JSP will replicate this problem.  Basically, 
on each cluster, change the string being appended to 
the session key to the node name.
In an apache round robin environment, one would 
expect that the string in the session would look like the 
following:
Hit1:
NodeA
Hit2:
NodeA,NodeB
Hit3:
NodeA,NodeB,NodeA
Hit4:
NodeA,NodeB,NodeA,NodeB
Hit5:
NodeA,NodeB,NodeA,NodeB,NodeA

Right now, this is what happens:
Hit1:
NodeA
Hit2:
NodeA,NodeB
Hit3:
NodeA,NodeA
Hit4:
NodeA,NodeB,NodeB
Hit5:
NodeA,NodeA,NodeA
Hit6:
NodeA,NodeB,NodeB,NodeB

Now, you do not necessarily need the apache round 
robin configuration but, it helps.  Email me if you have 
any questions.

Jason

----------------------------------------------------------------------

>Comment By: Thomas Peuss (tpeuss)
Date: 2003-12-21 18:41

Message:
Logged In: YES 
user_id=507779

Hello Sacha,

now I got the point. I will change the clustering code to
lookup the session in the clustering code in every
situation. Should I apply this to 3.2 or HEAD first?

CU
Thomas

----------------------------------------------------------------------

Comment By: Sacha Labourey (slaboure)
Date: 2003-12-21 17:22

Message:
Logged In: YES 
user_id=95900

> It is not uncommon for a highly redundant, large scale 
> deployment to expect this (Of any technology, not just 
J2EE) 
> from clustered technologies.  I am working on a customer 

What are these "requirements", your posts (this and the first 
one) fail to explicit them clearly. There is a bug in the code 
(in that we don't always use the latest known information in 
the case of dual network failure, flip-flap effect), ok, but I 
don't see the "feature requirement" you mention.

----------------------------------------------------------------------

Comment By: Jason Tetrault (airsquig)
Date: 2003-12-21 17:09

Message:
Logged In: YES 
user_id=934522

Hello All,

Two Notes, looking at the code I had the same worry Sacha 
did.  This approach will only work if there is a tomcat 
container failure and restart.  This will cause faulty data if 
failures result form network blips or overloaded machines.

If this is the way HTTP Clustering is going to work, this at 
least needs to be highlighted in the JBoss Cluster document.

It is not uncommon for a highly redundant, large scale 
deployment to expect this (Of any technology, not just J2EE) 
from clustered technologies.  I am working on a customer 
system right now that this is a requirement for.  I also believe 
it is common that J2EE application servers support this.  
Remember, many times you have a hardware based load 
balancer in the mix as well with replicated web servers 
(Which, yes you can configure for sticky sessions).  The 
question really is, what happens to the load AFTER the failure 
with many users in a sticky session environment.

Cheers

Jason


----------------------------------------------------------------------

Comment By: Sacha Labourey (slaboure)
Date: 2003-12-21 16:41

Message:
Logged In: YES 
user_id=95900

Hello Thomas,

No, you shouldn't redirect back, but a second failure could 
make it go back to the first node, and it wouldn't use the last 
known state, this is what is described in the first scenario.

If you hashmap is just holding *my* session, then simply drop 
it as the clustering code keeps both a serialized and non-
serialized representation of the session, so it won't cost much.

Why do you think that the tomcat clustering code was never 
designed to work without sticky sessions, I mean, which 
cases wouldn't work? As we can enabled synchronous 
replication, it should be ok IMHO (except in concurrent 
update), but in that case the spec is dump anyway.

Cheers,

sacha

----------------------------------------------------------------------

Comment By: Thomas Peuss (tpeuss)
Date: 2003-12-21 16:01

Message:
Logged In: YES 
user_id=507779

Sacha,

I see the problem of short network hangs leading to a
failover to another node. But why should I redirect a client
back to his "old" node after it comes up again? The session
on the old node is dead and will be removed after some time.
The local session cache is there because the Tomcat session
manager I derived the clustering session manager from has a
HashMap where it holds its sessions. My understanding of the
clustering code is that it holds the serialized sessions and
I have to deserialize them on access (which is costly - but
what do I tell you ;-) ).

If this is no longer the case we can remove the local
session cache and use the cluster cache on every access. I
think this is more straight forward anyway.

But this is still no bug because the Tomcat clustering code
was never designed to work without sticky session.

CU
Thomas

----------------------------------------------------------------------

Comment By: Sacha Labourey (slaboure)
Date: 2003-12-21 08:44

Message:
Logged In: YES 
user_id=95900

I don't agree Thomas: while it is not really the best thing to 
do not to use sticky sessions (what if you have concurrent 
(frame) requests to the same session?), the described 
problem can occur even in sticky-session situations if we 
have a set of minor network outages between the LB and the 
JBoss boxes.

What is the purpose of this local session cache anyway? We 
already have the cache at the EJB level for http sessions, 
why do we need another one? Is it because its creation is 
costly?

----------------------------------------------------------------------

Comment By: Thomas Peuss (tpeuss)
Date: 2003-12-20 12:03

Message:
Logged In: YES 
user_id=507779

The JBoss HTTP-Clustering ONLY works with sticky session for
performance reasons. Maybe we can introduce a configuration
option that allows a use in a round-robin fashion (I am sill
wondering why someone wants to do this).

So this is not a bug. You can add this as a feature request.

CU
Thomas

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=376685&aid=863113&group_id=22866


-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
JBoss-Development mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jboss-development

Reply via email to