Pier:

Great discussion points. I really appreciate your thoughtful feedback.

My comment about Tomcat caching session data does not preclude
it from being stored in the remote session server. Indeed, this would
be required. My thought was this, in a multi-node network if multiple
contiguous requests (for the same session) are handled by the same
tomcat node, then that tomcat node should not be forced to retrieve
a copy of the session from the session server for each request. It only
needs to go back to the session server for the session if it doesn't
have a 'valid' copy. Remember that if another tomcat instance causes
the session to be updated, then the server will tell all the clients to
invalidate that session. This caching works when intervening requests
are handled by more than one node and that do not actually update the
session attributes.

Notice also, in my concept, there are no delays built into the architecture
(other than the natural delays caused by sending data over the network).
The session server can simply respond to callers on-demand.

---------------------------------
Session Manager pool:

Shift gears for a minute back to my statements about the primary
and secondary servers. Now, let's amend that as follows:

'n' Session Servers may be present. Each is aware of (at least one of)
the other Session Server(s). All are 'peers', no notion of primary and
secondary is present or necessary between session servers. Each
session server communicates session additions, changes, deletions
to the other session servers that it knows about. This opens up a
plethora of configuration possibilities that offer the kinds of trade-offs
that you mentioned.

Clients (e.g. Tomcat instances) may connect to any of the related
session servers. In the event that its active session server goes down,
it can simply point at the 'next one in the list'. This list could be
pre-configured, or, the list could be obtained from the session
server itself (which would simplify configuration). It can do this
because all session servers should have an up-to-date copy of
all Session data. So, for example:

  in a 'cluster' consisting of 10 Tomcats and 2 Session servers
  5 of the Tomcats may be configured to (initially) talk to session
  server 1, while the other 5 may be configured to talk to session
  server 2. This gives us a way to 'load balance' the Session servers.
  if session server 1 goes down, it's 5 clients could simply switch
  over to server 2

Clients may still cache Session objects, and may keep them around
until the session server tells them to invalidate a particular session.

This peer interface might look something like this:

public interface SessionServerPeer extends SessionClient {
    public void register(SessionServerPeer self) throws RemoteException;
    public void deregister(SessionServerPeer self) throws RemoteException;
    public HttpSession [] getAllSessions() throws RemoteException;
    public SessionServerPeer [] getPeers() throws RemoteException;
    public void addPeers(SessionServerPeer [] peer) throws RemoteException;
}

And while we're at it, lets add the following methods to SessionServer and
SessionClient interfaces:

    public void
        setSessionAttribute(String sessionId, String attributeName, Object
value)
            throws RemoteException;
    public void
        removeSessionAttribute(String sessionId, String attributeName,
Object value)
            throws RemoteException;


One last point. I'm not certain that we'll be able send the HttpSession
object as a
parameter of type HttpSession (which relies on RMI stubs/skels to serialize
and
deserialize it). Instead, we'll probably need to send this data (session
attributes) as a
byte array. The reason I say this is
that we have no control over what objects may be stored as session
attributes
other than that we can insist that they be serializable. The problem is that
if some
web-application stores some custom bean in the session, I don't think that
the
rmic generated code in our session server will be able to deserialize it,
since it
won't know anything about the 'custom bean'. However, all Tomcat instances
in the cluster will need to know about 'custom bean', we can simply turn the
session (attributes) into a byte array that is sent to the session server.
The session server will send this byte array back out to Tomcat on request.
Tomcat would then need to deserialize the byte array into a session object
(or perhaps just the session attributes).

Having said that, perhaps these methods should look like this:

    public void
        setSessionAttribute(String sessionId, String attributeName, byte []
value)
            throws RemoteException;
    public void
        removeSessionAttribute(String sessionId, String attributeName, byte
[] value)
            throws RemoteException;

----- Original Message -----
From: "Pier Fumagalli" <[EMAIL PROTECTED]>
To: "Tomcat Developers List" <[EMAIL PROTECTED]>
Sent: Monday, November 12, 2001 6:04 PM
Subject: Re: Tomcat: Distributed Session Management revisited


| On 13/11/2001 12:54 am, "Tom Drake" <[EMAIL PROTECTED]> wrote:
|
| > Mika:
| >
| > Thanks for the reply. Here's some more thoughts on this subject.
| >
| > The primary problem that I see with the collaborative method
| > (e.g. extending the multicast solution) is
| > that all sessions will have to be sent to all cluster nodes. The
| > number session updates that have to travel 'on the wire' is in
| > relation to the number of nodes in the cluster.
|
| Linear growth, that's the best we can aim for...
|
| > Further more, when a new tomcat is brought on-line, it must
| > somehow retrieve a copy of all active sessions from somewhere.
| > There is nothing in place for this currently. Using multicast
| > is problematic. If a multicast request is made then all other nodes
| > would respond with all sessions. So, some other approach would
| > need to be taken which would result in two protocols being used
| > to make this feature work. This seems too complicated.
|
| Not "that" complicated. Most of the work on elective processes has been
done
| already in the scope of other projects, so, we would only need to adapt it
| to our scope...
|
| > ---------------------------------------
| > Consider this scenario:
| >
| > A user establishes a session on node 1 (of a 10 node cluster),
| > Tomcat would create a new session and transmit it to the
| > multicast port, which would then transmit 10 copies of this
| > session (1 to each cluster node).
| > Now suppose that the next request from this user is sent to
| > node 2, which causes an update to the session to occur. Again
| > 11 copies of the Session are transferred.
| > [...]
| > NOTE: remember this is UDP traffic. The more packets that
| > fly around, the greater the likely-hood of dropping packets.
| > Dropped packets in this case means that some tomcat
| > instances may have stale (or no) data for a given session.
|
| Indeed... Quite huge...
|
| > ------------------------------------------
| > With a centralized session manager the following traffic would
| > occur instead:
| >
| > node1 sends new session to server manager
| > node 2 requests the given (session id) session from the server manager
| > manager sends a copy of the session to node 2
| > node 2 updates the session and sends it back to the manager.
| > manager sends the 'invalidateSession(sessionId)' method in each of
nodes.
| >  (note: invalidateSession only contains the value of 'SessionId' + any
| > additional
| >   RMI overhead. This is far smaller than a complete Session object)
| >
| > The number of session copies sent as the result of an update is 2.
| > This number does not depend or vary based on the number of nodes.
| >
| > Now, let's add to the story. Let's say that Tomcat is smart enough to
cache
| > Session objects in it's memory space. Once Tomcat gets its hands on a
| > 'Session'
| > it keeps it until it becomes 'too old' or an
'invalidateSession(sessionId)'
| > message is
| > received from the remote Session Manager. This could cut down the the
number
| > of transfers of Session data from 2 to somewhere between 1 and 2.
|
| Yes, but in this case, we don't have redundancy of sessions... So, if the
| Tomcat which has the session dies, the whole session dies with him...
|
| > -----------------------------------------------------
| > On Redundant Session Managers.
| >
| > There are a couple ways to achieve this. One way is to place two Session
| > Managers in the network. One of them is the 'active' one, the other one
could
| > simply register itself as a client of the 'active' server. As a client,
it can
| > obtain copies of all new and changed sessions from the active server. If
for
| > some reason the active server needs to be brought down, it will send a
message
| > to all of it's clients (including the 'dormant' session manager)
indicating
| > that it's shutting down. The clients could, on receipt of this message,
| > connect to the 'next' session server (in their pre-configured list of
| > servers). The clients could simply carry on with the new server.
|
| Indeed...
|
| > If the active server simply goes off the air for some mysterious reason.
The
| > clients would get a RemoteException the next time they tried to talk to
the
| > server. This would be their clue to 'cut-over' to the other server (as
| > described above).
|
| But how would they know where the sessions ended up????
|
| > Last point. Sending Session delta's instead of the entire Session:
| >
| > This should be doable. The main thing that we care about are Session
| > attributes which are changed by the application. It's up to the
| > web-application to replace these values into the Session if their
contents
| > change. This is enough for us to be able to track which attributes have
| > actually changed.
|
| This can actually be done if we consider every operation on a session
| (adding/replacing/removing an attribute) and atomic operation....
|
| Let's see if I can complicate things a little bit :) (Love doing that).
|
| Let's imagine to have a pool of session managers (SA, SB, SC...) and a
pool
| of servlet containers (T1, T2, T3...).
|
| The first thing we want to do is bring up our session managers. Once we
| start them SA, SB, SC and SD are available to accept sessions.
|
| Then we start our servlet containers T1, T2, T3 and T4. When a request
comes
| in in any of the servlet containers, the servlet container simply
broadcasts
| a message saying "who can hold a session for me?"? All four managers will
| reply to that request, and the servlet manager can "order" them. For
| example, if we want a redundancy level of 2, the container might choose SB
| as the "primary" session manager, and SA as the "replica 1" session
manager;
| if we want a redundancy level of 3, the container might choose SD as
| "primary", SA as "replica 1" and SB as "replica 2".
|
| The information about "who is primary" and "who is replica X" is stored
| within the session manager itself.
|
| When one of the servlet containers needs to read or write from a session,
he
| will broadcast (again) the message "who holds this session?", of course,
all
| session managers holding (primary or replica) that session, will reply
with
| their "status" (primary, or replica #), and then the servlet container
will
| persist the data in -first of all- the primary session manager, -then- in
| all the replicas, and at the end return control to the caller (the thread
| which called "setAttribute/getAttribute").
|
| What happens if one of the session managers goes down? That simply the
| servlet container will notice that something is going wrong, because if
| configured with a "replica factor" of 3, he gets only 2 responses to
"who's
| holding this session?", we know for sure that one of the replicas (or the
| master) has gone down, so, simply, we can "elect" one of the replicas as
| "primary" (if the primary has gone down), and/or broadcast a message
saying
| "who can be replica for this session?"... The session is then persisted in
| all three places (the two old ones, plus the new one), and the thing goes
| on...
|
| What does it gives us? A lot of flexibility in terms that only a little
data
| is broadcasted (messages such as "who can hold this session?" or "who has
| this session?" or "who can be replica for this session"), so we avoid
| problems with UDP, then we have a sub-linear growth in a way that the
| traffic over the network is only (N*(sessiondata+overhead)) where N is the
| replica factor, and the administrator is free to trade his own data safety
| (more replicas, more traffic, more redundancy), with speed (less replicas,
| less traffic, less redundancy)...
|
| We don't have a single point of failure (whohoooo!), we don't need to
| replicate sessions with linear growth on N (where N is the number of
session
| managers), and we get load balancing of sessions for free...
|
| The only problem is that we need to use multicast, but that shouldn't be a
| big issue...
|
|     Pier
|
|
| --
| To unsubscribe, e-mail:
<mailto:[EMAIL PROTECTED]>
| For additional commands, e-mail:
<mailto:[EMAIL PROTECTED]>
|
|
|


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to