Re: Entity beans, clistering and scalability

Bobby Woolf Thu, 08 Feb 2001 13:50:07 -0800
Ken,

You're right, the EJB spec does not specify how (or even if) an EJB
container should provide load balancing for multiple EJB clients. Many EJB
container products use clustering, but traditional clustering is not the
only way for an EJB container to achieve load balancing. GemStone/J provides
load balancing in a way that doesn't burden the deployment environment with
the difficulties of clustering and gateways, a way that is transparent to
the EJB provider, deployer, and client. Here's how that works in GemStone/J.

<vendor>

GSJ has what we call Extreme Clustering, which can be thought of as VM
pooling. It's like session bean pooling, where a client is given a bean
that's available. When the client is finished, the bean is returned to the
pool and can be reused by another client. The size of the pool grows and
shrinks dynamically based on load to maximize performance while minimizing
resource consumption. In GSJ, the server can start up as many VMs as the
hardware can support, but only starts up as many as it needs to support the
current load. This pool of VM's is managed by the Activator, a service that
is built into the GSJ container and automatically used once you install GSJ.

GSJ also performs what we call Smart Load Balancing. The Activator monitors
each VM's load; it starts or stops VMs when necessary to control the size of
the pool. When an EJB client request is made on the GSJ container, GSJ
passes the request to the Activator. The Activator picks the best VM for the
job--based on each VM's load, the nature of the request, each VM's existing
resources such as pooled beans, etc.--and delegates the request to that
optimal VM. The VM then uses a server bean to fulfil the request in the
normal EJB way. If no VM is optimal because load is too high, then the
Activator will start more VMs.

This Extreme Clustering and Smart Load Balancing works across a network of
server computers and is especially ideal for large server computers like a
Sun E10000. Most app servers only start up a fixed number of VMs and have
difficulty doing so with a limited number of VMs and IP addresses, so they
only use a limited amount of the computer's capacity, which leaves unused
capacity yet constrains scalability. In such a configuration, buying a
larger box won't increase scalability, it just increases unused capacity.
With GSJ, the number of VMs is unlimited by the app server, so the pool can
grow to use all of the computer's resources. If you buy a larger box, you'll
have more computer resources for more VMs which will give you more
scalability.

Note that all of this is totally EJB compliant and happens within the EJB
container. Your EJB client code just requests a standard EJBean from an EJB
home and uses it like normal. Both your bean and client code are totally
unaware of the pool of VMs and the load balancing. Yet your code is suddenly
much more scalable than it might otherwise be.

</vendor>

Now to answer your questions:

> 1. What controls it? Is there some piece of gateway
> software that negotiates where to go in the cluster?

Many other products require a gateway between the clients and the server
cluster to distribute the load, but GSJ does not; the GSJ "gateway" (aka the
Activator) is built in. All EJB containers intercept client requests to
control security, transactions, acquire beans from pools, etc. GSJ, in
addition to all that, determines which of the available VMs would be best to
use and therefore balances (not just distributes) the load in the process.
All of this is completely transparent to the EJB client code, provider code,
and deployment mechanism.

> 2.  IN a cluster, is there always only one instance
> of an entity bean for a given row?

This is vendor specific. A single bean could somehow be shared, or a single
bean per CORBA orb, or each request could get its own bean copy. The spec
allows a single entity bean to be shared, but only one transaction can go
through a bean at a time. The most problematic issue with sharing an entity
bean is the possibility of write-write conflicts, so the "only one
transaction at a time" rule prevents such a conflict in the bean. At the end
of the transaction, the bean must be stored and reloaded, the details of
which are container and database specific. Thus the write-write conflicts
occur not in the entity bean, but in the database (relational, object, flat
file, whatever). So EJB sidesteps these concurrency issues and delegates
them to the database.

>  3.  If 2 is answered yes, doesn't this imply, since the EJB spec says
nothing
> about load balancing, that the spec and entity beans in general are
non-calable?
> That is, multiple clients uisng multipel appservers in a cluster still
need
> somehow to go to the same instance in one apps erver instance.  That
doesn't
> sound very scalable.  In that case, clustering sounds pretty meaningfless,
and
> claimns that EJBs are scalable would seem not to apply to entity beans.

You could be right, depending on the container's implementation. If hundreds
of clients were manipulating the same row of data at the same time, and if
the container would only create one entity bean instance for that row of
data, then the entity bean would become a bottleneck for those clients, no
matter how many VMs you have and how they're clustered. This is why most
container products won't do this. They'll create multiple entity bean
instances for the same row of data and at least attempt to spread the client
load across those instances. This moves the bottleneck from the entity
bean(s) to the row of data inside the database. Presumably the database is
prepared to handle all of this concurrency with efficient locking mechanisms
and caching so that the database does not become a bottleneck either.

I hope that answers your questions.

Bobby

-----
Bobby Woolf
Senior Architect
GemStone Systems, a Brokat company
[EMAIL PROTECTED]

-----Original Message-----
From: Kenneth D. Litwak [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, February 06, 2001 20:07
To: [EMAIL PROTECTED]
Subject: [EJB-INT] Entity beans, clistering and scalability


  Since the EJB spec says nothing so far as I kno2 about load balancing,
vendors
have beenleft to their own devices.  Their solutions have all revolved
around
clustering to my knowledge, which could be incorrect.  I have a few
questions
about clustering then.

  1.  What controls it?  Is there some piece of gateway software that
negotiates
where to go in the cluster?

  2.  IN a cluster, is there always only one instance of an entity bean fora
given row?

  3.  If 2 is answered yes, doesn't this imply, since the EJB spec says
nothing
about load balancing, that the spec and entity beans in general are
non-calable?
 That is, multiple clients uisng multipel appservers in a cluster still need
somehow to go to the same instance in one apps erver instance.  That doesn't
sound very scalable.  In that case, clustering sounds pretty meaningfless,
and
claimns that EJBs are scalable would seem not to apply to entity beans.  Am
I
missing something in this?  Thanks.


  Ken

===========================================================================
To unsubscribe, send email to [EMAIL PROTECTED] and include in the body
of the message "signoff EJB-INTEREST".  For general help, send email to
[EMAIL PROTECTED] and include in the body of the message "help".

===========================================================================
To unsubscribe, send email to [EMAIL PROTECTED] and include in the body
of the message "signoff EJB-INTEREST".  For general help, send email to
[EMAIL PROTECTED] and include in the body of the message "help".
Re: Entity beans, clistering and scalability

Reply via email to