Entity Bean Crisis

Dan Benanav Tue, 05 Dec 2000 17:35:41 -0800
I question the usefulness of entity beans.   Admittedly, the idea sounds
great, but try
modeling your real-life application using entity beans; you are likely
to encounter a
multitude of complexities that will lead to a less robust system and
ultimately will slow
down the development cycle.  Let me assure you that I am not a Microsoft
loyalist just
out to prove that Java is bad.  To the contrary, I am a fan of Java, and
a fan of EJB in
particular.  Amazingly enough it appears that the EJB community has
simply ignored
these and other problems with entity beans.  In general EJB is complex
and difficult to
program.  Marketing of enterprise java beans will only last so long.
Eventually, after
enough projects fail, the business community will catch on and abandon
entity beans
altogether.   That would be unfortunate since there is still great merit
to much of the EJB
standard and what it is trying to accomplish.

What is the real motivation for entity beans?  I believe the unfulfilled
promise of entity
beans is that it allows a programmer to create a persistent,
transactional and distributed
object model in much the same way as he or she would create a
non-persistent, non-
transactional, and non-distributed object model.  For example, suppose
you are creating
an object model that is to be run on one server and requires no
persistence of objects.
Let's suppose you have defined one of your business object as follows:

public class BusObj
{
    private int myX1;
    private int myX2;
    public int getX1()
    {
 return myX1;
    }
    public void setX1(int x)
    {
 myX1 = x;
    }
    public int getX2()
    {
 return myX2;
    }
    public void setX2(int x)
    {
 myX1 = x;
    }
    public void doBusinessLogic(int y)
    {
 if(myX1 <= 5){
     myX1 = myX1 + y;
 }
 else {
     myX2 = myX2 + y;
 }
    }
}
Suppose you want to make this a persistent object that maintains
transactional integrity
and that works in a distributed system.  Wouldn't it be nice if you
could basically use the
same class, perhaps with some added methods to handle the persistent,
transactional and
distributed aspects?  That is the promise of entity beans.  For the
BusObj example I
would, among other things (define the Home interface, etc.), have to
define the
BusObjBean.  That bean would have all the methods in the original
business object plus
the addition of several new methods:

public class BusObjBeanAttempt1 implements EntityBean
{
    //All this is exactly as above
    private int myX1;
    private int myX2;
    public int getX1()
    {
 return myX1;
    }
    public void setX1(int x)
    {
 myX1 = x;
    }
    public int getX2()
    {
 return myX2;
    }
    public void setX2(int x)
    {
 myX1 = x;
    }
    public void doBusinessLogic(int y)
    {
 if(myX1 <= 5){
     myX1 = myX1 + y;
 }
 else {
     myX2 = myX2 + y;
 }
    }
    //Now add the ejb specific stuff
    public void ejbLoad(){
 //SQL stuff to set myX1 and myX2
    }
    public void ejbStore(){
 //SQL Stuff to update the database
    }
    //All the rest of the ebj stuff: ejbCreate, ejbRemove,
    //ejbActivate, ejbPassivate, ejbFindByPrimaryKey, setEntityContext,
unsetEntityContext
}

Unfortunately in a distributed system this class can lead to performance
issues if the get
and set methods are called often because they are remote calls.  So now
we must change
our design to prevent too many such calls.  A standard way to do this is
to define a
business detail object and to use that in your bean as follows:

public class BusObjDetail
{
    private int myX1;
    private int myX2;
    public int getX1()
    {
 return myX1;
    }
    public void setX1(int x)
    {
 myX1 = x;
    }
    public int getX2()
    {
 return myX2;
    }
    public void setX2(int x)
    {
 myX1 = x;
    }
}

public class BusObjBeanAttempt2
{
    //All this is exactly as above
    private BusObjDetail myBusObj;

    public void setBusObjDetail(BusObjDetail bo)
    {
 myBusObj = bo;
    }

    public BusObjDetail getBusObjDetail()
    {
 return myBusObj;
    }

    public void doBusinessLogic(int y)
    {
 if(myBusObj.getX1() <= 2){
     myBusObj.setX1(myBusObj().getX1() + y)
 }
 else {
     myBusObj.setX2(myBusObj().getX2() + y)
 }
    }
    //Now add the ejb specific stuff
    public void ejbLoad(){
 //SQL stuff to set myX1 and myX2
    }
    public void ejbStore(){
 //SQL Stuff to update the database
    }
    //All the rest of the ebj stuff: ejbCreate, ejbRemove,
    //ejbActivate, ejbPassivate, ejbFindByPrimaryKey, setEntityContext,
unsetEntityContext
}

The more properties your class has the more important it is to use a
business detail object.
Having to do this isn't so bad, but we have just encountered the first
minor inadequacy of
entity beans.   You can't really model your objects without thinking
about remote calls
and the fact that your system is distributed.

There is another problem with the above code.   Every time we call the
getBusObjDetail()
method the EJB container will first call ejbLoad to synchronize the
state of the bean with
the database as it should.  However, it will also call ejbStore after
calling
getBusObjDetail.   That is a completely unnecessary call since no
changes are being
made to the database.   Now this is a potentially serious problem since
we are doing extra
work for nothing.  The ejbLoad, method call, ejbStore cycle makes sense
for methods
that change the state of the bean but not for methods that merely query
for information.
Interestedly enough the EJB specification does not address this issue
although some
vendors do.   To handle this problem, programmers can use a dirty flag
to indicate that
the state of the bean has changed.  Unfortunately, using such a flag is
prone to
programmer errors since the flag must be properly maintained and reset
in coordination
with container callbacks to the bean.  The programmer must be keenly
aware of the
bean's life cycle and the sequence of container callbacks.

Even if you manage to properly use the dirty flag to avoid unnecessary
calls to the
database you may still have some performance issues.   Some vendors use
pessimistic
concurrency for entity beans.  To explain this further I must describe
another detail about
beans that I have left out.  Each bean must have a unique primary key
that can be any
java object type.  For the business object example let's just say the
primary key is of type
Integer.  To accomplish this we could just add a data member of type
Integer to the bean.
Now suppose two clients make a call to a bean with the same primary
key.   Some
containers will serialize calls to the bean.  That means that only one
client will be active
at any time.   Such an approach is called pessimistic concurrency.
Optimistic
concurrency allows both calls to proceed concurrently. Pessimistic
concurrency can be
helpful in maintaining transactional integrity, as we shall soon see.
However, for
method calls that don't update the bean locking the bean is
unnecessary.     Again
performance issues can arise.  When using pessimistic concurrency you
must design your
application carefully to avoid deadlock.  That is not always easy to do
and programmers
that haven't thought about deadlock are bound to run into trouble.
Avoiding deadlock is
that nature of any transactional system.

While some vendors use pessimistic concurrency on a per server basis,
most if not all use
optimistic concurrency in a multi-server clustered server setting.
Therefore, when
writing code you should assume optimistic concurrency if you think there
is any chance
that you might want to create a clustered application.  Otherwise you
may be forced to
entirely rewrite your application.

Up until now we have encountered a variety of problems.  There are other
problems I
won't go into detail about, but let me just say that the programmer in
general must be
keenly aware of the life cycle of a bean and the container callback
process.  If you think
you can design a real system without reading the specification carefully
you are bound to
run into trouble.

One could argue that all the problems I mentioned so far aren't major
problems but there
is one final issue that is just hard to ignore.   Suppose that the
doBusinessLogic method is
one that should not be called simultaneously on the same bean.   A
pessimistic
concurrency approach would prevent concurrent calls, but as I have said,
one cannot
assume pessimistic concurrency in a clustered environment.  Notice that
the
doBusinessLogic method may change the value of myX1.  Suppose that there
are two
simultaneous calls to doBusinessLogic(1) on the same bean.  Both beans
do an ejbLoad
and initialize the value of myX1 from the database.  Let's say the value
myX1 is 2.
According to the logic in the method the value of myX1 will be
incremented to 3.   That
value gets stored into the database when ejbStore is called.   Now
imagine that
doBusinessLogic is keeping track of a balance.   In such a case the
value of myX1 should
have been incremented by 2 but instead gets incremented only by one.

So what do we do about this problem?  Here is where we encounter
problems.  The
simplest approach may be to set the transaction isolation mode to
TRANSACTION_SERIALIZABLE.   Unfortunately some major database vendors,
notably Oracle, do not really support a true TRANSACTION_SERIALIZABLE
mode.
Using Oracle's version of TRANSACTION_SERIALIZABLE the second
transaction to
complete would experience an SQL exception.  That exception would have
to be caught
in the ejbStore method and the transaction would have to be rolled
backed using the
setRollback method.   Unfortunately one cannot retry within the ejbStore
method itself
since there is no way to start a new transaction from with the bean.
The client must catch
the exception and retry.   One could create a session bean with
TRANSACTION_NOT_SUPPORTED set and retry within that bean.   This
approach
seems ugly to me and would only work in cases where concurrent
transactions are
unlikely to occur.

The way Oracle handles such a situation is pretty straightforward.
Before executing
doBusinessLogic one would to a "select for update" to get the values of
myX1 and
myX2.  That would prevent any other transaction from updating those
values until after
the transaction is complete.  To use this approach in the entity bean we
could either do a
"select for update" in ejbLoad or we could do it in the doBusinessLogic
method.  The
first choice would end up locking the bean for all calls including
queries and so is not a
good solution.  The second choice would lead to several unnecessary
database calls since
ejbLoad would still occur.

In summary it appears that the ejbLoad, method call, ejbStore cycle of
entity beans
creates too many programming difficulties.   Even in the simplest
examples programmers
are forced to understand the entity bean life cycle and callback process
and to hack the
code so that it works with the database locking approach.  Furthermore,
all this stuff is
easy and straightforward if you just use Stateless Session beans and
ordinary business
objects.  In a future article I will describe how to do that.

I look forward to comments on these criticisms.  However, I am most
interested in
comments concerning my last criticism regarding handling of concurrent
transactions
with entity beans.

Dan

===========================================================================
To unsubscribe, send email to [EMAIL PROTECTED] and include in the body
of the message "signoff EJB-INTEREST".  For general help, send email to
[EMAIL PROTECTED] and include in the body of the message "help".
Entity Bean Crisis

Reply via email to