MessageContext Persistence Performance Improvement
--------------------------------------------------

                 Key: AXIS2-3210
                 URL: https://issues.apache.org/jira/browse/AXIS2-3210
             Project: Axis 2.0 (Axis2)
          Issue Type: Improvement
          Components: kernel
            Reporter: Rich Scheuerle
            Assignee: Rich Scheuerle


MessageContext Persistence Performance Improvement
---------------------------------------------------------------------------------

Background: 
-----------------
When a MessageContext is persisted (for reliable messaging), the MessageContext 
object and associated
objects are written out to the ObjectOutput.  When a MessageContext is hydrated 
it is read from an 
InputObject.  The utility class, ObjectStateUtils, provides static utility 
functions to provide safety
mechanisms to write and read the data.

Problem:
--------------
The IBM performance team has profiled this code.  They found that the writing 
and reading of these objects is time 
consuming.  Some of the performance penalties are due to the use of static 
methods (thus hindering the ability to reuse
byte buffers).  Other penalties are due to the way that we determine if an 
object can be "safely written".
This JIRA issue addresses a number of these concerns.

Scope of Changes (Important):
-------------------------------------------
These changes only amend the existing writeExternal and readExternal support.  
There is no impact on any code that 
does not use these methods.  No additional logic api's are added or changed. 


Specific Concerns and Solutions:
----------------------------------------------
  A) The original logic writes objects into a buffer.  If a serialization error 
occurs, the algorithm safely 
     accommodates the error.  The downside is that it is very expensive to 
write each object to a temporary buffer.

     Solution:
     A new marker interface, SafeSerializable, is introduced.  If an object 
(i.e. MessageContext) has this marker
     interface or is a lang wrapper object (i.e. String) then the object is 
written directly to the ObjectOutput.
     Eliminating the extra buffer write increases throughput.
     A similar change is made to the read algorithm.  The new algorithm detects 
whether the object was written directly
     or whether it was written as a byte buffer.  In the case where it is 
written directly, no extra buffering is needed
     when reading.

  B) If a buffer is needed to write or read an object, the ObjectStateUtils 
class creates a new buffer.  This 
     excessive allocation of buffers and subsequent garbage collection can 
hinder performance.

     Solution:
     The code is re-factored to use two new classes: SafeObjectOutputStream and 
SafeObjectInputStream.  These classes
     wrap the ObjectOutput and ObjectInput objects and provide similar logic as 
ObjectStateUtils.
     The key difference is that these are not static utility classes.  
Therefore any buffers used during writing or reading can
     are reused for the life of the *Stream object.  In one series of tests, 
this reduced the number of buffers from 40 to 2 for 
     persisting a MessageContext.

  C) When an outbound MessageContext is persisted, its associated inbound 
MessageContext (if present) is also persisted.
     The problem is that the inbound MessageContext may have a large message.  
Writing out this message can impact performance
     and in some cases causes logic errors.
  
     Solution:
     Any code that hydrates an outbound MessageContext should never need the 
message (soapenvelope) associated with the 
     inbound MessageContext.  The solution is to not persist the inbound 
message.

  D) In the current code, "marker" strings are persisted along with the data.  
These marker strings may contain a lengthy 
     correlation id.   This extra information can impact performance and file 
size.

     Solution:
     I reduced the number of "marker" strings.  The remaining marker strings 
are changed to the "common name" of the object
     being persisted.  In most cases, the log correlation id is no longer 
present in the marker string.  In addition, I made
     changes to only create a log correlation id "on demand".  The log 
correlation code uses the (synchronized) UUIDGenerator.  
     Creating the log correlation id "on demand" limits unnecessary locking.

  E) Miscellaneous.  I spent time fine tuning the algorithmic logic in 
SafeObjectInputStream and SafeObjectOutputStream
     to eliminate extra buffers (i.e. ByteArrayOutputStream optimizations).  
These are all localized changes.

Other Non-performance Related Changes
------------------------------------------------------------

  i) The externalize related code is refactored so that all lives in the new 
org.apache.axis2.context.externalize package.
 
  ii) The ObjectStateUtils class is retained for legacy reasons.  I didn't want 
to remove any api's.  The implementation 
      of ObjectStatUtils is changed to delegate to the new classes.

  iii) New tests are added.

  iv) I added classes DebugOutputObjectStream and DebugObjectInputStream.  
      These classes are installed when log.isDebugEnabled() is true.  
      The classes log all method calls to and from the underlying ObjectOutput 
and ObjectInput; thus they are helpful 
      in debugging errors.

  v) Andy Gatford has provided code that uses the context classloader when 
reading persisted data.

  vi) The high level logic used to write and read the objects is generally the 
same.  The implementation of the algorithms is changed/improved.
     In some cases, this required changes to the format of the persisted data.  
An example is that each object is preceded by
     a boolean that indicates whether the object was written directly or 
written into a byte buffer.  I increased the revision id because
     I changed the format.



Kudos
---------
Much thanks to the following people who contributed to this work, helped with 
brainstorming, helped with testing or provided performance profiles:
Ann Robinson, Andy Gatford, Dan Zhong, Doug Larson, and Richard Slade.

Next Steps
---------------
I am attaching the patch to this JIRA.  I will be committing the patch in the 
next day or two.  Please let me know if you have any questions or concerns.

Thanks
Rich Scheuerle





-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to