[ 
https://issues.apache.org/jira/browse/AXIS2-3210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rich Scheuerle updated AXIS2-3210:
----------------------------------

    Attachment: patchJIRA.txt

Code Solution

ObjectInputStreamWithCL is contributed by Andrew Gatford (Sandesha).
The remaining changes are contributed by Rich Scheuerle

> MessageContext Persistence Performance Improvement
> --------------------------------------------------
>
>                 Key: AXIS2-3210
>                 URL: https://issues.apache.org/jira/browse/AXIS2-3210
>             Project: Axis 2.0 (Axis2)
>          Issue Type: Improvement
>          Components: kernel
>            Reporter: Rich Scheuerle
>            Assignee: Rich Scheuerle
>         Attachments: patchJIRA.txt
>
>
> MessageContext Persistence Performance Improvement
> ---------------------------------------------------------------------------------
> Background: 
> -----------------
> When a MessageContext is persisted (for reliable messaging), the 
> MessageContext object and associated
> objects are written out to the ObjectOutput.  When a MessageContext is 
> hydrated it is read from an 
> InputObject.  The utility class, ObjectStateUtils, provides static utility 
> functions to provide safety
> mechanisms to write and read the data.
> Problem:
> --------------
> The IBM performance team has profiled this code.  They found that the writing 
> and reading of these objects is time 
> consuming.  Some of the performance penalties are due to the use of static 
> methods (thus hindering the ability to reuse
> byte buffers).  Other penalties are due to the way that we determine if an 
> object can be "safely written".
> This JIRA issue addresses a number of these concerns.
> Scope of Changes (Important):
> -------------------------------------------
> These changes only amend the existing writeExternal and readExternal support. 
>  There is no impact on any code that 
> does not use these methods.  No additional logic api's are added or changed. 
> Specific Concerns and Solutions:
> ----------------------------------------------
>   A) The original logic writes objects into a buffer.  If a serialization 
> error occurs, the algorithm safely 
>      accommodates the error.  The downside is that it is very expensive to 
> write each object to a temporary buffer.
>      Solution:
>      A new marker interface, SafeSerializable, is introduced.  If an object 
> (i.e. MessageContext) has this marker
>      interface or is a lang wrapper object (i.e. String) then the object is 
> written directly to the ObjectOutput.
>      Eliminating the extra buffer write increases throughput.
>      A similar change is made to the read algorithm.  The new algorithm 
> detects whether the object was written directly
>      or whether it was written as a byte buffer.  In the case where it is 
> written directly, no extra buffering is needed
>      when reading.
>   B) If a buffer is needed to write or read an object, the ObjectStateUtils 
> class creates a new buffer.  This 
>      excessive allocation of buffers and subsequent garbage collection can 
> hinder performance.
>      Solution:
>      The code is re-factored to use two new classes: SafeObjectOutputStream 
> and SafeObjectInputStream.  These classes
>      wrap the ObjectOutput and ObjectInput objects and provide similar logic 
> as ObjectStateUtils.
>      The key difference is that these are not static utility classes.  
> Therefore any buffers used during writing or reading can
>      are reused for the life of the *Stream object.  In one series of tests, 
> this reduced the number of buffers from 40 to 2 for 
>      persisting a MessageContext.
>   C) When an outbound MessageContext is persisted, its associated inbound 
> MessageContext (if present) is also persisted.
>      The problem is that the inbound MessageContext may have a large message. 
>  Writing out this message can impact performance
>      and in some cases causes logic errors.
>   
>      Solution:
>      Any code that hydrates an outbound MessageContext should never need the 
> message (soapenvelope) associated with the 
>      inbound MessageContext.  The solution is to not persist the inbound 
> message.
>   D) In the current code, "marker" strings are persisted along with the data. 
>  These marker strings may contain a lengthy 
>      correlation id.   This extra information can impact performance and file 
> size.
>      Solution:
>      I reduced the number of "marker" strings.  The remaining marker strings 
> are changed to the "common name" of the object
>      being persisted.  In most cases, the log correlation id is no longer 
> present in the marker string.  In addition, I made
>      changes to only create a log correlation id "on demand".  The log 
> correlation code uses the (synchronized) UUIDGenerator.  
>      Creating the log correlation id "on demand" limits unnecessary locking.
>   E) Miscellaneous.  I spent time fine tuning the algorithmic logic in 
> SafeObjectInputStream and SafeObjectOutputStream
>      to eliminate extra buffers (i.e. ByteArrayOutputStream optimizations).  
> These are all localized changes.
> Other Non-performance Related Changes
> ------------------------------------------------------------
>   i) The externalize related code is refactored so that all lives in the new 
> org.apache.axis2.context.externalize package.
>  
>   ii) The ObjectStateUtils class is retained for legacy reasons.  I didn't 
> want to remove any api's.  The implementation 
>       of ObjectStatUtils is changed to delegate to the new classes.
>   iii) New tests are added.
>   iv) I added classes DebugOutputObjectStream and DebugObjectInputStream.  
>       These classes are installed when log.isDebugEnabled() is true.  
>       The classes log all method calls to and from the underlying 
> ObjectOutput and ObjectInput; thus they are helpful 
>       in debugging errors.
>   v) Andy Gatford has provided code that uses the context classloader when 
> reading persisted data.
>   vi) The high level logic used to write and read the objects is generally 
> the same.  The implementation of the algorithms is changed/improved.
>      In some cases, this required changes to the format of the persisted 
> data.  An example is that each object is preceded by
>      a boolean that indicates whether the object was written directly or 
> written into a byte buffer.  I increased the revision id because
>      I changed the format.
> Kudos
> ---------
> Much thanks to the following people who contributed to this work, helped with 
> brainstorming, helped with testing or provided performance profiles:
> Ann Robinson, Andy Gatford, Dan Zhong, Doug Larson, and Richard Slade.
> Next Steps
> ---------------
> I am attaching the patch to this JIRA.  I will be committing the patch in the 
> next day or two.  Please let me know if you have any questions or concerns.
> Thanks
> Rich Scheuerle

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to