MessageContext Persistence Performance Improvement --------------------------------------------------
Key: AXIS2-3210 URL: https://issues.apache.org/jira/browse/AXIS2-3210 Project: Axis 2.0 (Axis2) Issue Type: Improvement Components: kernel Reporter: Rich Scheuerle Assignee: Rich Scheuerle MessageContext Persistence Performance Improvement --------------------------------------------------------------------------------- Background: ----------------- When a MessageContext is persisted (for reliable messaging), the MessageContext object and associated objects are written out to the ObjectOutput. When a MessageContext is hydrated it is read from an InputObject. The utility class, ObjectStateUtils, provides static utility functions to provide safety mechanisms to write and read the data. Problem: -------------- The IBM performance team has profiled this code. They found that the writing and reading of these objects is time consuming. Some of the performance penalties are due to the use of static methods (thus hindering the ability to reuse byte buffers). Other penalties are due to the way that we determine if an object can be "safely written". This JIRA issue addresses a number of these concerns. Scope of Changes (Important): ------------------------------------------- These changes only amend the existing writeExternal and readExternal support. There is no impact on any code that does not use these methods. No additional logic api's are added or changed. Specific Concerns and Solutions: ---------------------------------------------- A) The original logic writes objects into a buffer. If a serialization error occurs, the algorithm safely accommodates the error. The downside is that it is very expensive to write each object to a temporary buffer. Solution: A new marker interface, SafeSerializable, is introduced. If an object (i.e. MessageContext) has this marker interface or is a lang wrapper object (i.e. String) then the object is written directly to the ObjectOutput. Eliminating the extra buffer write increases throughput. A similar change is made to the read algorithm. The new algorithm detects whether the object was written directly or whether it was written as a byte buffer. In the case where it is written directly, no extra buffering is needed when reading. B) If a buffer is needed to write or read an object, the ObjectStateUtils class creates a new buffer. This excessive allocation of buffers and subsequent garbage collection can hinder performance. Solution: The code is re-factored to use two new classes: SafeObjectOutputStream and SafeObjectInputStream. These classes wrap the ObjectOutput and ObjectInput objects and provide similar logic as ObjectStateUtils. The key difference is that these are not static utility classes. Therefore any buffers used during writing or reading can are reused for the life of the *Stream object. In one series of tests, this reduced the number of buffers from 40 to 2 for persisting a MessageContext. C) When an outbound MessageContext is persisted, its associated inbound MessageContext (if present) is also persisted. The problem is that the inbound MessageContext may have a large message. Writing out this message can impact performance and in some cases causes logic errors. Solution: Any code that hydrates an outbound MessageContext should never need the message (soapenvelope) associated with the inbound MessageContext. The solution is to not persist the inbound message. D) In the current code, "marker" strings are persisted along with the data. These marker strings may contain a lengthy correlation id. This extra information can impact performance and file size. Solution: I reduced the number of "marker" strings. The remaining marker strings are changed to the "common name" of the object being persisted. In most cases, the log correlation id is no longer present in the marker string. In addition, I made changes to only create a log correlation id "on demand". The log correlation code uses the (synchronized) UUIDGenerator. Creating the log correlation id "on demand" limits unnecessary locking. E) Miscellaneous. I spent time fine tuning the algorithmic logic in SafeObjectInputStream and SafeObjectOutputStream to eliminate extra buffers (i.e. ByteArrayOutputStream optimizations). These are all localized changes. Other Non-performance Related Changes ------------------------------------------------------------ i) The externalize related code is refactored so that all lives in the new org.apache.axis2.context.externalize package. ii) The ObjectStateUtils class is retained for legacy reasons. I didn't want to remove any api's. The implementation of ObjectStatUtils is changed to delegate to the new classes. iii) New tests are added. iv) I added classes DebugOutputObjectStream and DebugObjectInputStream. These classes are installed when log.isDebugEnabled() is true. The classes log all method calls to and from the underlying ObjectOutput and ObjectInput; thus they are helpful in debugging errors. v) Andy Gatford has provided code that uses the context classloader when reading persisted data. vi) The high level logic used to write and read the objects is generally the same. The implementation of the algorithms is changed/improved. In some cases, this required changes to the format of the persisted data. An example is that each object is preceded by a boolean that indicates whether the object was written directly or written into a byte buffer. I increased the revision id because I changed the format. Kudos --------- Much thanks to the following people who contributed to this work, helped with brainstorming, helped with testing or provided performance profiles: Ann Robinson, Andy Gatford, Dan Zhong, Doug Larson, and Richard Slade. Next Steps --------------- I am attaching the patch to this JIRA. I will be committing the patch in the next day or two. Please let me know if you have any questions or concerns. Thanks Rich Scheuerle -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]