The Rationale ============= (1) Harmonize HTTP message processing API The HTTP message processing API in HttpClient 3.0 is inconsistent. Methods that share similar function have inconsistent names and different argument signatures. There are several static utility classes intended to process HTTP primitives sitting in the util package.
(2) Avoid the use of objects with synchronized access StringBuffer and ByteArrayOutputStream used internally by HttpClient 3.0 synchronize on each object mutation, which causes significant performance degradation (3) Eliminate excessive garbage creation when processing HTTP messages Refactoring =========== (1) The HttpCore _should_ now have a consistent API for HTTP message processing. Parseable HTTP primitives now come with #parse methods to convert a char array to a HTTP primitive and #parseAll to convert a char array to a sequence of HTTP primitives. In those cases where Object#toString() may not always do the job, HTTP primitives come with #format and #formatAll methods. All HTTP primitives can work with either String or CharArrayBuffer. The code from static utility classes have been merged into the logically related HTTP primitive classes Consistency of an API is a different thing to different people, though, so please review and complain loudly if you disagree. (2) StringBuffer and ByteArrayOutputStream gotten rid of in favor of unsynchronized ByteArrayBuffer and CharArrayBuffer classes (3) HttpClient 3.0 produces approximately up to 50 intermediate objects per average HTTP request and 1-2 per content chunk (if chunk-encoded) primarily due to the abuse of String#trim() and String#substring() methods. This is a lot of garbage The refactored code, to the contrary, generates _virtually_ zero garbage (1 intermediate object I can think of) when parsing HTTP header and zero garbage when parsing content chunks. Moreover HTTP headers are tokenized only when needed, thus unused headers never get parsed and converted to high level Objects, further reducing amount of garbage required to process a request. Reduced garbage comes at the price of somewhat uglier code, though. * Header class can be initialized with an instance of CharArrayBuffer which is copied by reference, not by value. Theoretically one can still mutate the original CharArrayBuffer instance thus possibly rendering the Header instance corrupt. This problem can be solved by making Header an interface with two impls: an immutable public class and a package private instantiated by passing a reference to a CharArrayBuffer. I just thought that would be an overkill though. Please let me know if you disagree * NumUtils#parseUnsignedInt is used instead of standard Integer#parseInt to parse integer values in the HTTP messages such as protocol version and chunk size. The parseUnsignedInt method produces no intermediate garbage whereas the use of Integer#parseInt usually entails creation of an intermediate string object I understand this is a questionable design decision and will not object strongly should the majority decide this change be reverted I am planning to do some benchmarking to see if the 'near-zero-garbage' HTTP message processing results in any tangible performance gains. So far it appears to have produced an absolutely negligible ~1-2% performance increase when running in JRE 1.5 Oleg --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]