[ 
https://issues.apache.org/jira/browse/HTTPCORE-615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004265#comment-17004265
 ] 

Scott W Gifford commented on HTTPCORE-615:
------------------------------------------

[~olegk] The complex exception handler code is just a general re-implementation 
of Java 8 try-with-resources for Java 6; I originally wrote this in Java 8 and 
backported to Java 6 since that's the version declared in {{pom.xml}}.  
Basically all of the complexity is in making sure the original exception is 
re-thrown if an exception is thrown in the finally handler, as Java 8 
try-with-resources does.  If that complexity seems like overkill to you guys I 
can remove it and use a simpler approach that is more standard in Java 6; on 
the other hand if it seems useful I can move this to its own source file and 
look for other places in the library it might be useful.  Let me know what you 
think.

 

[~michael-o] The purpose of the cache is to allow an HTTP client to be backed 
by a simple HTTP cache implementing the W3C standards for caching.  Our use for 
this is in an HTTP reverse proxy, to allow proxied objects to be cached for a 
short time in the event of an outage.  This is not new code, and the existing 
implementation is in {{httpclient-cache}}.

The purpose of the cache serializer is to allow {{HttpCacheEntry}} objects to 
be serialized into a sequence of bytes that can be stored somewhere out-of-JVM. 
 memcached is currently the only cache storage implementation that the Apache 
HTTP client supports, and just stores byte sequences under simple keys, similar 
to how files in a directory would be stored.  The interface it must implement 
is {{MemcachedCacheEntry}}, and currently the only implementation is 
{{MemcachedCacheEntryImpl}}.

{{MemcachedCacheEntryImpl }}uses Java object serialization to serialize and 
deserialize, which is widely viewed as problematic.  It is brittle and easy to 
break programs if they are not extremely careful about modification and 
versioning (which is what happened in -HTTPCORE-578- and also happened to us), 
has well-known security problems, and is widely viewed as problematic.  
Further, my understanding is the Java project intends to eventually make Java 
object serialization an optional feature, and likely some sites will want to 
leave it out.  Please let me know if you are unfamiliar with this view or do 
not share this view and I will provide some references.  

My contributed serializer instead serializes and deserializes in a more robust 
way that does not have the problems of Java object serialization.  It takes 
advantage of the existing ability of the client to write a well-formed HTTP 
response to a stream, and has it write to a byte stream instead, then uses 
those bytes as the serialized representation; similarly it takes advantage of 
the existing ability to read a well-formed HTTP response from a stream and 
deserializes the stored stream of bytes to get its {{MemcachedCacheEntry 
}}object.  Additionally it encodes a few bits of metadata in HTTP 
pseudo-headers, named starting with hc-.  You can see an example of the format 
in the pull request, for example 
[simpleObject.serialized|[https://github.com/apache/httpcomponents-client/pull/187/commits/1253b17b7756aa55a94b530adf94925d0bb2573f#diff-e7026c663719c09b40c1ccbc7b46a12f]].

The core contribution is about 450 lines, and the rest is tests and benchmarks, 
including sample files used in the tests.  I am happy to remove some of the 
tests and benchmarks if you think they are overkill, but I thought they were 
useful in the initial pull request to demonstrate that the contribution is 
well-tested and performant.  Please let me know if that's a concern and I'll 
remove some of the ones that seem lower-value to me.

So the core contribution here is a more robust way to serialize/deserialize in 
the existing HTTP caching module, and the problem it solves is not very 
specialized at all; basically it means that problems like -HTTPCORE-578- are 
unlikely to happen in the future.  I do not see this being something with 
frequent changes, and once this has proven out the existing Java object-based 
serializer could be deprecated IMO.

I will edit the issue description to clarify what this does instead of what it 
does not do, and take care of the comments you had on the pull request.

Thanks for your response guys, look forward to getting this sorted out and 
contributed!

> Implement new cache serializer that is not based on Java Object Serialization
> -----------------------------------------------------------------------------
>
>                 Key: HTTPCORE-615
>                 URL: https://issues.apache.org/jira/browse/HTTPCORE-615
>             Project: HttpComponents HttpCore
>          Issue Type: New Feature
>            Reporter: Scott W Gifford
>            Priority: Major
>
> HTTPCORE-578 was caused by the brittleness of using Java Object Serialization 
> to store cache objects.  Java Object Serialization requires careful 
> understanding of what sorts of changes require a new serialization version, 
> with small mistakes leading to surprising results; further Java Object 
> Serialization has security issues, and will be an optional feature in 
> upcoming Java releases (with Jigsaw).  It would be better to have a more 
> stable serialization approach.
> Since the Apache client already knows how to communicate with HTTP, one 
> simple approach would be to serialize as if we were writing to an HTTP 
> client, and deserialize as if we were reading from an HTTP server.
> I have developed a serializer that does that, and would like to contribute it 
> back to the Apache project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org
For additional commands, e-mail: dev-h...@hc.apache.org

Reply via email to