Serialization matters in ways that are well documented. We've
discussed a number of considerations related to serialization on
JpaTicketRegistry such as LOB size and deserialization errors on
upgrades. These deployment and runtime considerations are well known
by us JPA deployers. On the contrary, the performance characteristics
of serialization are not well known. In researching memcached as a
new ticket backend, I have recorded throughput and storage data
sufficient to claim that default Java serialization is relatively slow
and dramatically wasteful in terms of bytes per serialized object.
The experimental approach compared a MemcachedTicketRegistry using the
default SerializingTranscoder [1] against a custom Transcoder
implementation based on the Kryo framework [2] that allowed strict
control over the serialized form of object graphs. We developed a
JMeter test plan [3] to exercise both implementations to obtain a
throughput comparison in terms of tickets per second. The follow
results demonstrate a 25% improvement in throughput using the custom
serializer:
SerializingTranscoder:
Sample Avg Latency (ms) Std Dev Tput (req/s)
Get ST for SAML 1433 537 79
GET LT 6798 3483 23
Get ST for CASv2 1431 525 79
CASv2 Validate 1464 492 80
POST Credentials 11500 2952 16
SAML Validate 1436 500 80
KryoTranscoder:
Sample Avg Latency (ms) Std Dev Tput (req/s)
Get ST for SAML 1139 511 99
GET LT 9024 2767 23
Get ST for CASv2 1141 512 101
CASv2 Validate 1164 493 102
POST Credentials 11348 2272 20
SAML Validate 1136 497 101
The more relevant result for the memcached use case is a comparison of
storage size for serialized object graphs. We compared the size in
bytes of a serialized TicketGrantingTicket representing an average
case containing 1 proxy ticket, 3 service tickets, and a handful of
principal attributes. A comparison of the serialized object [4]
indicates that Java serialization has a storage footprint of more than
2 times that of Kryo.
It should be noted that the SerializingTranscoder supports Gzip
compression, which can help reduce the storage footprint, but use of
compression comes at the cost of throughput.
These results in addition to known issues give me confidence to vote
for substantial refactoring of ticket backends in CAS 4.0 to support
three different ticket classes:
1. Simple (in-memory) without regard to serialized form
2. Serializable with explicit control over serialized form (e.g. via
java.io.Externalizable)
3. JPA with no serialization
I believe Scott's design using factories is a viable implementation
that can achieve the above goal.
For the short term I would like to offer my work on
MemcachedTicketRegistry [5] for consideration to be included in the
3.5 release. While appearing in the next release would make it easier
for us to use in the Spring/Summer timeframe, I'm quite certain it has
value generally for memcached deployers. In any case we plan to use
this code for our next production release at Virginia Tech.
M
[1]
http://dustin.github.com/java-memcached-client/apidocs/net/spy/memcached/transcoders/SerializingTranscoder.html
[2] http://code.google.com/p/kryo/
[3]
https://wiki.jasig.org/download/attachments/26510787/cas-stress-registry-dev.jmx?version=1&modificationDate=1329934450042
[4] https://gist.github.com/2007751
[5] https://github.com/serac/cas/tree/memcached-ng
--
You are currently subscribed to [email protected] as:
[email protected]
To unsubscribe, change settings or access archives, see
http://www.ja-sig.org/wiki/display/JSG/cas-dev