Thanks Janne.
So I just tried a test with your SimplePrincipalSerializer's logic as
the basis for SimplePrincipalCollection's writeObject/readObject
implementations.
Here are the stats before this change was made (your original findings):
--------------------------
Single principal, single realm
Default serializer, Simple serializer, Size saving
423 100 76.36%
Multiple principals, single realm
Default serializer, Simple serializer, Size saving
577 254 55.98%
Multiple principals, multiple realms
Default serializer, Simple serializer, Size saving
817 368 54.96%
-------------------------------
Here are the stats after moving SimplePrincipalSerializer's logic into
SimplePrincipalCollection:
-------------------------------
Single principal, single realm
Default serializer, Simple serializer, Size saving
434 100 76.96%
Multiple principals, single realm
Default serializer, Simple serializer, Size saving
623 254 59.23%
Multiple principals, multiple realms
Default serializer, Simple serializer, Size saving
977 368 62.33%
--------------------------------
Oddly enough, moving the relevant SimplePrincipalSerializer logic into
SimplePrincipalCollection makes Java's default serialization mechanism
_slower_ for the sample data set! That means that the HashMap
serialization implementation *when using default JDK object
serialization* is more efficient than manually trying to serialize the
map ourselves.
I hate Java serialization voodoo!
Time to see what happens when we implement Externalizable :)
I also think it'd be an interesting exercise to create a Serializer
implementation based on Google's Protocol Buffers project (see:
http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking)
Les
On Wed, Dec 15, 2010 at 4:35 AM, Janne Jalkanen
<[email protected]> wrote:
>
> SHIRO-226. Contains proposal patch against current trunk and corresponding
> unit tests.
>
> /janne
>
> On Dec 14, 2010, at 02:15 , Les Hazlewood wrote:
>
>> I don't see how having multiple cookies solves the data size problem.
>> In fact, I believe it makes the problem worse: instead of one cookie
>> header, now you have multiple, contributing to an even greater overall
>> request size.
>>
>> Also, if a user stores more than one principal in the collection
>> returned from a Realm, you can't just delete all but the primary
>> principal without the user's knowledge - they might not have a way to
>> reconstitute a Subject's identity properly - i.e. the primary
>> principal might have auxiliary information necessary when accessing
>> account data. You have to assume that if they provide multiple
>> principals, they actually need those to be retained for account lookup
>> later.
>>
>> My desire was to try to serialize the PrincipalCollection explicitly
>> and not delegate to the HashMap instance - as Janne suggested also. I
>> think if we do that, we may very well find that we don't have to
>> change anything because the data size will probably be within a
>> suitable range.
>>
>> It's certainly worth trying before we change anything else IMO.
>>
>> Les
>>
>> On Mon, Dec 13, 2010 at 3:02 PM, Kalle Korhonen
>> <[email protected]> wrote:
>>> On Mon, Dec 13, 2010 at 2:53 PM, Janne Jalkanen
>>> <[email protected]> wrote:
>>>> By using explicit serialization for things like realm names one should be
>>>> able to shave off a number of bytes *especially* for the very common
>>>> single-realm, single-principal case. It's a bit late over here, but I'll
>>>> try and see if I can generate some data or a patch tomorrow.
>>>
>>> Great. Using the primary principal and a cookie per realm would make
>>> this quite a bit more generic without loosing any of the benefits.
>>>
>>> Kalle
>>>
>>>
>>>> On Dec 13, 2010, at 22:25 , Les Hazlewood wrote:
>>>>
>>>>> I think it is a good use case, but I think we may not be on the same page
>>>>> yet.
>>>>>
>>>>> Unless I'm mistaken, the ID that Janne was talking about was a single
>>>>> user or account id in his own application. That corresponded to one
>>>>> principal in one realm only. I don't believe he was creating an ID
>>>>> that was a pointer to the PrincipalCollection instance, for example.
>>>>>
>>>>> So the question is: how do you efficiently represent a user's
>>>>> rememberMe identity when that identity could span multiple realms, or
>>>>> where there might be multiple principals, or a combination thereof?
>>>>>
>>>>> Are you implying that we create a RememberMeDAO to save the
>>>>> PrincipalCollection instance to a datastore (which will probably be
>>>>> fronted transparently with a cache) and send out the record's ID only
>>>>> in the cookie? That sounds like an extremely complicated solution
>>>>> since you'd have to come up with a purging strategy to handle orphan
>>>>> records - it's almost like solving the Session problem over again.
>>>>>
>>>>> My personal opinion is that I'd want to figure out a way to make the
>>>>> serialization output size more compact before going down that road.
>>>>> (It's something that should be done even if a DAO was used too).
>>>>>
>>>>> Regards,
>>>>>
>>>>> Les