Hi Flavio,

We've done some more analysis using the snapshot formatter and a heap dump and 
have found the source of the snapshot bloat.

What is taking  the majority of the space is the longKeyMap from DataTree.  In 
the heapdump, aclKeyMap has as many entries (which is to be expected given how 
the maps are used) and is also taking an equally large amount of space though 
at least aclKeyMap isn't serialised to the snapshot.

We use a custom authentication provider but because the 
AuthenticationProvider.matches method does not provide the path being operated 
on, we end up sticking the path in the ACL id.  Some of our apps end up 
generating a lot of paths for one time use and consequently we end up with lots 
of unique ACLs.

The two ACL maps in DataTree seem to be an optimisation so that repeated usage 
of ACLs does not result in the full list being stored multiple times.  However, 
these two maps are never removed from so if an ACL is unique these maps (and 
the snapshot) grow forever.

We're quite keen on fixing this as it's causing us lots of issues and we're 
happy to provide a patch but will need your opinion on the various options:
- create a third map which would be a reference count for the ACLs which can be 
updated as needed when creating, deleting or setting ACL.  When the reference 
count is 0, remove the entry from all the maps
- use weak references in some shape or form though this is made harder by the 
fact that ACL optimisation essentially needs a bidirectional index (hence the 
two maps).  We've given this one lots of thought but it would really require 
something like a ConcurrentWeakBiHashMap which just sounds wrong and over 
engineered :)

The other fix that could be made is to pass the path being operated on to the 
AuthenticationProvider.  However, doing that in a backwards compatible fashion 
is not trivial and even though it would fix my problem (by allowing me to 
remove the path from the ACL id) it wouldn't fix the general problem with this 
optimisation.

Looking forward to hearing your thoughts on this.

Thanks,
Karol

> On 22 Feb 2015, at 14:55, Flavio Junqueira <fpjunque...@yahoo.com.INVALID> 
> wrote:
> 
> Hi Karol,
> 
> It's odd that you have such large snapshots and little data in the data tree. 
> Are you creating lots of sessions? Right now I can't think of a good reason, 
> I suggest you really use the snapshot formatter to inspect the snapshot. 
> 
> -Flavio
> 
>> On 22 Feb 2015, at 14:23, Karol Dudzinski <karoldudzin...@gmail.com> wrote:
>> 
>> Hi Flavio,
>> 
>> Yes, one of ours clients had a bug which caused it to go into a 
>> create/delete tight loop with zero net effect (I.e. It was deleting what it 
>> had just created). After stopping the client, the snapshot never reduced in 
>> size so are the deletes in there permanently?
>> 
>> Thanks,
>> Karol
>> 
>> 
>>> On 22 Feb 2015, at 14:05, Flavio Junqueira <fpjunque...@yahoo.com.INVALID> 
>>> wrote:
>>> 
>>> Hi there,
>>> 
>>> Perhaps a lot of data has been deleted? In any case, you may want to use 
>>> the SnapshotFormatter to check what is in the large snapshot.
>>> 
>>> -Flavio
>>> 
>>>> On 22 Feb 2015, at 10:44, Karol Dudzinski <karoldudzin...@gmail.com> wrote:
>>>> 
>>>> Hi all,
>>>> 
>>>> I was under the impression that the snapshot contained essentially an 
>>>> on-disk copy of all the data.  However, one of our clusters has a snapshot 
>>>> which is over 1GB while the mntr four letter word reports an approximate 
>>>> data size in the hundreds of KB and a node count in the low thousands.  So 
>>>> what else goes into the snapshot and how can I slim it down?
>>>> 
>>>> Thanks,
>>>> Karol
> 

Reply via email to