On 10/17/11 3:31 PM, Selcuk AYA wrote:
Hi all,
I am hoping to send a more detailed email as to where I am in the txn
implementation but I noticed I might hit a snag with regard to the
various logical caches we maintain.
Ahha... Caches... Pain...
We have many of them. Let's first list all the caches we are using :
- dnCache : a DN cache used to spare DN parsing. If the DN is not
anymore valid, it should be removed from the cache.
- subentryCache : a DN -> Subentry Map. We should update it when a
subentry is added/modified
- accessControlXXX caches : Caches used for AccessPoint. It's a DnNode
data structure, using the DN as an entry point.
- groupCache : a cache containing the groups
- tupleCache : a cach containing the ACI tuples
- kdcReplayCache : a Kerberos cache
- referrals cache : a cache used to manage referrals
- credentialCache : a LRUMap used for authentication
- registrations : a cache of notifications
- ObjctClass chaches (must, may, superiors, allowed)
- TriggerSpecCache A cache of id for the Triggers
- notAliasCache : This is a weird alias. It's used to know if an entry's
parent is not an alias. IMO, we should rather have an Alias cache...
This is pretty much all the cache we declare, if we exclude the index
and master table cache (entry).
Obviously, many of those caches will be impacted by any modification
done in the server.
Emmanuel is moving these caches out of interceptor but up until now
these caches were in interceptors. They map entry DN to a logical
value that is a predicate of the entry attributes. Currently I am
aware of notAlias and subentry cache(there could be more) as such
caches but it is not difficult to see people might add such cache in
their custom interceptors.
An example of the transactional execution we might have according to
the planned implementation is this:
R1, T1, T2 -> R1 is before T1 and T2 and should be isolated from them.
T1 is committed and T2 started after T1, so it should see the affects
of T1. Also changes of T1 are not reflected(flushed) to the underlying
partitions yet. Remeber that readers merge what they read from
partitions with the changes in the unflushed part of the txn log.
Now considers how we would make R1 and T2 see a consistent state of
notAlias and subentry cache. It seems to me the only possible way is
to go ahead like we do with entries and index values: Update the cache
when the txn log is being flushed to the partitions and when these
caches read, merge whatever we read with the txn log. However, each
separate cache requires a separate logic to handle this merge and i am
afraid it might complicated and slow as the number of such caches
increase. Especially expecting a custom cache implementer to get this
right seems dubious.
please let me know what you think
This is plain right. We currently don't handle any kind of transaction
system for cache, so we might very well end with a out of date cache at
some point. This is dangerous... OTOH, as we now are implementing a MVCC
mechanism, keeping the cach as they are is just not an option, and we
must keep the revision for entries and DN, as they might have been
changed by another thread...
This is not an easy issue.
In many places, we are now using ehCache to manage caches, instead of
using a LRUMap, for instance, but this is not the case for all the
caches. For instance, in some places, we are using a DnNode cache (which
s used for partitions, accessControl, etc).
One possibility would be to associate the revision to each key, assuming
that each operation will have a revision number in its context. Any
modification impacting any cache will just create a new element with a
revision into those cache.
In order to improve the cache management, it would be good to always use
the CacheService, which provides some methods to easily manage caches.
That may be possible except for the DnNode cache, as it's a tree
hierarchy...
We should think seriously about the best way to solve this issue, as
it's really critical.
Thanks Selcuk !
regards
Selcuk
--
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com