Hi, On Mon, Oct 21, 2013 at 6:38 AM, Chetan Mehrotra <chetan.mehro...@gmail.com> wrote: > Marcel - Basic concern raised is that listeners without any filter > would cause lots of reads on the repository. these kind of listeners > would pull in modifications of all sessions performing distributed > writes. In our view this will not work well because it puts a very > high load on each of the cluster nodes and will likely delay delivery > of events. > > Thomas - As for the theoretical bottleneck, it is quite clear: lets > assume there are 1000 cluster nodes, and each one writes 1000 nodes > per second, there would be 1 million events per second on _each_ > cluster node, and 1 billion events per second in the system. It can > not possibly scale. Where exactly the bottleneck is (diffing, creating > the events, whatever) doesn't matter all that much in my view.
Instead of an repository problem (like diffing, event creation, etc.), this analysis tells me that the bottleneck here is the application that tries to listen to so many events. It doesn't matter how much we optimize the observation code inside a repository, as no non-distributed listener is ever going to be able to handle 1 billion events per second. > Change the logic such that we have one Oak listener which listens for > changes on root and then deliver to all JCR (after filtering by path > -> nodeType -> user) then we would be able to reduce the number of > calls to persistent store or cache considerably. So it changes current > logic where N listeners independently pull in changes to one where we > have 1 Oak listener per node which pulls in changes and then delivers > to all. Serving the same role as Sling Listener does for OSGi stack. -1 This introduces the problem where a single JCR event listener can block or slow down all other listeners. I'm not convinced by the assumption here that the observation listeners put undue pressure on the underlying MK or its caching. Do we have some data to prove this point? My reasoning is that if in any case we have a single (potentially multiplexed as suggested) listener that wants to read all the changed nodes, then those nodes will still need to be accessed from the MK and placed in the cache. If another listener does the same thing, they'll most likely find the items in the cache and not repeat the MK accesses. The end result is that the main performance cost goes to the first listener and any additional ones will come mostly for free, thus the claimed performance benefit of multiplexing observers is IMHO questionable. More generally, the basic premise here seems to be that a single listener would need to scale to observe an entire highly scaled repository with lots of concurrent writes. As explained by Thomas above, that simply isn't going to work at the high end use cases. Thus what we really should be doing to ensure full scalability of observation, is to try to get rid of observers that listen to all changes across a cluster. And where we can't do that, we simply accept the scalability limit inherent in the application design that requires such unlimited observers. BR, Jukka Zitting