[ https://issues.apache.org/jira/browse/OAK-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13810698#comment-13810698 ]
Alexander Klimetschek edited comment on OAK-1133 at 10/31/13 8:59 PM: ---------------------------------------------------------------------- *Clustered/external events* are a somewhat separate topic. I totally agree that you want to avoid them. AFAICS most events can be handled locally - from my experience most application use cases require local handling anyway, since there are other things local to the instance that you depend on, mostly a sticky session from the web server to ensure users see the data as quickly as possible. But still I think an enormous waste is going on if you look at *actual listeners in applications that register broadly for all events* and you need all the eventing/threading going on, just to figure out in 80% of the cases that this event can be discarded by the listener. And the listener has to read the repository data again in a separate session just to do the check. *This is not convenience, this is reducing unnecessary work*. Now the same principle can be used for cluster events, if you need them: I don't think an external mechanism such as JMS would really help, as you would add an extra data stream between instances that would need to send *all events*, since you cannot know if a listener on a target instance is interested in it or not (you cannot assume the listener code is shared and the specific registration happens on all cluster nodes, allowing you to filter out events to just the ones for which there are listeners on the other nodes). There already is a cluster sync happening between instances and once it arrives on the target instance, the same approach as proposed here could happen: those registered filters would run (not sure if the oak {{Observer}} as hook works here) and only trigger events (local) if needed. Of course those *listener registrations would include whether they care about local or external events*. Again different from today where you have to look at the event as it arrives in the listener to decide "oh this was external, no I don't care" after having already wasted precious resources for that event. And by default a listener would not register for external events, it would need to be a very dedicated extra step in the registration API to do so, to discourage accidentally registering for them, based on the experience that maybe 98% of observation use cases don't need external events. was (Author: alexander.klimetschek): Clustered/external events is a somewhat separate topic. I totally agree that you want to avoid them. AFAICS most events can be handled locally - from my experience most application use cases require local handling anyway, since there are other things local to the instance that you depend on, mostly a sticky session from the web server to ensure users see the data as quickly as possible. But still I think an enormous waste is going on if you look at actual listeners in applications that registers broadly for all events and you need all the eventing/threading going on, just to figure out in 80% of the cases that this event can be discarded by the listener. And the listener has to read the repository data again in a separate session just to do the check. This is not convenience, this is reducing unnecessary work. Now the same principle can be used for cluster events, if you need them: I don't think an external mechanism such as JMS would really help, as you would add an extra data stream between instances that would need to send *all events*, since you cannot know if a listener on a target instance is interested in it or not (you cannot assume the listener code is shared and the specific registration happens on all cluster nodes, allowing you to filter out events to just the ones for which there are listeners on the other nodes). There already is a cluster sync happening between instances and once it arrives on the target instance, the same approach as proposed here would happen: those registered filters would run (not sure if the oak {{Observer}} as hook works here) and only send out events if needed. Of course those registrations/filters would include whether they care about local or external events. Again different from today where you have to look at the event as it arrives in the listener to decide "oh this was external, no I don't care" after having already wasted precious resources for that event. And by default a listener would not register for external events, it would need to be a very dedicated extra step in the registration API to do so, to discourage accidentally registering for them, based on the experience that maybe 98% of observation use cases don't need external events. > Observation listener PLUS > ------------------------- > > Key: OAK-1133 > URL: https://issues.apache.org/jira/browse/OAK-1133 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: commons, jcr > Reporter: Alexander Klimetschek > Labels: performance > > Oak should provide an *extended and efficient JCR observation listener* > mechanism to support common use cases not handled well by the restricted > options of the JCR observation (only base path, node types and raw events). > Those cases require listeners to register much more broadly and then filter > out their specific cases themselves, thus putting too many events into the > observation system and creating a huge overhead due to asynchronous access to > the modified JCR data to do the filtering. This easily is a big performance > bottleneck with many writes and thus many events. > Previous discussions [on the > list|http://markmail.org/message/oyq7fnfrveceemoh] and in OAK-1120. > The goals should be: > * performance: handle filtering as early as possible, during the commit, > where access to the modified data is already present > * provide robust implementation for typical filtering cases > * provide an asynchronous listener mechanism as in JCR > * minimize effect on the lower levels on Oak (a visible addition in > oak-commons or oak-jcr should be enough) > * for delete events, allow filtering on the to-be-deleted data (currently not > possible in jcr listeners that run after the fact) > * if possible: design as an extension of the jcr observation to simplify > migration for existing code > * if possible: provide an intelligent listener that can work with pure JCR > (aka Jackrabbit 2) as well, by falling back to in-listener-filtering > * maybe: synchronous option using the same simple interface (instead of raw > Oak plugins itself); however, not sure if there is a benefit if they can only > read data and not change or block the session commit > Typical filtering cases: > - paths with globbing support (for example /content/foo/*/something) > - check for property values (equal, not equal, contains etc.), most > importantly > sling:resourceType in Sling apps > - allow to check properties on child nodes as well, typically jcr:content > - node types (already in jcr observation) > - created/modified/deleted events, separate from move/copy > - and more... a custom filter should be possible to pass through (with > similar access as the {{Observer}}) -- This message was sent by Atlassian JIRA (v6.1#6144)