Re: Scalabilty of EventListener
On 26.10.2013, at 09:49, Justin Edelson jus...@justinedelson.com wrote: Why does the workflow engine have to make that decision itself? Can't that decision be expressed declaratively? They are, after all, being defined declaratively by the administrator, so it should just be a matter of mapping the administrator's selections to a filter expression and then letting the EventAdmin implementation handle the filter. For example, you could have a workflow launcher expressed as a listener with a filter of ((resourceType=nt:file)(path=/content/dam/*/original)). Yes, but someone needs to be able to implement that - this would include more, not only a resource type check, but any property, e.g. (someProperty!=somevalue). Whether this is in the workflow or in the event admin doesn't matter for my POV, both are above Oak in the stack. Even firing the event admin in OSGi would be wasted resources if you can know upfront - while you have access to the content being changed - that you don't need to fire the event at all. NOTE - I'm not necessarily saying that everything in CQ's workflow engine is supported by Sling Resource events *today*; there might be a need to add new event properties, but that's far from impossible. My point is that it is in large part the lack of good filtering (which is available in EventAdmin and JMS) is what leads to having multiple dispatching systems in the same applicatio and that good filtering is already available in Sling applications *because* of the Observation/EventAdmin bridge. We are talking about performance and scalability, and I think for that it is best handled as early as possible. Cheers, Alex
Re: Scalabilty of EventListener
Hi, On Fri, Oct 25, 2013 at 4:54 PM, Alexander Klimetschek aklim...@adobe.com wrote: On 24.10.2013, at 16:55, Justin Edelson jus...@justinedelson.com wrote: Given that there's already an expression syntax defined in the EventAdmin specification, wouldn't it make more sense for Oak to just use OSGi Events? That's the only way I personally could forsee us being in a position to deprecate/remote the JCR Observation/OSGi EventAdmin bridge which we have in Sling today. I don't think Oak wants to be tied to OSGi, but the expression syntax could maybe reused. Better IMHO would be for Oak to use the event/messaging system provided by the platform. In an OSGi environment, this would be EventAdmin. In JavaEE, it would be JMS (for which message selectors can be used for filtering). FWIW, I'm not sure it's a straight comparison between the bridge we have in Sling and the CQ Workflow engine. The CQ Workflow engine *could* be implemented by registering specific OSGi event listeners (one per launcher). It just isn't (or so your email implies :). Not really, because it has to make the decision on matching events *itself*. Because of that, it is natural to only have one central listener (it quickly forks off into separate jobs once a match for a certain entry has been found). Why does the workflow engine have to make that decision itself? Can't that decision be expressed declaratively? They are, after all, being defined declaratively by the administrator, so it should just be a matter of mapping the administrator's selections to a filter expression and then letting the EventAdmin implementation handle the filter. For example, you could have a workflow launcher expressed as a listener with a filter of ((resourceType=nt:file)(path=/content/dam/*/original)). NOTE - I'm not necessarily saying that everything in CQ's workflow engine is supported by Sling Resource events *today*; there might be a need to add new event properties, but that's far from impossible. My point is that it is in large part the lack of good filtering (which is available in EventAdmin and JMS) is what leads to having multiple dispatching systems in the same applicatio and that good filtering is already available in Sling applications *because* of the Observation/EventAdmin bridge. Justin What I am asking for is that Oak provides those options (match on property values etc.) directly and it can handled as early as possible, without Oak having to call an event listener in the first place (and start a new thread etc.) in case there is no match. Cheers, Alex
Re: Scalabilty of EventListener
On 24.10.2013, at 16:55, Justin Edelson jus...@justinedelson.com wrote: Given that there's already an expression syntax defined in the EventAdmin specification, wouldn't it make more sense for Oak to just use OSGi Events? That's the only way I personally could forsee us being in a position to deprecate/remote the JCR Observation/OSGi EventAdmin bridge which we have in Sling today. I don't think Oak wants to be tied to OSGi, but the expression syntax could maybe reused. FWIW, I'm not sure it's a straight comparison between the bridge we have in Sling and the CQ Workflow engine. The CQ Workflow engine *could* be implemented by registering specific OSGi event listeners (one per launcher). It just isn't (or so your email implies :). Not really, because it has to make the decision on matching events *itself*. Because of that, it is natural to only have one central listener (it quickly forks off into separate jobs once a match for a certain entry has been found). What I am asking for is that Oak provides those options (match on property values etc.) directly and it can handled as early as possible, without Oak having to call an event listener in the first place (and start a new thread etc.) in case there is no match. Cheers, Alex
Re: Scalabilty of EventListener
As stated, so far the whole discussion is very theoretical - apart from the point that right now the jcr listener does a read for each and every changed node to get the resource type. This could definitely be improved with Oak as Oak is doing a content diff and therefore it should be possible to send this information down to the jcr listener. This would speed up things dramatically. But without concrete numbers and performance information, all of this sounds like premature optimization. Reinventing the wheel just slightly different without any way to proof whether it helps just doesn't sound right to me. We need to have a look at the whole system - it doesn't help if we talk about fixing one part of it for such scenarios while completely neglecting other parts. Therefore, (I know I repeat myself) I'm really looking forward to numbers showing the bottlenecks. Thanks Carsten -- Carsten Ziegeler cziege...@apache.org
Re: Scalabilty of EventListener
I don't think it's so theoretical: a generic observation listener such as the jcr listener in sling, that handles all events all the time with no option to reduce the scope in any way, seems like a barrier to scalability. The goal should be that any kind of observation should always be as specific as possible (by path, properties, resource type etc.) and that should ideally be handled on the oak level where it can be optimized the most. The problem basically was that the JCR observation listener API doesn't allow much constraints, only path and node types. But not by property values. Or other things. So everyone came up with his own generic dispatching listener: sling with one that checks the resource type, the aforementioned workflow launcher in Adobe CQ which allows any property values and other constraints etc. There should be one in Oak, with a few common options (property values etc.) that can be heavily optimized and optionally a custom function that runs on the node in question and could handle more complex decisions whether to handle the event or not (at your own risk regarding performance :)). Cheers, Alex On 24.10.2013, at 00:04, Carsten Ziegeler cziege...@apache.org wrote: As stated, so far the whole discussion is very theoretical - apart from the point that right now the jcr listener does a read for each and every changed node to get the resource type. This could definitely be improved with Oak as Oak is doing a content diff and therefore it should be possible to send this information down to the jcr listener. This would speed up things dramatically. But without concrete numbers and performance information, all of this sounds like premature optimization. Reinventing the wheel just slightly different without any way to proof whether it helps just doesn't sound right to me. We need to have a look at the whole system - it doesn't help if we talk about fixing one part of it for such scenarios while completely neglecting other parts. Therefore, (I know I repeat myself) I'm really looking forward to numbers showing the bottlenecks. Thanks Carsten -- Carsten Ziegeler cziege...@apache.org
Re: Scalabilty of EventListener
Hi Alex, Given that there's already an expression syntax defined in the EventAdmin specification, wouldn't it make more sense for Oak to just use OSGi Events? That's the only way I personally could forsee us being in a position to deprecate/remote the JCR Observation/OSGi EventAdmin bridge which we have in Sling today. FWIW, I'm not sure it's a straight comparison between the bridge we have in Sling and the CQ Workflow engine. The CQ Workflow engine *could* be implemented by registering specific OSGi event listeners (one per launcher). It just isn't (or so your email implies :). The Observation/EventAdmin bridge, however, is intended to bridge these two eventing systems and so it pretty much has to listen for all events and then rebroadcast them (in slightly mutated form). Same thing is true if you are building an EventAdmin/JMS bridge or a JCR Observation/JMS bridge or anything like that. Justin On Thu, Oct 24, 2013 at 7:14 PM, Alexander Klimetschek aklim...@adobe.com wrote: I don't think it's so theoretical: a generic observation listener such as the jcr listener in sling, that handles all events all the time with no option to reduce the scope in any way, seems like a barrier to scalability. The goal should be that any kind of observation should always be as specific as possible (by path, properties, resource type etc.) and that should ideally be handled on the oak level where it can be optimized the most. The problem basically was that the JCR observation listener API doesn't allow much constraints, only path and node types. But not by property values. Or other things. So everyone came up with his own generic dispatching listener: sling with one that checks the resource type, the aforementioned workflow launcher in Adobe CQ which allows any property values and other constraints etc. There should be one in Oak, with a few common options (property values etc.) that can be heavily optimized and optionally a custom function that runs on the node in question and could handle more complex decisions whether to handle the event or not (at your own risk regarding performance :)). Cheers, Alex On 24.10.2013, at 00:04, Carsten Ziegeler cziege...@apache.org wrote: As stated, so far the whole discussion is very theoretical - apart from the point that right now the jcr listener does a read for each and every changed node to get the resource type. This could definitely be improved with Oak as Oak is doing a content diff and therefore it should be possible to send this information down to the jcr listener. This would speed up things dramatically. But without concrete numbers and performance information, all of this sounds like premature optimization. Reinventing the wheel just slightly different without any way to proof whether it helps just doesn't sound right to me. We need to have a look at the whole system - it doesn't help if we talk about fixing one part of it for such scenarios while completely neglecting other parts. Therefore, (I know I repeat myself) I'm really looking forward to numbers showing the bottlenecks. Thanks Carsten -- Carsten Ziegeler cziege...@apache.org
Re: Scalabilty of EventListener
I just realized that there might be the need to dynamicaly add paths to the Listener. I have at least 2 cases in mind where the paths cannot be determined at deployment time which is the workflow launcher in CQ (where a user does define a ruleset at runtime) which might be changed to individual listeners being registerd, and the other case is when you have the temporary need to watch for changes of specific nodes (e.g. when a cache needs to be invalidated based on contentchanges - so the listener just needs to react on changes of resources relevant to the invalidation logic). During writing that I just came up with the question, what about such global listeners as the replication agent of CQ which potentially nees to act on all paths. The restrictions I have in mind is the path /content and that it would be sufficient to have one aggregated event for a page (substructure containing all resource to be used for rendering one request) which needs to be touched on subcontentchanges anyway (lastmodified). -- Dominik On Tue, Oct 22, 2013 at 5:48 PM, Carsten Ziegeler cziege...@apache.orgwrote: From the use cases I know, the listeners register for a specific path and are rarely interested in anything else. Some do check the resource type as well. For example the script engines check for changes under /libs and /apps (the resource paths actually) for changes to scripts, the job engine checks for new jobs somewhere under /var/jobs and checks the resource type as well etc. Many use cases check for changes (resource added, resource removed, resource changed) and then re-read the sub tree based on the changes. The mapping handler in the resource resolver is probably the most interesting one as it changes for nodes with some well defined properties, basically scanning the whole repository. And I think the i18n stuff does something similar (but this one might still be using jcr observation). This list is by no means exhaustive for Sling, but we see that we already have two listeners scanning the whole repository and require access to properties. Carsten 2013/10/22 Dominik Süß dominik.su...@gmail.com In a discussion [0] within the Oak mailinglist it became clear that the way Sling listens zu JCR Repository Changes and transforms all of them to events will not scale well in some big scale scenarios that oak is aiming to enable. Therefore the question was posted if it would be feasible and/or even necessary to refactor the API and deprecate (or at least discurrage) registration in a global scope as currently done. Since I do not want to copy paste parts of the discussion I do hope that the participants of the oak discussion add the remaining options with some more detail about consequences than can be found in the linked discussion. It would be great to also get some feedbacks of consumers of the existing API about the usage to identify how finegrained a potential registrationlogic with paths/properties might need to be. Best regards Dominik [0] http://markmail.org/message/n5vllhjoawypteck -- Carsten Ziegeler cziege...@apache.org
Re: Scalabilty of EventListener
Hi, On Wed, Oct 23, 2013 at 11:30 AM, Dominik Süß dominik.su...@gmail.com wrote: ...I have at least 2 cases in mind where the paths cannot be determined at deployment time... We studied observation usage patterns [1] earlier this year on a few Sling-based apps and it turns out that a lot of that can be done without observation, which might help scalability. So I'd say let's concentrate on Sling itself, and provide a way to be more specific in how observation is used, for the cases where that matters. And/or replace observation with other mechanisms where appropriate. A simple way of making our wide observation more specific is to require users of our rebroadcast OSGi events to indicate more specifically what they're interested in. This can be enabled by a configuration switch, so that small apps that don't care require no changes, and larger systems that require scalability can turn on that switch and have to do a bit of work to provide more details. Large scalable systems require work anyway, so this is not too much to ask for IMO. -Bertrand [1] https://cwiki.apache.org/confluence/display/SLING/Observation+usage+patterns
Re: Scalabilty of EventListener
Hi, On Tue, Oct 22, 2013 at 5:48 PM, Carsten Ziegeler cziege...@apache.org wrote: ...The mapping handler in the resource resolver is probably the most interesting one as it changes for nodes with some well defined properties, basically scanning the whole repository... This is one example where latency is not a problem, so periodic queries could be used instead of observation if that's more scalable. We're basically interested in whether anything changed in the results of a query since we last ran it, a pattern which might be optimized in Oak by taking advantage of the underlying MVCC storage. -Bertrand
Re: Scalabilty of EventListener
So you mean instead of doing observation, doing a query periodically? This would mean that we basically say, one of the main features of JCR, observation, is not usable. Carsten 2013/10/23 Bertrand Delacretaz bdelacre...@apache.org Hi, On Tue, Oct 22, 2013 at 5:48 PM, Carsten Ziegeler cziege...@apache.org wrote: ...The mapping handler in the resource resolver is probably the most interesting one as it changes for nodes with some well defined properties, basically scanning the whole repository... This is one example where latency is not a problem, so periodic queries could be used instead of observation if that's more scalable. We're basically interested in whether anything changed in the results of a query since we last ran it, a pattern which might be optimized in Oak by taking advantage of the underlying MVCC storage. -Bertrand -- Carsten Ziegeler cziege...@apache.org
Re: Scalabilty of EventListener
On 23 October 2013 11:25, Carsten Ziegeler cziege...@apache.org wrote: So you mean instead of doing observation, doing a query periodically? This would mean that we basically say, one of the main features of JCR, observation, is not usable. Past experience says that global observation of all changes in a cluster is not usable, and is best replaced by application specific messaging over a channel designed to scale. JCR Observation works just fine in the same memory space but beyond that it is far too noisy for a repository performing write operations. Ian Carsten 2013/10/23 Bertrand Delacretaz bdelacre...@apache.org Hi, On Tue, Oct 22, 2013 at 5:48 PM, Carsten Ziegeler cziege...@apache.org wrote: ...The mapping handler in the resource resolver is probably the most interesting one as it changes for nodes with some well defined properties, basically scanning the whole repository... This is one example where latency is not a problem, so periodic queries could be used instead of observation if that's more scalable. We're basically interested in whether anything changed in the results of a query since we last ran it, a pattern which might be optimized in Oak by taking advantage of the underlying MVCC storage. -Bertrand -- Carsten Ziegeler cziege...@apache.org
Re: Scalabilty of EventListener
I get this with global observation, but we're now talking about observation usage patterns and replacing them with queries. And these usage patterns are usually only observing partial parts of the repository. Carsten 2013/10/23 Ian Boston i...@tfd.co.uk On 23 October 2013 11:25, Carsten Ziegeler cziege...@apache.org wrote: So you mean instead of doing observation, doing a query periodically? This would mean that we basically say, one of the main features of JCR, observation, is not usable. Past experience says that global observation of all changes in a cluster is not usable, and is best replaced by application specific messaging over a channel designed to scale. JCR Observation works just fine in the same memory space but beyond that it is far too noisy for a repository performing write operations. Ian Carsten 2013/10/23 Bertrand Delacretaz bdelacre...@apache.org Hi, On Tue, Oct 22, 2013 at 5:48 PM, Carsten Ziegeler cziege...@apache.org wrote: ...The mapping handler in the resource resolver is probably the most interesting one as it changes for nodes with some well defined properties, basically scanning the whole repository... This is one example where latency is not a problem, so periodic queries could be used instead of observation if that's more scalable. We're basically interested in whether anything changed in the results of a query since we last ran it, a pattern which might be optimized in Oak by taking advantage of the underlying MVCC storage. -Bertrand -- Carsten Ziegeler cziege...@apache.org -- Carsten Ziegeler cziege...@apache.org
Re: Scalabilty of EventListener
On Wed, Oct 23, 2013 at 1:37 PM, Carsten Ziegeler cziege...@apache.org wrote: ...I get this with global observation, but we're now talking about observation usage patterns and replacing them with queries. And these usage patterns are usually only observing partial parts of the repository I'm not saying we need to get rid of all observation, but to make sure Sling itself does not limit scalability we might need to change parts of our code to use alternatives, or at least make alternatives possible. -Bertrand
Re: Scalabilty of EventListener
Ok, so I suggest we identify these parts then first concretely before we speculate that there is a problem and replace it with something where we don't know the real impact. So far, I heard that the global listener which translates jcr events into OSGi events is a problem - however as far as I followed the discussion, this isn't a problem right now but might lead to problems in some time if huge Oak based clustered repositories are used. But we don't have concrete numbers yet, so we could only fix this in a theoretical level. In the same category fall the global listeners we have in the resource resolver for the mapping and the i18n one. Is there more? Carsten -- Carsten Ziegeler cziege...@apache.org
Re: Scalabilty of EventListener
On Wed, Oct 23, 2013 at 2:10 PM, Carsten Ziegeler cziege...@apache.org wrote: So far, I heard that the global listener which translates jcr events into OSGi events is a problem - however as far as I followed the discussion, this isn't a problem right now but might lead to problems in some time if huge Oak based clustered repositories are used... Agreed. ...In the same category fall the global listeners we have in the resource resolver for the mapping and the i18n one.. Is there more?... I looked at an analysis the we did earlier this year on our Sling-based app and I don't have more than that. -Bertrand
Re: Scalabilty of EventListener
The analysis looks pretty good to me but does not provide answers of how to solve this. The first two topics Cached Content and Content Export, Replication to Remote Systems are the ones where I don't see an option to get rid of the content change triggers. This might not apply to the whole tree, but does this really matter when the tree that needs to be watched contains over 90% of the data? If I got the problem right the overhead is created by the fact that an event object is created and sent regardles if a consumer cares about it or not. So it might be worth to add something that asks the listeners if there is one that would consume this event and (something like an accepts() Method handing over raw metadata without generating the event object itself - more like a filter) and send the event then to be consumed by this and potential other consumers. Or did i get something completely wrong? Cheers Dominik On Wed, Oct 23, 2013 at 2:30 PM, Bertrand Delacretaz bdelacre...@apache.org wrote: On Wed, Oct 23, 2013 at 2:10 PM, Carsten Ziegeler cziege...@apache.org wrote: So far, I heard that the global listener which translates jcr events into OSGi events is a problem - however as far as I followed the discussion, this isn't a problem right now but might lead to problems in some time if huge Oak based clustered repositories are used... Agreed. ...In the same category fall the global listeners we have in the resource resolver for the mapping and the i18n one.. Is there more?... I looked at an analysis the we did earlier this year on our Sling-based app and I don't have more than that. -Bertrand
Re: Scalabilty of EventListener
Hi, On Wed, Oct 23, 2013 at 4:01 PM, Dominik Süß dominik.su...@gmail.com wrote: ...This might not apply to the whole tree, but does this really matter when the tree that needs to be watched contains over 90% of the data?... What's important is the frequency of observation events - if that 90% seldom changes, scaling won't be a problem. ...If I got the problem right the overhead is created by the fact that an event object is created and sent regardles if a consumer cares about it or not. So it might be worth to add something that asks the listeners if there is one that would consume this event... That's what I meant above: ...A simple way of making our wide observation more specific is to require users of our rebroadcast OSGi events to indicate more specifically what they're interested in... There's various ways of doing that: service properties, calling a service to register your interest in certain types of events etc. -Bertrand
Re: Scalabilty of EventListener
Hi Bertrand, I'm not so sure about the seldom changes - it is not about the amound of changes but about the amount of changes in a specific timeframe. The initial discussion started with the asumtion to have to deal with tons of changes within a second, e.g. for huge imports. In some cases those imports are real imports (so new creation) so postponing the processing and delegating to some postprocessing might work - but in case of updates this is not so easy (and I know such cases from the automotive area where masses of technical data is synched to all markets). Regarding the options you were stating: - service properties: my initial thought, but static and can therefore not adress all scenarios - registring for certain types: same problem, a bit more possibilities but restricted to the capabilities of the registry Therefore I thought of a filterlike behavior where the listeners can implement their own logic. Including measuring the time such a call takes and having a healthcheck announcing such a logic not to be performing well would be a good way to give developers the tooling to identify where they lose performance in this area. --- Dominik On Wed, Oct 23, 2013 at 4:19 PM, Bertrand Delacretaz bdelacre...@apache.org wrote: Hi, On Wed, Oct 23, 2013 at 4:01 PM, Dominik Süß dominik.su...@gmail.com wrote: ...This might not apply to the whole tree, but does this really matter when the tree that needs to be watched contains over 90% of the data?... What's important is the frequency of observation events - if that 90% seldom changes, scaling won't be a problem. ...If I got the problem right the overhead is created by the fact that an event object is created and sent regardles if a consumer cares about it or not. So it might be worth to add something that asks the listeners if there is one that would consume this event... That's what I meant above: ...A simple way of making our wide observation more specific is to require users of our rebroadcast OSGi events to indicate more specifically what they're interested in... There's various ways of doing that: service properties, calling a service to register your interest in certain types of events etc. -Bertrand
Re: Scalabilty of EventListener
On Wed, Oct 23, 2013 at 4:42 PM, Dominik Süß dominik.su...@gmail.com wrote: ...Regarding the options you were stating: - service properties: my initial thought, but static and can therefore not adress all scenarios... Service properties for a given service are static, but a service that has specific properties can come and go. IMO we need to work on concrete use cases to decide on the best option. -Bertrand
Re: Scalabilty of EventListener
From the use cases I know, the listeners register for a specific path and are rarely interested in anything else. Some do check the resource type as well. For example the script engines check for changes under /libs and /apps (the resource paths actually) for changes to scripts, the job engine checks for new jobs somewhere under /var/jobs and checks the resource type as well etc. Many use cases check for changes (resource added, resource removed, resource changed) and then re-read the sub tree based on the changes. The mapping handler in the resource resolver is probably the most interesting one as it changes for nodes with some well defined properties, basically scanning the whole repository. And I think the i18n stuff does something similar (but this one might still be using jcr observation). This list is by no means exhaustive for Sling, but we see that we already have two listeners scanning the whole repository and require access to properties. Carsten 2013/10/22 Dominik Süß dominik.su...@gmail.com In a discussion [0] within the Oak mailinglist it became clear that the way Sling listens zu JCR Repository Changes and transforms all of them to events will not scale well in some big scale scenarios that oak is aiming to enable. Therefore the question was posted if it would be feasible and/or even necessary to refactor the API and deprecate (or at least discurrage) registration in a global scope as currently done. Since I do not want to copy paste parts of the discussion I do hope that the participants of the oak discussion add the remaining options with some more detail about consequences than can be found in the linked discussion. It would be great to also get some feedbacks of consumers of the existing API about the usage to identify how finegrained a potential registrationlogic with paths/properties might need to be. Best regards Dominik [0] http://markmail.org/message/n5vllhjoawypteck -- Carsten Ziegeler cziege...@apache.org