Apache Sling Eventing and Job HandlingPage edited by Carsten ZiegelerChanges (1)
Full ContentOverviewThe Apache Sling Event Support bundle provides interesting services for advanced event handling and job processing. While this bundle leverages the OSGi EventAdmin, it provides a very powerful support for so called jobs: a job is a task which has to be performed by a component - the Sling job handling ensures that exactly one component performs this task. To get some hands on code, you can refer to the following tutorials: Possible Use Cases for Job Handling
Jobs (Guarantee of Processing)In general, the eventing mechanism (OSGi EventAdmin) has no knowledge about the contents of an event. Therefore, it can't decide if an event is important and should be processed by someone. As the event mechanism is a "fire event and forget about it" algorithm, there is no way for an event admin to tell if someone has really processed the event. Processing of an event could fail, the server or bundle could be stopped etc. On the other hand, there are use cases where the guarantee of processing a job is a must and usually this comes with the requirement of processing this job exactly once. Typical examples are sending notification emails (or sms) or post processing of content (like thumbnail generation of images or documents). The Sling Event Support adds the notion of a job to the OSGi EventAdmin. A job is a special OSGi event that someone has to process (do the job). The job event has a special topic =org/apache/sling/event/job= to indicate that the event contains a job. These job events are consumed by the Sling Job Handler - it ensures that someone does the job! To support different jobs and different processors of such jobs, the real topic of the event is stored in the =event.job.topic= property of the original event. When a job event (event with the topic =org/apache/sling/event/job=) is received, a new event with the topic from the property =event.job.topic= is fired (Firing this event comes of course with a set of rules and constraints explained below). PersistenceThe job event handler listens for all job events (all events with the topic =org/apache/sling/event/job=) and will as a first step persist those events in the JCR repository. All job events are stored in a tree under the job root node =/var/eventing/jobs=. Persisting the job ensures proper handling in a clustered environment and allows failover handling after a bundle stop or server restart. Once a job has been processed by someone, the job will be removed from the repository. Jobs are stored in the repository in order to ensure that exactly one single application node is processing the job. The repository will have a specific area (path) where all jobs are stored. In order to distinguish a job which occured twice and a job which is generated at the same time on several nodes, each job can be uniquely identified by its topic (property =event.job.topic=) and the =event.job.id= property. It is up to the client who is creating the event to ensure that the =event.job.id= property is unqiue and identical on all application nodes. If the job id is not provided for the job, then it is up to the client to ensure that the job even is only fired once. When the job event listener tries to write a job into the repository it will check if the repository already contains a job with the given topic =event.job.topic= and =event.job.id= property. If the event has already been written by some other application node, it's not written again. If the event has been written by the same node, it will be set to active again (=event:active= will be set to true and =event:created= will be updated). Each job is stored as a separate node with the following properties:
The failover of an application node is accomplished by locking and the =event:active= flag. If a job is locked in the repository a session scoped lock is used. If this application node dies, the lock dies as well. Each application node observes the JCR locking properties and therefore gets aware of unlocked event nodes with the active flag set to true. If an application node finds such a node, it locks it, updates the =event:application= information and processes it accordingly. In this case the event gets the additional property =org/apache/sling/job/retry=. Each application is periodically removing old jobs from the repository (using the scheduler). Job ProcessorsTo avoid timeouts and black listing of event handlers, the job event handler does not assume that the job has been processed if the event could be sent successfully. It is the task of the event handler to notify the job event handler that it has processed the job. In addition, the job processing should be done in the background. The =EventUtil= class has a helper method for this: =processJob(Event, JobProcessor)=. The event handler must implement the =JobProcessor= interface which consists of a single =process(Event)= method. When the event handler receives a job event, it calls =EventUtil.processJob(event, this)= and returns. The =process(Event)= method is now called in the background and when it finishes, the job event handler is notified that the job is completed. If the event handler wants to do the background processing by itself or does not need background processing at all, it must signal completition of the job by call =EventUtil.finishedJob(event)=. By default an application node is queuing the jobs which means that only one job is processed at a time. If a job can be run in parallel on one application node, the property =event.job.parallel= should be set with any value. The job id is optional and can be used to update or reactivate jobs. Distribution of JobsA job event is an event like any other. Therefore it is up to the client generating the event to decide if the event should be distributed. If the event is distributed, it will be distributed with a set =event.application= on the remote nodes. If the job event handler receives a job with the =event.application= property set, it will not try to write it into the repository. It will just broadcast this event asynchronously as a ~FYI event. If a job event is created simultanously on all application nodes, the event will not be distributed. The application node that actually has the lock on the stored job in the repository will clear the =event.application= when sending the event locally. All other application nodes will use the =event.application= stored in the repository when broadcasting the event locally. Usage PatternsBased on some usage patterns, we discuss the functionality of the eventing mechanism. Sending User Generated EventsIf a user action results in an event, the event is only created on one single node in the cluster. The event object is generated and delivered to the OSGi event admin. If the =event.distribute= is not explicitly set, the event is only distributed localled. If the =event.distribute= is the, the cluster event handler will write the event into the repository. All nodes in the cluster observe the repository area where all events are stored. If a new event is written into that area, each application node will get notified. It will create the event based on the information in the repository, clear the =event.distribute= and publish the event. The flow can be described as follows:
Processing JCR EventsJCR events are environment generated events and therefore are sent by the repository to each node in the cluster. In general, it is advisable to not built the application on the low level repository events but to use application events. Therefore the observer of the JCR event should create an OSGi event based on the changes in the repository. A decision has to be made if the event should be a job or a plain event. The flow can be described as follows:
Sending Scheduled EventsScheduled events are OSGi events that have been created by the environemnt. They are generated on each application node of the cluster through an own scheduler instance. Sending these events works the same as sending events based on JCR events (see above). In most use cases a scheduler will send job events to ensure that exactly one application node is processing the event. Receiving OSGi EventsIf you want to receive OSGi events, you can just follow the specification: receive it via a custom event handler which is registered on bundle start - a filter can be specified as a configuration property of the handler. As we follow the principle of distributing each event to every registered handler, the handler has to decide if it will process the event. In order to avoid multiple processing of this event in a clustered environment, the event handler should check the =event.application= property. If it is not set, it's a local event and the handler should process the event. If the =event.application= is set, it's a remote event and the handler should not process the event. This is a general rule of thumb - however, it's up to the handler to make its decision either on =event.application= or any other information. It is advisable to perform the local event check even in a non clustered environment as it makes the migration to a cluster later on much easier and there is nearly no performance overhead caused by the check. The ~EventUtil class provides an utility method =isLocalEvent(Event)= which checks the existance of the =event.application= property and returns =true= if it is absend. Distributed EventsIn addition to the job handling, the Sling Event support adds handling for distributed events. A distributed event is an OSGi event which is sent across JVM boundaries to a different VM. A potential use case is to broadcast information in a clustered environment. Sources of EventsWhen it comes to application based on Sling, there is a variety of sources from which OSGi events can be send:
The events can eiter be generated inside a current user context, e.g. when the user performs an action through the UI, or they can be out of a user context, e.g. for schedulded events. This leads to different weights of events. Weights of EventsWe can distinguish two different weights of events, depending how they are distributed in a clustered environment:
External events, like incoming JMS events etc. might fall either into the first or the second category. The receiver of such events must have the knowledge about the weight of the event. Basic PrinciplesThe foundation of the distributed event mechanism is to distribute each event to every node in a clustered environment. The event distribution mechanism has no knowledge about the intent of the event and therefore is not able to make delivery decisions by itself. It is up to the sender to decide what should happen, however the sender must explicitly declare an event to be distributed. There are exceptions to "distributing everything to everywhere" as for example framework related events (bundle stopped, installed etc.) should not be distributed. The event mechanism will provide additional functionality making it easier for event receivers to decide if they should process an event. The event receiver can determine if the event is a local event or comming from a remote application node. Therefore a general rule of thumb is to process events only if they're local and just regard remote events as a FYI. The event mechanism is an event mechanism which should not be confused with a messaging mechanism. Events are received by the event mechanism and distributed to registered listeners. Concepts like durable listeners, guarantee of processing etc. are not part of the event mechanism itself. However, there is additional support for such things, like job handling. The application should try to use application events instead of low level JCR events whereever possible. Therefore a bridging between JCR events and the event mechanism is required. However, a general "automatic" mapping will not be provided. It is up to the application to develop such a mapping on a per use case base. There might be some support to make the mapping easier. The event handling should be made as transparent to the developer as possible. Therefore the additional code for a developer to make the eventing working in a clustered environment etc. should be kept to a minimum (which will hopefully reduce possible user errors). Event MechanismThe event mechanism is leveraging the OSGi Event Admin Specification (OSGi Compendium 113). The event admin specification provides a sufficient base. It is based on the event publish and subscribe mechanism. Each event is associated with a topic and data. The data consists of custom key-value pairs where the keys are strings and the values can be any object. However, to work in distributed environments it is advisable to use only string and scalar types for data. If complex objects are required they have to be serializable. Events can either be send synchronously or asynchronously. It is up to the caller (the one sending the event) to decide this by choosing one of the provided methods. The OSGi API is very simple and leightweight - sending an event is just generating the event object and calling the event admin. Rceiving the event is implementing a single interface and declaring through properties which topics one is interested in. It's possible to add an additional filter (based on property values for example). %N The event handler should not take too much time to process the event. For example, the Apache Felix implementation of the event admin black lists an event handler if it takes more than 5 seconds to process the event - regardless if the event is sent synchronously or asynchronously. Therefore any heavier processing has to be done in the background. The event is just the trigger to start this. The job mechanism explained in this documentation is a good way of implementing this functionality for an event handler. The aim is to add all functionality on top of an existing event admin implementation. Therefore everything should be added by additional event handlers. Event HandlerAn event handler registers itself on a (set of) topic. It can also specify a filter for the events. This event handler is either notified synchronously or asynchronously depending on how the event has been sent. EventsThe type of the event is specified by the hierarchically organized topic. In order to provide clustering of JCR repositories and clustering of the sling based application instances, each event can contain the following properties - if they are absent, a default value is assumed:
While the =event.distribute= must be set by the sender of an event (if the event should be distributed), the =event.application= property is maintained by the event mechanism. Therefore a client sending an event should never set this information by itself. This will confuse the local event handlers and result in unexpected behaviour. On remote events the =event.application= is set by the event distribution mechanism. Event Distribution Across Application Nodes (Cluster)The (local) event admin is the service distributing events locally. The Sling Distributing Event Handler is a registered event handler that is listening for events to be distributed. It distributes the events to remote application notes, the JCR repository is used for distribution. The distributing event handler writes the events into the repository, the distributing event handlers on other application nodes get notified through observation and then distribute the read events locally. As mentioned above, the client sending an event has to mark an event to be distributed in a cluster by setting the =event.distribute= in the event properties (through ~EventUtil). An event handler receiving such an event can distinguish it by checking the =event.application= property. If the property is not available, it is a local event - if the property is available it is a remote event. This distribution mechanism has the advantage that the application nodes do not need to know each other and the distribution mechanism is independent from the used event admin implementation. Defining the filter for the =event.distribute= is also very simple. Storing Events in the RepositoryDistributable events are stored in the repository, the repository will have a specific area (path) where all events are stored. Each event is stored as a separate node with the following properties:
Each application is periodically removing old events from the repository (using the scheduler). Jobs (Guarantee of Processing)In general, the eventing mechanism has no knowledge about the contents of an event. Therefore, it can't decide if an event must be processed by a node. As the event mechanism is a "fire event and forget about it" algorithm, there is no way for an event admin to tell if someone has processed the event. On the other hand, there are use cases where the guarantee of processing is a must and usually this comes with the requirement of processing this event exactly once. Typical examples are sending notification emails (or sms) or post processing of content (like thumbnail generation of images or documents). We will call these events jobs to make clear that someone has to do something with the event (do the job). We will use a special topic =org/apache/sling/event/job= to indicate that the event contains a job, the real topic of the event is stored in the =event.job.topic= property. When a job event (event with the topic =org/apache/sling/event/job=) is received, a new event with the topic from the property =event.job.topic= is fired. The event must have the following properties:
The job event handler listens for all job events (all events with the topic =org/apache/sling/event/job=). The event handler will write the job event into the repository (into the job area), lock it, create a new event with the topic from the property =event.job.topic= and send the job event through the event admin. When the job is finished, the event listener will unlock the node from the repository. To avoid timeouts and black listing of event handlers, the job event handler does not assume that the job has been processed if the event could be sent successfully. It is the task of the event handler to notify the job event handler that it has processed the job. In addition, the job processing should be done in the background. The =EventUtil= class has a helper method for this: =processJob(Event, JobProcessor)=. The event handler must implement the =JobProcessor= interface which consists of a single =process(Event)= method. When the event handler receives a job event, it calls =EventUtil.processJob(event, this)= and returns. The =process(Event)= method is now called in the background and when it finishes, the job event handler is notified that the job is completed. If the event handler wants to do the background processing by itself or does not need background processing at all, it must signal completition of the job by call =EventUtil.finishedJob(event)=. By default an application node is queuing the jobs which means that only one job is processed at a time. If a job can be run in parallel on one application node, the property =event.job.parallel= should be set with any value. The job id is optional and can be used to update or reactivate jobs. Storing Jobs in the RepositoryJobs are stored in the repository in order to ensure that exactly one single application node is processing the job. The repository will have a specific area (path) where all jobs are stored. In order to distinguish a job which occured twice and a job which is generated at the same time on several nodes, each job can be uniquely identified by its topic (property =event.job.topic=) and the =event.job.id= property. It is up to the client who is creating the event to ensure that the =event.job.id= property is unqiue and identical on all application nodes. If the job id is not provided for the job, then it is up to the client to ensure that the job even is only fired once. When the job event listener tries to write a job into the repository it will check if the repository already contains a job with the given topic =event.job.topic= and =event.job.id= property. If the event has already been written by some other application node, it's not written again. If the event has been written by the same node, it will be set to active again (=event:active= will be set to true and =event:created= will be updated). Each job is stored as a separate node with the following properties:
The failover of an application node is accomplished by locking and the =event:active= flag. If a job is locked in the repository a session scoped lock is used. If this application node dies, the lock dies as well. Each application node observes the JCR locking properties and therefore gets aware of unlocked event nodes with the active flag set to true. If an application node finds such a node, it locks it, updates the =event:application= information and processes it accordingly. In this case the event gets the additional property =org/apache/sling/job/retry=. Each application is periodically removing old jobs from the repository (using the scheduler). Distribution of JobsA job event is an event like any other. Therefore it is up to the client generating the event to decide if the event should be distributed. If the event is distributed, it will be distributed with a set =event.application= on the remote nodes. If the job event handler receives a job with the =event.application= property set, it will not try to write it into the repository. It will just broadcast this event asynchronously as a ~FYI event. If a job event is created simultanously on all application nodes, the event will not be distributed. The application node that actually has the lock on the stored job in the repository will clear the =event.application= when sending the event locally. All other application nodes will use the =event.application= stored in the repository when broadcasting the event locally. Usage PatternsBased on some usage patterns, we discuss the functionality of the eventing mechanism. Sending User Generated EventsIf a user action results in an event, the event is only created on one single node in the cluster. The event object is generated and delivered to the OSGi event admin. If the =event.distribute= is not explicitly set, the event is only distributed localled. If the =event.distribute= is the, the cluster event handler will write the event into the repository. All nodes in the cluster observe the repository area where all events are stored. If a new event is written into that area, each application node will get notified. It will create the event based on the information in the repository, clear the =event.distribute= and publish the event. The flow can be described as follows:
Processing JCR EventsJCR events are environment generated events and therefore are sent by the repository to each node in the cluster. In general, it is advisable to not built the application on the low level repository events but to use application events. Therefore the observer of the JCR event should create an OSGi event based on the changes in the repository. A decision has to be made if the event should be a job or a plain event. The flow can be described as follows:
Sending Scheduled EventsScheduled events are OSGi events that have been created by the environemnt. They are generated on each application node of the cluster through an own scheduler instance. Sending these events works the same as sending events based on JCR events (see above). In most use cases a scheduler will send job events to ensure that exactly one application node is processing the event. Receiving OSGi EventsIf you want to receive OSGi events, you can just follow the specification: receive it via a custom event handler which is registered on bundle start - a filter can be specified as a configuration property of the handler. As we follow the principle of distributing each event to every registered handler, the handler has to decide if it will process the event. In order to avoid multiple processing of this event in a clustered environment, the event handler should check the =event.application= property. If it is not set, it's a local event and the handler should process the event. If the =event.application= is set, it's a remote event and the handler should not process the event. This is a general rule of thumb - however, it's up to the handler to make its decision either on =event.application= or any other information. It is advisable to perform the local event check even in a non clustered environment as it makes the migration to a cluster later on much easier and there is nearly no performance overhead caused by the check. The ~EventUtil class provides an utility method =isLocalEvent(Event)= which checks the existance of the =event.application= property and returns =true= if it is absend. SchedulerEach Sling based application will contain a scheduler service (which is based on the Quartz open source project). Use CasesPost Processing (Business Processes)A typical example for post processing (or running a business process) is sending an email or creating thumbnails and extracting meta data from the content (like we do in DAM), which we will discuss here. An appropriate JCR observer will be registered. This observer detects when new content is put into the repository or when content is changed. In these cases it creates appropriate =CONTENT_ADDED=, =CONTENT_UPDATED= OSGi events from the JCR events. In order to ensure that these actions get processed accordingly, the event is send as a job (with the special job topic, the =topic= and =id= property). The event admin now delivers these jobs to the registered handlers. The job event handler gets notified and (simplified version) sends the contained event synchronously. One of the handlers for these events is the post processing service in DAM. The job mechanism ensures that exactly one application node is post processing and that the process has to be finished even if the application node dies during execution. SchedulingThe scheduler is a service which uses the open source Quartz library. The scheduler has methods to start jobs periodically or with a cron definition. In addition, a service either implementing =java.lang.Runnable= or =org.quartz.job= is started through the whiteboard pattern if it either contains a configuration property =scheduler._expression_= or =scheduler.period=. The job is started with the ~PID of the service - if the service has no PID, the configuration property =scheduler.name= must be set.
Change Notification Preferences
View Online
|
View Changes
|
Add Comment
|
- [CONF] Apache Sling Website > Apache Sling Eventing and Job ... confluence
- [CONF] Apache Sling Website > Apache Sling Eventing and... confluence
- [CONF] Apache Sling Website > Apache Sling Eventing and... confluence
- [CONF] Apache Sling Website > Apache Sling Eventing and... confluence
- [CONF] Apache Sling Website > Apache Sling Eventing and... confluence
- [CONF] Apache Sling Website > Apache Sling Eventing and... confluence
- [CONF] Apache Sling Website > Apache Sling Eventing and... confluence
- [CONF] Apache Sling Website > Apache Sling Eventing and... confluence
- [CONF] Apache Sling Website > Apache Sling Eventing and... confluence
- [CONF] Apache Sling Website > Apache Sling Eventing and... confluence
- [CONF] Apache Sling Website > Apache Sling Eventing and... confluence