[ https://issues.apache.org/jira/browse/SLING-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242784#comment-15242784 ]
Ian Boston commented on SLING-5645: ----------------------------------- The Jobs API can be found at https://github.com/ieb/sling/tree/jobs_28/contrib/extensions/jobs/core/src/main/java/org/apache/sling/jobs with implementation in the same bundle. A Crankstart based IT test bundle starts the Jobs Subsystem up inside a cut down OSGi container. Its dependencies can best be seen by inspecting the crankstart provisioning models. https://github.com/ieb/sling/blob/jobs_28/contrib/extensions/jobs/it/src/test/resources/provisioning-model/jobs-runtime.txt. The implementation uses the MoM API leveraging two aspects of that contract, namely that messages in a jobs queue are delivered once and once only to a single message listener. Assuming this contract is honored by the MoM implementation, the Jobs Subsystem can define a JobConsumer API with the contract that implementors will guarantee any job accepted by a job consumer will be processed. Assuming that contract is also honored the Job Sub system can distribute jobs for execution as allowed by the MoM API. To allow components that have an interest in the status of Jobs while they are being executed the MoM API Publish subscribe capability is used to stream status events for jobs in the queue and being executed. There has been some review and discussion offlist about this aspect with those who have experience running MoM infrastructure at scale and the consensus is that provided the Pub/Sub implementation can be made durable when required, there is no need to maintain a centralised persistence store dedicated to Job status. This potentially frees the JobProcessing from the scalability limitations that a store of that nature will introduce, but burdens the MoM API implementation further. There is an example implementation of a Synchronous JobConsumer in the integration tests https://github.com/ieb/sling/blob/jobs_28/contrib/extensions/jobs/it-services/src/main/java/org/apache/sling/jobs/it/services/FullySyncJob.java. This is a trivial implementation as it uses the message thread to perform JobProcessing. Typically that would not be viable as it would limit throughput on the consuming JVM, but it demonstrates the process of consuming a Job. If the execute method on the JobConsumer throws an exception, the interface contract states that the Job message will not be dequeued from the MoM implementation and it will perform a retry operation based on configuration. ActiveMQ, the current OOTB implementation supports retry configuration on queues. The second example implementation, https://github.com/ieb/sling/blob/jobs_28/contrib/extensions/jobs/it-services/src/main/java/org/apache/sling/jobs/it/services/AsyncJobConsumer.java is a more realistic implementation. It uses a ThreadPoolExecutor with a fixed queue size. As jobs are dequeued they enter the local queue, which is drained by threads from the pool running Callables. If the queue becomes full, Job messages are rejected and returned to the MoM Queue to be retried. When the component shuts down the pool is drained. This implementation does not 100% guarantee that every job starts execution under every scenario and it would not survive a total hardware failure, however it limits the size of the queue to balance the risk of damage against complexity. The structure of the PoC is probably complete, although further work on testing is ongoing. > Provide a Jobs API and implementation suitable for widely distributed job > processing. > ------------------------------------------------------------------------------------- > > Key: SLING-5645 > URL: https://issues.apache.org/jira/browse/SLING-5645 > Project: Sling > Issue Type: New Feature > Reporter: Ian Boston > Assignee: Ian Boston > > This issue is to track work on a proof of concept to create a Jobs API and > implementation that will work in a distributed environment where the job > submitters and job consumers are not necessarily in the same JVM or in the > same Sling cluster. > Work is being done in a branch at > https://github.com/ieb/sling/tree/jobs_28/contrib/extensions/jobs > Since the implementation needs supporting APIs/Capabilities not already > present in Sling. There are some sub-tasks associated with this issue to > address those. -- This message was sent by Atlassian JIRA (v6.3.4#6332)