[ 
https://issues.apache.org/jira/browse/SLING-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242784#comment-15242784
 ] 

Ian Boston commented on SLING-5645:
-----------------------------------

The Jobs API can be found at 
https://github.com/ieb/sling/tree/jobs_28/contrib/extensions/jobs/core/src/main/java/org/apache/sling/jobs
 with implementation in the same bundle. A Crankstart based IT test bundle 
starts the Jobs Subsystem up inside a cut down OSGi container. Its dependencies 
can best be seen by inspecting the crankstart provisioning models. 
https://github.com/ieb/sling/blob/jobs_28/contrib/extensions/jobs/it/src/test/resources/provisioning-model/jobs-runtime.txt.
 

The implementation uses the MoM API leveraging two aspects of that contract, 
namely that messages in a jobs queue are delivered once and once only to a 
single message listener. Assuming this contract is honored by the MoM 
implementation, the Jobs Subsystem can define a JobConsumer API with the 
contract that implementors will guarantee any job accepted by a job consumer 
will be processed. Assuming that contract is also honored the Job Sub system 
can distribute jobs for execution as allowed by the MoM API. 

To allow components that have an interest in the status of Jobs while they are 
being executed the MoM API Publish subscribe capability is used to stream 
status events for jobs in the queue and being executed. There has been some 
review and discussion offlist about this aspect with those who have experience 
running MoM infrastructure at scale and the consensus is that provided the 
Pub/Sub implementation can be made durable when required, there is no need to 
maintain a centralised persistence store dedicated to Job status. This 
potentially frees the JobProcessing from the scalability limitations that a 
store of that nature will introduce, but burdens the MoM API implementation 
further.

There is an example implementation of a Synchronous JobConsumer in the 
integration tests 
https://github.com/ieb/sling/blob/jobs_28/contrib/extensions/jobs/it-services/src/main/java/org/apache/sling/jobs/it/services/FullySyncJob.java.
 This is a trivial implementation as it uses the message thread to perform 
JobProcessing. Typically that would not be viable as it would limit throughput 
on the consuming JVM, but it demonstrates the process of consuming a Job. If 
the execute method on the JobConsumer throws an exception, the interface 
contract states that the Job message will not be dequeued from the MoM 
implementation and it will perform a retry operation based on configuration. 
ActiveMQ, the current OOTB implementation supports retry configuration on 
queues.

The second example implementation, 
https://github.com/ieb/sling/blob/jobs_28/contrib/extensions/jobs/it-services/src/main/java/org/apache/sling/jobs/it/services/AsyncJobConsumer.java
 is a more realistic implementation. It uses a ThreadPoolExecutor with a fixed 
queue size. As jobs are dequeued they enter the local queue, which is drained 
by threads from the pool running Callables. If the queue becomes full, Job 
messages are rejected and returned to the MoM Queue to be retried. When the 
component shuts down the pool is drained. This implementation does not 100% 
guarantee that every job starts execution under every scenario and it would not 
survive a total hardware failure, however it limits the size of the queue to 
balance the risk of damage against complexity.

The structure of the PoC is probably complete, although further work on testing 
is ongoing.

> Provide a Jobs API and implementation suitable for widely distributed job 
> processing.
> -------------------------------------------------------------------------------------
>
>                 Key: SLING-5645
>                 URL: https://issues.apache.org/jira/browse/SLING-5645
>             Project: Sling
>          Issue Type: New Feature
>            Reporter: Ian Boston
>            Assignee: Ian Boston
>
> This issue is to track work on a proof of concept to create a Jobs API and 
> implementation that will work in a distributed environment where the job 
> submitters and job consumers are not necessarily in the same JVM or in the 
> same Sling cluster. 
> Work is being done in a branch at 
> https://github.com/ieb/sling/tree/jobs_28/contrib/extensions/jobs
> Since the implementation needs supporting APIs/Capabilities not already 
> present in Sling. There are some sub-tasks associated with this issue to 
> address those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to